Simple Linear Regression
The Simple Linear Regression model
import pandas as pd | |
import pylab as pl import matplotlib.pyplot as plt | |
import numpy as np | |
%matplotlib inline | |
#url https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv | |
#### Load Data #### | |
df = pd.read_csv("FuelConsumption.csv") | |
cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']] | |
# Taking Engine Size as independent Value i.e., X and Dependent value as Co2 Emission i.e., y | |
X = cdf[['ENGINESIZE']] | |
y = cdf[['CO2EMISSIONS']] | |
#### split dataset into train and test #### | |
''' | |
Next, we have to split the dataset into training and testing. We will use the training dataset for training the model and then check the performance of the model on the test dataset. | |
For this, we will use the train_test_split method from the library model_selection | |
We are providing a test_size of 1/3 which means the test set will contain 10 observations and the training set will contain 20 observations | |
Taking seed value = 0 and assigning it into random_state | |
''' | |
from sklearn.model_selection import train_test_split | |
seed = 0 | |
test_size = 1/3 | |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = test_size, random_state = seed) | |
#### Fit Simple Linear Regression model to training set #### | |
''' | |
We will be using the LinearRegression class from the library sklearn.linear_model. First, we create an object of the LinearRegression class and call the fit method passing the X_train and y_train | |
''' | |
from sklearn.linear_model import LinearRegression | |
regressor = LinearRegression() | |
regressor.fit(X_train, y_train) | |
#### Predict the test set #### | |
y_pred = regressor.predict(X_test) | |
#### Visualizing the training set #### | |
plt.scatter(X_train, y_train, color = 'red') | |
plt.plot(X_train, regressor.predict(X_train), color = 'blue') | |
plt.xlabel('Engine Size') | |
plt.ylabel('Emission') | |
plt.show() | |
#### Visualizing the test set #### | |
plt.scatter(X_test, y_test, color= 'red') | |
plt.plot(X_train, regressor.predict(X_train), color = 'blue') | |
plt.xlabel('Engine Size') | |
plt.ylabel('Emission') | |
plt.show() | |
#### Make new predictions #### | |
new_pred = regressor.predict([[3.6]]) | |
print(new_pred) |
Comments
Post a Comment