Simple Linear Regression


The Simple Linear Regression model
import pandas as pd
import pylab as pl
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
#url https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv
#### Load Data ####
df = pd.read_csv("FuelConsumption.csv")
cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']]
# Taking Engine Size as independent Value i.e., X and Dependent value as Co2 Emission i.e., y
X = cdf[['ENGINESIZE']]
y = cdf[['CO2EMISSIONS']]
#### split dataset into train and test ####
'''
Next, we have to split the dataset into training and testing. We will use the training dataset for training the model and then check the performance of the model on the test dataset.
For this, we will use the train_test_split method from the library model_selection
We are providing a test_size of 1/3 which means the test set will contain 10 observations and the training set will contain 20 observations
Taking seed value = 0 and assigning it into random_state
'''
from sklearn.model_selection import train_test_split
seed = 0
test_size = 1/3
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = test_size, random_state = seed)
#### Fit Simple Linear Regression model to training set ####
'''
We will be using the LinearRegression class from the library sklearn.linear_model. First, we create an object of the LinearRegression class and call the fit method passing the X_train and y_train
'''
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
#### Predict the test set ####
y_pred = regressor.predict(X_test)
#### Visualizing the training set ####
plt.scatter(X_train, y_train, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.xlabel('Engine Size')
plt.ylabel('Emission')
plt.show()
#### Visualizing the test set ####
plt.scatter(X_test, y_test, color= 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.xlabel('Engine Size')
plt.ylabel('Emission')
plt.show()
#### Make new predictions ####
new_pred = regressor.predict([[3.6]])
print(new_pred)

Comments

Popular posts from this blog

Multiple Linear Regression

IRIS ML Project