Skip to content

From the heart-disease dataset, we tried creating a sample linear regression model using two of the features provided from the dataset.

Notifications You must be signed in to change notification settings

mcabanlit/linear-regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linear Regression gcash donation paypal donation

python version scikit version

For this linear regression example, we will be using the heart disease dataset, which is a public health dataset that can be retrieved from Kaggle.

title

For this particular example, we will be only using two fields, the trestbps (resting blood pressure in mm/hg) and thalach (maximum heart rate achieved). There isn't much correlation between the data but for demonstration purposes, we will be using them to estimate linear regression using existing scikit libraries and also by using manual calculations in Python.

To calculate for the intercept or the b in y = mx + b, we use the following formula:

  • Intercept = [(ΣY)(ΣX2) – (ΣX)(ΣXY)] / [n(ΣX2) – (ΣX)^2]

To calculate for the slope or the m in y = mx + b, we use the following formula:

  • Slope = [n(ΣXY) – (ΣX)(ΣY)] / [n(ΣX2) – (ΣX)2]

We then compared our values to what is being calculated in sk-learn.

import matplotlib.pyplot as plt
from scipy import stats

slope, intercept, r, p, std_err = stats.linregress(linear_table["X"], linear_table["Y"])
print(f"y = {slope}x + {intercept}")

About

From the heart-disease dataset, we tried creating a sample linear regression model using two of the features provided from the dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published