The data which is used in this project has been taken from the kaggle. The dataset is of USA Housing Dataset which includes 7 columns including target variable "Price". In this task we have to predict the house prices in USA. I have created this notebook to just try handful of ML regression algorithms via; sklearn pipeline.
The project includes basic EDA, Outlier Analysis, Baseline Model Building, Model Comparison, Sklearn-Pipeline to Avoid Data Leakage, Cross Validation & Hyperparameter Tuning Using Randomsized Search CV & Prediction.
The Regression Algorithms which I have tested in this notebook are as follows:
-
Linear Regression
-
Robust Regression
-
TheilSen Regression
-
KNN Regressor
-
Decision Tree Regressor
-
Elastic Net
-
Ridge/Lasso
-
Stochastic Gradient Descent
-
Catboost
-
LightGBM
-
Gradient Boosting Regressor
-
Random Forest Regressor
-
Adaboost Regressor