Skip to content

prasad-madhale/machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning from Scratch

Common Machine Learning algorithms implemented from Scratch

Index:

  1. Linear Regression using Normal Equation
  2. Linear Regression with Gradient Descent
  3. Logistic Regression with Gradient Descent
  4. Decision Trees
  5. Regression Tree
  6. Logistic Regression using Newton's method
  7. Linear Regression with Ridge Regularization
  8. Perceptron
  9. Autoencoder NN from scratch and using Tensorflow
  10. Classifier Neural Network (square loss) from scratch and using Tensorflow
  11. Classifier Neural Network (cross-entropy) from scratch and using Tensorflow
  12. Gaussian Discriminant Analysis
  13. Naive Bayesian Classifier
  14. Expectation Maximization
  15. AdaBoost
  16. AdaBoost with Active Learning
  17. AdaBoost with missing data (on UCI datasets)
  18. Error Correcting Output Codes
  19. Gradient Boosted Trees
  20. Feature Selection
  21. PCA for Feature Reduction
  22. Logistic Regression with regularization
  23. HAAR Image feature extraction
  24. Support Vector Machine
  25. Dual Perceptron

Linear Regression (Normal Equation)

Linear Regression with Mean Squared Error cost function. The weight training is done with Normal Equation (closed-form solution).

Linear Regression for predicting Housing Price

Linear Regression for Email Spam detection

Linear Regression (using Batch Gradient Descent)

Linear Regression for House Price and Spam Email prediction using Batch Gradiet Descent.

Linear Regression with Gradient Descent-Spambase

Cost function

ROC curve


Linear Regression with Gradient Descent-Housing

Cost function

Logistic Regression with Batch Gradient Descent

Logistic Regression - Spambase dataset

Log Likelihood

ROC curve

Decision Tree

Decision Tree to classify data points in the Spambase dataset.

Decision Tree

Regression Tree

Regression Tree to predict continuous valued data for Housing price dataset.

Regression Tree

Logistic Regression using Newton's Method

Train logistic regression using Newton's method (solution in closed form)

Logistic Regression with Newton's method

Log likelihood

Linear Regression with Ridge Regularization

Train Linear Regression with Ridge regularization to control the weights

Linear Regression Ridge regularization - Housing

Linear Regression Ridge regularization - Spambase

Perceptron

Single layer pereptron to classify 01 labelled dataset

Perceptron

Mistakes per Iteration

Autoencoder Neural Network (scratch & tf)

Implemented Multilayer perceptron Neural Network for Autoencoder from scratch and using Tensorflow

AutoEncoder from Scratch

Loss per Epoch

AutoEncoder using Tensorflow

Loss per Epoch

Square Loss Classifier Neural Network (scratch & tf)

Implemented a Multilayer perceptron Neural Network for Classification from scratch and using Tensorflow. Uses sigmoid activation along with a square loss.

Classifier with Square Loss from scratch

Loss per Epoch

Classifier with Square Loss using tensorflow

Loss per Epoch

Cross-Entropy Classifier Neural Network

Implemented a Multilayer perceptron Neural Network for Classification from scratch and using Tensorflow. Uses sigmoid activation with softmax at the output layer along with a cross entropy loss.

Classifier with Cross Entropy Loss from scratch

Loss per Epoch

Classifier with Cross Entropy Loss using Tensorflow

Loss per Epoch

Gaussian Discriminant Analysis

Implemented GDA which learns a distribution to form discriminant function for prediction.

GDA

Naive Bayesian Classifier

Implemented a Naive Bayesian Classifier which uses Baye's rule to learn a gaussian using given data probabilities.

Naive Bayes Classifier (Gaussian)

Naive Bayes Classifier with Bernoulli

Naive Bayes Classifier for broken down into 9 bins

Naive Bayes Classifier for broken down into 4 bins

Naive Bayes Classifier on Polluted Dataset

Naive Bayes on missing data

We implement Naive Bayes on missing data by ignoring it in Bernoulli distribution probability calculations.

Expectation Maximization

Given data which is a mixture of Gaussian. EM algorithm accurately predicts to which Gaussian the data point belongs

Expectation Maximization (mixture of Gaussian)

Expectation Maximization for Coin Flipping example

Given two biased coins and datapoints derived by picking one coin at random and flipping it d times. EM algorithm helps to predict which coin is used to create that datapoint

AdaBoost

Boosting is a methodology where by combining multiple weak learners we get a strong model for prediction. I have used simple 1-split Decision Tree as weak learner for this AdaBoosting implementation.

AdaBoost with Optimal Thresholding

Optimal thresholding signifies going through all the decision stumps => (feature,threshold) combinations to find the one that gives maximum improvement in predictions

Error at Each Round

Train/Test Error

AUC curve

ROC curve

AdaBoost with Random Thresholding

Random thresholding signifies picking a decision stump => (feature,threshold) combination at random

Error at Each Round

Train/Test Error

AUC curve

ROC curve

AdaBoost on Polluted Data

Implemented AdaBoost with Optimal decision stump on polluted dataset. Output stored in ./logs/out_polluted.txt

Active learning is a technique in which we start with some percent of random data in the training set and then keep adding data points with least error to the training set. I have used Adaboost with Optimal Decision stumps to implement the Active Learning starting with 5, 10, 15, 20, 30, 50 percent of random data.

Implemented Adaboost on popular UCI datasets. Also, handled missing data in both datasets. Tested the performance by using some fixed percent of data selected at random.

Error Correcting Output Codes uses a coding matrix to use provide a way to use a binary classifier on a multi-label dataset. ECOC uses this coding matrix such that each column in the matrix represents a subset of labels and each of these has it's own model (for our case we use Adaboost as the individual learner).

Bagging involves creating small bags of x% data picked randomly with replacement. A model is trained on each bag and predictions are made based on either the average/mode of predictions over all models depending on the type of labels.

Select top important features based on margins analysis. margins

PCA allows us to reduce the number of features in the dataset by creating features which are a linear combination of each other. We used sklearn's PCA implementation to reduce the number of features in our dataset to 100 and then ran Naive Bayes on these features to obtain good results.

Logistic Regression with regularization

Logistic Regression with Ridge regularization

Logistic Regression with LASSO regularization

We extracted features from MNIST dataset using HAAR methodology.

  1. Create 100 rectangles randomly distributed inside image pixel sizes.
  2. Extract 2 features each by imposing these 100 rectangles on each image.
  3. We get 200 features per image.
  4. Feed these features and labels into a multi-class classifier to obtain predictions(in our case we use AdaBoost with ECOC)

Support Vector Machine

SVM using sklearn SVM with HAAR features Use the extracted HAAR features from MNIST dataset in SVM SVM with SMO from scratch SVM on MNIST with HAAR features

K-Nearest Neighbors

Implemented KNN with different number of nearest neighbors. Also, used different kernels like Gaussian, Cosine and Polynomial.

Implemented KNN with probability density estimator

Implemented KNN with fixed range of radius measured with Euclidean distance

Implemented KNN with feature selection by independently assessing the quality Weights of each feature by adjusting the weights with each instance

Implemented Dual Perceptron with linear and gaussian kernel.

Releases

No releases published

Packages

No packages published