This repository contains a series of machine learning projects from the Distributed Data Analytics lab course, which is part of the Data Analytics Master's program at Hildesheim Universität.
Content:
- Introduction to Message Passing Interface (MPI)
- Point-to-point communication
- Collective communication
- K-mean with MPI
- Gradient descent with MPI
- Counting neighbors with MapReduce program within Hadoop
- TextRank with MapReduce program within Hadoop
- RGB Image into grayscale with MapReduce program within Hadoop
- Introduction to Pytorch
- Regression on Wine quality dataset
- Convolutional Architecture in Pytorch
- Image classification of Flower dataset
- Loading dataset on the RAM
- Multiprocessing in Pytorch
- Image classification of Flower dataset with multiple workers
- Tracking and visualization performance on Tensorboard
- More exercises with Pytorch
- Parameters update without optimizer
- Regression with Neural Network on California Housing dataset
- Learning rate optimization
- Activation functions: ReLU/TanH
- More exercises with Pytorch
- Binary classification on a9a dataset
- Multiprocessing with dataset partitions
- Binary classification on Gisette dataset
- Image classification on Flower dataset
- Effect of batch size on training time
- Effect of batch normalization on training accuracy
- Depth-wise Separable Convolutional Layer