This repository contains the homeworks and other staff regards to the Big Data Engineering course (AY 21/22) at the University of Naples Federico II.
All the homeworks have been developed in team of 2.
- Homework1-MongoDB: design and development of a NoSQL database using MongoDB Compass for the storing of Yelp Dataset collections.
- Homework2-ApachePig: processing of Yelp Dataset Reviews collection using Apache Pig, with Pig Latin language.
- Homework3-ApacheSpark: distributed processing using Spark (PySpark) with support of Google Colab for the analysis of Yelp Dataset collections.
- HomeworkFinale-KPMG: data analysis of MIUR and ISTAT open dataset on university students enrolling using Python for the pre-processing of the data, MongoDB for the storage, and Apache Spark for the analysis.
- KPMG-UniversityTrends: a Python elaboration of MIUR and ISTAT open dataset, and an analysis of university trends with Pandas DataFrame, with development of some dashboards in Microsoft Power BI.