Data Engineering Project - Python, PySpark & SQL - Azure Data Factory (ADF), DataBricks, Synapse Analytics, Azure Data Lake Storage (ADLS) Gen2, Power BI, Tableau and Looker Studio
-
Updated
Oct 13, 2023 - Jupyter Notebook
Data Engineering Project - Python, PySpark & SQL - Azure Data Factory (ADF), DataBricks, Synapse Analytics, Azure Data Lake Storage (ADLS) Gen2, Power BI, Tableau and Looker Studio
Data Engineering Project using Tokyo Olympic Data
This project demonstrates an ETL pipeline using Microsoft Azure for IMDb Movie Rating Dataset analysis. It covers data extraction from Azure Blob Storage, transformation with Azure Databricks, and loading into Azure SQL using Azure Data Factory. The pipeline automates insights generation and is a practical example of cloud-based data engineering.
Development of a Data Pipeline using Azure Synapse
Real-Time Stock-Market Data Streaming Using Kafka
Conducted comprehensive data analysis on Retail Sales data leveraging Azure Core Services, Azure Databricks and Power BI
An end-to-end data engineering pipeline that fetches data from the BingAPI, cleans and transforms it with Azure Databricks.Sentiment Analysis is performed in AzureML and the data is visualized using Tableau.
A cutting-edge data project leverages Azure's suite of services to seamlessly transform raw data from GitHub into actionable insights. Using Azure Data Factory for data ingestion, Databricks for PySpark transformations, Synapse Analytics for advanced analysis, and Power BI for intuitive visualization, this project navigates complex data workflows..
Formula1 ADF pipeline
Tokyo-olympic-azure-data-engineering-end-to-end-project
Azure pipeline for data analytics on Tokyo Olympics data
This project builds an End-to-End Azure Data Engineering Pipeline, performing ETL and Analytics Reporting on the AdventureWorks2022LT Database.
This project builds an End-to-End Azure Data Engineering Pipeline, performing ETL and Analytics Reporting on the AdventureWorks2017LT Database.
Formula 1 race data engineering project which utilises azure services and databricks to ingest and analyse the data.
A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.
An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Factory, Azure Synapse and Tableau.
Add a description, image, and links to the azure-data-lake-gen2 topic page so that developers can more easily learn about it.
To associate your repository with the azure-data-lake-gen2 topic, visit your repo's landing page and select "manage topics."