redshift-cluster

This project is a data warehousing solution for Sparkify, a music streaming service to extract data from JSON logs and stores it in a star schema data model in Amazon Redshift.

python aws s3-bucket boto3 redshift-cluster

Updated Feb 24, 2023
Python

Via-88 / AWS_Redshift_Datawarehouse_and_Pyspark

Star

Create a Redshift DW on AWS by using snow schema and columnar operation `COPY` via SQL and Python

sql python3 redshift-cluster

Updated May 6, 2019
Python

Aleaume / Udacity_DataEng_P3

Star

Udacity Course, Data Engineering Nanodegree, 3rd Project, Data Warehouse with Amazon Redshift

udacity redshift redshift-cluster

Updated Nov 30, 2021
Jupyter Notebook

adharangaonkar / ETL-Pipelines

Star

A repository concentrating on using High end parallel pipelines to perform ETL across various data sources

spark etl postgresql aws-ec2 etl-pipeline redshift-cluster pysaprk

Updated Sep 23, 2021
Jupyter Notebook

Dipesh-Pokhrel / redshift-cluster

Star

Warehousing with redshift

s3-bucket psycopg2 boto3 iam-role redshift-cluster

Updated Aug 14, 2022
Jupyter Notebook

joyceannie / Data-Warehouse-AWS

Star

A music streaming startup, Sparkify, has grown their user base and song database and want to move their processes and data onto the cloud. The data resides in S3, in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app. The objective of the project is to create an ETL pieline t…

aws aws-s3 python3 data-warehouse redshift redshift-cluster

Updated Jun 22, 2022
Jupyter Notebook

sudips413 / DataModelingCollectionWarehousingRedshift

Star

The data is collected from IMDB and then transformed before loading to warehouse

warehousing redshift-cluster

Updated May 1, 2023
Jupyter Notebook

MaxineXiong / Cloud-Data-Warehousing-with-AWS-Redshift

Star

This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.

etl aws-s3 postgresql data-warehouse infrastructure-as-code postgresql-database data-warehousing extract-transform-load aws-boto3 aws-redshift dimensional-model etl-pipeline redshift-cluster dimensional-modeling cloud-data-warehouse

Updated Sep 9, 2024
Jupyter Notebook

essraahmed / Data-Pipeline-with-Airflow

Star

Data Pipeline with Apache Airflow

data airflow database s3 redshift datapipeline airflow-plugin amazon-s3 redshift-cluster airflow-dags

Updated Sep 28, 2022
Python

praveen-gopal-reddy / DataPipeLine-S3-to-Redshift-Using-Airflow

Star

Implement ETL data pipeline that reads data from S3 bucket and loads data into AWS redshift using Airflow

python docker aws airflow s3 redshift-database redshift-cluster

Updated Sep 1, 2021
Python

TriceB / DS4A-DE-Group2_wnba-nba-salary-gap

Star

Query-able API analyzing pay gap between WNBA and NBA by scraping multiple data sources utilizing Python and Beautiful Soup

python json aws-s3 jupyter-notebook pandas boto3 beautifulsoup4 plotly-dash redshift-cluster

Updated Jul 12, 2022
Jupyter Notebook

arfatmateen / Data_Warehouse_on_AWS

Star

Database Schema & ETL pipeline for Song Play Analysis | Bosch AI Talent Accelerator Scholarship Program

python aws sql s3-bucket jupyterlab etl-pipeline redshift-cluster

Updated Sep 14, 2022
Python

Dina-Hosny / Sparkify---Data-Pipelines-with-Airflow

Star

Sparkify - Data Pipelines with Airflow - Udacity Data Engineering Expert Track.

aws airflow udacity etl data-engineering redshift dataengineering pipline redshift-cluster

Updated Jan 17, 2023
Python

abrook7 / ETL_Project

Star

Airflow orchestrated ETL (running in docker containers) that pulls batch data from an API to a local Postgres database, loads to AWS S3/Redshift provisioned by Terraform, and visualized in Quicksight.

aws airflow etl docker-compose terraform postgresql s3-bucket redshift-cluster quicksight-dashboard

Updated May 5, 2023
Python

topunix / AWS-Redshift

Star

☁️ Creating a Redshift Cluster using the AWS Python SDK

infrastructure-as-code redshift-cluster aws-python-sdk

Updated Dec 6, 2019
Python

SalSuwai / Data_Lake

Star

In this project, We will use Spark and data lakes to build an ETL pipeline for a data lake hosted on S3. To complete the project, you will need to load data from S3, process the data into analytics tables using Spark, and load them back into S3. We will deploy this Spark process on a cluster using AWS.

aws apache-spark s3-bucket data-engineering data-lake etl-pipeline redshift-cluster

Updated Jun 14, 2021
Jupyter Notebook

Improve this page

Add a description, image, and links to the redshift-cluster topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the redshift-cluster topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

redshift-cluster

Here are 44 public repositories matching this topic...

MrBenA / Data_Warehouse-Amazon_Redshift

mhaywardhill / Redshift-DWH-ETL

jkenney0501 / AWS_Data_Engineering

sunnykan / sparkifydb_rs

Marcoc51 / Sparkify-Data-Warehouse

Via-88 / AWS_Redshift_Datawarehouse_and_Pyspark

Aleaume / Udacity_DataEng_P3

adharangaonkar / ETL-Pipelines

Dipesh-Pokhrel / redshift-cluster

joyceannie / Data-Warehouse-AWS

sudips413 / DataModelingCollectionWarehousingRedshift

MaxineXiong / Cloud-Data-Warehousing-with-AWS-Redshift

essraahmed / Data-Pipeline-with-Airflow

praveen-gopal-reddy / DataPipeLine-S3-to-Redshift-Using-Airflow

TriceB / DS4A-DE-Group2_wnba-nba-salary-gap

arfatmateen / Data_Warehouse_on_AWS

Dina-Hosny / Sparkify---Data-Pipelines-with-Airflow

abrook7 / ETL_Project

topunix / AWS-Redshift

SalSuwai / Data_Lake

Improve this page

Add this topic to your repo