In this project, we will load divvy trips dataset present in S3 to AWS redshift with the help of Airflow. Divvy is Chicago's bike share system that provides a fast and convenient way to get around without having your own bike.
At first we need to finish below tasks
- Build Airflow and Postgres image on docker windows using docker-compose.
- Create Redshift cluster using AWS console.
- upload divvy trips dataset files Link to files(pick any year) to S3 bucket. my bucket location is s3://udacity-dend/data-pipelines
Add AWS and Redshift credentials in Airflow. In aws_credentials, login is your Iam user access key id and password is secret access key.
Switch on the dags in Airflow and verify tables are created in AWS redshift or not.
As you see our divvy trips tables are created in redshift as per sql_statements file which is present in /dags/modules folder