Data-analysis-using-AWS-services-Athena-Glue-S3-IAM-Quicksight

This is an end-to-end simple data analytics solution using AWS services. From uploading the csv file to S3 bucket to visualizing results in Quicksight.The dataset used in this project is the data science job salaries from kaggle 'kaggle.com/datasets/ruchi798/data-science-job-salaries'.

Objective

The main objective of this project is to identify the top 5 popular data science salary in US based on job titlte, experience level, employment type, and remote job ratio by job title.

Dataset

The dataset contains variables which include work_year, experience_level, enployement_type, job_type, salary, salary_currency, salary_in_usd, employee_residence, remote_ratio, company_location, company_size.The data analysis process will be done as follows:

Step 1- First, we create an IAM user to grant access permission to s3. In the search bar we type IAM > users > add users

1a- Set user details and access type then next permissions

1b- We proceed by choosing attach existing policies directly since we already have a policy set up.

1c-Review all the details and create user

1d- User successfully created

Step 2- S3 buckets are created. The data-science-salaries-bucket will hold the raw file, while the data-science-salaries-bucket-result will hold the query results from Athena.

2a- The csv file is uploaded to the data-science-salaries-bucket

Step 3- Moving on to Athena. Before we can create our table we need to choose the bucket where the output query will be sent. In the Athena query editor we select settings > manage > browse s3 to choose the appropriate bucket.

3a- In Athena data catalogue, we select >create table > AWS glue crawler > add crawler to retrieve data information schema automatically.

3b- Crawler succesfully created

Step 4- Data query is performed in Athena, then results are loaded to data-science-bucket-result

Step 5- Now quicksight needs to access S3 to build report. But before quicksight can read the s3 bucket, we have to make sure it has permission to do so. We navigate the account section by clicking on top right > manage quicksight > security & permissions > manage > select s3 bucket.

5a- Next we set up a new data source to access S3 from quicksight new analysis > new dataset > S3 > upload Json manifest file > importe to spice

5b- After the data is imported to spice, we create a report in Quicksight. Our interest was to identify top 5 popular data science salary in US based on job titlte, experience level, employment type, and remote job ratio by job title.

Conclusion

Aws provides a suite of powerful tools to analyze data effectively. By using Athena, Glue, IAM, and Quicksight, businesses can gained valuable insights into their data, make informed decisions and optimize processes.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
README.md		README.md
aws_analysis_diagram.png		aws_analysis_diagram.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-analysis-using-AWS-services-Athena-Glue-S3-IAM-Quicksight

Objective

Dataset

Conclusion

About

Releases

Packages

Sigrid242/Data-Analysis-using-AWS-services-Athena-Glue-S3-IAM-Quicksight

Folders and files

Latest commit

History

Repository files navigation

Data-analysis-using-AWS-services-Athena-Glue-S3-IAM-Quicksight

Objective

Dataset

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages