#

aws-glue-crawler

Here are 34 public repositories matching this topic...

YouTube-Trending-video-analysis-ETL-using-AWS-Services

Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services

In this project I have used the Trending YouTube Video Statistics data from Kaggle to analyze and prepare it for usage.

python aws aws-s3 aws-athena awslambda quicksight aws-glue-crawler awsglue

Updated Nov 7, 2022

KRISHNASAIRAJ / AWS-Driven-Sales-Performance-Outlook

The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.

python aws-lambda dynamodb s3-bucket kinesis kinesis-firehose aws-athena glue-catalog aws-glue-crawler eventbridge-pipes

Updated Feb 11, 2024
Python

Shilpaar90 / AWS-Capturing-Schema-Changes-In-S3

A pipeline within AWS to capture schema changes in S3 files and to update them in a DB.

aws crawler aws-lambda dynamodb s3 aws-dynamodb aws-cloudwatch-logs aws-lambda-python aws-glue aws-eventbridge glue-catalog aws-glue-crawler

Updated Nov 30, 2021

subhamay-cloudworks / 0090-deutzia-cft

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

mihirkudale / Stock-Market-Real-Time-Data-Engineering-Project

In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.

python aws csv kafka aws-s3 jupyter-notebook consumer amazon-ec2 aws-ec2 apache-kafka producer aws-athena stockmarket aws-glue-crawler stockmarketanalysis aws-glue-catalog

Updated May 23, 2024
Jupyter Notebook

AirtonLira / aws-bigdata-glue-athena

Este projeto tem como objetivo realizar a coleta, catalogo, governança, processamento e visualização de dados.

aws aws-cloudformation aws-athena aws-glue aws-glue-crawler

Updated Mar 28, 2023

SadafAsad / LinkedIn-Jobs-Analysis

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

sumanthmalipeddi / spotify_trending_telugu

Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services

mysql aws data aws-lambda aws-s3 pandas-dataframe python3 data-engineering aws-cloudwatch aws-athena spotify-web-api etl-pipeline aws-glue-crawler

Updated Jun 10, 2024
Jupyter Notebook

aws-samples / automated-datastore-discovery-with-aws-glue

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.

aws terraform s3-bucket pyspark glue-job glue-catalog aws-glue-crawler

Updated Feb 10, 2022
Python

ShreyasLengade / serverless_etl_pipeline

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog

Updated Jun 25, 2024
Python

Kartik-Banga / Automated-ETL-Pipeline-for-Playstore-Data

Implemented ETL pipeline on AWS for Playstore data using Lambda, Glue Crawlers, and Glue ETL Jobs. Orchestrated workflow with Step Functions and achieved seamless integration, optimal data merging, and enhanced data quality/accessibility.

python sql aws-lambda aws-s3 data-visualization pyspark data-engineering cloud-computing data-analysis powerbi data-cleaning aws-step-functions aws-glue aws-glue-crawler

Updated Jan 4, 2024

h-fuzzy-logic / data-analytics-spring

Open data and cloud computing to answer the question: Are we losing our spring days?

python jupyter aws-s3 pandas seaborn openscience aws-athena aws-glue-crawler

Updated Sep 9, 2023
Jupyter Notebook

sarah-zhan / data_pipeline_amazon_products

An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization

aws s3-bucket tableau aws-athena aws-glue-crawler

Updated Mar 27, 2024
Jupyter Notebook

subhamay-cloudworks / 0053-bluebonnets-cft

Working with Glue Data Catalog and running the using S3 Event Notification and creating the entire stack using AWS CloudFormation

aws-s3 aws-cloudformation aws-glue aws-glue-crawler

Updated May 8, 2023

Saurabhkhandebharad / BigData-SK

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

aws big-data aws-lambda power-bi pyspark aws-ec2 aws-cloudformation aws-athena kaggle-dataset aws-services end-to-end-pipeline end-to-end-project aws-glue-crawler aws-s3-bucket

Updated Sep 7, 2024
Python

masood2iq / Serverless-Framework-Athena-Glue-Deployment-on-Existing-S3-Bucket

AWS Athena, Glue Database, Glue Crawler deployment on existing S3 bucket through Serverless (sls) Framework.

serverless-framework aws-athena aws-glue simple-query aws-glue-crawler aws-s3-bucket

Updated Dec 13, 2022
JavaScript

dhvani-k / YouTrend_Insights_Analyzing_YouTube_Video_Landscape

An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau

aws marketing youtube aws-lambda aws-s3 youtube-api aws-iam tableau content-strategy aws-athena aws-lambda-python aws-glue quicksight aws-glue-crawler user-insights

Updated Sep 23, 2023
Python

masood2iq / AWS-Athena-Glue-S3-CloudFormation-Deployment-AWSConsole

AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through CloudFormation stack on AWS console.

aws-cloudformation aws-athena aws-glue simple-query aws-glue-crawler aws-s3-bucket

Updated Dec 14, 2022

productiveAnalytics / aws-cdk-constructs-sandbox

Cloud Development Kit (AWS CDK) using TypeScript, Python and Java

aws-lambda aws-s3 python3 java8 aws-kinesis parquet firehose aws-athena aws-kinesis-firehose aws-glue aws-cdk cdk-constructs aws-glue-crawler

Updated Feb 19, 2022
Java

Improve this page

Add a description, image, and links to the aws-glue-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-crawler topic, visit your repo's landing page and select "manage topics."