aws-glue-crawler

Here are 34 public repositories matching this topic...

aws-samples / aws-glue-crawler-utilities

This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.

aws-glue aws-glue-crawler

Updated Dec 21, 2021
Python

aws-samples / amazon-rds-export-to-s3-automation

Star

This repository contains source code for the AWS Database Blog Post Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3

aws-lambda aws-kms aws-cloudformation amazon-rds amazon-sns amazon-s3 amazon-athena aws-backup aws-glue amazon-eventbridge aws-glue-crawler

Updated Apr 26, 2023

fermat01 / ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

Star

ETL Data pipeline using aws services

aws aws-s3 aws-athena aws-emr-clusters aws-glue-crawler

Updated Aug 23, 2024
Python

Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services

Star

In this project I have used the Trending YouTube Video Statistics data from Kaggle to analyze and prepare it for usage.

python aws aws-s3 aws-athena awslambda quicksight aws-glue-crawler awsglue

Updated Nov 7, 2022

KRISHNASAIRAJ / AWS-Driven-Sales-Performance-Outlook

Star

The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.

python aws-lambda dynamodb s3-bucket kinesis kinesis-firehose aws-athena glue-catalog aws-glue-crawler eventbridge-pipes

Updated Feb 11, 2024
Python

Shilpaar90 / AWS-Capturing-Schema-Changes-In-S3

Star

A pipeline within AWS to capture schema changes in S3 files and to update them in a DB.

aws crawler aws-lambda dynamodb s3 aws-dynamodb aws-cloudwatch-logs aws-lambda-python aws-glue aws-eventbridge glue-catalog aws-glue-crawler

Updated Nov 30, 2021

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

mihirkudale / Stock-Market-Real-Time-Data-Engineering-Project

Star

In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.

python aws csv kafka aws-s3 jupyter-notebook consumer amazon-ec2 aws-ec2 apache-kafka producer aws-athena stockmarket aws-glue-crawler stockmarketanalysis aws-glue-catalog

Updated May 23, 2024
Jupyter Notebook

AirtonLira / aws-bigdata-glue-athena

Star

Este projeto tem como objetivo realizar a coleta, catalogo, governança, processamento e visualização de dados.

aws aws-cloudformation aws-athena aws-glue aws-glue-crawler

Updated Mar 28, 2023

SadafAsad / LinkedIn-Jobs-Analysis

Star

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

sumanthmalipeddi / spotify_trending_telugu

Star

Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services

mysql aws data aws-lambda aws-s3 pandas-dataframe python3 data-engineering aws-cloudwatch aws-athena spotify-web-api etl-pipeline aws-glue-crawler

Updated Jun 10, 2024
Jupyter Notebook

aws-samples / automated-datastore-discovery-with-aws-glue

Star

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

Star

Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.

aws terraform s3-bucket pyspark glue-job glue-catalog aws-glue-crawler

Updated Feb 10, 2022
Python

ShreyasLengade / serverless_etl_pipeline

Star

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog

Updated Jun 25, 2024
Python

Kartik-Banga / Automated-ETL-Pipeline-for-Playstore-Data

Star

Implemented ETL pipeline on AWS for Playstore data using Lambda, Glue Crawlers, and Glue ETL Jobs. Orchestrated workflow with Step Functions and achieved seamless integration, optimal data merging, and enhanced data quality/accessibility.

python sql aws-lambda aws-s3 data-visualization pyspark data-engineering cloud-computing data-analysis powerbi data-cleaning aws-step-functions aws-glue aws-glue-crawler

Updated Jan 4, 2024

h-fuzzy-logic / data-analytics-spring

Star

Open data and cloud computing to answer the question: Are we losing our spring days?

python jupyter aws-s3 pandas seaborn openscience aws-athena aws-glue-crawler

Updated Sep 9, 2023
Jupyter Notebook

sarah-zhan / data_pipeline_amazon_products

Star

An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization

aws s3-bucket tableau aws-athena aws-glue-crawler

Updated Mar 27, 2024
Jupyter Notebook

subhamay-cloudworks / 0053-bluebonnets-cft

Sponsor

Star

Working with Glue Data Catalog and running the using S3 Event Notification and creating the entire stack using AWS CloudFormation

aws-s3 aws-cloudformation aws-glue aws-glue-crawler

Updated May 8, 2023

Saurabhkhandebharad / BigData-SK

Star

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

aws big-data aws-lambda power-bi pyspark aws-ec2 aws-cloudformation aws-athena kaggle-dataset aws-services end-to-end-pipeline end-to-end-project aws-glue-crawler aws-s3-bucket

Updated Sep 7, 2024
Python

masood2iq / Serverless-Framework-Athena-Glue-Deployment-on-Existing-S3-Bucket

Star

AWS Athena, Glue Database, Glue Crawler deployment on existing S3 bucket through Serverless (sls) Framework.

serverless-framework aws-athena aws-glue simple-query aws-glue-crawler aws-s3-bucket

Updated Dec 13, 2022
JavaScript

Improve this page

Add a description, image, and links to the aws-glue-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-crawler topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-glue-crawler

Here are 34 public repositories matching this topic...

aws-samples / aws-glue-crawler-utilities

aws-samples / amazon-rds-export-to-s3-automation

fermat01 / ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services

KRISHNASAIRAJ / AWS-Driven-Sales-Performance-Outlook

Shilpaar90 / AWS-Capturing-Schema-Changes-In-S3

subhamay-cloudworks / 0090-deutzia-cft

mihirkudale / Stock-Market-Real-Time-Data-Engineering-Project

AirtonLira / aws-bigdata-glue-athena

SadafAsad / LinkedIn-Jobs-Analysis

sumanthmalipeddi / spotify_trending_telugu

aws-samples / automated-datastore-discovery-with-aws-glue

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

ShreyasLengade / serverless_etl_pipeline

Kartik-Banga / Automated-ETL-Pipeline-for-Playstore-Data

h-fuzzy-logic / data-analytics-spring

sarah-zhan / data_pipeline_amazon_products

subhamay-cloudworks / 0053-bluebonnets-cft

Saurabhkhandebharad / BigData-SK

masood2iq / Serverless-Framework-Athena-Glue-Deployment-on-Existing-S3-Bucket

Improve this page

Add this topic to your repo