A very Long never ending Learning around Data Engineering & Machine Learning

Interesting Reads

Weekly Digest

The Data Engineering

Level 0

Level 1

Level 1.1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

The Snowflake Paper - Core idea is to build an enterprise-ready #datawarehouse solution for the #cloud 🎉📰📕
Most important points around Distributed #dataengineering Platform
Fundamental of #distributedsystems Scaling - Avoiding Co-ordination 🎊♨️🔆
Technical Debt in #dataengineering #softwareengineering 🔕💡🔕
Paper on Wander Join: Online Aggregation via Random Walks 📃💭📑 Join problem
The Delta Lake Paper - High-Performance ACID Table Storage 📋💡📋
Dynamo - AWS Highly Available Key-value Store #distributedsystem 💬💡🎉
An Efficient and Syntactically Idiomatic Approach to Management of Streams and Tables, A Single SQL for all 💡📩📩
Secure & Robust Machine Learning in #healthcare 💊🧪🥳
Progress in Medical Science using #deeplearning 💊💡💉
The Amazon Redshift Paper - A fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze large volumes of data using existing #businessintelligence tools 📂📰💭
Advancing #drugdiscovery via Artificial Intelligence 💊🏥🏥
Apache Calcite is a dynamic data management framework 🎉📚🎉
Lakehouse - A Paper on new Generation of #datawarehouse technology 💡🔎💡
Calvin: Fast Distributed Transactions for Partitioned Database Systems 📝📝
Presto or Trino - #SQL on Everything ( The Design, Motivation & Performance) #presto 💭🎊💡
Design - Exactly Once Delivery & Transactional Messaging in Apache Kafka
Apache Kafka Paper : Distributed Messaging System for Log Processing
Paper: Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size
Paper: Ground is an open-source data context service, a system to manage all the information that informs the use of data
Azure Data Lake Store(ADLS) is a fully-managed, elastic, scalable, and secure file system that supports #hadoop distributed file system (HDFS) and Cosmos semantics
An LFU (Least Frequently Used) Cache eviction algorithm of O(1) Runtime complexity
The Berkeley View on Cloud Computing - Paper
The Google File System - The Paper 🎉
Paper: Report on Distributed Deep Learning on Data Systems 📂
Crystal: A Unified Cache Storage System for Analytical Databases

Set 2

NA

Cloud

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
sketchnotes		sketchnotes
.gitignore		.gitignore
A Data Engineering Story.pptx		A Data Engineering Story.pptx
README.md		README.md
WhyDataOrchestration.pdf		WhyDataOrchestration.pdf
flink_cdc.md		flink_cdc.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A very Long never ending Learning around Data Engineering & Machine Learning

Interesting Reads

Weekly Digest

The Data Engineering

Level 0

Level 1

Level 1.1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

Set 2

NA

Cloud

About

Releases

Packages

GeoCode-Craft/around-dataengineering

Folders and files

Latest commit

History

Repository files navigation

A very Long never ending Learning around Data Engineering & Machine Learning

Interesting Reads

Weekly Digest

The Data Engineering

Level 0

Level 1

Level 1.1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

Set 2

NA

Cloud

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages