lakeFS - Data version control for your data lake | Git for data
Apache Kyuubi is a distributed and multi-tenant gateway to provide serve...
data load tool (dlt) is an open source Python library that makes data lo...
BitSail is a distributed high-performance data integration engine which ...
Few projects related to Data Engineering including Data Modeling, Infras...
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Wareh...
Kylo is a data lake management software platform and framework for enabl...
Personal Data Engineering Projects
Generic Data Ingestion & Dispersal Library for Hadoop
Real Time Big Data / IoT Machine Learning (Model Training and Inference)...
U-SQL Examples and Issue Tracking
Samples and Docs for Azure Data Lake Store and Analytics
Apache Spark 3 - Structured Streaming Course Material
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Apache Spark Course Material