The Archives Unleashed Toolkit is an open-source toolkit for analyzing w...
Code repository for the "PySpark in Action" book
Detect common phrases in large amounts of text using a data-driven appro...
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated ...
A Python framework for data processing on GCP.
Apache Spark 3 - Structured Streaming Course Material
PySpark Code for Hands-on Learners
Relation Extraction using Deep learning(CNN)
An Azure Databricks workshop leveraging the New York Taxi and Limousine ...
Big Data for Data Engineers Coursera Specialization from Yandex
Jupyter notebooks for pyspark tutorials given at University
A library for Spark DataFrame using MinIO Select API
Spark 2.0 Python Machine Learning examples
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis an...
JupyterLab extension that enables monitoring launched Apache Spark jobs ...