Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Haskell on Apache Spark.
The Internals of Spark SQL
The Internals of Spark Structured Streaming
Includes notes on using Apache Spark in general, notes on using Spark fo...
A boilerplate for writing PySpark Jobs
Infrastructures™ for Machine Learning Training/Inference in Production.
🏐 Apache Parquet for modern .NET
PySpark Cheat Sheet - example code to help you learn PySpark and develop...
Train and run Pytorch models on Apache Spark.
[PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet ba...
Morpheus brings the leading graph query language, Cypher, onto the leadi...
Serverless proxy for Spark cluster
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Fundamentals of Spark with Python (using PySpark), code examples