Groda Big Data Save

Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.

Project README

big_data

Big Data for beginners

Explore a variety of tutorials and interactive demonstrations focused on Big Data technologies like Hadoop, Spark, and more, primarily presented in the format of Jupyter notebooks.

Setting Up Hadoop: Single-Node Configuration

Running Apache Spark in Standalone Mode

MapReduce Tutorials

PySpark Tutorials

Miscellaneous Tutorials

Virtualization and Cloud Automation

Big Data Learning Pathways

About this repository

Notebooks Testing and CI

Most executable Jupyter notebooks are tested on an Ubuntu virtual machine through a GitHub automated workflow. The log file for successful executions is named: action_log.txt.

Current status: Run Notebooks on Ubuntu

The Github workflow is a starting point for what is known as Continuous Integration (CI) in DevOps/Platform Engineering circles.

Open Source Agenda is not affiliated with "Groda Big Data" Project. README Source: groda/big_data

Open Source Agenda Badge

Open Source Agenda Rating