PySpark-Tutorial provides basic algorithms using PySpark
PySpark is the Python API for Spark.
The purpose of PySpark tutorial is to provide basic distributed algorithms using PySpark.
PySpark supports two types of Data Abstractions:
PySpark Interactive Mode: has an interactive shell
($SPARK_HOME/bin/pyspark
) for basic testing
and debugging and is not supposed to be used
for production environment.
PySpark Batch Mode: you may use $SPARK_HOME/bin/spark-submit
command for running PySpark programs (may be used for
testing and production environemtns)
Thank you!
best regards,
Mahmoud Parsian