Gather Deployment Save

Gathers Python deployment, infrastructure and practices.

Project README

Gather-Deployment

Gathers Python deployment, infrastructure and practices.

Table of contents

Requirements

  1. Docker
  2. Docker compose

Tensorflow deployment

  1. Object Detection. Flask SocketIO + WebRTC

Stream from webcam using WebRTC -> Flask SocketIO to detect objects -> WebRTC -> Website.

  1. Object Detection. Flask SocketIO + opencv

Stream from OpenCV -> Flask SocketIO to detect objects -> OpenCV.

  1. Speech streaming. Flask SocketIO

Stream speech from microphone -> Flask SocketIO to do realtime speech recognition.

  1. Text classification. Flask + Gunicorn

Serve Tensorflow text model using Flask multiworker + Gunicorn.

  1. Image classification. TF Serving

Serve image classification model using TF Serving.

  1. Image Classification using Inception. Flask SocketIO

Stream image using SocketIO -> Flask SocketIO to classify.

  1. Object Detection. Flask + opencv

Webcam -> Opencv -> Flask -> web dashboard.

  1. Face-detection using MTCNN. Flask SocketIO + opencv

Stream from OpenCV -> Flask SocketIO to detect faces -> OpenCV.

  1. Face-detection using MTCNN. opencv

Webcam -> Opencv.

  1. Image classification using Inception. Flask + Docker

Serve Tensorflow image model using Flask multiworker + Gunicorn on Docker container.

  1. Image classification using Inception. Flask + EC2 Docker Swarm + Nginx load balancer

Serve inception on multiple AWS EC2, scale using Docker Swarm, balancing using Nginx.

  1. Text classification. Hadoop streaming MapReduce

Batch processing to classify texts using Tensorflow text model on Hadoop MapReduce.

  1. Text classification. Kafka

Stream text to Kafka producer and classify using Kafka consumer.

  1. Text classification. Distributed TF using Flask + Gunicorn + Eventlet

Serve text model on multiple machines using Distributed TF + Flask + Gunicorn + Eventlet. Means that, Distributed TF will split a single neural network model to multiple machines to do feed-forward.

  1. Text classification. Tornado + Gunicorn

Serve Tensorflow text model using Tornado + Gunicorn.

  1. Text classification. Flask + Celery + Hadoop

Submit large texts using Flask, signal queue celery job to process using Hadoop, delay Hadoop MapReduce.

  1. Text classification. Luigi scheduler + Hadoop

Submit large texts on Luigi scheduler, run Hadoop inside Luigi, event based Hadoop MapReduce.

  1. Text classification. Luigi scheduler + Distributed Celery

Submit large texts on Luigi scheduler, run Hadoop inside Luigi, delay processing.

  1. Text classification. Airflow scheduler + elasticsearch + Flask

Scheduling based processing using Airflow, store inside elasticsearch, serve it using Flask.

  1. Text classification. Apache Kafka + Apache Storm

Stream from twitter -> Kafka Producer -> Apache Storm, to do distributed minibatch realtime processing.

  1. Text classification. Dask

Batch processing to classify texts using Tensorflow text model on Dask.

  1. Text classification. Pyspark

Batch processing to classify texts using Tensorflow text model on Pyspark.

  1. Text classification. Pyspark streaming + Kafka

Stream texts to Kafka Producer -> Pyspark Streaming, to do minibatch realtime processing.

  1. Text classification. Streamz + Dask + Kafka

Stream texts to Kafka Producer -> Streamz -> Dask, to do minibatch realtime processing.

  1. Text classification. FastAPI + Streamz + Water Healer

Change concurrent requests into mini-batch realtime processing to speed up text classification.

  1. Text classification. PyFlink

Batch processing to classify texts using Tensorflow text model on Flink batch processing.

  1. Text classification. PyFlink + Kafka

Stream texts to Kafka Producer -> PyFlink Streaming, to do minibatch realtime processing.

  1. Object Detection. ImageZMQ

Stream from N camera clients using ImageZMQ -> N slaves ImageZMQ processing -> single dashboard.

Simple Backend

  1. Flask
  2. Flask with MongoDB
  3. REST API Flask
  4. Flask Redis PubSub
  5. Flask Mysql with REST API
  6. Flask Postgres with REST API
  7. Flask Elasticsearch
  8. Flask Logstash with Gunicorn
  9. Flask SocketIO with Redis
  10. Multiple Flask with Nginx Loadbalancer
  11. Multiple Flask SocketIO with Nginx Loadbalancer
  12. RabbitMQ and multiple Celery with Flask
  13. Flask + Gunicorn + HAproxy

Apache stack

  1. Flask with Hadoop Map Reduce
  2. Flask with Kafka
  3. Flask with Hadoop Hive
  4. PySpark with Jupyter
  5. Apache Flink with Jupyter
  6. Apache Storm with Redis
  7. Apache Flink with Zeppelin and Kafka
  8. Kafka cluster + Kafka REST
  9. Spotify Luigi + Hadoop streaming

simple data pipeline

  1. Streaming Tweepy to Elasticsearch
  2. Scheduled crawler using Luigi Spotify to Elasticsearch
  3. Airflow to Elasticsearch

Realtime ETL

  1. MySQL -> Apache NiFi -> Apache Hive
  2. PostgreSQL CDC -> Debezium -> KsqlDB

Unit test

  1. Pytest

Stress test

  1. Locust

Monitoring

  1. PostgreSQL + Prometheus + Grafana
  2. FastAPI + Prometheus + Loki + Jaeger

Mapping

Focused for Malaysia, for other countries, you need to change download links.

  1. OSRM Malaysia
  2. Maptiler Malaysia
  3. OSM Style Malaysia

Miscellaneous

  1. Elasticsearch + Kibana + Cerebro
  2. Jupyter notebook
  3. Jupyterhub
  4. Jupyterhub + Github Auth
  5. AutoPEP8
  6. Graph function dependencies
  7. MLFlow

Practice PySpark

  1. Simple PySpark SQL.
  • Simple PySpark SQL.
  1. Simple download dataframe from HDFS.
  • Create PySpark DataFrame from HDFS.
  1. Simple PySpark SQL with Hive Metastore.
  • Use PySpark SQL with Hive Metastore.
  1. Simple Delta lake.
  • Simple Delta lake.
  1. Delete Update Upsert using Delta.
  • Simple Delete Update Upsert using Delta lake.
  1. Structured streaming using Delta.
  • Simple structured streaming with Upsert using Delta streaming.
  1. Kafka Structured streaming using Delta.
  • Kafka structured streaming from PostgreSQL CDC using Debezium and Upsert using Delta streaming.
  1. PySpark ML text classification.
  • Text classification using Logistic regression and multinomial in PySpark ML.
  1. PySpark ML word vector.
  • Word vector in PySpark ML.

Practice PyFlink

  1. Simple Word Count to HDFS.
  • Simple Table API to do Word Count and sink into Parquet format in HDFS.
  1. Simple Word Count to PostgreSQL.
  • Simple Table API to do Word Count and sink into PostgreSQL using JDBC.
  1. Simple Word Count to Kafka.
  • Simple Table API to do Word Count and sink into Kafka.
  1. Simple text classification to HDFS.
  • Load trained text classification model using UDF to classify sentiment and sink into Parquet format in HDFS.
  1. Simple text classification to PostgreSQL.
  • Load trained text classification model using UDF to classify sentiment and sink into PostgreSQL.
  1. Simple text classification to Kafka.
  • Load trained text classification model using UDF to classify sentiment and sink into Kafka.
  1. Simple real time text classification upsert to PostgreSQL.
  • Simple real time text classification from Debezium CDC and upsert into PostgreSQL.
  1. Simple real time text classification upsert to Kafka.
  • Simple real time text classification from Debezium CDC and upsert into Kafka Upsert.
  1. Simple Word Count to Apache Hudi.
  • Simple Table API to do Word Count and sink into Apache Hudi in HDFS.
  1. Simple text classification to Apache Hudi.
  • Load trained text classification model using UDF to classify sentiment and sink into Apache Hudi in HDFS.
  1. Simple real time text classification upsert to Apache Hudi.
  • Simple real time text classification from Debezium CDC and upsert into Apache Hudi in HDFS.

Printscreen

Open Source Agenda is not affiliated with "Gather Deployment" Project. README Source: huseinzol05/Gather-Deployment