Live, real-time dashboard in a serverless docker web app, and deployed via terraform with a built-in CICD trigger
Learn how to deploy a live, real-time dashboard in a Python web app powered by end to end infrastructure as code!
The repo sets up simulated real-time iot data, 3 data pipelines, cicd trigger automation, and 40 GCP resources in total through terraform-all in under 10 minutes!
Main Goal: Get something living and breathing on YOUR screen. Take away what's USEFUL FOR YOU, even if it's simply some code snippets! :scissors: :clipboard:
Questions Explored:
What you'll be making!
What you'll ALSO be making!
Component | Product Overview | Purpose | Azure Equivalents | AWS Equivalents |
---|---|---|---|---|
Cloud Storage | Object store for all kinds of file types | Store sensitive files such as tfstate and the private service account key. Raw data for ad hoc usage | File/Blob Storage | S3 |
Cloud Build | Build workflows for testing and deployment across multiple environments | Deploy, CICD, and destroy terraform-managed infrastructure | Pipelines | CodePipeline |
Compute Engine | Scalable virtual machines | Simulate devices registering to IoT Core | Virtual Machines | EC2 |
Cloud IoT Core | Manage, deploy, and ingest data from dispersed devices | Manages the simulated devices and ingests their data to Pub/Sub | IoT Hub | IoT Core |
Cloud Pub/Sub | Message queue for ingesting and delivering data to other services | Middleware that serves as a shock-absorber and funnels data for further transformation | Service Bus, Storage Queues | Kinesis |
Key Management Service | Managed encryption keys for secrets protection | Encrypts and decrypts the private service account key for each deployment | Key Vault | KMS |
Cloud Dataflow | Serverless stream and batch data processing | Loads data into parquet files and into a BigQuery table | Stream Analytics | Glue |
Cloud Functions | Event-driven serverless compute | Writes simulated temperature device data to Bigtable | Functions | Lambda Functions |
Cloud Bigtable | NoSQL database for large workloads | Stores simulated device data and configured for time series read operations | Table Storage | DynamoDB |
BigQuery | Serverless analytics data warehouse | Stores simulated device data for aggregate reporting metrics using standard SQL | Data Lake Analytics, Data Lake Store | Redshift/Athena depending on who you ask ;) |
Cloud Run | Run Docker containers in a fully-managed, serverless app | Hosts the dash app that visualizes simulated device data in real-time by querying Bigtable every second | Container Instances | Fargate |
Cloud IAM | Access control for managing cloud resources | Gives cloud build and terraform access to deploy and edit the services in scope | IAM | IAM |
Sign up for a free trial OR use an existing GCP account
Manually fork the repo through the github interface
Note: you may likely be prompted to manually enable the Cloud Build api
Note: The rest of these instructions are written for cloud shell
# set the project ID within cloud shell
gcloud config set project <PROJECT_ID>
git clone https://github.com/<your-github-username>/iot-python-webapp.git
# change directory into the repo
cd iot-python-webapp/
What your terminal should look like
# Example: bash ./initial_setup.sh -e [email protected] -u user_123 -p ferrous-weaver-256122 -s demo-service-account -g gcp_signup_name_3 -b master
# Notes: leave the GITHUB_BRANCH_NAME as "master" for this demo
# You can find the GCP_USERNAME for your project in the cloud shell terminal before the "@" in "realsww123@cloudshell"
# I recommend you investigate the script which showcases actions to NOT be managed by terraform
# Creates secret encryptions, terraform service accounts, and buckets as pre-requisites to the terraform deployment
# append this syntax to the end of the bash command
# if you want to save your terminal output to a text file
####
2>&1 | tee SomeFile.txt
####
# template
bash ./initial_setup.sh [-e GITHUB_EMAIL] [-u GITHUB_USERNAME] [-p PROJECT_ID] [-s SERVICE_ACCOUNT_NAME] [-g GCP_USERNAME] [-b GITHUB_BRANCH_NAME]
Double check the secrets file is uploaded to the bucket and terraform files reflect what you set your command line arguments
# note: enabling apis may lag behind other services
# it is accounted for in the initial setup script above
gcloud builds submit --config=first_build.yaml
first_build.yaml
completes sucessfullygcloud beta run services list --platform managed
Click on the link to launch the web app
Instead of manually clicking ever-changing interfaces in the console for any changes, this step looks through the code and robustly applies those changes in a transparent way. This allows for an easy to follow paper trail and rollbacks by simply rerunning a previous build.
Commit and push changes to your github repo. This will automatically trigger a build using the logic in cloudbuild.yaml
# This will create a new commit to the master branch in github
# Note: MUST be the first commit to trigger build properly
# Any other commit will not reference the appropriate terraform config
# you recently created above
git status
git add --all
git commit -m "Update terraform config files"
git push origin master
Explore the cloud build history to verify a successful build
Check to see if the app exists after the cloudbuild history updates.
You should see an updated timestamp to the web app
gcloud beta run services list --platform managed
# deletes devices in IoT registry
# destroys terraform deployed resources
gcloud builds submit --config=destroy_build.yaml
Note: if you want to destroy everything, you can delete everything via the console OR simply delete the project you ran the deployment instructions in for a clean slate!
I store the tfstate in a remote storage bucket to prevent multiple deployments overriding each other
Bigtable was used to taste and see how fast read/writes were for time series data. Turns out each read/write takes less than 500ms on average, which is pretty fast for Python
Terraform has yet to create an official module dependency framework. They currently have resource dependency, but it's an incredible amount of code overhead to implement for enabling google apis: click here. Thankfully, module dependency is on the official roadmap, so I'm leaving this in the backlog to enhance after this feature is released: click here
Cloud function writes to Bigtable because it's more than enough to handle 3 devices sending concurrent invocations. Dataflow is an alternative
Dataflow Java templates are used to write to BigQuery and GCS because it was easy as pie to implement
KMS is used to launch terraform services with specific role access AND for cloud run to access the IoT device registry. In a real-world context, it'd follow least-privilege access principles
There is no formal testing of this demo outside of multiple walkthroughs of the deployment instructions. My goal was to explore, not to create the most robust app for production on day one
KMS key rings can NOT be deleted, so that GCP has a record of key ring names that can't be used anymore. If you're going to redeploy, you must rename the key ring or it'll error out
An IoT registry can not be force deleted if devices are tied to it
Cloud Run for terraform is still needing further development. Need work outside terraform to allow app to expose to public internet
For google apis, if it's the first time enabling, it may error out and force you to manually enable or rerun the terraform build
Managing secrets and setting up IAM at a granular level is a project of its own. You'll notice most of the roles grant wide permissions for demo purposes
Setting up good parameters for interoperability across modules requires robust, upfront repo planning
Dataflow jobs have to be replaced everytime you redeploy infrastructure with terraform-even if you don't make any changes! This will disrupt the live data flow, so be mindful when redeploying
Terraform features follow a couple months delay after a new GCP service is released
Next time, I would create a distinct pub/sub push subscription for the cloud function and pull subscriptions for the dataflow jobs for the same topic to employ the proper throughput mechanisms
PLEASE EXPLICITLY VERSION ALL YOUR DEPENDENCIES SUCH AS TERRAFORM PROVIDER VERSIONS AND PYTHON PACKAGES OR THEY WILL BITE YOU IN THE BUTT
My stackshare decision!: Think twitter for developers
IoT Reference Example: The java equivalent of what this repo does
Another IoT Reference Example: Official GCP documentation for reference architecture
Terraform Cloud Build Example: If you want to focus on cloudbuild setup
IoT Pipeline Qwiklab: Where I got the device simulator scripts and general starting point
All feedback is welcome! You can use the issue tracker to submit bugs, ideas, etc. Pull requests are splendid!
My master branch will be protected, so no changes will come through without my formal approval.