Python library for parallel maps running directly on Kubernetes. Intended for running many expensive tasks (minutes in runtime). Alpha stage. Currently supports only Google Cloud.
Kubeface aims for reasonably efficient execution of many long running Python tasks with medium sized (up to a few gigabytes) inputs and outputs. Design choices and assumptions:
The primary motivating application has been neural network model selection for the MHCflurry project.
See example.py for a simple working example.
client.map(...)creates a job
If a user calls (where
client is a kubeface.Client instance):
client.map(lambda x: x**2, range(10))
This creates a job containing 10 tasks. The return value is a generator that will yield the square of the numbers 0-9. The job is executed as follows:
kubectlcommand is issued that creates a bare pod whose entrypoint (i.e. what runs in the pod) installs kubeface if necessary then calls the command
_kubeface-run-task <input-path> <output-path>.
_kubeface-run-taskcommand downloads the input file from cloud storage, runs the task, and uploads the result to the specified path.
Kubeface tasks execute in the context of a particular docker image, since they run in a kubernetes pod. You can use any docker image with python installed. If your docker image does not have kubeface installed, then by default kubeface will try to install itself using
pip. This is inefficient since it will run for every task. If you plan on running many tasks it's a good idea to create your own docker image with kubeface installed.
Kubeface writes out HTML and JSON status pages to cloud storage and logs to stdout. However, the best way to figure out what's going on with your job is to use kubernetes directly, via
kubectl get pods and
kubectl logs <pod-name>.
From a checkout:
pip install -e .
To run the tests:
# Setting this environment variable is optional. # If you set it in the tests will run against a real google storage bucket. # See https://developers.google.com/identity/protocols/application-default-credentials#howtheywork; # you need to get Application Default Credentials before writing to your bucket. KUBEFACE_STORAGE=gs://kubeface-test # tests will write to gs://kubeface-test. # Run tests: nosetests
kubeface-run command runs a job from the shell, which is useful for testing or simple tasks.
If you don’t already have a kubernetes cluster running, use a command like this to start one:
gcloud config set compute/zone us-east1-c gcloud components install kubectl # if you haven't already installed kubectl gcloud container clusters create kubeface-cluster-$(whoami) \ --scopes storage-full \ --zone us-east1-c \ --num-nodes=2 \ --enable-autoscaling --min-nodes=1 --max-nodes=100 \ --machine-type=n1-standard-16
You should see your cluster listed here: https://console.cloud.google.com/kubernetes/list
Then run this to set it as the default for your session:
gcloud config set container/cluster kubeface-cluster-$(whoami) gcloud container clusters get-credentials kubeface-cluster-$(whoami)
Now launch a command:
kubeface-run \ --expression 'value**2' \ --generator-expression 'range(10)' \ --kubeface-max-simultaneous-tasks 10 \ --kubeface-backend kubernetes \ --kubeface-worker-image continuumio/anaconda3 \ --kubeface-kubernetes-task-resources-cpu 1 \ --kubeface-kubernetes-task-resources-memory-mb 500 \ --verbose \ --out-csv /tmp/result.csv
If you kill the above command, you can run this to kill all the running pods in your cluster:
kubectl delete pods --all
When you’re done working, delete your cluster:
gcloud container clusters delete kubeface-cluster-$(whoami)