Cortex Versions Save

Production infrastructure for machine learning at scale

v0.34.0

3 years ago

v0.34.0

New features

Support handling GET, PUT, PATCH, and DELETE HTTP requests in Realtime APIs (docs) https://github.com/cortexlabs/cortex/pull/2111 https://github.com/cortexlabs/cortex/issues/2063 (RobertLucian)
Support running realtime API containers locally for debugging / development purposes (docs) https://github.com/cortexlabs/cortex/pull/2112 https://github.com/cortexlabs/cortex/issues/2077 (vishalbollu)
Support multiple gRPC services / methods (which can be named arbitrarily) in a single Realtime API (docs) https://github.com/cortexlabs/cortex/pull/2111 https://github.com/cortexlabs/cortex/issues/2063 (RobertLucian)
Support specifying a list of node groups on which a workload is allowed to run (see configuration docs for Realtime, Async, Batch, or Task APIs) https://github.com/cortexlabs/cortex/pull/2098 https://github.com/cortexlabs/cortex/issues/2034 (RobertLucian)
Support AWS GovCloud regions https://github.com/cortexlabs/cortex/pull/2118 https://github.com/cortexlabs/cortex/issues/2103 (vishalbollu)

Breaking changes

"predictor" has been renamed to "handler" throughout the product (API configuration and Python APIs). In addition, as a result of supporting additional HTTP method verbs, predict() has been renamed to handle_post() in Realtime APIs (handle_get(), handle_put(), handle_patch(), and handle_delete() are now also supported). For consistency, predict() has been renamed to handle_async() for Async APIs, and handle_batch() for Batch APIs. See the examples for Realtime, Async, and Batch APIs. Task APIs have not been changed.

Bug fixes

Fix invalid Async workload status during processing https://github.com/cortexlabs/cortex/pull/2106 https://github.com/cortexlabs/cortex/issues/2104 (RobertLucian)

Docs

Add docs for configuring Grafana alerts (RobertLucian)
Document how to create a Cortex cluster without administrator IAM access (vishalbollu)
Add docs for mirroring Cortex's docker images to a private repo (vishalbollu)

Misc

Support json output for the cortex cluster info command https://github.com/cortexlabs/cortex/pull/2089 https://github.com/cortexlabs/cortex/issues/2062 (RobertLucian)
Allow nodegroups to be scaled down to max_instances == 0 https://github.com/cortexlabs/cortex/pull/2095 (deliahu)

v0.33.0

3 years ago

v0.33.0

New features

Allow specifying a CIDR range whitelist for APIs and the operator (docs) https://github.com/cortexlabs/cortex/pull/2071 https://github.com/cortexlabs/cortex/issues/2003 (vishalbollu)
Enable CORS for async, batch, and task APIs https://github.com/cortexlabs/cortex/pull/2082 https://github.com/cortexlabs/cortex/issues/2073 (deliahu)

Breaking changes

The onnx predictor type has been replaced by the python predictor type; please use the python predictor type instead (all onnx models are fully supported by the python predictor type)

Bug fixes

Fix bug affecting async api consistency during heavy traffic https://github.com/cortexlabs/cortex/pull/2072 (RobertLucian)
Fix bug affecting async api updates https://github.com/cortexlabs/cortex/pull/2067 (vishalbollu)

Misc

Rename cortex cluster configure command to cortex cluster scale https://github.com/cortexlabs/cortex/pull/2040 https://github.com/cortexlabs/cortex/issues/1972 (RobertLucian)
Disable AZRebalance autoscaling group process https://github.com/cortexlabs/cortex/pull/2042 https://github.com/cortexlabs/cortex/issues/1349 (RobertLucian)
Add horizontal pod autoscaler to async API gateway https://github.com/cortexlabs/cortex/pull/2079 https://github.com/cortexlabs/cortex/issues/2078 (RobertLucian)
Rename async modules to async_api to avoid name collision with the reserved keyword in Python 3.7+ https://github.com/cortexlabs/cortex/pull/2066 https://github.com/cortexlabs/cortex/issues/2052 (vishalbollu)
Backup images to dockerhub https://github.com/cortexlabs/cortex/pull/2081 (vishalbollu)
Add additional debugging info for cluster up failures https://github.com/cortexlabs/cortex/pull/2080 https://github.com/cortexlabs/cortex/issues/2027 (vishalbollu)

v0.32.0

3 years ago

v0.32.0

New features

Add gRPC support to realtime APIs (docs) https://github.com/cortexlabs/cortex/pull/1997 https://github.com/cortexlabs/cortex/issues/1056 (RobertLucian)
Add support for ONNX and TensorFlow predictor types in async APIs (docs) https://github.com/cortexlabs/cortex/pull/1996 https://github.com/cortexlabs/cortex/issues/1980 (miguelvr)
Support using ECR images from other AWS accounts and regions https://github.com/cortexlabs/cortex/pull/2011 https://github.com/cortexlabs/cortex/issues/1988 (vishalbollu)

Breaking changes

GCP support has been removed so that we can focus our efforts on improving the scalability, reliability, and security for Cortex on AWS. Cortex on GCP will still be available in v0.31. If you are currently using Cortex on GCP, our team will be happy to help you migrate to AWS or work with you to find alternative solutions. Please feel free to reach out to us on slack or email us at [email protected] if you're interested.

Bug fixes

Fix memory plots on Grafana dashboards for realtime and batch APIs https://github.com/cortexlabs/cortex/pull/2024 https://github.com/cortexlabs/cortex/pull/2014 https://github.com/cortexlabs/cortex/issues/1970 (RobertLucian)

Docs

Misc docs improvements https://github.com/cortexlabs/cortex/pull/1994 (ospillinger)

Misc

Increase kubelet's registryPullQPS limit from 5 to 10 https://github.com/cortexlabs/cortex/pull/2023 https://github.com/cortexlabs/cortex/issues/1989 (miguelvr)
Pin the AMI version https://github.com/cortexlabs/cortex/pull/2010 https://github.com/cortexlabs/cortex/issues/1975 https://github.com/cortexlabs/cortex/issues/1615 (vishalbollu)

v0.31.1

3 years ago

v0.31.1

Bug fixes

Preemptible node pools on GCP aren't autoscaling https://github.com/cortexlabs/cortex/pull/1981 (vishalbollu)
Replica autoscaler targets incorrect deployments on operator restart https://github.com/cortexlabs/cortex/pull/1982 (miguelvr)
Replica autoscaler is not reinitialized for running APIs on operator restart on GCP https://github.com/cortexlabs/cortex/pull/1984 (vishalbollu)

v0.31.0

3 years ago

v0.31.0

New features

Add support for AsyncAPI (experimental) (docs) https://github.com/cortexlabs/cortex/pull/1935 https://github.com/cortexlabs/cortex/issues/1610 (miguelvr)
Add support for multi-instance-type clusters to AWS/GCP providers (experimental) (aws/gcp docs) https://github.com/cortexlabs/cortex/pull/1951 (RobertLucian)
Allow users to duplicate/mirror traffic using shadow pipelines https://github.com/cortexlabs/cortex/pull/1948 https://github.com/cortexlabs/cortex/issues/1889 (docs) (vishalbollu)

Breaking changes

on_demand_backup in cluster configuration has been removed in favour of using a cluster with a mixture of spot and on-demand nodegroups. See multi-instance documentation for aws and gcp for more details.

Bug fixes

Fix Python client not respecting CORTEX_CLI_CONFIG_DIR environment variable for client-id.txt https://github.com/cortexlabs/cortex/pull/1953 (jackmpcollins)
Prevent threads from being stuck in DynamicBatcher https://github.com/cortexlabs/cortex/pull/1915 (cbensimon)
Fix unexpected cortex logs termination by increasing buffer size https://github.com/cortexlabs/cortex/pull/1939 (vishalbollu)
Decouple cluster deletion from EBS volume deletion for cortex cluster down https://github.com/cortexlabs/cortex/pull/1954 (deliahu)
Fix spot/on-demand GPU instances not joining the cluster by upgrading to eksctl 0.40.0 https://github.com/cortexlabs/cortex/pull/1955 (vishalbollu)
Prevent premature queue not found errors by preserving the SQS for minutes till after the job has completed https://github.com/cortexlabs/cortex/pull/1952 (vishalbollu)

Docs

Update docs https://github.com/cortexlabs/cortex/pull/1949 (ospillinger)

Misc

Configure a default cortex client to manage APIs from with cortex workloads https://github.com/cortexlabs/cortex/pull/1942 https://github.com/cortexlabs/cortex/issues/1644 (RobertLucian)
Save batch metrics to cloud to preserve job metrics history https://github.com/cortexlabs/cortex/pull/1940 (vishalbollu)

v0.30.0

3 years ago

v0.30.0

New features

Record custom metrics from predictors and view them in Grafana (docs) https://github.com/cortexlabs/cortex/pull/1910 https://github.com/cortexlabs/cortex/issues/1897 (miguelvr)
Add granular pod metrics to the Grafana dashboards https://github.com/cortexlabs/cortex/pull/1905 (RobertLucian)
Add node metrics to Grafana dashboards https://github.com/cortexlabs/cortex/pull/1900 (miguelvr)

Breaking changes

Remove support for installing Cortex on your own Kubernetes Cluster https://github.com/cortexlabs/cortex/pull/1921 (RobertLucian)

Bug fixes

Fix bug where successfully completed jobs were marked as completed with errors https://github.com/cortexlabs/cortex/pull/1913 (vishalbollu)
Fix bug where batch jobs were being terminated unnecessarily https://github.com/cortexlabs/cortex/pull/1917 (vishalbollu)
Prevent cluster autoscaler from reallocating job pods https://github.com/cortexlabs/cortex/pull/1919 (vishalbollu)
Address AWS cluster up quota issues such not enough NAT Gateways or EIPs https://github.com/cortexlabs/cortex/pull/1912 (RobertLucian)
Delete unused prometheus volume on cluster down https://github.com/cortexlabs/cortex/pull/1863 (miguelvr)
Create .cortex dir if not present https://github.com/cortexlabs/cortex/pull/1909 (RobertLucian)

Docs

Add docs for accessing dashboard through private load balancer (docs) https://github.com/cortexlabs/cortex/pull/1907 (deliahu)

Misc

Allow specifying paths for requirements.txt, conda-packages.txt & dependencies.sh (docs) https://github.com/cortexlabs/cortex/pull/1896 https://github.com/cortexlabs/cortex/pull/1927 https://github.com/cortexlabs/cortex/issues/1777 (miguelvr)
Log relevant kubernetes events to API specific log streams https://github.com/cortexlabs/cortex/pull/1906 https://github.com/cortexlabs/cortex/issues/833 (miguelvr)
Support credentials using AWS_SESSION_TOKEN with the CLI/Client (docs) https://github.com/cortexlabs/cortex/pull/1908 https://github.com/cortexlabs/cortex/pull/1920 https://github.com/cortexlabs/cortex/issues/1134 https://github.com/cortexlabs/cortex/issues/1865 (vishalbollu)
Provide auth to Operator and APIs by attaching IAM policies to the cluster (docs) https://github.com/cortexlabs/cortex/pull/1908 https://github.com/cortexlabs/cortex/issues/1858 (vishalbollu)

v0.29.0

3 years ago

v0.29.0

New features

Add Grafana dashboard for APIs (docs) https://github.com/cortexlabs/cortex/pull/1867 https://github.com/cortexlabs/cortex/pull/1885 https://github.com/cortexlabs/cortex/pull/1890 https://github.com/cortexlabs/cortex/pull/1887 (miguelvr)
Support API autoscaling in GCP clusters (docs) https://github.com/cortexlabs/cortex/pull/1814 https://github.com/cortexlabs/cortex/pull/1879 https://github.com/cortexlabs/cortex/issues/1601 (miguelvr)
Support traffic splitting in GCP clusters (docs) https://github.com/cortexlabs/cortex/pull/1892 https://github.com/cortexlabs/cortex/issues/1660 (miguelvr)

Breaking changes

The default Docker images for APIs have been slimmed down to not include packages other than what Cortex requires to function. Therefore, when deploying APIs, it is now necessary to include the dependencies that your predictor needs in requirements.txt (docs) and/or dependencies.sh (docs).

Bug fixes

Disable dynamic batcher for TensorFlow predictor type https://github.com/cortexlabs/cortex/pull/1888 (miguelvr)
Support empty directory objects for models saved in S3/GCS https://github.com/cortexlabs/cortex/pull/1830 https://github.com/cortexlabs/cortex/issues/1829 (RobertLucian)
Fix bug which prevented Task APIs on GCP from being cleaned up after completion https://github.com/cortexlabs/cortex/pull/1871 (RobertLucian)

Docs

Add documentation for using a version of Python other than the default via dependencies.sh (docs) or custom images (docs) https://github.com/cortexlabs/cortex/pull/1862 https://github.com/cortexlabs/cortex/issues/1779 (RobertLucian)

Misc

Support deploying predictor Python classes from more environments (e.g. from separate Python files, AWS Lambda) https://github.com/cortexlabs/cortex/pull/1883 https://github.com/cortexlabs/cortex/commit/3a1b777d06e660a49b6223badda4c5e8b1fe4ec1 https://github.com/cortexlabs/cortex/issues/1824 https://github.com/cortexlabs/cortex/issues/1826 (vishalbollu)
Improve error logging for Batch and Task APIs https://github.com/cortexlabs/cortex/pull/1866 https://github.com/cortexlabs/cortex/issues/1833 (RobertLucian)

v0.28.0

3 years ago

v0.28.0

New features

Support installing Cortex on an existing Kubernetes cluster (on AWS or GCP) (docs) https://github.com/cortexlabs/cortex/pull/1837 https://github.com/cortexlabs/cortex/issues/1808 (vishalbollu)

Breaking changes

The cloudwatch dashboard has been removed as a result of our switch to Prometheus for metrics aggregation. The dashboard will be replaced with an alternative in an upcoming release.

Bug fixes

Fix bug which can cause requests to APIs from a Python client to timeout during cluster autoscaling https://github.com/cortexlabs/cortex/pull/1841 https://github.com/cortexlabs/cortex/issues/1840 (RobertLucian)
Fix bug which can cause downscale_stabilization_period to be disregarded during downscaling https://github.com/cortexlabs/cortex/pull/1847 https://github.com/cortexlabs/cortex/issues/1846 (RobertLucian)

Misc

AWS credentials are no longer required to connect the CLI to the cluster operator. If you need to restrict access to your cluster operator, configure the operator's load balancer to be private by setting operator_load_balancer_scheme: internal in your cluster configuration file, and set up VPC Peering. We plan in supporting a new auth strategy in an upcoming release.
Improve S6 error code/signal handling https://github.com/cortexlabs/cortex/pull/1825 https://github.com/cortexlabs/cortex/issues/1703 (RobertLucian)

v0.27.0

3 years ago

v0.27.0

New features

Add new API type TaskAPI for running arbitrary Python jobs (docs) https://github.com/cortexlabs/cortex/pull/1717 https://github.com/cortexlabs/cortex/issues/253 (miguelvr, RobertLucian)
Write Cortex's logs as structured logs, and allow use of Cortex's structured logger in predictors (supports adding extra fields) (aws docs, gcp docs) https://github.com/cortexlabs/cortex/pull/1778 https://github.com/cortexlabs/cortex/pull/1803 https://github.com/cortexlabs/cortex/pull/1804 https://github.com/cortexlabs/cortex/issues/1732 https://github.com/cortexlabs/cortex/issues/1563 (vishalbollu)
Support preemptible instances on GCP (docs) https://github.com/cortexlabs/cortex/pull/1791 https://github.com/cortexlabs/cortex/issues/1631 (RobertLucian)
Support private load balancers on GCP (docs) https://github.com/cortexlabs/cortex/pull/1786 https://github.com/cortexlabs/cortex/issues/1621 (deliahu)
Support GCP instances with multiple GPUs (docs) https://github.com/cortexlabs/cortex/pull/1789 https://github.com/cortexlabs/cortex/issues/1784 (deliahu)

Breaking changes

cortex logs now streams logs from a single replica at random when there are multiple replicas for an API. The recommended way to analyze production logs is via a dedicated logging tool (by default, logs are sent to CloudWatch on AWS and StackDriver on GCP)

Bug fixes

Misc Python client fixes https://github.com/cortexlabs/cortex/pull/1798 https://github.com/cortexlabs/cortex/pull/1782 https://github.com/cortexlabs/cortex/pull/1772 (vishalbollu, RobertLucian)

Docs

Document the shared /mnt directory for TensorFlow predictors https://github.com/cortexlabs/cortex/pull/1802 https://github.com/cortexlabs/cortex/issues/1792 (deliahu)
Misc GCP docs improvements https://github.com/cortexlabs/cortex/pull/1799 (deliahu)

Misc

Improve out-of-memory status reporting (RobertLucian)
Improve batch job cleanup process https://github.com/cortexlabs/cortex/pull/1797 https://github.com/cortexlabs/cortex/pull/1796 (vishalbollu)
Remove grpc msg send/receive limit https://github.com/cortexlabs/cortex/pull/1769 https://github.com/cortexlabs/cortex/issues/1740 (RobertLucian)

v0.26.0

3 years ago

v0.26.0

New features

Support configuring the log level for APIs (docs) https://github.com/cortexlabs/cortex/pull/1741 https://github.com/cortexlabs/cortex/issues/1484 (RobertLucian)
Support creating a cluster in an existing AWS VPC (docs) https://github.com/cortexlabs/cortex/pull/1759 https://github.com/cortexlabs/cortex/issues/1142 (deliahu)
Support specifying the GCP network and subnet for the Cortex cluster (docs) https://github.com/cortexlabs/cortex/pull/1752 https://github.com/cortexlabs/cortex/issues/1738 (deliahu)
Support configuring shared memory size (shm) for inter-process communication (docs) https://github.com/cortexlabs/cortex/pull/1756 https://github.com/cortexlabs/cortex/issues/1638 (vishalbollu)

Breaking changes

The local provider has been removed. The best way to test your predictor implementation locally is to import it in a separate Python file and call your __init__() and predict() functions directly. The best way to test your API is to deploy it to a dev/test cluster.
Built-in support for API Gateway has been removed. If you need to create an https endpoint with valid certs, some options are to set up a custom domain or to manually create an API Gateway.
Prediction monitoring has been removed. We are exploring how to build a more powerful and customizable solution for this.
The predict CLI command has been deleted. curl, requests, etc. are the best tools for testing APIs.

Bug fixes

For multi-model APIs, allow model names to share a prefix https://github.com/cortexlabs/cortex/pull/1745 https://github.com/cortexlabs/cortex/issues/1699 (RobertLucian)

Docs

Misc docs improvements (ospillinger)