Get started: Install • Tutorial • Docs • Examples
Learn more: Website • Blog • Subscribe • Twitter • Contact
Cortex deploys your machine learning models to your cloud infrastructure. You define deployments with simple declarative configuration and Cortex deploys your models as JSON APIs on your AWS account. It also handles autoscaling, rolling updates, log streaming, inference on CPUs or GPUs, and more.
Cortex is actively maintained by Cortex Labs. We're a venture-backed team of infrastructure engineers and we're hiring.
Define your deployment using declarative configuration:
# cortex.yaml
- kind: api
name: my-api
model: s3://my-bucket/my-model.zip
request_handler: handler.py
compute:
min_replicas: 5
max_replicas: 20
Customize request handling (optional):
# handler.py
def pre_inference(sample, metadata):
# Python code
def post_inference(prediction, metadata):
# Python code
Deploy to your cloud infrastructure:
$ cortex deploy
Deploying ...
Ready! https://amazonaws.com/my-api
Serve real time predictions via scalable JSON APIs:
$ curl -d '{"a": 1, "b": 2, "c": 3}' https://amazonaws.com/my-api
{ prediction: "def" }
-
Minimal declarative configuration: Deployments can be defined in a single
cortex.yaml
file. -
Autoscaling: Cortex can automatically scale APIs to handle production workloads.
-
Multi framework: Cortex supports TensorFlow, Keras, PyTorch, Scikit-learn, XGBoost, and more.
-
Rolling updates: Cortex updates deployed APIs without any downtime.
-
Log streaming: Cortex streams logs from your deployed models to your CLI.
-
CPU / GPU support: Cortex can run inference on CPU or GPU infrastructure.