SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
:fire: News :fire:
SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
SkyPilot abstracts away cloud infra burdens:
SkyPilot maximizes GPU availability for your jobs:
SkyPilot cuts your cloud costs:
SkyPilot supports your existing GPU, TPU, and CPU workloads, with no code changes.
Install with pip (we recommend the nightly build for the latest features or from source):
pip install "skypilot-nightly[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
To get the last release, use:
pip install -U "skypilot[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
Current supported providers (AWS, Azure, GCP, OCI, Lambda Cloud, RunPod, Fluidstack, Paperspace, Cudo, IBM, Samsung, Cloudflare, any Kubernetes cluster):
You can find our documentation here.
A SkyPilot task specifies: resource requirements, data to be synced, setup commands, and the task commands.
Once written in this unified interface (YAML or Python API), the task can be launched on any available cloud. This avoids vendor lock-in, and allows easily moving jobs to a different provider.
Paste the following into a file my_task.yaml
:
resources:
accelerators: V100:1 # 1x NVIDIA V100 GPU
num_nodes: 1 # Number of VMs to launch
# Working directory (optional) containing the project codebase.
# Its contents are synced to ~/sky_workdir/ on the cluster.
workdir: ~/torch_examples
# Commands to be run before executing the job.
# Typical use: pip install -r requirements.txt, git clone, etc.
setup: |
pip install "torch<2.2" torchvision --index-url https://download.pytorch.org/whl/cu121
# Commands to run as a job.
# Typical use: launch the main program.
run: |
cd mnist
python main.py --epochs 1
Prepare the workdir by cloning:
git clone https://github.com/pytorch/examples.git ~/torch_examples
Launch with sky launch
(note: access to GPU instances is needed for this example):
sky launch my_task.yaml
SkyPilot then performs the heavy-lifting for you, including:
workdir
to the VMsetup
commands to prepare the VM for running the taskrun
commands
Refer to Quickstart to get started with SkyPilot.
To learn more, see our Documentation and Tutorials.
Runnable examples:
llm/
!examples/
).Follow updates:
Read the research:
We are excited to hear your feedback!
For general discussions, join us on the SkyPilot Slack.
We welcome and value all contributions to the project! Please refer to CONTRIBUTING for how to get involved.