Skypilot Versions Save

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

v0.2.2

1 year ago

What's Changed

This is a patch release with several bug fixes for TPU, Spot, Onprem and Storage.

Detailed announcements will be made in 0.3.0.

v0.2.0

1 year ago

We are excited to release SkyPilot 0.2.0, which receives a host of new features, with many enhancements and fixes.

Highlights

  • Managed Spot is made much more robust and easier to use.
    • Try using sky spot launch on your existing yamls!
    • We've seen users running 1000s of spot jobs in a recurring schedule.
  • TPU Pods are now supported.
    • To use a TPU Pod, simply modify e.g., accelerators: tpu-v2-8 to accelerators: tpu-v2-32.
  • Benchmark: use sky bench to easily measure the performance and cost of different cloud resources for your task.
  • Provisioning is sped up by ~1 minute.
  • Catalog is updated to V3 with 100s of resource changes and 1000s of price changes.
    • A100-80GB is now available on 3 clouds. Check out sky show-gpus -a for GPU prices.
    • No action needed as this will be automatically downloaded.

CLI & Task interface

New Features

  • Add zone support in YAML #1014
  • Add shell completion support for CLI by #1162
  • Add --no-setup option to sky launch to allow for remounting of files without running setup commands again #1184
  • Add sky start --all to start all clusters #1065
  • Add glob support for sky storage delete #1117
  • Add --no-follow option to sky logs and sky spot logs (print logs so far and exit)

Enhancements

  • Show vCPUs in optimizer/benchmark messages #1076
  • Make entrypoint optional: for quick VM launching, no more sky launch <flags> '', simply do sky launch <flags> #1191
  • Make sky check automatically enable necessary GCP APIs (#1197, #1209); make it more robust for AWS checks (#1194)

Managed spot

New Features

  • sky spot launch now automatically translates file_mounts in a YAML to use cloud storage. #1081 #1215
    • This means the same YAML for on-demand resources launched by sky launch can now be launched by sky spot launch.
  • Add --retry-until-up for sky spot launch; improve the responsiveness for sky spot cancel https://github.com/skypilot-org/skypilot/pull/1098
  • Expose a $SKYPILOT_RUN_ID environment variable shared by all recoveries of the same spot job (useful for identifying it in Weights & Biases) #1196
    • See the last Note block in docs.

Enhancements

Fixes

TPU support

Provisioner

Enhancements

Fixes

On-prem

Enhancements

Fixes

Backend

Enhancements

Fixes

Misc. enhancements

Thanks to all Contributors!

New contributors

Many thanks to all contributors who contributed to this release!

@Michaelvll, @concretevitamin, @infwinston, @michaelzhiluo, @WoosukKwon, @romilbhardwaj, @sumanthgenz, @ewzeng, @iojw, @franklsf95

v0.1.1

1 year ago

Highlights

This is our first release for SkyPilot -- a framework for easily running machine learning workloads on any cloud through a unified interface. No knowledge of cloud offerings is required or expected – you simply define the workload and its resource requirements, and SkyPilot will automatically execute it on AWS, Google Cloud Platform or Microsoft Azure.

Key features

  • Run existing projects on the cloud with zero code changes
  • Easily provision VMs across multiple cloud platforms (AWS, Azure or GCP)
  • Easily manage multiple clusters to handle different projects
  • Quick access to cloud instances for development
  • Store datasets on the cloud and access them like you would on a local file system
  • No cloud lock-in – seamlessly run your code across cloud providers

Thanks

Many thanks to all those who contributed to this release! @concretevitamin @romilbhardwaj @Michaelvll @infwinston @michaelzhiluo @WoosukKwon @suquark @mraheja @gmittal @iojw @lhqing @franklsf95

Full Changelog: https://github.com/skypilot-org/skypilot/commits/v0.1.1