Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
The Ray 2.6.3 patch release contains fixes for Ray Serve, and Ray Core streaming generators.
🔨 Fixes:
🔨 Fixes:
serve run
help message (#37859) (#38018)ray_serve_deployment_queued_queries
when client disconnects (#37965) (#38020)📖 Documentation:
The Ray 2.6.2 patch release contains a critical fix for ray's logging setup, as well fixes for Ray Serve, Ray Data, and Ray Job.
🔨 Fixes:
🔨 Fixes:
request_timeout_s
from Serve config to the cluster (#37884) (#37903)🔨 Fixes:
The Ray 2.6.1 patch release contains a critical fix for cluster launcher, and compatibility update for Ray Serve protobuf definition with python 3.11, as well doc improvements.
⚠️ Cluster launcher in Ray 2.6.0 fails to start multi-node clusters. Please update to 2.6.1 if you plan to use 2.6.0 cluster launcher.
🔨 Fixes:
🔨 Fixes:
📖Documentation:
@serve.batch
-decorated methods can stream responses.🎉 New Features:
💫 Enhancements:
🔨 Fixes:
pg.ready()
task for pending trials that end up reusing an actor (#35748)Dict[str, np.array]
batches in DummyTrainer
read bytes calculation (#36484)📖 Documentation:
dreambooth
example (#37102)🏗 Architecture refactoring:
🎉 New Features:
Dataset.unique()
(#36655, #36802)DataIterator.iter_batches()
(#36842) (#37260)DataIterator.iter_batches()
(#36686)ray.data.range_arrow()
(#35756)💫 Enhancements:
Dataset.write_datasource()
(#36134)Dataset.schema()
with new execution plan optimizer (#36740)Dataset.streaming_split()
(#36908)🔨 Fixes:
Dataset.streaming_split()
operatorDataset.streaming_split()
(#36039)Dataset.materialize()
and Dataset.streaming_split()
(#36092)Dataset.streaming_split()
(#36919)BlockMetadata
(#37119)📖 Documentation:
🏗 Architecture refactoring:
🎉 New Features:
💫 Enhancements:
🔨 Fixes:
📖 Documentation:
code-block
to testcode
. (#36483)🏗 Architecture refactoring:
BatchPredictor
(#36947, #37178)🔨 Fixes:
PENDING
trials (#35338)📖 Documentation:
🏗 Architecture refactoring:
tune/automl
(#35557)ray.tune.integration
(#35160)💫 Enhancements:
@serve.batch
-decorated methods can stream responses.@serve.batch
settings can be reconfigured dynamically.max_concurrent_queries
and tail latencies under load.🔨 Fixes:
🎉 New Features:
💫 Enhancements:
🔨 Fixes:
🏗 Architecture refactoring:
🎉 New Features:
💫 Enhancements:
🔨 Fixes:
📖 Documentation:
Many thanks to all those who contributed to this release!
@ericl, @ArturNiederfahrenhorst, @sihanwang41, @scv119, @aslonnie, @bluecoconut, @alanwguo, @krfricke, @frazierprime, @vitsai, @amogkam, @GeneDer, @jovany-wang, @gjoliver, @simran-2797, @rkooo567, @shrekris-anyscale, @kevin85421, @angelinalg, @maxpumperla, @kouroshHakha, @Yard1, @chaowanggg, @justinvyu, @fantow, @Catch-Bull, @cadedaniel, @ckw017, @hora-anyscale, @rickyyx, @scottsun94, @XiaodongLv, @SongGuyang, @RocketRider, @stephanie-wang, @inpefess, @peytondmurray, @sven1977, @matthewdeng, @ijrsvt, @MattiasDC, @richardliaw, @bveeramani, @rynewang, @woshiyyya, @can-anyscale, @omus, @eax-anyscale, @raulchen, @larrylian, @Deegue, @Rohan138, @jjyao, @iycheng, @akshay-anyscale, @edoakes, @zcin, @dmatrix, @bryant1410, @WanNJ, @architkulkarni, @scottjlee, @JungeAlexander, @avnishn, @harisankar95, @pcmoritz, @wuisawesome, @mattip
The Ray 2.5.1 patch release adds wheels for MacOS for Python 3.11. It also contains fixes for multiple components, along with fixes for our documentation.
🔨 Fixes:
🎉 New Features:
🔨 Fixes:
The Ray 2.5 release features focus on a number of enhancements and improvements across the Ray ecosystem, including:
💫Enhancements:
air_verbosity
against None. (#33871)RunConfig.storage_path
to replace SyncConfig.upload_dir
and RunConfig.local_dir
. (#33463)🔨 Fixes:
test_tune_torch_get_device_gpu
race condition (#35004)📖Documentation:
convert_torch_code_to_ray_air
(#35224)🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
🔨 Fixes:
torch.save()
(#35615) (#35790)📖Documentation:
🏗 Architecture refactoring:
ray.train
HuggingFace modules (#35270) (#35488)🎉 New Features:
💫Enhancements:
tune.ExperimentAnalysis
to pull experiment checkpoint files from the cloud if needed (#34461)🔨 Fixes:
test_tune_torch_get_device_gpu
race condition (#35004)tune/execution/checkpoint_manager
state serialization. (#34368)--smoke-test
. (#34167)📖Documentation:
🏗 Architecture refactoring:
tabulate
package (#34789)🎉 New Features:
💫Enhancements:
LongPoll
updates (#34675)ClassNode
and FunctionNode
with Application
in top-level Serve APIs (#34627)🔨 Fixes:
app_msg
to empty string by default (#35646)📖Documentation:
RayServeHandle
and RayServeSyncHandle
docstrings & typing (#34714)🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
💫Enhancements:
example-full.yaml
(#34487)📖Documentation:
pip
and conda
requirements files (#34071)💫Enhancements:
--err
flag to query stderr logs from worker/actors instead of --suffix=err
(#34300)is_head_node
to state API and GcsNodeInfo (#34299)Many thanks to all those who contributed to this release!
@vitsai, @XiaodongLv, @justinvyu, @Dan-Yeh, @dependabot[bot], @alanwguo, @grimreaper, @yiwei00000, @pomcho555, @ArturNiederfahrenhorst, @maxpumperla, @jjyao, @ijrsvt, @sven1977, @Yard1, @pcmoritz, @c21, @architkulkarni, @jbedorf, @amogkam, @ericl, @jiafuzha, @clarng, @shrekris-anyscale, @matthewdeng, @gjoliver, @jcoffi, @edoakes, @ethanabrooks, @iycheng, @Rohan138, @angelinalg, @Linniem, @aslonnie, @zcin, @wuisawesome, @Catch-Bull, @woshiyyya, @avnishn, @jjyyxx, @jianoaix, @bveeramani, @sihanwang41, @scottjlee, @YQ-Wang, @mattip, @can-anyscale, @xwjiang2010, @fedassembly, @joncarter1, @robin-anyscale, @rkooo567, @DACUS1995, @simran-2797, @ProjectsByJackHe, @zen-xu, @ashahab, @larrylian, @kouroshHakha, @raulchen, @sofianhnaide, @scv119, @nathan-az, @kevin85421, @rickyyx, @Sahar-E, @krfricke, @chaowanggg, @peytondmurray, @cadedaniel
Over the last few months, we have seen a flurry of innovative activity around generative AI models and large language models (LLM). To continue our effort to ensure Ray provides a pivotal compute substrate for generative AI workloads and addresses the challenges (as explained in our blog series), we have invested engineering efforts in this release to ensure that these open source LLM models and workloads are accessible to the open source community and performant with Ray.
This release includes new examples for training, batch inference, and serving with your own LLM.
ray.data.DatasetContext.get_current().execution_options.preserve_order = True
.💫Enhancements:
TorchDetectionPredictor
(#32199)artifact_location
, run_name
to MLFlow integration (#33641)*path
properties to Result
and ResultGrid
(#33410)Preprocessor.transform
lazy by default (#32872)BatchPredictor
lazy (#32510, #32796)TempFileLock
util (#32862)collate_fn
to iter_torch_batches
(#32412)Callable[[torch.Tensor], torch.Tensor]
to TorchVisionTransform
(#32383)DatasetIterator
torch tensors to correct device (#31753)🔨 Fixes:
use_gpu
with HuggingFacePredictor
(#32333)Callback
raise DeprecationWarning
(#33775)Checkpoint.from_checkpoint
as developer API (#33094)DatasetIterator
backwards compability (#32526)CountVectorizer
failing with big data (#32351)from_uri
. (#32386)dtype
type hint in DLPredictor
methods (#32198)set_preprocessor
(#33088)📖Documentation:
BatchPredictor.from_checkpoint
to docs (#32877)🏗 Architecture refactoring:
TensorflowCheckpoint.get_model
model_definition
parameter (#33776)🎉 New Features:
collate_fn
to Dataset.iter_torch_batches() (#32412)ignore_missing_paths
in reading Datasource (#33126)💫Enhancements:
tf_schema
parameter in read_tfrecords() and write_tfrecords() (#32857)TorchVisionTransform
(#32383)ArrowTensorArray
and ArrowVariableShapedTensorArray
(#32143)🔨 Fixes:
📖Documentation:
🎉 New Features:
AccelerateTrainer
(#33269)Trainer.restore
API for train experiment-level fault tolerance (#31920)💫Enhancements:
Trainer.restore
on errors raised by trainer.fit()
(#33610)CUDA_VISIBLE_DEVICES
(#33159)train.torch.get_device()
(#32893)torch.distributed
env vars (#32450)🔨 Fixes:
DatasetIterator
, handle device_map
(#32955)test_torch_trainer
(#32963)test_gpu
by sorting the devices (#33002)📖Documentation:
🏗 Architecture refactoring:
🎉 New Features:
OrderedDict
import. (#33709)experimental/output.py
(#33767)💫Enhancements:
Tuner.restore
(#32317)Tuner.can_restore(path)
utility for checking if an experiment exists at a path/uri (#32003)Tuner.restore
usage to prepare for trainable
becoming a required arg (#32912)on_experiment_end
hook for the final wait of SyncCallback
sync processes (#33390)remote_checkpoint_dir
upon actor reuse (#32420)use_threads=False
in pyarrow syncing (#32256)WandbLoggerCallback
actors to finish uploading to wandb on experiment end (#33174)🔨 Fixes:
ray.data.Dataset
w/o lineage captured in trial config (#33565)trial.__getstate__
(#32624)📖Documentation:
tune.run
API in logging messages when using the Tuner
(#33642)log_to_file
doc. (#32128)🏗 Architecture refactoring:
🎉 New Features:
💫Enhancements:
log_to_stderr
option to logger and improve internal logging (#33597)🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
💫Enhancements:
🎉 New Features:
🔨 Fixes:
Many thanks to all those who contributed to this release!
@zjf2012, @christy, @fyrestone, @avnishn, @scottjlee, @sijieamoy, @jjyao, @sven1977, @jamesclark-Zapata, @cadedaniel, @jovany-wang, @pcmoritz, @MaskRay, @csivanich, @augray, @wuisawesome, @Wendi-anyscale, @maxpumperla, @shawnpanda, @DmitriGekhtman, @yuduber, @gjoliver, @ju2ez, @clarkzinzow, @brycehuang30, @iycheng, @justinvyu, @dmatrix, @edoakes, @tmbdev, @scottsun94, @jianoaix, @cool-RR, @prrajput1199, @amogkam, @ckw017, @alanwguo, @architkulkarni, @chaowanggg, @AmeerHajAli, @stephanie-wang, @bewestphal, @matthew29tang, @dbczumar, @sihanwang41, @ericl, @soumitrak, @matthewdeng, @Catch-Bull, @peytondmurray, @XiaodongLv, @bveeramani, @YQ-Wang, @Linniem, @ProjectsByJackHe, @woshiyyya, @c21, @shrekris-anyscale, @zcin, @Yard1, @can-anyscale, @kouroshHakha, @robertnishihara, @richardliaw, @krfricke, @shomilj, @ArturNiederfahrenhorst, @ijrsvt, @GokuMohandas, @jbedorf, @xwjiang2010, @anydayeol, @clarng, @davidxia, @rickyyx, @Siraj-Qazi, @kira-lin, @scv119, @chengscott, @angelinalg, @rkooo567, @rshin, @deanwampler, @gramhagen, @larrylian, @WeichenXu123, @simonsays1980
The Ray 2.3.1 patch release contains fixes for multiple components:
zip()
(https://github.com/ray-project/ray/pull/32795)serve run
to use Ray Client instead of Ray Jobs (https://github.com/ray-project/ray/pull/32976)max_concurrent_queries
being ignored when autoscaling (https://github.com/ray-project/ray/pull/32772 and https://github.com/ray-project/ray/pull/33022)--block
(https://github.com/ray-project/ray/pull/32961)💫Enhancements:
set_preprocessor
method to Checkpoint
(#31721)save_checkpoints
to upload_checkpoints
(#31582)WandbLoggerCallback
example (#31625)DLPredictor.call_model
tensor
parameter to inputs
(#30574)use_gpu
to HuggingFacePredictor
(#30945)Checkpoint
improvements (#30948)TensorflowCheckpoint.get_model
(#31203)🔨 Fixes:
📖Documentation:
🏗 Architecture refactoring:
🎉 New Features:
💫Enhancements:
ds.map_batches()
(#30000)🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
NCCL_SOCKET_IFNAME
to blacklist veth
(#31824)RunConfig
is used when there are multiple places to specify it (#31959)ScalingConfig
to be optional for DataParallelTrainer
s if already in Tuner param_space
(#30920)🔨 Fixes:
Preprocessor
configs when using stream API. (#31725)fail_fast="raise"
(#30817)SklearnTrainer
(#30593)📖Documentation:
🏗 Architecture refactoring:
💫Enhancements:
validate_upload_dir
to Syncer (#30869)🔨 Fixes:
AxSearch
save and nan/inf result handling (#31147)AxSearch
search space conversion for fixed list hyperparameters (#31088)Tuner.restore
(#30893)sort_by_metric
with nested metrics (#30906)fail_fast="raise"
(#30817)📖Documentation:
🏗 Architecture refactoring:
overwrite_trainable
argument in Tuner restore to trainable
(#32059)🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
experimental_relax_shapes
(but reduce_retracing
instead). (#29214)__str__()
method to PolicyMap. (#31098)contrib
folder. (#30992)AlgorithmConfig.overrides()
to replace multiagent->policies->config
and evaluation_config
dicts. (#30879)deprecation_warning(.., error=True)
should raise ValueError
, not DeprecationWarning
. (#30255)gym.spaces.Text
serialization. (#30794)MultiAgentBatch
to SampleBatch
in offline_rl.py. (#30668)Algorithm.train()
return Tune-style config dict (instead of AlgorithmConfig object). (#30591)🔨 Fixes:
try_import_..()
. (#31332)tensorflow_probability
imports. (#31331)PolicyMap.__del__()
to also remove a deleted policy ID from the internal deque. (#31388)get_model_v2()
instead of get_model()
with MADDPG. (#30905)📖Documentation:
🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
ray status
and autoscaler (#32337)🔨 Fixes:
📖Documentation:
🎉 New Features:
📖Documentation:
Many thanks to all those who contributed to this release!
@minerharry, @scottsun94, @iycheng, @DmitriGekhtman, @jbedorf, @krfricke, @simonsays1980, @eltociear, @xwjiang2010, @ArturNiederfahrenhorst, @richardliaw, @avnishn, @WeichenXu123, @Capiru, @davidxia, @andreapiso, @amogkam, @sven1977, @scottjlee, @kylehh, @yhna940, @rickyyx, @sihanwang41, @n30111, @Yard1, @sriram-anyscale, @Emiyalzn, @simran-2797, @cadedaniel, @harelwa, @ijrsvt, @clarng, @pabloem, @bveeramani, @lukehsiao, @angelinalg, @dmatrix, @sijieamoy, @simon-mo, @jbesomi, @YQ-Wang, @larrylian, @c21, @AndreKuu, @maxpumperla, @architkulkarni, @wuisawesome, @justinvyu, @zhe-thoughts, @matthewdeng, @peytondmurray, @kevin85421, @tianyicui-tsy, @cassidylaidlaw, @gvspraveen, @scv119, @kyuyeonpooh, @Siraj-Qazi, @jovany-wang, @ericl, @shrekris-anyscale, @Catch-Bull, @jianoaix, @christy, @MisterLin1995, @kouroshHakha, @pcmoritz, @csko, @gjoliver, @clarkzinzow, @SongGuyang, @ckw017, @ddelange, @alanwguo, @Dhul-Husni, @Rohan138, @rkooo567, @fzyzcjy, @chaokunyang, @0x2b3bfa0, @zoltan-fedor, @Chong-Li, @crypdick, @jjyao, @emmyscode, @stephanie-wang, @starpit, @smorad, @nikitavemuri, @zcin, @tbukic, @ayushthe1, @mattip
Ray 2.2 is a stability-focused release, featuring stability improvements across many Ray components.
🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🏗 Architecture refactoring:
🎉 New Features:
select_columns()
to select a subset of columns (#29081)write_tfrecords()
to write TFRecord files (#29448)from_torch()
to create dataset from Torch dataset (#29588)from_tf()
to create dataset from TensorFlow dataset (#29591)batch_size
in BatchMapper
(#29193)💫Enhancements:
include_paths
in read_images()
to return image file path (#30007)to_pandas()
and to_dask()
(#29417)read_tfrecords()
output from Pandas to Arrow format (#30390)str
exclude in Concatenator
(#29443)🔨 Fixes:
random_shuffle()
(#29276)random_shuffle_each_window()
(#29482)iter_batches()
to not return empty batch (#29638)map_batches()
to fetch input blocks on-demand (#29289)take_all()
to not accept limit argument (#29746)map_groups()
(#30172)stats()
call causing Dataset schema to be unset (#29635)batch_format
is not specified for BatchMapper
(#30366)📖Documentation:
map_batches()
documentation about execution model and UDF pickle-ability requirement (#29233)to_tf()
docstring (#29464)🎉 New Features:
💫Enhancements:
🔨 Fixes:
📖Documentation:
🏗 Architecture refactoring:
🎉 New Features:
Tuner.restore
work with relative experiment paths (#30363)Tuner.restore
from a local directory that has moved (#29920)💫Enhancements:
with_resources
takes in a ScalingConfig
(#30259)with_resources
in with_parameters
(#29740)trial_name_creator
and trial_dirname_creator
to TuneConfig
(#30123)BaseTrainer
to Trainable
once in the Tuner (#30355)remote_checkpoint_dir
work with query strings (#30125)🔨 Fixes:
ResourceChangingScheduler
dropping PGF args (#30304)Tuner
(#29956)TUNE_ORIG_WORKING_DIR
env variable (#30134)📖Documentation:
ResultGrid
and Result
) (#29072)🏗 Architecture refactoring:
setup_wandb()
function (#29828)🎉 New Features:
💫Enhancements:
🔨 Fixes:
🎉 New Features:
💫Enhancements:
from_checkpoint()
) for directly instantiating instances from a checkpoint directory w/o knowing the original configuration used or any other information (having the checkpoint is sufficient). For a detailed overview, see here. (#28812, #29772, #29370, #29520, #29328)🏗 Architecture refactoring:
🔨 Fixes:
📖Documentation:
🎉 New Features:
💫Enhancements:
entrypoint_num_cpus
, entrypoint_num_gpus
, or entrypoint_resources
. (#28564, #28203)🔨 Fixes:
num_cpus
required by task/actors by default (#30496)📖Documentation:
💫Enhancements:
🎉 New Features:
ray list cluster-events
.🔨 Fixes:
💫Enhancements:
Many thanks to all those who contributed to this release!
@shrekris-anyscale, @rickyyx, @scottjlee, @shogohida, @liuyang-my, @matthewdeng, @wjrforcyber, @linusbiostat, @clarkzinzow, @justinvyu, @zygi, @christy, @amogkam, @cool-RR, @jiaodong, @EvgeniiTitov, @jjyao, @ilee300a, @jianoaix, @rkooo567, @mattip, @maxpumperla, @ericl, @cadedaniel, @bveeramani, @rueian, @stephanie-wang, @lcipolina, @bparaj, @JoonHong-Kim, @avnishn, @tomsunelite, @larrylian, @alanwguo, @VishDev12, @c21, @dmatrix, @xwjiang2010, @thomasdesr, @tiangolo, @sokratisvas, @heyitsmui, @scv119, @pcmoritz, @bhavika, @yzs981130, @andraxin, @Chong-Li, @clarng, @acxz, @ckw017, @krfricke, @kouroshHakha, @sijieamoy, @iycheng, @gjoliver, @peytondmurray, @xcharleslin, @DmitriGekhtman, @andreichalapco, @vitrioil, @architkulkarni, @simon-mo, @ArturNiederfahrenhorst, @sihanwang41, @pabloem, @sven1977, @avivhaber, @wuisawesome, @jovany-wang, @Yard1