Release v1.3.0 Notes

Highlights

We are now testing and publishing Ray's scalability limits with each release, see: https://github.com/ray-project/ray/tree/releases/1.3.0/benchmarks
Ray Client is now usable by default with any Ray cluster started by the Ray Cluster Launcher.

Ray Cluster Launcher

💫Enhancements:

Observability improvements (#14816, #14608)
Worker nodes no longer killed on autoscaler failure (#14424)
Better validation for min_workers and max_workers (#13779)
Auto detect memory resource for AWS and K8s (#14567)
On autoscaler failure, propagate error message to drivers (#14219)
Avoid launching GPU nodes when the workload only has CPU tasks (#13776)
Autoscaler/GCS compatibility (#13970, #14046, #14050)
Testing (#14488, #14713)
Migration of configs to multi-node-type format (#13814, #14239)
Better config validation (#14244, #13779)
Node-type max workers defaults infinity (#14201)

🔨 Fixes:

AWS configuration (#14868, #13558, #14083, #13808)
GCP configuration (#14364, #14417)
Azure configuration (#14787, #14750, #14721)
Kubernetes (#14712, #13920, #13720, #14773, #13756, #14567, #13705, #14024, #14499, #14593, #14655)
Other (#14112, #14579, #14002, #13836, #14261, #14286, #14424, #13727, #13966, #14293, #14293, #14718, #14380, #14234, #14484)

Ray Client

💫Enhancements:

Version checks for Python and client protocol (#13722, #13846, #13886, #13926, #14295)
Validate server port number (#14815)
Enable Ray client server by default (#13350, #13429, #13442)
Disconnect ray upon client deactivation (#13919)
Convert Ray objects to Ray client objects (#13639)
Testing (#14617, #14813, #13016, #13961, #14163, #14248, #14630, #14756, #14786)
Documentation (#14422, #14265)

🔨 Fixes:

Hook runtime context (#13750)
Fix mutual recursion (#14122)
Set gRPC max message size (#14063)
Monitor stream errors (#13386)
Fix dependencies (#14654)
Fix ray.get ctrl-c (#14425)
Report error deserialization errors (#13749)
Named actor refcounting fix (#14753)
RayTaskError serialization (#14698)
Multithreading fixes (#14701)

Ray Core

🎉 New Features:

We are now testing and publishing Ray's scalability limits with each release. Check out https://github.com/ray-project/ray/tree/releases/1.3.0/benchmarks.
[alpha] Ray-native Python-based collective communication primitives for Ray clusters with distributed CPUs or GPUs.

🔨 Fixes:

Ray is now using c++14.
Fixed high CPU breaking raylets with heartbeat missing errors (#13963, #14301)
Fixed high CPU issues from raylet during object transfer (#13724)
Improvement in placement group APIs including better Java support (#13821, #13858, #13582, #15049, #13821)

Ray Data Processing

🎉 New Features:

Object spilling is turned on by default. Check out the documentation.
Dask-on-Ray and Spark-on-Ray are fully ready to use. Please try them out and give us feedback!
Dask-on-Ray is now compatible with Dask 2021.4.0.
Dask-on-Ray now works natively with dask.persist().

🔨 Fixes:

Various improvements in object spilling and memory management layer to support large scale data processing (#13649, #14149, #13853, #13729, #14222, #13781, #13737, #14288, #14578, #15027)
lru_evict flag is now deprecated. Recommended solution now is to use object spilling.

🏗 Architecture refactoring:

Various architectural improvements in object spilling and memory management. For more details, check out the whitepaper.
Locality-aware scheduling is turned on by default.
Moved from centralized GCS-based object directory protocol to decentralized owner-to-owner protocol, yielding better cluster scalability.

RLlib

🎉 New Features:

R2D2 implementation for torch and tf. (#13933)
PlacementGroup support (all RLlib algos now return PlacementGroupFactory from Trainer.default_resource_request). (#14289)
Multi-GPU support for tf-DQN/PG/A2C. (#13393)

💫Enhancements:

Documentation: Update documentation for Curiosity's support of continuous actions (#13784); CQL documentation (#14531)
Attention-wrapper works with images and supports prev-n-actions/rewards options. (#14569)
rllib rollout runs in parallel by default via Trainer’s evaluation worker set. (#14208)
Add env rendering (customizable) and video recording options (for non-local mode; >0 workers; +evaluation-workers) and episode media logging. (#14767, #14796)
Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
Example Scripts: Add coin game env + matrix social dilemma env + tests and examples (shoutout to Maxime Riché!). (#14208); Attention net (#14864); Serve + RLlib. (#14416); Env seed (#14471); Trajectory view API (enhancements and tf2 support). (#13786); Tune trial + checkpoint selection. (#14209)
DDPG: Add support for simplex action space. (#14011)
Others: on_learn_on_batch callback allows custom metrics. (#13584); Add TorchPolicy.export_model(). (#13989)

🔨 Fixes:

Trajectory View API bugs (#13646, #14765, #14037, #14036, #14031, #13555)
Test cases (#14620, #14450, #14384, #13835, #14357, #14243)
Others (#13013, #14569, #13733, #13556, #13988, #14737, #14838, #15272, #13681, #13764, #13519, #14038, #14033, #14034, #14308, #14243)

🏗 Architecture refactoring:

Remove all non-trajectory view API code. (#14860)
Obsolete UsageTrackingDict in favor of SampleBatch. (#13065)

Tune

🎉 New Features:

We added a new searcher HEBOSearcher (#14504, #14246, #13863, #14427)
Tune is now natively compatible with the Ray Client (#13778, #14115, #14280)
Tune now uses Ray’s Placement Groups underneath the hood. This will enable much faster autoscaling and training (for distributed trials) (#13906, #15011, #14313)

💫Enhancements:

Checkpointing improvements (#13376, #13767)
Optuna Search Algorithm improvements (#14731, #14387)
tune.with_parameters now works with Class API (#14532)

🔨Fixes:

BOHB & Hyperband fixes (#14487, #14171)
Nested metrics improvements (#14189, #14375, #14379)
Fix non-deterministic category sampling (#13710)
Type hints (#13684)
Documentation (#14468, #13880, #13740)
Various issues and bug fixes (#14176, #13939, #14392, #13812, #14781, #14150, #14850, #14118, #14388, #14152, #13825, #13936)

SGD

Add fault tolerance during worker startup (#14724)

Serve

🎉 New Features:

Added metadata to default logger in backend replicas (#14251)
Added more metrics for ServeHandle stats (#13640)
Deprecated system-level batching in favor of @serve.batch (#14610, #14648)
Beta support for Serve with Ray client (#14163)
Use placement groups to bypass autoscaler throttling (#13844)
Deprecate client-based API in favor of process-wide singleton (#14696)
Add initial support for FastAPI ingress (#14754)

🔨 Fixes:

Fix ServeHandle serialization (#13695)

🏗 Architecture refactoring:

Refactor BackendState to support backend versioning and add more unit testing (#13870, #14658, #14740, #14748)
Optimize long polling to be per-key (#14335)

Dashboard

🎉 New Features:

Dashboard now supports being served behind a reverse proxy. (#14012)
Disk and network metrics are added to prometheus. (#14144)

💫Enhancements:

Better CPU & memory information on K8s. (#14593, #14499)
Progress towards a new scalable dashboard. (#13790, #11667, #13763,#14333)

Thanks

Many thanks to all those who contributed to this release:
@geraint0923, @iycheng, @yurirocha15, @brian-yu, @harryge00, @ijrsvt, @wumuzi520, @suquark, @simon-mo, @clarkzinzow, @RaphaelCS, @FarzanT, @ob, @ashione, @ffbin, @robertnishihara, @SongGuyang, @zhe-thoughts, @rkooo567, @ezra-h, @acxz, @clay4444, @QuantumMecha, @jirkafajfr, @wuisawesome, @Qstar, @guykhazma, @devin-petersohn, @jeroenboeye, @ConeyLiu, @dependabot[bot], @fyrestone, @micahtyong, @javi-redondo, @Manuscrit, @mxz96102, @EscapeReality846089495, @WangTaoTheTonic, @stanislav-chekmenev, @architkulkarni, @Yard1, @tchordia, @zhisbug, @Bam4d, @niole, @yiranwang52, @thomasjpfan, @DmitriGekhtman, @gabrieleoliaro, @jparkerholder, @kfstorm, @andrew-rosenfeld-ts, @erikerlandson, @Crissman, @raulchen, @sumanthratna, @Catch-Bull, @chaokunyang, @krfricke, @raoul-khour-ts, @sven1977, @kathryn-zhou, @AmeerHajAli, @jovany-wang, @amogkam, @antoine-galataud, @tgaddair, @randxie, @ChaceAshcraft, @ericl, @cassidylaidlaw, @TanjaBayer, @lixin-wei, @lena-kashtelyan, @cathrinS, @qicosmos, @richardliaw, @rmsander, @jCrompton, @mjschock, @pdames, @barakmich, @michaelzhiluo, @stephanie-wang, @edoakes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray-1.3.0

Release v1.3.0 Notes

Highlights

Ray Cluster Launcher

💫Enhancements:

🔨 Fixes:

Ray Client

💫Enhancements:

🔨 Fixes:

Ray Core

🎉 New Features:

🔨 Fixes:

Ray Data Processing

🎉 New Features:

🔨 Fixes:

🏗 Architecture refactoring:

RLlib

🎉 New Features:

💫Enhancements:

🔨 Fixes:

🏗 Architecture refactoring:

Tune

🎉 New Features:

💫Enhancements:

🔨Fixes:

SGD

Serve

🎉 New Features:

🔨 Fixes:

🏗 Architecture refactoring:

Dashboard

🎉 New Features:

💫Enhancements:

Thanks