-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT SUBMIT] diff 2.23 to 2.24 #48263
Draft
rynewang
wants to merge
120
commits into
releases/2.23.0
Choose a base branch
from
releases/2.24.0
base: releases/2.23.0
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
we use clang now; no gcc Signed-off-by: Lonnie Liu <[email protected]>
not being used anymore. images are built with wanda and `ray_ci` scripts Signed-off-by: Lonnie Liu <[email protected]>
old `bazel_tools` constraints are deprecated. Signed-off-by: Lonnie Liu <[email protected]>
to latest version of 1.6.1 required to upgrade bazel. old skylib uses platform constraints that are depredated in newer versions of bazel. Signed-off-by: Lonnie Liu <[email protected]>
…ucing object size (#45309) Signed-off-by: Ruiyang Wang <[email protected]>
Package uploading is a CPU intensive work in Dashboard, where it collects the whole 500 MiB working_dir and uploads it to the GCS. It can take 30s to do so - during which the Dashboard event loop is blocking. This PR moves the uploading to another thread. This avoids event loop blocking. This PR also removes a dead reference to gcs_client in http_server_head.py. Signed-off-by: Ruiyang Wang <[email protected]>
…passed via NCCL in accelerated DAG (#45332) This adds support for dynamically sized torch.Tensors to be passed between accelerated DAG nodes via NCCL. Specifically, the following code is now supported, whereas previously `shape` and `dtype` had to be explicitly passed to `TorchTensorType`. ```python with InputNode() as inp: dag = sender.send.bind(inp) dag = dag.with_type_hint(TorchTensorType(transport="nccl")) dag = receiver.recv.bind(dag) compiled_dag = dag.experimental_compile() ``` The feature works by creating a shared memory channel to pass the metadata for the shape and dtype of the tensor. The metadata is then used to create a buffer of the correct size on the NCCL receiver. Initial microbenchmarks shows this adds about 50% throughput overhead compared to statically declaring the shape and dtype, or about 160us/DAG call. This seems a bit higher than expected (see also #45319). This also adds a few other fixes: - adds support for reusing actors to create new NCCL groups, which is needed if a DAG is torn down and a new one is created. - adds a lock to DAG teardown, to prevent the same NCCL group from getting destructed twice. - User-defined TorchTensorType shape or dtype is now used as a hint for the buffer size, instead of a required size. Since buffers are currently static, an error will be thrown if the user tries to return a too-large tensor. Part 1 of #45306, will follow up with a separate PR for nested tensors. --------- Signed-off-by: Stephanie Wang <[email protected]> Co-authored-by: SangBin Cho <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]>
so that they do not have to execute in sequential order Signed-off-by: Lonnie Liu <[email protected]>
This [commit](0de88e4) added these files into `benchmarks/benchmarks/` directory instead of just `benchmarks/` by accident. This PR moves these files back into just `benchmarks/` directory Signed-off-by: khluu <[email protected]>
… store as artifact (#45363) - This is to use for automation from `product` repo - Builds `update_version` binary into a python zip file and upload it as an artifact in `release-automation` pipeline - Have `root_dir` as an arg for `update_version` since automation is using this on a cloned Ray repo --------- Signed-off-by: khluu <[email protected]>
#45392) Avoid pickling LanceFragment when creating read tasks for Lance, as this is expensive. Signed-off-by: Cheng Su <[email protected]>
…45210) Make the "Experiment state snapshotting has been triggered multiple..." warning message is less confusing, and remove the false positive log at the end of every run. Also makes some deprecations of `TUNE_RESULT_DIR`, `RAY_AIR_LOCAL_CACHE_DIR`, `local_dir` legacy settings. --------- Signed-off-by: Justin Yu <[email protected]> Co-authored-by: Cuong Nguyen <[email protected]>
not built or used anywhere anymore Signed-off-by: Lonnie Liu <[email protected]>
approved by @jjyao --------- Signed-off-by: khluu <[email protected]> Signed-off-by: kevin <[email protected]>
This PR removes several methods from BlockList and LazyBlockList that aren't used anywhere. Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: Cuong Nguyen <[email protected]>
Some minor code cleanup separated from #45450 . To focus that PR more on new changes only.
…45194) Currently calling get_runtime_context().get_actor_name() from driver will crash. Instead of crashing, this PR returns None in this case. Signed-off-by: 982945902 <[email protected]> Co-authored-by: Huaiwei Sun <[email protected]>
Fix compute config for microbenchmark_gpu_unstable. Closes #45322. --------- Signed-off-by: Stephanie Wang <[email protected]>
to version 1.14.0 Signed-off-by: Lonnie Liu <[email protected]>
not supported on newer version of bazel Signed-off-by: Lonnie Liu <[email protected]>
the flag already flipped its default to true in bazel 5.6.x , and it is removed in bazel 6.x Signed-off-by: Lonnie Liu <[email protected]>
fixes https://errorprone.info/bugpattern/DoubleBraceInitialization Signed-off-by: Lonnie Liu <[email protected]>
More recent versions of `jax` (e.g. `0.4.28`) will cause this to fail. Signed-off-by: Matthew Deng <[email protected]>
to 0.29.37; required for bazel upgrade. Signed-off-by: Lonnie Liu <[email protected]>
…eads and skip mixin buffer if not needed. (#45467)
so that we know which archive import it is talking about Signed-off-by: Lonnie Liu <[email protected]>
The _split_at_index function isn't used anywhere. This PR removes it. Signed-off-by: Balaji Veeramani <[email protected]>
cleaner to write, and easier to parse Signed-off-by: Lonnie Liu <[email protected]>
This package is not available for mac, let's skip it on mac platform Test: - CI Signed-off-by: can <[email protected]>
and moving it out, as it is a very fundamental bazel package, not specific to ray. Signed-off-by: Lonnie Liu <[email protected]>
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Update the experimental feature guide on multi-container deployment approach for Ray Serve. ## Related issue number Closes: #45026 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: dudeperf3ct <[email protected]>
Signed-off-by: lishuo121 <[email protected]>
…chanism (#45156) Signed-off-by: Cindy Zhang <[email protected]>
…n MultiAgentEnvRunner when sampling whole episodes. (#45617)
for bumping package versions up in the container and dodging cve's also upgrade `idna` and add missing `cupy-cuda11x` package in constraints.. Signed-off-by: Lonnie Liu <[email protected]>
some packages are declared more than once. Signed-off-by: Lonnie Liu <[email protected]>
This PR adds multi-arg and kwarg support by serializing all positional args and kwargs and passing it through the channel. When the channel is read at runtime, the individual args are extracted first before passing to the consuming tasks. Closes #42793 --------- Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: Rui Qiao <[email protected]>
…ult value is set (#45301) Currently it's unclear how the default value is set Signed-off-by: Jiajun Yao <[email protected]>
This code path deletes the release test working directory upon the job completion. We found repetitive cases where users want the data to be available for debugging purpose. Let's rely on s3 policy to clean up the data after a few days. Test: - CI Signed-off-by: can <[email protected]>
Notice that we haven't removed this support completely once I work on upgrading python 3.12. Need to change some runtime environment to `oss-ci-base_build` since `forge` is using python 3.8. Test: - CI Signed-off-by: can <[email protected]>
Refactor ResourceManager and avoid it directly depending on concrete operators. --------- Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
… symlinks (#45618) New env var is called RAY_DASHBOARD_BUILD_FOLLOW_SYMLINKS. This is an advanced setting that should only be used with special Ray installations where the dashboard build files are symlinked to a different directory. This is not recommended for most users and can pose a security risk. Please reference the aiohttp docs here: https://docs.aiohttp.org/en/stable/web_reference.html#aiohttp.web.UrlDispatcher.add_static
Signed-off-by: hongchaodeng <[email protected]>
add oss tag to container tests Add `oss` tag to container tests. Signed-off-by: Cindy Zhang <[email protected]> Signed-off-by: Cindy Zhang <[email protected]>
…45217) This PR adds an example for stable diffusion model fine-tuning and serving using HPU. Moreover, it also covers how to adapt an existing HPU example to run on Ray, so that users can use Ray to run the examples on huggingface/optimum-habana. --------- Signed-off-by: Zhi Lin <[email protected]> Signed-off-by: Yunxuan Xiao <[email protected]> Signed-off-by: Samuel Chan <[email protected]> Co-authored-by: Yunxuan Xiao <[email protected]> Co-authored-by: Yunxuan Xiao <[email protected]> Co-authored-by: Samuel Chan <[email protected]> Co-authored-by: Peyton Murray <[email protected]>
…onGroup` (#45523) Signed-off-by: Yang, Bo <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
) Signed-off-by: hejialing.hjl <[email protected]>
…arnerGroup.update_from_batch()`. (#45419)
Add keys to a few cheap builds and tests that I noticed failed on people's PR so we can include them in microcheck. These tests are not covered in the scope of test_in_docker. Test: - CI Signed-off-by: can <[email protected]>
Signed-off-by: Jiajun Yao <[email protected]>
This PR is to add the telemetry recording for newly added datasources. Signed-off-by: Cheng Su <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
…/2.24.0 fast forward
Generated by release-automation bot --------- Signed-off-by: kevin <[email protected]> Signed-off-by: khluu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.