Skip to content

Commit

Permalink
[Doc] Fix doc lint (#49355)
Browse files Browse the repository at this point in the history
Signed-off-by: dentiny <[email protected]>
  • Loading branch information
dentiny authored Dec 19, 2024
1 parent 53d2145 commit 95fc8e3
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 29 deletions.
6 changes: 3 additions & 3 deletions doc/source/ray-core/compiled-graph/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Create a very simple actor that directly returns a given input using classic Ray
for _ in range(5):
msg_ref = a.echo.remote("hello")
ray.get(msg_ref)

start = time.perf_counter()
msg_ref = a.echo.remote("hello")
ray.get(msg_ref)
Expand Down Expand Up @@ -68,7 +68,7 @@ Next, execute the DAG and measure the performance.
for _ in range(5):
msg_ref = dag.execute("hello")
ray.get(msg_ref)

start = time.perf_counter()
# `dag.execute` runs the DAG and returns a future. You can use `ray.get` API.
msg_ref = dag.execute("hello")
Expand All @@ -81,7 +81,7 @@ Next, execute the DAG and measure the performance.
Execution takes 86.72196418046951 us

The performance of the same DAG improved by 10X. The explanation for this improvement is because the function ``echo`` is cheap and thus highly affected by
the system overhead. Due to various bookeeping and distributed protocols, the classic Ray Core APIs usually have 1ms+ system overhead.
the system overhead. Due to various bookeeping and distributed protocols, the classic Ray Core APIs usually have 1ms+ system overhead.
Because the DAG is known ahead of time, Compiled Graph can pre-allocate all necessary
resources ahead of time and greatly reduce the system overhead.

Expand Down
4 changes: 2 additions & 2 deletions doc/source/ray-core/compiled-graph/ray-compiled-graph.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ Ray Compiled Graph
The API is available from Ray 2.32.

As large language models (LLM) become common, programming distributed systems with multiple GPUs is essential.
Ray APIs facilitate using multiple GPUs but have limitations such as:
Ray APIs facilitate using multiple GPUs but have limitations such as:

* having a high system overhead of over 1ms per task launch, which is unsuitable for high-performance tasks like LLM inference
* lack direct GPU-to-GPU RDMA communication, requiring external tools like NCCL.
* lack direct GPU-to-GPU RDMA communication, requiring external tools like NCCL.

Ray Compiled Graph gives you a classic Ray Core-like API but with:

Expand Down
48 changes: 24 additions & 24 deletions release/release_data_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
run:
timeout: 3600
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data-internal/imagenet/parquet --format parquet
python read_and_consume_benchmark.py
s3://ray-benchmark-data-internal/imagenet/parquet --format parquet
--iter-bundles
- name: read_images
Expand All @@ -35,9 +35,9 @@
timeout: 3600
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data-internal/imagenet/tfrecords --format tfrecords
s3://ray-benchmark-data-internal/imagenet/tfrecords --format tfrecords
--iter-bundles
- name: read_from_uris
run:
timeout: 5400
Expand All @@ -62,8 +62,8 @@
- name: write_parquet
run:
timeout: 3600
script: >
python read_and_consume_benchmark.py
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf1000/lineitem --format parquet --write
###################
Expand All @@ -74,7 +74,7 @@
run:
timeout: 600
script: >
python read_and_consume_benchmark.py
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf10000/lineitem --format parquet --count
###############
Expand Down Expand Up @@ -268,38 +268,38 @@
- __suffix__: numpy
run:
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
--iter-batches numpy
- __suffix__: pandas
run:
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
--iter-batches pandas
- __suffix__: pyarrow
run:
script: >
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
python read_and_consume_benchmark.py
s3://ray-benchmark-data/tpch/parquet/sf100/lineitem --format parquet
--iter-batches pyarrow
- name: to_tf

run:
timeout: 2400
script: >
python read_and_consume_benchmark.py
s3://air-example-data-2/100G-image-data-synthetic-raw/ --format image
python read_and_consume_benchmark.py
s3://air-example-data-2/100G-image-data-synthetic-raw/ --format image
--to-tf image image
- name: iter_torch_batches

run:
timeout: 2400
script: >
python read_and_consume_benchmark.py
s3://air-example-data-2/100G-image-data-synthetic-raw/ --format image
python read_and_consume_benchmark.py
s3://air-example-data-2/100G-image-data-synthetic-raw/ --format image
--iter-torch-batches
###########
Expand Down Expand Up @@ -339,17 +339,17 @@
run:
timeout: 10800
script: >
python dataset/sort_benchmark.py --num-partitions=1000 --partition-size=1e9
python dataset/sort_benchmark.py --num-partitions=1000 --partition-size=1e9
--shuffle
variations:
- __suffix__: regular
- __suffix__: chaos
run:
prepare: >
python setup_chaos.py --chaos TerminateEC2Instance --kill-interval 600
--max-to-kill 2
prepare: >
python setup_chaos.py --chaos TerminateEC2Instance --kill-interval 600
--max-to-kill 2
- name: sort
working_dir: nightly_tests
stable: False
Expand All @@ -370,8 +370,8 @@
- __suffix__: regular
- __suffix__: chaos
run:
prepare: >
python setup_chaos.py --chaos TerminateEC2Instance --kill-interval 900
prepare: >
python setup_chaos.py --chaos TerminateEC2Instance --kill-interval 900
--max-to-kill 3
Expand Down

0 comments on commit 95fc8e3

Please sign in to comment.