Adding support vLLM openai entrypoint benchmarking script #793

Edwinhr716 · 2024-08-31T00:05:03Z

Currently the benchmarking script only works with the regular entrypoint of vllm. Since the benchmarking suite builds uses openai entrypoint instead, changed it to support that entrypoint
.

benchmarks/benchmark/tools/profile-generator/container/latency_throughput_curve.sh

vivianrwu

For jetstream, can you change max_tokens: 1 to max_tokens: output_len under the benchmark_serving.py script? Other than that, lgtm for jetstream side changes!

Edwinhr716 and others added 4 commits August 26, 2024 14:32

changes to support openai entrypoint

01ad429

cat command to show results

7a487fb

defaulted entrypoint in vllm case, updated docs and versions

b982e4a

readded comment

cd95fd1

Edwinhr716 requested review from achandrasekar, ahg-g and annapendleton as code owners August 31, 2024 00:05

fixed lint issue

14f5f99

annapendleton reviewed Aug 31, 2024

View reviewed changes

benchmarks/benchmark/tools/profile-generator/container/latency_throughput_curve.sh Outdated Show resolved Hide resolved

ran terraform fmt

abfc66a

annapendleton approved these changes Aug 31, 2024

View reviewed changes

fixed comments

e7278e8

vivianrwu reviewed Sep 3, 2024

View reviewed changes

changes to support jetstream

7983364

annapendleton merged commit c872599 into GoogleCloudPlatform:main Sep 3, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support vLLM openai entrypoint benchmarking script #793

Adding support vLLM openai entrypoint benchmarking script #793

Edwinhr716 commented Aug 31, 2024 •

edited

Loading

vivianrwu left a comment •

edited

Loading

Adding support vLLM openai entrypoint benchmarking script #793

Adding support vLLM openai entrypoint benchmarking script #793

Conversation

Edwinhr716 commented Aug 31, 2024 • edited Loading

vivianrwu left a comment • edited Loading

Choose a reason for hiding this comment

Edwinhr716 commented Aug 31, 2024 •

edited

Loading

vivianrwu left a comment •

edited

Loading