[Bugfix][Frontend] Fix missing `/metrics` endpoint #6463

DarkLight1337 · 2024-07-16T07:17:33Z

Mount does not work on router, so I'm reverting the change in #5090 so that mount is performed on the application object. Since app is no longer global, I've moved the code into a function to be called after app is constructed.

To avoid similar issues in the future, I've added some tests to check that the endpoints remain accessible.

FIX #6461

github-actions · 2024-07-16T07:17:44Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only trigger fastcheck CI to run, which consists only a small and essential subset of tests to quickly catch errors with the flexibility to run extra individual tests on top (you can do this by unblocking test steps in the Buildkite run).

Full CI run is still required to merge this PR so once the PR is ready to go, please make sure to run it. If you need all test signals in between PR commits, you can trigger full CI as well.

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2024-07-16T07:23:26Z

@robertgshaw2-neuralmagic can you help update the test to check #4523? Since I haven't worked with prometheus metrics before, I'm worried that I might break something with this refactor.

tests/entrypoints/openai/test_basic.py

DarkLight1337 · 2024-07-19T04:00:12Z

@robertgshaw2-neuralmagic since this PR has been merged already, you can open a new PR if you need to update the tests to be more robust.

see vllm-project/vllm#6463

xfalcox · 2024-07-22T19:03:41Z

Can we get a new docker release with this fix?

ZeroYuJie · 2024-08-06T04:00:25Z

v0.5.4 docker release still missing /metrics endpoint

ZeroYuJie · 2024-08-06T05:51:12Z

I try use main branch to build docker image,The metrics information has become completely unavailable,no longer contains any statistical information related to the model.

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 7313.0
python_gc_objects_collected_total{generation="1"} 5988.0
python_gc_objects_collected_total{generation="2"} 45.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 922.0
python_gc_collections_total{generation="1"} 83.0
python_gc_collections_total{generation="2"} 6.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="14",version="3.10.14"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.3022793728e+010
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 7.00747776e+08
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72291887821e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 18.240000000000002
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 22.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06

DarkLight1337 · 2024-08-06T05:53:52Z

I try use main branch to build docker image,The metrics information has become completely unavailable,no longer contains any statistical information related to the model.

Can you revert to the commit from this PR to ensure that the problem isn't caused by subsequent commits? (Notably #6883)

cc @robertgshaw2-neuralmagic

frittentheke · 2024-08-06T06:57:43Z

The fix https://github.com/vllm-project/vllm/pull/6463/files for the former issue #6461 also contained tests to ensure metrics are there. Are they not being used @DarkLight1337 ?

DarkLight1337 · 2024-08-06T07:00:44Z

The test only checks that the metrics endpoint exists. It doesn't check which metrics are being exposed.

frittentheke · 2024-08-06T07:03:14Z

The test only checks that the metrics endpoint exists. It doesn't check which metrics are being exposed.

That might be a cool thing to test then 🙄 ;-)

Signed-off-by: Alvant <[email protected]>

Add tests

cecf15b

DarkLight1337 marked this pull request as draft July 16, 2024 07:17

Update test

4defc08

Fix /metrics 404 error

fc19360

DarkLight1337 marked this pull request as ready for review July 16, 2024 07:30

DarkLight1337 requested a review from robertgshaw2-neuralmagic July 16, 2024 07:30

DarkLight1337 commented Jul 16, 2024

View reviewed changes

tests/entrypoints/openai/test_basic.py Outdated Show resolved Hide resolved

simon-mo mentioned this pull request Jul 16, 2024

v0.5.2, v0.5.3, v0.6.0 Release Tracker #6434

Closed

7 tasks

simon-mo approved these changes Jul 17, 2024

View reviewed changes

DarkLight1337 added 2 commits July 17, 2024 07:45

Merge branch 'upstream' into fix-metrics-endpoint

e22aea4

Update args

bef87a2

simon-mo added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 19, 2024

simon-mo enabled auto-merge (squash) July 19, 2024 02:00

simon-mo merged commit 6366efc into vllm-project:main Jul 19, 2024
86 checks passed

DarkLight1337 deleted the fix-metrics-endpoint branch July 19, 2024 04:00

dtrifiro added a commit to opendatahub-io/vllm-tgis-adapter that referenced this pull request Jul 19, 2024

fix missing metrics endpoint

7a2d9eb

see vllm-project/vllm#6463

dtrifiro added a commit to opendatahub-io/vllm-tgis-adapter that referenced this pull request Jul 19, 2024

fix missing metrics endpoint

bac5266

see vllm-project/vllm#6463

dtrifiro mentioned this pull request Jul 19, 2024

fix missing metrics endpoint opendatahub-io/vllm-tgis-adapter#50

Merged

github-merge-queue bot pushed a commit to opendatahub-io/vllm-tgis-adapter that referenced this pull request Jul 19, 2024

fix missing metrics endpoint

b9e01ff

see vllm-project/vllm#6463

github-merge-queue bot pushed a commit to opendatahub-io/vllm-tgis-adapter that referenced this pull request Jul 19, 2024

fix missing metrics endpoint

0064080

see vllm-project/vllm#6463

dtrifiro added a commit to opendatahub-io/vllm-tgis-adapter that referenced this pull request Jul 19, 2024

fix missing metrics endpoint

5618c0d

see vllm-project/vllm#6463

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[Bugfix][Frontend] Fix missing /metrics endpoint (vllm-project#6463)

b7962ea

gnpinkert pushed a commit to gnpinkert/vllm that referenced this pull request Jul 26, 2024

[Bugfix][Frontend] Fix missing /metrics endpoint (vllm-project#6463)

7c7fdf6

cduk pushed a commit to cduk/vllm-pascal that referenced this pull request Aug 6, 2024

[Bugfix][Frontend] Fix missing /metrics endpoint (vllm-project#6463)

b3a181f

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Bugfix][Frontend] Fix missing /metrics endpoint (vllm-project#6463)

09454ff

Signed-off-by: Alvant <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Bugfix][Frontend] Fix missing /metrics endpoint (vllm-project#6463)

c59f3f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix][Frontend] Fix missing `/metrics` endpoint #6463

[Bugfix][Frontend] Fix missing `/metrics` endpoint #6463

DarkLight1337 commented Jul 16, 2024 •

edited

Loading

github-actions bot commented Jul 16, 2024

DarkLight1337 commented Jul 16, 2024

DarkLight1337 commented Jul 19, 2024

xfalcox commented Jul 22, 2024

ZeroYuJie commented Aug 6, 2024

ZeroYuJie commented Aug 6, 2024

DarkLight1337 commented Aug 6, 2024 •

edited

Loading

frittentheke commented Aug 6, 2024

DarkLight1337 commented Aug 6, 2024

frittentheke commented Aug 6, 2024 •

edited

Loading

[Bugfix][Frontend] Fix missing /metrics endpoint #6463

[Bugfix][Frontend] Fix missing /metrics endpoint #6463

Conversation

DarkLight1337 commented Jul 16, 2024 • edited Loading

github-actions bot commented Jul 16, 2024

DarkLight1337 commented Jul 16, 2024

DarkLight1337 commented Jul 19, 2024

xfalcox commented Jul 22, 2024

ZeroYuJie commented Aug 6, 2024

ZeroYuJie commented Aug 6, 2024

DarkLight1337 commented Aug 6, 2024 • edited Loading

frittentheke commented Aug 6, 2024

DarkLight1337 commented Aug 6, 2024

frittentheke commented Aug 6, 2024 • edited Loading

[Bugfix][Frontend] Fix missing `/metrics` endpoint #6463

[Bugfix][Frontend] Fix missing `/metrics` endpoint #6463

DarkLight1337 commented Jul 16, 2024 •

edited

Loading

DarkLight1337 commented Aug 6, 2024 •

edited

Loading

frittentheke commented Aug 6, 2024 •

edited

Loading