Skip to content

Actions: vllm-project/vllm

Add label on auto-merge enabled

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,285 workflow runs
1,285 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[ Misc ] fp8-marlin channelwise via compressed-tensors
Add label on auto-merge enabled #85: Pull request #6524 auto_merge_enabled by mgoin
July 25, 2024 00:45 11s
July 25, 2024 00:45 11s
[Bugfix] Fix decode tokens w. CUDA graph
Add label on auto-merge enabled #84: Pull request #6757 auto_merge_enabled by comaniac
July 24, 2024 20:40 13s
July 24, 2024 20:40 13s
[Bugfix] Bump transformers to 4.43.2
Add label on auto-merge enabled #83: Pull request #6752 auto_merge_enabled by mgoin
July 24, 2024 18:55 13s
July 24, 2024 18:55 13s
[Bugfix] Fix awq_marlin and gptq_marlin flags
Add label on auto-merge enabled #82: Pull request #6745 auto_merge_enabled by mgoin
July 24, 2024 18:48 15s
July 24, 2024 18:48 15s
[Frontend] split run_server into build_server and run_server
Add label on auto-merge enabled #81: Pull request #6740 auto_merge_enabled by simon-mo
July 24, 2024 15:54 18s
July 24, 2024 15:54 18s
Adding f-string to validation error which is missing
Add label on auto-merge enabled #80: Pull request #6748 auto_merge_enabled by comaniac
July 24, 2024 15:52 14s
July 24, 2024 15:52 14s
[Bugfix] Miscalculated latency lead to time_to_first_token_seconds inaccurate.
Add label on auto-merge enabled #79: Pull request #6686 auto_merge_enabled by comaniac
July 24, 2024 15:44 11s
July 24, 2024 15:44 11s
[Kernel] Tuned FP8 Kernels for Ada Lovelace
Add label on auto-merge enabled #78: Pull request #6677 auto_merge_enabled by mgoin
July 24, 2024 14:39 12s
July 24, 2024 14:39 12s
[Bugfix]fix modelscope compatible issue
Add label on auto-merge enabled #77: Pull request #6730 auto_merge_enabled by simon-mo
July 24, 2024 12:04 12s
July 24, 2024 12:04 12s
[Bugfix] Fix token padding for chameleon
Add label on auto-merge enabled #76: Pull request #6724 auto_merge_enabled by ywang96
July 24, 2024 04:47 11s
July 24, 2024 04:47 11s
[SpecDecoding] Update MLPSpeculator CI tests to use smaller model
Add label on auto-merge enabled #75: Pull request #6714 auto_merge_enabled by njhill
July 24, 2024 04:38 13s
July 24, 2024 04:38 13s
[Bugfix] Miscalculated latency lead to time_to_first_token_seconds inaccurate.
Add label on auto-merge enabled #74: Pull request #6686 auto_merge_enabled by Yard1
July 24, 2024 00:22 10s
July 24, 2024 00:22 10s
[Core] Tweaks to model runner/input builder developer APIs
Add label on auto-merge enabled #73: Pull request #6712 auto_merge_enabled by comaniac
July 24, 2024 00:18 12s
July 24, 2024 00:18 12s
[Bugfix] fix flashinfer cudagraph capture for PP
Add label on auto-merge enabled #72: Pull request #6708 auto_merge_enabled by Yard1
July 24, 2024 00:15 13s
July 24, 2024 00:15 13s
[bitsandbytes]: support read bnb pre-quantized model
Add label on auto-merge enabled #71: Pull request #5753 auto_merge_enabled by mgoin
July 23, 2024 22:28 13s
July 23, 2024 22:28 13s
[Bugfix] StatLoggers: cache spec decode metrics when they get collected.
Add label on auto-merge enabled #70: Pull request #6645 auto_merge_enabled by comaniac
July 23, 2024 21:58 13s
July 23, 2024 21:58 13s
[CI] Add smoke test for non-uniform AutoFP8 quantization
Add label on auto-merge enabled #69: Pull request #6702 auto_merge_enabled by comaniac
July 23, 2024 19:19 11s
July 23, 2024 19:19 11s
[Kernel] Tuned FP8 Kernels for Ada Lovelace
Add label on auto-merge enabled #68: Pull request #6677 auto_merge_enabled by mgoin
July 23, 2024 18:58 9s
July 23, 2024 18:58 9s
[Bugfix] Miscalculated latency lead to time_to_first_token_seconds inaccurate.
Add label on auto-merge enabled #67: Pull request #6686 auto_merge_enabled by Yard1
July 23, 2024 17:12 52s
July 23, 2024 17:12 52s
[Kernels] Add fp8 support to reshape_and_cache_flash
Add label on auto-merge enabled #66: Pull request #6667 auto_merge_enabled by Yard1
July 23, 2024 17:12 24s
July 23, 2024 17:12 24s
[Misc] Support FP8 kv cache scales from compressed-tensors
Add label on auto-merge enabled #65: Pull request #6528 auto_merge_enabled by mgoin
July 23, 2024 02:38 10s
July 23, 2024 02:38 10s
[Core] Reduce unnecessary compute when logprobs=None
Add label on auto-merge enabled #64: Pull request #6532 auto_merge_enabled by comaniac
July 23, 2024 02:19 11s
July 23, 2024 02:19 11s
[Bugfix] Fix null modules_to_not_convert in FBGEMM Fp8 quantization
Add label on auto-merge enabled #63: Pull request #6665 auto_merge_enabled by robertgshaw2-redhat
July 23, 2024 00:58 10s
July 23, 2024 00:58 10s
[Core] Modulize prepare input and attention metadata builder
Add label on auto-merge enabled #62: Pull request #6596 auto_merge_enabled by comaniac
July 22, 2024 23:16 11s
July 22, 2024 23:16 11s
[Frontend] Kill the server on engine death
Add label on auto-merge enabled #61: Pull request #6594 auto_merge_enabled by Yard1
July 22, 2024 23:09 14s
July 22, 2024 23:09 14s
ProTip! You can narrow down the results and go further in time using created:<2024-07-22 or the other filters available.