Skip to content

Actions: vllm-project/vllm

Add label on auto-merge enabled

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
122 workflow run results
122 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[Bugfix] fix flashinfer cudagraph capture for PP
Add label on auto-merge enabled #72: Pull request #6708 auto_merge_enabled by Yard1
July 24, 2024 00:15 13s
July 24, 2024 00:15 13s
[bitsandbytes]: support read bnb pre-quantized model
Add label on auto-merge enabled #71: Pull request #5753 auto_merge_enabled by mgoin
July 23, 2024 22:28 13s
July 23, 2024 22:28 13s
[Bugfix] StatLoggers: cache spec decode metrics when they get collected.
Add label on auto-merge enabled #70: Pull request #6645 auto_merge_enabled by comaniac
July 23, 2024 21:58 13s
July 23, 2024 21:58 13s
[CI] Add smoke test for non-uniform AutoFP8 quantization
Add label on auto-merge enabled #69: Pull request #6702 auto_merge_enabled by comaniac
July 23, 2024 19:19 11s
July 23, 2024 19:19 11s
[Kernel] Tuned FP8 Kernels for Ada Lovelace
Add label on auto-merge enabled #68: Pull request #6677 auto_merge_enabled by mgoin
July 23, 2024 18:58 9s
July 23, 2024 18:58 9s
[Bugfix] Miscalculated latency lead to time_to_first_token_seconds inaccurate.
Add label on auto-merge enabled #67: Pull request #6686 auto_merge_enabled by Yard1
July 23, 2024 17:12 52s
July 23, 2024 17:12 52s
[Kernels] Add fp8 support to reshape_and_cache_flash
Add label on auto-merge enabled #66: Pull request #6667 auto_merge_enabled by Yard1
July 23, 2024 17:12 24s
July 23, 2024 17:12 24s
[Misc] Support FP8 kv cache scales from compressed-tensors
Add label on auto-merge enabled #65: Pull request #6528 auto_merge_enabled by mgoin
July 23, 2024 02:38 10s
July 23, 2024 02:38 10s
[Core] Reduce unnecessary compute when logprobs=None
Add label on auto-merge enabled #64: Pull request #6532 auto_merge_enabled by comaniac
July 23, 2024 02:19 11s
July 23, 2024 02:19 11s
[Bugfix] Fix null modules_to_not_convert in FBGEMM Fp8 quantization
Add label on auto-merge enabled #63: Pull request #6665 auto_merge_enabled by robertgshaw2-redhat
July 23, 2024 00:58 10s
July 23, 2024 00:58 10s
[Core] Modulize prepare input and attention metadata builder
Add label on auto-merge enabled #62: Pull request #6596 auto_merge_enabled by comaniac
July 22, 2024 23:16 11s
July 22, 2024 23:16 11s
[Frontend] Kill the server on engine death
Add label on auto-merge enabled #61: Pull request #6594 auto_merge_enabled by Yard1
July 22, 2024 23:09 14s
July 22, 2024 23:09 14s
[Misc] Remove deprecation warning for beam search
Add label on auto-merge enabled #60: Pull request #6659 auto_merge_enabled by WoosukKwon
July 22, 2024 23:04 21s
July 22, 2024 23:04 21s
[Bugfix][Kernel] Use int64_t for indices in fp8 quant kernels
Add label on auto-merge enabled #59: Pull request #6649 auto_merge_enabled by robertgshaw2-redhat
July 22, 2024 14:45 12s
July 22, 2024 14:45 12s
[Bugfix] Fix vocab_size field access in LLaVA models
Add label on auto-merge enabled #58: Pull request #6624 auto_merge_enabled by DarkLight1337
July 22, 2024 03:40 14s
July 22, 2024 03:40 14s
[ CI ] Awq Marlin Integration Tests
Add label on auto-merge enabled #57: Pull request #6627 auto_merge_enabled by robertgshaw2-redhat
July 22, 2024 01:01 10s
July 22, 2024 01:01 10s
[ Kernel ] Enable fp8-marlin for fbgemm-fp8 models
Add label on auto-merge enabled #56: Pull request #6606 auto_merge_enabled by mgoin
July 20, 2024 18:32 11s
July 20, 2024 18:32 11s
[Misc] Consolidate and optimize logic for building padded tensors
Add label on auto-merge enabled #55: Pull request #6541 auto_merge_enabled by DarkLight1337
July 20, 2024 03:37 11s
July 20, 2024 03:37 11s
[ Misc ] fbgemm checkpoints
Add label on auto-merge enabled #54: Pull request #6559 auto_merge_enabled by mgoin
July 20, 2024 01:37 10s
July 20, 2024 01:37 10s
[ Kernel ] FP8 Dynamic Per Token Quant - Add scale_ub
Add label on auto-merge enabled #53: Pull request #6593 auto_merge_enabled by mgoin
July 20, 2024 00:35 12s
July 20, 2024 00:35 12s
[Core] Allow specifying custom Executor
Add label on auto-merge enabled #52: Pull request #6557 auto_merge_enabled by Yard1
July 19, 2024 23:34 14s
July 19, 2024 23:34 14s
[Core] Allow specifying custom Executor
Add label on auto-merge enabled #51: Pull request #6557 auto_merge_enabled by Yard1
July 19, 2024 22:27 10s
July 19, 2024 22:27 10s
[Bugfix] [SpecDecode] AsyncMetricsCollector: update time since last collection
Add label on auto-merge enabled #50: Pull request #6578 auto_merge_enabled by cadedaniel
July 19, 2024 20:53 11s
July 19, 2024 20:53 11s
[ Kernel ] Enable Dynamic Per Token fp8
Add label on auto-merge enabled #49: Pull request #6547 auto_merge_enabled by robertgshaw2-redhat
July 19, 2024 18:34 13s
July 19, 2024 18:34 13s
[Misc] Fix input_scale typing in w8a8_utils.py
Add label on auto-merge enabled #48: Pull request #6579 auto_merge_enabled by mgoin
July 19, 2024 14:31 11s
July 19, 2024 14:31 11s