Upstream merge 24 12 16#330

Merged

gshtras merged 102 commits intomainfrom upstream_merge_24_12_16

Dec 16, 2024

+10,604-4,101

Commits on Dec 9, 2024

Build tpu image in release pipeline (vllm-project#10936 )

richardsliu
and
khluu
authored
[V1] Do not store None in self.generators (vllm-project#11038 )
WoosukKwon
authored

Commits on Dec 10, 2024

[Docs] Add dedicated tool calling page to docs (vllm-project#10554 )

mgoin
and
tlrmchlsmth
authored
[Model] Add has_weight to RMSNorm and re-enable weights loading tracker for Mamba (vllm-project#10739 )
Isotr0py
authored
[Bugfix] Fix usage of deprecated decorator (vllm-project#11025 )
DarkLight1337
authored
[Frontend] Use request id from header (vllm-project#10968 )
joerunde
authored
[Pixtral] Improve loading (vllm-project#11040 )
patrickvonplaten
authored
[V1] Multiprocessing Tensor Parallel Support for v1 (vllm-project#9856 )
tlrmchlsmth
authored
monitor metrics of tokens per step using cudagraph batchsizes (vllm-project#11031 )
youkaichao
authored
[Bugfix] Fix xgrammar failing to read a vocab_size from LlavaConfig on PixtralHF. (vllm-project#11043 )
sjuxax
authored
Update README.md (vllm-project#11034 )
dmoliveira
authored
[Bugfix] cuda error running llama 3.2 (vllm-project#11047 )
GeneDer
authored
Add example of helm chart for vllm deployment on k8s (vllm-project#9199 )
mfournioux
authored
[Bugfix] Handle <|tool_call|> token in granite tool parser (vllm-project#11039 )
tjohnson31415
authored
[Misc][LoRA] Add PEFTHelper for LoRA (vllm-project#11003 )
jeejeelee
authored
[Bugfix] Backport request id validation to v0 (vllm-project#11036 )
joerunde
authored
[BUG] Remove token param vllm-project#10921 (vllm-project#11022 )
flaviabeo
authored
[Core] Update to outlines >= 0.1.8 (vllm-project#10576 )
russellb
authored
[torch.compile] add a flag to track batchsize statistics (vllm-project#11059 )
youkaichao
authored
[V1][Bugfix] Always set enable_chunked_prefill = True for V1 (vllm-project#11061 )
WoosukKwon
authored

Commits on Dec 11, 2024

Commits on Dec 12, 2024

Commits on Dec 13, 2024

Commits on Dec 14, 2024

Commits on Dec 15, 2024

Commits on Dec 16, 2024