Update ROCm vLLM to 0.4.3#40

Merged

mawong-amd merged 393 commits intomainfrom main_upstream_candidate_531_fp8

Jun 6, 2024

+54,524-18,615

This pull request is big! We're only showing the most recent 250 commits

Commits on Apr 27, 2024

[BugFix] Fix return type of executor execute_model methods (vllm-project#4402 )
njhill
authored
[BugFix] Resolved Issues For LinearMethod --> QuantConfig (vllm-project#4418 )
robertgshaw2-redhat
authored

Commits on Apr 30, 2024

[Misc] Upgrade to torch==2.3.0 (vllm-project#4454 )
mgoin
authored
[Bugfix][Kernel] Fix compute_type for MoE kernel (vllm-project#4463 )
WoosukKwon
authored
[Core]Refactor gptq_marlin ops (vllm-project#4466 )
jikunshang
authored
[BugFix] fix num_lookahead_slots missing in async executor (vllm-project#4165 )

leiwen83
and
wenlei03
authored
[Doc] add visualization for multi-stage dockerfile (vllm-project#4456 )

prashantgupta24
and
ywang96
authored
[Kernel] Support Fp8 Checkpoints (Dynamic + Static) (vllm-project#4332 )

authored
[Frontend] Support complex message content for chat completions endpoint (vllm-project#3467 )

authored
[Frontend] [Core] Tensorizer: support dynamic num_readers, update version (vllm-project#4467 )
alpayariyak
authored
[Bugfix][Minor] Make ignore_eos effective (vllm-project#4468 )
bigPYJ1151
authored
fix_tokenizer_snapshot_download_bug (vllm-project#4493 )
kingljl
authored

Commits on May 1, 2024

Commits on May 2, 2024

Commits on May 3, 2024

Commits on May 7, 2024

Commits on May 8, 2024

Commits on May 9, 2024

Commits on May 10, 2024

Commits on May 11, 2024

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (vllm-project#3734 )
CatherineSue
authored

Commits on May 12, 2024

[Model] Add support for IBM Granite Code models (vllm-project#4636 )
yikangshen
authored

Commits on May 13, 2024

Commits on May 15, 2024

Commits on May 16, 2024

Commits on May 18, 2024

Commits on May 19, 2024

Commits on May 20, 2024

Commits on May 21, 2024

Commits on May 22, 2024

Commits on May 23, 2024

Commits on May 24, 2024

Commits on May 29, 2024

Commits on May 30, 2024

Commits on May 31, 2024

Commits on Jun 5, 2024

Update linear.py

gshtras
authored and
mawong-amd
committed

Commits on Jun 6, 2024