Upstream merge 24 10 21#240
Merged
gshtras merged 299 commits intomainfrom upstream_merge_24_10_21Oct 23, 2024
+14,673-6,589
Commits
This pull request is big! We're only showing the most recent 250 commits
Commits on Sep 30, 2024
Commits on Oct 1, 2024
- authored
[CI][SpecDecode] Fix spec decode tests, use flash attention backend for spec decode CI tests. (vllm-project#8975)
authored- authored
- authored
- authored
[Bugfix] Fix Token IDs Reference for MiniCPM-V When Images are Provided With No Placeholders (vllm-project#8991)
authored- authored
- authored
Update benchmark_serving.py to read and write json-datasets, results in UTF8, for better compatibility with Windows (vllm-project#8997)
authored- authored
Commits on Oct 2, 2024
Commits on Oct 3, 2024
- authored
- authored
- authored
[BugFix] Enforce Mistral ToolCall id constraint when using the Mistral tool call parser (vllm-project#9020)
authored- authored
- authored
- authored
Commits on Oct 4, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 5, 2024
- authored
- authored
- authored
- authored
- authored
Commits on Oct 6, 2024
- authored
[Bugfix] Fix try-catch conditions to import correct Flash Attention Backend in Draft Model (vllm-project#9101)
authored- authored
- authored
Commits on Oct 7, 2024
- authored
- authored
- authored
- authored
[Hardware][CPU] Cross-attention and Encoder-Decoder models support on CPU backend (vllm-project#9089)
authored- authored
- authored
- authored
- authored
- authored
Commits on Oct 8, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- committed
- committed
- authored
- committed
- committed
- committed
[CI/Build] Add examples folder into Docker image so that we can leverage the templates*.jinja when serving models (vllm-project#8758)
authored- authored
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- authored
- committed
- authored
- committed
- authored
Commits on Oct 9, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 10, 2024
- authored
- authored
- authored
- authored
[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 (vllm-project#9149)
authored- authored
- authored
- authored
- authored
[CI/Build] Make the
Dockerfile.cpu
file'sPIP_EXTRA_INDEX_URL
Configurable as a Build Argument (vllm-project#9252)authored- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 11, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 12, 2024
- authored
- authored
- authored
- authored
- authored
Commits on Oct 13, 2024
Commits on Oct 14, 2024
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 15, 2024
- authored
- authored
- authored
- authored
- authored
- authored
[Bugfix] Fix vLLM UsageInfo and logprobs None AssertionError with empty token_ids (vllm-project#9034)
Commits on Oct 16, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Bugfix][Kernel] Prevent integer overflow in fp8 dynamic per-token quantize kernel (vllm-project#9425)
authored
Commits on Oct 17, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 18, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Model] Add user-configurable task for models that support both generation and embedding (vllm-project#9424)
authored- authored
- authored
- authored
Commits on Oct 19, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Oct 20, 2024
- authored
- authored
- authored
- authored
Commits on Oct 21, 2024
- authored
- authored
- authored
- authored
- authored
- committed
- committed
- committed