Skip to content

Commit

Permalink
Disable auto enabling chunked prefill on ROCm platform on long contex…
Browse files Browse the repository at this point in the history
…ts due to poor performance (#324)

Signed-off-by: Gregory Shtrasberg <[email protected]>
  • Loading branch information
gshtras authored Dec 12, 2024
1 parent 7efa6e0 commit 405e730
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion vllm/engine/arg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1063,7 +1063,8 @@ def create_engine_config(self,
if (is_gpu and not use_sliding_window and not use_spec_decode
and not self.enable_lora
and not self.enable_prompt_adapter
and model_config.task != "embedding"):
and model_config.task != "embedding"
and not current_platform.is_rocm()):
self.enable_chunked_prefill = True
logger.warning(
"Chunked prefill is enabled by default for models with "
Expand Down

0 comments on commit 405e730

Please sign in to comment.