removing --enable-chunked-prefill

ROCm · Dec 17, 2024 · c75a4c4 · c75a4c4
1 parent 0881810
commit c75a4c4
Showing 1 changed file with 2 additions and 4 deletions.
diff --git a/docs/dev-docker/README.md b/docs/dev-docker/README.md
@@ -261,8 +261,7 @@ Benchmark Meta-Llama-3.1-405B FP8 with input 128 tokens, output 128 tokens and t
     --num-scheduler-steps 10 \
     --tensor-parallel-size 8 \
     --input-len 128 \
-    --output-len 128 \
-    --enable-chunked-prefill false
+    --output-len 128
 
 If you want to run Meta-Llama-3.1-405B FP16, please run
 
@@ -278,8 +277,7 @@ If you want to run Meta-Llama-3.1-405B FP16, please run
     --output-len 128 \
     --swap-space 16 \
     --max-model-len 8192 \
-    --max-num-batched-tokens 65536 \
-    --enable-chunked-prefill false
+    --max-num-batched-tokens 65536
 
 For fp8 quantized Llama3.18B/70B models: