fix readme

GoogleCloudPlatform · Mar 8, 2024 · 961b0c7 · 961b0c7
1 parent af4bec4
commit 961b0c7
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/benchmarks/inference-server/triton/README.md b/benchmarks/inference-server/triton/README.md
@@ -170,7 +170,7 @@ terraform apply
 | `model_id`           | Model used for inference.                                                                     | String  | `"meta-llama/Llama-2-7b-chat-hf"`         | No       |
 | `gpu_count`          | Parallelism based on number of gpus.                                                          | Number  | `1`                                       | No       |
 | `ksa`                | Kubernetes Service Account used for workload.                                                 | String  | `"default"`                               | No       |
-| `huggingface-secret` | Name of the kubectl huggingface secret token                                                  | String  | `"huggingface-secret"`                    | Yes       |
+| `huggingface_secret` | Name of the kubectl huggingface secret token                                                  | String  | `"huggingface-secret"`                    | Yes       |
 | `gcs_model_path`     | Path where model engine in gcs will be read from.     | String  |    null                                | Yes      |
 | `server_launch_command_string`     | Command to launc the Triton Inference Server     | String  |   "pip install sentencepiece protobuf && huggingface-cli login --token $HUGGINGFACE_TOKEN && /opt/tritonserver/bin/tritonserver --model-repository=/all_models/inflight_batcher_llm --disable-auto-complete-config --backend-config=python,shm-region-prefix-name=prefix0_"                                 | No      |