From e1f00f69e544ffd03c3840e7eec2bc938bf8efec Mon Sep 17 00:00:00 2001 From: Matthias Reso <13337103+mreso@users.noreply.github.com> Date: Wed, 18 Sep 2024 01:08:00 +0000 Subject: [PATCH] Update vllm/lora readme --- examples/large_models/vllm/lora/Readme.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/large_models/vllm/lora/Readme.md b/examples/large_models/vllm/lora/Readme.md index c592f23a73..0469d53567 100644 --- a/examples/large_models/vllm/lora/Readme.md +++ b/examples/large_models/vllm/lora/Readme.md @@ -55,7 +55,7 @@ The vllm integration uses an OpenAI compatible interface which lets you perform Curl: ```bash -curl --header "Content-Type: application/json" --request POST --data @prompt.json http://localhost:8080/predictions/llama-8b-lora/1.0/v1 +curl --header "Content-Type: application/json" --request POST --data @prompt.json http://localhost:8080/predictions/llama-8b-lora/1.0/v1/completions ``` Python + Request: