diff --git a/examples/usecases/RAG_based_LLM_serving/README.md b/examples/usecases/RAG_based_LLM_serving/README.md index 17ff03b8c7..258305849b 100644 --- a/examples/usecases/RAG_based_LLM_serving/README.md +++ b/examples/usecases/RAG_based_LLM_serving/README.md @@ -359,7 +359,7 @@ The system architecture for the end-to-end solution using RAG based LLM serving ![RAG + LLM Deployment](https://raw.githubusercontent.com/pytorch/serve/master/examples/usecases/RAG_based_LLM_serving/assets/rag_llm.png "RAG + LLM Deployment") -The steps for full deployment are mentioned in [Deploy.md](https://github.com/pytorch/serve/blob/master/examples/usecases/RAG_based_LLM_serving/Deploy.md) +The steps for full deployment are mentioned in [Deployment Guide](https://github.com/pytorch/serve/blob/master/examples/usecases/RAG_based_LLM_serving/Deploy.md#Deploy-Llama-&-RAG-using-TorchServe) The code snippet which can chain the RAG endpoint with Llama endpoint is shown below diff --git a/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png b/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png index b9dc1cf320..fe7f2d2caa 100644 Binary files a/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png and b/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png differ