diff --git a/examples/usecases/RAG_based_LLM_serving/README.md b/examples/usecases/RAG_based_LLM_serving/README.md
index 17ff03b8c7..258305849b 100644
--- a/examples/usecases/RAG_based_LLM_serving/README.md
+++ b/examples/usecases/RAG_based_LLM_serving/README.md
@@ -359,7 +359,7 @@ The system architecture for the end-to-end solution using RAG based LLM serving
 ![RAG + LLM Deployment](https://raw.githubusercontent.com/pytorch/serve/master/examples/usecases/RAG_based_LLM_serving/assets/rag_llm.png "RAG + LLM Deployment")
 
 
-The steps for full deployment are mentioned in [Deploy.md](https://github.com/pytorch/serve/blob/master/examples/usecases/RAG_based_LLM_serving/Deploy.md)
+The steps for full deployment are mentioned in [Deployment Guide](https://github.com/pytorch/serve/blob/master/examples/usecases/RAG_based_LLM_serving/Deploy.md#Deploy-Llama-&-RAG-using-TorchServe)
 
 The code snippet which can chain the RAG endpoint with Llama endpoint is shown below
 
diff --git a/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png b/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png
index b9dc1cf320..fe7f2d2caa 100644
Binary files a/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png and b/examples/usecases/RAG_based_LLM_serving/assets/rag_perf.png differ