This project is a simple implementation of Retrieval Augmented Generation (RAG) using Langchain and Chainlit framework. Data Helper is a bot that enables users to inquire about the content of a webpage.
Tools Used | Usage |
---|---|
Mistal 7B Instruct V2 | LLM for inference |
Langchain | To chain various preprocessing steps |
Cohere | Text embedding |
ChromaDB | Vector store |
Chainlit | Front-end interface |
- Extract web content from a URL using Langchain's WebBaseLoader
- Split the web content using Langchain's RecursiveCharacterTextSplitter function
- Obtain a vector representation for each chunk using Cohere's embedding model.
- Store these vectors into a vector store (e.g Chroma DB)
- Compare the vector representation of the user's input (query) with all vectors in the vector store. Retrieve the top few similar vectors.
- Feed these similar vectors into the large language model's prompt template as additional context, along with the user's query.
In this demo, I supplied the last of us fandom wiki page to the bot. https://thelastofus.fandom.com/wiki/The_Last_of_Us_Part_II
. I then asked the question How did Joel die?
. I asked it to return 6 most similar chunks and fed it to the prompt template as additional context.
The prompt fed to the LLM is shown below
The Chainlit framework was employed to create a user interface, allowing users to input their queries.
I've noticed that parameters like chunk size, chunk overlap, and the prompt template significantly impact the generated output. I intend to conduct experiments with various parameters to improve the output further.