In this Repo, we will demonstrate how to use MinIO to build a Retrieval Augmented Generation(RAG) based chat application using commodity hardware.
- Use MinIO to store all the documents, processed chunks and the embeddings using the vector database.
- Use MinIO's bucket notification feature to trigger events when adding or removing documents to a bucket
- Webhook that consumes the event and process the documents using Langchain and saves the metadata and chunked documents to a metadata bucket
- Trigger MinIO bucket notification events for newly added or removed chunked documents
- A Webhook that consumes the events and generates embeddings and save it to the Vector Database (LanceDB) that is persisted in MinIO
- MinIO - Object Store to persist all the Data
- LanceDB - Serverless open-source Vector Database that persists data in object store
- Ollama - To run LLM and embedding model locally (OpenAI API compatible)
- Gradio - Interface through which to interact with RAG application
- FastAPI - Server for the Webhooks that receives bucket notification from MinIO and exposes the Gradio App
- LangChain & Unstructured - To Extract useful text from our documents and Chunk them for Embedding
- LLM - Phi-3-128K (3.8B Parameters)
- Embeddings - Nomic Embed Text v1.5 (Matryoshka Embeddings/ 768 Dim, 8K context)
Install the required packages using the following command:
pip install -r requirements.txt
You can follow the step by step described in the Notebook to run the application.