Local RAG System Setup

This repository contains a Local Retrieval-Augmented Generation (RAG) System designed as a learning project. The goal is to build a simplified version of a RAG system from scratch to understand the underlying components and their interactions. It is not intended to be a fully optimized or production-ready solution but serves as an educational tool to explore concepts like embedding, vector storage, and query handling using a build-from-scratch approach.

Prerequisites

Docker and Docker Compose installed on your machine.
An OpenAI API key for accessing OpenAI services as the model used for embeddings and generations is currently OpenAI-based.

Environment Variables

Before starting, ensure you have set the following environment variables:

export OPENAI_API_KEY="your-openai-api-key"

Or include this variable in a local .env file.

Replace your-openai-api-key with your actual OpenAI API key.

Setup Instructions

Step 1: Build and Run Docker Containers

Navigate to the project directory:
Build and start the Docker containers:
```
docker compose up -d
```

A tutorial on how the database is set up and explanations for the settings can be found here.

Step 2: Configure Markdown Sources

Copy the example configuration file and edit it to specify your markdown source paths:

cp project_config_example.yaml project_config.yaml

Open project_config.yaml and replace the example paths with the paths to your desired markdown sources. Ensure the paths are correctly indented as shown in the example file.

Step 3: Install Python Requirements

Set up a local virtual environment and install the Python requirements using uv:

uv install

This command will create a virtual environment and install all dependencies specified in the pyproject.toml file.

Alternatively, you can use another tool like poetry to install the needed libraries found as dependencies in the pyproject.toml file.

Step 4: Fill the RAG Database

Run the database setup script:
```
python rag_setup.py
```
This script will populate the RAG database with the necessary data.

Step 5: Start the Frontend

Run the frontend script to start the Gradio application:
```
python frontend.py
```
This will start a local server where you can interact with the chat application. Please keep in mind that every single query picks out the most similar embedding from the database and generates an answer, even if no embedding is relevant to the query.

Usage

Once the setup is complete, open your browser and navigate to the local server URL provided by Gradio to start using the chat application.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
docker-compose.yaml		docker-compose.yaml
frontend.py		frontend.py
local_embedding.py		local_embedding.py
markdown_processor.py		markdown_processor.py
project_config_example.yaml		project_config_example.yaml
project_plan.md		project_plan.md
pyproject.toml		pyproject.toml
rag_components.py		rag_components.py
rag_setup.py		rag_setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Local RAG System Setup

Prerequisites

Environment Variables

Setup Instructions

Step 1: Build and Run Docker Containers

Step 2: Configure Markdown Sources

Step 3: Install Python Requirements

Step 4: Fill the RAG Database

Step 5: Start the Frontend

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GalaxyInfernoCodes/local-rag-system

Folders and files

Latest commit

History

Repository files navigation

Local RAG System Setup

Prerequisites

Environment Variables

Setup Instructions

Step 1: Build and Run Docker Containers

Step 2: Configure Markdown Sources

Step 3: Install Python Requirements

Step 4: Fill the RAG Database

Step 5: Start the Frontend

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages