How to run

A NextJS application to chat with ollama, claude or chatgpt language models with the possibility of using a RAG approach.

Some PDFs are not supported due to encoding issues, .txt and .md work best.

Normal query to llm (without including files):
Query to llm including files (embeddings already calculated):

You can checkout a video demo in rag video showcase.

How to run

You should have ollama installed as a container (ollama installation instructions).
Install an ollama embedding model and configure it in the docker-compose.yaml file (environment: EMBEDDING_MODEL: embedding-model-name).

2.1 To install the nomic-embed-text embedding model execute the following:
```
    docker exec -it ollama bash
    ollama pull nomic-embed-text
```
2.2 You can check the installed models with (from inside ollama's container):
```
    ollama list
```
Install a language model in ollama, for example DeepSeek's smallest model (inside ollama's container):
```
ollama run deepseek-r1:1.5b
```
Create the nextjs container and the database container by executing the following in the root directory:
```
docker compose up --build -d
```
Once the containers have been created connect the ollama container to the docker network with:
```
docker network connect rag_app_network ollama
```

Open your browser and go to http://localhost:3000.

Running it with Claude or ChatGPT

In the docker-compose.yaml file add your API keys:

ANTHROPIC_API_KEY: your-anthropic-api-key
OPENAI_API_KEY: your-openai-api-key

How does RAG work

Retrieval Augmented Generation (or RAG) combines language models with static data in an attempt to improve the precision and updatability of such systems.

We have three main components:

Language model
Embedding model
Static information (e.g., files)

flowchart LR
    info[Static information] -- embedding model --> emb_info[embedding representation<br>of the information]
    query[user query] -- embedding model --> emb_query[embedding representation<br>of the query]

    emb_info --> dist[pick near info embeddings<br> to query embedding]
    emb_query --> dist

    dist -- input:<br>(information from<br>closest embeddings)  --> lm[Language model]
    query -- input --> lm

Using the embedding model we process the document information to create embeddings that represent the information in an n-dimensional space.
- in the beginning I processed the whole file (bad performance), then I chose to split the contents into chunks of 8192 characters (file-utils.ts line 60).
When the user sends a query, an embedding is generated for the query.
We compare the distance between the query embedding and the info chunk embeddings (in the n-dimensional space). The closer to the query embedding a content is, the more relevant it should be to the query.
Finally we create the input to the language model using the relevant information with the following template (route.ts line 28):
```
with the following context:
{info chunks we retrieved}

respond to the following question:
{user's query}
```

Possible Issues

413 HTTP error

If you're hosting the project in a server with a reverse proxy like Nginx you may get HTTP errors: 413 Content Too Large. To fix this you can edit the server block to add:

server {
    # ...
    client_max_body_size 50M; // Set the desired size limit
    # ...
}

Docker credentials

if you face something similar to the following error when trying to run docker compose up --build -d:

...
 failed to resolve source metadata for docker.io/library/node:18-alpine: error getting credentials - err: exec: "docker-credential-desktop": executable file not found in $PATH, out: 
...

you may want to edit the file ~/.docker/config.json to remove the "credStore" attribute from the file.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
misc		misc
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
components.json		components.json
docker-compose.yaml		docker-compose.yaml
eslint.config.mjs		eslint.config.mjs
init.sql		init.sql
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

How to run

Running it with Claude or ChatGPT

How does RAG work

Possible Issues

413 HTTP error

Docker credentials

About

Uh oh!

Releases

Packages

Languages

alexgf0/rag

Folders and files

Latest commit

History

Repository files navigation

How to run

Running it with Claude or ChatGPT

How does RAG work

Possible Issues

413 HTTP error

Docker credentials

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages