Spring AI Retrieval Augmented Generation with Open AI

Introduction

Retrieval Augmented Generation (RAG) is a technique that integrates your data into the AI model's responses.

First, you need to upload the documents you wish to have analyzed in an AI respoinse into a Vector Database. This involves breaking down the documents into smaller segments because AI models typically only manage to process a few tens of kilobytes of custom data for generating responses. After splitting, these document segments are stored in the Vector Database.

The second step involves including data from the Vector Database that is pertinent to your query when you make a request to the AI model. This is achieved by performing a similarity search within the Vector Database to identify relevant content.

In the third step, you merge the text of your request with the documents retrieved from the Vector Database before sending it to the AI model. This process is informally referred to as 'stuffing the prompt'.

This project demonstrates Retrieval Augmented Generation in practice and can serve as the foundation for customizing to meet your specific requirements in your own project.

Endpoints

This project contains a web service with the following endpoints under http://localhost:8080

POST /data/load
GET /data/count
POST /data/delete
GET /qa

The /qa endpoint takes a question parameter which is the question you want to ask the AI model. The /qa endpoint also takes a stuffit boolean parameter, whose default it true, that will 'stuff the prompt' with similar documents to the question. When stuffing the prompt, this follows the RAG pattern.

Prerequisites

OpenAI Credentials

Create an account at OpenAI Signup and generate the token at API Keys.

The Spring AI project defines a configuration property named spring.ai.openai.api-key that you should set to the value of the API Key obtained from openai.com.

You can set this in the projects /resources/application.yml file or by exporting an environment variable, for example.

export SPRING_AI_OPENAI_API_KEY=<INSERT KEY HERE>

Note, the /resources/application.yml references the environment variable ${SPRING_AI_OPENAI_API_KEY}.

VectorStore

To run the PgVectorStore locally, using docker-compose. From the top project directory and run:

docker-compose up

Later starts Postgres DB on localhost and port 5432.

Then you can connect to the database (password: postgres) and inspect or alter the vector_store table content:

psql -U postgres -h localhost -p 5432

\l
\c vector_store
\dt

select count(*) from vector_store;

delete from vector_store;

You can connect to the pgAdmin on http://localhost:5050 as user: [email protected] and pass: admin. Then navigate to the Databases/vector_store/Schemas/public/Tables/vector_store.

The UI tool DBeaver is also a useful GUI for postgres.

Building and running

./mvnw spring-boot:run

Access the endpoints

The first thing you should do is load the data. The examples show usage with the HTTPie command line utility as it simplifies sending HTTP requests with data as compared to curl.

Loading, counting and deleting data

http POST http://localhost:8080/data/load

Next you can see how many document fragments were loaded into the Vector Store using

http http://localhost:8080/data/count

If you want to start over, for example because you changed in the code which document is being loaded, then execute

http POST http://localhost:8080/data/delete

Q&A over the document using QueryEngine

Send your question to the QueryEngine using

http --body --unsorted localhost:8080/qa/engine question=="What is the purpose of Carina?"

The response is

{
    "question": "What is the purpose of Carina?",
    "answer": "The purpose of Carina is to provide a safe, easy-to-use, online location-based care matching service. It serves individuals and families searching for home care or child care, as well as care professionals looking for good jobs. Carina is committed to building community and prioritizing people over profit."
}

Q&A over the document

Send you question to the AI Model using

http --body --unsorted localhost:8080/qa message==<insert question here>

Note that there are two equal signs == , that separate the key-value pairs

To ask the same question but without the similar documents stuffing the prompt, that is, not using the RAG pattern,

http --body --unsorted http://localhost:8080/qa message==<insert question here> stuffit==false

Examples

$ http --body --unsorted localhost:8080/qa question=="What is the purpose of Carina?"
{
    "question": "What is the purpose of Carina?",
    "answer": "The purpose of Carina is to provide a safe and easy-to-use online care matching service. It aims to connect care providers with individuals and families who are in need of home care or child care services. Carina prioritizes building community and supporting care workers by bringing good jobs to them. Its goal is to strengthen the care economy and support workers, individuals, and families in the process."
}

and without stuffing the prompt

$ http --body --unsorted localhost:8080/qa question=="What is the purpose of Carina?" stuffit==false
{
    "question": "What is the purpose of Carina?",
    "answer": "Carina is a constellation located in the southern sky. It does not have a specific purpose, but like other constellations, it serves as a way to organize and identify stars in the night sky. Constellations have been used for navigation, storytelling, and scientific observation throughout history."
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.mvn/wrapper		.mvn/wrapper
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spring AI Retrieval Augmented Generation with Open AI

Introduction

Endpoints

Prerequisites

OpenAI Credentials

VectorStore

Building and running

Access the endpoints

Loading, counting and deleting data

Q&A over the document using QueryEngine

Q&A over the document

Examples

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

rd-1-2022/ai-openai-rag

Folders and files

Latest commit

History

Repository files navigation

Spring AI Retrieval Augmented Generation with Open AI

Introduction

Endpoints

Prerequisites

OpenAI Credentials

VectorStore

Building and running

Access the endpoints

Loading, counting and deleting data

Q&A over the document using QueryEngine

Q&A over the document

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages