This template demonstrates a FastAPI server with Retrieval-Augmented Generation (RAG). It uses LlamaIndex for efficient retrieval of relevant documents, enabling the generation of accurate and context-aware responses based on indexed data using the OpenAI API.
- Query Endpoint: Accepts a user query, retrieves relevant information from the LlamaIndex, and generates a GPT-based response.
- Rebuild Index: Allows rebuilding the document index to reflect updated data.
- LlamaIndex Integration: Uses
llama_index
for efficient document storage and retrieval. - GPT Response Generation: Leverages OpenAI's GPT API to generate contextually relevant answers.
-
Clone the repository:
git clone <repository_url> cd <repository_name>
-
Install dependencies: Ensure you have Python 3.8+ and
pip
installed, then run:pip install -r requirements.txt
-
Set up OpenAI API key: Add your OpenAI API key to the environment variable
OPENAI_API_KEY
. This can be done by adding the following line to your.bashrc
,.zshrc
, or creating a.env
file:export OPENAI_API_KEY="your_openai_api_key"
Run the FastAPI app with uvicorn:
uvicorn src.app:app --reload
The server will be available at http://127.0.0.1:8000
.