PaRAGon

A customizable, pluggable, and scalable Retrieval-Augmented Generation (RAG) framework designed to streamline the process of building AI-powered applications that require knowledge-grounded text generation. This framework provides end-to-end components for retrieval, processing, and generation, while allowing users to plug in custom modules at every stage.

Features

Pluggable Design: Easily integrate custom modules for retrieval, preprocessing, and generation.
Framework-Agnostic: Compatible with popular frameworks like LangChain, LLAMAIndex etc.
Customizable Pipelines: Configure workflows tailored to your use case (e.g., question answering, summarization, etc.).
Scalability: Deploy locally or in distributed cloud environments.
Pre-built Modules: Includes commonly used retrieval and generation backends for quick setup.
Integration Friendly: Supports REST APIs, message queues, and file-based inputs/outputs

Default RAG Components

The default implementation of the RAG framework is designed for optimal performance with the following stack:

Pipeline: Haystack, AI orchestration framework to build customizable, production-ready LLM applications
Vector Database: Milvus, a cloud-native vector database, storage
Embedding Model: sentence-transformers/all-MiniLM-L12-v2 pre-trained embedding model from Hugging Face for vectorization
LLM: mistral-7b-v0.1
Document Parser: Docling parses documents and exports them to the desired format with ease and speed

Getting Started

Pre-Requisites

Python 3.9+

Installation

Clone the repository:

git clone [email protected]:redhat-et/PaRAGon.git
cd PaRAGon

Install dependencies:

pip install -r requirements.txt

Running the RAG pipeline locally

We have a simple demo script test/sanity_test.py that you can run to test the RAG pipeline.

Before running the script, ensure that you update the following vars in the paragon/settings.py file:

milvus_server_url - You can replace this with your hosted Milvus vector DB endpoint else use the default which will be an in-memory local deployment of Milvus Lite
llm_base_url - The LLM serving endpoint you would like to interact with. It can either be a locally running LLM or a hosted LLM
llm - The corresponding model name of the LLM being served at the above specified URL

Optionally, you can change many other settings that are mentioned in the paragon/settings.py script such as retrieval_embedding_model for your choice of embedding model, top_k the number of search results you would like to be returned etc to suit your needs.

Once all the settings have been made, you can run the following:

python test/sanity_test.py

The test/sanity_test.py script includes example PDFs that are processed and converted into JSON format using Docling. These JSON files are then indexed as embeddings in the Milvus vector database. The script also provides sample queries that are run through the RAG pipeline, allowing you to observe the generated responses.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
paragon		paragon
test		test
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaRAGon

Features

Default RAG Components

Getting Started

Pre-Requisites

Installation

Running the RAG pipeline locally

About

Releases

Packages

Contributors 2

Languages

redhat-et/PaRAGon

Folders and files

Latest commit

History

Repository files navigation

PaRAGon

Features

Default RAG Components

Getting Started

Pre-Requisites

Installation

Running the RAG pipeline locally

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages