RAGViz

RAGViz (Retrieval Augmented Generation Visualization) is a tool that visualizes both document and token-level attention on the retrieved context feeded to the LLM to ground answer generation.

RAGViz provides an add/remove document functionality to compare the generated tokens when certain documents are not included in the context.
Combining both functionalities allows for a diagnosis on the effectiveness and influence of certain retrieved documents or sections of text on the LLM's answer generation.

Demo Video

A basic demonstration of RAGViz is available here.

Configuration

The following are the system configurations of our RAGViz demonstration:

The Pile-CC English documents are used for retrieval
Documents are partioned into 4 DiskANN indexes on separate nodes, each with ~20 million documents
Documents are embedded into feature vectors using AnchorDR. To use AnchorDR in RAGViz you must follow the installation instructions on the repo here to ensure your Python environment is set up correctly. Do this after running pip install -r backend/requirements.txt.
LLaMa2 generation/attention output done with vLLM and HuggingFace transformers library
Frontend UI is adapted from Lepton search engine

Customization

Snippets:

You can modify the snippets used for context in RAG by adding a new file and class in backend/snippet, adding it to backend/ragviz.py and frontend/src/app/components/search.tsx. We currently offer the following snippets:

Naive First:
- Represent a document with its first 128 tokens
Sliding Window
- Compute inner product similarity between windows of 128 tokens and the query; use the most similar window to the query to represent a document

Datasets:

New datasets for retrieval can be added using a new file and class in backend/search, and modifying backend/ragviz.py accordingly.

We currently have implemented both a implementation the following datasets:

Clueweb22B english documents
Pile-CC dataset

LLMs:

Any model supported by HuggingFace transformers library can be used as the LLM backbone.

To apply vLLM for fast inference, the LLM backbone needs to be supported by vLLM. A list of vLLM supported model is available here.

You can set the model path of the model for RAG inside of backend/.env.example. We used meta-llama/Llama-2-7b-chat-hf for the demo.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAGViz

Demo Video

Configuration

Customization

Snippets:

Datasets:

LLMs:

About

Releases

Packages

Contributors 3

Languages

License

cxcscmu/RAGViz

Folders and files

Latest commit

History

Repository files navigation

RAGViz

Demo Video

Configuration

Customization

Snippets:

Datasets:

LLMs:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages