We propose a generic RAG approach called Adaptive Note- Enhanced RAG (Adaptive-Note) for complex QA tasks, which includes the iterative information collector, adaptive memory reviewer, and task-oriented generator, while fol- lowing a new Retriever-and-Memory paradigm.
Please create a data/
directory and place all the corpus and evaluation files within it. All experimental datasets can be found here. For the ASQA corpus, please download it separately from this link due to its large size.
We offer three retrieval services as detailed in our paper:
- BM25 Retrieval Service using ElasticSearch for 2WikiMQA, MuSiQue, and HotpotQA.
- BGE Retrieval Service using FAISS for CRUD.
- GTR Retrieval Service using FAISS for ASQA.
First, install Elasticsearch 7.10 using the following commands:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-linux-x86_64.tar.gz.sha512
shasum -a 512 -c elasticsearch-7.10.2-linux-x86_64.tar.gz.sha512
tar -xzf elasticsearch-7.10.2-linux-x86_64.tar.gz
cd elasticsearch-7.10.2/
./bin/elasticsearch # Start the server
pkill -f elasticsearch # To stop the server
cd src/build_index/es
# 2WikiMQA
python index_2wiki.py
# MuSiQue
python index_musique.py
# HotpotQA
python index_hotpotqa.py
Our corpus and retrieval settings follow ALCE. Since generating GTR embeddings is resource-intensive, you can download them and place them in data/corpus/asqa/
as follows:
wget https://huggingface.co/datasets/princeton-nlp/gtr-t5-xxl-wikipedia-psgs_w100-index/resolve/main/gtr_wikipedia_index.pkl
Build the index:
cd src/build_index/emb
python index.py --dataset asqa --model gtr-t5-xxl
Build the index for CRUD:
cd src/build_index/emb
python index.py --dataset crud --model bge-base-zh-v1.5 --chunk_size 512
You can configure your API key, URL, and other settings in the ./config/config.yaml
file.
To run and evaluate for 2WikiMQA, MuSiQue, HotpotQA, and ASQA:
python main.py --method note --retrieve_top_k 5 --dataset asqa --max_step 3 --max_fail_step 1 --MaxClients 10 --model gpt-3.5-turbo-0125 --device cuda:0
The predicted results and evaluation metrics (F1 and EM) will be automatically saved in the output/{dataset}/{method}/{model}
directory. The evaluation results can be found at the end of the file.
For CRUD (crud_1doc, crud_2doc, crud_3doc):
python main.py --method note --retrieve_top_k 2 --dataset crud_1doc --max_step 3 --max_fail_step 1 --MaxClients 10 --model gpt-3.5-turbo-0125 --device cuda:0
The predicted results will be automatically saved in the output/{dataset}/{method}/{model}
directory.
We follow CRUD-RAG and use the RAGQuestEval metric, which relies on GPT. Run the following code to perform the evaluation:
python metrics_questeval_crud.py --eval_path {saved predict file}
The evaluation results will be automatically saved in the output/{dataset}/{method}/{model}/metric_questeval
directory.
Please cite the following paper if you find Adaptive-Note helpful!
@misc{wang2024retrieverandmemoryadaptivenoteenhancedretrievalaugmented,
title={Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation},
author={Ruobing Wang and Daren Zha and Shi Yu and Qingfei Zhao and Yuxuan Chen and Yixuan Wang and Shuo Wang and Yukun Yan and Zhenghao Liu and Xu Han and Zhiyuan Liu and Maosong Sun},
year={2024},
eprint={2410.08821},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.08821},
}