Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

CRaFT: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

Introduction

Refusal-Aware Instruction Tuning (RAIT) enables Large Language Models (LLMs) to refuse to answer unknown questions. By modifying responses of unknown questions in the training data to refusal responses such as "I don't know", RAIT enhances the reliability of LLMs and reduces their hallucination. Generally, RAIT modifies training samples based on the correctness of the initial LLM's response. However, this crude approach can cause LLMs to excessively refuse answering questions they could have correctly answered, the problem we call over-refusal. To address this issue, we introduce Certainty Represented Knowledge Flow for Refusal-Aware Instructions Tuning (CRaFT). The framework of CRaFT is shown below.

Getting Start

Preparing enviroment and data

Our code depends on xtuner and opencompass, so we need to set up the corresponding environment first.

## xtuner
git clone https://github.com/InternLM/xtuner.git
cd xtuner
conda create --name xtuner-env python=3.10 -y
conda activate xtuner-env
pip install -e '.[all]'

## opencompass
git clone https://github.com/Zrc007/opencompass_CRaFT.git
cd opencompass
conda create --name opencompass python=3.10 -y
conda activate opencompass
pip install -e .

We conducted experiments on OEQA and MCQA separately. For OEQA, we used TriviaQA as the training set, with both TriviaQA and NQ as the test sets. For MCQA, MMLU was used as the training set, and ARC as the test set. We have already preprocessed these four datasets and stored them under path dataset\preprocessed_dataset.

CRaFT

CRaFT supports both OEQA and MCQA. The process for OEQA is as follows (the execution order of MCQA is also consistent.):

Stage1: Query Knowledge State and Flow

export HF_HOME = your_HF_HOME_path
## get knowledge state (correctness and certainty) of initial model
(opencompass) ./scripts/stage1/OEQA/triviaqa_kq_init.sh

## rehearsal train and get knowledge flow

### construct rehearsal training instructions
(opencompass) ./scripts/stage1/OEQA/triviaqa_rehearsal_instructions_construction.sh

### rehearsal train(You can also train using other training frameworks and then add the model to `compass_config/models`)
(xtuner-env) ./scripts/stage1/OEQA/triviaqa_rehearsal_SFT.sh
(xtuner-env) ./scripts/stage1/OEQA/triviaqa_rehearsal_convert.sh

### get knowledge flow
(opencompass) ./scripts/stage1/OEQA/triviaqa_kq_rehearsal.sh

Stage2: Refusal-Aware instructions construction & Tuning

## Refusal-Aware instructions construction
(opencompass) ./scripts/stage2/OEQA/triviaqa_instructions_construction.sh

## Refusal-Aware instructions Tuning
(xtuner-env) ./scripts/stage2/OEQA/triviaqa_CRaFT.sh
(xtuner-env) ./scripts/stage2/OEQA/triviaqa_CRaFT_convert.sh

Evaluation

## OEQA
(opencompass) ./scripts/Eval/OEQA/triviaqa_eval.sh
(opencompass) ./scripts/Eval/OEQA/nq_eval.sh

## MCQA
(opencompass) ./scripts/Eval/MCQA/mmlu_eval.sh
(opencompass) ./scripts/Eval/MCQA/ARC_c_Test_eval.sh

Citation

@article{zhu2024utilize,
  title={Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning},
  author={Zhu, Runchuan and Ma, Zhipeng and Wu, Jiang and Gao, Junyuan and Wang, Jiaqi and Lin, Dahua and He, Conghui},
  journal={arXiv preprint arXiv:2410.06913},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
compass_config		compass_config
data_process		data_process
dataset/preprocessed_dataset		dataset/preprocessed_dataset
images		images
scripts		scripts
train		train
utils		utils
.gitignore		.gitignore
README-zh.md		README-zh.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

CRaFT: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

Introduction

Getting Start

Preparing enviroment and data

CRaFT

Evaluation

Citation

About

Languages

opendatalab/CRaFT

Folders and files

Latest commit

History

Repository files navigation

Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

CRaFT: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

Introduction

Getting Start

Preparing enviroment and data

CRaFT

Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Languages