GitHub

Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

| 📑 Paper | 🐱 GitHub |

Overview

Tool-augmented large language models (LLMs) are attracting widespread attention when accessing up-to-date knowledge and alleviating hallucination issues. Nowadays, advanced closed-source LLMs (e.g., ChatGPT) have demonstrated surprising tool-usage capabilities through prompting and in-context learning techniques. To empower the capabilities of open-source LLMs (e.g., LLaMA) in manipulating tools, current efforts focus on either template-driven or token-triggered tool-usage. However, the former hampers LLMs' flexibility to address diverse user's queries due to constrained tool interactions, while the latter limits the generalizability when engaging with new tools, since tool-usage learning is based on task- and tool-specific datasets. To alleviate these concerns, in this paper, we propose a decision-aware and generalizable tool-usage framework (DEER). Specifically, we first construct the tool-usage samples with multiple decision branches via an automatic generation pipeline, thereby inspiring the decision-making awareness of LLMs under diverse scenarios. Meanwhile, we propose a novel tool sampling strategy to enhance the generalizability of LLMs over unseen tools. Extensive experiments demonstrate that our proposed DEER is effective and significantly outperforms baselines across various datasets.

Quick Start

Setup

Install the libraries listed in requirements.txt.

pip install -r requirements.txt

Generate tool's function and query-call pairs

cd ToolDEER/
python3 src/generate_tool.py --data_path data/raw/chatgpt_plugins.json --write_to_path data/processed/tool.json

Download the chatgpt_plugins.json from here, then move to data/raw/.

Construct the multi-decision tool-usage datasets

The pipeline of our multi-decision sample generation.

cd ToolDEER/
python3 src/generate_dataset.py --tool_path data/processed/tool.json --general_path data/raw/${general_filename} --save_dir data/processed/

Download the general dataset from here.

Finetuning using LoRA

export WANDB_MODE="disabled"

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed --master_port=20002 src/train_lora.py \
    --model_name_or_path ${llama2_7b_ckpt_dir}/ \
    --data_path data/processed/train.json \
    --bf16 False \
    --output_dir outputs/ \
    --num_train_epochs 10 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --save_strategy "steps" \
    --save_steps 10000 \
    --save_total_limit 10 \
    --learning_rate 5e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.04 \
    --lr_scheduler_type "cosine" \
    --logging_steps 20 \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True \
    --deepspeed configs/stage2.json

Evaluation

cd ToolDEER/
python3 src/inference.py --model_name_or_path ${llama2_7b_ckpt_dir}/ --lora_path outputs --data_path data/processed/valid.json

The comparison with baselines on unseen tools.

The comparison of diverse tool sampling strategies and corresponding sampling ratios.

Citation

If you find this work is helpful to your research or applications, please feel free to cite our work.

@article{gui2024look,
  title={Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models},
  author={Gui, Anchun and Li, Jian and Dai, Yong and Du, Nan and Xiao, Han},
  journal={arXiv preprint arXiv:2402.16696},
  year={2024}
}

Acknowledgments

This project partially refers to ToolLLM and ToolAlpaca. Thanks for their wonderful work!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
baselines		baselines
configs		configs
data/processed		data/processed
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

| 📑 Paper | 🐱 GitHub |

Overview

Quick Start

Setup

Generate tool's function and query-call pairs

Construct the multi-decision tool-usage datasets

Finetuning using LoRA

Evaluation

Citation

Acknowledgments

About

Releases

Packages

Languages

License

Ericmututu/ToolDEER

Folders and files

Latest commit

History

Repository files navigation

Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

| 📑 Paper | 🐱 GitHub |

Overview

Quick Start

Setup

Generate tool's function and query-call pairs

Construct the multi-decision tool-usage datasets

Finetuning using LoRA

Evaluation

Citation

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages