This is the official repository for the paper: "Exploring Self-supervised Logic-enhanced Training for Large Language Models".
- Python: 3.9
- CUDA: 11.7/11.8
Other python packages can be installed using the following command:
pip install -r requirements.txt
This project relies on Hydra to manage configurations. The configuration files are located in conf/
.
Typical usage:
python trainer_base_ds_mul.py -cp <config_path> -cn <config_file_name> # torch launcher
deepspeed --include localhost:0,1,2,3 trainer_base_ds_mul.py -cp <config_path> -cn <config_file_name> # deepspeed launcher
You can download all datasets for self-supervised training from the huggingface repo, which also contains the processed datasets containing already constructed logically consistent pairs.
If you want to preprocess the datasets by yourself, you can simply run the following command:
python trainer_base_ds_mul.py -cp <config_path> -cn <config_file_name> do_preprocess=True
This will stops the program after the datasets have been prepared. You can also remove do_preprocess=True
so that the program will start training immediately.
However, this is not encouraged, as the preprocessing step is time-consuming, and usually the training requires distributed training,
which means that the other processes will be waiting for the data to be ready.
All configs for training different models are listed as follows:
- LLaMA-7B
- Config:
conf/llama/wiki/llama_7b_merit_v1_pv91_v91_v5_0.yaml
- Weights: Huggingface Hub
- Config:
- LLaMA-13B
- Config:
conf/llama/wiki/llama_13b_merit_v1_pv91_v91_v5_0.yaml
- Weights: Huggingface Hub
- Config:
- LLaMA-33B (QLoRA)
- Normal data : Counterfactual data = 1:3
- Config:
conf/llama/wiki/llama_30b_merit_v1_pv91_v91_v5_0.yaml
- Weights: Huggingface Hub
- Config:
- Normal data : Counterfactual data = 1:0
- Config:
conf/llama/wiki/llama_30b_merit_v1_pv91_v91_v5_0_no_aug.yaml
- Weights: Huggingface Hub
- Config:
- Normal data : Counterfactual data = 1:1
- Config:
conf/llama/wiki/llama_30b_merit_v1_pv91_v91_v5_0_1aug.yaml
- Weights: Huggingface Hub
- Config:
- Normal data : Counterfactual data = 1:3
- LLaMA-65B (QLoRA)
- Config:
conf/llama/wiki/llama_65b_merit_v1_pv91_v91_v5_0.yaml
- Config:
- LLaMA-65B (Full parameter training w. Pipeline Parallel)
- Config:
conf/llama/wiki/llama_65b_merit_v1_pv91_v91_v5_0_full_mp.yaml
- Note: For pipeline parallel training, please launch the program using
trainer_base_ds_mp.py
. Also, please first convert the Huggingface weights to DeepSpeed's format viaconvert2ckpt.py
.
- Config:
- Falcon-40B
- Config:
conf/rw/falcon_40b_merit_v1_pv91_v91_v5_0.yaml
- Weights: Huggingface Hub
- Config:
Since there are too many configs for evaluation in this repo, we only list one example here:
python trainer_base_fsdp_v4.py -cp conf/llama/wiki/mc_eval/ -cn llama_30b_merit_v5_qlora_logiqav2_eval_mc_v1_0_test # This is for LogiQA-v2 multiple choice evaluation.
If you find the repository and the paper helpful, please kindly cite our papers:
@inproceedings{logicllm2023jiao,
author = {Fangkai Jiao and
Zhiyang Teng and
Bosheng Ding and
Zhengyuan Liu and
Nancy F. Chen and
Shafiq R. Joty},
title = {Exploring Self-supervised Logic-enhanced Training for Large
Language Models},
booktitle = {{NAACL}},
publisher = {Association for Computational Linguistics},
year = {2024},
}
@inproceedings{merit2022jiao,
author = {Fangkai Jiao and
Yangyang Guo and
Xuemeng Song and
Liqiang Nie},
title = {MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning},
booktitle = {Findings of {ACL}},
pages = {3496--3509},
publisher = {Association for Computational Linguistics},
year = {2022},
}