Misaligning Reasoning with Answers - A Framework for Assessing LLM CoT Robustness[Arxiv]
Enyi Jiang, Changming Xu, Nischay Singh, Gagandeep Singh
We recommend first creating a conda environment using the provided requirements.txt:
conda create --name MATCHA
pip install -r requirements.txt
python inference_emb_attack.py --dataset="gsm8k" --model="llama3-8B" --method="few_shot_cot" --qes_limit=0 --prompt_path="./basic_cot_prompts/math_word_problems" --random_seed=42 --multipath=1 --basic_cot True
python inference_tok_random_dual.py --dataset="gsm8k" --model="llama3-8B" --method="few_shot_cot" --qes_limit=0 --prompt_path="./basic_cot_prompts/math_word_problems" --random_seed=42 --multipath=1 --basic_cot True
python close_source_transfer.py --dataset="gsm8k" --model="gpt-3.5-turbo" --model2='deepseek' --method="few_shot_cot" --qes_limit=0 --prompt_path="basic_cot_prompts/math_word_problems" --random_seed=42 --multipath=1 --temperature=0.7 --basic_cot True --api_time_interval=2
--dataset
: The name of a dataset.choices = [gsm8k, strategyqa, singleeq]
.--model
: open-source model.choices = ["llama3-8B","mistral", "zephyr", "qwen", "deepseek"]
.--method
: few-shot-cot.--qes_limit
: number of test questions.--prompt_path
: path of prompt file.
Parts of the code in this repo is based on
Cite the paper/repo:
@article{jiang2025misaligning,
title={Misaligning Reasoning with Answers--A Framework for Assessing LLM CoT Robustness},
author={Jiang, Enyi and Xu, Changming and Singh, Nischay and Singh, Gagandeep},
journal={arXiv preprint arXiv:2505.17406},
year={2025}
}