This repository contains the official implementation of Improving Your Model Ranking on Chatbot Arena by Vote Rigging
We simulate rigging on new votes beyond the
To set up the initial rigging environment, you could run the following command to separate the complete voting records into the historical votes (90%), which are used to generate the initially simulated leaderboard, and other users' votes (10%), which are used to explore the impact of concurrent voting from other users.
python initial_env.py
You could directly run the following command to obtain the results under the idealized rigging scenario:
python vote_rigging.py --rigging_mode omni_bt_diff --classifier_acc 1.0 --beta 1.0
The default rigging strategy is Omni-BT, and we also support other rigging strategies specified by --rigging_mode
. Specifically, we support t_random for T-Random; t_tie for T-Tie; t_normal for T-Normal; t_abstain for T-Abstain; omni_on for Omni-On; omni_bt_diff for Omni-BT (Relative); omni_bt_abs for Omni-BT (Absolute). Besides, you could set --classifier_acc
to control the classification performance of de-anonymizing functions and set --beta
to control the marginal probability of sampling the target model. If you want to explore the impact of concurrent votes from other users, you may run the following command:
python rigging_with_vo.py --rigging_mode omni_bt_diff
By default, the maximum number of other users' votes is 100,000.
First, initialize the voting environment with the following command:
python initial_env.py --classifier
To generate the training corpus, you could run the following example command that queries Llama-3-8B-Instruct using the prompt from the HC3 dataset:
python classifier/dataset_cur.py --output_dir hc3 --model_id meta-llama/Meta-Llama-3-8B-Instruct
With the prepared training corpus, try to run the following script to fine-tune a RoBERTa-based model:
python classifier/train.py --dataset hc3
Then, you could rig with the multi-class classifier with the following demo code:
python vote_rigging_classifier.py --dataset hc3 --model_path [REPLACE THIS WITH YOUR OWN MODEL PATH]
To detect malicious users, you can run the following command:
python detect_malicious_users.py --rigging_mode omni_bt_diff
For vote filtering, you can run the following command and specify the parameter --filter_threshold
to control the filtering threshold.
python vote_filtering.py --rigging_mode omni_bt_diff --filter_threshold 0.8
If you find our work interesting, please consider giving a star ⭐ and cite as:
@misc{min2025improvingmodelrankingchatbot,
title={Improving Your Model Ranking on Chatbot Arena by Vote Rigging},
author={Rui Min and Tianyu Pang and Chao Du and Qian Liu and Minhao Cheng and Min Lin},
year={2025},
eprint={2501.17858},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.17858},
}