Results of Peer Rank (PR) and Peer Discussion (PD) are stored in separate folders.
The peer_rank
folder contains results of Peer Rank (PR).
(To Be Added)
The peer_discussion
folder contains results of Peer Discussion (PD). By default, the directory tree is as follows:
- peer_discussion
|-- lfqa
| |-- all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
| |-- rating-{reviewer}-temperature-{temperature}-{explicit}.json
| |-- {reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
| |-- gather_all.py
|-- vicuna80
|-- all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
|-- {reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
|-- gather_all.py
Each dataset has a corresponding result folder. rating-*
files for Vicuna80 are in the vicuna80 dataset folder.
Default files follow the following naming rules:
rating-{reviewer}-temperature-{temperature}-{explicit}.json
- Files follow this naming rule contain reviews generated by large language model (LLM) reviewers.
{reviewer}
is the name of reviewers.{temperature}
describe the temperature used to generate reviews.{explicit}
indicates that there are explicit instructions in prompts.
all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
- Files follow this naming rule contain all information of two reviewers and their initial reviews.
- Discussion history will be added to this file after discussion between
reviewer 1
andreviewer 2
. {reviewer1}
,{reviewer2}
, and{explicit}
are the same as above.{multi}
indicates that whether a system prompt is concatenated after each turn.
{reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
- Files follow this naming rule only contain discussion history between
{reviewer1}
and{reviewer2}
, which{reviewer1}
leads the discussion. {reviewer1}
,{reviewer2}
,{multi}
, and{explicit}
are the same as above.
- Files follow this naming rule only contain discussion history between
gather_all.py
- This file gathers review of two reviewers (files named after the rule 1) and generate a file contains all information (a file named after the rule 2).