Skip to content

Latest commit

 

History

History
42 lines (34 loc) · 2.13 KB

README.md

File metadata and controls

42 lines (34 loc) · 2.13 KB

Results

Results of Peer Rank (PR) and Peer Discussion (PD) are stored in separate folders.

The peer_rank folder contains results of Peer Rank (PR).

(To Be Added)

The peer_discussion folder contains results of Peer Discussion (PD). By default, the directory tree is as follows:

- peer_discussion
|-- lfqa
|   |-- all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
|   |-- rating-{reviewer}-temperature-{temperature}-{explicit}.json
|   |-- {reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
|   |-- gather_all.py
|-- vicuna80
    |-- all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
    |-- {reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
    |-- gather_all.py

Each dataset has a corresponding result folder. rating-* files for Vicuna80 are in the vicuna80 dataset folder.

Default files follow the following naming rules:

  1. rating-{reviewer}-temperature-{temperature}-{explicit}.json
    • Files follow this naming rule contain reviews generated by large language model (LLM) reviewers.
    • {reviewer} is the name of reviewers.
    • {temperature} describe the temperature used to generate reviews.
    • {explicit} indicates that there are explicit instructions in prompts.
  2. all-data-{reviewer1}-{reviewer2}-{multi}-{explicit}.json
    • Files follow this naming rule contain all information of two reviewers and their initial reviews.
    • Discussion history will be added to this file after discussion between reviewer 1 and reviewer 2.
    • {reviewer1}, {reviewer2}, and {explicit} are the same as above.
    • {multi} indicates that whether a system prompt is concatenated after each turn.
  3. {reviewer1}-{reviewer2}-discussion-{multi}-{explicit}-log.txt
    • Files follow this naming rule only contain discussion history between {reviewer1} and {reviewer2}, which {reviewer1} leads the discussion.
    • {reviewer1}, {reviewer2}, {multi}, and {explicit} are the same as above.
  4. gather_all.py
    • This file gathers review of two reviewers (files named after the rule 1) and generate a file contains all information (a file named after the rule 2).