Name		Name	Last commit message	Last commit date
parent directory ..
distillation		distillation
prompts		prompts
ratings		ratings
README.md		README.md
requirements.txt		requirements.txt

README.md

Nectar

This codebase contains the scripts used to collect the prompts in berkeley-nest/Nectar as well as generate the seven-wise comparisons. In addition, we include data from ablations and the corresponding Jupyter notebooks to visualize the experimental data.

Setup

The code has been verified on Python 3.10, but other versions of Python are likely compatible.

Simply run the command below to download required packages:

pip install -r requirements.txt

Collecting Prompts

Code for collecting prompts is found in the prompts directory. Public datasets were compiled into one dataset via create_data.ipynb, including some basic heuristics for duplicate detection.

The prompts/visualizations directory contains the Jupyter notebook used to generate figures for the paper.

Distillation

Code for distillation is found in the distillation directory. Distill.py contains the script used to inference various models in parallel.

Ratings

Code for inferencing the seven-wise ratings is found in the rating directory. Specially, rate.py contains the final code used to generate the Nectar ratings for all 180k rows.

The ratings/visualizations directory contains variance Jupyter Notebooks to generate visualizations. The data for these visualizations can be found in the rating/results directory. Inside each directory in the results directory has a prompt_log.txt containing the prompt used, log_args.txt containing the rating script args used, and also rankings.jsonl containing the outputted rankings from the experiment.

The ratings/experiments folder contains some extra scripts specifically for running certain experiments. Their outputs are found in the directories contained in rating/results with the associated names.

measure_k_position_bias.py: Measures how positional bias changes as K increases.
measure_k_to_pairwise.py: Measures how judgment agreement with pairwise ratings changes as K increases.
rate_pointwise.py: Creates pointwise ratings instead of pairwise ratings.
rate_verbose.py: Tests ratings with more explicit anti-verbosity prompting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nectar

nectar

README.md

Nectar

Setup

Collecting Prompts

Distillation

Ratings

Files

nectar

Directory actions

More options

Directory actions

More options

Latest commit

History

nectar

Folders and files

parent directory

README.md

Nectar

Setup

Collecting Prompts

Distillation

Ratings