Skip to content

Commit

Permalink
test
Browse files Browse the repository at this point in the history
  • Loading branch information
natolambert committed Aug 23, 2024
1 parent 76d41a6 commit c08bfcb
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 21 deletions.
16 changes: 8 additions & 8 deletions .github/workflows/push-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ on:
- 'Dockerfile'
- 'setup.py'
branches: [main]
# pull_request: # note, comment this out for running on every push
# # Also run on PRs that update the files in the image's directory (other than README).
# branches: [main]
# paths:
# - 'rewardbench/**'
# - 'scripts/**'
# - 'Dockerfile'
# - 'setup.py'
pull_request: # note, comment this out for running on every push
# Also run on PRs that update the files in the image's directory (other than README).
branches: [main]
paths:
- 'rewardbench/**'
- 'scripts/**'
- 'Dockerfile'
- 'setup.py'
# workflow_dispatch: # This allows us to manually trigger a build through the GitHub UI.

env:
Expand Down
16 changes: 3 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,20 +234,10 @@ Notes: Do not use the character - in image names for beaker,

When updating the `Dockerfile`, make sure to see the instructions at the top to update the base cuda version.

In development, we have the following docker images (most recent first as it's likely what you need).
TODO: Update it so one image has VLLM (for generative RM only) and one without. Without will load much faster.
We recently switched to automatic beaker image building workflows.
You can use this image, or the last image with the previous Dockerfile
- `nathanl/rewardbench_auto`: Automatic image.
- `nathanl/rb_v23`, Jul. 2024: Include support for bfloat16 models from command line
- `nathanl/rb_v22`, Jul. 2024: Include new [Generalizable Reward Model](https://arxiv.org/abs/2406.10216)
- `nathanl/rb_v20`: Fixes to DPO handling (minor) + llama 3 not quantized for dpo
- `nathanl/rb_v18`: Improvements to RewardBench CLI
- `nathanl/rb_v17` (with VLLM): add support for vllm + llm as a judge, `rb_v16` is similar without prometheus and some OpenAI models
- `nathanl/rb_v12`: add support for llama3
- `nathanl/rewardbench_v10`: add support for `mightbe/Better-PairRM` via jinja2
- `nathanl/rewardbench_v8`: add support for `openbmb/Eurus-RM-7b` and starcoder2
- `nathanl/rewardbench_v5`: improve saving with DPO script
- `nathanl/rewardbench_v4`: fix EOS token bug on FastChat models (GH #90)
- `nathanl/rewardbench_v2`: fix beaver cost model
- `nathanl/rewardbench_v1`: release version

## Citation
Please cite our work with the following:
Expand Down

0 comments on commit c08bfcb

Please sign in to comment.