Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs model ladder #708

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docs/model_ladder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Model Ladder

The model ladder is a set of scripts that help you easily run models over a standardized set of parameter sizes and token multipliers.

## setup
You just probably only need beaker ganty

## example usage
For example this will train you a 150M model on the dolma17 data mix with a token multiplier of 20 * number of parameters (one chinchilla cuz who doesn't like more obscurity in naming) with a specifed run name and getting all the data from s3
```
scripts/beaker/ladder-launch.sh 1 --model 150M --data dolma17 --length 1xC --name testing-out-model-ladder --s3
```

## data mixes
Data mixes are defined in [named_data_mixes.py](olmo/data/named_data_mixes.py).

## detailed usage

### train command
```
usage: ladder.py train [-h] --model MODEL --data DATA [--length LENGTH] --name
NAME [--s3 | --no-s3] [--wandb | --no-wandb]
[--read_location READ_LOCATION]
[--write_location WRITE_LOCATION] [--save_overwrite]
[--load_path LOAD_PATH] [--eval_on_load]

options:
-h, --help show this help message and exit
--model MODEL
--data DATA
--length LENGTH
--name NAME
--s3, --no-s3 read data from S3, write checkpoints to S3 (default:
False)
--wandb, --no-wandb create a run in wandb (default: True)
--read_location READ_LOCATION
--write_location WRITE_LOCATION
--save_overwrite
--load_path LOAD_PATH
--eval_on_load
```
Loading