Skip to content

Conversation

finbarrtimbers
Copy link
Collaborator

@finbarrtimbers finbarrtimbers commented Sep 18, 2025

Fixes #677.

Experiments:

  1. Debug run: Beaker
  2. Tool run: Beaker
  3. Multi-node run: Beaker
  4. Finetune run: Beaker

@finbarrtimbers finbarrtimbers marked this pull request as ready for review September 19, 2025 17:13
Copy link
Collaborator

@hamishivi hamishivi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm running this I get

Traceback (most recent call last):
  File "/weka/oe-adapt-default/hamishi/pr_review/open-instruct/mason.py", line 924, in <module>
    main()
  File "/weka/oe-adapt-default/hamishi/pr_review/open-instruct/mason.py", line 915, in main
    tasks=[make_task_spec(args, full_command, i, beaker_secrets, whoami, args.resumable) for i, full_command in enumerate(full_commands)],
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/weka/oe-adapt-default/hamishi/pr_review/open-instruct/mason.py", line 915, in <listcomp>
    tasks=[make_task_spec(args, full_command, i, beaker_secrets, whoami, args.resumable) for i, full_command in enumerate(full_commands)],
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/weka/oe-adapt-default/hamishi/pr_review/open-instruct/mason.py", line 852, in make_task_spec
    spec = beaker.TaskSpec(
           ^^^^^^^^^^^^^^^
AttributeError: module 'beaker' has no attribute 'TaskSpec'

@finbarrtimbers
Copy link
Collaborator Author

finbarrtimbers commented Sep 20, 2025 via email

@hamishivi
Copy link
Collaborator

@finbarrtimbers running a slightly edited version of scripts/train/tulu3/finetune_8b.sh:

python mason.py     --cluster ai2/jupiter-cirrascale-2     --workspace ai2/tulu-3-dev     --priority high     --image nathanl/open_instruct_auto --pure_docker_mode     --preemptible     --num_nodes 8     --budget ai2/oe-adapt     --gpus 8 -- accelerate launch     --mixed_precision bf16     --num_processes 8     --use_deepspeed     --deepspeed_config_file configs/ds_configs/stage3_no_offloading_accelerate.conf     --deepspeed_multinode_launcher standard     open_instruct/finetune.py     --exp_name tulu3_8b_sft     --model_name_or_path meta-llama/Llama-3.1-8B     --model_revision main     --tokenizer_name meta-llama/Llama-3.1-8B     --tokenizer_revision main     --use_slow_tokenizer     --dataset_mixer_list allenai/tulu-3-sft-mixture 512     --max_seq_length 4096     --per_device_train_batch_size 1     --gradient_accumulation_steps 2     --learning_rate 5e-06     --lr_scheduler_type linear     --warmup_ratio 0.03     --weight_decay 0.0     --num_train_epochs 2     --reduce_loss sum     --use_flash_attn     --gradient_checkpointing     --report_to wandb     --with_tracking     --logging_steps 1     --seed 8

(edited just to reduce the dataset size)

- Changed beaker_client.workspace.secrets() to beaker_client.secret.list()
- Changed beaker_client.account.whoami() to beaker_client.user.get()
- Changed beaker.ExperimentSpec to beaker.BeakerExperimentSpec
- Changed beaker.Constraints to beaker.BeakerConstraints
- Changed beaker.RetrySpec to beaker.BeakerRetrySpec
- Changed all beaker.EnvVar to beaker.BeakerEnvVar
- Changed beaker.DataMount to beaker.BeakerDataMount
- Changed beaker.DataSource to beaker.BeakerDataSource
- Changed beaker.TaskResources to beaker.BeakerTaskResources
- Changed beaker.ImageSource to beaker.BeakerImageSource
- Changed beaker.ResultSpec to beaker.BeakerResultSpec
- Changed beaker.TaskContext to beaker.BeakerTaskContext
- Changed beaker.Priority to beaker.BeakerPriority
The experiment.create() method now returns a BeakerWorkload object,
which has an experiment field containing the ID.
- Fixed exception names: ConfigurationError → BeakerConfigurationError, ExperimentNotFound → BeakerExperimentNotFound
- Updated to use workload.get() and experiment.get_spec() instead of experiment.get()
- Changed description update to use workload.update() instead of experiment.set_description()
- Updated test mocks to match the new API structure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update to beaker-py 2
2 participants