Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f85f7a3
Added repro-bug script
finbarrtimbers Sep 18, 2025
63566c0
Add uv run to repro script
finbarrtimbers Sep 18, 2025
3755408
Updated uv.lock
finbarrtimbers Sep 18, 2025
af7bb08
Added tokenizer
finbarrtimbers Sep 19, 2025
f03cbdc
Now, we use my olmo3-test
finbarrtimbers Sep 19, 2025
2f354d2
Now, we use my olmo3-test
finbarrtimbers Sep 19, 2025
47f894f
Added logging for vllm version
finbarrtimbers Sep 19, 2025
a9ec674
Updated code ot fix bug
finbarrtimbers Sep 19, 2025
5e18e10
removed logging
finbarrtimbers Sep 19, 2025
df843fb
Fixed max concurrency
finbarrtimbers Sep 19, 2025
10f4b54
Added runai dep
finbarrtimbers Sep 22, 2025
df46c75
Add runai-model-streamer-gcs dependency for GCS support
finbarrtimbers Sep 22, 2025
52f9cd1
Updated project memory
finbarrtimbers Sep 22, 2025
e60c8c4
Updated code with warning when blocking put
finbarrtimbers Sep 25, 2025
437e306
Added creds
finbarrtimbers Sep 25, 2025
4a93c3d
Add boto3 dependency for GCS support
finbarrtimbers Sep 22, 2025
ab7cc52
Udpated script to use tokenizer
finbarrtimbers Sep 26, 2025
152b2d6
Updated tokenizer
finbarrtimbers Sep 26, 2025
a24fad2
Updated multi-node script
finbarrtimbers Sep 26, 2025
cd98288
Added bos
finbarrtimbers Sep 26, 2025
148ad12
Added tokenizer to multi-node script
finbarrtimbers Sep 26, 2025
bb8c958
use titan for 32b
finbarrtimbers Sep 27, 2025
3b71bf8
Log memory usage.
finbarrtimbers Sep 29, 2025
73d29fb
Updated clusters.
finbarrtimbers Sep 29, 2025
22d5b0d
Set gather_whole_model False
finbarrtimbers Sep 29, 2025
dfb0251
Now, we log GPU memory.
finbarrtimbers Sep 29, 2025
942fdf5
Use stage 3
finbarrtimbers Sep 29, 2025
671f17c
Now use tp 2
finbarrtimbers Sep 29, 2025
e308dc6
CLeaned up PR.
finbarrtimbers Sep 29, 2025
9d40c46
undid changegs
finbarrtimbers Oct 6, 2025
0a0b834
Updated code
finbarrtimbers Oct 6, 2025
4f2003e
Undid changes
finbarrtimbers Oct 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ dependencies = [
"immutabledict==1.2.0",
"flash-attn>=2.8.3; platform_system != 'Darwin'",
"liger-kernel>=0.5.4; platform_system != 'Darwin'",
"runai-model-streamer>=0.14.0",
"runai-model-streamer-gcs>=0.14.0",
"boto3>=1.26.0",
]

[build-system]
Expand Down
9 changes: 7 additions & 2 deletions scripts/train/debug/large_test_script.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ uv run python mason.py \
--max_prompt_token_length 2048 \
--response_length 4096 \
--pack_length 20480 \
--model_name_or_path Qwen/Qwen2.5-7B \
--model_name_or_path /weka/oe-adapt-default/finbarrt/stego32/step17000-hf \
--chat_template_name tulu_thinker \
--inflight_updates True \
--stop_strings "</answer>" \
Expand All @@ -61,4 +61,9 @@ uv run python mason.py \
--oe_eval_max_length 32768 \
--oe_eval_tasks "codex_humanevalplus:0-shot-chat-v1::tulu-thinker,mbppplus:0-shot-chat::tulu-thinker,livecodebench_codegeneration::tulu-thinker" \
--dataset_skip_cache True \
--push_to_hub False
--push_to_hub False \
# Requirements for OLMo3 32B
--tokenizer_name_or_path "allenai/OLMo-2-1124-7B" \
--add_bos \
--deepspeed_stage 3 \
--gather_whole_model False \
Loading