Training fixes #53

benjamin-elder · 2025-10-29T18:11:48Z

Changes to the training loader to use context response pairs.

Changes to tool call env and evaluation.py to enable parallel/multiprocess inference.

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

AnkitaNaik

LGTM!!

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

Co-authored-by: siyuhuo <[email protected]>

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

AnkitaNaik

LGTM!!

AnkitaNaik · 2025-11-13T19:50:11Z

data_utils/create_data_splits/create_samples_for_sft.py

    print(f"Datasets Loaded for {format} turn data.")
    print("------------------------------------------------------------------------------------------------------------------------------------------------------------------------------")
-    assert set(data_no_scenarios_dict.keys()) == set(data_with_scenario_dict.keys())
+    # assert set(data_no_scenarios_dict.keys()) == set(data_with_scenario_dict.keys())


Is this assertion not valid in exploratory trajectories? @benjamin-elder

@AnkitaNaik yeah that seems to be the case. The alternate traces are sort of randomly dispersed I guess, so there's no guarantee that you won't have a case where the non-scenario doesn't have one and the scenario does, or vice-versa.

AnkitaNaik · 2025-11-13T19:54:16Z

envs/apis/rest/call.py

 from transformers import AutoModelForCausalLM, AutoTokenizer
 device = "auto"
-model_path = "ibm-granite/granite-3.0-8b-base"
+model_path = "ibm-granite/granite-4.0-micro"


Good catch!!

benjamin-elder requested review from AnkitaNaik and syhcode October 29, 2025 18:11

benjamin-elder marked this pull request as ready for review October 31, 2025 14:46

Benjamin Elder [email protected] added 9 commits October 31, 2025 13:06

loader takes c-r pairs

9d880ba

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

saving train loader changes

6170241

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

some comments

18b6a4c

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

added scripts and configs

f43d61e

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

val split/loss calculation

d62a3f2

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

still fixing bugs

db87f44

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

more updates

7488702

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

single command script

af6af37

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

stashing some changes

2046bad

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

benjamin-elder force-pushed the training_fixes branch from 76a923a to 2046bad Compare October 31, 2025 17:07

Benjamin Elder [email protected] added 4 commits October 31, 2025 13:29

simplify configs

8f92e86

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

bug fix

6e859b4

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

sleep longer

bf2c1ae

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

custom_loader has thoughts

8a7df5b

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

AnkitaNaik approved these changes Nov 3, 2025

View reviewed changes

Benjamin Elder [email protected] and others added 11 commits November 10, 2025 16:42

separate train/val into different files

3a4f50c

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

Config files for train runs

f1afec5

Update file listing

846cf87

Fix checkpointing

09cfc51

Fix <think> tag and print formatted prompts

9cd6d79

Minor fix for logging based on version numbers

29f7ffd

Temporary updated gradient accumulation values.

239548f

Changes for simplifying logging

6c4bcbc

Merge branch 'training_fixes' into dev/training_fixes_train_runs

d18344c

Remove debugging steps

4deb079

stashing training changes

5b87887

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

benjamin-elder force-pushed the training_fixes branch from 33e0ffd to 5b87887 Compare November 12, 2025 14:42

Benjamin Elder [email protected] and others added 4 commits November 13, 2025 08:20

rebase error in custom_loader

bbd5f09

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

shuffle error

64928ab

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

siyu-fixes (#56)

8bdbc90

Co-authored-by: siyuhuo <[email protected]>

specify the lora modules to target

0bcefcd

Signed-off-by: Benjamin Elder [email protected] <[email protected]>

AnkitaNaik approved these changes Nov 13, 2025

View reviewed changes

benjamin-elder merged commit 1392c1a into main Nov 13, 2025
1 check passed

benjamin-elder deleted the training_fixes branch November 13, 2025 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training fixes #53

Training fixes #53

Uh oh!

benjamin-elder commented Oct 29, 2025

Uh oh!

AnkitaNaik left a comment

Uh oh!

AnkitaNaik left a comment

Uh oh!

AnkitaNaik Nov 13, 2025

Uh oh!

benjamin-elder Nov 13, 2025

Uh oh!

AnkitaNaik Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Training fixes #53

Training fixes #53

Uh oh!

Conversation

benjamin-elder commented Oct 29, 2025

Uh oh!

AnkitaNaik left a comment

Choose a reason for hiding this comment

Uh oh!

AnkitaNaik left a comment

Choose a reason for hiding this comment

Uh oh!

AnkitaNaik Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

benjamin-elder Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

AnkitaNaik Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants