[Bugfix] Avoid truncating the outputs based on string lengths #201

anton-l · 2024-06-11T15:52:07Z

This fixes a bug where max_new_tokens was set to 1 due to biggest_context = max(len(c) for c in context) instead of biggest_context = <max number of tokens>.

This meant that a 4-shot gsm8k eval was generating single-token predictions if the model's context size is 2048.

To reproduce:

accelerate launch --num_processes=1 "run_evals_accelerate.py" --model_args="pretrained=HuggingFaceFW/ablation-model-fineweb-edu" \
      --output_dir $OUTPUT_DIR --max_samples 1000 --override_batch_size 32 \
      --tasks "leaderboard|gsm8k|4|1"

And check for single-token predictions in $OUTPUT_DIR/details/HuggingFaceFW/ablation-model-fineweb-edu/2024-06-11T15-34-10.689447/details_leaderboard|gsm8k|4_2024-06-11T15-34-10.689447.parquet

anton-l · 2024-06-11T15:55:31Z

src/lighteval/models/base_model.py

                    hlog_warn(
-                        f"The smallest context of your batch ({smallest_context}) is bigger than the maximum context size allowed by the model ({self.max_length}) for a task in"
+                        f"The context size of your batch ({context_size}) is bigger than the maximum context size allowed by the model ({self.max_length}) for a task in"


Replaced smallest_context here because the real issues are caused by biggest_context (renamed to context_size)

src/lighteval/models/base_model.py

clefourrier

It overall LGTM, I'd welcome a second look by @NathanHB
Side note - we need to add a long context eval to the test suite, though gpt2 has so small a context that anything is long context I guess XD

…gface#201) * Fix context size * - redundant condition --------- Co-authored-by: Clémentine Fourrier <[email protected]>

* Fix context size * - redundant condition --------- Co-authored-by: Clémentine Fourrier <[email protected]>

Fix context size

1b058b5

anton-l requested a review from clefourrier June 11, 2024 15:52

anton-l commented Jun 11, 2024

View reviewed changes

clefourrier reviewed Jun 12, 2024

View reviewed changes

src/lighteval/models/base_model.py Outdated Show resolved Hide resolved

clefourrier reviewed Jun 12, 2024

View reviewed changes

- redundant condition

4ff071c

anton-l requested a review from NathanHB June 12, 2024 10:46

clefourrier approved these changes Jul 3, 2024

View reviewed changes

clefourrier added 4 commits July 4, 2024 10:27

Merge branch 'main' into context_size_fix

6a81af4

Merge branch 'main' into context_size_fix

8f136c9

Merge branch 'main' into context_size_fix

b486665

Merge branch 'main' into context_size_fix

5c825d5

clefourrier merged commit 6064695 into huggingface:main Jul 8, 2024
2 checks passed

hynky1999 pushed a commit that referenced this pull request May 22, 2025

[Bugfix] Avoid truncating the outputs based on string lengths (#201)

f9a6072

* Fix context size * - redundant condition --------- Co-authored-by: Clémentine Fourrier <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Avoid truncating the outputs based on string lengths #201

[Bugfix] Avoid truncating the outputs based on string lengths #201

Uh oh!

anton-l commented Jun 11, 2024

Uh oh!

anton-l Jun 11, 2024

Uh oh!

Uh oh!

clefourrier left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Avoid truncating the outputs based on string lengths #201

[Bugfix] Avoid truncating the outputs based on string lengths #201

Uh oh!

Conversation

anton-l commented Jun 11, 2024

Uh oh!

anton-l Jun 11, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clefourrier left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

clefourrier left a comment •

edited

Loading