-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Avoid truncating the outputs based on string lengths #201
Conversation
hlog_warn( | ||
f"The smallest context of your batch ({smallest_context}) is bigger than the maximum context size allowed by the model ({self.max_length}) for a task in" | ||
f"The context size of your batch ({context_size}) is bigger than the maximum context size allowed by the model ({self.max_length}) for a task in" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced smallest_context
here because the real issues are caused by biggest_context
(renamed to context_size
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It overall LGTM, I'd welcome a second look by @NathanHB
Side note - we need to add a long context eval to the test suite, though gpt2 has so small a context that anything is long context I guess XD
…gface#201) * Fix context size * - redundant condition --------- Co-authored-by: Clémentine Fourrier <[email protected]>
This fixes a bug where
max_new_tokens
was set to1
due tobiggest_context = max(len(c) for c in context)
instead ofbiggest_context = <max number of tokens>
.This meant that a 4-shot gsm8k eval was generating single-token predictions if the model's context size is
2048
.To reproduce:
And check for single-token predictions in
$OUTPUT_DIR/details/HuggingFaceFW/ablation-model-fineweb-edu/2024-06-11T15-34-10.689447/details_leaderboard|gsm8k|4_2024-06-11T15-34-10.689447.parquet