Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent responses for the same case with different limit parameters #2550

Open
Starry-Liu1 opened this issue Dec 7, 2024 · 0 comments

Comments

@Starry-Liu1
Copy link

Starry-Liu1 commented Dec 7, 2024

Hello, I meet this problem when using the same input parameters but varying the limit value, the responses differ. This behavior is unexpected, as changing the limit should only affect the number of results returned, not the response content for identical cases.

This is my code:
lm_eval --model hf --limit 0.01 --model_args pretrained=../../../llms/Meta-Llama-3.1-8B-Instruct --tasks gsm8k_cot --device cuda:4 --log_samples --batch_size 1 --output_path ./results/gsm8k_cot/hf_test_0.01_v2.json --gen_kwargs max_gen_toks=512 do_sample=true temperature=0.5 --seed 42,42,42,42

However, when the limit is changed to 0.015, the response is different in the same case. And nearly all responses in the two settings differ.
image

How can I fix this bug and get the same response? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant