Inconsistent responses for the same case with different limit parameters #2550

Starry-Liu1 · 2024-12-07T03:01:42Z

Hello, I meet this problem when using the same input parameters but varying the limit value, the responses differ. This behavior is unexpected, as changing the limit should only affect the number of results returned, not the response content for identical cases.

This is my code:
lm_eval --model hf --limit 0.01 --model_args pretrained=../../../llms/Meta-Llama-3.1-8B-Instruct --tasks gsm8k_cot --device cuda:4 --log_samples --batch_size 1 --output_path ./results/gsm8k_cot/hf_test_0.01_v2.json --gen_kwargs max_gen_toks=512 do_sample=true temperature=0.5 --seed 42,42,42,42

However, when the limit is changed to 0.015, the response is different in the same case. And nearly all responses in the two settings differ.

How can I fix this bug and get the same response? Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent responses for the same case with different limit parameters #2550

Inconsistent responses for the same case with different limit parameters #2550

Starry-Liu1 commented Dec 7, 2024 •

edited

Loading

Inconsistent responses for the same case with different limit parameters #2550

Inconsistent responses for the same case with different limit parameters #2550

Comments

Starry-Liu1 commented Dec 7, 2024 • edited Loading

Starry-Liu1 commented Dec 7, 2024 •

edited

Loading