[Performance significantly drop when increase the batch_size] #2498

yushengsu-thu · 2024-11-17T08:44:11Z

Hello, I use the latest and v0.4.3 version of lm_eval and I find the weird phenomena on llama-3.2-3B
The following is my script:

BATCH_SIZE=256

torchrun --nproc-per-node=8 --no-python lm_eval \
    --model_args pretrained=meta-llama/Llama-3.2-3B \
    --tasks gsm8k_cot \
    --batch_size BATCH_SIZE

[llama-3.2-3B] Nodes=1, GPUs=8, llama

batch_size=1
|  Tasks  |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|---------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k_cot|      3|flexible-extract|     8|exact_match|↑  |0.2987|±  |0.0126||         
|         |       |strict-match    |     8|exact_match|↑  |0.2835|±  |0.0124|

batch_size=32
|  Tasks  |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|---------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k_cot|      3|flexible-extract|     8|exact_match|↑  |0.2790|±  |0.0124|
|         |       |strict-match    |     8|exact_match|↑  |0.2616|±  |0.0121|
 
batch_size=128
|  Tasks  |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|---------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k_cot|      3|flexible-extract|     8|exact_match|↑  |0.1304|±  |0.0093|
|         |       |strict-match    |     8|exact_match|↑  |0.1221|±  |0.0090|

batch_size=256
|  Tasks  |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|---------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k_cot|      3|flexible-extract|     8|exact_match|↑  |0.0409|±  |0.0055|
|         |       |strict-match    |     8|exact_match|↑  |0.0364|±  |0.0052|

accelerate 1.0.1

The text was updated successfully, but these errors were encountered:

baberabb · 2024-11-18T07:12:11Z

Hi! I can't reproduce this (1 GPU). Are you using the latest transformers?

hf (pretrained=meta-llama/Llama-3.2-3B), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 256

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k_cot	3	flexible-extract	8	exact_match	↑	0.2980	±	0.0126
		strict-match	8	exact_match	↑	0.2828	±	0.0124

baberabb self-assigned this Nov 18, 2024

baberabb added the bug Something isn't working. label Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance significantly drop when increase the batch_size] #2498

[Performance significantly drop when increase the batch_size] #2498

yushengsu-thu commented Nov 17, 2024 •

edited

Loading

baberabb commented Nov 18, 2024 •

edited

Loading

[Performance significantly drop when increase the batch_size] #2498

[Performance significantly drop when increase the batch_size] #2498

Comments

yushengsu-thu commented Nov 17, 2024 • edited Loading

baberabb commented Nov 18, 2024 • edited Loading

yushengsu-thu commented Nov 17, 2024 •

edited

Loading

baberabb commented Nov 18, 2024 •

edited

Loading