fix sdxl mlperf time bug #1580

huijuanzh · 2024-12-09T07:05:20Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Bug found when test with num_batches = throughput_warmup_steps.

test cmd:
python text_to_image_generation.py --model_name_or_path /host/mnt/ctrl/disk1/personal/cg/models/stable-diffusion-xl-base-1.0 --prompts "Sailing ship painting by Van Gogh" --num_imag es_per_prompt 10 --batch_size 4 --image_save_dir /tmp/stable_diffusion_xl_images --scheduler euler_discrete --use_habana --use_hpu_graphs --num_inference_steps 30 --height 1024 --width 1024 --gaudi_config Habana/stable-diffusion --bf16 --optimize

Speed metrics: {'generation_runtime': 247.3958, 'generation_samples_per_second': 0.049, 'generation_steps_per_second': 1.31}

python text_to_image_generation.py --model_name_or_path /host/mnt/ctrl/disk1/personal/cg/models/stable-diffusion-xl-base-1.0 --prompts "Sailing ship painting by Van Gogh" --num_imag es_per_prompt 16 --batch_size 4 --image_save_dir /tmp/stable_diffusion_xl_images --scheduler euler_discrete --use_habana --use_hpu_graphs --num_inference_steps 30 --height 1024 --width 1024 --gaudi_config Habana/stable-diffusion --bf16 --optimize

Speed metrics: {'generation_runtime': 261.2659, 'generation_samples_per_second': 0.258, 'generation_steps_per_second': 7.751}

G2D:

resolution	steps	bs	Throughput (image/s)
1024*1024	30	1	0.249
1024*1024	30	2	0.257
1024*1024	30	4	0.047

huijuanzh · 2024-12-09T07:15:53Z

G2D update patch test:

resolution	steps	num_images_per_prompt	bs	Throughput (image/s)
1024*1024	30	4	4	0.24
1024*1024	30	8	4	0.241
1024*1024	30	10	4	0.244
1024*1024	30	12	4	0.245
1024*1024	30	16	4	0.258
1024*1024	30	20	4	0.258
1024*1024	30	30	4	0.259

regisss · 2024-12-09T09:12:17Z

optimum/habana/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_mlperf.py

        if t1 == t0 or use_warmup_inference_steps:
-            num_samples = num_batches * batch_size
-            num_steps = (num_inference_steps - throughput_warmup_steps) * num_batches * batch_size
+            num_samples = batch_size


There could still be more than 1 batch no?

we just calculate the latest batch time in use_warmup_inference_steps case(add and j == num_batches - 1 in line 841), so the num_samples is batch_size

fix sdxl mlperf time bug

16bd6ab

huijuanzh requested a review from regisss as a code owner December 9, 2024 07:05

regisss reviewed Dec 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix sdxl mlperf time bug #1580

fix sdxl mlperf time bug #1580

huijuanzh commented Dec 9, 2024

huijuanzh commented Dec 9, 2024

regisss Dec 9, 2024

huijuanzh Dec 9, 2024

fix sdxl mlperf time bug #1580

Are you sure you want to change the base?

fix sdxl mlperf time bug #1580

Conversation

huijuanzh commented Dec 9, 2024

What does this PR do?

Before submitting

huijuanzh commented Dec 9, 2024

regisss Dec 9, 2024

Choose a reason for hiding this comment

huijuanzh Dec 9, 2024

Choose a reason for hiding this comment