From 195b4ed54f2805352677790541e6f3a2b91c5905 Mon Sep 17 00:00:00 2001 From: chengzeyi Date: Fri, 27 Dec 2024 13:33:28 +0800 Subject: [PATCH] correct perf numbers and add details --- docs/performance/hunyuanvideo.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/performance/hunyuanvideo.md b/docs/performance/hunyuanvideo.md index 0e04d27..f9843cf 100644 --- a/docs/performance/hunyuanvideo.md +++ b/docs/performance/hunyuanvideo.md @@ -2,6 +2,9 @@ xDiT is [HunyuanVideo](https://github.com/Tencent/HunyuanVideo?tab=readme-ov-file#-parallel-inference-on-multiple-gpus-by-xdit)'s official parallel inference engine. On H100 and H20 GPUs, xDiT reduces the generation time of 1028x720 videos from 31 minutes to 5 minutes, and 960x960 videos from 28 minutes to 6 minutes. +The H100 and H20 performance benchmarks are done with the official HunyuanVideo repository. The L20 performance benchmarks are done with the `diffusers` implementation. +The L20 performance benchmarks are measured using this [script](examples/hunyuan_video_usp_example.py), along with `flash-attn==2.7.2.post1` and CUDA 12.4. + ### 1280x720 Resolution (129 frames, 50 steps) - Ulysses Latency (seconds)
@@ -22,6 +25,6 @@ xDiT is [HunyuanVideo](https://github.com/Tencent/HunyuanVideo?tab=readme-ov-fil |----------|--------|---------|---------|---------| | H100 | 1,735.01 | 934.09 | 645.45 | 367.02 | | H20 | 6,621.46 | 3,400.55 | 2,310.48 | 1,214.67 | -| L20 | 6,039.08 | 3,260.62 | 2,070.96 | | +| L20 | 6,039.08 | 3,260.62 | 2,284.74 | |