[OpenSora-hpcai] OSv1.2 performance optimization #687

hadipash · 2024-10-09T09:53:33Z

TODO:

Validate accuracy and visual quality on long training.
Update performance tables in README.

Tests were conducted in dynamic DVM mode, on MS daily from 09.04 with CANN 8.0 RC2. Results include training step average time only (no data loading time):

Changes	Shape (res x frames x batch)	Time (s)	Change (s)	Comment
Original	720p x 51 x 2	30.409
	144p x 204 x 10	19.934
Switch to `repeat_interleave_ext_v2`	720p x 51 x 2	28.913	-1.496 (-4.9%)
	144p x 204 x 10	19.872	-0.062 (-0.3%)
Remove SiLU & GELU FP32 upcast	720p x 51 x 2	30.346	-0.062 (-0.2%)	No performance improvement,
	144p x 204 x 10	20.506	+0.572 (+2.9%)	will consult with the MS team.
Convert parameters to BF16	720p x 51 x 2	28.957	-1.452 (-4.8%)
	144p x 204 x 10	18.747	-1.187 (-3.9%)
Remove redundant `ops.transpose` in VAE	720p x 51 x 2	30.448	+0.040 (+0.1%)	No changes due to the kernel fusion.
	144p x 204 x 10	20.103	+0.168 (+0.8%)	Beneficial in KBK & PyNative modes.

Final improvement	720p x 51 x 2	27.896	-2.512 (-8.3%)
	144p x 204 x 10	18.804	-1.130 (-5.7%)

zhtmike

seems no code change for Convert parameters to BF16 ?

hadipash · 2024-10-10T02:20:32Z

seems no code change for Convert parameters to BF16 ?

This refers to the network parameters that are explicitly defined with nn.Parameter(), such as self.scale_shift_table. For some reason, any calculations performed on self.scale_shift_table are upcast to the parameter type (i.e. fp32) and the new type is propagated in the network, even with AMP enabled.

performance optimization

de01058

hadipash requested review from CaitinZhao, SamitHuang, zhanghuiyao and vigo999 as code owners October 9, 2024 09:53

hadipash requested a review from zhtmike October 9, 2024 09:55

linting

37fd542

zhtmike approved these changes Oct 9, 2024

View reviewed changes

hadipash added 2 commits October 10, 2024 16:01

fix

8d6d25b

linting

39a15fc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenSora-hpcai] OSv1.2 performance optimization #687

[OpenSora-hpcai] OSv1.2 performance optimization #687

hadipash commented Oct 9, 2024 •

edited

Loading

zhtmike left a comment

hadipash commented Oct 10, 2024

[OpenSora-hpcai] OSv1.2 performance optimization #687

Are you sure you want to change the base?

[OpenSora-hpcai] OSv1.2 performance optimization #687

Conversation

hadipash commented Oct 9, 2024 • edited Loading

zhtmike left a comment

Choose a reason for hiding this comment

hadipash commented Oct 10, 2024

hadipash commented Oct 9, 2024 •

edited

Loading