[benchmark] add option to enable CompiledAutograd #1536

crcrpar · 2024-12-10T16:33:57Z

What does this PR do?

CompiledAutograd seems to speed up FSDP2 and I checked it with torchtitan.
I however somehow do not find it beneficial for litgpt models.

setting: pjnl-20241209, 8H100

CompiledAutograd	Performance (tps)	Memory (GB)
N	6244	51.2
Y	7200	43.0

CompiledAutograd	Performance (tokens/s/GPU)	Memory (GB)
N	11722.76	39.13
Y	10702.33	52.61

Signed-off-by: Masaki Kozuki <[email protected]>

add option to enable CompiledAutograd

fa70057

Signed-off-by: Masaki Kozuki <[email protected]>