Usage with torch.compile in Pytorch 2? #60

dreavjr · 2023-09-28T10:46:10Z

Is mup compatible with torch.compile() in Pytorch 2? If yes, what is the correct usage (e.g. should we apply mup before compile or after)?

edwardjhu · 2023-10-29T00:36:04Z

I don't see why it might not be compatible right away, but I haven't tested it.

What happens to the coordinate check if you rerun one of our examples after torch.compile()?

tivek · 2024-02-17T23:21:57Z

Recently, torch.compile() started using FakeTensors for both input and weight during compilation. That means that temporary FakeTensor weights are created from original Tensor weights. infshape attributes are not copied to these FakeTensor weights.

Consequently, during compilation, MuReadout.forward() and MuReadout.width_mult() trip this assert and the compilation fails.

This unwanted sideeffect will also influence the ability to eg. export mup models to ONNX.

Any advice how to circumvent missing infshapes on FakeTensors going forward?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage with torch.compile in Pytorch 2? #60

Usage with torch.compile in Pytorch 2? #60

dreavjr commented Sep 28, 2023

edwardjhu commented Oct 29, 2023

tivek commented Feb 17, 2024 •

edited

Loading

Usage with torch.compile in Pytorch 2? #60

Usage with torch.compile in Pytorch 2? #60

Comments

dreavjr commented Sep 28, 2023

edwardjhu commented Oct 29, 2023

tivek commented Feb 17, 2024 • edited Loading

tivek commented Feb 17, 2024 •

edited

Loading