PyTorch Lightning example #2

tchaton · 2022-03-09T13:15:40Z

Dear team behind mup,

This is some great work! I believe providing a PyTorch Lightning example could help users adopt this library.

I even wonder if this technique could be embedded in an even less boilerplate way. I was thinking about an extension to Pytorch Lightning Tuner which would automatically apply mup and apply the µTransferable Hyperparameters.

I wondered if someone from the mup Team would be interested to investigate those ideas to democratize even further this work.

Best,
T.C

edwardjhu · 2022-03-19T11:45:04Z

Hi tchaton,

Thanks for the pointer to the Lightning Tuner. We are not familiar with its usage, but from the page you linked, it looks like one can pass a model to, for example, lr_find along with a grid and the Tuner performs the necessary for loop(s) and returns the best HPs. In other words, one should be able to pass the proxy model, parametrized in muP, to the Tuner and take advantage of both right away.

Perhaps you are thinking about adding an option such as lr_find(model, mup=True, ...) to the Tuner API. The main obstacle is that we still need to let muP know which dimensions go to infinity in the limit by instantiating models of different widths. We also need the user to manually switch optimizers as well. Both are hard to hide inside a Tuner fn call.

Please let us know if you have ideas on how we can make this integration more seamless!

zygi added a commit to flanlabs/mup that referenced this issue Dec 6, 2022

attempt microsoft#2 at the beautiful fix

5458a1e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch Lightning example #2

PyTorch Lightning example #2

tchaton commented Mar 9, 2022 •

edited

Loading

edwardjhu commented Mar 19, 2022

PyTorch Lightning example #2

PyTorch Lightning example #2

Comments

tchaton commented Mar 9, 2022 • edited Loading

edwardjhu commented Mar 19, 2022

tchaton commented Mar 9, 2022 •

edited

Loading