You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is some great work! I believe providing a PyTorch Lightning example could help users adopt this library.
I even wonder if this technique could be embedded in an even less boilerplate way. I was thinking about an extension to Pytorch Lightning Tuner which would automatically apply mup and apply the µTransferable Hyperparameters.
I wondered if someone from the mup Team would be interested to investigate those ideas to democratize even further this work.
Best,
T.C
The text was updated successfully, but these errors were encountered:
Thanks for the pointer to the Lightning Tuner. We are not familiar with its usage, but from the page you linked, it looks like one can pass a model to, for example, lr_find along with a grid and the Tuner performs the necessary for loop(s) and returns the best HPs. In other words, one should be able to pass the proxy model, parametrized in muP, to the Tuner and take advantage of both right away.
Perhaps you are thinking about adding an option such as lr_find(model, mup=True, ...) to the Tuner API. The main obstacle is that we still need to let muP know which dimensions go to infinity in the limit by instantiating models of different widths. We also need the user to manually switch optimizers as well. Both are hard to hide inside a Tuner fn call.
Please let us know if you have ideas on how we can make this integration more seamless!
zygi
added a commit
to flanlabs/mup
that referenced
this issue
Dec 6, 2022
Dear team behind mup,
This is some great work! I believe providing a PyTorch Lightning example could help users adopt this library.
I even wonder if this technique could be embedded in an even less boilerplate way. I was thinking about an extension to Pytorch Lightning Tuner which would automatically apply mup and apply the µTransferable Hyperparameters.
I wondered if someone from the mup Team would be interested to investigate those ideas to democratize even further this work.
Best,
T.C
The text was updated successfully, but these errors were encountered: