-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new logic to enable stored checkpoint weights to be copied to new history dimensions #36
Add new logic to enable stored checkpoint weights to be copied to new history dimensions #36
Conversation
… history dimensions
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @scottcha! Thanks for opening a PR!! I think this is useful feature. :) A few comments:
I think we should not allow the user to decrease the value of max_history_size
, as that would very likely break performance. Could you add an assertion for this?
In addition, currently this only works for the pretrained checkpoint, not any checkpoint saved after that. Would you be able to separate the checkpoint-adaptation logic from the max_history_size
-adaptation logic, so this would also work for checkpoints saved after that? We might at some point release additional models to which the checkpoint-adaptation logic does not apply, but for which you might want to change max_history_size
.
…d different fn to adapt history
@wesselb that sounds like good suggestions. I'll update the PR addressing those. It might take a couple days based on my availability but its no problem. |
That would be perfect! Thanks :) |
PR is updated with the requested changes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottcha Amazing work. I've got a few very minor comments, and then I think this is ready to go once CI passes!
Thank you!!
match previous weights device and dtype
Ok, pushed changes to address the feedback. Thanks! |
@scottcha I think this is ready to be merged. Would you be able to run the formatter over the PR once? I think that should fix the formatting check. The easier way by doing this is by setting up |
Got it, handn't noticed the pre-commit component. Pushed changes to resolve those issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, this looks ready to go! Thanks, @scottcha :)
Here is a possible fix for #35. Currently the new logic attempts to copy the provided weights into any new history dimensions which get allocated by setting max_history_size > 2.