-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About ft_optim grad from ft_loss #13
Comments
You can refer to the implementation in |
Hi, thank you for sharing your efforts. |
Thanks to your response in the openreview, I guess understood. Thank you! |
c=
can you explain why the ft layer can be updated without using it |
The ft layer is not used for pu, while the ft layer used for ps is used for pu. (In the end, ft layer is used for pu.) |
Hi, thank you for sharing this information. It confuses me so much.
After reading your reply, I guess it can be understood by "the ps and up share a single ft layer (i.e., the same parameter |
Hi, I reproduced your code and found that ft_loss did not produce a gradient in the film layer, so how does your learning to learn update ft_optim?
The text was updated successfully, but these errors were encountered: