Skip to content
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.

is the code corresponding to the algorithm in paper ? #40

Closed
aliutkus opened this issue Apr 5, 2022 · 1 comment
Closed

is the code corresponding to the algorithm in paper ? #40

aliutkus opened this issue Apr 5, 2022 · 1 comment

Comments

@aliutkus
Copy link

aliutkus commented Apr 5, 2022

Hi, thanks for providing code for your paper

I was wondering, when you do the actual update there:

It looks to me that you are not exactly performing the same thing as what's stated in the paper
https://proceedings.neurips.cc/paper/2018/file/432aca3a1e345e339f35a30c8f65edce-Paper.pdf

On line 2, the "heads" parameters are updated through theta_t = theta_t - eta * grad, but here you do theta_t = theta_t * scale_t * eta * grad
is that correct ?

best

antoine

@aliutkus
Copy link
Author

aliutkus commented Apr 5, 2022

ups, I realize it's the same issue as
#12

closing this

@aliutkus aliutkus closed this as completed Apr 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant