AdamD implementation (or option to skip bias-correction to adam-derived optimizers)? #385

jstjohn · 2021-10-25T15:58:25Z

I recently put out a proposal to add an argument to adam-derived optimizers to skip the bias-correction term on w, only applying it to v. See the figure attached in the issue pytorch/pytorch#67105 and the write-up I put together for theoretical justification AdamD: Improved bias-correction in Adam. Since it's still too early in the idea's existence to add this to the pytorch repo (according to them), your repo seems like a reasonable home for it. I am happy to send you a PR, but I would like to hear what you would prefer:

A new optimizer, AdamD and AdamDW (mirroring Adam/AdamW but with the bias-correction on the w term step excluded).
An otherwise vanilla fork of Adam/AdamW, with a boolean flag allowing the user to turn the bias-correction on/off, as well as adding this option to the relevant optimizers already included in this repo. I have not read through it carefully but this would likely include Lamb (it would be an option to enable bias-correction on v only, since it is already excluded otherwise), AdamP, and maybe others.

Let me know how you would like to proceed, or if you want any further clarification!

The text was updated successfully, but these errors were encountered:

jettify · 2021-10-26T01:57:20Z

I will be happy to accept PR, I like option 1 seems like more clear API. Internally if possible implementation should share code if possible.

jstjohn mentioned this issue Oct 27, 2021

Add adamD options and remake plots and readme files #387

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdamD implementation (or option to skip bias-correction to adam-derived optimizers)? #385

AdamD implementation (or option to skip bias-correction to adam-derived optimizers)? #385

jstjohn commented Oct 25, 2021 •

edited

Loading

jettify commented Oct 26, 2021 •

edited

Loading

AdamD implementation (or option to skip bias-correction to adam-derived optimizers)? #385

AdamD implementation (or option to skip bias-correction to adam-derived optimizers)? #385

Comments

jstjohn commented Oct 25, 2021 • edited Loading

jettify commented Oct 26, 2021 • edited Loading

jstjohn commented Oct 25, 2021 •

edited

Loading

jettify commented Oct 26, 2021 •

edited

Loading