Is is necessary to update log sigma instead of update sigma for weight noise #28

lai-agent-m · 2020-04-08T16:34:16Z

I saw that you calculates the gradient for log(sigma)*2/2048.0, I guess that it's for numerical stability but I'm not sure. In my implementation I directly calculate the gradient of the varience since it's directly in the paper, I didn't well test my code so I'm not sure if anything will break.

lai-agent-m · 2020-04-08T19:02:31Z

I see the problem, the sum of square of a vector do not equal the square of sum, so the trick solve the problem or the problem actually still presisits. hope to hear from you and I'll do some math first.

Ok, After I apply the chain rule, I see that this trick didn't solve the sum of square problem. So you experienced underflow and this trick solved it?

lai-agent-m · 2020-04-08T19:38:40Z

okay, It's hard to get gradient for each sample on autograd format. I saw that many other paper on HME recognition also mentioned that they use weight noise. Do you think they also use this version of 'weight noise' that is in fact knid of different from original one. I don't feel square of sum is a good approximation for sum of square.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is is necessary to update log sigma instead of update sigma for weight noise #28

Is is necessary to update log sigma instead of update sigma for weight noise #28

lai-agent-m commented Apr 8, 2020

lai-agent-m commented Apr 8, 2020 •

edited

Loading

lai-agent-m commented Apr 8, 2020

Is is necessary to update log sigma instead of update sigma for weight noise #28

Is is necessary to update log sigma instead of update sigma for weight noise #28

Comments

lai-agent-m commented Apr 8, 2020

lai-agent-m commented Apr 8, 2020 • edited Loading

lai-agent-m commented Apr 8, 2020

lai-agent-m commented Apr 8, 2020 •

edited

Loading