You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the documentatin of the gaussia noise regularization it says
adds normal (Gaussian) distribution noise into training data in order to decrease overfitting (testing
data are untouched). Sigma gives the standard deviation (spread or "width") of the normal distribution.
The optimal value is commonly between 0.05 and 0.6. The default is to not add noise, but that leads
to significantly suboptimal results.
In case, its better as an addition, then the sigma parameter needs to be changed for regression task and not be limited to 1.
The text was updated successfully, but these errors were encountered:
cmougan
changed the title
[DOC] Guassian noise is multiplicative instead of additive
[DOC] Guassian noise regularization: multiplicative or additive?
Jan 7, 2022
cmougan
changed the title
[DOC] Guassian noise regularization: multiplicative or additive?
[DOC] Gaussian noise regularization: multiplicative or additive?
Jan 7, 2022
Hi Carlos,
I'm not exactly sure what your point of concern is here. Is it
a wording issue: Since it says add noise the noise has to be additive? I'm not a native English speaker, but to me add noise can also mean adding noise in a multiplicative fashion.
a technical issue: Adding noise makes more sense to be additive.
In case of 1, I think it might be good enough to clarify in the documentation.
In case of 2: LOO and target encoding is for regression and binary classification problems. The noise is added to the encoded values (i.e. some variation of category means). For binary classification I fully agree that additive noise makes more sense precisely because of the issue around 0 that you're pointing out. For regression problem I can imagine scenarios where it might make sense - dependent on the nature of the problem - to add multiplicative or additive noise. Also the regressor used afterwards might change whether you want additive or multiplicative noise. I don't think there is a one-size-fits-it-all solution
Apparently, some researchers will expect if you say additive to be "+". But overall I believe it's better with the current way. --there is no one-size-fits-all and it does not make sense to add one hyperparameter more for this. Actually, "recent work" has provided some empirical evidence that Smoothing might be in general better.
If there is nicer wording, it might be better to clarify.
Some suggestions might be: includes, incorporates
In the documentatin of the gaussia noise regularization it says
But then the operation is a multiplication:
I am not sure it this should be like this. Noise is less relevant when close to 0.
Also it creates negative categories very soon.
Would this be best?
In case, its better as an addition, then the sigma parameter needs to be changed for regression task and not be limited to 1.
The text was updated successfully, but these errors were encountered: