You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, all things Flux.params are headed for extinction, see e.g. #2413. The current idiom for this is Optimisers.trainables... or in most cases, use WeightDecay instead:
julia>gradient(s_ ->sum(sqnorm, Flux.params(s_)), s) # as above
((layers = ((weight = Float32[-0.18066745-0.4179064; 0.3016829-0.4228169; … ; -0.36133823-0.23173195; 0.45555136-0.12170375], bias = Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0…0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], σ =nothing), (weight = Float32[-0.031820923-0.41430357…0.338810770.35217345; -0.032086630.039828066…-0.3371693-0.34633902], bias = Float32[0.0, 0.0], σ =nothing)),),)
julia>import Optimisers
julia>gradient(s_ ->sum(sqnorm, Optimisers.trainables(s_)), s) # new way, same numbers
((layers = ((weight = Float32[-0.18066745-0.4179064; 0.3016829-0.4228169; … ; -0.36133823-0.23173195; 0.45555136-0.12170375], bias = Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0…0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], σ =nothing), (weight = Float32[-0.031820923-0.41430357…0.338810770.35217345; -0.032086630.039828066…-0.3371693-0.34633902], bias = Float32[0.0, 0.0], σ =nothing)),),)
help?> Optimisers.WeightDecay
WeightDecay(λ =5e-4)
Implements L_2 regularisation, also known as ridge regression, when composed with other rules as
the first transformation in an OptimiserChain.
It does this by adding λ .* x to the gradient. This is equivalent to adding λ/2*sum(abs2, x)
== λ/2*norm(x)^2 to the loss.
See also [SignDecay] for L_1 normalisation.
Ideally Optimisers.trainables would be accessible as Flux.trainables, and be included in this package's documentation.
Cannot differentiate L2 regularized loss.
Package versions:
The text was updated successfully, but these errors were encountered: