-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inf and NaN in weights #904
Comments
Bump... |
Gentle bump... |
I wasn't involved in the previous discussions in #671 and #814, but my impression was that the errors are intentional and the previous behaviour (no errors) was considered a bug. I think I tend to agree with this view since weights are supposed to sum to 1 (or at least it should be able to normalize them to such weights), which is not possible if a weight is |
The whole point of |
I don't see a clear inconsistency, there's no restriction on the values and julia> using StatsBase
julia> mean([1,2,3,Inf])
Inf
julia> mean([1,2,3,Inf], weights([1,1,1,1]))
Inf
julia> mean([1,2,3,NaN])
NaN
julia> mean([1,2,3,NaN], weights([1,1,1,1]))
NaN The restriction only applies to the weights themselves because they have to sum up to 1. |
And if they don't because there's a Consider two scenarios: Have an array of arrays, and want to compute
and Have an array of arrays of values, and array of arrays of weights, and want to compute weighted
doesn't work as soon as there are any |
For those stumbling across the same issue: Or, even easier – just overload the using StatsBase
@generated Weights{S,T,V}(values, sum) where {S<:Real, T<:Real, V<:AbstractVector{T}} = Expr(:new, Weights{S,T,V}, :values, :sum) That's what I personally do. It doesn't change any existing behavior, just adds support for non-finite values in weight vectors. julia> mean([1,2], weights([1,NaN]))
NaN |
Currently,
weights
throws an error whenever it encounters non-finite values:ArgumentError: weights cannot contain Inf or NaN values
. That's highly surprising IMO, basically all other operations propagate NaNs and allow Infs – both basic arithmetics and aggregations.I looked up as found that this error was introduced after the issue #671. I think throwing this exception denies perfectly sensible behavior, so that weighted aggregations worked exactly as their unweighted counterparts – by propagating NaNs. And the throw in
weights()
should be removed, maybe adding it in specific aggregations where Infs/NaNs really do not make any sense (eg, mean and median should work).Numpy does propagate, of course:
The text was updated successfully, but these errors were encountered: