Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undeprecate no-dot syntax for quantile() and nquantile() #1150

Open
ron-wolf opened this issue Jul 24, 2020 · 7 comments · May be fixed by #1159
Open

Undeprecate no-dot syntax for quantile() and nquantile() #1150

ron-wolf opened this issue Jul 24, 2020 · 7 comments · May be fixed by #1159

Comments

@ron-wolf
Copy link

This issue picks up where JuliaStats/StatsBase.jl#586 left off. At issue is the following snippet in the Distributions package:

@deprecate ($fun)(d::UnivariateDistribution, X::AbstractArray) ($fun).(d, X)

The macro affects the quantile() and nquantile() functions, of which the former is used by StatsBase in this line.

According to @nalimilan, StatsBase relies on the deprecated behavior. This is because when quantile() is broadcasted over the given weights/probabilities (second param, p::Union{AbstractArray, Tuple{Vararg{Real}}}), _quantilesort! is run twice on the same parameter (first param, v::AbstractVector). See this code in the Statistics package where the sorting function would be broadcast over using the new syntax requested by Distributions. In any case, requiring broadcasting would mean requiring the first parameter to be sorted twice.

Would it be best to un-deprecate this functionality, or is there another course? For now, using Statistics as per its own documentation gives the user a deprecation warning, which is… not ideal, especially for first-time users, of which I’m one.

@nalimilan
Copy link
Member

To complete a bit: quantile(x, p::AbstractArray) isn't really needed when x is a Distribution (in Distributions.jl), since it's strictly equivalent to broadcasting. But when x is a vector (in StatsBase), broadcasting involves sorting twice as @ron-wolf said. So it would make sense to undeprecate this method in Distributions for consistency.

Another solution would be to define a custom broadcasting method for quantile in StatsBase which would only sort once. Not sure whether that's a good idea or not.

@ron-wolf
Copy link
Author

@nalimilan Sorry for the delay. I think the best solution is just to undeprecate non-broadcast calls to quantile, since they seem to be needed. (If I am wrong and they should never be used outside of StatsBase, please let me know, as that will probably require a clever solution. In any case…) I would like to submit a pull request, but how would I actually go about undeprecating this functionality?

@nalimilan
Copy link
Member

No problem. You just have to remove the @deprecate definition in Distributions and replace it with a normal function definition (and move it to a better place).

@ron-wolf
Copy link
Author

But won’t that undeprecate all of these functions? Is that what we want?

for fun in [:pdf, :logpdf,
:cdf, :logcdf,
:ccdf, :logccdf,
:invlogcdf, :invlogccdf,
:quantile, :cquantile]

Also, why would we add the definition of quantile here in Distributions when it’s already defined in StatsBase? Maybe I’m a bit turned around about what’s defined or deprecated where.

@nalimilan
Copy link
Member

Right, you should only remove quantile from that list (and define it elsewhere). Also, quantile isn't defined for Distribution objects in StatsBase nor in Statistics AFAIK. Actually, the definition in Statistics will call collect(d::Distribution), which doesn't make sense.

@ron-wolf
Copy link
Author

Okay. Do you think it’s only quantile that should be removed?

@nalimilan
Copy link
Member

Yes.

@ron-wolf ron-wolf linked a pull request Aug 17, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants