Skip to content

Commit

Permalink
[ENH] native implementation of Johnson QPD family, explicit pdf (#327)
Browse files Browse the repository at this point in the history
This PR reworks the family of QPD family of distributions for efficiency
and to allow removal of the newly introduced dependency `findiff` in
#232.

The dependency `findiff` was introduced for approximation of `pdf`, but
in fact it is unnecessary as the `pdf` can be analytically derived by
applying the chain rule. True, it has to be applied three or four times,
but it's still the chain rule... efficiency and accuracy gains are
significant, and it helps us avoid computing numerical derivatives for
all entries in a large matrix, together with the now unnecessary
`findiff` dependency.

Makes the following changes:

* refactoring of the three QPD distributions tp use `skpro` machinery:
* use of the `skpro` native parameter broadcasting system instead of
ad-hoc broadcasting
* use of the `skpro` native approximation for `mean`, `var`, instead of
three copies of similar (and partially duplicative) approximation inside
the distributions
* refactoring between the three QPD distributions with the end of
simplification
* refactoring QPD parameter computation into a single, fully vectorized
function, `_prep_qpd_vars`
* clean room reimplementation of `cdf`, `ppf` of the three distributions
based on the `cyclic_boosting` reference
* new implementation of `pdf`, as derivative of `cdf`

As side effects of the rework:
* all parameters now broadcast in numpy-like fashion, including `alpha`,
`lower`, `upper`, which previously was not possible
* the distributions can be 2D with more than 1 column, which previously
was not possible
* `version` (the base distribution) can now be an arbitrary continuous
`scipy` distribution
* `pdf` is numerically exact
* the distributions do not have soft dependencies anymore

Regarding the relation to `cyclic_boosting`:

* this is clean room reimplementation and credit is given, so I hope
this is fine license-wise - @FelixWick?
* this is the result of trying to remove the `findiff` dependency for
computing the `pdf` from the `cdf` that was introduced in #232, as well
as cleanup before release. I ended up simplifying a lot, ending up here.
In this sense, the work of @setoguchi-naoki was crucial in arriving at
this point.
* I would have no issue at all with you moving the improved code into
`cyclic_boosting`. We can even restore the dependency and maintain the
distribution logic in `cyclic_boosting` if that were your preference,
e.g., for ownership reasons.
  • Loading branch information
fkiraly authored May 16, 2024
1 parent 5889075 commit dad8793
Show file tree
Hide file tree
Showing 2 changed files with 358 additions and 478 deletions.
Loading

0 comments on commit dad8793

Please sign in to comment.