Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ENH] native implementation of Johnson QPD family, explicit pdf (#327)
This PR reworks the family of QPD family of distributions for efficiency and to allow removal of the newly introduced dependency `findiff` in #232. The dependency `findiff` was introduced for approximation of `pdf`, but in fact it is unnecessary as the `pdf` can be analytically derived by applying the chain rule. True, it has to be applied three or four times, but it's still the chain rule... efficiency and accuracy gains are significant, and it helps us avoid computing numerical derivatives for all entries in a large matrix, together with the now unnecessary `findiff` dependency. Makes the following changes: * refactoring of the three QPD distributions tp use `skpro` machinery: * use of the `skpro` native parameter broadcasting system instead of ad-hoc broadcasting * use of the `skpro` native approximation for `mean`, `var`, instead of three copies of similar (and partially duplicative) approximation inside the distributions * refactoring between the three QPD distributions with the end of simplification * refactoring QPD parameter computation into a single, fully vectorized function, `_prep_qpd_vars` * clean room reimplementation of `cdf`, `ppf` of the three distributions based on the `cyclic_boosting` reference * new implementation of `pdf`, as derivative of `cdf` As side effects of the rework: * all parameters now broadcast in numpy-like fashion, including `alpha`, `lower`, `upper`, which previously was not possible * the distributions can be 2D with more than 1 column, which previously was not possible * `version` (the base distribution) can now be an arbitrary continuous `scipy` distribution * `pdf` is numerically exact * the distributions do not have soft dependencies anymore Regarding the relation to `cyclic_boosting`: * this is clean room reimplementation and credit is given, so I hope this is fine license-wise - @FelixWick? * this is the result of trying to remove the `findiff` dependency for computing the `pdf` from the `cdf` that was introduced in #232, as well as cleanup before release. I ended up simplifying a lot, ending up here. In this sense, the work of @setoguchi-naoki was crucial in arriving at this point. * I would have no issue at all with you moving the improved code into `cyclic_boosting`. We can even restore the dependency and maintain the distribution logic in `cyclic_boosting` if that were your preference, e.g., for ownership reasons.
- Loading branch information