Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Samples parameterization #33

Closed
aimalz opened this issue May 20, 2021 · 7 comments
Closed

Samples parameterization #33

aimalz opened this issue May 20, 2021 · 7 comments
Assignees
Labels
enhancement New feature or request parameterization new/upgraded PDF parameterization need

Comments

@aimalz
Copy link
Collaborator

aimalz commented May 20, 2021

Currently we can make PDFs from samples via qp.spline_from_samples but, unless I'm missing something, there isn't a parameterization whose parameters are the sample values themselves rather than the spline parameters derived from a KDE thereof. This would be very helpful for things like the PIT metric used in RAIL, which is a 1D probability distribution defined by samples.

@eacharles
Copy link
Collaborator

eacharles commented Jul 13, 2021

I'm not actually sure what it means to have a PDF parametrization whose parameters are the sample values? How do you compute the pdf or cdf or ppf from sample values? It seem to me that, given some sample values, you need to convert to some other representation, e.g., qp.spline_from_samples.

Perhaps what you would like is a class that allows users to store sample values, and provides an easy interface to the conversion routines.

@eacharles eacharles added the question Further information is requested label Jul 13, 2021
@aimalz
Copy link
Collaborator Author

aimalz commented Jul 20, 2021

Functionally, I agree that the methods would have to be implemented using the KDE as a default intermediary, so a class that initializes an ensemble from samples, outputs to samples, and connects to the conversion functions would indeed be useful.

@eacharles
Copy link
Collaborator

So, the KDE representation was really inefficient for large samples b/c it it evaluated the PDF by doing an operation that involved all the samples. And also b/c there wasn't really a smart way to implement _cdf or _ppf So what I did was to convert it to a spline. So the Spline_Gen.create_from_samples will create a PDF from samples. And of course you can generate samples from any ensemble using ens.rvs(). We could put in an explicit KDE that computes things using the samples, but we are gonna want to tell people not to use it form more than a few samples or a few PDFs, cause it is really not performant.

@eacharles
Copy link
Collaborator

It would probably be a better long term solution just to make a NB that shows how to invoke Spline_Gen.create_from_samples and maybe a function that does ensemble.write_samples() for any PDF.

@eacharles
Copy link
Collaborator

If for whatever reason you want something ensemblish that ties together the reading and writing of samples, I would actually consider using the newly minted ancillary data to do that. I.e., a spline_pdf that carries around the samples used to generate it.

@aimalz
Copy link
Collaborator Author

aimalz commented Jul 28, 2021

Re: KDE, I think the dominant use case would do it for lots of samples and lots of PDFs, so your concern about computation is a fair one. Perhaps the most natural thing to do is actually quantiles, where the N sample values {z} naturally define regular quantiles separated by 1/N, which could then be binned down upon conversion.

@aimalz aimalz added parameterization new/upgraded PDF parameterization need and removed question Further information is requested labels Dec 6, 2022
@aimalz aimalz added the enhancement New feature or request label Jul 18, 2023
@aimalz
Copy link
Collaborator Author

aimalz commented Aug 2, 2023

#170 is a duplicate of this but the fresher conversation makes it the more reasonable issue to keep open.

@aimalz aimalz closed this as completed Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request parameterization new/upgraded PDF parameterization need
Projects
None yet
Development

No branches or pull requests

2 participants