-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
equal p(z) bins for all sources in mock_data #404
Comments
If I understand well, the incriminated lines are CLMM/clmm/support/mock_data.py Line 380 in b94f88b
galaxy_catalog['z'] itself, and then extend these zmin and zmax by pdf_range , and proceed. The resulting binning will be unique and applicable to all the galaxies in the catalog (in general I think that this for loop should disappear, but that will demand a bit of careful broadcasting).
|
Yes, we can replace that for loop by the following (which I haven't tested): zbins = np.arange(zsrc_min, zsrc_max+zsrc_bin_width, zsrc_bin_width)
pzpdf_grid = np.exp(-(zbins[:,None]-galaxy_catalog['z'])/(2*pzsigma**2) / ((2*np.pi)**0.5*pzsigma) or maybe the transpose of that. (And by the way, |
git blame say Matthew Kirby 2 years ago, for the for loop, and then some refactoring 8 months ago by Céline, Alex, and a couple others. There are many ways to improve the code here, but there is not much point in doing so before it is really used. And I do not think that "really using it" should wait for qp to enter the game. Quite the contrary actually IMHO..... |
I don't recall why we implemented it this way. I guess one question is what will happen in real life? Will photoz codes provide pdf on a single binning for all galaxies or not (it's not obvious to me). If this is the standard then I completely agree that this should be changed to something like @cristobal-sifon proposes. |
I think that the code here should not care. You should be able to mock pdfs, ingest them in some TBD form by something like qp, and make sure that the code downstream does not depend on the way the pdf is represented. This is the whole point of qp. So I do not think that it matters what real life will be : here we need to abstract away from it. The only real question to my mind is how it is provided, which is a different question : e.g. inside an object catalog, or separately with some id to join to an object catalog? |
Given this week's tag up discussion, we decided to have a more abstract approach for storing the PDF in galcat. The idea is to have the PDF column as an object type and add a pdf attribute to the galaxy catalog to describe the use of this column. So we can have something like:
right now we can start implementing this first case with simple modification of what is in branch issue/404/galcat_pdf_improvement |
FWIW, this is really pretty heavily overlapping with functionality in qp and in particular the construction of the qp.Ensemble, which represents a set of pdfs.
It would be good to chat a bit about this. |
This is linked to issue #401 (interfacing ' |
The current implementation with
photoz_sigma_unscaled
generates a pdf that is sampled differently for every galaxy. From_compute_photoz_pdfs
:so the result is a column like:
This is both very inconvenient, because the user then has to interpolate if they want to combine the pdfs, and a big waste of space.
I propose to change this for a single binning scheme that depends on
(zsrc_min,zsrc_max)
and a new optional argumentzsrc_bin_width
or similar, so that the binning can be stored as a separate attribute ofGalaxyCluster
, e.g.,zsrc_pdf_centers
or justzsrc_bins
. See related suggestion in #401.The text was updated successfully, but these errors were encountered: