Skip to content

Commit

Permalink
Merge pull request #49 from LSSTDESC/user/aimalz/renaming
Browse files Browse the repository at this point in the history
naming consistency/clarity within src/rail/estimation
  • Loading branch information
aimalz committed Jul 15, 2023
2 parents 63eb5e7 + f679441 commit 45ecb4c
Show file tree
Hide file tree
Showing 17 changed files with 170 additions and 159 deletions.
44 changes: 20 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,31 @@
# pz-rail-hub
# pz-rail

[![Template](https://img.shields.io/badge/Template-LINCC%20Frameworks%20Python%20Project%20Template-brightgreen)](https://lincc-ppt.readthedocs.io/en/latest/)
[![codecov](https://codecov.io/gh/LSSTDESC/pz-rail-hub/branch/main/graph/badge.svg)](https://codecov.io/gh/LSSTDESC/pz-rail-hub)
[![PyPI](https://img.shields.io/pypi/v/hub?color=blue&logo=pypi&logoColor=white)](https://pypi.org/project/hub/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7017551.svg)](https://doi.org/10.5281/zenodo.7017551)
[![codecov](https://codecov.io/gh/LSSTDESC/pz-rail/branch/main/graph/badge.svg)](https://codecov.io/gh/LSSTDESC/pz-rail)
[![Template](https://img.shields.io/badge/Template-LINCC%20Frameworks%20Python%20Project%20Template-brightgreen)](https://lincc-ppt.readthedocs.io/en/latest/)

TODO - add more about your project here.

## RAIL: Redshift Assessment Infrastructure Layers
# RAIL: Redshift Assessment Infrastructure Layers

This package is part of the larger ecosystem of Photometric Redshifts
in [RAIL](https://github.com/LSSTDESC/RAIL).
RAIL is a flexible software library providing tools to produce at-scale photometric redshift data products, including uncertainties and summary statistics, and stress-test them under realistically complex systematics.
A detailed description of RAIL's modular structure is available in the [Overview](https://lsstdescrail.readthedocs.io/en/stable/source/overview.html) on ReadTheDocs.

### Citing RAIL
RAIL serves as the infrastructure supporting many extragalactic applications of the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory, including Rubin-wide commissioning activities.
RAIL was initiated by the Photometric Redshifts (PZ) Working Group (WG) of the LSST Dark Energy Science Collaboration (DESC) as a result of the lessons learned from the [Data Challenge 1 (DC1) experiment](https://academic.oup.com/mnras/article/499/2/1587/5905416) to enable the PZ WG Deliverables in [the LSST-DESC Science Roadmap (see Sec. 5.18)](https://lsstdesc.org/assets/pdf/docs/DESC_SRM_latest.pdf), aiming to guide the selection and implementation of redshift estimators in DESC analysis pipelines.
RAIL is developed and maintained by a diverse team comprising DESC Pipeline Scientists (PSs), international in-kind contributors, LSST Interdisciplinary Collaboration for Computing (LINCC) Frameworks software engineers, and other volunteers, but all are welcome to join the team regardless of LSST data rights.

This code, while public on GitHub, has not yet been released by DESC and is
still under active development. Our release of v1.0 will be accompanied by a
journal paper describing the development and validation of RAIL.
## Installation

If you make use of the ideas or software in RAIL, please cite the repository
<https://github.com/LSSTDESC/RAIL>. You are welcome to re-use the code, which
is open source and available under terms consistent with the MIT license.
Installation instructions are available under [Installation](https://lsstdescrail.readthedocs.io/en/stable/source/installation.html) on ReadTheDocs.

External contributors and DESC members wishing to use RAIL for non-DESC projects
should consult with the Photometric Redshifts (PZ) Working Group conveners,
ideally before the work has started, but definitely before any publication or
posting of the work to the arXiv.
## Contributing

### Citing this package
The greatest strength of RAIL is its extensibility; those interested in contributing to RAIL should start by consulting the [Contributing guidelines](https://lsstdescrail.readthedocs.io/en/stable/source/contributing.html) on ReadTheDocs.

If you use this package, you should also cite the appropriate papers for each
code used. A list of such codes is included in the
[Citing RAIL](https://lsstdescrail.readthedocs.io/en/stable/source/citing.html)
section of the main RAIL Read The Docs page.
## Citing RAIL

RAIL is open source and may be used according to the terms of its [LICENSE](https://github.com/LSSTDESC/RAIL/blob/main/LICENSE) [(BSD 3-Clause)](https://opensource.org/licenses/BSD-3-Clause).
If you make use of the ideas or software here in any publication, you must cite this repository <https://github.com/LSSTDESC/RAIL> as "LSST-DESC PZ WG (in prep)" with the [Zenodo DOI](https://doi.org/10.5281/zenodo.7017551).
Please consider also inviting the developers as co-authors on publications resulting from your use of RAIL by [making an issue](https://github.com/LSSTDESC/rail/issues/new/choose).
Additionally, several of the codes accessible through the RAIL ecosystem must be cited if used in a publication.
A convenient list of what to cite may be found under [Citing RAIL](https://lsstdescrail.readthedocs.io/en/stable/source/citing.html) on ReadTheDocs.
4 changes: 2 additions & 2 deletions docs/source/citing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The following list provides the necessary references for external codes accessib
| GPz:
| PZFlowPDF:
| PZFlowEstimator:
| J. F. Crenshaw et al (in prep)
| `Zenodo link <https://zenodo.org/record/6369625#.Ylcpjy-cYW8>`_
Expand All @@ -38,4 +38,4 @@ The following list provides the necessary references for external codes accessib
| trainZ:
| `Schmidt, Malz et al (2020) <https://ui.adsabs.harvard.edu/abs/2020MNRAS.499.1587S/abstract>`_
| varInference:
| VarInfStackSummarizer:
37 changes: 26 additions & 11 deletions docs/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@ Once you are satisfied with your PR, request that other team members review and
approve it. You could send the request to someone whom you've worked with on the
topic, or one of the core maintainers of rail.

**TODO what to call branches goes here**


Merge
-----
Expand All @@ -93,6 +95,15 @@ Once the changes in your PR have been approved, these are your next steps:
2. enter ``closes #[#]`` in the comment field to close the resolved issue
3. delete your branch using the button on the merged pull request.

If you are making changes that affect multiple repositories, make a branch and PR on each one.
The PRs should be merged and new releases made in the following order without long delays between steps:
1. `rail_base`
2. all per-algorithm repositories in any order
3. `rail`
4. `rail_pipelines`
This will minimize the time when new installations from PyPI could be broken by conflicts.


Reviewing a PR
--------------

Expand All @@ -118,36 +129,39 @@ Naming conventions
We follow the `pep8 <https://peps.python.org/pep-0008/#descriptive-naming-styles>`_
recommendations for naming new modules and ``RailStage`` classes within them.


Modules
-------

Modules should use all lowercase, with underscores where it aids the readability
of the module name. If the module performs only one of p(z) or n(z) calculations,
it is convenient to include that in the module name.
of the module name.

e.g.
For example:

* ``simple_neurnet`` is a module name for algorithms that use simple neural networks from sklearn to compute p(z) or n(z)
* ``random_pz`` is an algorithm that computes p(z)
* ``skl_neurnet`` is a module name for algorithms that use scikit-learn's simple neural network implementation to estimate p(z)
* ``random_gauss`` is a module name for a p(z) estimation algorithm that assigns each galaxy a random Gaussian distribution

It's good for the module name to specify the source of the implementation of a particularly common algorithm, e.g. ``minisom_som`` and ``somoclu_som`` are distinct.
Note that these names should not be identical to the name of the package the algorithm came from, to avoid introducing namespace collisions for users who have imported the original package as well, i.e. ``pzflow_nf`` is a safer name than ``pzflow``.


Stages
------

RailStages are python classes and so should use CapWords convention. All rail
stages using the same algorithm should use the same short, descriptive prefix,
and the suffix is the type of stage.
RailStages are python classes and so should use the CapWords convention. All
rail stages using the same algorithm should use the same short, descriptive
prefix, and the suffix is the type of stage.

e.g.

* ``SimpleNNInformer`` is an informer using a simple neural network
* ``SimpleNNEstimator`` is an estimator using a simple neural network
* ``KNearNeighInformer`` is an informer using the k-nearest neighbors algorithm
* ``KNearNeighEstimator`` is an estimator using the k-nearest neighbors algorithm

Possible suffixes include:

* Summarizer
* Informer
* Estimator
* Summarizer
* Classifier
* Creator
* Degrader
Expand All @@ -164,3 +178,4 @@ for those workflows:
* :ref:`Adding a new Rail Stage` without new dependencies
* :ref:`Adding a new algorithm` (new engine or package)
* :ref:`Sharing a Rail Pipeline`

10 changes: 5 additions & 5 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -187,15 +187,15 @@ For Delight you should be able to just do:
pip install pz-rail-delight
However, the particular estimator `Delight` is built with `Cython` and uses `openmp`. Mac has dropped native support for `openmp`, which will likely cause problems when trying to run the `delightPZ` estimation code in RAIL. See the notes below for instructions on installing Delight if you wish to use this particular estimator.
However, the particular estimator `Delight` is built with `Cython` and uses `openmp`. Mac has dropped native support for `openmp`, which will likely cause problems when trying to run the `DelightEstimator` estimation code in RAIL. See the notes below for instructions on installing Delight if you wish to use this particular estimator.

If you are installing RAIL on a Mac, as noted above the `delightPZ` estimator requires that your machine's `gcc` be set up to work with `openmp`. If you are installing on a Mac and do not plan on using `delightPZ`, then you can simply install RAIL with `pip install .[base]` rather than `pip install .[all]`, which will skip the Delight package. If you are on a Mac and *do* expect to run `delightPZ`, then follow the instructions `here <https://github.com/LSSTDESC/Delight/blob/master/Mac_installation.md>`_ to install Delight before running `pip install .[all]`.
If you are installing RAIL on a Mac, as noted above the `DelightEstimator` estimator requires that your machine's `gcc` be set up to work with `openmp`. If you are installing on a Mac and do not plan on using `DelightEstimator`, then you can simply install RAIL with `pip install .[base]` rather than `pip install .[all]`, which will skip the Delight package. If you are on a Mac and *do* expect to run `DelightEstimator`, then follow the instructions `here <https://github.com/LSSTDESC/Delight/blob/master/Mac_installation.md>`_ to install Delight before running `pip install .[all]`.


Installing FZBoost
Installing FlexZBoost
------------------

For FZBoost, you should be able to just do
For FlexZBoost, you should be able to just do

.. code-block:: bash
Expand Down Expand Up @@ -229,7 +229,7 @@ Using GPU-optimization for pzflow
Note that the Creation Module depends on pzflow, which has an optional GPU-compatible installation.
For instructions, see the `pzflow Github repo <https://github.com/jfcrenshaw/pzflow/>`_.

On some systems that are slightly out of date, e.g. an older version of python's `setuptools`, there can be some problems installing packages hosted on GitHub rather than PyPi. We recommend that you update your system; however, some users have still reported problems with installation of subpackages necessary for `FZBoost` and `bpz_lite`. If this occurs, try the following procedure:
On some systems that are slightly out of date, e.g. an older version of python's `setuptools`, there can be some problems installing packages hosted on GitHub rather than PyPi. We recommend that you update your system; however, some users have still reported problems with installation of subpackages necessary for `flexzboost` and `bpz_lite`. If this occurs, try the following procedure:

Once you have installed RAIL, you can import the package (via `import rail`) in any of your scripts and notebooks.
For examples demonstrating how to use the different pieces, see the notebooks in the `examples/` directory.
Expand Down
4 changes: 2 additions & 2 deletions docs/source/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,8 @@ Methods that estimate per-galaxy PDFs directly from photometry are referred to a
Individual estimation and summarization codes are "wrapped" as RAIL stages so that they can be run in a controlled way.

**base design**:
Estimators for several popular codes `BPZ_lite` (a slimmed down version of the popular template-based BPZ code), `FlexZBoost`, and delight `Delight` are included in rail/estimation, as are an estimator `PZFlowPDF` that uses the same normalizing flow employed in the creation module, and `KNearNeighPDF` for a simple color-based nearest neighbor estimator.
The pathological `trainZ` estimator is also implemented.
Estimators for several popular codes `BPZliteEstimator` (a slimmed down version of the popular template-based BPZ code), `FlexZBoostEstimator`, and `DelightEstimator` are included in rail/estimation, as are an estimator `PZFlowEstimator` that uses the same normalizing flow employed in the creation module, and `KNearNeighEstimator` for a simple color-based nearest neighbor estimator.
The pathological `TrainZEstimator` estimator is also implemented.
Several very basic summarizers such as a histogram of point source estimates, the naive "stacking"/summing of PDFs, and a variational inference-based summarizer are also included in RAIL.

**Usage**:
Expand Down
8 changes: 4 additions & 4 deletions examples/core_examples/FileIO_DataStore.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@
"source": [
"# Using the data in a pipeline stage: photo-z estimation example\n",
"\n",
"Now that we have our data in place, we can use it in a RAIL stage. As an example, we'll estimate photo-z's for our data. Let's train the `KNearNeighPDF` algorithm with our train_data, and then estimate photo-z's for the test_data. We need to make the RAIL stages for each of these steps, first we need to train/inform our nearest neighbor algorithm with the train_data:"
"Now that we have our data in place, we can use it in a RAIL stage. As an example, we'll estimate photo-z's for our data. Let's train the `KNearNeighEstimator` algorithm with our train_data, and then estimate photo-z's for the test_data. We need to make the RAIL stages for each of these steps, first we need to train/inform our nearest neighbor algorithm with the train_data:"
]
},
{
Expand All @@ -230,7 +230,7 @@
"metadata": {},
"outputs": [],
"source": [
"from rail.estimation.algos.knnpz import Inform_KNearNeighPDF, KNearNeighPDF"
"from rail.estimation.algos.k_nearneigh import KNearNeighInformer, KNearNeighEstimator"
]
},
{
Expand All @@ -239,7 +239,7 @@
"metadata": {},
"outputs": [],
"source": [
"inform_knn = Inform_KNearNeighPDF.make_stage(name='inform_knn', input='train_data', \n",
"inform_knn = KNearNeighInformer.make_stage(name='inform_knn', input='train_data', \n",
" nondetect_val=99.0, model='knnpz.pkl',\n",
" hdf5_groupname='')\n"
]
Expand Down Expand Up @@ -268,7 +268,7 @@
"metadata": {},
"outputs": [],
"source": [
"estimate_knn = KNearNeighPDF.make_stage(name='estimate_knn', hdf5_groupname='photometry', nondetect_val=99.0,\n",
"estimate_knn = KNearNeighEstimator.make_stage(name='estimate_knn', hdf5_groupname='photometry', nondetect_val=99.0,\n",
" model='knnpz.pkl', output=\"KNNPZ_estimates.hdf5\")"
]
},
Expand Down
12 changes: 6 additions & 6 deletions examples/estimation_examples/NZDir.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
"metadata": {},
"outputs": [],
"source": [
"from rail.estimation.algos.NZDir import NZDir, Inform_NZDir\n",
"from rail.estimation.algos.nz_dir import NZDirSummarizer, NZDirInformer\n",
"from rail.core.data import TableHandle\n",
"from rail.core.stage import RailStage"
]
Expand Down Expand Up @@ -161,7 +161,7 @@
"id": "f65d4835-2ff6-4206-b017-cdf1d7cad828",
"metadata": {},
"source": [
"Now, let's set up or estimator, first creating a stage for the informer. We define any input variables in a dictionary and then use that with `make_stage` to create an instance of our NZDir summarizer. We'll create a histogram of 25 bins, using 5 nearest neighbors to define our specz neighborhood, and above we defined our bin column as \"bin\":"
"Now, let's set up or estimator, first creating a stage for the informer. We define any input variables in a dictionary and then use that with `make_stage` to create an instance of our NZDirSummarizer. We'll create a histogram of 25 bins, using 5 nearest neighbors to define our specz neighborhood, and above we defined our bin column as \"bin\":"
]
},
{
Expand All @@ -171,7 +171,7 @@
"metadata": {},
"outputs": [],
"source": [
"train_nzdir = Inform_NZDir.make_stage(name='train_nzdir', n_neigh=5,\n",
"train_nzdir = NZDirInformer.make_stage(name='train_nzdir', n_neigh=5,\n",
" szweightcol='weight', model=\"NZDir_model.pkl\")"
]
},
Expand Down Expand Up @@ -225,7 +225,7 @@
"binnames = ['low', 'mid', 'hi']\n",
"bin_datasets = [low_bin, mid_bin, hi_bin]\n",
"for bin, indata in zip(binnames, bin_datasets):\n",
" nzsumm = NZDir.make_stage(name=f'nzsumm_{bin}', **summdict)\n",
" nzsumm = NZDirSummarizer.make_stage(name=f'nzsumm_{bin}', **summdict)\n",
" bin_ens[f'{bin}'] = nzsumm.estimate(indata)"
]
},
Expand Down Expand Up @@ -381,7 +381,7 @@
"source": [
"xinformdict = dict(n_neigh=5, bincol=\"bin\", szweightcol='weight',\n",
" model=\"NZDir_model_incompl.pkl\", hdf5_groupname='')\n",
"newsumm_inform = Inform_NZDir.make_stage(name='newsumm_inform', **xinformdict)"
"newsumm_inform = NZDirInformer.make_stage(name='newsumm_inform', **xinformdict)"
]
},
{
Expand Down Expand Up @@ -416,7 +416,7 @@
"binnames = ['low', 'mid', 'hi']\n",
"bin_datasets = [low_bin, mid_bin, hi_bin]\n",
"for bin, indata in zip(binnames, bin_datasets):\n",
" nzsumm = NZDir.make_stage(name=f'nzsumm_{bin}', **xestimatedict)\n",
" nzsumm = NZDirSummarizer.make_stage(name=f'nzsumm_{bin}', **xestimatedict)\n",
" new_ens[f'{bin}'] = nzsumm.estimate(indata)"
]
},
Expand Down
Loading

0 comments on commit 45ecb4c

Please sign in to comment.