Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u/parejkoj/pipeline: improve docs on config overrides #430

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

parejkoj
Copy link
Contributor

No description provided.

We no longer want pipelines to live in obs packages.
@parejkoj parejkoj requested a review from kfindeisen July 29, 2024 23:38
Copy link

codecov bot commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.24%. Comparing base (83b1bb7) to head (b414d14).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #430      +/-   ##
==========================================
+ Coverage   80.21%   80.24%   +0.03%     
==========================================
  Files          96       96              
  Lines       10971    10971              
  Branches     2099     2099              
==========================================
+ Hits         8800     8804       +4     
+ Misses       1822     1820       -2     
+ Partials      349      347       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@parejkoj parejkoj force-pushed the u/parejkoj/pipeline-override-docfix branch from 128763f to b414d14 Compare July 30, 2024 00:44
Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this documentation is either accurate or accessible to new users. Maybe it would be better to create a Jira issue and take the time to do it properly?

level of a package.
* Instrument packages should provide `Pipeline`\ s that override standard
* High level pipeline packages (e.g. ``ap_pipe``, ``drp_pipe``) should provide `Pipeline`\ s that override standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cp_pipe?

@@ -457,6 +457,35 @@ used in instrument-specific ``pipelines`` sub-directories, for example
you to run a `Pipeline` that is configured for the desired camera, or can
serve as a base for further `Pipeline`\ s to import.

"Obs packages" like ``obs_subaru`` also provide a place for instrument-specific ``~lsst.pipe.base.PipelineTask`` configuration overrides, in the ``config`` sub-directory.
These python files are loaded with ``exec`` in the ``~lsst.pipe.base.Task`` context, with the ``config`` variable pointing to that Task's config class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec is an implementation detail, and is not something the user needs to know. In any case, explaining config files is best deferred to https://pipelines.lsst.io/v/daily/modules/lsst.pipe.base/creating-a-task.html#configuration (label creating-a-task-configuration), though ATM it doesn't say anything about the files except that they exist.

@@ -457,6 +457,35 @@ used in instrument-specific ``pipelines`` sub-directories, for example
you to run a `Pipeline` that is configured for the desired camera, or can
serve as a base for further `Pipeline`\ s to import.

"Obs packages" like ``obs_subaru`` also provide a place for instrument-specific ``~lsst.pipe.base.PipelineTask`` configuration overrides, in the ``config`` sub-directory.
These python files are loaded with ``exec`` in the ``~lsst.pipe.base.Task`` context, with the ``config`` variable pointing to that Task's config class.
The filename must match the task's :ref:`name <creating-a-task-class-variables>`, e.g. ``calibrateImage.py`` for ``~lsst.pipe.tasks.calibrateImage.CalibrateImage``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tasks don't have a well-defined name. The lookup is actually based on the task's label in a pipeline, and is only done for top-level tasks. While using _DefaultName is a good convention in practice, we have enough pipelines with multiple copies of a task that the reader should understand what happens in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backticks will literally print ~lsst.pipe.tasks.calibrateImage.CalibrateImage. Please check your build.

Suggested change
The filename must match the task's :ref:`name <creating-a-task-class-variables>`, e.g. ``calibrateImage.py`` for ``~lsst.pipe.tasks.calibrateImage.CalibrateImage``.
The filename must match the task's :ref:`name <creating-a-task-class-variables>`, e.g. ``calibrateImage.py`` for `~lsst.pipe.tasks.calibrateImage.CalibrateImage`.

"Obs packages" like ``obs_subaru`` also provide a place for instrument-specific ``~lsst.pipe.base.PipelineTask`` configuration overrides, in the ``config`` sub-directory.
These python files are loaded with ``exec`` in the ``~lsst.pipe.base.Task`` context, with the ``config`` variable pointing to that Task's config class.
The filename must match the task's :ref:`name <creating-a-task-class-variables>`, e.g. ``calibrateImage.py`` for ``~lsst.pipe.tasks.calibrateImage.CalibrateImage``.
The file will be placed in either the obs-package's ``config`` sub-directory (for packages that are only define a single instrument) or a ``config`` sub-directory specific to that instrument (for packages that define multiple instruments), e.g. ``config/LATISS``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obs_lsst has configs directly under config, too:

Suggested change
The file will be placed in either the obs-package's ``config`` sub-directory (for packages that are only define a single instrument) or a ``config`` sub-directory specific to that instrument (for packages that define multiple instruments), e.g. ``config/LATISS``.
The file will be placed in either the obs-package's ``config`` sub-directory (for settings that apply to all a package's instruments) or a ``config`` sub-directory specific to that instrument (for settings specific to one instrument in multi-instrument packages), e.g. ``config/LATISS``.

These python files are loaded with ``exec`` in the ``~lsst.pipe.base.Task`` context, with the ``config`` variable pointing to that Task's config class.
The filename must match the task's :ref:`name <creating-a-task-class-variables>`, e.g. ``calibrateImage.py`` for ``~lsst.pipe.tasks.calibrateImage.CalibrateImage``.
The file will be placed in either the obs-package's ``config`` sub-directory (for packages that are only define a single instrument) or a ``config`` sub-directory specific to that instrument (for packages that define multiple instruments), e.g. ``config/LATISS``.
They should be used to specific configuration overrides that are specific to that instrument for any pipeline using that ``~lsst.pipe.base.PipelineTask``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
They should be used to specific configuration overrides that are specific to that instrument for any pipeline using that ``~lsst.pipe.base.PipelineTask``.
They should be used to specify configuration overrides that are specific to that instrument for any pipeline using that `~lsst.pipe.base.PipelineTask`.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As understanding of our pipelines and algorithms evolve, it is natural to need to change the configuration options that were originally selected--possibly arbitrarily--as the defaults.
There are several places where such a configuration change could be made: in the lowest-level ``~lsst.pipe.base.Task`` config class field; in the ``~lsst.pipe.base.PipelineTask`` config class ``~lsst.pex.config.Config.setDefaults`` method; in the instrument-specific ``obs_*``-package override file for that ``~lsst.pipe.base.PipelineTask``; in the pipeline-specific pipeline definition file; or in the instrument- and pipeline-specific pipeline definition file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix all the broken links.

Comment on lines +492 to +497
Where to make configuration changes?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As understanding of our pipelines and algorithms evolve, it is natural to need to change the configuration options that were originally selected--possibly arbitrarily--as the defaults.
There are several places where such a configuration change could be made: in the lowest-level ``~lsst.pipe.base.Task`` config class field; in the ``~lsst.pipe.base.PipelineTask`` config class ``~lsst.pex.config.Config.setDefaults`` method; in the instrument-specific ``obs_*``-package override file for that ``~lsst.pipe.base.PipelineTask``; in the pipeline-specific pipeline definition file; or in the instrument- and pipeline-specific pipeline definition file.
When considering where to make a configuration change or opt-in algorithmic change, developers should apply these guidelines:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is confusing -- you keep talking about changing existing configs, but doesn't this also apply when you first create a new config? What is an "opt-in algorithmic change"?

Comment on lines +499 to +504
- Prefer making changes directly in the lowest-level ``~lsst.pipe.base.Task`` defaults whenever possible, and updating tests that break as a result. When the latter is a substantial scope increase or simply seems inappropriate, the old configs can be explicitly specified in the tests.
- When multiple config changes generally need to be made together, consider making a helper method to do this, as part of the public interface in the lowest-level ``~lsst.pipe.base.Task`` file.
- When making changes directly in the lowest-level Task is not possible, next consider putting config overrides for that ``~lsst.pipe.base.Task`` in the ``~lsst.pipe.base.PipelineTask`` config class's ``~lsst.pex.config.Config.setDefaults`` method. This will ensure that all pipelines using that PipelineTask will use the configuration, independent of instrument or pipeline definition.
- If the configuration is needed for only one a production-specific pipeline (e.g. DRP or AP), put the override in an instrument-generic production-specific config override file in one of the ``*_pipe`` packages. Often this will involve calling a new helper method as in earlier previous bullet, and nothing more.
- Put config options in one ore more obs-package only when they are appropriate for essentially all instances of the Task on that instrument (mostly ISR and other tasks that deal with the actual camera structure), and only that instrument. Be wary of putting filter-specific configs here, as the set of all filters being processed in a particular run is not a per-instrument constant, and often data-set-specific overrides are better.
- Put config options in other Pipeline snippets or production+instrument-specific configs only when no other places are possible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I don't see how anyone who isn't intimately familiar with the stack can understand this. This requires knowledge of subtasks (and the ability to recognize the third bullet is talking about subtasks even though they're never named as such), familiarity with "helper methods" (a new term), and probably more I'm overlooking.

- Prefer making changes directly in the lowest-level ``~lsst.pipe.base.Task`` defaults whenever possible, and updating tests that break as a result. When the latter is a substantial scope increase or simply seems inappropriate, the old configs can be explicitly specified in the tests.
- When multiple config changes generally need to be made together, consider making a helper method to do this, as part of the public interface in the lowest-level ``~lsst.pipe.base.Task`` file.
- When making changes directly in the lowest-level Task is not possible, next consider putting config overrides for that ``~lsst.pipe.base.Task`` in the ``~lsst.pipe.base.PipelineTask`` config class's ``~lsst.pex.config.Config.setDefaults`` method. This will ensure that all pipelines using that PipelineTask will use the configuration, independent of instrument or pipeline definition.
- If the configuration is needed for only one a production-specific pipeline (e.g. DRP or AP), put the override in an instrument-generic production-specific config override file in one of the ``*_pipe`` packages. Often this will involve calling a new helper method as in earlier previous bullet, and nothing more.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody does this, not even https://github.com/lsst/drp_pipe/.

@@ -451,15 +451,57 @@ class name of the python camera object. For instance, for an ``obs_subaru``

instrument: lsst.obs.subaru.HyperSuprimeCam
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any documentation of what order config settings are applied in (field defaults, setDefaults, general obs overrides, instrument obs overrides, top-level pipelines, included pipelines, and maybe other stuff I'm forgetting). It would be hard for the reader to make informed decisions without that kind of big picture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants