-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #339 from lsst/tickets/DM-38952
DM-38952: revive pipeline mock system and move it here (from ctrl_mpexec)
- Loading branch information
Showing
14 changed files
with
1,112 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Revived bitrotted support for "mocked" `PipelineTask` execution and moved it here (from `ctrl_mpexec`). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
.. py:currentmodule:: lsst.pipe.base.tests.mocks | ||
.. _testing-pipelines-with-mocks: | ||
|
||
############################ | ||
Testing pipelines with mocks | ||
############################ | ||
|
||
The `lsst.pipe.base.tests.mocks` package provides a way to build and execute `.QuantumGraph` objects without actually running any real task code or relying on real data. | ||
This is primarily for testing the middleware responsible for `.QuantumGraph` generation and execution, but it can also be used to check that the connections in a configured pipeline are consistent with each other and with any documented recommendations for how to run those steps (e.g., which dimensions can safely be constrained by user expressions). | ||
|
||
The high-level entry point to this system is `mock_task_defs` function, which takes an iterable of `.TaskDef` objects (typically obtained from `.Pipeline.toExpandedPipeline`) and returns a new sequence of `.TaskDef` objects, in which each original task has been replaced by a configuration of `MockPipelineTask` whose connections are analogous to the original. | ||
Passing the ``--mock`` option to ``pipetask qgraph`` or ``pipetask run`` will run this on the given pipeline when building the graph. | ||
When a pipeline is mocked, all task labels and dataset types are transformed by the `get_mock_name` function (so these can live alongside their real counterparts in a single data repository), and the storage classes of all regular connections are replaced with instances of `MockStorageClass`. | ||
The in-memory Python type for `MockStorageClass` is always `MockDataset`, which is always written to disk in JSON format, but conversions between mock storage classes are always defined analogously to the original storage classes they mock, and the `MockDataset` class records conversions (and component access and parameters) when they occur, allowing test code that runs later to load them and inspect exactly how the object was loaded and provided to the task when it was executed. | ||
|
||
The `MockPipelineTask.runQuantum` method reads all input mocked datasets that correspond to a `MockStorageClass` and simulates reading any input datasets there were not mocked (via the `MockPipelineTaskConfig.unmocked_dataset_types` config option, or the `mock_task_defs` argument of the same name) by constructing a new `MockDataset` instance for them. | ||
It then constructs and writes new `MockDataset` instances for each of its predicted outputs, storing copies of the input `MockDataset`\s within them. | ||
`MockPipelineTaskConfig` and `mock_task_defs` also have options for causing quanta that match a data ID expression to raise an exception instead. | ||
Dataset types produced by the execution framework - configs, logs, metadata, and package version information - are not mocked, but they are given names with the prefix added by `get_mock_name` by virtue of being constructed from a task label that has that prefix. | ||
|
||
Importing the `lsst.pipe.base.tests.mocks` package causes the `~lsst.daf.butler.StorageClassFactory` and `~lsst.daf.butler.FormatterFactory` classes to be monkey-patched with special code that recognizes mock storage class names without being included in any butler configuration files. | ||
This should not affect how any non-mock storage classes are handled, but it is still best to only import `lsst.pipe.base.tests.mocks` in code that is *definitely* using the mock system, even if that means putting the import at function scope instead of module scope. | ||
|
||
The `ci_middleware <https://github.com/lsst/ci_middleware.git>`_ package is the primary place where this mocking library is used, and the home of its unit tests, but it has been designed to be usable in regular "real" data repositories as well. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# This file is part of pipe_base. | ||
# | ||
# Developed for the LSST Data Management System. | ||
# This product includes software developed by the LSST Project | ||
# (http://www.lsst.org). | ||
# See the COPYRIGHT file at the top-level directory of this distribution | ||
# for details of code ownership. | ||
# | ||
# This program is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License as published by | ||
# the Free Software Foundation, either version 3 of the License, or | ||
# (at your option) any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU General Public License | ||
# along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
"""A system for replacing the tasks in a pipeline with mocks that just read and | ||
write trivial datasets. | ||
See :ref:`testing-pipelines-with-mocks` for details. | ||
""" | ||
|
||
from ._data_id_match import * | ||
from ._pipeline_task import * | ||
from ._storage_class import * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
# This file is part of pipe_base. | ||
# | ||
# Developed for the LSST Data Management System. | ||
# This product includes software developed by the LSST Project | ||
# (http://www.lsst.org). | ||
# See the COPYRIGHT file at the top-level directory of this distribution | ||
# for details of code ownership. | ||
# | ||
# This program is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License as published by | ||
# the Free Software Foundation, either version 3 of the License, or | ||
# (at your option) any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU General Public License | ||
# along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
from __future__ import annotations | ||
|
||
__all__ = ["DataIdMatch"] | ||
|
||
import operator | ||
from collections.abc import Callable | ||
from typing import Any | ||
|
||
import astropy.time | ||
from lsst.daf.butler import DataId | ||
from lsst.daf.butler.registry.queries.expressions.parser import Node, ParserYacc, TreeVisitor # type: ignore | ||
|
||
|
||
class _DataIdMatchTreeVisitor(TreeVisitor): | ||
"""Expression tree visitor which evaluates expression using values from | ||
DataId. | ||
""" | ||
|
||
def __init__(self, dataId: DataId): | ||
self.dataId = dataId | ||
|
||
def visitNumericLiteral(self, value: str, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
try: | ||
return int(value) | ||
except ValueError: | ||
return float(value) | ||
|
||
def visitStringLiteral(self, value: str, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
return value | ||
|
||
def visitTimeLiteral(self, value: astropy.time.Time, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
return value | ||
|
||
def visitRangeLiteral(self, start: int, stop: int, stride: int | None, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
if stride is None: | ||
return range(start, stop + 1) | ||
else: | ||
return range(start, stop + 1, stride) | ||
|
||
def visitIdentifier(self, name: str, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
return self.dataId[name] | ||
|
||
def visitUnaryOp(self, operator_name: str, operand: Any, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
operators: dict[str, Callable[[Any], Any]] = { | ||
"NOT": operator.not_, | ||
"+": operator.pos, | ||
"-": operator.neg, | ||
} | ||
return operators[operator_name](operand) | ||
|
||
def visitBinaryOp(self, operator_name: str, lhs: Any, rhs: Any, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
operators = { | ||
"OR": operator.or_, | ||
"AND": operator.and_, | ||
"+": operator.add, | ||
"-": operator.sub, | ||
"*": operator.mul, | ||
"/": operator.truediv, | ||
"%": operator.mod, | ||
"=": operator.eq, | ||
"!=": operator.ne, | ||
"<": operator.lt, | ||
">": operator.gt, | ||
"<=": operator.le, | ||
">=": operator.ge, | ||
} | ||
return operators[operator_name](lhs, rhs) | ||
|
||
def visitIsIn(self, lhs: Any, values: list[Any], not_in: bool, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
is_in = True | ||
for value in values: | ||
if not isinstance(value, range): | ||
value = [value] | ||
if lhs in value: | ||
break | ||
else: | ||
is_in = False | ||
if not_in: | ||
is_in = not is_in | ||
return is_in | ||
|
||
def visitParens(self, expression: Any, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
return expression | ||
|
||
def visitTupleNode(self, items: tuple[Any, ...], node: Node) -> Any: | ||
# docstring is inherited from base class | ||
raise NotImplementedError() | ||
|
||
def visitFunctionCall(self, name: str, args: list[Any], node: Node) -> Any: | ||
# docstring is inherited from base class | ||
raise NotImplementedError() | ||
|
||
def visitPointNode(self, ra: Any, dec: Any, node: Node) -> Any: | ||
# docstring is inherited from base class | ||
raise NotImplementedError() | ||
|
||
|
||
class DataIdMatch: | ||
"""Class that can match DataId against the user-defined string expression. | ||
Parameters | ||
---------- | ||
expression : `str` | ||
User-defined expression, supports syntax defined by daf_butler | ||
expression parser. Maps identifiers in the expression to the values of | ||
DataId. | ||
""" | ||
|
||
def __init__(self, expression: str): | ||
parser = ParserYacc() | ||
self.expression = expression | ||
self.tree = parser.parse(expression) | ||
|
||
def match(self, dataId: DataId) -> bool: | ||
"""Matches DataId contents against the expression. | ||
Parameters | ||
---------- | ||
dataId : `DataId` | ||
DataId that is matched against an expression. | ||
Returns | ||
------- | ||
match : `bool` | ||
Result of expression evaluation. | ||
Raises | ||
------ | ||
KeyError | ||
Raised when identifier in expression is not defined for given | ||
`DataId`. | ||
TypeError | ||
Raised when expression evaluates to a non-boolean type or when | ||
operation in expression cannot be performed on operand types. | ||
NotImplementedError | ||
Raised when expression includes valid but unsupported syntax, e.g. | ||
function call. | ||
""" | ||
visitor = _DataIdMatchTreeVisitor(dataId) | ||
result = self.tree.visit(visitor) | ||
if not isinstance(result, bool): | ||
raise TypeError(f"Expression '{self.expression}' returned non-boolean object {type(result)}") | ||
return result |
Oops, something went wrong.