Skip to content

Commit

Permalink
Clean up and document
Browse files Browse the repository at this point in the history
  • Loading branch information
eigerx committed Oct 15, 2024
1 parent 735294a commit 7a69154
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 4 deletions.
6 changes: 6 additions & 0 deletions doc/changes/DM-41605.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Aggregate multiple `pipetask report` outputs into one wholistic `Summary`.

While the `QuantumProvenanceGraph` was designed to resolve processing over
dataquery-identified groups, `pipetask aggregate-reports` is designed to
combine multiple group-level reports into one which totals the successes,
issues and failures over the same section of pipeline.
24 changes: 20 additions & 4 deletions python/lsst/ctrl/mpexec/cli/script/report.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,16 +198,32 @@ def report_v2(
def aggregate_reports(
filenames: Iterable[str], full_output_filename: str | None, brief: bool = False
) -> None:
"""Docstring.
"""Aggregrate multiple `QuantumProvenanceGraph` summaries on separate
dataquery-identified groups into one wholistic report. This is intended for
reports over the same tasks in the same pipeline, after `pipetask report`
has been resolved over all graphs associated with each group.
open a bunch of json files, call model_validate_json, call aggregrate,
print summary
Parameters
----------
filenames : `Iterable[str]`
The paths to the JSON files produced by `pipetask report` (note: this
is only compatible with the multi-graph or `--force-v2` option). These
files correspond to the `QuantumProvenanceGraph.Summary` objects which
are produced for each group.
full_output_filename : `str | None`
The name of the JSON file in which to store the aggregate report, if
passed. This is passed to `print_summary` at the end of this function.
brief : `bool = False`
Only display short (counts-only) summary on stdout. This includes
counts and not error messages or data_ids (similar to BPS report).
This option will still report all `cursed` datasets and `wonky`
quanta. This is passed to `print_summary` at the end of this function.
"""
summaries: Iterable[Summary] = []
for filename in filenames:
with open(filename) as f:
model = Summary.model_validate_json(f.read())
summaries.append(model)
summaries.extend([model])
result = Summary.aggregate(summaries)
print_summary(result, full_output_filename, brief)

Expand Down

0 comments on commit 7a69154

Please sign in to comment.