Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-41962: Use storage classes from QG in PreExecInit. #276

Merged
merged 4 commits into from
Dec 2, 2023

Conversation

TallJimbo
Copy link
Member

@TallJimbo TallJimbo commented Dec 1, 2023

PipelineDatasetTypes does not preserve storage classes (that's why it's being deprecated).

Checklist

  • ran Jenkins (failed due to something unrelated; main is broken in at least two ways right now, and this ticket fixes one of them)
  • added a release note for user-visible changes to doc/changes

PipelineDatasetTypes does preserve storage classes (that's why
it's being deprecated).
First problem was just that the test assumed it could register a
dataset type after the QG was made and not run into trouble when
running that QG.

Second problem was that it was just expecting something other than what
its own code comments suggested, which was also contrary to correct
behavior.
Copy link

codecov bot commented Dec 1, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (643a063) 87.15% compared to head (2e33486) 87.05%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #276      +/-   ##
==========================================
- Coverage   87.15%   87.05%   -0.11%     
==========================================
  Files          49       49              
  Lines        4429     4433       +4     
  Branches      764      766       +2     
==========================================
- Hits         3860     3859       -1     
- Misses        413      421       +8     
+ Partials      156      153       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have questions before I can review ☹️

tests/test_simple_pipeline_executor.py Outdated Show resolved Hide resolved
tests/test_simple_pipeline_executor.py Show resolved Hide resolved
@@ -243,14 +237,21 @@ def test_from_pipeline_output_differ(self):
)
)

executor = self._configure_pipeline(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to move until after the dataset type has been registered? If that is the case shouldn't the other tests have this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. They probably don't matter, but they're at least unsafe. Will do.

# b returns a dict and that is converted to TaskMetadata on put.
self._test_logs(cm.output, "dict", "lsst.pipe.base.TaskMetadata", "dict", "dict")

self.assertEqual(len(quanta), 2)
self.assertEqual(self.butler.get("intermediate"), {"zero": 0, "one": 1})
self.assertEqual(self.butler.get("intermediate").to_dict(), {"zero": 0, "one": 1})
self.assertEqual(self.butler.get("output").to_dict(), {"zero": 0, "one": 1, "two": 2})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So "output" is a TaskMetadata? How is that consistent with "b returned a dict" test above and the default connections being "dict"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of b, it returned a dict, and declared that it would return a dict, but since the dataset type was registered with the repo as TaskMetadataLike it was converted to TaskMetadataLike on put.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It should likely use the more explicit TaskMetadata.from_dict() in the test to make it obvious.

This includes:

 - Using TaskMetadata.from_dict instead of to_dict in comparisons to
   make the intent clearer.

 - Documenting what the log-inspection utility method actually tests.

 - Making one test (test_from_pipeline_intermediates_differ)
   registering a dataset type before building a QG because that's
   what's needed in general for correctness (even though it didn't
   matter here).

 - Renaming and re-documenting another test
   (test_from_pipeline_inconsistent_dataset_types) where we were
   actually testing that building the QG and then changing a
   dataset type out from under it can be a problem.
@TallJimbo
Copy link
Member Author

I've added another commit that cleans up the test a bit; ready for another look.

We have a custom QG builders in the wild and they're not as
well-behaved as the main one; guard against them until we can fix them.
@TallJimbo TallJimbo merged commit e072a71 into main Dec 2, 2023
13 of 14 checks passed
@TallJimbo TallJimbo deleted the tickets/DM-41962 branch December 2, 2023 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants