-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-40392: Add methods to update run collection name in a quantum #882
Conversation
The ruff problem comes from ruff 0.285 with astral-sh/ruff#6465 |
@@ -706,11 +706,48 @@ def overrideStorageClass(self, storageClass: str | StorageClass) -> DatasetRef: | |||
A new dataset reference that is the same as the current one but | |||
with a different storage class in the `DatasetType`. | |||
""" | |||
return self.update(storage_class=storageClass) | |||
|
|||
def update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe replace()
to match the name used in dataclasses
? update
doesn't immediately suggest that a new version is being returned. I called the equivalent ResourcePath.replace
for that reason.
ccf6e2c
to
763189d
Compare
@@ -190,7 +190,7 @@ def direct( | |||
# This method requires tuples as values of the mapping, but JSON | |||
# readers will read things in as lists. Be kind and transparently | |||
# transform to tuples | |||
_recItems = {k: v if type(v) != list else tuple(v) for k, v in record.items()} # type: ignore | |||
_recItems = {k: v if not isinstance(v, list) else tuple(v) for k, v in record.items()} # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we deliberately used type()
here for speed since we know it can only be a list coming from JSON and we need this to be fast. We don't want to worry about subclasses of list here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest ruff was complaining about it, should I just suppress this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why don't we always use tuple(v)
? tuple(v)
will return v
if it's already a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That might be a better answer. Let the tuple() constructor deal with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, not all items are supposed to be tuples there, sorry for the noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we only want lists to migrate. I think adding a noqa will work but you'll also have to adjust pyproject.toml to stop ruff complaining about unused noqa -- the rubin-env ruff used in Jenkins will not trigger an error and will instead get upset by the unused noqa.
763189d
to
a99425a
Compare
7be0a4c
to
4cf9cb8
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #882 +/- ##
=======================================
Coverage 87.70% 87.71%
=======================================
Files 274 274
Lines 36195 36219 +24
Branches 7575 7578 +3
=======================================
+ Hits 31746 31770 +24
Misses 3272 3272
Partials 1177 1177
☔ View full report in Codecov by Sentry. |
4cf9cb8
to
6d29504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add trivial test for update_output_run and update_input_run.
If these two methods have to be called in exactly the right way and solely from QuantumGraph.updateRun
maybe they should be private methods in the sense that middleware sometimes knows it can call private methods in different packages but we don't want them public? Else it begs the question why isn't there a quantum.update_run
that combines these two methods?
Returns | ||
------- | ||
modified : `DatasetRef` | ||
A new dataset reference with updated attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all 3 parameters are None do you want to return self
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could but I do not want an additional test for all None
, I don't think it's going to used with no parameters.
refs: Iterable[DatasetRef], run: str, dataset_id_map: Mapping[DatasetId, DatasetId] | ||
) -> Iterator[DatasetRef]: | ||
for ref in refs: | ||
if dataset_id := dataset_id_map.get(ref.id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment that this is for intermediates only and if it's not already in dataset_id_map the ref won't be modified?
I agree that those two new methods in |
89362da
to
a900423
Compare
I moved that private code to |
a900423
to
4424314
Compare
QuantumGraph.updateRun() method needs a better implementation and these new methods in DatasetRef and Quantum classes are for supporting that implementation.
4424314
to
cda19fe
Compare
QuantumGraph.updateRun() method needs a better implementation and these new methods in DatasetRef and Quantum classes are for supporting that implementation.
Checklist
doc/changes