-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-39939: use query-results' grouping when processing iterables of DatasetRefs #863
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #863 +/- ##
==========================================
+ Coverage 87.90% 87.92% +0.01%
==========================================
Files 273 270 -3
Lines 35764 35697 -67
Branches 7474 7478 +4
==========================================
- Hits 31440 31388 -52
+ Misses 3166 3150 -16
- Partials 1158 1159 +1
☔ View full report in Codecov by Sentry. |
45e7f53
to
8213990
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, one suggestion to make it a bit more type-checkable.
if hasattr(refs, "_iter_by_dataset_type"): | ||
return refs._iter_by_dataset_type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super-happy with this dynamic approach, particularly because our favorite mypy cannot type-check this. Would it be possible to add an abstraction (or maybe Protocol
with runtime check) so that isinstance
can be used instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched to a runtime-checkable protocol, but I've given it a leading underscore (and documented it as "package private") since I don't want external code to start using this.
refs : `~collections.abc.Iterable` [ `DatasetRef` ] | ||
`DatasetRef` instances to group. If this has a | ||
``_iter_by_dataset_type`` method, it will be called with no | ||
arguments and the result reutrnd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arguments and the result reutrnd. | |
arguments and the result returned. |
tests/test_progress.py
Outdated
self.assertEqual(MockProgressBar.last.total, 2) | ||
|
||
def test_iter_item_chunks_not_sized(self): | ||
"""Test using `Progress.iter_item_chunks` with an unsized iterable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Test using `Progress.iter_item_chunks` with an unsized iterable | |
"""Test using `Progress.iter_item_chunks` with an unsized iterable of |
8213990
to
8a9e8c4
Compare
When processing all dataset types in a collection together, this can represent a huge decrease in memory usage, by querying for and then processing only one dataset type at a time.
8a9e8c4
to
a3ec38a
Compare
Checklist
doc/changes