-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-45431: Update parquet formatter to use can_accept() #1059
Conversation
eb80dad
to
6915fe7
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1059 +/- ##
==========================================
+ Coverage 89.53% 89.55% +0.01%
==========================================
Files 361 357 -4
Lines 46200 46173 -27
Branches 9469 9486 +17
==========================================
- Hits 41366 41348 -18
+ Misses 3499 3488 -11
- Partials 1335 1337 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I assume the tests demonstrate that the coercion is bypassed correctly and accuracy is retained?
if not delegate or not delegate.can_accept(inMemoryDataset): | ||
inMemoryDataset = ref.datasetType.storageClass.coerce_type(inMemoryDataset) | ||
|
||
if not self.constraints.isAcceptable(ref): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a bit more efficient to do this check before the coerce step.
|
||
import pyarrow as pa | ||
from lsst.daf.butler import StorageClassDelegate | ||
from lsst.utils.introspection import get_full_type_name | ||
from lsst.utils.iteration import ensure_iterable | ||
|
||
__all__ = ["ArrowTableDelegate"] | ||
if TYPE_CHECKING: | ||
import pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We disable warnings from pandas bad imports in the mypy config so I think this will not be used in the mypy github action check because pandas isn't listed in types.txt.
@@ -2212,5 +2162,81 @@ def testRowGroupSizeDataFrameWithLists(self): | |||
self.assertGreater(row_group_size, 1_000_000) | |||
|
|||
|
|||
def _checkAstropyTableEquality(table1, table2, skip_units=False, has_bigendian=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I wonder if this should replace assertAstropyTablesEqual
in lsst.daf.butler.tests.utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't noticed this! Is it worth looking into merging these now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure your checks are more complete and so we may as well sync up the utility code. Not required now though.
e963406
to
b02c95b
Compare
The primary use case is storing astropy tables with units into existing storage classes that are declared as DataFrames. With this update astropy tables can be fully preserved and the tests focus on this.
b02c95b
to
54a7d55
Compare
This PR additionally adds
can_accept()
support to delegates and the InMemoryDatastore.Checklist
doc/changes
configs/old_dimensions