Fix test data for removing partition_info #166
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Description
This is related to work toward deprecating
partition_info.csv
(astronomy-commons/hipscat#147).Solution Description
The intermediate file in this test data is quite old, and didn't have the Norder/Dir/Npix columns inside the parquet file. The pipeline could successfully resume from the previous work, but generated
_metadata
file that didn't have row group statistics for those three columns. Then, when we try to read the directory as a hipscat catalog, we try to create thePartitionInfo
from the row group statistics, but fail to do so.This PR re-generates the intermediate file, using the current map/reduce methods of the import tool to include the partition info columns as necessary.
And this is kind of impossible to review, since the only change is to a binary file, so that's why there's such a long explanation =]