-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Macauff Import Pipeline #185
Conversation
self.resume_plan = ResumePlan( | ||
input_paths=self.input_paths, | ||
tmp_path=self.tmp_path, | ||
progress_bar=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
progress_bar=True, | |
progress_bar=self.progress_bar, |
Let's you skip generation of a progress bar in unit tests.
"""instance of input reader that specifies arguments necessary for reading | ||
from your input files""" | ||
|
||
constant_healpix_order: int = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need any new healpix orders? We should be using the existing partitioning from the left catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a pile of rows in a csv (or set of csvs) that come out of the macauff package and need to be matched against the partitioning of the left catalog
left_pixels = left_catalog.partition_info.get_healpix_pixels() | ||
|
||
# assign a constant healpix order value for the data | ||
_map_pixels(args, client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
dec_column=args.left_dec_column, | ||
id_column=args.left_id_column, | ||
add_hipscat_index=False, | ||
use_schema_file=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the schema parquet file, it would be good to pass it in here.
But I'm not sure that the catalog's reduce_pixel_shards
method is the best call here.
|
||
alignment = align_trees( | ||
left=left_pixel_tree, | ||
right=cross_match_pixel_tree, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we adding in the join_Norder
/ join_Npix
columns to associate with the right catalog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, that needs to be added as well
Change Description
#147
Solution Description
copartitions based on the left catalog
NOTE: I think we could generalize this function a lot and make it so that we just copartition based on a given catalog and make it an association table.
New Feature Checklist