-
Notifications
You must be signed in to change notification settings - Fork 0
Allow compressed output bundles #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for compressed output bundles by enabling conversion components to accept ZIP archives as input for spatial data formats (CosMx, Xenium, and Aviti).
Key changes:
- Added utility functions to extract files from ZIP archives with glob pattern support
- Updated conversion components to detect and handle ZIP inputs automatically
- Added comprehensive test coverage for compressed input functionality
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
src/utils/unzip_archived_folder.py | New Python utility for extracting ZIP archives with pattern matching |
src/utils/unzip_archived_folder.R | New R utility for extracting ZIP archives with glob pattern support |
src/convert/*/script.py | Updated Python scripts to handle ZIP inputs using new utilities |
src/convert/*/script.R | Updated R scripts to handle ZIP inputs using new utilities |
src/convert/*/test.py | Added test cases for compressed input validation |
src/convert/*/test.R | Added test cases for compressed input validation |
src/convert/*/config.vsh.yaml | Added utility dependencies and test setup requirements |
CHANGELOG.md | Documentation of new compressed input functionality |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Extracts a ZIP archive to a temporary directory and returns the path to the extracted folder. | ||
Args: | ||
zip_path (Union[str, Path]): Path to the ZIP archive. |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter name in docstring is 'zip_path' but the actual parameter is 'archived_folder'. The docstring should be updated to match the parameter name.
zip_path (Union[str, Path]): Path to the ZIP archive. | |
archived_folder (Union[str, Path]): Path to the ZIP archive. |
Copilot uses AI. Check for mistakes.
extracted_path (Union[str, Path]): Path to the extracted folder inside the temporary directory. | ||
""" | ||
|
||
temp_dir = Path(tempfile.TemporaryDirectory().name) |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using TemporaryDirectory().name creates a path to a directory that doesn't exist and won't be automatically cleaned up. Use tempfile.mkdtemp() instead or properly manage the TemporaryDirectory context.
Copilot uses AI. Check for mistakes.
Path: Path to the extraction directory. | ||
""" | ||
|
||
temp_dir = Path(tempfile.TemporaryDirectory().name) |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using TemporaryDirectory().name creates a path to a directory that doesn't exist and won't be automatically cleaned up. Use tempfile.mkdtemp() instead or properly manage the TemporaryDirectory context.
Copilot uses AI. Check for mistakes.
Args: | ||
zip_path (Union[str, Path]): Path to the ZIP archive. | ||
members (list[str]): List of file paths within the archive to extract. |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type annotation in docstring shows 'list[str]' but the actual parameter accepts 'list[Union[str, Path]]'. The docstring should reflect the correct type.
members (list[str]): List of file paths within the archive to extract. | |
members (list[Union[str, Path]]): List of file paths within the archive to extract. |
Copilot uses AI. Check for mistakes.
required_file_patterns = [ | ||
"**/experiment.xenium", | ||
"**/nucleus_boundaries.parquet", | ||
"**/cell_boundaries.parquet", | ||
"**/transcripts.parquet", | ||
"**/cell_feature_matrix.h5", | ||
"**/cells.parquet", | ||
"**/morphology_mip.ome.tif", | ||
"**/morphology_focus.ome.tif", | ||
] | ||
xenium_output_bundle = unzip_archived_folder(par["input"]) |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The required_file_patterns list is defined but never used. It should either be passed to the unzip_archived_folder function or removed if not needed.
required_file_patterns = [ | |
"**/experiment.xenium", | |
"**/nucleus_boundaries.parquet", | |
"**/cell_boundaries.parquet", | |
"**/transcripts.parquet", | |
"**/cell_feature_matrix.h5", | |
"**/cells.parquet", | |
"**/morphology_mip.ome.tif", | |
"**/morphology_focus.ome.tif", | |
] | |
xenium_output_bundle = unzip_archived_folder(par["input"]) | |
xenium_output_bundle = unzip_archived_folder(par["input"], required_file_patterns) |
Copilot uses AI. Check for mistakes.
if __name__ == "__main__": | ||
main() |
Copilot
AI
Sep 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script now defines a main() function but the original code wasn't wrapped in it. The existing code outside main() should be moved inside the main() function for consistency.
Copilot uses AI. Check for mistakes.
No description provided.