Skip to content

Conversation

dorien-er
Copy link
Contributor

No description provided.

@dorien-er dorien-er requested a review from rcannood September 4, 2025 09:21
@rcannood rcannood requested a review from Copilot September 4, 2025 14:10
Copy link

@rcannood rcannood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for compressed output bundles by enabling conversion components to accept ZIP archives as input for spatial data formats (CosMx, Xenium, and Aviti).

Key changes:

  • Added utility functions to extract files from ZIP archives with glob pattern support
  • Updated conversion components to detect and handle ZIP inputs automatically
  • Added comprehensive test coverage for compressed input functionality

Reviewed Changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/utils/unzip_archived_folder.py New Python utility for extracting ZIP archives with pattern matching
src/utils/unzip_archived_folder.R New R utility for extracting ZIP archives with glob pattern support
src/convert/*/script.py Updated Python scripts to handle ZIP inputs using new utilities
src/convert/*/script.R Updated R scripts to handle ZIP inputs using new utilities
src/convert/*/test.py Added test cases for compressed input validation
src/convert/*/test.R Added test cases for compressed input validation
src/convert/*/config.vsh.yaml Added utility dependencies and test setup requirements
CHANGELOG.md Documentation of new compressed input functionality

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Extracts a ZIP archive to a temporary directory and returns the path to the extracted folder.
Args:
zip_path (Union[str, Path]): Path to the ZIP archive.
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name in docstring is 'zip_path' but the actual parameter is 'archived_folder'. The docstring should be updated to match the parameter name.

Suggested change
zip_path (Union[str, Path]): Path to the ZIP archive.
archived_folder (Union[str, Path]): Path to the ZIP archive.

Copilot uses AI. Check for mistakes.

extracted_path (Union[str, Path]): Path to the extracted folder inside the temporary directory.
"""

temp_dir = Path(tempfile.TemporaryDirectory().name)
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using TemporaryDirectory().name creates a path to a directory that doesn't exist and won't be automatically cleaned up. Use tempfile.mkdtemp() instead or properly manage the TemporaryDirectory context.

Copilot uses AI. Check for mistakes.

Path: Path to the extraction directory.
"""

temp_dir = Path(tempfile.TemporaryDirectory().name)
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using TemporaryDirectory().name creates a path to a directory that doesn't exist and won't be automatically cleaned up. Use tempfile.mkdtemp() instead or properly manage the TemporaryDirectory context.

Copilot uses AI. Check for mistakes.

Args:
zip_path (Union[str, Path]): Path to the ZIP archive.
members (list[str]): List of file paths within the archive to extract.
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation in docstring shows 'list[str]' but the actual parameter accepts 'list[Union[str, Path]]'. The docstring should reflect the correct type.

Suggested change
members (list[str]): List of file paths within the archive to extract.
members (list[Union[str, Path]]): List of file paths within the archive to extract.

Copilot uses AI. Check for mistakes.

Comment on lines +34 to +44
required_file_patterns = [
"**/experiment.xenium",
"**/nucleus_boundaries.parquet",
"**/cell_boundaries.parquet",
"**/transcripts.parquet",
"**/cell_feature_matrix.h5",
"**/cells.parquet",
"**/morphology_mip.ome.tif",
"**/morphology_focus.ome.tif",
]
xenium_output_bundle = unzip_archived_folder(par["input"])
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The required_file_patterns list is defined but never used. It should either be passed to the unzip_archived_folder function or removed if not needed.

Suggested change
required_file_patterns = [
"**/experiment.xenium",
"**/nucleus_boundaries.parquet",
"**/cell_boundaries.parquet",
"**/transcripts.parquet",
"**/cell_feature_matrix.h5",
"**/cells.parquet",
"**/morphology_mip.ome.tif",
"**/morphology_focus.ome.tif",
]
xenium_output_bundle = unzip_archived_folder(par["input"])
xenium_output_bundle = unzip_archived_folder(par["input"], required_file_patterns)

Copilot uses AI. Check for mistakes.

Comment on lines +72 to +73
if __name__ == "__main__":
main()
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script now defines a main() function but the original code wasn't wrapped in it. The existing code outside main() should be moved inside the main() function for consistency.

Copilot uses AI. Check for mistakes.

@dorien-er dorien-er merged commit 096168e into main Sep 5, 2025
8 checks passed
@dorien-er dorien-er deleted the allow-compressed-output-bundles branch September 5, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants