Here is a list of tasks that are available within Fractal-compatible packages,
+including both fractal-tasks-core and others.
+
These are the tasks that we are aware of; if you created your own package of
+Fractal tasks, reach out to have it listed here (or, if you want to build your
+own tasks, follow these instructions).
Description: The APx Fractal Task Collection is mainainted by Apricot Therapeutics AG, Switzerland. This is a collection of tasks intended to be used in combination with the Fractal Analytics Platform maintained by the BioVisionCenter Zurich (co-founded by the Friedrich Miescher Institute and the University of Zurich). The tasks in this collection are focused on extending Fractal's capabilities of processing 2D image data, with a special focus on multiplexed 2D image data. Most tasks work with 3D image data, but they have not specifically been developed for this scenario.
Update all tasks to use the new Fractal API from Fractal server 2.0 (#671)
+
Provide new dev tooling to create Fractal manifest for new task API (#671)
+
Add Pydantic models for OME-NGFF HCS Plate validation (#671)
+
Breaking changes in core library:
+
In get_acquisition_paths helper function of NgffWellMeta:
+ The dictionary now contains a list of paths as values, not single paths.
+ The NotImplementedError for multiple images with the same acquisition was removed.
+
The utils.get_table_path_dict helper function was made private & changed its input parameters:
+ It's now _get_table_path_dict(zarr_url: str)
(major) Introduce new tasks for registration of multiplexing cycles: calculate_registration_image_based, apply_registration_to_ROI_tables, apply_registration_to_image (#487).
+
(major) Introduce new overwrite argument for tasks create_ome_zarr, create_ome_zarr_multiplex, yokogawa_to_ome_zarr, copy_ome_zarr, maximum_intensity_projection, cellpose_segmentation, napari_workflows_wrapper (#499).
+
(major) Rename illumination_correction parameter from overwrite to overwrite_input (#499).
+
Fix plate-selection bug in copy_ome_zarr task (#513).
+
Fix bug in definition of metadata["plate"] in create_ome_zarr_multiplex task (#513).
+
Introduce new helper functions write_table, prepare_label_group and open_zarr_group_with_overwrite (#499).
Make tasks-related dependencies optional, and installable via fractal-tasks extra (#390).
+
Remove tools package extra (#384), and split the subpackage content into lib_ROI_overlaps and examples (#390).
+
+
+
(major) Modify task arguments
+
Add Pydantic model lib_channels.OmeroChannel (#410, #422);
+
Add Pydantic model tasks._input_models.Channel (#422);
+
Add Pydantic model tasks._input_models.NapariWorkflowsInput (#422);
+
Add Pydantic model tasks._input_models.NapariWorkflowsOutput (#422);
+
Move all Pydantic models to main package (#438).
+
Modify arguments of illumination_correction task (#431);
+
Modify arguments of create_ome_zarr and create_ome_zarr_multiplex (#433).
+
Modify argument default for ROI_table_names, in copy_ome_zarr (#449).
+
Remove the delete option from yokogawa to ome zarr (#443).
+
Reorder task inputs (#451).
+
+
+
JSON Schemas for task arguments:
+
Add JSON Schemas for task arguments in the package manifest (#369, #384).
+
Add JSON Schemas for attributes of custom task-argument Pydantic models (#436).
+
Make schema-generation tools more general, when handling custom Pydantic models (#445).
+
Include titles for custom-model-typed arguments and argument attributes (#447).
+
Remove TaskArguments models and switch to Pydantic V1 validate_arguments (#369).
+
Make coercing&validating task arguments required, rather than optional (#408).
+
Remove default_args from manifest (#379, #393).
+
+
+
Other:
+
Make pydantic dependency required for running tasks, and pin it to V1 (#408).
+
Remove legacy executor definitions from manifest (#361).
+
Add GitHub action for testing pip install with/without fractal-tasks extra (#390).
+
Remove sqlmodel from dev dependencies (#374).
+
Relax constraint on torch version, from ==1.12.1 to <=2.0.0 (#406).
+
Review task docstrings and improve documentation (#413, #416).
+
Update anndata dependency requirements (from ^0.8.0 to >=0.8.0,<=0.9.1), and replace anndata.experimental.write_elem with anndata._io.specs.write_elem (#428).
Disable bugged validation of model_type argument in cellpose_segmentation (#344).
+
Raise an error if the user provides an unexpected argument to a task (#337); this applies to the case of running a task as a script, with a pydantic model for task-argument validation.
(major) Update task interface: remove filename extension from input_paths and output_path for all tasks, and add new arguments (image_extension,image_glob_pattern) to create_ome_zarr task (#323).
+
Implement logic for handling image_glob_patterns argument, both when globbing images and in Yokogawa metadata parsing (#326).
The fractal-tasks-core repository is the reference implementation for Fractal
+tasks and for Fractal task packages, but the Fractal platform can also be
+used to execute custom tasks.
⚠️ OBSOLETE: How to write a Fractal-compatible custom task¶
+
+
⚠️⚠️ These instructions are here just as a reference, but they refer to legacy
+versions of fractal-server. While the overall structure of the instructions
+is still valid, several details are now obsolete and won't work. ⚠️⚠️
+
+
The fractal-tasks-core repository is the reference implementation for Fractal
+tasks and for Fractal task packages, but the Fractal platform can also be
+used to execute custom tasks.
Each task must be associated to some metadata, so that it can be used in
+Fractal. The full specification is
+here,
+and the required attributes are:
+
+
name: the task name, e.g. "Create OME-Zarr structure";
+
command: a command that can be executed from the command line;
+
input_type: this can be any string (typical examples: "image" or "zarr");
+ the special value "Any" means that Fractal won't perform any check of the
+ input_type when applying the task to a dataset.
+
output_type: same logic as input_type.
+
source: this is meant to be as close as possible to unique task identifier;
+ for custom tasks, it can be anything (e.g. "my_task"), but for task that
+ are collected automatically from a package (see Task package this
+ attribute will have a very specific form (e.g.
+ "pip_remote:fractal_tasks_core:0.10.0:fractal-tasks::convert_yokogawa_to_ome-zarr").
+
meta: a JSON object (similar to a Python dictionary) with some additional
+ information, see Task meta-parameters.
+
+
There are multiple ways to get the appropriate metadata into the database,
+including a POST request to the fractal-server API (see Tasks section in
+the fractal-server API
+documentation)
+or the automated addition of a whole set of tasks through specific API
+endpoints (see Task package).
Therefore the task command must accept these additional command-line arguments.
+If the task is a Python script, this can be achieved easily by using the
+run_fractal_task function - which is available as part of
+fractal_tasks_core.tasks._utils.
The meta attribute of tasks (see the corresponding item in Task
+metadata) is where we specify some requirements on how the
+task should be run. This notably includes:
+
+
If the task has to be run in parallel (e.g. over multiple wells of an
+ OME-Zarr dataset), then meta should include a key-value pair like
+ {"parallelization_level": "well"}. If the parallelization_level key is
+ missing, the task is considered as non-parallel.
+
If Fractal is configured to run on a SLURM cluster, meta may include
+ additional information on the SLRUM requirements (more info on the Fractal
+ SLURM backend
+ here).
When a task is run via Fractal, its input parameters (i.e. the ones in the file
+specified via the -j command-line otion) will always include a set of keyword
+arguments with specific names:
The only task output which will be visible to Fractal is what goes in the
+output metadata-update file (i.e. the one specified through the
+--metadata-out command-line option). Note that this only holds for
+non-parallel tasks, while (for the moment) Fractal fully ignores the output of
+parallel tasks.
+
+
IMPORTANT: This means that each task must always write any output to
+disk, before ending.
The description of other advanced features is not yet available in this page.
+
+
Also other attributes of the Task metadata exist, and they
+ would be recognized by other Fractal components (e.g. fractal-server or
+ fractal-web). These include JSON Schemas for input parameters and additional
+ documentation-related attributes.
+
In fractal-tasks-core, we use pydantic
+ v1 to fully coerce and validate the input
+ parameters into a set of given types.
Given a set of Python scripts corresponding to Fractal tasks, it is useful to
+combine them into a single Python package, using the standard
+tools or
+other options (e.g. for fractal-tasks-core we use
+poetry).
Creating a package is often a good practice, for reasons unrelated to Fractal:
+
+
It makes it simple to assign a global version to the package, and to host it
+ on a public index like PyPI;
+
It may reduce code duplication:
+
The scripts may have a shared set of external dependencies, which are
+ defined in a single place for a package.
+
The scripts may import functions from a shared set of auxiliary Python
+ modules, which can be included in the package.
+
+
+
+
Moreover, having a single package also streamlines some Fractal-related
+operations. Given the package MyTasks (available on PyPI, or locally), the
+Fractal platform offers a feature that automatically:
+
+
Downloads the wheel file of package MyTasks (if it's on a public index,
+ rather than a local file);
+
Creates a Python virtual environment (venv) which is specific for a given
+ version of the MyTasks package, and installs the MyTasks package in that
+ venv;
+
Populates all the corresponding entries in the task database table with
+ the appropriate Task metadata, which are extracted from
+ the package manifest.
+
+
This feature is currently exposed in the /api/v1/task/collect/pip/ endpoint of fractal-server (see API documentation).
To be compatible with Fractal, a task package must satisfy some additional requirements:
+
+
The package is built as a a wheel file, and can be installed via pip.
+
The __FRACTAL_MANIFEST__.json file is bundled in the package, in its root
+ folder. If you are using poetry, no special operation is needed. If you
+ are using a setup.cfg file, see
+ this
+ comment.
+
Include JSON Schemas. The tools in fractal_tasks_core.dev are used to
+ generate JSON Schema's for the input parameters of each task in
+ fractal-tasks-core. They are meant to be flexible and re-usable to perform
+ the same operation on an independent package, but they are not thoroughly
+ documented/tested for more general use; feel free to open an issue if something
+ is not clear.
+
Include additional task metadata like docs_info or docs_link, which will
+ be displayed in the Fractal web-client. Note: this feature is not yet
+ implemented.
We use poetry to manage both development environments and package building. A simple way to install it is pipx install poetry==1.8.2, or you can look at the installation section here.
+
From the repository root folder, running any of
+
# Install the core library only
+poetryinstall
+
+# Install the core library and the tasks
+poetryinstall-Efractal-tasks
+
+# Install the core library and the development/documentation dependencies
+poetryinstall--withdev--withdocs
+
+will take care of installing all the dependencies in a separate environment (handled by poetry itself), optionally installing also the dependencies for developement and to build the documentation.
+
We use pytest for unit and integration testing of Fractal. If you installed the development dependencies, you may run the test suite by invoking commands like:
+
# Run all tests
+poetryrunpytest
+
+# Run all tests with a verbose mode, and stop at the first failure
+poetryrunpytest-x-v
+
+# Run all tests and also print their output
+poetryrunpytest-s
+
+# Ignore some tests folders
+poetryrunpytest--ignoretests/tasks
+
+
The tests files are in the tests folder of the repository. Its structure reflects the fractal_tasks_core structure, with tests for the core library in the main folder and tests for tasks and dev subpckages in their own subfolders.
+
Tests are also run through GitHub Actions, with Python 3.9, 3.10 and 3.11. Note that within GitHub actions we run tests for both the poetry-installed and pip-installed versions of the code, which may e.g. have different versions of some dependencies (since pip install does not rely on the poetry.lock lockfile).
The documentations is built with mkdocs.
+To build the documentation locally, setup a development python environment (e.g. with poetry install --with docs) and then run one of these commands:
+
poetry run mkdocs serve --config-file mkdocs.yml # serves the docs at http://127.0.0.1:8000
+poetry run mkdocs build --config-file mkdocs.yml # creates a build in the `site` folder
+
If appropriate (e.g. if you added some new task arguments, or if you modified some of their descriptions), update the JSON Schemas in the manifest via:
+
From within the main branch, use a command like:
+
# Automatic bump of release number
+poetryrunbumpverupdate--[tag-num|patch|minor]--dry
+
+# Set a specific version
+poetryrunbumpverupdate--set-version1.2.3--dry
+
+to test updating the version bump
+
If the previous step looks good, remove the --dry and re-run the same command. This will commit both the edited files and the new tag, and push.
+
Approve the new version deployment at Publish package to PyPI (or have it approved); the corresponding GitHub action will take care of running poetry build and poetry publish with the appropriate credentials.
Fractal is a framework to process high content imaging data at scale and prepare it for interactive visualization.
+
+
This project is under active development 🔨. If you need help or found a bug, open an issue here.
+
+
Fractal provides distributed workflows that convert TBs of image data into OME-Zar files.
+The platform then processes the 3D image data by applying tasks like illumination correction, maximum intensity projection, 3D segmentation using cellpose and measurements using napari workflows.
+The pyramidal OME-Zarr files enable interactive visualization in the napari viewer.
+
+
The fractal-tasks-core package contains the python tasks that parse Yokogawa CV7000 images into OME-Zarr and process OME-Zarr files. Find more information about Fractal in general and the other repositories at this link. All tasks are written as Python functions and are optimized for usage in Fractal workflows, but they can also be used as standalone functions to parse data or process OME-Zarr files. We heavily use regions of interest (ROIs) in our OME-Zarr files to store the positions of field of views. ROIs are saved as AnnData tables following this spec proposal. We save wells as large Zarr arrays instead of a collection of arrays for each field of view (see details here).
+
Here is an example of the interactive visualization in napari using the newly-proposed async loading in NAP4 and the napari-ome-zarr plugin:
Create Zarr Structure: Task to generate the zarr structure based on Yokogawa metadata files
+
Yokogawa to Zarr: Parses the Yokogawa CV7000 image data and saves it to the Zarr file
+
Illumination Correction: Applies an illumination correction based on a flatfield image & subtracts a background from the image.
+
Image Labeling (& Image Labeling Whole Well): Applies a cellpose network to the image of a single ROI or the whole well. cellpose parameters can be tuned for optimal performance.
+
Maximum Intensity Projection: Creates a maximum intensity projection of the whole plate.
+
Measurement: Make some standard measurements (intensity & morphology) using napari workflows, saving results to AnnData tables.
+
+
Some additional tasks are currently being worked on and some older tasks are still present in the fractal_tasks_core folder. See the package page for the detailed description of all tasks.
Fractal was conceived in the Liberali Lab at the Friedrich Miescher Institute for Biomedical Research and in the Pelkmans Lab at the University of Zurich by @jluethi and @gusqgm. The Fractal project is now developed at the BioVisionCenter at the University of Zurich and the project lead is with @jluethi. The core development is done under contract by eXact lab S.r.l..
defglob_with_multiple_patterns(
+ *,
+ folder:str,
+ patterns:Sequence[str]=None,
+)->set[str]:
+"""
+ List all the items (files and folders) in a given folder that
+ simultaneously match a series of glob patterns.
+
+ Args:
+ folder: Base folder where items will be searched.
+ patterns: If specified, the list of patterns (defined as in
+ https://docs.python.org/3/library/fnmatch.html) that item
+ names will match with.
+ """
+
+ # Sanitize base-folder path
+ iffolder.endswith("/"):
+ actual_folder=folder[:-1]
+ else:
+ actual_folder=folder[:]
+
+ # If not pattern is specified, look for *all* items in the base folder
+ ifnotpatterns:
+ patterns=["*"]
+
+ # Combine multiple glob searches (via set intersection)
+ logging.info(f"[glob_with_multiple_patterns] {patterns=}")
+ items=None
+ forpatterninpatterns:
+ new_matches=glob(f"{actual_folder}/{pattern}")
+ ifitemsisNone:
+ items=set(new_matches)
+ else:
+ items=items.intersection(new_matches)
+ items=itemsorset()
+ logging.info(f"[glob_with_multiple_patterns] Found {len(items)} items")
+
+ returnitems
+
Handles the conversion of Cellvoyager XML metadata into well indentifiers.
+Returns well identifiers like A01, B02 etc. for 96 & 384 well plates.
+Returns well identifiers like A01.a1, A01.b2 etc. for 1536 well plates.
+Defaults to the processing used for 96 & 384 well plates, unless the
+plate_type is 1536. For 1536 well plates, the first 4x4 wells go into
+A01.a1 - A01.d4 and so on.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
row_series
+
+
+
Series with index being the index of the image and the
+value the row position (starting at 1 for top left).
+
+
+
+ TYPE:
+ Series
+
+
+
+
+
+
col_series
+
+
+
Series with index being the index of the image and the
+value the col position (starting at 1 for top left).
+
+
+
+ TYPE:
+ Series
+
+
+
+
+
+
plate_type
+
+
+
Number of wells in the plate layout. Used to determine
+whether it's a 1536 well plate or a different layout.
def_create_well_ids(
+ row_series:pd.Series,
+ col_series:pd.Series,
+ plate_type:int,
+)->list[str]:
+"""
+ Create well_id list from XML metadata
+
+ Handles the conversion of Cellvoyager XML metadata into well indentifiers.
+ Returns well identifiers like A01, B02 etc. for 96 & 384 well plates.
+ Returns well identifiers like A01.a1, A01.b2 etc. for 1536 well plates.
+ Defaults to the processing used for 96 & 384 well plates, unless the
+ plate_type is 1536. For 1536 well plates, the first 4x4 wells go into
+ A01.a1 - A01.d4 and so on.
+
+ Args:
+ row_series: Series with index being the index of the image and the
+ value the row position (starting at 1 for top left).
+ col_series: Series with index being the index of the image and the
+ value the col position (starting at 1 for top left).
+ plate_type: Number of wells in the plate layout. Used to determine
+ whether it's a 1536 well plate or a different layout.
+
+ Returns:
+ list of well_ids
+
+ """
+ ifplate_type==1536:
+ # Row are built of a base letter (matching to the 96 well plate layout)
+ # and a sub letter (position of the 1536 well within the 4x4 grid,
+ # can be a-d) of that well
+ row_base=[chr(math.floor((x-1)/4)+65)forxin(row_series)]
+ row_sub=[chr((x-1)%4+97)forxin(row_series)]
+ # Columns are built of a base number (matching to the 96 well plate
+ # layout) and a sub integer (position of the 1536 well within the
+ # 4x4 grid, can be 1-4) of that well
+ col_base=[math.floor((x-1)/4)+1forxincol_series]
+ col_sub=[(x-1)%4+1forxincol_series]
+ well_ids=[]
+ foriinrange(len(row_base)):
+ well_ids.append(
+ f"{row_base[i]}{col_base[i]:02}.{row_sub[i]}{col_sub[i]}"
+ )
+ else:
+ row_str=[chr(x)forxin(row_series+64)]
+ well_ids=[f"{a}{b:02}"fora,binzip(row_str,col_series)]
+
+ returnwell_ids
+
defcalculate_steps(site_series:pd.Series):
+"""
+ TBD
+
+ Args:
+ site_series: TBD
+ """
+
+ # site_series is the z_micrometer series for a given site of a given
+ # channel. This function calculates the step size in Z
+
+ # First diff is always NaN because there is nothing to compare it to
+ steps=site_series.diff().dropna().astype(float)
+ ifnotnp.allclose(steps.iloc[0],np.array(steps)):
+ raiseNotImplementedError(
+ "When parsing the Yokogawa mlf file, some sites "
+ "had varying step size in Z. "
+ "That is not supported for the OME-Zarr parsing"
+ )
+ returnsteps.mean()
+
defget_earliest_time_per_site(mlf_frame:pd.DataFrame)->pd.DataFrame:
+"""
+ TBD
+
+ Args:
+ mlf_frame: TBD
+ """
+
+ # Get the time information per site
+ # Because a site will contain time information for each plane
+ # of each channel, we just return the earliest time infromation
+ # per site.
+ returnpd.to_datetime(
+ mlf_frame.groupby(["well_id","FieldIndex"]).min()["Time"],utc=True
+ )
+
defget_z_steps(mlf_frame:pd.DataFrame)->pd.DataFrame:
+"""
+ TBD
+
+ Args:
+ mlf_frame: TBD
+ """
+
+ # Process mlf_frame to extract Z information (pixel size & steps).
+ # Run checks on consistencies & return site-based z step dataframe
+ # Group by well, field & channel
+ grouped_sites_z=(
+ mlf_frame.loc[
+ :,
+ ["well_id","FieldIndex","ActionIndex","Ch","Z"],
+ ]
+ .set_index(["well_id","FieldIndex","ActionIndex","Ch"])
+ .groupby(level=[0,1,2,3])
+ )
+
+ # If there is only 1 Z step, set the Z spacing to the count of planes => 1
+ ifgrouped_sites_z.count()["Z"].max()==1:
+ z_data=grouped_sites_z.count().groupby(["well_id","FieldIndex"])
+ else:
+ # Group the whole site (combine channels), because Z steps need to be
+ # consistent between channels for OME-Zarr.
+ z_data=grouped_sites_z.apply(calculate_steps).groupby(
+ ["well_id","FieldIndex"]
+ )
+
+ check_group_consistency(
+ z_data,message="Comparing Z steps between channels"
+ )
+
+ # Ensure that channels have the same number of z planes and
+ # reduce it to one value.
+ # Only check if there is more than one channel available
+ ifany(
+ grouped_sites_z.count().groupby(["well_id","FieldIndex"]).count()>1
+ ):
+ check_group_consistency(
+ grouped_sites_z.count().groupby(["well_id","FieldIndex"]),
+ message="Checking number of Z steps between channels",
+ )
+
+ z_steps=(
+ grouped_sites_z.count()
+ .groupby(["well_id","FieldIndex"])
+ .mean()
+ .astype(int)
+ )
+
+ # Combine the two dataframes
+ z_frame=pd.concat([z_data.mean(),z_steps],axis=1)
+ z_frame.columns=["pixel_size_z","z_pixel"]
+ returnz_frame
+
defread_metadata_files(
+ mrf_path:str,
+ mlf_path:str,
+ filename_patterns:Optional[list[str]]=None,
+)->tuple[pd.DataFrame,pd.DataFrame,int]:
+"""
+ Create tables for mrf & mlf Yokogawa metadata.
+
+ Args:
+ mrf_path: Full path to MeasurementDetail.mrf metadata file.
+ mlf_path: Full path to MeasurementData.mlf metadata file.
+ filename_patterns: List of patterns to filter the image filenames in
+ the mlf metadata table. Patterns must be defined as in
+ https://docs.python.org/3/library/fnmatch.html.
+
+ Returns:
+
+ """
+
+ # parsing of mrf & mlf files are based on the
+ # yokogawa_image_collection_task v0.5 in drogon, written by Dario Vischi.
+ # https://github.com/fmi-basel/job-system-workflows/blob/00bbf34448972d27f258a2c28245dd96180e8229/src/gliberal_workflows/tasks/yokogawa_image_collection_task/versions/version_0_5.py # noqa
+ # Now modified for Fractal use
+
+ mrf_frame,plate_type=read_mrf_file(mrf_path)
+
+ # filter_position & filter_wheel_position are parsed, but not
+ # processed further. Figure out how to save them as relevant metadata for
+ # use e.g. during illumination correction
+
+ mlf_frame,error_count=read_mlf_file(
+ mlf_path,plate_type,filename_patterns
+ )
+ # Time points are parsed as part of the mlf_frame, but currently not
+ # processed further. Once we tackle time-resolved data, parse from here.
+
+ returnmrf_frame,mlf_frame,error_count
+
defread_mlf_file(
+ mlf_path:str,
+ plate_type:int,
+ filename_patterns:Optional[list[str]]=None,
+)->tuple[pd.DataFrame,int]:
+"""
+ Process the mlf metadata file of a Cellvoyager CV7K/CV8K.
+
+ Args:
+ mlf_path: Full path to MeasurementData.mlf metadata file.
+ plate_type: Plate layout, integer for the number of potential wells.
+ filename_patterns: List of patterns to filter the image filenames in
+ the mlf metadata table. Patterns must be defined as in
+ https://docs.python.org/3/library/fnmatch.html.
+
+ Returns:
+ mlf_frame: pd.DataFrame with relevant metadata per image
+ error_count: Count of errors found during metadata processing
+ """
+
+ # Load the whole MeasurementData.mlf file
+ mlf_frame_raw=pd.read_xml(mlf_path)
+
+ # Remove all rows that do not match the given patterns
+ logger.info(
+ f"Read {mlf_path}, and apply following patterns to "
+ f"image filenames: {filename_patterns}"
+ )
+ iffilename_patterns:
+ filenames=mlf_frame_raw.MeasurementRecord
+ keep_row=None
+ forpatterninfilename_patterns:
+ actual_pattern=fnmatch.translate(pattern)
+ new_matches=filenames.str.fullmatch(actual_pattern)
+ ifnew_matches.sum()==0:
+ raiseValueError(
+ f"In {mlf_path} there is no image filename "
+ f'matching "{actual_pattern}".'
+ )
+ ifkeep_rowisNone:
+ keep_row=new_matches.copy()
+ else:
+ keep_row=keep_row&new_matches
+ ifkeep_row.sum()==0:
+ raiseValueError(
+ f"In {mlf_path} there is no image filename "
+ f"matching {filename_patterns}."
+ )
+ mlf_frame_matching=mlf_frame_raw[keep_row.values].copy()
+ else:
+ mlf_frame_matching=mlf_frame_raw.copy()
+
+ # Create a well ID column
+ # Row & column are provided as int from XML metadata
+ mlf_frame_matching["well_id"]=_create_well_ids(
+ mlf_frame_matching["Row"],mlf_frame_matching["Column"],plate_type
+ )
+
+ # Flip Y axis to align to image coordinate system
+ mlf_frame_matching["Y"]=-mlf_frame_matching["Y"]
+
+ # Compute number or errors
+ error_count=(mlf_frame_matching["Type"]=="ERR").sum()
+
+ # We're only interested in the image metadata
+ mlf_frame=mlf_frame_matching[mlf_frame_matching["Type"]=="IMG"]
+
+ returnmlf_frame,error_count
+
This function handles different patterns of well names: Classical wells in
+their format like B03 (row B, column 03) typically found in 96 & 384 well
+plates from the cellvoyager microscopes. And 1536 well plates with wells
+like A01.a1 (row Aa, column 011).
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
well_id
+
+
+
Well name. Either formatted like A03 (for 96 well and 384
+well plates), or formatted like `A01.a1 (for 1536 well plates).
def_extract_row_col_from_well_id(well_id:str)->tuple[str,str]:
+"""
+ Split well name into row & column
+
+ This function handles different patterns of well names: Classical wells in
+ their format like B03 (row B, column 03) typically found in 96 & 384 well
+ plates from the cellvoyager microscopes. And 1536 well plates with wells
+ like A01.a1 (row Aa, column 011).
+
+ Args:
+ well_id: Well name. Either formatted like `A03` (for 96 well and 384
+ well plates), or formatted like `A01.a1 (for 1536 well plates).
+ Returns:
+ Tuple of row and column names.
+ """
+ iflen(well_id)==3andwell_id.count(".")==0:
+ return(well_id[0],well_id[1:3])
+ eliflen(well_id)==6andwell_id.count(".")==1:
+ core,suffix=well_id.split(".")
+ row=f"{core[0]}{suffix[0]}"
+ col=f"{core[1:]}{suffix[1]}"
+ return(row,col)
+ else:
+ raiseNotImplementedError(
+ f"Processing wells like {well_id} has not been implemented. "
+ "This converter only handles wells like B03 or B03.a1"
+ )
+
Given a list of well names, construct a sorted row&column list
+
This function applies _extract_row_col_from_well_id to each wells
+element and then sorts the result.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
wells
+
+
+
list of well names. Either formatted like [A03, B01, C03] for
+96 well and 384 well plates. Or formatted like [A01.a1, A03.b2,
+B04.c4] for 1536 well plates.
defgenerate_row_col_split(wells:list[str])->list[tuple[str,str]]:
+"""
+ Given a list of well names, construct a sorted row&column list
+
+ This function applies `_extract_row_col_from_well_id` to each `wells`
+ element and then sorts the result.
+
+ Args:
+ wells: list of well names. Either formatted like [A03, B01, C03] for
+ 96 well and 384 well plates. Or formatted like [A01.a1, A03.b2,
+ B04.c4] for 1536 well plates.
+ Returns:
+ well_rows_columns: List of tuples of row & col names
+ """
+ well_rows_columns=[_extract_row_col_from_well_id(well)forwellinwells]
+ returnsorted(well_rows_columns)
+
defget_filename_well_id(row:str,col:str)->str:
+"""
+ Generates the well_id as extracted from the filename from row & col.
+
+ Processes the well identifiers generated by `generate_row_col_split` for
+ cellvoyager datasets.
+
+ Args:
+ row: name of the row. Typically a single letter (A, B, C) for 96 & 384
+ well plates. And two letters (Aa, Bb, Cc) for 1536 well plates.
+ col: name of the column. Typically 2 digits (01, 02, 03) for 96 & 384
+ well plates. And 3 digits (011, 012, 021) for 1536 well plates.
+ Returns:
+ well_id: name of the well as it would appear in the original image
+ file name.
+ """
+ iflen(row)==1andlen(col)==2:
+ returnrow+col
+ eliflen(row)==2andlen(col)==3:
+ returnf"{row[0]}{col[:2]}.{row[1]}{col[2]}"
+ else:
+ raiseNotImplementedError(
+ f"Processing wells with {row=} & {col=} has not been implemented. "
+ "This converter only handles wells like B03 or B03.a1"
+ )
+
classChannelInputModel(BaseModel):
+"""
+ A channel which is specified by either `wavelength_id` or `label`.
+
+ This model is similar to `OmeroChannel`, but it is used for
+ task-function arguments (and for generating appropriate JSON schemas).
+
+ Attributes:
+ wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.
+ label: Name of the channel.
+ """
+
+ wavelength_id:Optional[str]=None
+ label:Optional[str]=None
+
+ @validator("label",always=True)
+ defmutually_exclusive_channel_attributes(cls,v,values):
+"""
+ Check that either `label` or `wavelength_id` is set.
+ """
+ wavelength_id=values.get("wavelength_id")
+ label=v
+ ifwavelength_idandv:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both set "
+ f"(given {wavelength_id=} and {label=})."
+ )
+ ifwavelength_idisNoneandvisNone:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both `None`"
+ )
+ returnv
+
@validator("label",always=True)
+defmutually_exclusive_channel_attributes(cls,v,values):
+"""
+ Check that either `label` or `wavelength_id` is set.
+ """
+ wavelength_id=values.get("wavelength_id")
+ label=v
+ ifwavelength_idandv:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both set "
+ f"(given {wavelength_id=} and {label=})."
+ )
+ ifwavelength_idisNoneandvisNone:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both `None`"
+ )
+ returnv
+
Custom error for when get_channel_from_list fails,
+that can be captured and handled upstream if needed.
+
+
+ Source code in fractal_tasks_core/channels.py
+
137
+138
+139
+140
+141
+142
+143
classChannelNotFoundError(ValueError):
+"""
+ Custom error for when `get_channel_from_list` fails,
+ that can be captured and handled upstream if needed.
+ """
+
+ pass
+
classOmeroChannel(BaseModel):
+"""
+ Custom class for Omero channels, based on OME-NGFF v0.4.
+
+ Attributes:
+ wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.
+ index: Do not change. For internal use only.
+ label: Name of the channel.
+ window: Optional `Window` object to set default display settings for
+ napari.
+ color: Optional hex colormap to display the channel in napari (it
+ must be of length 6, e.g. `00FFFF`).
+ active: Should this channel be shown in the viewer?
+ coefficient: Do not change. Omero-channel attribute.
+ inverted: Do not change. Omero-channel attribute.
+ """
+
+ # Custom
+
+ wavelength_id:str
+ index:Optional[int]
+
+ # From OME-NGFF v0.4 transitional metadata
+
+ label:Optional[str]
+ window:Optional[Window]
+ color:Optional[str]
+ active:bool=True
+ coefficient:int=1
+ inverted:bool=False
+
+ @validator("color",always=True)
+ defvalid_hex_color(cls,v,values):
+"""
+ Check that `color` is made of exactly six elements which are letters
+ (a-f or A-F) or digits (0-9).
+ """
+ ifvisNone:
+ returnv
+ iflen(v)!=6:
+ raiseValueError(f'color must have length 6 (given: "{v}")')
+ allowed_characters="abcdefABCDEF0123456789"
+ forcharacterinv:
+ ifcharacternotinallowed_characters:
+ raiseValueError(
+ "color must only include characters from "
+ f'"{allowed_characters}" (given: "{v}")'
+ )
+ returnv
+
@validator("color",always=True)
+defvalid_hex_color(cls,v,values):
+"""
+ Check that `color` is made of exactly six elements which are letters
+ (a-f or A-F) or digits (0-9).
+ """
+ ifvisNone:
+ returnv
+ iflen(v)!=6:
+ raiseValueError(f'color must have length 6 (given: "{v}")')
+ allowed_characters="abcdefABCDEF0123456789"
+ forcharacterinv:
+ ifcharacternotinallowed_characters:
+ raiseValueError(
+ "color must only include characters from "
+ f'"{allowed_characters}" (given: "{v}")'
+ )
+ returnv
+
classWindow(BaseModel):
+"""
+ Custom class for Omero-channel window, based on OME-NGFF v0.4.
+
+ Attributes:
+ min: Do not change. It will be set to `0` by default.
+ max:
+ Do not change. It will be set according to bit-depth of the images
+ by default (e.g. 65535 for 16 bit images).
+ start: Lower-bound rescaling value for visualization.
+ end: Upper-bound rescaling value for visualization.
+ """
+
+ min:Optional[int]
+ max:Optional[int]
+ start:int
+ end:int
+
def_get_new_unique_value(
+ value:str,
+ existing_values:list[str],
+)->str:
+"""
+ Produce a string value that is not present in a given list
+
+ Append `_1`, `_2`, ... to a given string, if needed, until finding a value
+ which is not already present in `existing_values`.
+
+ Args:
+ value: The first guess for the new value
+ existing_values: The list of existing values
+
+ Returns:
+ A string value which is not present in `existing_values`
+ """
+ counter=1
+ new_value=value
+ whilenew_valueinexisting_values:
+ new_value=f"{value}-{counter}"
+ counter+=1
+ returnnew_value
+
defcheck_well_channel_labels(*,well_zarr_path:str)->None:
+"""
+ Check that the channel labels for a well are unique.
+
+ First identify the channel-labels list for each image in the well, then
+ compare lists and verify their intersection is empty.
+
+ Args:
+ well_zarr_path: path to an OME-NGFF well zarr group.
+ """
+
+ # Iterate over all images (multiplexing acquisitions, multi-FOVs, ...)
+ group=zarr.open_group(well_zarr_path,mode="r+")
+ image_paths=[image["path"]forimageingroup.attrs["well"]["images"]]
+ list_of_channel_lists=[]
+ forimage_pathinimage_paths:
+ channels=get_omero_channel_list(
+ image_zarr_path=f"{well_zarr_path}/{image_path}"
+ )
+ list_of_channel_lists.append(channels[:])
+
+ # For each pair of channel-labels lists, verify they do not overlap
+ forind_1,channels_1inenumerate(list_of_channel_lists):
+ labels_1=set([c.labelforcinchannels_1])
+ forind_2inrange(ind_1):
+ channels_2=list_of_channel_lists[ind_2]
+ labels_2=set([c.labelforcinchannels_2])
+ intersection=labels_1&labels_2
+ ifintersection:
+ hint=(
+ "Are you parsing fields of view into separate OME-Zarr "
+ "images? This could lead to non-unique channel labels, "
+ "and then could be the reason of the error"
+ )
+ raiseValueError(
+ "Non-unique channel labels\n"
+ f"{labels_1=}\n{labels_2=}\n{hint}"
+ )
+
Update a channel list to use it in the OMERO/channels metadata.
+
Given a list of channel dictionaries, update each one of them by:
+ 1. Adding a label (if missing);
+ 2. Adding a set of OMERO-specific attributes;
+ 3. Discarding all other attributes.
+
The new_channels output can be used in the attrs["omero"]["channels"]
+attribute of an image group.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
channels
+
+
+
A list of channel dictionaries (each one must include the
+wavelength_id key).
defdefine_omero_channels(
+ *,
+ channels:list[OmeroChannel],
+ bit_depth:int,
+ label_prefix:Optional[str]=None,
+)->list[dict[str,Union[str,int,bool,dict[str,int]]]]:
+"""
+ Update a channel list to use it in the OMERO/channels metadata.
+
+ Given a list of channel dictionaries, update each one of them by:
+ 1. Adding a label (if missing);
+ 2. Adding a set of OMERO-specific attributes;
+ 3. Discarding all other attributes.
+
+ The `new_channels` output can be used in the `attrs["omero"]["channels"]`
+ attribute of an image group.
+
+ Args:
+ channels: A list of channel dictionaries (each one must include the
+ `wavelength_id` key).
+ bit_depth: bit depth.
+ label_prefix: TBD
+
+ Returns:
+ `new_channels`, a new list of consistent channel dictionaries that
+ can be written to OMERO metadata.
+ """
+
+ new_channels=[c.copy(deep=True)forcinchannels]
+ default_colors=["00FFFF","FF00FF","FFFF00"]
+
+ forchannelinnew_channels:
+ wavelength_id=channel.wavelength_id
+
+ # If channel.label is None, set it to a default value
+ ifchannel.labelisNone:
+ default_label=wavelength_id
+ iflabel_prefix:
+ default_label=f"{label_prefix}_{default_label}"
+ logging.warning(
+ f"Missing label for {channel=}, using {default_label=}"
+ )
+ channel.label=default_label
+
+ # If channel.color is None, set it to a default value (use the default
+ # ones for the first three channels, or gray otherwise)
+ ifchannel.colorisNone:
+ try:
+ channel.color=default_colors.pop()
+ exceptIndexError:
+ channel.color="808080"
+
+ # Set channel.window attribute
+ ifchannel.window:
+ channel.window.min=0
+ channel.window.max=2**bit_depth-1
+
+ # Check that channel labels are unique for this image
+ labels=[c.labelforcinnew_channels]
+ iflen(set(labels))<len(labels):
+ raiseValueError(f"Non-unique labels in {new_channels=}")
+
+ new_channels_dictionaries=[
+ c.dict(exclude={"index"},exclude_unset=True)forcinnew_channels
+ ]
+
+ returnnew_channels_dictionaries
+
defget_channel_from_image_zarr(
+ *,
+ image_zarr_path:str,
+ label:Optional[str]=None,
+ wavelength_id:Optional[str]=None,
+)->OmeroChannel:
+"""
+ Extract a channel from OME-NGFF zarr attributes.
+
+ This is a helper function that combines `get_omero_channel_list` with
+ `get_channel_from_list`.
+
+ Args:
+ image_zarr_path: Path to an OME-NGFF image zarr group.
+ label: `label` attribute of the channel to be extracted.
+ wavelength_id: `wavelength_id` attribute of the channel to be
+ extracted.
+
+ Returns:
+ A single channel dictionary.
+ """
+ omero_channels=get_omero_channel_list(image_zarr_path=image_zarr_path)
+ channel=get_channel_from_list(
+ channels=omero_channels,label=label,wavelength_id=wavelength_id
+ )
+ returnchannel
+
Find the channel that has the required values of label and/or
+wavelength_id, and identify its positional index (which also
+corresponds to its index in the zarr array).
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
channels
+
+
+
A list of channel dictionary, where each channel includes (at
+least) the label and wavelength_id keys.
defget_channel_from_list(
+ *,
+ channels:list[OmeroChannel],
+ label:Optional[str]=None,
+ wavelength_id:Optional[str]=None,
+)->OmeroChannel:
+"""
+ Find matching channel in a list.
+
+ Find the channel that has the required values of `label` and/or
+ `wavelength_id`, and identify its positional index (which also
+ corresponds to its index in the zarr array).
+
+ Args:
+ channels: A list of channel dictionary, where each channel includes (at
+ least) the `label` and `wavelength_id` keys.
+ label: The label to look for in the list of channels.
+ wavelength_id: The wavelength_id to look for in the list of channels.
+
+ Returns:
+ A single channel dictionary.
+ """
+
+ # Identify matching channels
+ iflabel:
+ ifwavelength_id:
+ # Both label and wavelength_id are specified
+ matching_channels=[
+ c
+ forcinchannels
+ if(c.label==labelandc.wavelength_id==wavelength_id)
+ ]
+ else:
+ # Only label is specified
+ matching_channels=[cforcinchannelsifc.label==label]
+ else:
+ ifwavelength_id:
+ # Only wavelength_id is specified
+ matching_channels=[
+ cforcinchannelsifc.wavelength_id==wavelength_id
+ ]
+ else:
+ # Neither label or wavelength_id are specified
+ raiseValueError(
+ "get_channel requires at least one in {label,wavelength_id} "
+ "arguments"
+ )
+
+ # Verify that there is one and only one matching channel
+ iflen(matching_channels)==0:
+ required_match=[f"{label=}",f"{wavelength_id=}"]
+ required_match_string=" and ".join(
+ [xforxinrequired_matchif"None"notinx]
+ )
+ raiseChannelNotFoundError(
+ f"ChannelNotFoundError: No channel found in {channels}"
+ f" for {required_match_string}"
+ )
+ iflen(matching_channels)>1:
+ raiseValueError(f"Inconsistent set of channels: {channels}")
+
+ channel=matching_channels[0]
+ channel.index=channels.index(channel)
+ returnchannel
+
defget_omero_channel_list(*,image_zarr_path:str)->list[OmeroChannel]:
+"""
+ Extract the list of channels from OME-NGFF zarr attributes.
+
+ Args:
+ image_zarr_path: Path to an OME-NGFF image zarr group.
+
+ Returns:
+ A list of channel dictionaries.
+ """
+ group=zarr.open_group(image_zarr_path,mode="r+")
+ channels_dicts=group.attrs["omero"]["channels"]
+ channels=[OmeroChannel(**c)forcinchannels_dicts]
+ returnchannels
+
This function creates the package manifest based on a task_list.py
+Python module located in the dev subfolder of the package, see an
+example of such list at ...
+
The manifest is then written to __FRACTAL_MANIFEST__.json, in the
+main package directory.
+
Note: a valid example of custom_pydantic_models would be
+
defcreate_manifest(
+ package:str="fractal_tasks_core",
+ manifest_version:str="2",
+ has_args_schemas:bool=True,
+ args_schema_version:str="pydantic_v1",
+ docs_link:Optional[str]=None,
+ custom_pydantic_models:Optional[list[tuple[str,str,str]]]=None,
+):
+"""
+ This function creates the package manifest based on a `task_list.py`
+ Python module located in the `dev` subfolder of the package, see an
+ example of such list at ...
+
+ The manifest is then written to `__FRACTAL_MANIFEST__.json`, in the
+ main `package` directory.
+
+ Note: a valid example of `custom_pydantic_models` would be
+ ```
+ [
+ ("my_task_package", "some_module.py", "SomeModel"),
+ ]
+ ```
+
+ Arguments:
+ package: The name of the package (must be importable).
+ manifest_version: Only `"2"` is supported.
+ has_args_schemas:
+ Whether to autogenerate JSON Schemas for task arguments.
+ args_schema_version:
+ Only `"pydantic_v1"` is currently supported in `fractal-server`
+ and `fractal-web`.
+ custom_pydantic_models:
+ Custom models to be included when building JSON Schemas for task
+ arguments.
+ """
+
+ # Preliminary check
+ ifmanifest_version!="2":
+ raiseNotImplementedError(f"{manifest_version=} is not supported")
+
+ logging.info("Start generating a new manifest")
+
+ # Prepare an empty manifest
+ manifest=dict(
+ manifest_version=manifest_version,
+ task_list=[],
+ has_args_schemas=has_args_schemas,
+ )
+ ifhas_args_schemas:
+ manifest["args_schema_version"]=args_schema_version
+
+ # Prepare a default value of docs_link
+ ifpackage=="fractal_tasks_core"anddocs_linkisNone:
+ docs_link=(
+ "https://fractal-analytics-platform.github.io/fractal-tasks-core"
+ )
+
+ # Import the task list from `dev/task_list.py`
+ task_list_module=import_module(f"{package}.dev.task_list")
+ TASK_LIST=getattr(task_list_module,"TASK_LIST")
+
+ # Loop over TASK_LIST, and append the proper task dictionary
+ # to manifest["task_list"]
+ fortask_objinTASK_LIST:
+ # Convert Pydantic object to dictionary
+ task_dict=task_obj.dict(
+ exclude={"meta_init","executable_init","meta","executable"},
+ exclude_unset=True,
+ )
+
+ # Copy some properties from `task_obj` to `task_dict`
+ iftask_obj.executable_non_parallelisnotNone:
+ task_dict[
+ "executable_non_parallel"
+ ]=task_obj.executable_non_parallel
+ iftask_obj.executable_parallelisnotNone:
+ task_dict["executable_parallel"]=task_obj.executable_parallel
+ iftask_obj.meta_non_parallelisnotNone:
+ task_dict["meta_non_parallel"]=task_obj.meta_non_parallel
+ iftask_obj.meta_parallelisnotNone:
+ task_dict["meta_parallel"]=task_obj.meta_parallel
+
+ # Autogenerate JSON Schemas for non-parallel/parallel task arguments
+ ifhas_args_schemas:
+ forkindin["non_parallel","parallel"]:
+ executable=task_dict.get(f"executable_{kind}")
+ ifexecutableisnotNone:
+ logging.info(f"[{executable}] START")
+ schema=create_schema_for_single_task(
+ executable,
+ package=package,
+ custom_pydantic_models=custom_pydantic_models,
+ )
+ logging.info(f"[{executable}] END (new schema)")
+ task_dict[f"args_schema_{kind}"]=schema
+
+ # Update docs_info, based on task-function description
+ docs_info=create_docs_info(
+ executable_non_parallel=task_obj.executable_non_parallel,
+ executable_parallel=task_obj.executable_parallel,
+ package=package,
+ )
+ ifdocs_infoisnotNone:
+ task_dict["docs_info"]=docs_info
+ ifdocs_linkisnotNone:
+ task_dict["docs_link"]=docs_link
+
+ manifest["task_list"].append(task_dict)
+ print()
+
+ # Write manifest
+ imported_package=import_module(package)
+ manifest_path=(
+ Path(imported_package.__file__).parent/"__FRACTAL_MANIFEST__.json"
+ )
+ withmanifest_path.open("w")asf:
+ json.dump(manifest,f,indent=2)
+ f.write("\n")
+ logging.info(f"Manifest stored in {manifest_path.as_posix()}")
+
Keeps only the description part of the docstrings: e.g from
+
'Custom class for Omero-channel window, based on OME-NGFF v0.4.\n'
+'\n'
+'Attributes:\n'
+'min: Do not change. It will be set to `0` by default.\n'
+'max: Do not change. It will be set according to bitdepth of the images\n'
+' by default (e.g. 65535 for 16 bit images).\n'
+'start: Lower-bound rescaling value for visualization.\n'
+'end: Upper-bound rescaling value for visualization.'
+
+to 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\n'.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
old_schema
+
+
+
TBD
+
+
+
+ TYPE:
+ _Schema
+
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/dev/lib_args_schemas.py
+
def_remove_attributes_from_descriptions(old_schema:_Schema)->_Schema:
+"""
+ Keeps only the description part of the docstrings: e.g from
+ ```
+ 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'
+ '\\n'
+ 'Attributes:\\n'
+ 'min: Do not change. It will be set to `0` by default.\\n'
+ 'max: Do not change. It will be set according to bitdepth of the images\\n'
+ ' by default (e.g. 65535 for 16 bit images).\\n'
+ 'start: Lower-bound rescaling value for visualization.\\n'
+ 'end: Upper-bound rescaling value for visualization.'
+ ```
+ to `'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'`.
+
+ Args:
+ old_schema: TBD
+ """
+ new_schema=old_schema.copy()
+ if"definitions"innew_schema:
+ forname,definitioninnew_schema["definitions"].items():
+ parsed_docstring=docparse(definition["description"])
+ new_schema["definitions"][name][
+ "description"
+ ]=parsed_docstring.short_description
+ logging.info("[_remove_attributes_from_descriptions] END")
+ returnnew_schema
+
defcreate_schema_for_single_task(
+ executable:str,
+ package:Optional[str]="fractal_tasks_core",
+ custom_pydantic_models:Optional[list[tuple[str,str,str]]]=None,
+ task_function:Optional[Callable]=None,
+ verbose:bool=False,
+)->_Schema:
+"""
+ Main function to create a JSON Schema of task arguments
+
+ This function can be used in two ways:
+
+ 1. `task_function` argument is `None`, `package` is set, and `executable`
+ is a path relative to that package.
+ 2. `task_function` argument is provided, `executable` is an absolute path
+ to the function module, and `package` is `None. This is useful for
+ testing.
+
+ """
+
+ logging.info("[create_schema_for_single_task] START")
+ iftask_functionisNone:
+ usage="1"
+ # Usage 1 (standard)
+ ifpackageisNone:
+ raiseValueError(
+ "Cannot call `create_schema_for_single_task with "
+ f"{task_function=} and {package=}. Exit."
+ )
+ ifos.path.isabs(executable):
+ raiseValueError(
+ "Cannot call `create_schema_for_single_task with "
+ f"{task_function=} and absolute {executable=}. Exit."
+ )
+ else:
+ usage="2"
+ # Usage 2 (testing)
+ ifpackageisnotNone:
+ raiseValueError(
+ "Cannot call `create_schema_for_single_task with "
+ f"{task_function=} and non-None {package=}. Exit."
+ )
+ ifnotos.path.isabs(executable):
+ raiseValueError(
+ "Cannot call `create_schema_for_single_task with "
+ f"{task_function=} and non-absolute {executable=}. Exit."
+ )
+
+ # Extract function from module
+ ifusage=="1":
+ # Extract the function name (for the moment we assume the function has
+ # the same name as the module)
+ function_name=Path(executable).with_suffix("").name
+ # Extract the function object
+ task_function=_extract_function(
+ package_name=package,
+ module_relative_path=executable,
+ function_name=function_name,
+ verbose=verbose,
+ )
+ else:
+ # The function object is already available, extract its name
+ function_name=task_function.__name__
+
+ ifverbose:
+ logging.info(f"[create_schema_for_single_task] {function_name=}")
+ logging.info(f"[create_schema_for_single_task] {task_function=}")
+
+ # Validate function signature against some custom constraints
+ _validate_function_signature(task_function)
+
+ # Create and clean up schema
+ vf=ValidatedFunction(task_function,config=None)
+ schema=vf.model.schema()
+ schema=_remove_args_kwargs_properties(schema)
+ schema=_remove_pydantic_internals(schema)
+ schema=_remove_attributes_from_descriptions(schema)
+
+ # Include titles for custom-model-typed arguments
+ schema=_include_titles(schema,verbose=verbose)
+
+ # Include descriptions of function. Note: this function works both
+ # for usages 1 or 2 (see docstring).
+ function_args_descriptions=_get_function_args_descriptions(
+ package_name=package,
+ module_path=executable,
+ function_name=function_name,
+ verbose=verbose,
+ )
+ schema=_insert_function_args_descriptions(
+ schema=schema,descriptions=function_args_descriptions,verbose=verbose
+ )
+
+ # Merge lists of fractal-tasks-core and user-provided Pydantic models
+ user_provided_models=custom_pydantic_modelsor[]
+ pydantic_models=FRACTAL_TASKS_CORE_PYDANTIC_MODELS+user_provided_models
+
+ # Check that model names are unique
+ pydantic_models_names=[item[2]foriteminpydantic_models]
+ duplicate_class_names=[
+ name
+ forname,countinCounter(pydantic_models_names).items()
+ ifcount>1
+ ]
+ ifduplicate_class_names:
+ pydantic_models_str=" "+"\n ".join(map(str,pydantic_models))
+ raiseValueError(
+ "Cannot parse docstrings for models with non-unique names "
+ f"{duplicate_class_names}, in\n{pydantic_models_str}"
+ )
+
+ # Extract model-attribute descriptions and insert them into schema
+ forpackage_name,module_relative_path,class_nameinpydantic_models:
+ attrs_descriptions=_get_class_attrs_descriptions(
+ package_name=package_name,
+ module_relative_path=module_relative_path,
+ class_name=class_name,
+ )
+ schema=_insert_class_attrs_descriptions(
+ schema=schema,
+ class_name=class_name,
+ descriptions=attrs_descriptions,
+ )
+
+ logging.info("[create_schema_for_single_task] END")
+ returnschema
+
This is a provisional helper function that replaces newlines with spaces
+and reduces multiple contiguous whitespace characters to a single one.
+Future iterations of the docstrings format/parsing may render this function
+not-needed or obsolete.
def_sanitize_description(string:str)->str:
+"""
+ Sanitize a description string.
+
+ This is a provisional helper function that replaces newlines with spaces
+ and reduces multiple contiguous whitespace characters to a single one.
+ Future iterations of the docstrings format/parsing may render this function
+ not-needed or obsolete.
+
+ Args:
+ string: TBD
+ """
+ # Replace newline with space
+ new_string=string.replace("\n"," ")
+ # Replace N-whitespace characterss with a single one
+ while" "innew_string:
+ new_string=new_string.replace(" "," ")
+ returnnew_string
+
def_validate_function_signature(function:Callable):
+"""
+ Validate the function signature.
+
+ Implement a set of checks for type hints that do not play well with the
+ creation of JSON Schema, see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.
+
+ Args:
+ function: TBD
+ """
+ sig=signature(function)
+ forparaminsig.parameters.values():
+
+ # CASE 1: Check that name is not forbidden
+ ifparam.nameinFORBIDDEN_PARAM_NAMES:
+ raiseValueError(
+ f"Function {function} has argument with name {param.name}"
+ )
+
+ # CASE 2: Raise an error for unions
+ ifstr(param.annotation).startswith(("typing.Union[","Union[")):
+ raiseValueError("typing.Union is not supported")
+
+ # CASE 3: Raise an error for "|"
+ if"|"instr(param.annotation):
+ raiseValueError('Use of "|" in type hints is not supported')
+
+ # CASE 4: Raise an error for optional parameter with given (non-None)
+ # default, e.g. Optional[str] = "asd"
+ is_annotation_optional=str(param.annotation).startswith(
+ ("typing.Optional[","Optional[")
+ )
+ default_given=(param.defaultisnotNone)and(
+ param.default!=inspect._empty
+ )
+ ifdefault_givenandis_annotation_optional:
+ raiseValueError("Optional parameter has non-None default value")
+
+ logging.info("[_validate_function_signature] END")
+ returnsig
+
defcreate_docs_info(
+ executable_non_parallel:Optional[str]=None,
+ executable_parallel:Optional[str]=None,
+ package:str="fractal_tasks_core",
+)->list[str]:
+"""
+ Return task description based on function docstring.
+ """
+ logging.info("[create_docs_info] START")
+ docs_info=[]
+ forexecutablein[executable_non_parallel,executable_parallel]:
+ ifexecutableisNone:
+ continue
+ # Extract the function name.
+ # Note: this could be made more general, but for the moment we assume
+ # that the function has the same name as the module)
+ function_name=Path(executable).with_suffix("").name
+ logging.info(f"[create_docs_info] {function_name=}")
+ # Get function description
+ description=_get_function_description(
+ package_name=package,
+ module_path=executable,
+ function_name=function_name,
+ )
+ docs_info.append(f"## {function_name}\n{description}\n")
+ docs_info="".join(docs_info)
+ logging.info("[create_docs_info] END")
+ returndocs_info
+
def_include_titles_for_properties(
+ properties:dict[str,dict],
+ verbose:bool=False,
+)->dict[str,dict]:
+"""
+ Scan through properties of a JSON Schema, and set their title when it is
+ missing.
+
+ The title is set to `name.title()`, where `title` is a standard string
+ method - see https://docs.python.org/3/library/stdtypes.html#str.title.
+
+ Args:
+ properties: TBD
+ """
+ ifverbose:
+ logging.info(
+ f"[_include_titles_for_properties] Original properties:\n"
+ f"{properties}"
+ )
+
+ new_properties=properties.copy()
+ forprop_name,propinproperties.items():
+ if"title"notinprop.keys():
+ new_prop=prop.copy()
+ new_prop["title"]=prop_name.title()
+ new_properties[prop_name]=new_prop
+ ifverbose:
+ logging.info(
+ f"[_include_titles_for_properties] New properties:\n"
+ f"{new_properties}"
+ )
+ returnnew_properties
+
These models are used in task_list.py, and they provide a layer that
+simplifies writing the task list of a package in a way that is compliant with
+fractal-server v2.
This helper function is similar to write_table, in that it prepares the
+appropriate zarr groups (labels and the new-label one) and performs
+overwrite-dependent checks. At a difference with write_table, this
+function does not actually write the label array to the new zarr group;
+such writing operation must take place in the actual task function, since
+in fractal-tasks-core it is done sequentially on different regions of the
+zarr array.
+
What this function does is:
+
+
Create the labels group, if needed.
+
If overwrite=False, check that the new label does not exist (either in
+ zarr attributes or as a zarr sub-group).
+
Update the labels attribute of the image group.
+
If label_attrs is set, include this set of attributes in the
+ new-label zarr group.
If False, check that the new label does not exist (either in zarr
+attributes or as a zarr sub-group); if True propagate parameter
+to create_group method, making it overwrite any existing
+sub-group with the given name.
defprepare_label_group(
+ image_group:zarr.hierarchy.Group,
+ label_name:str,
+ label_attrs:dict[str,Any],
+ overwrite:bool=False,
+ logger:Optional[logging.Logger]=None,
+)->zarr.group:
+"""
+ Set the stage for writing labels to a zarr group
+
+ This helper function is similar to `write_table`, in that it prepares the
+ appropriate zarr groups (`labels` and the new-label one) and performs
+ `overwrite`-dependent checks. At a difference with `write_table`, this
+ function does not actually write the label array to the new zarr group;
+ such writing operation must take place in the actual task function, since
+ in fractal-tasks-core it is done sequentially on different `region`s of the
+ zarr array.
+
+ What this function does is:
+
+ 1. Create the `labels` group, if needed.
+ 2. If `overwrite=False`, check that the new label does not exist (either in
+ zarr attributes or as a zarr sub-group).
+ 3. Update the `labels` attribute of the image group.
+ 4. If `label_attrs` is set, include this set of attributes in the
+ new-label zarr group.
+
+ Args:
+ image_group:
+ The group to write to.
+ label_name:
+ The name of the new label; this name also overrides the multiscale
+ name in NGFF-image Zarr attributes, if needed.
+ overwrite:
+ If `False`, check that the new label does not exist (either in zarr
+ attributes or as a zarr sub-group); if `True` propagate parameter
+ to `create_group` method, making it overwrite any existing
+ sub-group with the given name.
+ label_attrs:
+ Zarr attributes of the label-image group.
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`).
+
+ Returns:
+ Zarr group of the new label.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ # Create labels group (if needed) and extract current_labels
+ if"labels"notinset(image_group.group_keys()):
+ labels_group=image_group.create_group("labels",overwrite=False)
+ else:
+ labels_group=image_group["labels"]
+ current_labels=labels_group.attrs.asdict().get("labels",[])
+
+ # If overwrite=False, check that the new label does not exist (either as a
+ # zarr sub-group or as part of the zarr-group attributes)
+ ifnotoverwrite:
+ iflabel_nameinset(labels_group.group_keys()):
+ error_msg=(
+ f"Sub-group '{label_name}' of group {image_group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ iflabel_nameincurrent_labels:
+ error_msg=(
+ f"Item '{label_name}' already exists in `labels` attribute of "
+ f"group {image_group.store.path}, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
+ # Update the `labels` metadata of the image group, if needed
+ iflabel_namenotincurrent_labels:
+ new_labels=current_labels+[label_name]
+ labels_group.attrs["labels"]=new_labels
+
+ # Define new-label group
+ label_group=labels_group.create_group(label_name,overwrite=overwrite)
+
+ # Validate attrs against NGFF specs 0.4
+ try:
+ meta=NgffImageMeta(**label_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Label attributes do not comply with NGFF image "
+ "specifications, as encoded in fractal-tasks-core.\n"
+ f"Original error:\nValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ # Replace multiscale name with label_name, if needed
+ current_multiscale_name=meta.multiscale.name
+ ifcurrent_multiscale_name!=label_name:
+ logger.warning(
+ f"Setting multiscale name to '{label_name}' (old value: "
+ f"'{current_multiscale_name}') in label-image NGFF "
+ "attributes."
+ )
+ label_attrs["multiscales"][0]["name"]=label_name
+ # Overwrite label_group attributes with label_attrs key/value pairs
+ label_group.attrs.put(label_attrs)
+
+ returnlabel_group
+
def_postprocess_output(
+ *,
+ modified_array:np.ndarray,
+ original_array:np.ndarray,
+ background:np.ndarray,
+)->np.ndarray:
+"""
+ Postprocess cellpose output, mainly to restore its original background.
+
+ **NOTE**: The pre/post-processing functions and the
+ masked_loading_wrapper are currently meant to work as part of the
+ cellpose_segmentation task, with the plan of then making them more
+ flexible; see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
+
+ Args:
+ modified_array: The 3D (ZYX) array with the correct object data and
+ wrong background data.
+ original_array: The 3D (ZYX) array with the wrong object data and
+ correct background data.
+ background: The 3D (ZYX) boolean array that defines the background.
+
+ Returns:
+ The postprocessed array.
+ """
+ # Restore background
+ modified_array[background]=original_array[background]
+ returnmodified_array
+
def_preprocess_input(
+ image_array:np.ndarray,
+ *,
+ region:tuple[slice,...],
+ current_label_path:str,
+ ROI_table_path:str,
+ ROI_positional_index:int,
+)->tuple[np.ndarray,np.ndarray,np.ndarray]:
+"""
+ Preprocess a four-dimensional cellpose input.
+
+ This involves :
+
+ - Loading the masking label array for the appropriate ROI;
+ - Extracting the appropriate label value from the `ROI_table.obs`
+ dataframe;
+ - Constructing the background mask, where the masking label matches with a
+ specific label value;
+ - Setting the background of `image_array` to `0`;
+ - Loading the array which will be needed in postprocessing to restore
+ background.
+
+ **NOTE 1**: This function relies on V1 of the Fractal table specifications,
+ see
+ https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.
+
+ **NOTE 2**: The pre/post-processing functions and the
+ masked_loading_wrapper are currently meant to work as part of the
+ cellpose_segmentation task, with the plan of then making them more
+ flexible; see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
+
+ Naming of variables refers to a two-steps labeling, as in "first identify
+ organoids, then look for nuclei inside each organoid") :
+
+ - `"masking"` refers to the labels that are used to identify the object
+ vs background (e.g. the organoid labels); these labels already exist.
+ - `"current"` refers to the labels that are currently being computed in
+ the `cellpose_segmentation` task, e.g. the nuclear labels.
+
+ Args:
+ image_array: The 4D CZYX array with image data for a specific ROI.
+ region: The ZYX indices of the ROI, in a form like
+ `(slice(0, 1), slice(1000, 2000), slice(1000, 2000))`.
+ current_label_path: Path to the image used as current label, in a form
+ like `/somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0`.
+ ROI_table_path: Path of the AnnData table for the masking-label ROIs;
+ this is used (together with `ROI_positional_index`) to extract
+ `label_value`.
+ ROI_positional_index: Index of the current ROI, which is used to
+ extract `label_value` from `ROI_table_obs`.
+ Returns:
+ A tuple with three arrays: the preprocessed image array, the background
+ mask, the current label.
+ """
+
+ logger.info(f"[_preprocess_input] {image_array.shape=}")
+ logger.info(f"[_preprocess_input] {region=}")
+
+ # Check that image data are 4D (CZYX) - FIXME issue 340
+ ifnotimage_array.ndim==4:
+ raiseValueError(
+ "_preprocess_input requires a 4D "
+ f"image_array argument, but {image_array.shape=}"
+ )
+
+ # Load the ROI table and its metadata attributes
+ ROI_table=ad.read_zarr(ROI_table_path)
+ attrs=zarr.group(ROI_table_path).attrs
+ logger.info(f"[_preprocess_input] {ROI_table_path=}")
+ logger.info(f"[_preprocess_input] {attrs.asdict()=}")
+ MaskingROITableAttrs(**attrs.asdict())
+ label_relative_path=attrs["region"]["path"]
+ column_name=attrs["instance_key"]
+
+ # Check that ROI_table.obs has the right column and extract label_value
+ ifcolumn_namenotinROI_table.obs.columns:
+ raiseValueError(
+ 'In _preprocess_input, "{column_name}" '
+ f" missing in {ROI_table.obs.columns=}"
+ )
+ label_value=int(ROI_table.obs[column_name][ROI_positional_index])
+
+ # Load masking-label array (lazily)
+ masking_label_path=str(
+ Path(ROI_table_path).parent/label_relative_path/"0"
+ )
+ logger.info(f"{masking_label_path=}")
+ masking_label_array=da.from_zarr(masking_label_path)
+ logger.info(
+ f"[_preprocess_input] {masking_label_path=}, "
+ f"{masking_label_array.shape=}"
+ )
+
+ # Load current-label array (lazily)
+ current_label_array=da.from_zarr(current_label_path)
+ logger.info(
+ f"[_preprocess_input] {current_label_path=}, "
+ f"{current_label_array.shape=}"
+ )
+
+ # Load ROI data for current label array
+ current_label_region=current_label_array[region].compute()
+
+ # Load ROI data for masking label array, with or without upscaling
+ ifmasking_label_array.shape!=current_label_array.shape:
+ logger.info("Upscaling of masking label is needed")
+ lowres_region=convert_region_to_low_res(
+ highres_region=region,
+ highres_shape=current_label_array.shape,
+ lowres_shape=masking_label_array.shape,
+ )
+ masking_label_region=masking_label_array[lowres_region].compute()
+ masking_label_region=upscale_array(
+ array=masking_label_region,
+ target_shape=current_label_region.shape,
+ )
+ else:
+ masking_label_region=masking_label_array[region].compute()
+
+ # Check that all shapes match
+ shapes=(
+ masking_label_region.shape,
+ current_label_region.shape,
+ image_array.shape[1:],
+ )
+ iflen(set(shapes))>1:
+ raiseValueError(
+ "Shape mismatch:\n"
+ f"{current_label_region.shape=}\n"
+ f"{masking_label_region.shape=}\n"
+ f"{image_array.shape=}"
+ )
+
+ # Compute background mask
+ background_3D=masking_label_region!=label_value
+ if(masking_label_region==label_value).sum()==0:
+ raiseValueError(
+ f"Label {label_value} is not present in the extracted ROI"
+ )
+
+ # Set image background to zero
+ n_channels=image_array.shape[0]
+ foriinrange(n_channels):
+ image_array[i,background_3D]=0
+
+ return(image_array,background_3D,current_label_region)
+
defmasked_loading_wrapper(
+ *,
+ function:Callable,
+ image_array:np.ndarray,
+ kwargs:Optional[dict]=None,
+ use_masks:bool,
+ preprocessing_kwargs:Optional[dict]=None,
+):
+"""
+ Wrap a function with some pre/post-processing functions
+
+ Args:
+ function: The callable function to be wrapped.
+ image_array: The image array to be preprocessed and then used as
+ positional argument for `function`.
+ kwargs: Keyword arguments for `function`.
+ use_masks: If `False`, the wrapper only calls
+ `function(*args, **kwargs)`.
+ preprocessing_kwargs: Keyword arguments for the preprocessing function
+ (see call signature of `_preprocess_input()`).
+ """
+ # Optional preprocessing
+ ifuse_masks:
+ preprocessing_kwargs=preprocessing_kwargsor{}
+ (
+ image_array,
+ background_3D,
+ current_label_region,
+ )=_preprocess_input(image_array,**preprocessing_kwargs)
+ # Run function
+ kwargs=kwargsor{}
+ new_label_img=function(image_array,**kwargs)
+ # Optional postprocessing
+ ifuse_masks:
+ new_label_img=_postprocess_output(
+ modified_array=new_label_img,
+ original_array=current_label_region,
+ background=background_3D,
+ )
+ returnnew_label_img
+
classAcquisitionInPlate(BaseModel):
+"""
+ Model for an element of `Plate.acquisitions`.
+
+ See https://ngff.openmicroscopy.org/0.4/#plate-md.
+ """
+
+ id:int=Field(
+ description="A unique identifier within the context of the plate"
+ )
+ maximumfieldcount:Optional[int]=Field(
+ None,
+ description=(
+ "Int indicating the maximum number of fields of view for the "
+ "acquisition"
+ ),
+ )
+ name:Optional[str]=Field(
+ None,description="a string identifying the name of the acquisition"
+ )
+ description:Optional[str]=Field(
+ None,
+ description="The description of the acquisition",
+ )
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
56
+57
+58
+59
+60
+61
+62
+63
+64
+65
classAxis(BaseModel):
+"""
+ Model for an element of `Multiscale.axes`.
+
+ See https://ngff.openmicroscopy.org/0.4/#axes-md.
+ """
+
+ name:str
+ type:Optional[str]=None
+ unit:Optional[str]=None
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
32
+33
+34
+35
+36
+37
+38
+39
+40
+41
+42
+43
classChannel(BaseModel):
+"""
+ Model for an element of `Omero.channels`.
+
+ See https://ngff.openmicroscopy.org/0.4/#omero-md.
+ """
+
+ window:Optional[Window]=None
+ label:Optional[str]=None
+ family:Optional[str]=None
+ color:str
+ active:Optional[bool]=None
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
422
+423
+424
+425
+426
+427
+428
+429
classColumnInPlate(BaseModel):
+"""
+ Model for an element of `Plate.columns`.
+
+ See https://ngff.openmicroscopy.org/0.4/#plate-md.
+ """
+
+ name:str
+
Note 1: The NGFF image is defined in a different model
+(NgffImageMeta), while the Image model only refere to an item of
+Well.images.
+
Note 2: We deviate from NGFF specs, since we allow path to be an
+arbitrary string.
+TODO: include a check like constr(regex=r'^[A-Za-z0-9]+$'), through a
+Pydantic validator.
classImageInWell(BaseModel):
+"""
+ Model for an element of `Well.images`.
+
+ **Note 1:** The NGFF image is defined in a different model
+ (`NgffImageMeta`), while the `Image` model only refere to an item of
+ `Well.images`.
+
+ **Note 2:** We deviate from NGFF specs, since we allow `path` to be an
+ arbitrary string.
+ TODO: include a check like `constr(regex=r'^[A-Za-z0-9]+$')`, through a
+ Pydantic validator.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ acquisition:Optional[int]=Field(
+ None,description="A unique identifier within the context of the plate"
+ )
+ path:str=Field(
+ ...,description="The path for this field of view subgroup"
+ )
+
classMultiscale(BaseModel):
+"""
+ Model for an element of `NgffImageMeta.multiscales`.
+
+ See https://ngff.openmicroscopy.org/0.4/#multiscale-md.
+ """
+
+ name:Optional[str]=None
+ datasets:list[Dataset]=Field(...,min_items=1)
+ version:Optional[str]=None
+ axes:list[Axis]=Field(...,max_items=5,min_items=2,unique_items=True)
+ coordinateTransformations:Optional[
+ list[
+ Union[
+ ScaleCoordinateTransformation,
+ TranslationCoordinateTransformation,
+ ]
+ ]
+ ]=None
+
+ @validator("coordinateTransformations",always=True)
+ def_no_global_coordinateTransformations(cls,v):
+"""
+ Fail if Multiscale has a (global) coordinateTransformations attribute.
+ """
+ ifvisnotNone:
+ raiseNotImplementedError(
+ "Global coordinateTransformations at the multiscales "
+ "level are not currently supported in the fractal-tasks-core "
+ "model for the NGFF multiscale."
+ )
+
@validator("coordinateTransformations",always=True)
+def_no_global_coordinateTransformations(cls,v):
+"""
+ Fail if Multiscale has a (global) coordinateTransformations attribute.
+ """
+ ifvisnotNone:
+ raiseNotImplementedError(
+ "Global coordinateTransformations at the multiscales "
+ "level are not currently supported in the fractal-tasks-core "
+ "model for the NGFF multiscale."
+ )
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
461
+462
+463
+464
+465
+466
+467
+468
classNgffPlateMeta(BaseModel):
+"""
+ Model for the metadata of a NGFF plate.
+
+ See https://ngff.openmicroscopy.org/0.4/#plate-md.
+ """
+
+ plate:Plate
+
classNgffWellMeta(BaseModel):
+"""
+ Model for the metadata of a NGFF well.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ well:Optional[Well]=None
+
+ defget_acquisition_paths(self)->dict[int,list[str]]:
+"""
+ Create mapping from acquisition indices to corresponding paths.
+
+ Runs on the well zarr attributes and loads the relative paths in the
+ well.
+
+ Returns:
+ Dictionary with `(acquisition index: [image_path])` key/value
+ pairs.
+
+ Raises:
+ ValueError:
+ If an element of `self.well.images` has no `acquisition`
+ attribute.
+ """
+ acquisition_dict={}
+ forimageinself.well.images:
+ ifimage.acquisitionisNone:
+ raiseValueError(
+ "Cannot get acquisition paths for Zarr files without "
+ "'acquisition' metadata at the well level"
+ )
+ ifimage.acquisitionnotinacquisition_dict:
+ acquisition_dict[image.acquisition]=[]
+ acquisition_dict[image.acquisition].append(image.path)
+ returnacquisition_dict
+
defget_acquisition_paths(self)->dict[int,list[str]]:
+"""
+ Create mapping from acquisition indices to corresponding paths.
+
+ Runs on the well zarr attributes and loads the relative paths in the
+ well.
+
+ Returns:
+ Dictionary with `(acquisition index: [image_path])` key/value
+ pairs.
+
+ Raises:
+ ValueError:
+ If an element of `self.well.images` has no `acquisition`
+ attribute.
+ """
+ acquisition_dict={}
+ forimageinself.well.images:
+ ifimage.acquisitionisNone:
+ raiseValueError(
+ "Cannot get acquisition paths for Zarr files without "
+ "'acquisition' metadata at the well level"
+ )
+ ifimage.acquisitionnotinacquisition_dict:
+ acquisition_dict[image.acquisition]=[]
+ acquisition_dict[image.acquisition].append(image.path)
+ returnacquisition_dict
+
classPlate(BaseModel):
+"""
+ Model for `NgffPlateMeta.plate`.
+
+ See https://ngff.openmicroscopy.org/0.4/#plate-md.
+ """
+
+ acquisitions:list[AcquisitionInPlate]
+ columns:list[ColumnInPlate]
+ field_count:Optional[int]
+ name:Optional[str]
+ rows:list[RowInPlate]
+ # version will become required in 0.5
+ version:Optional[str]=Field(
+ None,description="The version of the specification"
+ )
+ wells:list[WellInPlate]
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
68
+69
+70
+71
+72
+73
+74
+75
+76
+77
+78
+79
classScaleCoordinateTransformation(BaseModel):
+"""
+ Model for a scale transformation.
+
+ This corresponds to scale-type elements of
+ `Dataset.coordinateTransformations` or
+ `Multiscale.coordinateTransformations`.
+ See https://ngff.openmicroscopy.org/0.4/#trafo-md
+ """
+
+ type:Literal["scale"]
+ scale:list[float]=Field(...,min_items=2)
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
82
+83
+84
+85
+86
+87
+88
+89
+90
+91
+92
+93
classTranslationCoordinateTransformation(BaseModel):
+"""
+ Model for a translation transformation.
+
+ This corresponds to translation-type elements of
+ `Dataset.coordinateTransformations` or
+ `Multiscale.coordinateTransformations`.
+ See https://ngff.openmicroscopy.org/0.4/#trafo-md
+ """
+
+ type:Literal["translation"]
+ translation:list[float]=Field(...,min_items=2)
+
classWell(BaseModel):
+"""
+ Model for `NgffWellMeta.well`.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ images:list[ImageInWell]=Field(
+ ...,
+ description="The images included in this well",
+ min_items=1,
+ unique_items=True,
+ )
+ version:Optional[str]=Field(
+ None,description="The version of the specification"
+ )
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
410
+411
+412
+413
+414
+415
+416
+417
+418
+419
classWellInPlate(BaseModel):
+"""
+ Model for an element of `Plate.wells`.
+
+ See https://ngff.openmicroscopy.org/0.4/#plate-md.
+ """
+
+ path:str
+ rowIndex:int
+ columnIndex:int
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
classWindow(BaseModel):
+"""
+ Model for `Channel.window`.
+
+ Note that we deviate by NGFF specs by making `start` and `end` optional.
+ See https://ngff.openmicroscopy.org/0.4/#omero-md.
+ """
+
+ max:float
+ min:float
+ start:Optional[float]=None
+ end:Optional[float]=None
+
This is used to provide a user-friendly error message.
+
+
+ Source code in fractal_tasks_core/ngff/zarr_utils.py
+
16
+17
+18
+19
+20
+21
+22
+23
classZarrGroupNotFoundError(ValueError):
+"""
+ Wrap zarr.errors.GroupNotFoundError
+
+ This is used to provide a user-friendly error message.
+ """
+
+ pass
+
defdetect_ome_ngff_type(group:zarr.hierarchy.Group)->str:
+"""
+ Given a Zarr group, find whether it is an OME-NGFF plate, well or image.
+
+ Args:
+ group: Zarr group
+
+ Returns:
+ The detected OME-NGFF type (`plate`, `well` or `image`).
+ """
+ attrs=group.attrs.asdict()
+ if"plate"inattrs.keys():
+ ngff_type="plate"
+ elif"well"inattrs.keys():
+ ngff_type="well"
+ elif"multiscales"inattrs.keys():
+ ngff_type="image"
+ else:
+ error_msg=(
+ "Zarr group at cannot be identified as one "
+ "of OME-NGFF plate/well/image groups."
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ logger.info(f"Zarr group identified as OME-NGFF {ngff_type}.")
+ returnngff_type
+
defload_NgffImageMeta(zarr_path:str)->NgffImageMeta:
+"""
+ Load the attributes of a zarr group and cast them to `NgffImageMeta`.
+
+ Args:
+ zarr_path: Path to the zarr group.
+
+ Returns:
+ A new `NgffImageMeta` object.
+ """
+ try:
+ zarr_group=zarr.open_group(zarr_path,mode="r")
+ exceptGroupNotFoundError:
+ error_msg=(
+ "Could not load attributes for the requested image, "
+ f"because no Zarr group was found at {zarr_path}"
+ )
+ logging.error(error_msg)
+ raiseZarrGroupNotFoundError(error_msg)
+ zarr_attrs=zarr_group.attrs.asdict()
+ try:
+ returnNgffImageMeta(**zarr_attrs)
+ exceptExceptionase:
+ logging.error(
+ f"Contents of {zarr_path} cannot be cast to NgffImageMeta.\n"
+ f"Original error:\n{str(e)}"
+ )
+ raisee
+
defload_NgffPlateMeta(zarr_path:str)->NgffPlateMeta:
+"""
+ Load the attributes of a zarr group and cast them to `NgffPlateMeta`.
+
+ Args:
+ zarr_path: Path to the zarr group.
+
+ Returns:
+ A new `NgffPlateMeta` object.
+ """
+ try:
+ zarr_group=zarr.open_group(zarr_path,mode="r")
+ exceptGroupNotFoundError:
+ error_msg=(
+ "Could not load attributes for the requested plate, "
+ f"because no Zarr group was found at {zarr_path}"
+ )
+ logging.error(error_msg)
+ raiseZarrGroupNotFoundError(error_msg)
+ zarr_attrs=zarr_group.attrs.asdict()
+ try:
+ returnNgffPlateMeta(**zarr_attrs)
+ exceptExceptionase:
+ logging.error(
+ f"Contents of {zarr_path} cannot be cast to NgffPlateMeta.\n"
+ f"Original error:\n{str(e)}"
+ )
+ raisee
+
defload_NgffWellMeta(zarr_path:str)->NgffWellMeta:
+"""
+ Load the attributes of a zarr group and cast them to `NgffWellMeta`.
+
+ Args:
+ zarr_path: Path to the zarr group.
+
+ Returns:
+ A new `NgffWellMeta` object.
+ """
+ try:
+ zarr_group=zarr.open_group(zarr_path,mode="r")
+ exceptGroupNotFoundError:
+ error_msg=(
+ "Could not load attributes for the requested well, "
+ f"because no Zarr group was found at {zarr_path}"
+ )
+ logging.error(error_msg)
+ raiseZarrGroupNotFoundError(error_msg)
+ zarr_attrs=zarr_group.attrs.asdict()
+ try:
+ returnNgffWellMeta(**zarr_attrs)
+ exceptExceptionase:
+ logging.error(
+ f"Contents of {zarr_path} cannot be cast to NgffWellMeta.\n"
+ f"Original error:\n{str(e)}"
+ )
+ raisee
+
Starting from on-disk highest-resolution data, build and write to disk a
+pyramid with (num_levels - 1) coarsened levels.
+This function works for 2D, 3D or 4D arrays.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarrurl
+
+
+
Path of the image zarr group, not including the
+multiscale-level path (e.g. "some/path/plate.zarr/B/03/0").
def_is_overlapping_1D_int(
+ line1:Sequence[int],
+ line2:Sequence[int],
+)->bool:
+"""
+ Given two integer intervals, find whether they overlap
+
+ This is the same as `is_overlapping_1D` (based on
+ https://stackoverflow.com/a/70023212/19085332), for integer-valued
+ intervals.
+
+ Args:
+ line1: The boundaries of the first interval , written as
+ `[x_min, x_max]`.
+ line2: The boundaries of the second interval , written as
+ `[x_min, x_max]`.
+ """
+ returnline1[0]<line2[1]andline2[0]<line1[1]
+
def_is_overlapping_3D_int(box1:list[int],box2:list[int])->bool:
+"""
+ Given two three-dimensional integer boxes, find whether they overlap.
+
+ This is the same as is_overlapping_3D (based on
+ https://stackoverflow.com/a/70023212/19085332), for integer-valued
+ boxes.
+
+ Args:
+ box1: The boundaries of the first box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ box2: The boundaries of the second box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ """
+ overlap_x=_is_overlapping_1D_int([box1[0],box1[3]],[box2[0],box2[3]])
+ overlap_y=_is_overlapping_1D_int([box1[1],box1[4]],[box2[1],box2[4]])
+ overlap_z=_is_overlapping_1D_int([box1[2],box1[5]],[box2[2],box2[5]])
+ returnoverlap_xandoverlap_yandoverlap_z
+
defis_overlapping_1D(
+ line1:Sequence[float],line2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two intervals, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ line1: The boundaries of the first interval, written as
+ `[x_min, x_max]`.
+ line2: The boundaries of the second interval, written as
+ `[x_min, x_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+ returnline1[0]<=line2[1]-tolandline2[0]<=line1[1]-tol
+
defis_overlapping_2D(
+ box1:Sequence[float],box2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two rectangular boxes, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ box1: The boundaries of the first rectangle, written as
+ `[x_min, y_min, x_max, y_max]`.
+ box2: The boundaries of the second rectangle, written as
+ `[x_min, y_min, x_max, y_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+ overlap_x=is_overlapping_1D(
+ [box1[0],box1[2]],[box2[0],box2[2]],tol=tol
+ )
+ overlap_y=is_overlapping_1D(
+ [box1[1],box1[3]],[box2[1],box2[3]],tol=tol
+ )
+ returnoverlap_xandoverlap_y
+
defis_overlapping_3D(
+ box1:Sequence[float],box2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two three-dimensional boxes, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ box1: The boundaries of the first box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ box2: The boundaries of the second box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+
+ overlap_x=is_overlapping_1D(
+ [box1[0],box1[3]],[box2[0],box2[3]],tol=tol
+ )
+ overlap_y=is_overlapping_1D(
+ [box1[1],box1[4]],[box2[1],box2[4]],tol=tol
+ )
+ overlap_z=is_overlapping_1D(
+ [box1[2],box1[5]],[box2[2],box2[5]],tol=tol
+ )
+ returnoverlap_xandoverlap_yandoverlap_z
+
defload_region(
+ data_zyx:da.Array,
+ region:tuple[slice,slice,slice],
+ compute:bool=True,
+ return_as_3D:bool=False,
+)->Union[da.Array,np.ndarray]:
+"""
+ Load a region from a dask array.
+
+ Can handle both 2D and 3D dask arrays as input and return them as is or
+ always as a 3D array.
+
+ Args:
+ data_zyx: Dask array (2D or 3D).
+ region: Region to load, tuple of three slices (ZYX).
+ compute: Whether to compute the result. If `True`, returns a numpy
+ array. If `False`, returns a dask array.
+ return_as_3D: Whether to return a 3D array, even if the input is 2D.
+
+ Returns:
+ 3D array.
+ """
+
+ iflen(region)!=3:
+ raiseValueError(
+ f"In `load_region`, `region` must have three elements "
+ f"(given: {len(region)})."
+ )
+
+ iflen(data_zyx.shape)==3:
+ img=data_zyx[region]
+ eliflen(data_zyx.shape)==2:
+ img=data_zyx[(region[1],region[2])]
+ ifreturn_as_3D:
+ img=np.expand_dims(img,axis=0)
+ else:
+ raiseValueError(
+ f"Shape {data_zyx.shape} not supported for `load_region`"
+ )
+ ifcompute:
+ returnimg.compute()
+ else:
+ returnimg
+
DataFrame with each line representing the bounding-box ROI that
+corresponds to a unique value of mask_array. ROI properties are
+expressed in physical units (with columns defined as elsewhere this
+module - see e.g. prepare_well_ROI_table), and positions are
+optionally shifted (if origin_zyx is set). An additional column
+label keeps track of the mask_array value corresponding to each
+ROI.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
Nested list of indices. The main list has one item per ROI. Each ROI
+item is a list of six integers as in [start_z, end_z, start_y,
+end_y, start_x, end_x]. The array-index interval for a given ROI
+is start_x:end_x along X, and so on for Y and Z.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
defconvert_ROI_table_to_indices(
+ ROI:ad.AnnData,
+ full_res_pxl_sizes_zyx:Sequence[float],
+ level:int=0,
+ coarsening_xy:int=2,
+ cols_xyz_pos:Sequence[str]=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ ],
+ cols_xyz_len:Sequence[str]=[
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ],
+)->list[list[int]]:
+"""
+ Convert a ROI AnnData table into integer array indices.
+
+ Args:
+ ROI: AnnData table with list of ROIs.
+ full_res_pxl_sizes_zyx:
+ Physical-unit pixel ZYX sizes at the full-resolution pyramid level.
+ level: Pyramid level.
+ coarsening_xy: Linear coarsening factor in the YX plane.
+ cols_xyz_pos: Column names for XYZ ROI positions.
+ cols_xyz_len: Column names for XYZ ROI edges.
+
+ Raises:
+ ValueError:
+ If any of the array indices is negative.
+
+ Returns:
+ Nested list of indices. The main list has one item per ROI. Each ROI
+ item is a list of six integers as in `[start_z, end_z, start_y,
+ end_y, start_x, end_x]`. The array-index interval for a given ROI
+ is `start_x:end_x` along X, and so on for Y and Z.
+ """
+ # Handle empty ROI table
+ iflen(ROI)==0:
+ return[]
+
+ # Set pyramid-level pixel sizes
+ pxl_size_z,pxl_size_y,pxl_size_x=full_res_pxl_sizes_zyx
+ prefactor=coarsening_xy**level
+ pxl_size_x*=prefactor
+ pxl_size_y*=prefactor
+
+ x_pos,y_pos,z_pos=cols_xyz_pos[:]
+ x_len,y_len,z_len=cols_xyz_len[:]
+
+ list_indices=[]
+ forROI_nameinROI.obs_names:
+ # Extract data from anndata table
+ x_micrometer=ROI[ROI_name,x_pos].X[0,0]
+ y_micrometer=ROI[ROI_name,y_pos].X[0,0]
+ z_micrometer=ROI[ROI_name,z_pos].X[0,0]
+ len_x_micrometer=ROI[ROI_name,x_len].X[0,0]
+ len_y_micrometer=ROI[ROI_name,y_len].X[0,0]
+ len_z_micrometer=ROI[ROI_name,z_len].X[0,0]
+
+ # Identify indices along the three dimensions
+ start_x=x_micrometer/pxl_size_x
+ end_x=(x_micrometer+len_x_micrometer)/pxl_size_x
+ start_y=y_micrometer/pxl_size_y
+ end_y=(y_micrometer+len_y_micrometer)/pxl_size_y
+ start_z=z_micrometer/pxl_size_z
+ end_z=(z_micrometer+len_z_micrometer)/pxl_size_z
+ indices=[start_z,end_z,start_y,end_y,start_x,end_x]
+
+ # Round indices to lower integer
+ indices=list(map(round,indices))
+
+ # Fail for negative indices
+ ifmin(indices)<0:
+ raiseValueError(
+ f"ROI {ROI_name} converted into negative array indices.\n"
+ f"ZYX position: {z_micrometer}, {y_micrometer}, "
+ f"{x_micrometer}\n"
+ f"ZYX pixel sizes: {pxl_size_z}, {pxl_size_y}, "
+ f"{pxl_size_x} ({level=})\n"
+ "Hint: As of fractal-tasks-core v0.12, FOV/well ROI "
+ "tables with non-zero origins (e.g. the ones created with "
+ "v0.11) are not supported."
+ )
+
+ # Append ROI indices to to list
+ list_indices.append(indices[:])
+
+ returnlist_indices
+
defconvert_ROIs_from_3D_to_2D(
+ adata:ad.AnnData,
+ pixel_size_z:float,
+)->ad.AnnData:
+"""
+ TBD
+
+ Note that this function is only relevant when the ROIs in adata span the
+ whole extent of the Z axis.
+ TODO: check this explicitly.
+
+ Args:
+ adata: TBD
+ pixel_size_z: TBD
+ """
+
+ # Compress a 3D stack of images to a single Z plane,
+ # with thickness equal to pixel_size_z
+ df=adata.to_df()
+ df["len_z_micrometer"]=pixel_size_z
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df=df.astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ new_adata=ad.AnnData(X=df)
+
+ # Rename rows and columns
+ new_adata.obs_names=adata.obs_names
+ new_adata.var_names=list(map(str,df.columns))
+
+ returnnew_adata
+
Construct an empty bounding-box ROI table of given shape.
+
This function mirrors the functionality of array_to_bounding_box_table,
+for the specific case where the array includes no label. The advantages of
+this function are that:
+
+
It does not require computing a whole array of zeros;
+
We avoid hardcoding column names in the task functions.
+
+
+
+
+
+
+
+
RETURNS
+
DESCRIPTION
+
+
+
+
+
+
+ DataFrame
+
+
+
+
+
DataFrame with no rows, and with columns corresponding to the output of
+array_to_bounding_box_table.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
defempty_bounding_box_table()->pd.DataFrame:
+"""
+ Construct an empty bounding-box ROI table of given shape.
+
+ This function mirrors the functionality of `array_to_bounding_box_table`,
+ for the specific case where the array includes no label. The advantages of
+ this function are that:
+
+ 1. It does not require computing a whole array of zeros;
+ 2. We avoid hardcoding column names in the task functions.
+
+ Returns:
+ DataFrame with no rows, and with columns corresponding to the output of
+ `array_to_bounding_box_table`.
+ """
+
+ df_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ df=pd.DataFrame(columns=[xforxindf_columns]+["label"])
+ returndf
+
Produce a table with ROIS placed on a rectangular grid.
+
The main goal of this ROI grid is to allow processing of smaller subset of
+the whole array.
+
In a specific case (that is, if the image array was obtained by stitching
+together a set of FOVs placed on a regular grid), the ROIs correspond to
+the original FOVs.
+
TODO: make this flexible with respect to the presence/absence of Z.
defget_image_grid_ROIs(
+ array_shape:tuple[int,int,int],
+ pixels_ZYX:list[float],
+ grid_YX_shape:tuple[int,int],
+)->ad.AnnData:
+"""
+ Produce a table with ROIS placed on a rectangular grid.
+
+ The main goal of this ROI grid is to allow processing of smaller subset of
+ the whole array.
+
+ In a specific case (that is, if the image array was obtained by stitching
+ together a set of FOVs placed on a regular grid), the ROIs correspond to
+ the original FOVs.
+
+ TODO: make this flexible with respect to the presence/absence of Z.
+
+ Args:
+ array_shape: ZYX shape of the image array.
+ pixels_ZYX: ZYX pixel sizes in micrometers.
+ grid_YX_shape:
+
+ Returns:
+ An `AnnData` table with a single ROI.
+ """
+ shape_z,shape_y,shape_x=array_shape[-3:]
+ grid_size_y,grid_size_x=grid_YX_shape[:]
+ X=[]
+ obs_names=[]
+ counter=0
+ start_z=0
+ len_z=shape_z
+
+ # Find minimal len_y that covers [0,shape_y] with grid_size_y intervals
+ len_y=math.ceil(shape_y/grid_size_y)
+ len_x=math.ceil(shape_x/grid_size_x)
+ forind_yinrange(grid_size_y):
+ start_y=ind_y*len_y
+ tmp_len_y=min(shape_y,start_y+len_y)-start_y
+ forind_xinrange(grid_size_x):
+ start_x=ind_x*len_x
+ tmp_len_x=min(shape_x,start_x+len_x)-start_x
+ X.append(
+ [
+ start_x*pixels_ZYX[2],
+ start_y*pixels_ZYX[1],
+ start_z*pixels_ZYX[0],
+ tmp_len_x*pixels_ZYX[2],
+ tmp_len_y*pixels_ZYX[1],
+ len_z*pixels_ZYX[0],
+ ]
+ )
+ counter+=1
+ obs_names.append(f"ROI_{counter}")
+ ROI_table=ad.AnnData(X=np.array(X,dtype=np.float32))
+ ROI_table.obs_names=obs_names
+ ROI_table.var_names=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ returnROI_table
+
defis_standard_roi_table(table:str)->bool:
+"""
+ True if the name of the table contains one of the standard Fractal tables
+
+ If a table name is well_ROI_table, FOV_ROI_table or contains either of the
+ two (e.g. registered_FOV_ROI_table), this function returns True.
+
+ Args:
+ table: table name
+
+ Returns:
+ bool of whether it's a standard ROI table
+
+ """
+ if"well_ROI_table"intable:
+ returnTrue
+ elif"FOV_ROI_table"intable:
+ returnTrue
+ else:
+ returnFalse
+
defprepare_FOV_ROI_table(
+ df:pd.DataFrame,metadata:tuple[str,...]=("time",)
+)->ad.AnnData:
+"""
+ Prepare an AnnData table for fields-of-view ROIs.
+
+ Args:
+ df:
+ Input dataframe, possibly prepared through
+ `parse_yokogawa_metadata`.
+ metadata:
+ Columns of `df` to be stored (if present) into AnnData table `obs`.
+ """
+
+ # Make a local copy of the dataframe, to avoid SettingWithCopyWarning
+ df=df.copy()
+
+ # Convert DataFrame index to str, to avoid
+ # >> ImplicitModificationWarning: Transforming to str index
+ # when creating AnnData object.
+ # Do this in the beginning to allow concatenation with e.g. time
+ df.index=df.index.astype(str)
+
+ # Obtain box size in physical units
+ df=df.assign(len_x_micrometer=df.x_pixel*df.pixel_size_x)
+ df=df.assign(len_y_micrometer=df.y_pixel*df.pixel_size_y)
+ df=df.assign(len_z_micrometer=df.z_pixel*df.pixel_size_z)
+
+ # Select only the numeric positional columns needed to define ROIs
+ # (to avoid) casting things like the data column to float32
+ # or to use unnecessary columns like bit_depth
+ positional_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ "x_micrometer_original",
+ "y_micrometer_original",
+ ]
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df_roi=df.loc[:,positional_columns].astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ adata=ad.AnnData(X=df_roi)
+
+ # Reset origin of the FOV ROI table, so that it matches with the well
+ # origin
+ adata=reset_origin(adata)
+
+ # Save any metadata that is specified to the obs df
+ forcolinmetadata:
+ ifcolindf:
+ # Cast all metadata to str.
+ # Reason: AnnData Zarr writers don't support all pandas types.
+ # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written
+ adata.obs[col]=df[col].astype(str)
+
+ # Rename rows and columns: Maintain FOV indices from the dataframe
+ # (they are already enforced to be unique by Pandas and may contain
+ # information for the user, as they are based on the filenames)
+ adata.obs_names="FOV_"+adata.obs.index
+ adata.var_names=list(map(str,df_roi.columns))
+
+ returnadata
+
defprepare_well_ROI_table(
+ df:pd.DataFrame,metadata:tuple[str,...]=("time",)
+)->ad.AnnData:
+"""
+ Prepare an AnnData table with a single well ROI.
+
+ Args:
+ df:
+ Input dataframe, possibly prepared through
+ `parse_yokogawa_metadata`.
+ metadata:
+ Columns of `df` to be stored (if present) into AnnData table `obs`.
+ """
+
+ # Make a local copy of the dataframe, to avoid SettingWithCopyWarning
+ df=df.copy()
+
+ # Convert DataFrame index to str, to avoid
+ # >> ImplicitModificationWarning: Transforming to str index
+ # when creating AnnData object.
+ # Do this in the beginning to allow concatenation with e.g. time
+ df.index=df.index.astype(str)
+
+ # Calculate bounding box extents in physical units
+ formuin["x","y","z"]:
+ # Obtain per-FOV properties in physical units.
+ # NOTE: a FOV ROI is defined here as the interval [min_micrometer,
+ # max_micrometer], with max_micrometer=min_micrometer+len_micrometer
+ min_micrometer=df[f"{mu}_micrometer"]
+ len_micrometer=df[f"{mu}_pixel"]*df[f"pixel_size_{mu}"]
+ max_micrometer=min_micrometer+len_micrometer
+ # Obtain well bounding box, in physical units
+ min_min_micrometer=min_micrometer.min()
+ max_max_micrometer=max_micrometer.max()
+ df[f"{mu}_micrometer"]=min_min_micrometer
+ df[f"len_{mu}_micrometer"]=max_max_micrometer-min_min_micrometer
+
+ # Select only the numeric positional columns needed to define ROIs
+ # (to avoid) casting things like the data column to float32
+ # or to use unnecessary columns like bit_depth
+ positional_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df_roi=df.iloc[0:1,:].loc[:,positional_columns].astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ adata=ad.AnnData(X=df_roi)
+
+ # Reset origin of the single-entry well ROI table
+ adata=reset_origin(adata)
+
+ # Save any metadata that is specified to the obs df
+ forcolinmetadata:
+ ifcolindf:
+ # Cast all metadata to str.
+ # Reason: AnnData Zarr writers don't support all pandas types.
+ # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written
+ adata.obs[col]=df[col].astype(str)
+
+ # Rename rows and columns: Maintain FOV indices from the dataframe
+ # (they are already enforced to be unique by Pandas and may contain
+ # information for the user, as they are based on the filenames)
+ adata.obs_names="well_"+adata.obs.index
+ adata.var_names=list(map(str,df_roi.columns))
+
+ returnadata
+
defreset_origin(
+ ROI_table:ad.AnnData,
+ x_pos:str="x_micrometer",
+ y_pos:str="y_micrometer",
+ z_pos:str="z_micrometer",
+)->ad.AnnData:
+"""
+ Return a copy of a ROI table, with shifted-to-zero origin for some columns.
+
+ Args:
+ ROI_table: Original ROI table.
+ x_pos: Name of the column with X position of ROIs.
+ y_pos: Name of the column with Y position of ROIs.
+ z_pos: Name of the column with Z position of ROIs.
+
+ Returns:
+ A copy of the `ROI_table` AnnData table, where values of `x_pos`,
+ `y_pos` and `z_pos` columns have been shifted by their minimum
+ values.
+ """
+ new_table=ROI_table.copy()
+
+ origin_x=min(new_table[:,x_pos].X[:,0])
+ origin_y=min(new_table[:,y_pos].X[:,0])
+ origin_z=min(new_table[:,z_pos].X[:,0])
+
+ forFOVinnew_table.obs_names:
+ new_table[FOV,x_pos]=new_table[FOV,x_pos].X[0,0]-origin_x
+ new_table[FOV,y_pos]=new_table[FOV,y_pos].X[0,0]-origin_y
+ new_table[FOV,z_pos]=new_table[FOV,z_pos].X[0,0]-origin_z
+
+ returnnew_table
+
defare_ROI_table_columns_valid(*,table:ad.AnnData)->None:
+"""
+ Verify some validity assumptions on a ROI table.
+
+ This function reflects our current working assumptions (e.g. the presence
+ of some specific columns); this may change in future versions.
+
+ Args:
+ table: AnnData table to be checked
+ """
+
+ # Hard constraint: table columns must include some expected ones
+ columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ forcolumnincolumns:
+ ifcolumnnotintable.var_names:
+ raiseValueError(f"Column {column} is not present in ROI table")
+
Check that list of indices has zero origin on each axis.
+
See fractal-tasks-core issues #530 and #554.
+
This helper function is meant to provide informative error messages when
+ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.
+This function will be deprecated and removed as soon as the v0.11/v0.12
+transition advances.
+
Note that only FOV_ROI_table and well_ROI_table have to fulfill this
+constraint, while ROI tables obtained through segmentation may have
+arbitrary (non-negative) indices.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
list_indices
+
+
+
Output of convert_ROI_table_to_indices; each item is like
+[start_z, end_z, start_y, end_y, start_x, end_x].
defcheck_valid_ROI_indices(
+ list_indices:list[list[int]],
+ ROI_table_name:str,
+)->None:
+"""
+ Check that list of indices has zero origin on each axis.
+
+ See fractal-tasks-core issues #530 and #554.
+
+ This helper function is meant to provide informative error messages when
+ ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.
+ This function will be deprecated and removed as soon as the v0.11/v0.12
+ transition advances.
+
+ Note that only `FOV_ROI_table` and `well_ROI_table` have to fulfill this
+ constraint, while ROI tables obtained through segmentation may have
+ arbitrary (non-negative) indices.
+
+ Args:
+ list_indices:
+ Output of `convert_ROI_table_to_indices`; each item is like
+ `[start_z, end_z, start_y, end_y, start_x, end_x]`.
+ ROI_table_name: Name of the ROI table.
+
+ Raises:
+ ValueError:
+ If the table name is `FOV_ROI_table` or `well_ROI_table` and the
+ minimum value of `start_x`, `start_y` and `start_z` are not all
+ zero.
+ """
+ ifROI_table_namenotin["FOV_ROI_table","well_ROI_table"]:
+ # This validation function only applies to the FOV/well ROI tables
+ # generated with fractal-tasks-core
+ return
+
+ # Find minimum index along ZYX
+ min_start_z=min(item[0]foriteminlist_indices)
+ min_start_y=min(item[2]foriteminlist_indices)
+ min_start_x=min(item[4]foriteminlist_indices)
+
+ # Check that minimum indices are all zero
+ forind,min_indexinenumerate((min_start_z,min_start_y,min_start_x)):
+ ifmin_index!=0:
+ axis=["Z","Y","X"][ind]
+ raiseValueError(
+ f"{axis} component of ROI indices for table `{ROI_table_name}`"
+ f" do not start with 0, but with {min_index}.\n"
+ "Hint: As of fractal-tasks-core v0.12, FOV/well ROI "
+ "tables with non-zero origins (e.g. the ones created with "
+ "v0.11) are not supported."
+ )
+
This function reflects our current working assumptions (e.g. the presence
+of some specific columns); this may change in future versions.
+
If use_masks=True, we verify that the table is a valid
+masking_roi_table as of table specifications V1; if this check fails,
+use_masks should be set to False upstream in the parent function.
defis_ROI_table_valid(*,table_path:str,use_masks:bool)->Optional[bool]:
+"""
+ Verify some validity assumptions on a ROI table.
+
+ This function reflects our current working assumptions (e.g. the presence
+ of some specific columns); this may change in future versions.
+
+ If `use_masks=True`, we verify that the table is a valid
+ `masking_roi_table` as of table specifications V1; if this check fails,
+ `use_masks` should be set to `False` upstream in the parent function.
+
+ Args:
+ table_path: Path of the AnnData ROI table to be checked.
+ use_masks: If `True`, perform some additional checks related to
+ masked loading.
+
+ Returns:
+ Always `None` if `use_masks=False`, otherwise return whether the table
+ is valid for masked loading.
+ """
+
+ table=ad.read_zarr(table_path)
+ are_ROI_table_columns_valid(table=table)
+ ifnotuse_masks:
+ returnNone
+
+ # Check whether the table can be used for masked loading
+ attrs=zarr.group(table_path).attrs.asdict()
+ logger.info(f"ROI table at {table_path} has attrs: {attrs}")
+ try:
+ MaskingROITableAttrs(**attrs)
+ logging.info("ROI table can be used for masked loading")
+ returnTrue
+ exceptValidationError:
+ logging.info("ROI table cannot be used for masked loading")
+ returnFalse
+
deffind_overlaps_in_ROI_indices(
+ list_indices:list[list[int]],
+)->Optional[tuple[int,int]]:
+"""
+ Given a list of integer ROI indices, find whether there are overlaps.
+
+ Args:
+ list_indices: List of ROI indices, where each element in the list
+ should look like
+ `[start_z, end_z, start_y, end_y, start_x, end_x]`.
+
+ Returns:
+ `None` if no overlap was detected, otherwise a tuple with the
+ positional indices of a pair of overlapping ROIs.
+ """
+
+ forind_1,ROI_1inenumerate(list_indices):
+ s_z,e_z,s_y,e_y,s_x,e_x=ROI_1[:]
+ box_1=[s_x,s_y,s_z,e_x,e_y,e_z]
+ forind_2inrange(ind_1):
+ ROI_2=list_indices[ind_2]
+ s_z,e_z,s_y,e_y,s_x,e_x=ROI_2[:]
+ box_2=[s_x,s_y,s_z,e_x,e_y,e_z]
+ if_is_overlapping_3D_int(box_1,box_2):
+ return(ind_1,ind_2)
+ returnNone
+
Run an overlap check over all wells and optionally plots overlaps.
+
This function is currently only used in tests and examples.
+
The plotting_function parameter is exposed so that other tools (see
+examples in this repository) may use it to show the FOV ROIs. Its arguments
+are: [xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well].
defrun_overlap_check(
+ site_metadata:pd.DataFrame,
+ tol:float=1e-10,
+ plotting_function:Optional[Callable]=None,
+):
+"""
+ Run an overlap check over all wells and optionally plots overlaps.
+
+ This function is currently only used in tests and examples.
+
+ The `plotting_function` parameter is exposed so that other tools (see
+ examples in this repository) may use it to show the FOV ROIs. Its arguments
+ are: `[xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well]`.
+
+ Args:
+ site_metadata: TBD
+ tol: TBD
+ plotting_function: TBD
+ """
+
+ ifplotting_functionisNone:
+
+ defplotting_function(
+ xmin,xmax,ymin,ymax,list_overlapping_FOVs,selected_well
+ ):
+ pass
+
+ wells=site_metadata.index.unique(level="well_id")
+ overlapping_FOVs=[]
+ forselected_wellinwells:
+ overlap_curr_well=check_well_for_FOV_overlap(
+ site_metadata,
+ selected_well=selected_well,
+ tol=tol,
+ plotting_function=plotting_function,
+ )
+ ifoverlap_curr_well:
+ print(selected_well)
+ overlapping_FOVs.append(overlap_curr_well)
+
+ returnoverlapping_FOVs
+
This is the general interface that should allow for a smooth coexistence of
+tables with different fractal_table_version values. Currently only V1 is
+defined and implemented. The assumption is that V2 should only change:
+
+
The lower-level writing function (that is, _write_table_v2).
+
The type of the table (which would also reflect into a more general type
+ hint for table, in the current funciton);
+
A different definition of what values of table_attrs are valid or
+ invalid, to be implemented in _write_table_v2.
+
Possibly, additional parameters for _write_table_v2, which will be
+ optional parameters of write_table (so that write_table remains
+ valid for both V1 and V2).
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
image_group
+
+
+
The image Zarr group where the table will be written.
If False, check that the new table does not exist (either as a
+zarr sub-group or as part of the zarr-group attributes). In all
+cases, propagate parameter to low-level functions, to determine the
+behavior in case of an existing sub-group named as in table_name.
If set, overwrite table_group attributes with table_attrs key/value
+pairs. If table_type is not provided, then table_attrs must
+include the type key.
defwrite_table(
+ image_group:zarr.hierarchy.Group,
+ table_name:str,
+ table:ad.AnnData,
+ overwrite:bool=False,
+ table_type:Optional[str]=None,
+ table_attrs:Optional[dict[str,Any]]=None,
+)->zarr.group:
+"""
+ Write a table to a Zarr group.
+
+ This is the general interface that should allow for a smooth coexistence of
+ tables with different `fractal_table_version` values. Currently only V1 is
+ defined and implemented. The assumption is that V2 should only change:
+
+ 1. The lower-level writing function (that is, `_write_table_v2`).
+ 2. The type of the table (which would also reflect into a more general type
+ hint for `table`, in the current funciton);
+ 3. A different definition of what values of `table_attrs` are valid or
+ invalid, to be implemented in `_write_table_v2`.
+ 4. Possibly, additional parameters for `_write_table_v2`, which will be
+ optional parameters of `write_table` (so that `write_table` remains
+ valid for both V1 and V2).
+
+ Args:
+ image_group:
+ The image Zarr group where the table will be written.
+ table_name:
+ The name of the table.
+ table:
+ The table object (currently an AnnData object, for V1).
+ overwrite:
+ If `False`, check that the new table does not exist (either as a
+ zarr sub-group or as part of the zarr-group attributes). In all
+ cases, propagate parameter to low-level functions, to determine the
+ behavior in case of an existing sub-group named as in `table_name`.
+ table_type: `type` attribute for the table; in case `type` is also
+ present in `table_attrs`, this function argument takes priority.
+ table_attrs:
+ If set, overwrite table_group attributes with table_attrs key/value
+ pairs. If `table_type` is not provided, then `table_attrs` must
+ include the `type` key.
+
+ Returns:
+ Zarr group of the table.
+ """
+ # Choose which version to use, giving priority to a value that is present
+ # in table_attrs
+ version=__FRACTAL_TABLE_VERSION__
+ iftable_attrsisnotNone:
+ try:
+ version=table_attrs["fractal_table_version"]
+ exceptKeyError:
+ pass
+
+ ifversion=="1":
+ return_write_table_v1(
+ image_group,
+ table_name,
+ table,
+ overwrite,
+ table_type,
+ table_attrs,
+ )
+ else:
+ raiseNotImplementedError(
+ f"fractal_table_version='{version}' is not supported"
+ )
+
def_write_elem_with_overwrite(
+ group:zarr.hierarchy.Group,
+ key:str,
+ elem:Any,
+ *,
+ overwrite:bool,
+ logger:Optional[logging.Logger]=None,
+)->None:
+"""
+ Wrap `anndata.experimental.write_elem`, to include `overwrite` parameter.
+
+ See docs for the original function
+ [here](https://anndata.readthedocs.io/en/stable/generated/anndata.experimental.write_elem.html).
+
+ This function writes `elem` to the sub-group `key` of `group`. The
+ `overwrite`-related expected behavior is:
+
+ * if the sub-group does not exist, create it (independently on
+ `overwrite`);
+ * if the sub-group already exists and `overwrite=True`, overwrite the
+ sub-group;
+ * if the sub-group already exists and `overwrite=False`, fail.
+
+ Note that this version of the wrapper does not include the original
+ `dataset_kwargs` parameter.
+
+ Args:
+ group:
+ The group to write to.
+ key:
+ The key to write to in the group. Note that absolute paths will be
+ written from the root.
+ elem:
+ The element to write. Typically an in-memory object, e.g. an
+ AnnData, pandas dataframe, scipy sparse matrix, etc.
+ overwrite:
+ If `True`, overwrite the `key` sub-group (if present); if `False`
+ and `key` sub-group exists, raise an error.
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`)
+
+ Raises:
+ OverwriteNotAllowedError:
+ If `overwrite=False` and the sub-group already exists.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ ifkeyinset(group.group_keys()):
+ ifnotoverwrite:
+ error_msg=(
+ f"Sub-group '{key}' of group {group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ write_elem(group,key,elem)
+
Handle multiple options for writing an AnnData table to a zarr group.
+
+
Create the tables group, if needed.
+
If overwrite=False, check that the new table does not exist (either in
+ zarr attributes or as a zarr sub-group).
+
Call the _write_elem_with_overwrite wrapper with the appropriate
+ overwrite parameter.
+
Update the tables attribute of the image group.
+
Validate table_type and table_attrs according to Fractal table
+ specifications, and raise errors/warnings if needed; then set the
+ appropriate attributes in the new-table Zarr group.
If False, check that the new table does not exist (either as a
+zarr sub-group or as part of the zarr-group attributes). In all
+cases, propagate parameter to _write_elem_with_overwrite, to
+determine the behavior in case of an existing sub-group named as
+table_name.
If set, overwrite table_group attributes with table_attrs key/value
+pairs. If table_type is not provided, then table_attrs must
+include the type key.
def_write_table_v1(
+ image_group:zarr.hierarchy.Group,
+ table_name:str,
+ table:ad.AnnData,
+ overwrite:bool=False,
+ table_type:Optional[str]=None,
+ table_attrs:Optional[dict[str,Any]]=None,
+)->zarr.group:
+"""
+ Handle multiple options for writing an AnnData table to a zarr group.
+
+ 1. Create the `tables` group, if needed.
+ 2. If `overwrite=False`, check that the new table does not exist (either in
+ zarr attributes or as a zarr sub-group).
+ 3. Call the `_write_elem_with_overwrite` wrapper with the appropriate
+ `overwrite` parameter.
+ 4. Update the `tables` attribute of the image group.
+ 5. Validate `table_type` and `table_attrs` according to Fractal table
+ specifications, and raise errors/warnings if needed; then set the
+ appropriate attributes in the new-table Zarr group.
+
+
+ Args:
+ image_group:
+ The group to write to.
+ table_name:
+ The name of the new table.
+ table:
+ The AnnData table to write.
+ overwrite:
+ If `False`, check that the new table does not exist (either as a
+ zarr sub-group or as part of the zarr-group attributes). In all
+ cases, propagate parameter to `_write_elem_with_overwrite`, to
+ determine the behavior in case of an existing sub-group named as
+ `table_name`.
+ table_type: `type` attribute for the table; in case `type` is also
+ present in `table_attrs`, this function argument takes priority.
+ table_attrs:
+ If set, overwrite table_group attributes with table_attrs key/value
+ pairs. If `table_type` is not provided, then `table_attrs` must
+ include the `type` key.
+
+ Returns:
+ Zarr group of the new table.
+ """
+
+ # Create tables group (if needed) and extract current_tables
+ if"tables"notinset(image_group.group_keys()):
+ tables_group=image_group.create_group("tables",overwrite=False)
+ else:
+ tables_group=image_group["tables"]
+ current_tables=tables_group.attrs.asdict().get("tables",[])
+
+ # If overwrite=False, check that the new table does not exist (either as a
+ # zarr sub-group or as part of the zarr-group attributes)
+ ifnotoverwrite:
+ iftable_nameinset(tables_group.group_keys()):
+ error_msg=(
+ f"Sub-group '{table_name}' of group {image_group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ iftable_nameincurrent_tables:
+ error_msg=(
+ f"Item '{table_name}' already exists in `tables` attribute of "
+ f"group {image_group.store.path}, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
+ # Always include fractal-roi-table version in table attributes
+ iftable_attrsisNone:
+ table_attrs=dict(fractal_table_version="1")
+ eliftable_attrs.get("fractal_table_version",None)isNone:
+ table_attrs["fractal_table_version"]="1"
+
+ # Set type attribute for the table
+ table_type_from_attrs=table_attrs.get("type",None)
+ iftable_typeisnotNone:
+ iftable_type_from_attrsisnotNone:
+ logger.warning(
+ f"Setting table type to '{table_type}' (and overriding "
+ f"'{table_type_from_attrs}' attribute)."
+ )
+ table_attrs["type"]=table_type
+ else:
+ iftable_type_from_attrsisNone:
+ raiseValueError(
+ "Missing attribute `type` for table; this must be provided"
+ " either via `table_type` or within `table_attrs`."
+ )
+
+ # Prepare/validate attributes for the table
+ table_type=table_attrs.get("type",None)
+ iftable_type=="roi_table":
+ pass
+ eliftable_type=="masking_roi_table":
+ try:
+ MaskingROITableAttrs(**table_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Table attributes do not comply with Fractal "
+ "`masking_roi_table` specifications V1.\nOriginal error:\n"
+ f"ValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ eliftable_type=="feature_table":
+ try:
+ FeatureTableAttrs(**table_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Table attributes do not comply with Fractal "
+ "`feature_table` specifications V1.\nOriginal error:\n"
+ f"ValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ else:
+ logger.warning(f"Unknown table type `{table_type}`.")
+
+ # If it's all OK, proceed and write the table
+ _write_elem_with_overwrite(
+ tables_group,
+ table_name,
+ table,
+ overwrite=overwrite,
+ )
+ table_group=tables_group[table_name]
+
+ # Update the `tables` metadata of the image group, if needed
+ iftable_namenotincurrent_tables:
+ new_tables=current_tables+[table_name]
+ tables_group.attrs["tables"]=new_tables
+
+ # Update table_group attributes with table_attrs key/value pairs
+ table_group.attrs.update(**table_attrs)
+
+ returntable_group
+
defget_tables_list_v1(
+ zarr_url:str,table_type:str=None,strict:bool=False
+)->list[str]:
+"""
+ Find the list of tables in the Zarr file
+
+ Optionally match a table type and only return the names of those tables.
+
+ Args:
+ zarr_url: Path to the OME-Zarr image
+ table_type: The type of table to look for. Special handling for
+ "ROIs" => matches both "roi_table" & "masking_roi_table".
+ strict: If `True`, only return tables that have a type attribute.
+ If `False`, also include tables without a type attribute.
+
+ Returns:
+ List of the names of available tables
+ """
+ withzarr.open(zarr_url,mode="r")aszarr_group:
+ zarr_subgroups=list(zarr_group.group_keys())
+ if"tables"notinzarr_subgroups:
+ return[]
+ withzarr.open(zarr_url,mode="r")aszarr_group:
+ all_tables=list(zarr_group.tables.group_keys())
+
+ ifnottable_type:
+ returnall_tables
+ else:
+ return_filter_tables_by_type_v1(
+ zarr_url,all_tables,table_type,strict
+ )
+
defapply_registration_to_single_ROI_table(
+ roi_table:ad.AnnData,
+ max_df:pd.DataFrame,
+ min_df:pd.DataFrame,
+)->ad.AnnData:
+"""
+ Applies the registration to a ROI table
+
+ Calculates the new position as: p = position + max(shift, 0) - own_shift
+ Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)
+
+ Args:
+ roi_table: AnnData table which contains a Fractal ROI table.
+ Rows are ROIs
+ max_df: Max translation shift in z, y, x for each ROI. Rows are ROIs,
+ columns are translation_z, translation_y, translation_x
+ min_df: Min translation shift in z, y, x for each ROI. Rows are ROIs,
+ columns are translation_z, translation_y, translation_x
+ Returns:
+ ROI table where all ROIs are registered to the smallest common area
+ across all acquisitions.
+ """
+ roi_table=copy.deepcopy(roi_table)
+ rois=roi_table.obs.index
+ if(rois!=max_df.index).all()or(rois!=min_df.index).all():
+ raiseValueError(
+ "ROI table and max & min translation need to contain the same "
+ f"ROIS, but they were {rois=}, {max_df.index=}, {min_df.index=}"
+ )
+
+ forroiinrois:
+ roi_table[[roi],["z_micrometer"]]=(
+ roi_table[[roi],["z_micrometer"]].X
+ +float(max_df.loc[roi,"translation_z"])
+ -roi_table[[roi],["translation_z"]].X
+ )
+ roi_table[[roi],["y_micrometer"]]=(
+ roi_table[[roi],["y_micrometer"]].X
+ +float(max_df.loc[roi,"translation_y"])
+ -roi_table[[roi],["translation_y"]].X
+ )
+ roi_table[[roi],["x_micrometer"]]=(
+ roi_table[[roi],["x_micrometer"]].X
+ +float(max_df.loc[roi,"translation_x"])
+ -roi_table[[roi],["translation_x"]].X
+ )
+ # This calculation only works if all ROIs are the same size initially!
+ roi_table[[roi],["len_z_micrometer"]]=(
+ roi_table[[roi],["len_z_micrometer"]].X
+ -float(max_df.loc[roi,"translation_z"])
+ +float(min_df.loc[roi,"translation_z"])
+ )
+ roi_table[[roi],["len_y_micrometer"]]=(
+ roi_table[[roi],["len_y_micrometer"]].X
+ -float(max_df.loc[roi,"translation_y"])
+ +float(min_df.loc[roi,"translation_y"])
+ )
+ roi_table[[roi],["len_x_micrometer"]]=(
+ roi_table[[roi],["len_x_micrometer"]].X
+ -float(max_df.loc[roi,"translation_x"])
+ +float(min_df.loc[roi,"translation_x"])
+ )
+ returnroi_table
+
Parses zarr_urls & groups them by HCS wells & acquisition
+
Generates a dict with keys a unique description of the acquisition
+(e.g. plate + well for HCS plates). The values are dictionaries. The keys
+of the secondary dictionary are the acqusitions, its values the zarr_url
+for a given acquisition.
defcreate_well_acquisition_dict(
+ zarr_urls:list[str],
+)->dict[str,dict[int,str]]:
+"""
+ Parses zarr_urls & groups them by HCS wells & acquisition
+
+ Generates a dict with keys a unique description of the acquisition
+ (e.g. plate + well for HCS plates). The values are dictionaries. The keys
+ of the secondary dictionary are the acqusitions, its values the `zarr_url`
+ for a given acquisition.
+
+ Args:
+ zarr_urls: List of zarr_urls
+
+ Returns:
+ image_groups
+ """
+ image_groups=dict()
+
+ # Dict to cache well-level metadata
+ well_metadata=dict()
+ forzarr_urlinzarr_urls:
+ well_path,img_sub_path=_split_well_path_image_path(zarr_url)
+ # For the first zarr_url of a well, load the well metadata and
+ # initialize the image_groups dict
+ ifwell_pathnotinimage_groups:
+ well_meta=load_NgffWellMeta(well_path)
+ well_metadata[well_path]=well_meta.well
+ image_groups[well_path]={}
+
+ # For every zarr_url, add it under the well_path & acquisition keys to
+ # the image_groups dict
+ forimageinwell_metadata[well_path].images:
+ ifimage.path==img_sub_path:
+ ifimage.acquisitioninimage_groups[well_path]:
+ raiseValueError(
+ "This task has not been built for OME-Zarr HCS plates"
+ "with multiple images of the same acquisition per well"
+ f". {image.acquisition} is the acquisition for "
+ f"multiple images in {well_path=}."
+ )
+
+ image_groups[well_path][image.acquisition]=zarr_url
+ returnimage_groups
+
Updates the necessary metadata for a new copy of an OME-Zarr image
+
Based on an existing OME-Zarr image in the same well, the metadata is
+copied and added to the new zarr well. Additionally, the well-level
+metadata is updated to include this new image.
def_copy_hcs_ome_zarr_metadata(
+ zarr_url_origin:str,
+ zarr_url_new:str,
+)->None:
+"""
+ Updates the necessary metadata for a new copy of an OME-Zarr image
+
+ Based on an existing OME-Zarr image in the same well, the metadata is
+ copied and added to the new zarr well. Additionally, the well-level
+ metadata is updated to include this new image.
+
+ Args:
+ zarr_url_origin: zarr_url of the origin image
+ zarr_url_new: zarr_url of the newly created image. The zarr-group
+ already needs to exist, but metadata is written by this function.
+ """
+ # Copy over OME-Zarr metadata for illumination_corrected image
+ # See #681 for discussion for validation of this zattrs
+ old_image_group=zarr.open_group(zarr_url_origin,mode="r")
+ old_attrs=old_image_group.attrs.asdict()
+ zarr_url_new=zarr_url_new.rstrip("/")
+ new_image_group=zarr.group(zarr_url_new)
+ new_image_group.attrs.put(old_attrs)
+
+ # Update well metadata about adding the new image:
+ new_image_path=zarr_url_new.split("/")[-1]
+ well_url,old_image_path=_split_well_path_image_path(zarr_url_origin)
+ _update_well_metadata(well_url,old_image_path,new_image_path)
+
def_copy_tables_from_zarr_url(
+ origin_zarr_url:str,
+ target_zarr_url:str,
+ table_type:str=None,
+ overwrite:bool=True,
+)->None:
+"""
+ Copies all ROI tables from one Zarr into a new Zarr
+
+ Args:
+ origin_zarr_url: url of the OME-Zarr image that contains tables.
+ e.g. /path/to/my_plate.zarr/B/03/0
+ target_zarr_url: url of the new OME-Zarr image where tables are copied
+ to. e.g. /path/to/my_plate.zarr/B/03/0_illum_corr
+ table_type: Filter for specific table types that should be copied.
+ overwrite: Whether existing tables of the same name in the
+ target_zarr_url should be overwritten.
+ """
+ table_list=get_tables_list_v1(
+ zarr_url=origin_zarr_url,table_type=table_type
+ )
+
+ iftable_list:
+ logger.info(
+ f"Copying the tables {table_list} from {origin_zarr_url} to "
+ f"{target_zarr_url}."
+ )
+ new_image_group=zarr.group(target_zarr_url)
+
+ fortableintable_list:
+ logger.info(f"Copying table: {table}")
+ # Get the relevant metadata of the Zarr table & add it
+ table_url=f"{origin_zarr_url}/tables/{table}"
+ old_table_group=zarr.open_group(table_url,mode="r")
+ # Write the Zarr table
+ curr_table=ad.read_zarr(table_url)
+ write_table(
+ new_image_group,
+ table,
+ curr_table,
+ table_attrs=old_table_group.attrs.asdict(),
+ overwrite=overwrite,
+ )
+
Pick the best match from path_list to a given path
+
This is a workaround to find the reference registration acquisition when
+there are multiple OME-Zarrs with the same acquisition identifier in the
+well metadata and we need to find which one is the reference for a given
+path.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
path_list
+
+
+
List of paths to OME-Zarr images in the well metadata. For
+example: ['0', '0_illum_corr']
def_get_matching_ref_acquisition_path_heuristic(
+ path_list:list[str],path:str
+)->str:
+"""
+ Pick the best match from path_list to a given path
+
+ This is a workaround to find the reference registration acquisition when
+ there are multiple OME-Zarrs with the same acquisition identifier in the
+ well metadata and we need to find which one is the reference for a given
+ path.
+
+ Args:
+ path_list: List of paths to OME-Zarr images in the well metadata. For
+ example: ['0', '0_illum_corr']
+ path: A given path for which we want to find the reference image. For
+ example, '1_illum_corr'
+
+ Returns:
+ The best matching reference path. If no direct match is found, it
+ returns the most similar one based on suffix hierarchy or the base
+ path if applicable. For example, '0_illum_corr' with the example
+ inputs above.
+ """
+
+ # Extract the base number and suffix from the input path
+ base,suffix=_split_base_suffix(path)
+
+ # Sort path_list
+ sorted_path_list=sorted(path_list)
+
+ # Never return the input `path`
+ ifpathinsorted_path_list:
+ sorted_path_list.remove(path)
+
+ # First matching rule: a path with the same suffix
+ forpinsorted_path_list:
+ # Split the list path into base and suffix
+ p_base,p_suffix=_split_base_suffix(p)
+ # If suffices match, it's the match.
+ ifp_suffix==suffix:
+ returnp
+
+ # If no match is found, return the first entry in the list
+ logger.warning(
+ "No heuristic reference acquisition match found, defaulting to first "
+ f"option {sorted_path_list[0]}."
+ )
+ returnsorted_path_list[0]
+
Update the well metadata by adding the new_image_path to the image list.
+
The content of new_image_path will be based on old_image_path, the origin
+for the new image that was created.
+This function aims to avoid race conditions with other processes that try
+to update the well metadata file by using FileLock & Timeouts
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
well_url
+
+
+
Path to the HCS OME-Zarr well that needs to be updated
def_update_well_metadata(
+ well_url:str,
+ old_image_path:str,
+ new_image_path:str,
+ timeout:int=120,
+)->None:
+"""
+ Update the well metadata by adding the new_image_path to the image list.
+
+ The content of new_image_path will be based on old_image_path, the origin
+ for the new image that was created.
+ This function aims to avoid race conditions with other processes that try
+ to update the well metadata file by using FileLock & Timeouts
+
+ Args:
+ well_url: Path to the HCS OME-Zarr well that needs to be updated
+ old_image_path: path relative to well_url where the original image is
+ found
+ new_image_path: path relative to well_url where the new image is placed
+ timeout: Timeout in seconds for trying to get the file lock
+ """
+ lock=FileLock(f"{well_url}/.zattrs.lock")
+ withlock.acquire(timeout=timeout):
+
+ well_meta=load_NgffWellMeta(well_url)
+ existing_well_images=[image.pathforimageinwell_meta.well.images]
+ ifnew_image_pathinexisting_well_images:
+ raiseValueError(
+ f"Could not add the {new_image_path=} image to the well "
+ "metadata because and image with that name "
+ f"already existed in the well metadata: {well_meta}"
+ )
+ try:
+ well_meta_image_old=next(
+ image
+ forimageinwell_meta.well.images
+ ifimage.path==old_image_path
+ )
+ exceptStopIteration:
+ raiseValueError(
+ f"Could not find an image with {old_image_path=} in the "
+ "current well metadata."
+ )
+ well_meta_image=copy.deepcopy(well_meta_image_old)
+ well_meta_image.path=new_image_path
+ well_meta.well.images.append(well_meta_image)
+ well_meta.well.images=sorted(
+ well_meta.well.images,
+ key=lambda_image:_image.path,
+ )
+
+ well_group=zarr.group(well_url)
+ well_group.attrs.put(well_meta.dict(exclude_none=True))
+
Apply registration to images by using a registered ROI table
+
This task consists of 4 parts:
+
+
Mask all regions in images that are not available in the
+registered ROI table and store each acquisition aligned to the
+reference_acquisition (by looping over ROIs).
+
Do the same for all label images.
+
Copy all tables from the non-aligned image to the aligned image
+(currently only works well if the only tables are well & FOV ROI tables
+(registered and original). Not implemented for measurement tables and
+other ROI tables).
+
Clean up: Delete the old, non-aligned image and rename the new,
+aligned image to take over its place.
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_url
+
+
+
Path or url to the individual OME-Zarr image to be processed.
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table which has been registered
+and will be applied to mask and shift the images.
+Examples: registered_FOV_ROI_table => loop over the field of
+views, registered_well_ROI_table => process the whole well as
+one image.
@validate_arguments
+defapply_registration_to_image(
+ *,
+ # Fractal parameters
+ zarr_url:str,
+ # Core parameters
+ registered_roi_table:str,
+ reference_acquisition:int=0,
+ overwrite_input:bool=True,
+):
+"""
+ Apply registration to images by using a registered ROI table
+
+ This task consists of 4 parts:
+
+ 1. Mask all regions in images that are not available in the
+ registered ROI table and store each acquisition aligned to the
+ reference_acquisition (by looping over ROIs).
+ 2. Do the same for all label images.
+ 3. Copy all tables from the non-aligned image to the aligned image
+ (currently only works well if the only tables are well & FOV ROI tables
+ (registered and original). Not implemented for measurement tables and
+ other ROI tables).
+ 4. Clean up: Delete the old, non-aligned image and rename the new,
+ aligned image to take over its place.
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ registered_roi_table: Name of the ROI table which has been registered
+ and will be applied to mask and shift the images.
+ Examples: `registered_FOV_ROI_table` => loop over the field of
+ views, `registered_well_ROI_table` => process the whole well as
+ one image.
+ reference_acquisition: Which acquisition to register against. Uses the
+ OME-NGFF HCS well metadata acquisition keys to find the reference
+ acquisition.
+ overwrite_input: Whether the old image data should be replaced with the
+ newly registered image data. Currently only implemented for
+ `overwrite_input=True`.
+
+ """
+ logger.info(zarr_url)
+ logger.info(
+ f"Running `apply_registration_to_image` on {zarr_url=}, "
+ f"{registered_roi_table=} and {reference_acquisition=}. "
+ f"Using {overwrite_input=}"
+ )
+
+ well_url,old_img_path=_split_well_path_image_path(zarr_url)
+ new_zarr_url=f"{well_url}/{zarr_url.split('/')[-1]}_registered"
+ # Get the zarr_url for the reference acquisition
+ acq_dict=load_NgffWellMeta(well_url).get_acquisition_paths()
+ ifreference_acquisitionnotinacq_dict:
+ raiseValueError(
+ f"{reference_acquisition=} was not one of the available "
+ f"acquisitions in {acq_dict=} for well {well_url}"
+ )
+ eliflen(acq_dict[reference_acquisition])>1:
+ ref_path=_get_matching_ref_acquisition_path_heuristic(
+ acq_dict[reference_acquisition],old_img_path
+ )
+ logger.warning(
+ "Running registration when there are multiple images of the same "
+ "acquisition in a well. Using a heuristic to match the reference "
+ f"acquisition. Using {ref_path} as the reference image."
+ )
+ else:
+ ref_path=acq_dict[reference_acquisition][0]
+ reference_zarr_url=f"{well_url}/{ref_path}"
+
+ ROI_table_ref=ad.read_zarr(
+ f"{reference_zarr_url}/tables/{registered_roi_table}"
+ )
+ ROI_table_acq=ad.read_zarr(f"{zarr_url}/tables/{registered_roi_table}")
+
+ ngff_image_meta=load_NgffImageMeta(zarr_url)
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ num_levels=ngff_image_meta.num_levels
+
+ ####################
+ # Process images
+ ####################
+ logger.info("Write the registered Zarr image to disk")
+ write_registered_zarr(
+ zarr_url=zarr_url,
+ new_zarr_url=new_zarr_url,
+ ROI_table=ROI_table_acq,
+ ROI_table_ref=ROI_table_ref,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ aggregation_function=np.mean,
+ )
+
+ ####################
+ # Process labels
+ ####################
+ try:
+ labels_group=zarr.open_group(f"{zarr_url}/labels","r")
+ label_list=labels_group.attrs["labels"]
+ except(zarr.errors.GroupNotFoundError,KeyError):
+ label_list=[]
+
+ iflabel_list:
+ logger.info(f"Processing the label images: {label_list}")
+ labels_group=zarr.group(f"{new_zarr_url}/labels")
+ labels_group.attrs["labels"]=label_list
+
+ forlabelinlabel_list:
+ write_registered_zarr(
+ zarr_url=f"{zarr_url}/labels/{label}",
+ new_zarr_url=f"{new_zarr_url}/labels/{label}",
+ ROI_table=ROI_table_acq,
+ ROI_table_ref=ROI_table_ref,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ aggregation_function=np.max,
+ )
+
+ ####################
+ # Copy tables
+ # 1. Copy all standard ROI tables from the reference acquisition.
+ # 2. Copy all tables that aren't standard ROI tables from the given
+ # acquisition.
+ ####################
+ table_dict_reference=_get_table_path_dict(reference_zarr_url)
+ table_dict_component=_get_table_path_dict(zarr_url)
+
+ table_dict={}
+ # Define which table should get copied:
+ fortableintable_dict_reference:
+ ifis_standard_roi_table(table):
+ table_dict[table]=table_dict_reference[table]
+ fortableintable_dict_component:
+ ifnotis_standard_roi_table(table):
+ ifreference_zarr_url!=zarr_url:
+ logger.warning(
+ f"{zarr_url} contained a table that is not a standard "
+ "ROI table. The `Apply Registration To Image task` is "
+ "best used before additional tables are generated. It "
+ f"will copy the {table} from this acquisition without "
+ "applying any transformations. This will work well if "
+ f"{table} contains measurements. But if {table} is a "
+ "custom ROI table coming from another task, the "
+ "transformation is not applied and it will not match "
+ "with the registered image anymore."
+ )
+ table_dict[table]=table_dict_component[table]
+
+ iftable_dict:
+ logger.info(f"Processing the tables: {table_dict}")
+ new_image_group=zarr.group(new_zarr_url)
+
+ fortableintable_dict.keys():
+ logger.info(f"Copying table: {table}")
+ # Get the relevant metadata of the Zarr table & add it
+ # See issue #516 for the need for this workaround
+ max_retries=20
+ sleep_time=5
+ current_round=0
+ whilecurrent_round<max_retries:
+ try:
+ old_table_group=zarr.open_group(
+ table_dict[table],mode="r"
+ )
+ current_round=max_retries
+ exceptzarr.errors.GroupNotFoundError:
+ logger.debug(
+ f"Table {table} not found in attempt {current_round}. "
+ f"Waiting {sleep_time} seconds before trying again."
+ )
+ current_round+=1
+ time.sleep(sleep_time)
+ # Write the Zarr table
+ curr_table=ad.read_zarr(table_dict[table])
+ write_table(
+ new_image_group,
+ table,
+ curr_table,
+ table_attrs=old_table_group.attrs.asdict(),
+ overwrite=True,
+ )
+
+ ####################
+ # Clean up Zarr file
+ ####################
+ ifoverwrite_input:
+ logger.info(
+ "Replace original zarr image with the newly created Zarr image"
+ )
+ # Potential for race conditions: Every acquisition reads the
+ # reference acquisition, but the reference acquisition also gets
+ # modified
+ # See issue #516 for the details
+ os.rename(zarr_url,f"{zarr_url}_tmp")
+ os.rename(new_zarr_url,zarr_url)
+ shutil.rmtree(f"{zarr_url}_tmp")
+ image_list_updates=dict(image_list_updates=[dict(zarr_url=zarr_url)])
+ else:
+ image_list_updates=dict(
+ image_list_updates=[dict(zarr_url=new_zarr_url,origin=zarr_url)]
+ )
+ # Update the metadata of the the well
+ well_url,new_img_path=_split_well_path_image_path(new_zarr_url)
+ _update_well_metadata(
+ well_url=well_url,
+ old_image_path=old_img_path,
+ new_image_path=new_img_path,
+ )
+
+ returnimage_list_updates
+
This function loads the image or label data from a zarr array based on the
+ROI bounding-box coordinates and stores them into a new zarr array.
+The new Zarr array has the same shape as the original array, but will have
+0s where the ROI tables don't specify loading of the image data.
+The ROIs loaded from list_indices will be written into the
+list_indices_ref position, thus performing translational registration if
+the two lists of ROI indices vary.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_url
+
+
+
Path or url to the individual OME-Zarr image to be used as
+the basis for the new OME-Zarr image.
defwrite_registered_zarr(
+ zarr_url:str,
+ new_zarr_url:str,
+ ROI_table:ad.AnnData,
+ ROI_table_ref:ad.AnnData,
+ num_levels:int,
+ coarsening_xy:int=2,
+ aggregation_function:Callable=np.mean,
+):
+"""
+ Write registered zarr array based on ROI tables
+
+ This function loads the image or label data from a zarr array based on the
+ ROI bounding-box coordinates and stores them into a new zarr array.
+ The new Zarr array has the same shape as the original array, but will have
+ 0s where the ROI tables don't specify loading of the image data.
+ The ROIs loaded from `list_indices` will be written into the
+ `list_indices_ref` position, thus performing translational registration if
+ the two lists of ROI indices vary.
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be used as
+ the basis for the new OME-Zarr image.
+ new_zarr_url: Path or url to the new OME-Zarr image to be written
+ ROI_table: Fractal ROI table for the component
+ ROI_table_ref: Fractal ROI table for the reference acquisition
+ num_levels: Number of pyramid layers to be created (argument of
+ `build_pyramid`).
+ coarsening_xy: Coarsening factor between pyramid levels
+ aggregation_function: Function to be used when downsampling (argument
+ of `build_pyramid`).
+
+ """
+ # Read pixel sizes from Zarr attributes
+ ngff_image_meta=load_NgffImageMeta(zarr_url)
+ pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+
+ # Create list of indices for 3D ROIs
+ list_indices=convert_ROI_table_to_indices(
+ ROI_table,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ list_indices_ref=convert_ROI_table_to_indices(
+ ROI_table_ref,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+
+ old_image_group=zarr.open_group(zarr_url,mode="r")
+ old_ngff_image_meta=load_NgffImageMeta(zarr_url)
+ new_image_group=zarr.group(new_zarr_url)
+ new_image_group.attrs.put(old_image_group.attrs.asdict())
+
+ # Loop over all channels. For each channel, write full-res image data.
+ data_array=da.from_zarr(old_image_group["0"])
+ # Create dask array with 0s of same shape
+ new_array=da.zeros_like(data_array)
+
+ # TODO: Add sanity checks on the 2 ROI tables:
+ # 1. The number of ROIs need to match
+ # 2. The size of the ROIs need to match
+ # (otherwise, we can't assign them to the reference regions)
+ # ROI_table_ref vs ROI_table_acq
+ fori,roi_indicesinenumerate(list_indices):
+ reference_region=convert_indices_to_regions(list_indices_ref[i])
+ region=convert_indices_to_regions(roi_indices)
+
+ axes_list=old_ngff_image_meta.axes_names
+
+ ifaxes_list==["c","z","y","x"]:
+ num_channels=data_array.shape[0]
+ # Loop over channels
+ forind_chinrange(num_channels):
+ idx=tuple(
+ [slice(ind_ch,ind_ch+1)]+list(reference_region)
+ )
+ new_array[idx]=load_region(
+ data_zyx=data_array[ind_ch],region=region,compute=False
+ )
+ elifaxes_list==["z","y","x"]:
+ new_array[reference_region]=load_region(
+ data_zyx=data_array,region=region,compute=False
+ )
+ elifaxes_list==["c","y","x"]:
+ # TODO: Implement cyx case (based on looping over xy case)
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+ elifaxes_list==["y","x"]:
+ # TODO: Implement yx case
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+ else:
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+
+ new_array.to_zarr(
+ f"{new_zarr_url}/0",
+ overwrite=True,
+ dimension_separator="/",
+ write_empty_chunks=False,
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to
+ # disk a pyramid of coarser levels
+ build_pyramid(
+ zarrurl=new_zarr_url,
+ overwrite=True,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=data_array.chunksize,
+ aggregation_function=aggregation_function,
+ )
+
Intialization arguments provided by
+image_based_registration_hcs_init. They contain the
+reference_zarr_url that is used for registration.
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table over which the task loops to
+calculate the registration. Examples: FOV_ROI_table => loop over
+the field of views, well_ROI_table => process the whole well as
+one image.
@validate_arguments
+defcalculate_registration_image_based(
+ *,
+ # Fractal arguments
+ zarr_url:str,
+ init_args:InitArgsRegistration,
+ # Core parameters
+ wavelength_id:str,
+ roi_table:str="FOV_ROI_table",
+ level:int=2,
+)->None:
+"""
+ Calculate registration based on images
+
+ This task consists of 3 parts:
+
+ 1. Loading the images of a given ROI (=> loop over ROIs)
+ 2. Calculating the transformation for that ROI
+ 3. Storing the calculated transformation in the ROI table
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ init_args: Intialization arguments provided by
+ `image_based_registration_hcs_init`. They contain the
+ reference_zarr_url that is used for registration.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ wavelength_id: Wavelength that will be used for image-based
+ registration; e.g. `A01_C01` for Yokogawa, `C01` for MD.
+ roi_table: Name of the ROI table over which the task loops to
+ calculate the registration. Examples: `FOV_ROI_table` => loop over
+ the field of views, `well_ROI_table` => process the whole well as
+ one image.
+ level: Pyramid level of the image to be used for registration.
+ Choose `0` to process at full resolution.
+
+ """
+ logger.info(
+ f"Running for {zarr_url=}.\n"
+ f"Calculating translation registration per {roi_table=} for "
+ f"{wavelength_id=}."
+ )
+
+ init_args.reference_zarr_url=init_args.reference_zarr_url
+
+ # Read some parameters from Zarr metadata
+ ngff_image_meta=load_NgffImageMeta(str(init_args.reference_zarr_url))
+ coarsening_xy=ngff_image_meta.coarsening_xy
+
+ # Get channel_index via wavelength_id.
+ # Intially only allow registration of the same wavelength
+ channel_ref:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=init_args.reference_zarr_url,
+ wavelength_id=wavelength_id,
+ )
+ channel_index_ref=channel_ref.index
+
+ channel_align:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=zarr_url,
+ wavelength_id=wavelength_id,
+ )
+ channel_index_align=channel_align.index
+
+ # Lazily load zarr array
+ data_reference_zyx=da.from_zarr(
+ f"{init_args.reference_zarr_url}/{level}"
+ )[channel_index_ref]
+ data_alignment_zyx=da.from_zarr(f"{zarr_url}/{level}")[
+ channel_index_align
+ ]
+
+ # Read ROIs
+ ROI_table_ref=ad.read_zarr(
+ f"{init_args.reference_zarr_url}/tables/{roi_table}"
+ )
+ ROI_table_x=ad.read_zarr(f"{zarr_url}/tables/{roi_table}")
+ logger.info(
+ f"Found {len(ROI_table_x)} ROIs in {roi_table=} to be processed."
+ )
+
+ # Check that table type of ROI_table_ref is valid. Note that
+ # "ngff:region_table" and None are accepted for backwards compatibility
+ valid_table_types=[
+ "roi_table",
+ "masking_roi_table",
+ "ngff:region_table",
+ None,
+ ]
+ ROI_table_ref_group=zarr.open_group(
+ f"{init_args.reference_zarr_url}/tables/{roi_table}",
+ mode="r",
+ )
+ ref_table_attrs=ROI_table_ref_group.attrs.asdict()
+ ref_table_type=ref_table_attrs.get("type")
+ ifref_table_typenotinvalid_table_types:
+ raiseValueError(
+ (
+ f"Table '{roi_table}' (with type '{ref_table_type}') is "
+ "not a valid ROI table."
+ )
+ )
+
+ # For each acquisition, get the relevant info
+ # TODO: Add additional checks on ROIs?
+ if(ROI_table_ref.obs.index!=ROI_table_x.obs.index).all():
+ raiseValueError(
+ "Registration is only implemented for ROIs that match between the "
+ "acquisitions (e.g. well, FOV ROIs). Here, the ROIs in the "
+ f"reference acquisitions were {ROI_table_ref.obs.index}, but the "
+ f"ROIs in the alignment acquisition were {ROI_table_x.obs.index}"
+ )
+ # TODO: Make this less restrictive? i.e. could we also run it if different
+ # acquisitions have different FOVs? But then how do we know which FOVs to
+ # match?
+ # If we relax this, downstream assumptions on matching based on order
+ # in the list will break.
+
+ # Read pixel sizes from zarr attributes
+ ngff_image_meta_acq_x=load_NgffImageMeta(zarr_url)
+ pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ pxl_sizes_zyx_acq_x=ngff_image_meta_acq_x.get_pixel_sizes_zyx(level=0)
+
+ ifpxl_sizes_zyx!=pxl_sizes_zyx_acq_x:
+ raiseValueError(
+ "Pixel sizes need to be equal between acquisitions for "
+ "registration."
+ )
+
+ # Create list of indices for 3D ROIs spanning the entire Z direction
+ list_indices_ref=convert_ROI_table_to_indices(
+ ROI_table_ref,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices_ref,roi_table)
+
+ list_indices_acq_x=convert_ROI_table_to_indices(
+ ROI_table_x,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices_acq_x,roi_table)
+
+ num_ROIs=len(list_indices_ref)
+ compute=True
+ new_shifts={}
+ fori_ROIinrange(num_ROIs):
+ logger.info(
+ f"Now processing ROI {i_ROI+1}/{num_ROIs} "
+ f"for channel {channel_align}."
+ )
+ img_ref=load_region(
+ data_zyx=data_reference_zyx,
+ region=convert_indices_to_regions(list_indices_ref[i_ROI]),
+ compute=compute,
+ )
+ img_acq_x=load_region(
+ data_zyx=data_alignment_zyx,
+ region=convert_indices_to_regions(list_indices_acq_x[i_ROI]),
+ compute=compute,
+ )
+
+ ##############
+ # Calculate the transformation
+ ##############
+ # Basic version (no padding, no internal binning)
+ ifimg_ref.shape!=img_acq_x.shape:
+ raiseNotImplementedError(
+ "This registration is not implemented for ROIs with "
+ "different shapes between acquisitions."
+ )
+ shifts=phase_cross_correlation(
+ np.squeeze(img_ref),np.squeeze(img_acq_x)
+ )[0]
+
+ # Registration based on scmultiplex, image-based
+ # shifts, _, _ = calculate_shift(np.squeeze(img_ref),
+ # np.squeeze(img_acq_x), bin=binning, binarize=False)
+
+ # TODO: Make this work on label images
+ # (=> different loading) etc.
+
+ ##############
+ # Storing the calculated transformation ###
+ ##############
+ # Store the shift in ROI table
+ # TODO: Store in OME-NGFF transformations: Check SpatialData approach,
+ # per ROI storage?
+
+ # Adapt ROIs for the given ROI table:
+ ROI_name=ROI_table_ref.obs.index[i_ROI]
+ new_shifts[ROI_name]=calculate_physical_shifts(
+ shifts,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+
+ # Write physical shifts to disk (as part of the ROI table)
+ logger.info(f"Updating the {roi_table=} with translation columns")
+ image_group=zarr.group(zarr_url)
+ new_ROI_table=get_ROI_table_with_translation(ROI_table_x,new_shifts)
+ write_table(
+ image_group,
+ roi_table,
+ new_ROI_table,
+ overwrite=True,
+ table_attrs=ref_table_attrs,
+ )
+
Second channel for segmentation (in the same format as
+channel). If specified, cellpose runs in dual channel mode.
+For dual channel segmentation of cells, the first channel should
+contain the membrane marker, the second channel should contain the
+nuclear marker.
Name of the ROI table over which the task loops to
+apply Cellpose segmentation. Examples: FOV_ROI_table => loop over
+the field of views, organoid_ROI_table => loop over the organoid
+ROI table (generated by another task), well_ROI_table => process
+the whole well as one image.
If provided, a ROI table with that name is created,
+which will contain the bounding boxes of the newly segmented
+labels. ROI tables should have ROI in their name.
If True, try to use masked loading and fall back to
+use_masks=False if the ROI table is not suitable. Masked
+loading is relevant when only a subset of the bounding box should
+actually be processed (e.g. running within organoid_ROI_table).
Expected diameter of the objects that should be
+segmented in pixels at level 0. Initial diameter is rescaled using
+the level that was selected. The rescaled value is passed as
+the diameter to the CellposeModel.eval method.
Parameter of CellposeModel.eval method. Valid
+values between -6 to 6. From Cellpose documentation: "Decrease this
+threshold if cellpose is not returning as many ROIs as you’d
+expect. Similarly, increase this threshold if cellpose is returning
+too ROIs particularly from dim areas."
Parameter of CellposeModel.eval method. Valid
+values between 0.0 and 1.0. From Cellpose documentation: "Increase
+this threshold if cellpose is not returning as many ROIs as you’d
+expect. Similarly, decrease this threshold if cellpose is returning
+too many ill-shaped ROIs."
By default, data is normalized so 0.0=1st percentile and
+1.0=99th percentile of image intensities in each channel.
+This automatic normalization can lead to issues when the image to
+be segmented is very sparse. You can turn off the default
+rescaling. With the "custom" option, you can either provide your
+own rescaling percentiles or fixed rescaling upper and lower
+bound integers.
Parameter of CellposeModel class. Whether to use cellpose
+net averaging to run the 4 built-in networks (useful for nuclei,
+cyto and cyto2, not sure it works for the others).
@validate_arguments
+defcellpose_segmentation(
+ *,
+ # Fractal parameters
+ zarr_url:str,
+ # Core parameters
+ level:int,
+ channel:ChannelInputModel,
+ channel2:Optional[ChannelInputModel]=None,
+ input_ROI_table:str="FOV_ROI_table",
+ output_ROI_table:Optional[str]=None,
+ output_label_name:Optional[str]=None,
+ # Cellpose-related arguments
+ diameter_level0:float=30.0,
+ # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/401 # noqa E501
+ model_type:Literal[tuple(models.MODEL_NAMES)]="cyto2",
+ pretrained_model:Optional[str]=None,
+ # Advanced parameters
+ cellprob_threshold:float=0.0,
+ flow_threshold:float=0.4,
+ normalize:CellposeCustomNormalizer=CellposeCustomNormalizer(),
+ anisotropy:Optional[float]=None,
+ min_size:int=15,
+ augment:bool=False,
+ net_avg:bool=False,
+ use_gpu:bool=True,
+ batch_size:int=8,
+ invert:bool=False,
+ tile:bool=True,
+ tile_overlap:float=0.1,
+ resample:bool=True,
+ interp:bool=True,
+ stitch_threshold:float=0.0,
+ use_masks:bool=True,
+ relabeling:bool=True,
+ overwrite:bool=True,
+)->None:
+"""
+ Run cellpose segmentation on the ROIs of a single OME-Zarr image.
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ level: Pyramid level of the image to be segmented. Choose `0` to
+ process at full resolution.
+ channel: Primary channel for segmentation; requires either
+ `wavelength_id` (e.g. `A01_C01`) or `label` (e.g. `DAPI`).
+ channel2: Second channel for segmentation (in the same format as
+ `channel`). If specified, cellpose runs in dual channel mode.
+ For dual channel segmentation of cells, the first channel should
+ contain the membrane marker, the second channel should contain the
+ nuclear marker.
+ input_ROI_table: Name of the ROI table over which the task loops to
+ apply Cellpose segmentation. Examples: `FOV_ROI_table` => loop over
+ the field of views, `organoid_ROI_table` => loop over the organoid
+ ROI table (generated by another task), `well_ROI_table` => process
+ the whole well as one image.
+ output_ROI_table: If provided, a ROI table with that name is created,
+ which will contain the bounding boxes of the newly segmented
+ labels. ROI tables should have `ROI` in their name.
+ use_masks: If `True`, try to use masked loading and fall back to
+ `use_masks=False` if the ROI table is not suitable. Masked
+ loading is relevant when only a subset of the bounding box should
+ actually be processed (e.g. running within `organoid_ROI_table`).
+ output_label_name: Name of the output label image (e.g. `"organoids"`).
+ relabeling: If `True`, apply relabeling so that label values are
+ unique for all objects in the well.
+ diameter_level0: Expected diameter of the objects that should be
+ segmented in pixels at level 0. Initial diameter is rescaled using
+ the `level` that was selected. The rescaled value is passed as
+ the diameter to the `CellposeModel.eval` method.
+ model_type: Parameter of `CellposeModel` class. Defines which model
+ should be used. Typical choices are `nuclei`, `cyto`, `cyto2`, etc.
+ pretrained_model: Parameter of `CellposeModel` class (takes
+ precedence over `model_type`). Allows you to specify the path of
+ a custom trained cellpose model.
+ cellprob_threshold: Parameter of `CellposeModel.eval` method. Valid
+ values between -6 to 6. From Cellpose documentation: "Decrease this
+ threshold if cellpose is not returning as many ROIs as you’d
+ expect. Similarly, increase this threshold if cellpose is returning
+ too ROIs particularly from dim areas."
+ flow_threshold: Parameter of `CellposeModel.eval` method. Valid
+ values between 0.0 and 1.0. From Cellpose documentation: "Increase
+ this threshold if cellpose is not returning as many ROIs as you’d
+ expect. Similarly, decrease this threshold if cellpose is returning
+ too many ill-shaped ROIs."
+ normalize: By default, data is normalized so 0.0=1st percentile and
+ 1.0=99th percentile of image intensities in each channel.
+ This automatic normalization can lead to issues when the image to
+ be segmented is very sparse. You can turn off the default
+ rescaling. With the "custom" option, you can either provide your
+ own rescaling percentiles or fixed rescaling upper and lower
+ bound integers.
+ anisotropy: Ratio of the pixel sizes along Z and XY axis (ignored if
+ the image is not three-dimensional). If `None`, it is inferred from
+ the OME-NGFF metadata.
+ min_size: Parameter of `CellposeModel` class. Minimum size of the
+ segmented objects (in pixels). Use `-1` to turn off the size
+ filter.
+ augment: Parameter of `CellposeModel` class. Whether to use cellpose
+ augmentation to tile images with overlap.
+ net_avg: Parameter of `CellposeModel` class. Whether to use cellpose
+ net averaging to run the 4 built-in networks (useful for `nuclei`,
+ `cyto` and `cyto2`, not sure it works for the others).
+ use_gpu: If `False`, always use the CPU; if `True`, use the GPU if
+ possible (as defined in `cellpose.core.use_gpu()`) and fall-back
+ to the CPU otherwise.
+ batch_size: number of 224x224 patches to run simultaneously on the GPU
+ (can make smaller or bigger depending on GPU memory usage)
+ invert: invert image pixel intensity before running network (if True,
+ image is also normalized)
+ tile: tiles image to ensure GPU/CPU memory usage limited (recommended)
+ tile_overlap: fraction of overlap of tiles when computing flows
+ resample: run dynamics at original image size (will be slower but
+ create more accurate boundaries)
+ interp: interpolate during 2D dynamics (not available in 3D)
+ (in previous versions it was False, now it defaults to True)
+ stitch_threshold: if stitch_threshold>0.0 and not do_3D and equal
+ image sizes, masks are stitched in 3D to return volume segmentation
+ overwrite: If `True`, overwrite the task output.
+ """
+
+ # Set input path
+ logger.info(f"{zarr_url=}")
+
+ # Preliminary checks on Cellpose model
+ ifpretrained_model:
+ ifnotos.path.exists(pretrained_model):
+ raiseValueError(f"{pretrained_model=} does not exist.")
+
+ # Read attributes from NGFF metadata
+ ngff_image_meta=load_NgffImageMeta(zarr_url)
+ num_levels=ngff_image_meta.num_levels
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ full_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ actual_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=level)
+ logger.info(f"NGFF image has {num_levels=}")
+ logger.info(f"NGFF image has {coarsening_xy=}")
+ logger.info(
+ f"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}"
+ )
+ logger.info(
+ f"NGFF image has level-{level} pixel sizes "
+ f"{actual_res_pxl_sizes_zyx}"
+ )
+
+ # Find channel index
+ try:
+ tmp_channel:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=zarr_url,
+ wavelength_id=channel.wavelength_id,
+ label=channel.label,
+ )
+ exceptChannelNotFoundErrorase:
+ logger.warning(
+ "Channel not found, exit from the task.\n"
+ f"Original error: {str(e)}"
+ )
+ returnNone
+ ind_channel=tmp_channel.index
+
+ # Find channel index for second channel, if one is provided
+ ifchannel2:
+ try:
+ tmp_channel_c2:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=zarr_url,
+ wavelength_id=channel2.wavelength_id,
+ label=channel2.label,
+ )
+ exceptChannelNotFoundErrorase:
+ logger.warning(
+ f"Second channel with wavelength_id: {channel2.wavelength_id} "
+ f"and label: {channel2.label} not found, exit from the task.\n"
+ f"Original error: {str(e)}"
+ )
+ returnNone
+ ind_channel_c2=tmp_channel_c2.index
+
+ # Set channel label
+ ifoutput_label_nameisNone:
+ try:
+ channel_label=tmp_channel.label
+ output_label_name=f"label_{channel_label}"
+ except(KeyError,IndexError):
+ output_label_name=f"label_{ind_channel}"
+
+ # Load ZYX data
+ data_zyx=da.from_zarr(f"{zarr_url}/{level}")[ind_channel]
+ logger.info(f"{data_zyx.shape=}")
+ ifchannel2:
+ data_zyx_c2=da.from_zarr(f"{zarr_url}/{level}")[ind_channel_c2]
+ logger.info(f"Second channel: {data_zyx_c2.shape=}")
+
+ # Read ROI table
+ ROI_table_path=f"{zarr_url}/tables/{input_ROI_table}"
+ ROI_table=ad.read_zarr(ROI_table_path)
+
+ # Perform some checks on the ROI table
+ valid_ROI_table=is_ROI_table_valid(
+ table_path=ROI_table_path,use_masks=use_masks
+ )
+ ifuse_masksandnotvalid_ROI_table:
+ logger.info(
+ f"ROI table at {ROI_table_path} cannot be used for masked "
+ "loading. Set use_masks=False."
+ )
+ use_masks=False
+ logger.info(f"{use_masks=}")
+
+ # Create list of indices for 3D ROIs spanning the entire Z direction
+ list_indices=convert_ROI_table_to_indices(
+ ROI_table,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices,input_ROI_table)
+
+ # If we are not planning to use masked loading, fail for overlapping ROIs
+ ifnotuse_masks:
+ overlap=find_overlaps_in_ROI_indices(list_indices)
+ ifoverlap:
+ raiseValueError(
+ f"ROI indices created from {input_ROI_table} table have "
+ "overlaps, but we are not using masked loading."
+ )
+
+ # Select 2D/3D behavior and set some parameters
+ do_3D=data_zyx.shape[0]>1andlen(data_zyx.shape)==3
+ ifdo_3D:
+ ifanisotropyisNone:
+ # Compute anisotropy as pixel_size_z/pixel_size_x
+ anisotropy=(
+ actual_res_pxl_sizes_zyx[0]/actual_res_pxl_sizes_zyx[2]
+ )
+ logger.info(f"Anisotropy: {anisotropy}")
+
+ # Rescale datasets (only relevant for level>0)
+ ifngff_image_meta.axes_names[0]!="c":
+ raiseValueError(
+ "Cannot set `remove_channel_axis=True` for multiscale "
+ f"metadata with axes={ngff_image_meta.axes_names}. "
+ 'First axis should have name "c".'
+ )
+ new_datasets=rescale_datasets(
+ datasets=[ds.dict()fordsinngff_image_meta.datasets],
+ coarsening_xy=coarsening_xy,
+ reference_level=level,
+ remove_channel_axis=True,
+ )
+
+ label_attrs={
+ "image-label":{
+ "version":__OME_NGFF_VERSION__,
+ "source":{"image":"../../"},
+ },
+ "multiscales":[
+ {
+ "name":output_label_name,
+ "version":__OME_NGFF_VERSION__,
+ "axes":[
+ ax.dict()
+ foraxinngff_image_meta.multiscale.axes
+ ifax.type!="channel"
+ ],
+ "datasets":new_datasets,
+ }
+ ],
+ }
+
+ image_group=zarr.group(zarr_url)
+ label_group=prepare_label_group(
+ image_group,
+ output_label_name,
+ overwrite=overwrite,
+ label_attrs=label_attrs,
+ logger=logger,
+ )
+
+ logger.info(
+ f"Helper function `prepare_label_group` returned {label_group=}"
+ )
+ logger.info(f"Output label path: {zarr_url}/labels/{output_label_name}/0")
+ store=zarr.storage.FSStore(f"{zarr_url}/labels/{output_label_name}/0")
+ label_dtype=np.uint32
+
+ # Ensure that all output shapes & chunks are 3D (for 2D data: (1, y, x))
+ # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/398
+ shape=data_zyx.shape
+ iflen(shape)==2:
+ shape=(1,*shape)
+ chunks=data_zyx.chunksize
+ iflen(chunks)==2:
+ chunks=(1,*chunks)
+ mask_zarr=zarr.create(
+ shape=shape,
+ chunks=chunks,
+ dtype=label_dtype,
+ store=store,
+ overwrite=False,
+ dimension_separator="/",
+ )
+
+ logger.info(
+ f"mask will have shape {data_zyx.shape} "
+ f"and chunks {data_zyx.chunks}"
+ )
+
+ # Initialize cellpose
+ gpu=use_gpuandcellpose.core.use_gpu()
+ ifpretrained_model:
+ model=models.CellposeModel(
+ gpu=gpu,pretrained_model=pretrained_model
+ )
+ else:
+ model=models.CellposeModel(gpu=gpu,model_type=model_type)
+
+ # Initialize other things
+ logger.info(f"Start cellpose_segmentation task for {zarr_url}")
+ logger.info(f"relabeling: {relabeling}")
+ logger.info(f"do_3D: {do_3D}")
+ logger.info(f"use_gpu: {gpu}")
+ logger.info(f"level: {level}")
+ logger.info(f"model_type: {model_type}")
+ logger.info(f"pretrained_model: {pretrained_model}")
+ logger.info(f"anisotropy: {anisotropy}")
+ logger.info("Total well shape/chunks:")
+ logger.info(f"{data_zyx.shape}")
+ logger.info(f"{data_zyx.chunks}")
+ ifchannel2:
+ logger.info("Dual channel input for cellpose model")
+ logger.info(f"{data_zyx_c2.shape}")
+ logger.info(f"{data_zyx_c2.chunks}")
+
+ # Counters for relabeling
+ ifrelabeling:
+ num_labels_tot=0
+
+ # Iterate over ROIs
+ num_ROIs=len(list_indices)
+
+ ifoutput_ROI_table:
+ bbox_dataframe_list=[]
+
+ logger.info(f"Now starting loop over {num_ROIs} ROIs")
+ fori_ROI,indicesinenumerate(list_indices):
+ # Define region
+ s_z,e_z,s_y,e_y,s_x,e_x=indices[:]
+ region=(
+ slice(s_z,e_z),
+ slice(s_y,e_y),
+ slice(s_x,e_x),
+ )
+ logger.info(f"Now processing ROI {i_ROI+1}/{num_ROIs}")
+
+ # Prepare single-channel or dual-channel input for cellpose
+ ifchannel2:
+ # Dual channel mode, first channel is the membrane channel
+ img_1=load_region(
+ data_zyx,
+ region,
+ compute=True,
+ return_as_3D=True,
+ )
+ img_np=np.zeros((2,*img_1.shape))
+ img_np[0,:,:,:]=img_1
+ img_np[1,:,:,:]=load_region(
+ data_zyx_c2,
+ region,
+ compute=True,
+ return_as_3D=True,
+ )
+ channels=[1,2]
+ else:
+ img_np=np.expand_dims(
+ load_region(data_zyx,region,compute=True,return_as_3D=True),
+ axis=0,
+ )
+ channels=[0,0]
+
+ # Prepare keyword arguments for segment_ROI function
+ kwargs_segment_ROI=dict(
+ model=model,
+ channels=channels,
+ do_3D=do_3D,
+ anisotropy=anisotropy,
+ label_dtype=label_dtype,
+ diameter=diameter_level0/coarsening_xy**level,
+ cellprob_threshold=cellprob_threshold,
+ flow_threshold=flow_threshold,
+ normalize=normalize,
+ min_size=min_size,
+ augment=augment,
+ net_avg=net_avg,
+ batch_size=batch_size,
+ invert=invert,
+ tile=tile,
+ tile_overlap=tile_overlap,
+ resample=resample,
+ interp=interp,
+ stitch_threshold=stitch_threshold,
+ )
+
+ # Prepare keyword arguments for preprocessing function
+ preprocessing_kwargs={}
+ ifuse_masks:
+ preprocessing_kwargs=dict(
+ region=region,
+ current_label_path=f"{zarr_url}/labels/{output_label_name}/0",
+ ROI_table_path=ROI_table_path,
+ ROI_positional_index=i_ROI,
+ )
+
+ # Call segment_ROI through the masked-loading wrapper, which includes
+ # pre/post-processing functions if needed
+ new_label_img=masked_loading_wrapper(
+ image_array=img_np,
+ function=segment_ROI,
+ kwargs=kwargs_segment_ROI,
+ use_masks=use_masks,
+ preprocessing_kwargs=preprocessing_kwargs,
+ )
+
+ # Shift labels and update relabeling counters
+ ifrelabeling:
+ num_labels_roi=np.max(new_label_img)
+ new_label_img[new_label_img>0]+=num_labels_tot
+ num_labels_tot+=num_labels_roi
+
+ # Write some logs
+ logger.info(f"ROI {indices}, {num_labels_roi=}, {num_labels_tot=}")
+
+ # Check that total number of labels is under control
+ ifnum_labels_tot>np.iinfo(label_dtype).max:
+ raiseValueError(
+ "ERROR in re-labeling:"
+ f"Reached {num_labels_tot} labels, "
+ f"but dtype={label_dtype}"
+ )
+
+ ifoutput_ROI_table:
+ bbox_df=array_to_bounding_box_table(
+ new_label_img,
+ actual_res_pxl_sizes_zyx,
+ origin_zyx=(s_z,s_y,s_x),
+ )
+
+ bbox_dataframe_list.append(bbox_df)
+
+ overlap_list=[]
+ fordfinbbox_dataframe_list:
+ overlap_list.extend(
+ get_overlapping_pairs_3D(df,full_res_pxl_sizes_zyx)
+ )
+ iflen(overlap_list)>0:
+ logger.warning(
+ f"{len(overlap_list)} bounding-box pairs overlap"
+ )
+
+ # Compute and store 0-th level to disk
+ da.array(new_label_img).to_zarr(
+ url=mask_zarr,
+ region=region,
+ compute=True,
+ )
+
+ logger.info(
+ f"End cellpose_segmentation task for {zarr_url}, "
+ "now building pyramids."
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to disk a
+ # pyramid of coarser levels
+ build_pyramid(
+ zarrurl=f"{zarr_url}/labels/{output_label_name}",
+ overwrite=overwrite,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=chunks,
+ aggregation_function=np.max,
+ )
+
+ logger.info("End building pyramids")
+
+ ifoutput_ROI_table:
+ # Handle the case where `bbox_dataframe_list` is empty (typically
+ # because list_indices is also empty)
+ iflen(bbox_dataframe_list)==0:
+ bbox_dataframe_list=[empty_bounding_box_table()]
+ # Concatenate all ROI dataframes
+ df_well=pd.concat(bbox_dataframe_list,axis=0,ignore_index=True)
+ df_well.index=df_well.index.astype(str)
+ # Extract labels and drop them from df_well
+ labels=pd.DataFrame(df_well["label"].astype(str))
+ df_well.drop(labels=["label"],axis=1,inplace=True)
+ # Convert all to float (warning: some would be int, in principle)
+ bbox_dtype=np.float32
+ df_well=df_well.astype(bbox_dtype)
+ # Convert to anndata
+ bbox_table=ad.AnnData(df_well,dtype=bbox_dtype)
+ bbox_table.obs=labels
+
+ # Write to zarr group
+ image_group=zarr.group(zarr_url)
+ logger.info(
+ "Now writing bounding-box ROI table to "
+ f"{zarr_url}/tables/{output_ROI_table}"
+ )
+ table_attrs={
+ "type":"masking_roi_table",
+ "region":{"path":f"../labels/{output_label_name}"},
+ "instance_key":"label",
+ }
+ write_table(
+ image_group,
+ output_ROI_table,
+ bbox_table,
+ overwrite=overwrite,
+ table_attrs=table_attrs,
+ )
+
Which channels to use. If only one channel is provided, [0,
+0] should be used. If two channels are provided (the first
+dimension of x has length of 2), [1, 2] should be used
+(x[0, :, :,:] contains the membrane channel and
+x[1, :, :, :] contains the nuclear channel).
normalize data so 0.0=1st percentile and 1.0=99th
+percentile of image intensities in each channel. This automatic
+normalization can lead to issues when the image to be segmented
+is very sparse.
defsegment_ROI(
+ x:np.ndarray,
+ model:models.CellposeModel=None,
+ do_3D:bool=True,
+ channels:list[int]=[0,0],
+ anisotropy:Optional[float]=None,
+ diameter:float=30.0,
+ cellprob_threshold:float=0.0,
+ flow_threshold:float=0.4,
+ normalize:CellposeCustomNormalizer=CellposeCustomNormalizer(),
+ label_dtype:Optional[np.dtype]=None,
+ augment:bool=False,
+ net_avg:bool=False,
+ min_size:int=15,
+ batch_size:int=8,
+ invert:bool=False,
+ tile:bool=True,
+ tile_overlap:float=0.1,
+ resample:bool=True,
+ interp:bool=True,
+ stitch_threshold:float=0.0,
+)->np.ndarray:
+"""
+ Internal function that runs Cellpose segmentation for a single ROI.
+
+ Args:
+ x: 4D numpy array.
+ model: An instance of `models.CellposeModel`.
+ do_3D: If `True`, cellpose runs in 3D mode: runs on xy, xz & yz planes,
+ then averages the flows.
+ channels: Which channels to use. If only one channel is provided, `[0,
+ 0]` should be used. If two channels are provided (the first
+ dimension of `x` has length of 2), `[1, 2]` should be used
+ (`x[0, :, :,:]` contains the membrane channel and
+ `x[1, :, :, :]` contains the nuclear channel).
+ anisotropy: Set anisotropy rescaling factor for Z dimension.
+ diameter: Expected object diameter in pixels for cellpose.
+ cellprob_threshold: Cellpose model parameter.
+ flow_threshold: Cellpose model parameter.
+ normalize: normalize data so 0.0=1st percentile and 1.0=99th
+ percentile of image intensities in each channel. This automatic
+ normalization can lead to issues when the image to be segmented
+ is very sparse.
+ label_dtype: Label images are cast into this `np.dtype`.
+ augment: Whether to use cellpose augmentation to tile images with
+ overlap.
+ net_avg: Whether to use cellpose net averaging to run the 4 built-in
+ networks (useful for `nuclei`, `cyto` and `cyto2`, not sure it
+ works for the others).
+ min_size: Minimum size of the segmented objects.
+ batch_size: number of 224x224 patches to run simultaneously on the GPU
+ (can make smaller or bigger depending on GPU memory usage)
+ invert: invert image pixel intensity before running network (if True,
+ image is also normalized)
+ tile: tiles image to ensure GPU/CPU memory usage limited (recommended)
+ tile_overlap: fraction of overlap of tiles when computing flows
+ resample: run dynamics at original image size (will be slower but
+ create more accurate boundaries)
+ interp: interpolate during 2D dynamics (not available in 3D)
+ (in previous versions it was False, now it defaults to True)
+ stitch_threshold: if stitch_threshold>0.0 and not do_3D and equal
+ image sizes, masks are stitched in 3D to return volume segmentation
+ """
+
+ # Write some debugging info
+ logger.info(
+ "[segment_ROI] START |"
+ f" x: {type(x)}, {x.shape} |"
+ f" {do_3D=} |"
+ f" {model.diam_mean=} |"
+ f" {diameter=} |"
+ f" {flow_threshold=} |"
+ f" {normalize.type=}"
+ )
+
+ # Optionally perform custom normalization
+ ifnormalize.type=="custom":
+ x=normalized_img(
+ x,
+ lower_p=normalize.lower_percentile,
+ upper_p=normalize.upper_percentile,
+ lower_bound=normalize.lower_bound,
+ upper_bound=normalize.upper_bound,
+ )
+
+ # Actual labeling
+ t0=time.perf_counter()
+ mask,_,_=model.eval(
+ x,
+ channels=channels,
+ do_3D=do_3D,
+ net_avg=net_avg,
+ augment=augment,
+ diameter=diameter,
+ anisotropy=anisotropy,
+ cellprob_threshold=cellprob_threshold,
+ flow_threshold=flow_threshold,
+ normalize=normalize.cellpose_normalize,
+ min_size=min_size,
+ batch_size=batch_size,
+ invert=invert,
+ tile=tile,
+ tile_overlap=tile_overlap,
+ resample=resample,
+ interp=interp,
+ stitch_threshold=stitch_threshold,
+ )
+
+ ifmask.ndim==2:
+ # If we get a 2D image, we still return it as a 3D array
+ mask=np.expand_dims(mask,axis=0)
+ t1=time.perf_counter()
+
+ # Write some debugging info
+ logger.info(
+ "[segment_ROI] END |"
+ f" Elapsed: {t1-t0:.3f} s |"
+ f" {mask.shape=},"
+ f" {mask.dtype=} (then {label_dtype}),"
+ f" {np.max(mask)=} |"
+ f" {model.diam_mean=} |"
+ f" {diameter=} |"
+ f" {flow_threshold=}"
+ )
+
+ returnmask.astype(label_dtype)
+
Validator to handle different normalization scenarios for Cellpose models
+
If type="default", then Cellpose default normalization is
+used and no other parameters can be specified.
+If type="no_normalization", then no normalization is used and no
+other parameters can be specified.
+If type="custom", then either percentiles or explicit integer
+bounds can be applied.
+
+
+
+
+
+
+
ATTRIBUTE
+
DESCRIPTION
+
+
+
+
+
type
+
+
+
One of default (Cellpose default normalization), custom
+(using the other custom parameters) or no_normalization.
Specify a custom lower-bound percentile for rescaling
+as a float value between 0 and 100. Set to 1 to run the same as
+default). You can only specify percentiles or bounds, not both.
Specify a custom upper-bound percentile for rescaling
+as a float value between 0 and 100. Set to 99 to run the same as
+default, set to e.g. 99.99 if the default rescaling was too harsh.
+You can only specify percentiles or bounds, not both.
classCellposeCustomNormalizer(BaseModel):
+"""
+ Validator to handle different normalization scenarios for Cellpose models
+
+ If `type="default"`, then Cellpose default normalization is
+ used and no other parameters can be specified.
+ If `type="no_normalization"`, then no normalization is used and no
+ other parameters can be specified.
+ If `type="custom"`, then either percentiles or explicit integer
+ bounds can be applied.
+
+ Attributes:
+ type:
+ One of `default` (Cellpose default normalization), `custom`
+ (using the other custom parameters) or `no_normalization`.
+ lower_percentile: Specify a custom lower-bound percentile for rescaling
+ as a float value between 0 and 100. Set to 1 to run the same as
+ default). You can only specify percentiles or bounds, not both.
+ upper_percentile: Specify a custom upper-bound percentile for rescaling
+ as a float value between 0 and 100. Set to 99 to run the same as
+ default, set to e.g. 99.99 if the default rescaling was too harsh.
+ You can only specify percentiles or bounds, not both.
+ lower_bound: Explicit lower bound value to rescale the image at.
+ Needs to be an integer, e.g. 100.
+ You can only specify percentiles or bounds, not both.
+ upper_bound: Explicit upper bound value to rescale the image at.
+ Needs to be an integer, e.g. 2000.
+ You can only specify percentiles or bounds, not both.
+ """
+
+ type:Literal["default","custom","no_normalization"]="default"
+ lower_percentile:Optional[float]=Field(None,ge=0,le=100)
+ upper_percentile:Optional[float]=Field(None,ge=0,le=100)
+ lower_bound:Optional[int]=None
+ upper_bound:Optional[int]=None
+
+ # In the future, add an option to allow using precomputed percentiles
+ # that are stored in OME-Zarr histograms and use this pydantic model that
+ # those histograms actually exist
+
+ @root_validator
+ defvalidate_conditions(cls,values):
+ # Extract values
+ type=values.get("type")
+ lower_percentile=values.get("lower_percentile")
+ upper_percentile=values.get("upper_percentile")
+ lower_bound=values.get("lower_bound")
+ upper_bound=values.get("upper_bound")
+
+ # Verify that custom parameters are only provided when type="custom"
+ iftype!="custom":
+ iflower_percentileisnotNone:
+ raiseValueError(
+ f"Type='{type}' but {lower_percentile=}. "
+ "Hint: set type='custom'."
+ )
+ ifupper_percentileisnotNone:
+ raiseValueError(
+ f"Type='{type}' but {upper_percentile=}. "
+ "Hint: set type='custom'."
+ )
+ iflower_boundisnotNone:
+ raiseValueError(
+ f"Type='{type}' but {lower_bound=}. "
+ "Hint: set type='custom'."
+ )
+ ifupper_boundisnotNone:
+ raiseValueError(
+ f"Type='{type}' but {upper_bound=}. "
+ "Hint: set type='custom'."
+ )
+
+ # The only valid options are:
+ # 1. Both percentiles are set and both bounds are unset
+ # 2. Both bounds are set and both percentiles are unset
+ are_percentiles_set=(
+ lower_percentileisnotNone,
+ upper_percentileisnotNone,
+ )
+ are_bounds_set=(
+ lower_boundisnotNone,
+ upper_boundisnotNone,
+ )
+ iflen(set(are_percentiles_set))!=1:
+ raiseValueError(
+ "Both lower_percentile and upper_percentile must be set "
+ "together."
+ )
+ iflen(set(are_bounds_set))!=1:
+ raiseValueError(
+ "Both lower_bound and upper_bound must be set together"
+ )
+ iflower_percentileisnotNoneandlower_boundisnotNone:
+ raiseValueError(
+ "You cannot set both explicit bounds and percentile bounds "
+ "at the same time. Hint: use only one of the two options."
+ )
+
+ returnvalues
+
+ @property
+ defcellpose_normalize(self)->bool:
+"""
+ Determine whether cellpose should apply its internal normalization.
+
+ If type is set to `custom` or `no_normalization`, don't apply cellpose
+ internal normalization
+ """
+ returnself.type=="default"
+
defnormalize_bounds(Y:np.ndarray,lower:int=0,upper:int=65535):
+"""normalize image so 0.0 is lower value and 1.0 is upper value
+
+ Args:
+ Y: The image to be normalized
+ lower: Lower normalization value
+ upper: Upper normalization value
+
+ """
+ X=Y.copy()
+ X=(X-lower)/(upper-lower)
+ returnX
+
defnormalize_percentile(Y:np.ndarray,lower:float=1,upper:float=99):
+"""normalize image so 0.0 is lower percentile and 1.0 is upper percentile
+ Percentiles are passed as floats (must be between 0 and 100)
+
+ Args:
+ Y: The image to be normalized
+ lower: Lower percentile
+ upper: Upper percentile
+
+ """
+ X=Y.copy()
+ x01=np.percentile(X,lower)
+ x99=np.percentile(X,upper)
+ X=(X-x01)/(x99-x01)
+ returnX
+
This task is run after an init task (typically
+cellvoyager_to_ome_zarr_init or
+cellvoyager_to_ome_zarr_init_multiplex), and it populates the empty
+OME-Zarr files that were prepared.
+
Note that the current task always overwrites existing data. To avoid this
+behavior, set the overwrite argument of the init task to False.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_url
+
+
+
Path or url to the individual OME-Zarr image to be processed.
+(standard argument for Fractal tasks, managed by Fractal server).
defsort_fun(filename:str)->list[int]:
+"""
+ Takes a string (filename of a Yokogawa image), extract site and
+ z-index metadata and returns them as a list of integers.
+
+ Args:
+ filename: Name of the image file.
+ """
+
+ filename_metadata=parse_filename(filename)
+ site=int(filename_metadata["F"])
+ z_index=int(filename_metadata["Z"])
+ return[site,z_index]
+
Create a OME-NGFF zarr folder, without reading/writing image data.
+
Find plates (for each folder in input_paths):
+
+
glob image files,
+
parse metadata from image filename to identify plates,
+
identify populated channels.
+
+
Create a zarr folder (for each plate):
+
+
parse mlf metadata,
+
identify wells and field of view (FOV),
+
create FOV ZARR,
+
verify that channels are uniform (i.e., same channels).
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_urls
+
+
+
List of paths or urls to the individual OME-Zarr image to
+be processed. Not used by the converter task.
+(standard argument for Fractal tasks, managed by Fractal server).
list of paths to the folders that contains the Cellvoyager
+image files. Each entry is a path to a folder that contains the
+image files themselves for a multiwell plate and the
+MeasurementData & MeasurementDetail metadata files.
A list of OmeroChannel s, where each channel must
+include the wavelength_id attribute and where the
+wavelength_id values must be unique across the list.
If specified, only parse images with filenames
+that match with all these patterns. Patterns must be defined as in
+https://docs.python.org/3/library/fnmatch.html, Example:
+image_glob_pattern=["*_B03_*"] => only process well B03
+image_glob_pattern=["*_C09_*", "*F016*", "*Z[0-5][0-9]C*"] =>
+only process well C09, field of view 16 and Z planes 0-59.
If None, parse Yokogawa metadata from mrf/mlf
+files in the input_path folder; else, the full path to a csv file
+containing the parsed metadata table.
A metadata dictionary containing important metadata about the OME-Zarr
+plate, the images and some parameters required by downstream tasks
+(like num_levels).
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_init.py
+
Create OME-NGFF structure and metadata to host a multiplexing dataset.
+
This task takes a set of image folders (i.e. different multiplexing
+acquisitions) and build the internal structure and metadata of a OME-NGFF
+zarr group, without actually loading/writing the image data.
+
Each element in input_paths should be treated as a different acquisition.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_urls
+
+
+
List of paths or urls to the individual OME-Zarr image to
+be processed. Not used by the converter task.
+(standard argument for Fractal tasks, managed by Fractal server).
dictionary of acquisitions. Each key is the acquisition
+identifier (normally 0, 1, 2, 3 etc.). Each item defines the
+acquisition by providing the image_dir and the allowed_channels.
If specified, only parse images with filenames
+that match with all these patterns. Patterns must be defined as in
+https://docs.python.org/3/library/fnmatch.html, Example:
+image_glob_pattern=["*_B03_*"] => only process well B03
+image_glob_pattern=["*_C09_*", "*F016*", "*Z[0-5][0-9]C*"] =>
+only process well C09, field of view 16 and Z planes 0-59.
If None, parse Yokogawa metadata from mrf/mlf
+files in the input_path folder; else, a dictionary of key-value
+pairs like (acquisition, path) with acquisition a string like
+the key of the acquisitions dict and path pointing to a csv
+file containing the parsed metadata table.
A metadata dictionary containing important metadata about the OME-Zarr
+plate, the images and some parameters required by downstream tasks
+(like num_levels).
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_init_multiplex.py
+
def_generate_plate_well_metadata(
+ zarr_urls:list[str],
+)->tuple[dict[str,dict],dict[str,dict[str,dict]],dict[str,dict]]:
+"""
+ Generate metadata for OME-Zarr HCS plates & wells.
+
+ Based on the list of zarr_urls, generate metadata for all plates and all
+ their wells.
+
+ Args:
+ zarr_urls: List of paths or urls to the individual OME-Zarr image to
+ be processed.
+
+ Returns:
+ plate_metadata_dicts: Dictionary of plate plate metadata. The structure
+ is: {"old_plate_name": NgffPlateMeta (as dict)}.
+ new_well_image_attrs: Dictionary of image lists for the new wells.
+ The structure is: {"old_plate_name": {"old_well_name":
+ [ImageInWell(as dict)]}}
+ well_image_attrs: Dictionary of Image attributes of the existing wells.
+ """
+ # TODO: Simplify this block. Currently complicated, because we need to loop
+ # through all potential plates, all their wells & their images to build up
+ # the metadata for the plate & well.
+ plate_metadata_dicts={}
+ plate_wells={}
+ well_image_attrs={}
+ new_well_image_attrs={}
+ forzarr_urlinzarr_urls:
+ # Extract plate/well/image parts of `zarr_url`
+ old_plate_url=_get_plate_url_from_image_url(zarr_url)
+ well_sub_url=_get_well_sub_url(zarr_url)
+ curr_img_sub_url=_get_image_sub_url(zarr_url)
+
+ # The first time a plate is found, create its metadata
+ ifold_plate_urlnotinplate_metadata_dicts:
+ logger.info(f"Reading plate metadata of {old_plate_url=}")
+ old_plate_meta=load_NgffPlateMeta(old_plate_url)
+ plate_metadata=dict(
+ plate=dict(
+ acquisitions=old_plate_meta.plate.acquisitions,
+ field_count=old_plate_meta.plate.field_count,
+ name=old_plate_meta.plate.name,
+ # The new field count could be different from the old
+ # field count
+ version=old_plate_meta.plate.version,
+ )
+ )
+ plate_metadata_dicts[old_plate_url]=plate_metadata
+ plate_wells[old_plate_url]=[]
+ well_image_attrs[old_plate_url]={}
+ new_well_image_attrs[old_plate_url]={}
+
+ # The first time a plate/well pair is found, create the well metadata
+ ifwell_sub_urlnotinplate_wells[old_plate_url]:
+ plate_wells[old_plate_url].append(well_sub_url)
+ old_well_url=f"{old_plate_url}/{well_sub_url}"
+ logger.info(f"Reading well metadata of {old_well_url}")
+ well_attrs=load_NgffWellMeta(old_well_url)
+ well_image_attrs[old_plate_url][well_sub_url]=well_attrs.well
+ new_well_image_attrs[old_plate_url][well_sub_url]=[]
+
+ # Find images of the current well with name matching the current image
+ # TODO: clarify whether this list must always have length 1
+ curr_well_image_list=[
+ img
+ forimginwell_image_attrs[old_plate_url][well_sub_url].images
+ ifimg.path==curr_img_sub_url
+ ]
+ new_well_image_attrs[old_plate_url][
+ well_sub_url
+ ]+=curr_well_image_list
+
+ # Fill in the plate metadata based on all available wells
+ forold_plate_urlinplate_metadata_dicts:
+ well_list,row_list,column_list=_generate_wells_rows_columns(
+ plate_wells[old_plate_url]
+ )
+ plate_metadata_dicts[old_plate_url]["plate"]["columns"]=[]
+ forcolumnincolumn_list:
+ plate_metadata_dicts[old_plate_url]["plate"]["columns"].append(
+ {"name":column}
+ )
+
+ plate_metadata_dicts[old_plate_url]["plate"]["rows"]=[]
+ forrowinrow_list:
+ plate_metadata_dicts[old_plate_url]["plate"]["rows"].append(
+ {"name":row}
+ )
+ plate_metadata_dicts[old_plate_url]["plate"]["wells"]=well_list
+
+ # Validate with NgffPlateMeta model
+ plate_metadata_dicts[old_plate_url]=NgffPlateMeta(
+ **plate_metadata_dicts[old_plate_url]
+ ).dict(exclude_none=True)
+
+ returnplate_metadata_dicts,new_well_image_attrs,well_image_attrs
+
Given the absolute zarr_url for an OME-Zarr image within an HCS plate,
+return the path to the plate zarr group.
+
+
+ Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
+
35
+36
+37
+38
+39
+40
+41
+42
def_get_plate_url_from_image_url(zarr_url:str)->str:
+"""
+ Given the absolute `zarr_url` for an OME-Zarr image within an HCS plate,
+ return the path to the plate zarr group.
+ """
+ zarr_url=zarr_url.rstrip("/")
+ plate_path="/".join(zarr_url.split("/")[:-3])
+ returnplate_path
+
Given the absolute zarr_url for an OME-Zarr image within an HCS plate,
+return the path to the image zarr group.
+
+
+ Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
+
45
+46
+47
+48
+49
+50
+51
+52
def_get_well_sub_url(zarr_url:str)->str:
+"""
+ Given the absolute `zarr_url` for an OME-Zarr image within an HCS plate,
+ return the path to the image zarr group.
+ """
+ zarr_url=zarr_url.rstrip("/")
+ well_url="/".join(zarr_url.split("/")[-3:-1])
+ returnwell_url
+
Duplicate the OME-Zarr HCS structure for a set of zarr_urls.
+
This task only processes the zarr images in the zarr_urls, not all the
+images in the plate. It copies all the plate & well structure, but none
+of the image metadata or the actual image data:
+
+
For each plate, create a new OME-Zarr HCS plate with the attributes for
+ all the images in zarr_urls
+
For each well (in each plate), create a new zarr subgroup with the
+ same attributes as the original one.
@validate_arguments
+defcopy_ome_zarr_hcs_plate(
+ *,
+ # Fractal parameters
+ zarr_urls:list[str],
+ zarr_dir:str,
+ # Advanced parameters
+ suffix:str="mip",
+ overwrite:bool=False,
+)->dict[str,Any]:
+"""
+ Duplicate the OME-Zarr HCS structure for a set of zarr_urls.
+
+ This task only processes the zarr images in the zarr_urls, not all the
+ images in the plate. It copies all the plate & well structure, but none
+ of the image metadata or the actual image data:
+
+ - For each plate, create a new OME-Zarr HCS plate with the attributes for
+ all the images in zarr_urls
+ - For each well (in each plate), create a new zarr subgroup with the
+ same attributes as the original one.
+
+ Note: this task makes use of methods from the `Attributes` class, see
+ https://zarr.readthedocs.io/en/stable/api/attrs.html.
+
+ Args:
+ zarr_urls: List of paths or urls to the individual OME-Zarr image to
+ be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_dir: path of the directory where the new OME-Zarrs will be
+ created.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ suffix: The suffix that is used to transform `plate.zarr` into
+ `plate_suffix.zarr`. Note that `None` is not currently supported.
+ overwrite: If `True`, overwrite the task output.
+
+ Returns:
+ A parallelization list to be used in a compute task to fill the wells
+ with OME-Zarr images.
+ """
+
+ # Preliminary check
+ ifsuffixisNoneorsuffix=="":
+ raiseValueError(
+ "Running copy_ome_zarr_hcs_plate without a suffix would lead to"
+ "overwriting of the existing HCS plates."
+ )
+
+ parallelization_list=[]
+
+ # Generate parallelization list
+ forzarr_urlinzarr_urls:
+ old_plate_url=_get_plate_url_from_image_url(zarr_url)
+ well_sub_url=_get_well_sub_url(zarr_url)
+ old_plate_name=old_plate_url.split(".zarr")[-2].split("/")[-1]
+ new_plate_name=f"{old_plate_name}_{suffix}"
+ zarrurl_plate_new=f"{zarr_dir}/{new_plate_name}.zarr"
+ curr_img_sub_url=_get_image_sub_url(zarr_url)
+ new_zarr_url=f"{zarrurl_plate_new}/{well_sub_url}/{curr_img_sub_url}"
+ parallelization_item=dict(
+ zarr_url=new_zarr_url,
+ init_args=dict(origin_url=zarr_url),
+ )
+ InitArgsMIP(**parallelization_item["init_args"])
+ parallelization_list.append(parallelization_item)
+
+ # Generate the plate metadata & parallelization list
+ (
+ plate_attrs_dicts,
+ new_well_image_attrs,
+ well_image_attrs,
+ )=_generate_plate_well_metadata(zarr_urls=zarr_urls)
+
+ # Create the new OME-Zarr HCS plate
+ forold_plate_url,plate_attrsinplate_attrs_dicts.items():
+ old_plate_name=old_plate_url.split(".zarr")[-2].split("/")[-1]
+ new_plate_name=f"{old_plate_name}_{suffix}"
+ zarrurl_new=f"{zarr_dir}/{new_plate_name}.zarr"
+ logger.info(f"{old_plate_url=}")
+ logger.info(f"{zarrurl_new=}")
+ new_plate_group=open_zarr_group_with_overwrite(
+ zarrurl_new,overwrite=overwrite
+ )
+ new_plate_group.attrs.put(plate_attrs)
+
+ # Write well groups:
+ forwell_sub_urlinnew_well_image_attrs[old_plate_url]:
+ new_well_group=zarr.group(f"{zarrurl_new}/{well_sub_url}")
+ well_attrs=dict(
+ well=dict(
+ images=[
+ img.dict(exclude_none=True)
+ forimginnew_well_image_attrs[old_plate_url][
+ well_sub_url
+ ]
+ ],
+ version=well_image_attrs[old_plate_url][
+ well_sub_url
+ ].version,
+ )
+ )
+ new_well_group.attrs.put(well_attrs)
+
+ returndict(parallelization_list=parallelization_list)
+
Applies pre-calculated registration to ROI tables.
+
Apply pre-calculated registration such that resulting ROIs contain
+the consensus align region between all acquisitions.
+
Parallelization level: well
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_url
+
+
+
Path or url to the individual OME-Zarr image to be processed.
+Refers to the zarr_url of the reference acquisition.
+(standard argument for Fractal tasks, managed by Fractal server).
Intialization arguments provided by
+init_group_by_well_for_multiplexing. It contains the
+zarr_url_list listing all the zarr_urls in the same well as the
+zarr_url of the reference acquisition that are being processed.
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table over which the task loops to
+calculate the registration. Examples: FOV_ROI_table => loop over
+the field of views, well_ROI_table => process the whole well as
+one image.
@validate_arguments
+deffind_registration_consensus(
+ *,
+ # Fractal parameters
+ zarr_url:str,
+ init_args:InitArgsRegistrationConsensus,
+ # Core parameters
+ roi_table:str="FOV_ROI_table",
+ # Advanced parameters
+ new_roi_table:Optional[str]=None,
+):
+"""
+ Applies pre-calculated registration to ROI tables.
+
+ Apply pre-calculated registration such that resulting ROIs contain
+ the consensus align region between all acquisitions.
+
+ Parallelization level: well
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be processed.
+ Refers to the zarr_url of the reference acquisition.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ init_args: Intialization arguments provided by
+ `init_group_by_well_for_multiplexing`. It contains the
+ zarr_url_list listing all the zarr_urls in the same well as the
+ zarr_url of the reference acquisition that are being processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ roi_table: Name of the ROI table over which the task loops to
+ calculate the registration. Examples: `FOV_ROI_table` => loop over
+ the field of views, `well_ROI_table` => process the whole well as
+ one image.
+ new_roi_table: Optional name for the new, registered ROI table. If no
+ name is given, it will default to "registered_" + `roi_table`
+
+ """
+ ifnotnew_roi_table:
+ new_roi_table="registered_"+roi_table
+ logger.info(
+ f"Running for {zarr_url=} & the other acquisitions in that well. \n"
+ f"Applying translation registration to {roi_table=} and storing it as "
+ f"{new_roi_table=}."
+ )
+
+ # Collect all the ROI tables
+ roi_tables={}
+ roi_tables_attrs={}
+ foracq_zarr_urlininit_args.zarr_url_list:
+ curr_ROI_table=ad.read_zarr(f"{acq_zarr_url}/tables/{roi_table}")
+ curr_ROI_table_group=zarr.open_group(
+ f"{acq_zarr_url}/tables/{roi_table}",mode="r"
+ )
+ curr_ROI_table_attrs=curr_ROI_table_group.attrs.asdict()
+
+ # For reference_acquisition, handle the fact that it doesn't
+ # have the shifts
+ ifacq_zarr_url==zarr_url:
+ curr_ROI_table=add_zero_translation_columns(curr_ROI_table)
+ # Check for valid ROI tables
+ are_ROI_table_columns_valid(table=curr_ROI_table)
+ translation_columns=[
+ "translation_z",
+ "translation_y",
+ "translation_x",
+ ]
+ ifcurr_ROI_table.var.index.isin(translation_columns).sum()!=3:
+ raiseValueError(
+ f"{roi_table=} in {acq_zarr_url} does not contain the "
+ f"translation columns {translation_columns} necessary to use "
+ "this task."
+ )
+ roi_tables[acq_zarr_url]=curr_ROI_table
+ roi_tables_attrs[acq_zarr_url]=curr_ROI_table_attrs
+
+ # Check that all acquisitions have the same ROIs
+ rois=roi_tables[list(roi_tables.keys())[0]].obs.index
+ foracq_zarr_url,acq_roi_tableinroi_tables.items():
+ ifnot(acq_roi_table.obs.index==rois).all():
+ raiseValueError(
+ f"Acquisition {acq_zarr_url} does not contain the same ROIs "
+ f"as the reference acquisition {zarr_url}:\n"
+ f"{acq_zarr_url}: {acq_roi_table.obs.index}\n"
+ f"{zarr_url}: {rois}"
+ )
+
+ roi_table_dfs=[
+ roi_table.to_df().loc[:,translation_columns]
+ forroi_tableinroi_tables.values()
+ ]
+ logger.info("Calculating min & max translation across acquisitions.")
+ max_df,min_df=calculate_min_max_across_dfs(roi_table_dfs)
+ shifted_rois={}
+
+ # Loop over acquisitions
+ foracq_zarr_urlininit_args.zarr_url_list:
+ shifted_rois[acq_zarr_url]=apply_registration_to_single_ROI_table(
+ roi_tables[acq_zarr_url],max_df,min_df
+ )
+
+ # TODO: Drop translation columns from this table?
+
+ logger.info(
+ f"Write the registered ROI table {new_roi_table} for "
+ "{acq_zarr_url=}"
+ )
+ # Save the shifted ROI table as a new table
+ image_group=zarr.group(acq_zarr_url)
+ write_table(
+ image_group,
+ new_roi_table,
+ shifted_rois[acq_zarr_url],
+ table_attrs=roi_tables_attrs[acq_zarr_url],
+ )
+
defcorrect(
+ img_stack:np.ndarray,
+ corr_img:np.ndarray,
+ background:int=110,
+):
+"""
+ Corrects a stack of images, using a given illumination profile (e.g. bright
+ in the center of the image, dim outside).
+
+ Args:
+ img_stack: 4D numpy array (czyx), with dummy size along c.
+ corr_img: 2D numpy array (yx)
+ background: Background value that is subtracted from the image before
+ the illumination correction is applied.
+ """
+
+ logger.info(f"Start correct, {img_stack.shape}")
+
+ # Check shapes
+ ifcorr_img.shape!=img_stack.shape[2:]orimg_stack.shape[0]!=1:
+ raiseValueError(
+ "Error in illumination_correction:\n"
+ f"{img_stack.shape=}\n{corr_img.shape=}"
+ )
+
+ # Store info about dtype
+ dtype=img_stack.dtype
+ dtype_max=np.iinfo(dtype).max
+
+ # Background subtraction
+ img_stack[img_stack<=background]=0
+ img_stack[img_stack>background]-=background
+
+ # Apply the normalized correction matrix (requires a float array)
+ # img_stack = img_stack.astype(np.float64)
+ new_img_stack=img_stack/(corr_img/np.max(corr_img))[None,None,:,:]
+
+ # Handle edge case: corrected image may have values beyond the limit of
+ # the encoding, e.g. beyond 65535 for 16bit images. This clips values
+ # that surpass this limit and triggers a warning
+ ifnp.sum(new_img_stack>dtype_max)>0:
+ warnings.warn(
+ "Illumination correction created values beyond the max range of "
+ f"the current image type. These have been clipped to {dtype_max=}."
+ )
+ new_img_stack[new_img_stack>dtype_max]=dtype_max
+
+ logger.info("End correct")
+
+ # Cast back to original dtype and return
+ returnnew_img_stack.astype(dtype)
+
Applies illumination correction to the images in the OME-Zarr.
+
Assumes that the illumination correction profiles were generated before
+separately and that the same background subtraction was used during
+calculation of the illumination correction (otherwise, it will not work
+well & the correction may only be partial).
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_url
+
+
+
Path or url to the individual OME-Zarr image to be processed.
+(standard argument for Fractal tasks, managed by Fractal server).
Dictionary where keys match the wavelength_id
+attributes of existing channels (e.g. A01_C01 ) and values are
+the filenames of the corresponding illumination profiles.
Background value that is subtracted from the image before
+the illumination correction is applied. Set it to 0 if you don't
+want any background subtraction.
Name of the ROI table that contains the information
+about the location of the individual field of views (FOVs) to
+which the illumination correction shall be applied. Defaults to
+"FOV_ROI_table", the default name Fractal converters give the ROI
+tables that list all FOVs separately. If you generated your
+OME-Zarr with a different converter and used Import OME-Zarr to
+generate the ROI tables, image_ROI_table is the right choice if
+you only have 1 FOV per Zarr image and grid_ROI_table if you
+have multiple FOVs per Zarr image and set the right grid options
+during import.
If True, the results of this task will overwrite
+the input image data. If false, a new image is generated and the
+illumination corrected data is saved there.
@validate_arguments
+defillumination_correction(
+ *,
+ # Fractal parameters
+ zarr_url:str,
+ # Core parameters
+ illumination_profiles_folder:str,
+ illumination_profiles:dict[str,str],
+ background:int=0,
+ input_ROI_table:str="FOV_ROI_table",
+ overwrite_input:bool=True,
+ # Advanced parameters
+ suffix:str="_illum_corr",
+)->dict[str,Any]:
+
+"""
+ Applies illumination correction to the images in the OME-Zarr.
+
+ Assumes that the illumination correction profiles were generated before
+ separately and that the same background subtraction was used during
+ calculation of the illumination correction (otherwise, it will not work
+ well & the correction may only be partial).
+
+ Args:
+ zarr_url: Path or url to the individual OME-Zarr image to be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ illumination_profiles_folder: Path of folder of illumination profiles.
+ illumination_profiles: Dictionary where keys match the `wavelength_id`
+ attributes of existing channels (e.g. `A01_C01` ) and values are
+ the filenames of the corresponding illumination profiles.
+ background: Background value that is subtracted from the image before
+ the illumination correction is applied. Set it to `0` if you don't
+ want any background subtraction.
+ input_ROI_table: Name of the ROI table that contains the information
+ about the location of the individual field of views (FOVs) to
+ which the illumination correction shall be applied. Defaults to
+ "FOV_ROI_table", the default name Fractal converters give the ROI
+ tables that list all FOVs separately. If you generated your
+ OME-Zarr with a different converter and used Import OME-Zarr to
+ generate the ROI tables, `image_ROI_table` is the right choice if
+ you only have 1 FOV per Zarr image and `grid_ROI_table` if you
+ have multiple FOVs per Zarr image and set the right grid options
+ during import.
+ overwrite_input: If `True`, the results of this task will overwrite
+ the input image data. If false, a new image is generated and the
+ illumination corrected data is saved there.
+ suffix: What suffix to append to the illumination corrected images.
+ Only relevant if `overwrite_input=False`.
+ """
+
+ # Defione old/new zarrurls
+ ifoverwrite_input:
+ zarr_url_new=zarr_url.rstrip("/")
+ else:
+ zarr_url_new=zarr_url.rstrip("/")+suffix
+
+ t_start=time.perf_counter()
+ logger.info("Start illumination_correction")
+ logger.info(f" {overwrite_input=}")
+ logger.info(f" {zarr_url=}")
+ logger.info(f" {zarr_url_new=}")
+
+ # Read attributes from NGFF metadata
+ ngff_image_meta=load_NgffImageMeta(zarr_url)
+ num_levels=ngff_image_meta.num_levels
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ full_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ logger.info(f"NGFF image has {num_levels=}")
+ logger.info(f"NGFF image has {coarsening_xy=}")
+ logger.info(
+ f"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}"
+ )
+
+ # Read channels from .zattrs
+ channels:list[OmeroChannel]=get_omero_channel_list(
+ image_zarr_path=zarr_url
+ )
+ num_channels=len(channels)
+
+ # Read FOV ROIs
+ FOV_ROI_table=ad.read_zarr(f"{zarr_url}/tables/{input_ROI_table}")
+
+ # Create list of indices for 3D FOVs spanning the entire Z direction
+ list_indices=convert_ROI_table_to_indices(
+ FOV_ROI_table,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices,input_ROI_table)
+
+ # Extract image size from FOV-ROI indices. Note: this works at level=0,
+ # where FOVs should all be of the exact same size (in pixels)
+ ref_img_size=None
+ forindicesinlist_indices:
+ img_size=(indices[3]-indices[2],indices[5]-indices[4])
+ ifref_img_sizeisNone:
+ ref_img_size=img_size
+ else:
+ ifimg_size!=ref_img_size:
+ raiseValueError(
+ "ERROR: inconsistent image sizes in list_indices"
+ )
+ img_size_y,img_size_x=img_size[:]
+
+ # Assemble dictionary of matrices and check their shapes
+ corrections={}
+ forchannelinchannels:
+ wavelength_id=channel.wavelength_id
+ corrections[wavelength_id]=imread(
+ (
+ Path(illumination_profiles_folder)
+ /illumination_profiles[wavelength_id]
+ ).as_posix()
+ )
+ ifcorrections[wavelength_id].shape!=(img_size_y,img_size_x):
+ raiseValueError(
+ "Error in illumination_correction, "
+ "correction matrix has wrong shape."
+ )
+
+ # Lazily load highest-res level from original zarr array
+ data_czyx=da.from_zarr(f"{zarr_url}/0")
+
+ # Create zarr for output
+ ifoverwrite_input:
+ new_zarr=zarr.open(f"{zarr_url_new}/0")
+ else:
+ new_zarr=zarr.create(
+ shape=data_czyx.shape,
+ chunks=data_czyx.chunksize,
+ dtype=data_czyx.dtype,
+ store=zarr.storage.FSStore(f"{zarr_url_new}/0"),
+ overwrite=False,
+ dimension_separator="/",
+ )
+ _copy_hcs_ome_zarr_metadata(zarr_url,zarr_url_new)
+ # Copy ROI tables from the old zarr_url to keep ROI tables and other
+ # tables available in the new Zarr
+ _copy_tables_from_zarr_url(zarr_url,zarr_url_new)
+
+ # Iterate over FOV ROIs
+ num_ROIs=len(list_indices)
+ fori_c,channelinenumerate(channels):
+ fori_ROI,indicesinenumerate(list_indices):
+ # Define region
+ s_z,e_z,s_y,e_y,s_x,e_x=indices[:]
+ region=(
+ slice(i_c,i_c+1),
+ slice(s_z,e_z),
+ slice(s_y,e_y),
+ slice(s_x,e_x),
+ )
+ logger.info(
+ f"Now processing ROI {i_ROI+1}/{num_ROIs} "
+ f"for channel {i_c+1}/{num_channels}"
+ )
+ # Execute illumination correction
+ corrected_fov=correct(
+ data_czyx[region].compute(),
+ corrections[channel.wavelength_id],
+ background=background,
+ )
+ # Write to disk
+ da.array(corrected_fov).to_zarr(
+ url=new_zarr,
+ region=region,
+ compute=True,
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to disk a
+ # pyramid of coarser levels
+ build_pyramid(
+ zarrurl=zarr_url_new,
+ overwrite=True,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=data_czyx.chunksize,
+ )
+
+ t_end=time.perf_counter()
+ logger.info(f"End illumination_correction, elapsed: {t_end-t_start}")
+
+ ifoverwrite_input:
+ image_list_updates=dict(image_list_updates=[dict(zarr_url=zarr_url)])
+ else:
+ image_list_updates=dict(
+ image_list_updates=[dict(zarr_url=zarr_url_new,origin=zarr_url)]
+ )
+ returnimage_list_updates
+
This task prepares a parallelization list of all zarr_urls that need to be
+used to calculate the registration between acquisitions (all zarr_urls
+except the reference acquisition vs. the reference acquisition).
+This task only works for HCS OME-Zarrs for 2 reasons: Only HCS OME-Zarrs
+currently have defined acquisition metadata to determine reference
+acquisitions. And we have only implemented the grouping of images for
+HCS OME-Zarrs by well (with the assumption that every well just has 1
+image per acqusition).
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_urls
+
+
+
List of paths or urls to the individual OME-Zarr image to
+be processed.
+(standard argument for Fractal tasks, managed by Fractal server).
path of the directory where the new OME-Zarrs will be
+created. Not used by this task.
+(standard argument for Fractal tasks, managed by Fractal server).
@validate_arguments
+defimage_based_registration_hcs_init(
+ *,
+ # Fractal parameters
+ zarr_urls:list[str],
+ zarr_dir:str,
+ # Core parameters
+ reference_acquisition:int=0,
+)->dict[str,list[dict[str,Any]]]:
+"""
+ Initialized calculate registration task
+
+ This task prepares a parallelization list of all zarr_urls that need to be
+ used to calculate the registration between acquisitions (all zarr_urls
+ except the reference acquisition vs. the reference acquisition).
+ This task only works for HCS OME-Zarrs for 2 reasons: Only HCS OME-Zarrs
+ currently have defined acquisition metadata to determine reference
+ acquisitions. And we have only implemented the grouping of images for
+ HCS OME-Zarrs by well (with the assumption that every well just has 1
+ image per acqusition).
+
+ Args:
+ zarr_urls: List of paths or urls to the individual OME-Zarr image to
+ be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_dir: path of the directory where the new OME-Zarrs will be
+ created. Not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ reference_acquisition: Which acquisition to register against. Needs to
+ match the acquisition metadata in the OME-Zarr image.
+
+ Returns:
+ task_output: Dictionary for Fractal server that contains a
+ parallelization list.
+ """
+ logger.info(
+ f"Running `image_based_registration_hcs_init` for {zarr_urls=}"
+ )
+ image_groups=create_well_acquisition_dict(zarr_urls)
+
+ # Create the parallelization list
+ parallelization_list=[]
+ forkey,image_groupinimage_groups.items():
+ # Assert that all image groups have the reference acquisition present
+ ifreference_acquisitionnotinimage_group.keys():
+ raiseValueError(
+ f"Registration with {reference_acquisition=} can only work if "
+ "all wells have the reference acquisition present. It was not "
+ f"found for well {key}."
+ )
+ # Add all zarr_urls except the reference acquisition to the
+ # parallelization list
+ foracquisition,zarr_urlinimage_group.items():
+ ifacquisition!=reference_acquisition:
+ reference_zarr_url=image_group[reference_acquisition]
+ parallelization_list.append(
+ dict(
+ zarr_url=zarr_url,
+ init_args=dict(reference_zarr_url=reference_zarr_url),
+ )
+ )
+
+ returndict(parallelization_list=parallelization_list)
+
The single OME-Zarr can be a full OME-Zarr HCS plate or an individual
+OME-Zarr image. The image needs to be in the zarr_dir as specified by the
+dataset. The current version of this task:
+
+
Creates the appropriate components-related metadata, needed for
+ processing an existing OME-Zarr through Fractal.
+
Optionally adds new ROI tables to the existing OME-Zarr.
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarr_urls
+
+
+
List of paths or urls to the individual OME-Zarr image to
+be processed. Not used.
+(standard argument for Fractal tasks, managed by Fractal server).
The OME-Zarr name, without its parent folder. The parent
+folder is provided by zarr_dir; e.g. zarr_name="array.zarr",
+if the OME-Zarr path is in /zarr_dir/array.zarr.
@validate_arguments
+defimport_ome_zarr(
+ *,
+ # Fractal parameters
+ zarr_urls:list[str],
+ zarr_dir:str,
+ # Core parameters
+ zarr_name:str,
+ update_omero_metadata:bool=True,
+ add_image_ROI_table:bool=True,
+ add_grid_ROI_table:bool=True,
+ # Advanced parameters
+ grid_y_shape:int=2,
+ grid_x_shape:int=2,
+ overwrite:bool=False,
+)->dict[str,Any]:
+"""
+ Import a single OME-Zarr into Fractal.
+
+ The single OME-Zarr can be a full OME-Zarr HCS plate or an individual
+ OME-Zarr image. The image needs to be in the zarr_dir as specified by the
+ dataset. The current version of this task:
+
+ 1. Creates the appropriate components-related metadata, needed for
+ processing an existing OME-Zarr through Fractal.
+ 2. Optionally adds new ROI tables to the existing OME-Zarr.
+
+ Args:
+ zarr_urls: List of paths or urls to the individual OME-Zarr image to
+ be processed. Not used.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_dir: path of the directory where the new OME-Zarrs will be
+ created.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_name: The OME-Zarr name, without its parent folder. The parent
+ folder is provided by zarr_dir; e.g. `zarr_name="array.zarr"`,
+ if the OME-Zarr path is in `/zarr_dir/array.zarr`.
+ add_image_ROI_table: Whether to add a `image_ROI_table` table to each
+ image, with a single ROI covering the whole image.
+ add_grid_ROI_table: Whether to add a `grid_ROI_table` table to each
+ image, with the image split into a rectangular grid of ROIs.
+ grid_y_shape: Y shape of the ROI grid in `grid_ROI_table`.
+ grid_x_shape: X shape of the ROI grid in `grid_ROI_table`.
+ update_omero_metadata: Whether to update Omero-channels metadata, to
+ make them Fractal-compatible.
+ overwrite: Whether new ROI tables (added when `add_image_ROI_table`
+ and/or `add_grid_ROI_table` are `True`) can overwite existing ones.
+ """
+
+ # Is this based on the Zarr_dir or the zarr_urls?
+ iflen(zarr_urls)>0:
+ logger.warning(
+ "Running import while there are already items from the image list "
+ "provided to the task. The following inputs were provided: "
+ f"{zarr_urls=}"
+ "This task will not process the existing images, but look for "
+ f"zarr files named {zarr_name=} in the {zarr_dir=} instead."
+ )
+
+ zarr_path=f"{zarr_dir.rstrip('/')}/{zarr_name}"
+ logger.info(f"Zarr path: {zarr_path}")
+
+ root_group=zarr.open_group(zarr_path,mode="r")
+ ngff_type=detect_ome_ngff_type(root_group)
+ grid_YX_shape=(grid_y_shape,grid_x_shape)
+
+ image_list_updates=[]
+ ifngff_type=="plate":
+ forwellinroot_group.attrs["plate"]["wells"]:
+ well_path=well["path"]
+
+ well_group=zarr.open_group(zarr_path,path=well_path,mode="r")
+ forimageinwell_group.attrs["well"]["images"]:
+ image_path=image["path"]
+ zarr_url=f"{zarr_path}/{well_path}/{image_path}"
+ types=_process_single_image(
+ zarr_url,
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+ image_list_updates.append(
+ dict(
+ zarr_url=zarr_url,
+ attributes=dict(
+ plate=zarr_name,
+ well=well_path.replace("/",""),
+ ),
+ types=types,
+ )
+ )
+ elifngff_type=="well":
+ logger.warning(
+ "Only OME-Zarr for plates are fully supported in Fractal; "
+ f"e.g. the current one ({ngff_type=}) cannot be "
+ "processed via the `maximum_intensity_projection` task."
+ )
+ forimageinroot_group.attrs["well"]["images"]:
+ image_path=image["path"]
+ zarr_url=f"{zarr_path}/{image_path}"
+ well_name="".join(zarr_path.split("/")[-2:])
+ types=_process_single_image(
+ zarr_url,
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+ image_list_updates.append(
+ dict(
+ zarr_url=zarr_url,
+ attributes=dict(
+ well=well_name,
+ ),
+ types=types,
+ )
+ )
+ elifngff_type=="image":
+ logger.warning(
+ "Only OME-Zarr for plates are fully supported in Fractal; "
+ f"e.g. the current one ({ngff_type=}) cannot be "
+ "processed via the `maximum_intensity_projection` task."
+ )
+ zarr_url=zarr_path
+ types=_process_single_image(
+ zarr_url,
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+ image_list_updates.append(
+ dict(
+ zarr_url=zarr_url,
+ types=types,
+ )
+ )
+
+ image_list_changes=dict(image_list_updates=image_list_updates)
+ returnimage_list_changes
+
path of the directory where the new OME-Zarrs will be
+created. Not used by this task.
+(standard argument for Fractal tasks, managed by Fractal server).
@validate_arguments
+definit_group_by_well_for_multiplexing(
+ *,
+ # Fractal parameters
+ zarr_urls:list[str],
+ zarr_dir:str,
+ # Core parameters
+ reference_acquisition:int=0,
+)->dict[str,list[str]]:
+"""
+ Finds images for all acquisitions per well.
+
+ Returns the parallelization_list to run `find_registration_consensus`.
+
+ Args:
+ zarr_urls: List of paths or urls to the individual OME-Zarr image to
+ be processed.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_dir: path of the directory where the new OME-Zarrs will be
+ created. Not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ reference_acquisition: Which acquisition to register against. Uses the
+ OME-NGFF HCS well metadata acquisition keys to find the reference
+ acquisition.
+ """
+ logger.info(
+ f"Running `init_group_by_well_for_multiplexing` for {zarr_urls=}"
+ )
+ image_groups=create_well_acquisition_dict(zarr_urls)
+
+ # Create the parallelization list
+ parallelization_list=[]
+ forkey,image_groupinimage_groups.items():
+ # Assert that all image groups have the reference acquisition present
+ ifreference_acquisitionnotinimage_group.keys():
+ raiseValueError(
+ f"Registration with {reference_acquisition=} can only work if "
+ "all wells have the reference acquisition present. It was not "
+ f"found for well {key}."
+ )
+
+ # Create a parallelization list entry for each image group
+ zarr_url_list=[]
+ foracquisition,zarr_urlinimage_group.items():
+ ifacquisition==reference_acquisition:
+ reference_zarr_url=zarr_url
+
+ zarr_url_list.append(zarr_url)
+
+ parallelization_list.append(
+ dict(
+ zarr_url=reference_zarr_url,
+ init_args=dict(zarr_url_list=zarr_url_list),
+ )
+ )
+
+ returndict(parallelization_list=parallelization_list)
+
classInitArgsCellVoyager(BaseModel):
+"""
+ Arguments to be passed from cellvoyager converter init to compute
+
+ Attributes:
+ image_dir: Directory where the raw images are found
+ plate_prefix: part of the image filename needed for finding the
+ right subset of image files
+ well_ID: part of the image filename needed for finding the
+ right subset of image files
+ image_extension: part of the image filename needed for finding the
+ right subset of image files
+ image_glob_patterns: Additional glob patterns to filter the available
+ images with
+ acquisition: Acquisition metadata needed for multiplexing
+ """
+
+ image_dir:str
+ plate_prefix:str
+ well_ID:str
+ image_extension:str
+ image_glob_patterns:Optional[list[str]]
+ acquisition:Optional[int]
+
+
+
+ Source code in fractal_tasks_core/tasks/io_models.py
+
78
+79
+80
+81
+82
+83
+84
+85
+86
classInitArgsMIP(BaseModel):
+"""
+ Init Args for MIP task.
+
+ Attributes:
+ origin_url: Path to the zarr_url with the 3D data
+ """
+
+ origin_url:str
+
+
+
+ Source code in fractal_tasks_core/tasks/io_models.py
+
26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
+37
classInitArgsRegistrationConsensus(BaseModel):
+"""
+ Registration consensus init args.
+
+ Provides the list of zarr_urls for all acquisitions for a given well
+
+ Attributes:
+ zarr_url_list: List of zarr_urls for all the OME-Zarr images in the
+ well.
+ """
+
+ zarr_url_list:list[str]
+
A list of OmeroChannel objects, where each channel
+must include the wavelength_id attribute and where the
+wavelength_id values must be unique across the list.
classMultiplexingAcquisition(BaseModel):
+"""
+ Input class for Multiplexing Cellvoyager converter
+
+ Attributes:
+ image_dir: Path to the folder that contains the Cellvoyager image
+ files for that acquisition and the MeasurementData &
+ MeasurementDetail metadata files.
+ allowed_channels: A list of `OmeroChannel` objects, where each channel
+ must include the `wavelength_id` attribute and where the
+ `wavelength_id` values must be unique across the list.
+ """
+
+ image_dir:str
+ allowed_channels:list[OmeroChannel]
+
classNapariWorkflowsOutput(BaseModel):
+"""
+ A value of the `output_specs` argument in `napari_workflows_wrapper`.
+
+ Attributes:
+ type: Output type (either `label` or `dataframe`).
+ label_name: Label name (for label outputs, it is used as the name of
+ the label; for dataframe outputs, it is used to fill the
+ `region["path"]` field).
+ table_name: Table name (for dataframe outputs only).
+ """
+
+ type:Literal["label","dataframe"]
+ label_name:str
+ table_name:Optional[str]=None
+
+ @validator("table_name",always=True)
+ deftable_name_only_for_dataframe_type(cls,v,values):
+"""
+ Check that table_name is set only for dataframe outputs.
+ """
+ _type=values.get("type")
+ if(_type=="dataframe"and(notv))or(_type!="dataframe"andv):
+ raiseValueError(
+ f"Output item has type={_type} but table_name={v}."
+ )
+ returnv
+
@validator("table_name",always=True)
+deftable_name_only_for_dataframe_type(cls,v,values):
+"""
+ Check that table_name is set only for dataframe outputs.
+ """
+ _type=values.get("type")
+ if(_type=="dataframe"and(notv))or(_type!="dataframe"andv):
+ raiseValueError(
+ f"Output item has type={_type} but table_name={v}."
+ )
+ returnv
+
Name of the ROI table over which the task loops to
+apply napari workflows.
+Examples:
+FOV_ROI_table
+=> loop over the field of views;
+organoid_ROI_table
+=> loop over the organoid ROI table (generated by another task);
+well_ROI_table
+=> process the whole well as one image.
Pyramid level of the image to be used as input for
+napari-workflows. Choose 0 to process at full resolution.
+Levels > 0 are currently only supported for workflows that only
+have intensity images as input and only produce a label images as
+output.
Expected dimensions (either 2 or 3). Useful
+when loading 2D images that are stored in a 3D array with shape
+(1, size_x, size_y) [which is the default way Fractal stores 2D
+images], but you want to make sure the napari workflow gets a 2D
+array to process. Also useful to set to 2 when loading a 2D
+OME-Zarr that is saved as (size_x, size_y).
defupscale_array(
+ *,
+ array:np.ndarray,
+ target_shape:tuple[int,...],
+ axis:Optional[Sequence[int]]=None,
+ pad_with_zeros:bool=False,
+ warn_if_inhomogeneous:bool=False,
+)->np.ndarray:
+"""
+ Upscale an array along a given list of axis (through repeated application
+ of `np.repeat`), to match a target shape.
+
+ Args:
+ array: The array to be upscaled.
+ target_shape: The shape of the rescaled array.
+ axis: The axis along which to upscale the array (if `None`, then all
+ axis are used).
+ pad_with_zeros: If `True`, pad the upscaled array with zeros to match
+ `target_shape`.
+ warn_if_inhomogeneous: If `True`, raise a warning when the conversion
+ factors are not identical across all dimensions.
+
+ Returns:
+ The upscaled array, with shape `target_shape`.
+ """
+
+ # Default behavior: use all axis
+ ifaxisisNone:
+ axis=list(range(len(target_shape)))
+
+ array_shape=array.shape
+ info=(
+ f"Trying to upscale from {array_shape=} to {target_shape=}, "
+ f"acting on {axis=}."
+ )
+
+ iflen(array_shape)!=len(target_shape):
+ raiseValueError(f"{info} Dimensions-number mismatch.")
+ ifaxis==[]:
+ raiseValueError(f"{info} Empty axis list")
+ ifmin(axis)<0:
+ raiseValueError(f"{info} Negative axis specification not allowed.")
+
+ # Check that upscale is doable
+ forind,diminenumerate(array_shape):
+ # Check that array is not larger than target (downscaling)
+ ifdim>target_shape[ind]:
+ raiseValueError(
+ f"{info}{ind}-th array dimension is larger than target."
+ )
+ # Check that all relevant axis are included in axis
+ ifdim!=target_shape[ind]andindnotinaxis:
+ raiseValueError(
+ f"{info}{ind}-th array dimension differs from "
+ f"target, but {ind} is not included in "
+ f"{axis=}."
+ )
+
+ # Compute upscaling factors
+ upscale_factors={}
+ foraxinaxis:
+ if(target_shape[ax]%array_shape[ax])>0andnotpad_with_zeros:
+ raiseValueError(
+ "Incommensurable upscale attempt, "
+ f"from {array_shape=} to {target_shape=}."
+ )
+ upscale_factors[ax]=target_shape[ax]//array_shape[ax]
+ # Check that this is not downscaling
+ ifupscale_factors[ax]<1:
+ raiseValueError(info)
+ info=f"{info} Upscale factors: {upscale_factors}"
+
+ # Raise a warning if upscaling is non-homogeneous across all axis
+ ifwarn_if_inhomogeneous:
+ iflen(set(upscale_factors.values()))>1:
+ warnings.warn(f"{info} (inhomogeneous)")
+
+ # Upscale array, via np.repeat
+ upscaled_array=array
+ foraxinaxis:
+ upscaled_array=np.repeat(
+ upscaled_array,upscale_factors[ax],axis=ax
+ )
+
+ # Check that final shape is correct
+ ifnotupscaled_array.shape==target_shape:
+ ifpad_with_zeros:
+ pad_width=[]
+ foraxinlist(range(len(target_shape))):
+ missing=target_shape[ax]-upscaled_array.shape[ax]
+ ifmissing<0or(missing>0andaxnotinaxis):
+ raiseValueError(
+ f"{info} ""Something wrong during zero-padding"
+ )
+ pad_width.append([0,missing])
+ upscaled_array=np.pad(
+ upscaled_array,
+ pad_width=pad_width,
+ mode="constant",
+ constant_values=0,
+ )
+ logging.warning(f"{info}{upscaled_array.shape=}.")
+ logging.warning(
+ f"Padding upscaled_array with zeros with {pad_width=}"
+ )
+ else:
+ raiseValueError(f"{info}{upscaled_array.shape=}.")
+
+ returnupscaled_array
+
Discover the acquisition index based on OME-NGFF metadata.
+
Given the path to a zarr image folder (e.g. /path/plate.zarr/B/03/0),
+extract the acquisition index from the .zattrs file of the parent
+folder (i.e. at the well level), or return None if acquisition is not
+specified.
+
Notes:
+
+
For non-multiplexing datasets, acquisition is not a required
+ information in the metadata. If it is not there, this function
+ returns None.
+
This function fails if we use an image that does not belong to
+ an OME-NGFF well.
def_find_omengff_acquisition(image_zarr_path:Path)->Union[int,None]:
+"""
+ Discover the acquisition index based on OME-NGFF metadata.
+
+ Given the path to a zarr image folder (e.g. `/path/plate.zarr/B/03/0`),
+ extract the acquisition index from the `.zattrs` file of the parent
+ folder (i.e. at the well level), or return `None` if acquisition is not
+ specified.
+
+ Notes:
+
+ 1. For non-multiplexing datasets, acquisition is not a required
+ information in the metadata. If it is not there, this function
+ returns `None`.
+ 2. This function fails if we use an image that does not belong to
+ an OME-NGFF well.
+
+ Args:
+ image_zarr_path: Full path to an OME-NGFF image folder.
+ """
+
+ # Identify well path and attrs
+ well_zarr_path=image_zarr_path.parent
+ ifnot(well_zarr_path/".zattrs").exists():
+ raiseValueError(
+ f"{str(well_zarr_path)} must be an OME-NGFF well "
+ "folder, but it does not include a .zattrs file."
+ )
+ well_group=zarr.open_group(str(well_zarr_path))
+ attrs_images=well_group.attrs["well"]["images"]
+
+ # Loook for the acquisition of the current image (if any)
+ acquisition=None
+ forimg_dictinattrs_images:
+ if(
+ img_dict["path"]==image_zarr_path.name
+ and"acquisition"inimg_dict.keys()
+ ):
+ acquisition=img_dict["acquisition"]
+ break
+
+ returnacquisition
+
Dictionary with table names as keys and table paths as values. If
+tables Zarr group is missing, or if it does not have a tables
+key, then return an empty dictionary.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/utils.py
+
def_get_table_path_dict(zarr_url:str)->dict[str,str]:
+"""
+ Compile dictionary of (table name, table path) key/value pairs.
+
+
+ Args:
+ zarr_url:
+ Path or url to the individual OME-Zarr image to be processed.
+
+ Returns:
+ Dictionary with table names as keys and table paths as values. If
+ `tables` Zarr group is missing, or if it does not have a `tables`
+ key, then return an empty dictionary.
+ """
+
+ try:
+ tables_group=zarr.open_group(f"{zarr_url}/tables","r")
+ table_list=tables_group.attrs["tables"]
+ except(zarr.errors.GroupNotFoundError,KeyError):
+ table_list=[]
+
+ table_path_dict={}
+ fortableintable_list:
+ table_path_dict[table]=f"{zarr_url}/tables/{table}"
+
+ returntable_path_dict
+
Flexibly extract parameters from metadata dictionary
+
This covers both parameters which are acquisition-specific (if the image
+belongs to an OME-NGFF array and its acquisition is specified) or simply
+available in the dictionary.
+The two cases are handled as:
+
metadata[acquisition]["some_parameter"] # acquisition available
+metadata["some_parameter"] # acquisition not available
+
defget_parameters_from_metadata(
+ *,
+ keys:Sequence[str],
+ metadata:dict[str,Any],
+ image_zarr_path:Path,
+)->dict[str,Any]:
+"""
+ Flexibly extract parameters from metadata dictionary
+
+ This covers both parameters which are acquisition-specific (if the image
+ belongs to an OME-NGFF array and its acquisition is specified) or simply
+ available in the dictionary.
+ The two cases are handled as:
+ ```
+ metadata[acquisition]["some_parameter"] # acquisition available
+ metadata["some_parameter"] # acquisition not available
+ ```
+
+ Args:
+ keys: list of required parameters.
+ metadata: metadata dictionary.
+ image_zarr_path: full path to image, e.g. `/path/plate.zarr/B/03/0`.
+ """
+
+ parameters={}
+ acquisition=_find_omengff_acquisition(image_zarr_path)
+ ifacquisitionisnotNone:
+ parameters["acquisition"]=acquisition
+
+ forkeyinkeys:
+ ifacquisitionisNone:
+ parameter=metadata[key]
+ else:
+ try:
+ parameter=metadata[key][str(acquisition)]
+ exceptTypeError:
+ parameter=metadata[key]
+ exceptKeyError:
+ parameter=metadata[key]
+ parameters[key]=parameter
+ returnparameters
+
Given a set of datasets (as per OME-NGFF specs), update their "scale"
+transformations in the YX directions by including a prefactor
+(coarsening_xy**reference_level).
defopen_zarr_group_with_overwrite(
+ path:Union[str,MutableMapping],
+ *,
+ overwrite:bool,
+ logger:Optional[logging.Logger]=None,
+ **open_group_kwargs:Any,
+)->zarr.hierarchy.Group:
+"""
+ Wrap `zarr.open_group` and add `overwrite` argument.
+
+ This wrapper sets `mode="w"` for `overwrite=True` and `mode="w-"` for
+ `overwrite=False`.
+
+ The expected behavior is
+
+
+ * if the group does not exist, create it (independently on `overwrite`);
+ * if the group already exists and `overwrite=True`, replace the group with
+ an empty one;
+ * if the group already exists and `overwrite=False`, fail.
+
+ From the [`zarr.open_group`
+ docs](https://zarr.readthedocs.io/en/stable/api/hierarchy.html#zarr.hierarchy.open_group):
+
+ * `mode="r"` means read only (must exist);
+ * `mode="r+"` means read/write (must exist);
+ * `mode="a"` means read/write (create if doesn’t exist);
+ * `mode="w"` means create (overwrite if exists);
+ * `mode="w-"` means create (fail if exists).
+
+
+ Args:
+ path:
+ Store or path to directory in file system or name of zip file
+ (`zarr.open_group` parameter).
+ overwrite:
+ Determines the `mode` parameter of `zarr.open_group`, which is
+ `"w"` (if `overwrite=True`) or `"w-"` (if `overwrite=False`).
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`)
+ open_group_kwargs:
+ Keyword arguments of `zarr.open_group`.
+
+ Returns:
+ The zarr group.
+
+ Raises:
+ OverwriteNotAllowedError:
+ If `overwrite=False` and the group already exists.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ # Set mode for zarr.open_group
+ ifoverwrite:
+ new_mode="w"
+ else:
+ new_mode="w-"
+
+ # Write log about current status
+ logger.info(f"Start open_zarr_group_with_overwrite ({overwrite=}).")
+ try:
+ # Call `zarr.open_group` with `mode="r"`, which fails for missing group
+ current_group=zarr.open_group(path,mode="r")
+ keys=list(current_group.group_keys())
+ logger.info(f"Zarr group {path} already exists, with {keys=}")
+ exceptGroupNotFoundError:
+ logger.info(f"Zarr group {path} does not exist yet.")
+
+ # Raise warning if we are overriding an existing value of `mode`
+ if"mode"inopen_group_kwargs.keys():
+ mode=open_group_kwargs.pop("mode")
+ logger.warning(
+ f"Overriding {mode=} with {new_mode=}, "
+ "in open_zarr_group_with_overwrite"
+ )
+
+ # Call zarr.open_group
+ try:
+ returnzarr.open_group(path,mode=new_mode,**open_group_kwargs)
+ exceptContainsGroupError:
+ # Re-raise error with custom message and type
+ error_msg=(
+ f"Cannot create zarr group at {path=} with `{overwrite=}` "
+ "(original error: `zarr.errors.ContainsGroupError`).\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
Thanks to the package manifest and to their structure, the tasks in
+fractal_tasks_core.tasks can be run within the Fractal
+platform; this consists in a
+backend server
+which can be accessed by one of the two available clients (a command-line
+client and a
+web-client).
+
The fractal-demos repository lists a set of relevant examples, including:
How to use the command-line client to submit a series of typical workflows (based on fractal-tasks-core tasks) to Fractal; see folders from 01 to 10 in the examples folder.
The fractal-tasks-core GitHub repository includes an examples folder, listing a few examples of how to run fractal-tasks-core tasks from a standard Python script (instead of using the Fractal platform).
Enter one of the example folders, remove the tmp_out temporary output
+ folder (if present), and run one of the run_workflow Python scripts.
+
+
+
View the output OME-Zarr in the tmp_out folder with
+ napari, which can be installed via pip install
+ napari[pyqt5] napari-ome-zarr.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/search/search_index.json b/search/search_index.json
new file mode 100644
index 000000000..1f3b8319e
--- /dev/null
+++ b/search/search_index.json
@@ -0,0 +1 @@
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to Fractal Tasks Core's documentation!","text":"
Fractal is a framework to process high content imaging data at scale and prepare it for interactive visualization.
This project is under active development \ud83d\udd28. If you need help or found a bug, open an issue here.
Fractal provides distributed workflows that convert TBs of image data into OME-Zar files. The platform then processes the 3D image data by applying tasks like illumination correction, maximum intensity projection, 3D segmentation using cellpose and measurements using napari workflows. The pyramidal OME-Zarr files enable interactive visualization in the napari viewer.
The fractal-tasks-core package contains the python tasks that parse Yokogawa CV7000 images into OME-Zarr and process OME-Zarr files. Find more information about Fractal in general and the other repositories at this link. All tasks are written as Python functions and are optimized for usage in Fractal workflows, but they can also be used as standalone functions to parse data or process OME-Zarr files. We heavily use regions of interest (ROIs) in our OME-Zarr files to store the positions of field of views. ROIs are saved as AnnData tables following this spec proposal. We save wells as large Zarr arrays instead of a collection of arrays for each field of view (see details here).
Here is an example of the interactive visualization in napari using the newly-proposed async loading in NAP4 and the napari-ome-zarr plugin:
Create Zarr Structure: Task to generate the zarr structure based on Yokogawa metadata files
Yokogawa to Zarr: Parses the Yokogawa CV7000 image data and saves it to the Zarr file
Illumination Correction: Applies an illumination correction based on a flatfield image & subtracts a background from the image.
Image Labeling (& Image Labeling Whole Well): Applies a cellpose network to the image of a single ROI or the whole well. cellpose parameters can be tuned for optimal performance.
Maximum Intensity Projection: Creates a maximum intensity projection of the whole plate.
Measurement: Make some standard measurements (intensity & morphology) using napari workflows, saving results to AnnData tables.
Some additional tasks are currently being worked on and some older tasks are still present in the fractal_tasks_core folder. See the package page for the detailed description of all tasks.
Fractal was conceived in the Liberali Lab at the Friedrich Miescher Institute for Biomedical Research and in the Pelkmans Lab at the University of Zurich by @jluethi and @gusqgm. The Fractal project is now developed at the BioVisionCenter at the University of Zurich and the project lead is with @jluethi. The core development is done under contract by eXact lab S.r.l..
Here is a list of tasks that are available within Fractal-compatible packages, including both fractal-tasks-core and others.
These are the tasks that we are aware of; if you created your own package of Fractal tasks, reach out to have it listed here (or, if you want to build your own tasks, follow these instructions).
Home page: https://github.com/Apricot-Therapeutics/APx_fractal_task_collection
Description: The APx Fractal Task Collection is mainainted by Apricot Therapeutics AG, Switzerland. This is a collection of tasks intended to be used in combination with the Fractal Analytics Platform maintained by the BioVisionCenter Zurich (co-founded by the Friedrich Miescher Institute and the University of Zurich). The tasks in this collection are focused on extending Fractal's capabilities of processing 2D image data, with a special focus on multiplexed 2D image data. Most tasks work with 3D image data, but they have not specifically been developed for this scenario.
Update all tasks to use the new Fractal API from Fractal server 2.0 (#671)
Provide new dev tooling to create Fractal manifest for new task API (#671)
Add Pydantic models for OME-NGFF HCS Plate validation (#671)
Breaking changes in core library:
In get_acquisition_paths helper function of NgffWellMeta: The dictionary now contains a list of paths as values, not single paths. The NotImplementedError for multiple images with the same acquisition was removed.
The utils.get_table_path_dict helper function was made private & changed its input parameters: It's now _get_table_path_dict(zarr_url: str)
(major) Introduce new tasks for registration of multiplexing cycles: calculate_registration_image_based, apply_registration_to_ROI_tables, apply_registration_to_image (#487).
(major) Introduce new overwrite argument for tasks create_ome_zarr, create_ome_zarr_multiplex, yokogawa_to_ome_zarr, copy_ome_zarr, maximum_intensity_projection, cellpose_segmentation, napari_workflows_wrapper (#499).
(major) Rename illumination_correction parameter from overwrite to overwrite_input (#499).
Fix plate-selection bug in copy_ome_zarr task (#513).
Fix bug in definition of metadata[\"plate\"] in create_ome_zarr_multiplex task (#513).
Introduce new helper functions write_table, prepare_label_group and open_zarr_group_with_overwrite (#499).
Make tasks-related dependencies optional, and installable via fractal-tasks extra (#390).
Remove tools package extra (#384), and split the subpackage content into lib_ROI_overlaps and examples (#390).
(major) Modify task arguments
Add Pydantic model lib_channels.OmeroChannel (#410, #422);
Add Pydantic model tasks._input_models.Channel (#422);
Add Pydantic model tasks._input_models.NapariWorkflowsInput (#422);
Add Pydantic model tasks._input_models.NapariWorkflowsOutput (#422);
Move all Pydantic models to main package (#438).
Modify arguments of illumination_correction task (#431);
Modify arguments of create_ome_zarr and create_ome_zarr_multiplex (#433).
Modify argument default for ROI_table_names, in copy_ome_zarr (#449).
Remove the delete option from yokogawa to ome zarr (#443).
Reorder task inputs (#451).
JSON Schemas for task arguments:
Add JSON Schemas for task arguments in the package manifest (#369, #384).
Add JSON Schemas for attributes of custom task-argument Pydantic models (#436).
Make schema-generation tools more general, when handling custom Pydantic models (#445).
Include titles for custom-model-typed arguments and argument attributes (#447).
Remove TaskArguments models and switch to Pydantic V1 validate_arguments (#369).
Make coercing&validating task arguments required, rather than optional (#408).
Remove default_args from manifest (#379, #393).
Other:
Make pydantic dependency required for running tasks, and pin it to V1 (#408).
Remove legacy executor definitions from manifest (#361).
Add GitHub action for testing pip install with/without fractal-tasks extra (#390).
Remove sqlmodel from dev dependencies (#374).
Relax constraint on torch version, from ==1.12.1 to <=2.0.0 (#406).
Review task docstrings and improve documentation (#413, #416).
Update anndata dependency requirements (from ^0.8.0 to >=0.8.0,<=0.9.1), and replace anndata.experimental.write_elem with anndata._io.specs.write_elem (#428).
Disable bugged validation of model_type argument in cellpose_segmentation (#344).
Raise an error if the user provides an unexpected argument to a task (#337); this applies to the case of running a task as a script, with a pydantic model for task-argument validation.
(major) Update task interface: remove filename extension from input_paths and output_path for all tasks, and add new arguments (image_extension,image_glob_pattern) to create_ome_zarr task (#323).
Implement logic for handling image_glob_patterns argument, both when globbing images and in Yokogawa metadata parsing (#326).
"},{"location":"custom_task/","title":"How to write a Fractal-compatible custom task","text":"
The fractal-tasks-core repository is the reference implementation for Fractal tasks and for Fractal task packages, but the Fractal platform can also be used to execute custom tasks.
For the most recent versions of Fractal (namely fractal-server v2 and fractal-tasks-core v1), the instructions for building your own tasks are available at https://fractal-analytics-platform.github.io/build_your_own_fractal_task.
As a reference, here is a copy of the legacy instructions for older Fractal versions, which are currently obsolete.
"},{"location":"custom_tasks_old/","title":"\u26a0\ufe0f OBSOLETE: How to write a Fractal-compatible custom task","text":"
\u26a0\ufe0f\u26a0\ufe0f These instructions are here just as a reference, but they refer to legacy versions of fractal-server. While the overall structure of the instructions is still valid, several details are now obsolete and won't work. \u26a0\ufe0f\u26a0\ufe0f
The fractal-tasks-core repository is the reference implementation for Fractal tasks and for Fractal task packages, but the Fractal platform can also be used to execute custom tasks.
This page lists the Fractal-compatibility requirements, for a single custom task or for a task package.
Note that these specifications evolve frequently, see e.g. discussion at https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151.
Note: While the contents of this page remain valid, the recommended procedure to get up to speed and build a Python package of Fractal-compatible tasks is to use the template available at https://github.com/fractal-analytics-platform/fractal-tasks-template.
A Fractal task is mainly formed by two components:
A set of metadata, which are stored in the task table of the database of a fractal-server instance, see Task metadata.
An executable command, which can take some specific command-line arguments (see Command-line interface); the standard example is a Python script.
In the following we explain what are the Fractal-compatibility requirements for a single task, and then for a task package.
Each task must be associated to some metadata, so that it can be used in Fractal. The full specification is here, and the required attributes are:
name: the task name, e.g. \"Create OME-Zarr structure\";
command: a command that can be executed from the command line;
input_type: this can be any string (typical examples: \"image\" or \"zarr\"); the special value \"Any\" means that Fractal won't perform any check of the input_type when applying the task to a dataset.
output_type: same logic as input_type.
source: this is meant to be as close as possible to unique task identifier; for custom tasks, it can be anything (e.g. \"my_task\"), but for task that are collected automatically from a package (see Task package this attribute will have a very specific form (e.g. \"pip_remote:fractal_tasks_core:0.10.0:fractal-tasks::convert_yokogawa_to_ome-zarr\").
meta: a JSON object (similar to a Python dictionary) with some additional information, see Task meta-parameters.
There are multiple ways to get the appropriate metadata into the database, including a POST request to the fractal-server API (see Tasks section in the fractal-server API documentation) or the automated addition of a whole set of tasks through specific API endpoints (see Task package).
Therefore the task command must accept these additional command-line arguments. If the task is a Python script, this can be achieved easily by using the run_fractal_task function - which is available as part of fractal_tasks_core.tasks._utils.
The meta attribute of tasks (see the corresponding item in Task metadata) is where we specify some requirements on how the task should be run. This notably includes:
If the task has to be run in parallel (e.g. over multiple wells of an OME-Zarr dataset), then meta should include a key-value pair like {\"parallelization_level\": \"well\"}. If the parallelization_level key is missing, the task is considered as non-parallel.
If Fractal is configured to run on a SLURM cluster, meta may include additional information on the SLRUM requirements (more info on the Fractal SLURM backend here).
When a task is run via Fractal, its input parameters (i.e. the ones in the file specified via the -j command-line otion) will always include a set of keyword arguments with specific names:
The only task output which will be visible to Fractal is what goes in the output metadata-update file (i.e. the one specified through the --metadata-out command-line option). Note that this only holds for non-parallel tasks, while (for the moment) Fractal fully ignores the output of parallel tasks.
IMPORTANT: This means that each task must always write any output to disk, before ending.
The description of other advanced features is not yet available in this page.
Also other attributes of the Task metadata exist, and they would be recognized by other Fractal components (e.g. fractal-server or fractal-web). These include JSON Schemas for input parameters and additional documentation-related attributes.
In fractal-tasks-core, we use pydantic v1 to fully coerce and validate the input parameters into a set of given types.
Here we describe a simplified example of a Fractal-compatible Python task (for more realistic examples see the fractal-task-core tasks folder).
The script /some/path/my_task.py may look like
# Import a helper function from fractal_tasks_core\nfrom fractal_tasks_core.tasks._utils import run_fractal_task\n\ndef my_task_function(\n # Reserved Fractal arguments\n input_paths,\n output_path,\n metadata,\n # Task-specific arguments\n argument_A,\n argument_B = \"default_B_value\",\n):\n # Do something, based on the task parameters\n print(\"Here we go, we are in `my_task_function`\")\n with open(f\"{output_path}/output.txt\", \"w\") as f:\n f.write(f\"argument_A={argument_A}\\n\")\n f.write(f\"argument_B={argument_B}\\n\")\n # Compile the output metadata update and return\n output_metadata_update = {\"nothing\": \"to add\"}\n return output_metadata_update\n\n# Thi block is executed when running the Python script directly\nif __name__ == \"__main__\":\n run_fractal_task(task_function=my_task_function)\n
where we use run_fractal_task so that we don't have to take care of the command-line arguments.
Some valid metadata attributes for this task would be:
Given a set of Python scripts corresponding to Fractal tasks, it is useful to combine them into a single Python package, using the standard tools or other options (e.g. for fractal-tasks-core we use poetry).
Creating a package is often a good practice, for reasons unrelated to Fractal:
It makes it simple to assign a global version to the package, and to host it on a public index like PyPI;
It may reduce code duplication:
The scripts may have a shared set of external dependencies, which are defined in a single place for a package.
The scripts may import functions from a shared set of auxiliary Python modules, which can be included in the package.
Moreover, having a single package also streamlines some Fractal-related operations. Given the package MyTasks (available on PyPI, or locally), the Fractal platform offers a feature that automatically:
Downloads the wheel file of package MyTasks (if it's on a public index, rather than a local file);
Creates a Python virtual environment (venv) which is specific for a given version of the MyTasks package, and installs the MyTasks package in that venv;
Populates all the corresponding entries in the task database table with the appropriate Task metadata, which are extracted from the package manifest.
This feature is currently exposed in the /api/v1/task/collect/pip/ endpoint of fractal-server (see API documentation).
To be compatible with Fractal, a task package must satisfy some additional requirements:
The package is built as a a wheel file, and can be installed via pip.
The __FRACTAL_MANIFEST__.json file is bundled in the package, in its root folder. If you are using poetry, no special operation is needed. If you are using a setup.cfg file, see this comment.
Include JSON Schemas. The tools in fractal_tasks_core.dev are used to generate JSON Schema's for the input parameters of each task in fractal-tasks-core. They are meant to be flexible and re-usable to perform the same operation on an independent package, but they are not thoroughly documented/tested for more general use; feel free to open an issue if something is not clear.
Include additional task metadata like docs_info or docs_link, which will be displayed in the Fractal web-client. Note: this feature is not yet implemented.
The ones in the list are the main requirements; if you hit unexpected behaviors, also have a look at https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151 or open a new issue.
"},{"location":"development/","title":"Development","text":""},{"location":"development/#setting-up-environment","title":"Setting up environment","text":"
We use poetry to manage both development environments and package building. A simple way to install it is pipx install poetry==1.8.2, or you can look at the installation section here.
From the repository root folder, running any of
# Install the core library only\npoetry install\n\n# Install the core library and the tasks\npoetry install -E fractal-tasks\n\n# Install the core library and the development/documentation dependencies\npoetry install --with dev --with docs\n
will take care of installing all the dependencies in a separate environment (handled by poetry itself), optionally installing also the dependencies for developement and to build the documentation."},{"location":"development/#testing","title":"Testing","text":"
We use pytest for unit and integration testing of Fractal. If you installed the development dependencies, you may run the test suite by invoking commands like:
# Run all tests\npoetry run pytest\n\n# Run all tests with a verbose mode, and stop at the first failure\npoetry run pytest -x -v\n\n# Run all tests and also print their output\npoetry run pytest -s\n\n# Ignore some tests folders\npoetry run pytest --ignore tests/tasks\n
The tests files are in the tests folder of the repository. Its structure reflects the fractal_tasks_core structure, with tests for the core library in the main folder and tests for tasks and dev subpckages in their own subfolders.
Tests are also run through GitHub Actions, with Python 3.9, 3.10 and 3.11. Note that within GitHub actions we run tests for both the poetry-installed and pip-installed versions of the code, which may e.g. have different versions of some dependencies (since pip install does not rely on the poetry.lock lockfile).
The documentations is built with mkdocs. To build the documentation locally, setup a development python environment (e.g. with poetry install --with docs) and then run one of these commands:
poetry run mkdocs serve --config-file mkdocs.yml # serves the docs at http://127.0.0.1:8000\npoetry run mkdocs build --config-file mkdocs.yml # creates a build in the `site` folder\n
A dedicated GitHub action takes care of building the documentation and pushing it to https://fractal-analytics-platform.github.io/fractal-tasks-core, when commits are pushed to the main branch.
"},{"location":"development/#release-to-pypi","title":"Release to PyPI","text":""},{"location":"development/#preliminary-check-list","title":"Preliminary check-list","text":"
The main branch is checked out.
All tests are passing, for the main branch.
CHANGELOG.md is up to date.
If appropriate (e.g. if you added some new task arguments, or if you modified some of their descriptions), update the JSON Schemas in the manifest via:
poetry run python fractal_tasks_core/dev/create_manifest.py\n
(note that the CI will fail if you forgot to update the manifest,, but it is good to be aware of it)
# Automatic bump of release number\npoetry run bumpver update --[tag-num|patch|minor] --dry\n\n# Set a specific version\npoetry run bumpver update --set-version 1.2.3 --dry\n
to test updating the version bump
If the previous step looks good, remove the --dry and re-run the same command. This will commit both the edited files and the new tag, and push.
Approve the new version deployment at Publish package to PyPI (or have it approved); the corresponding GitHub action will take care of running poetry build and poetry publish with the appropriate credentials.
"},{"location":"development/#static-type-checker","title":"Static type checker","text":"
We do not enforce strict mypy compliance, but we do run it as part of a specific GitHub Action. You can run mypy locally for instance as:
poetry run mypy --package fractal_tasks_core --ignore-missing-imports --warn-redundant-casts --warn-unused-ignores --warn-unreachable --pretty\n
"},{"location":"install/","title":"How to install","text":"
The fractal_tasks_core Python package is hosted on PyPI (https://pypi.org/project/fractal-tasks-core), and can be installed via pip. It includes three (sub)packages:
The main fractal_tasks_core package: a set of helper functions to be used in the Fractal tasks (and possibly in other independent packages).
The fractal_tasks_core.tasks subpackage: a set of standard Fractal tasks.
The fractal_tasks_core.dev subpackage: a set of developement tools (mostly related to creation of JSON Schemas for task arguments).
which only installs the dependencies necessary for the main package and for the dev subpackage."},{"location":"install/#full-installation","title":"Full installation","text":"
In order to also use the tasks subpackage, the additional extra fractal-tasks must be included, as in
pip install fractal-tasks-core[fractal-tasks]\n
Warning: This command installs heavier dependencies (e.g. torch)."},{"location":"tables/","title":"Table specifcations","text":"
Within fractal-tasks-core, we make use of tables which are AnnData objects stored within OME-Zarr image groups. This page describes the different kinds of tables we use, and it includes:
A core table specification, valid for all tables;
The definition of tables for regions of interests (ROIs);
The definition of masking ROI tables, namely ROI tables that are linked e.g. to labels;
A feature-table specification, to store measurements.
Note: The specifications below are largely inspired by a proposed update to OME-NGFF specs. This update is currently on hold, and fractal-tasks-core will evolve as soon as an official NGFF table specs is adopted - see also the Outlook section.
In this section we describe version 1 (V1) of the Fractal table specifications; for the moment, only V1 exists. Note that V1 specifications are only implemented as os of version 0.14.0 of fractal-tasks-core.
The core-table specification consists in the definition of the required Zarr structure and attributes, and of the AnnData table format.
AnnData table format
We store tabular data into Zarr groups as AnnData (\"Annotated Data\") objects; the anndata Python library provides the definition of this format and the relevant tools. Quoting from the anndata documentation:
AnnData is specifically designed for matrix-like data. By this we mean that we have \\(n\\) observations, each of which can be represented as \\(d\\)-dimensional vectors, where each dimension corresponds to a variable or feature. Both the rows and columns of this \\(n \\times d\\) matrix are special in the sense that they are indexed.
Note that AnnData tables are easily transformed from/into pandas.DataFrame objects - see e.g. the AnnData.to_df method.
Zarr structure and attributes
The structure of Zarr groups is based on the image specification in NGFF 0.4, with an additional tables group and the corresponding subgroups (similar to labels):
image.zarr # Zarr group for a NGFF image\n|\n\u251c\u2500\u2500 0 # Zarr array for multiscale level 0\n\u251c\u2500\u2500 ...\n\u251c\u2500\u2500 N # Zarr array for multiscale level N\n|\n\u251c\u2500\u2500 labels # Zarr subgroup with a list of labels associated to this image\n| \u251c\u2500\u2500 label_A # Zarr subgroup for a given label\n| \u251c\u2500\u2500 label_B # Zarr subgroup for a given label\n| \u2514\u2500\u2500 ...\n|\n\u2514\u2500\u2500 tables # Zarr subgroup with a list of tables associated to this image\n \u251c\u2500\u2500 table_1 # Zarr subgroup for a given table\n \u251c\u2500\u2500 table_2 # Zarr subgroup for a given table\n \u2514\u2500\u2500 ...\n
The Zarr attributes of the tables group must include the key tables, pointing to the list of all tables (this simplifies discovery of tables associated to the current NGFF image), as in image.zarr/tables/.zattrs
{\n\"tables\": [\"table_1\", \"table_2\"]\n}\n
The Zarr attributes of each specific-table group must include the version of the table specification (currently version 1), through the fractal_table_version attribute. Also note that the anndata function to write an AnnData object into a Zarr group automatically sets additional attributes. Here is an example of the resulting Zarr attributes: image.zarr/tables/table_1/.zattrs
{\n\"fractal_table_version\": \"1\",\n\"encoding-type\": \"anndata\", // Automatically added by anndata 0.11\n\"encoding-version\": \"0.1.0\", // Automatically added by anndata 0.11\n}\n
In fractal-tasks-core, a ROI table defines regions of space which are three-dimensional (see also the Outlook section about dimensionality flexibility) and box-shaped. Typical use cases are described here.
Zarr attributes
The specification of a ROI table is a subset of the core table one. Moreover, the table-group Zarr attributes must include the type attribute with value roi_table, as in image.zarr/tables/table_1/.zattrs
The var attribute of a given AnnData object indexes the columns of the table. A fractal-tasks-core ROI table must include the following six columns:
x_micrometer, y_micrometer, z_micrometer: the lower bounds of the XYZ intervals defining the ROI, in micrometers;
len_x_micrometer, len_y_micrometer, len_z_micrometer: the XYZ edge lengths, in micrometers.
Notes:
The axes origin for the ROI positions (e.g. for x_micrometer) corresponds to the top-left corner of the image (for the YX axes) and to the lowest Z plane.
ROIs are defined in physical coordinates, and they do not store information on the number or size of pixels.
ROI tables may also include other columns, beyond the required ones. Here are the ones that are typically used in fractal-tasks-core (see also the Use cases section):
x_micrometer_original and y_micrometer_original, which are a copy of x_micrometer and y_micrometer taken before applying some transformation;
translation_x, translation_y and translation_z, which are used during registration of multiplexing acquisitions;
label, which is used to link a ROI to a label (either for masking ROI tables or for feature tables).
"},{"location":"tables/#masking-roi-tables","title":"Masking ROI tables","text":"
Masking ROI tables are a specific instance of the basic ROI tables described above, where each ROI must also be associated to a specific label of a label image.
Motivation
The motivation for this association is based on the following use case:
By performing segmentation of a NGFF image, we identify N objects and we store them as a label image (where the value at each pixel correspond to the label index);
We also compute the three-dimensional bounding box of each segmented object, and store these bounding boxes into a masking ROI table;
For each one of these ROIs, we also include information that link it to both the label image and a specific label index;
During further processing we can load/modify specific sub-regions of the ROI, based on information contained in the label image. This kind of operations are masked, as they only act on the array elements that match a certain condition on the label value.
Zarr attributes
For this kind of tables, fractal-tasks-core closely follows the proposed NGFF update mentioned above. The requirements on the Zarr attributes of a given table are:
Attributes must contain a type key, with value masking_roi_table2.
Attributes must contain a region key; the corresponding value must be an object with a path key and a string value (i.e. the path to the data the table is annotating).
Attributes must include a key instance_key, which is the key in obs that denotes which instance in region the row corresponds to.
Here is an example of valid Zarr attributes image.zarr/tables/table_1/.zattrs
On top of the required ROI-table colums, the masking-ROI-table AnnData object must have an attribute obs with a key matching to the instance_key zarr attribute. For instance if instance_key=\"label\" then table.obs[\"label\"] must exist, with its items matching the labels in the image in \"../labels/label_DAPI\".
The typical use case for feature tables is to store measurements related to segmented objects, while mantaining a link to the original instances (e.g. labels). Note that the current specification is aligned to the one of masking ROI tables, since they both need to relate a table to a label image, but the two may diverge in the future.
As part of the current fractal-tasks-core tasks, measurements can be performed e.g. via regionprops from scikit-image, as wrapped in napari-skimage-regionprops).
Zarr attributes
For this kind of tables, fractal-tasks-core closely follows the proposed NGFF update mentioned above. The requirements on the Zarr attributes of a given table are:
Attributes must contain a type key, with value feature_table2.
Attributes must contain a region key; the corresponding value must be an object with a path key and a string value (i.e. the path to the data the table is annotating).
Attributes must include a key instance_key, which is the key in obs that denotes which instance in region the row corresponds to.
Here is an example of valid Zarr attributes image.zarr/tables/table_1/.zattrs
The feature-table AnnData object must have an attribute obs with a key matching to the instance_key zarr attribute. For instance if instance_key=\"label\" then table.obs[\"label\"] must exist, with its items matching the labels in the image in \"../labels/label_DAPI\".
"},{"location":"tables/#examples","title":"Examples","text":""},{"location":"tables/#use-cases-for-roi-tables","title":"Use cases for ROI tables","text":""},{"location":"tables/#ome-zarr-creation","title":"OME-Zarr creation","text":"
OME-Zarrs created via fractal-tasks-core (e.g. by parsing Yokogawa images via the create_ome_zarr or create_ome_zarr_multiplex tasks) always include two specific ROI tables:
The table named well_ROI_table, which covers the NGFF image corresponding to the whole well1;
The table named FOV_ROI_table, which lists all original fields of view (FOVs).
Each one of these two tables includes ROIs that span the whole image size along the Z axis. Note that this differs, e.g., from ROIs which are the bounding boxes of three-dimensional segmented objects, and which may cover only a part of the image Z size.
When working with an externally-generated OME-Zarr, one may use the import_ome_zarr task to make it compatible with fractal-tasks-core. This task optionally adds two ROI tables to the NGFF images:
The table named image_ROI_table, which covers the whole image;
A table named grid_ROI_table, which splits the whole-image ROI into a YX rectangular grid of smaller ROIs. This may correspond to original FOVs (in case the image is a tiled well1), or it may simply be useful for applying downstream processing to smaller arrays and avoid large memory requirements.
As for the case of well_ROI_table and FOV_ROI_table described above, also these two tables include ROIs spanning the whole image extension along the Z axis.
ROI tables are also used and updated during image processing, e.g as in:
The FOV ROI table may undergo transformations during processing, e.g. FOV ROIs may be shifted to avoid overlaps; in this case, we use the optional columns x_micrometer_original and y_micrometer_original to store the values before the transformation.
The FOV ROI table is also used to store information on the registration of multiplexing acquisitions, via the translation_x, translation_y and translation_z optional columns.
Several tasks in fractal-tasks-core take an existing ROI table as an input and then loop over the ROIs defined in the table. This makes the task more flexible, as it can be used to process e.g. a whole well, a set of FOVs, or a set of custom regions of the array.
The anndata library offers a set of functions for input/output of AnnData tables, including functions specifically targeting the Zarr format.
"},{"location":"tables/#reading-a-table","title":"Reading a table","text":"
To read an AnnData table from a Zarr group, one may use the read_zarr function. In the following example a NGFF image was created by stitching together two field of views, where each one is made of a stack of five Z planes with 1 um spacing between the planes. The FOV_ROI_table has information on the XY position and size of the two original FOVs (named FOV_1 and FOV_2):
In this case, the second FOV (labeled FOV_2) is defined as the three-dimensional region such that
X is between 416 and 832 micrometers;
Y is between 0 and 351 micrometers;
Z is between 0 and 5 - which means that all the five available Z planes are included.
"},{"location":"tables/#writing-a-table","title":"Writing a table","text":"
The anndata.experimental.write_elem function provides the required functionality to write an AnnData object to a Zarr group. In fractal-tasks-core, the write_table helper function wraps the anndata function and includes additional functionalities -- see its documentation.
With respect to the wrapped anndata function, the main additional features of write_table are
The boolean parameter overwrite (defaulting to False), that determines the behavior in case of an already-existing table at the given path.
The table_attrs parameter, as a shorthand for updating the Zarr attributes of the table group after its creation.
These specifications may evolve (especially based on the future NGFF updates), eventually leading to breaking changes in future versions. fractal-tasks-core will aim at mantaining backwards-compatibility with V1 for a reasonable amount of time.
Here is an in-progress list of aspects that may be reviewed:
We aim at removing the use of hard-coded units from the column names (e.g. x_micrometer), in favor of a more general definition of units.
The z_micrometer and len_z_micrometer columns are currently required in all ROI tables, even when the ROIs actually define a two-dimensional XY region; in that case, we set z_micrometer=0 and len_z_micrometer is such that the whole Z size is covered (that is, len_z_micrometer is the product of the spacing between Z planes and the number of planes). In a future version, we may introduce more flexibility and also accept ROI tables which only include X and Y axes, and adapt the relevant tools so that they automatically expand these ROIs into three-dimensions when appropriate.
Concerning the use of AnnData tables or other formats for tabular data, our plan is to follow whatever serialised table specification becomes part of the NGFF standard. For the record, Zarr does not natively support storage of dataframes (see e.g. https://github.com/zarr-developers/numcodecs/issues/452), which is one aspect in favor of sticking with the anndata library.
Within fractal-tasks-core, NGFF images represent whole wells; this still complies with the NGFF specifications, as of an approved clarification in the specs. This explains the reason for storing the regions corresponding to the original FOVs in a specific ROI table, since one NGFF image includes a collection of FOVs. Note that this approach does not rely on the assumption that the FOVs constitute a regular tiling of the well, but it also covers the case of irregularly placed FOVs.\u00a0\u21a9\u21a9
Note that the table types masking_roi_table and feature_table closely resemble the type=\"ngff:region_table\" specification in the previous proposed NGFF table specs.\u00a0\u21a9\u21a9
Home page: https://github.com/Apricot-Therapeutics/APx_fractal_task_collection
Description: The APx Fractal Task Collection is mainainted by Apricot Therapeutics AG, Switzerland. This is a collection of tasks intended to be used in combination with the Fractal Analytics Platform maintained by the BioVisionCenter Zurich (co-founded by the Friedrich Miescher Institute and the University of Zurich). The tasks in this collection are focused on extending Fractal's capabilities of processing 2D image data, with a special focus on multiplexed 2D image data. Most tasks work with 3D image data, but they have not specifically been developed for this scenario.
A channel which is specified by either wavelength_id or label.
This model is similar to OmeroChannel, but it is used for task-function arguments (and for generating appropriate JSON schemas).
ATTRIBUTE DESCRIPTION wavelength_id
Unique ID for the channel wavelength, e.g. A01_C01.
TYPE: Optional[str]
label
Name of the channel.
TYPE: Optional[str]
Source code in fractal_tasks_core/channels.py
class ChannelInputModel(BaseModel):\n\"\"\"\n A channel which is specified by either `wavelength_id` or `label`.\n\n This model is similar to `OmeroChannel`, but it is used for\n task-function arguments (and for generating appropriate JSON schemas).\n\n Attributes:\n wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.\n label: Name of the channel.\n \"\"\"\n\n wavelength_id: Optional[str] = None\n label: Optional[str] = None\n\n @validator(\"label\", always=True)\n def mutually_exclusive_channel_attributes(cls, v, values):\n\"\"\"\n Check that either `label` or `wavelength_id` is set.\n \"\"\"\n wavelength_id = values.get(\"wavelength_id\")\n label = v\n if wavelength_id and v:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both set \"\n f\"(given {wavelength_id=} and {label=}).\"\n )\n if wavelength_id is None and v is None:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both `None`\"\n )\n return v\n
@validator(\"label\", always=True)\ndef mutually_exclusive_channel_attributes(cls, v, values):\n\"\"\"\n Check that either `label` or `wavelength_id` is set.\n \"\"\"\n wavelength_id = values.get(\"wavelength_id\")\n label = v\n if wavelength_id and v:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both set \"\n f\"(given {wavelength_id=} and {label=}).\"\n )\n if wavelength_id is None and v is None:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both `None`\"\n )\n return v\n
Custom error for when get_channel_from_list fails, that can be captured and handled upstream if needed.
Source code in fractal_tasks_core/channels.py
class ChannelNotFoundError(ValueError):\n\"\"\"\n Custom error for when `get_channel_from_list` fails,\n that can be captured and handled upstream if needed.\n \"\"\"\n\n pass\n
Custom class for Omero channels, based on OME-NGFF v0.4.
ATTRIBUTE DESCRIPTION wavelength_id
Unique ID for the channel wavelength, e.g. A01_C01.
TYPE: str
index
Do not change. For internal use only.
TYPE: Optional[int]
label
Name of the channel.
TYPE: Optional[str]
window
Optional Window object to set default display settings for napari.
TYPE: Optional[Window]
color
Optional hex colormap to display the channel in napari (it must be of length 6, e.g. 00FFFF).
TYPE: Optional[str]
active
Should this channel be shown in the viewer?
TYPE: bool
coefficient
Do not change. Omero-channel attribute.
TYPE: int
inverted
Do not change. Omero-channel attribute.
TYPE: bool
Source code in fractal_tasks_core/channels.py
class OmeroChannel(BaseModel):\n\"\"\"\n Custom class for Omero channels, based on OME-NGFF v0.4.\n\n Attributes:\n wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.\n index: Do not change. For internal use only.\n label: Name of the channel.\n window: Optional `Window` object to set default display settings for\n napari.\n color: Optional hex colormap to display the channel in napari (it\n must be of length 6, e.g. `00FFFF`).\n active: Should this channel be shown in the viewer?\n coefficient: Do not change. Omero-channel attribute.\n inverted: Do not change. Omero-channel attribute.\n \"\"\"\n\n # Custom\n\n wavelength_id: str\n index: Optional[int]\n\n # From OME-NGFF v0.4 transitional metadata\n\n label: Optional[str]\n window: Optional[Window]\n color: Optional[str]\n active: bool = True\n coefficient: int = 1\n inverted: bool = False\n\n @validator(\"color\", always=True)\n def valid_hex_color(cls, v, values):\n\"\"\"\n Check that `color` is made of exactly six elements which are letters\n (a-f or A-F) or digits (0-9).\n \"\"\"\n if v is None:\n return v\n if len(v) != 6:\n raise ValueError(f'color must have length 6 (given: \"{v}\")')\n allowed_characters = \"abcdefABCDEF0123456789\"\n for character in v:\n if character not in allowed_characters:\n raise ValueError(\n \"color must only include characters from \"\n f'\"{allowed_characters}\" (given: \"{v}\")'\n )\n return v\n
Check that color is made of exactly six elements which are letters (a-f or A-F) or digits (0-9).
Source code in fractal_tasks_core/channels.py
@validator(\"color\", always=True)\ndef valid_hex_color(cls, v, values):\n\"\"\"\n Check that `color` is made of exactly six elements which are letters\n (a-f or A-F) or digits (0-9).\n \"\"\"\n if v is None:\n return v\n if len(v) != 6:\n raise ValueError(f'color must have length 6 (given: \"{v}\")')\n allowed_characters = \"abcdefABCDEF0123456789\"\n for character in v:\n if character not in allowed_characters:\n raise ValueError(\n \"color must only include characters from \"\n f'\"{allowed_characters}\" (given: \"{v}\")'\n )\n return v\n
Custom class for Omero-channel window, based on OME-NGFF v0.4.
ATTRIBUTE DESCRIPTION min
Do not change. It will be set to 0 by default.
TYPE: Optional[int]
max
Do not change. It will be set according to bit-depth of the images by default (e.g. 65535 for 16 bit images).
TYPE: Optional[int]
start
Lower-bound rescaling value for visualization.
TYPE: int
end
Upper-bound rescaling value for visualization.
TYPE: int
Source code in fractal_tasks_core/channels.py
class Window(BaseModel):\n\"\"\"\n Custom class for Omero-channel window, based on OME-NGFF v0.4.\n\n Attributes:\n min: Do not change. It will be set to `0` by default.\n max:\n Do not change. It will be set according to bit-depth of the images\n by default (e.g. 65535 for 16 bit images).\n start: Lower-bound rescaling value for visualization.\n end: Upper-bound rescaling value for visualization.\n \"\"\"\n\n min: Optional[int]\n max: Optional[int]\n start: int\n end: int\n
Produce a string value that is not present in a given list
Append _1, _2, ... to a given string, if needed, until finding a value which is not already present in existing_values.
PARAMETER DESCRIPTION value
The first guess for the new value
TYPE: str
existing_values
The list of existing values
TYPE: list[str]
RETURNS DESCRIPTION str
A string value which is not present in existing_values
Source code in fractal_tasks_core/channels.py
def _get_new_unique_value(\n value: str,\n existing_values: list[str],\n) -> str:\n\"\"\"\n Produce a string value that is not present in a given list\n\n Append `_1`, `_2`, ... to a given string, if needed, until finding a value\n which is not already present in `existing_values`.\n\n Args:\n value: The first guess for the new value\n existing_values: The list of existing values\n\n Returns:\n A string value which is not present in `existing_values`\n \"\"\"\n counter = 1\n new_value = value\n while new_value in existing_values:\n new_value = f\"{value}-{counter}\"\n counter += 1\n return new_value\n
Check that the wavelength_id attributes of a channel list are unique.
PARAMETER DESCRIPTION channels
TBD
TYPE: list[OmeroChannel]
Source code in fractal_tasks_core/channels.py
def check_unique_wavelength_ids(channels: list[OmeroChannel]):\n\"\"\"\n Check that the `wavelength_id` attributes of a channel list are unique.\n\n Args:\n channels: TBD\n \"\"\"\n wavelength_ids = [c.wavelength_id for c in channels]\n if len(set(wavelength_ids)) < len(wavelength_ids):\n raise ValueError(\n f\"Non-unique wavelength_id's in {wavelength_ids}\\n\" f\"{channels=}\"\n )\n
Check that the channel labels for a well are unique.
First identify the channel-labels list for each image in the well, then compare lists and verify their intersection is empty.
PARAMETER DESCRIPTION well_zarr_path
path to an OME-NGFF well zarr group.
TYPE: str
Source code in fractal_tasks_core/channels.py
def check_well_channel_labels(*, well_zarr_path: str) -> None:\n\"\"\"\n Check that the channel labels for a well are unique.\n\n First identify the channel-labels list for each image in the well, then\n compare lists and verify their intersection is empty.\n\n Args:\n well_zarr_path: path to an OME-NGFF well zarr group.\n \"\"\"\n\n # Iterate over all images (multiplexing acquisitions, multi-FOVs, ...)\n group = zarr.open_group(well_zarr_path, mode=\"r+\")\n image_paths = [image[\"path\"] for image in group.attrs[\"well\"][\"images\"]]\n list_of_channel_lists = []\n for image_path in image_paths:\n channels = get_omero_channel_list(\n image_zarr_path=f\"{well_zarr_path}/{image_path}\"\n )\n list_of_channel_lists.append(channels[:])\n\n # For each pair of channel-labels lists, verify they do not overlap\n for ind_1, channels_1 in enumerate(list_of_channel_lists):\n labels_1 = set([c.label for c in channels_1])\n for ind_2 in range(ind_1):\n channels_2 = list_of_channel_lists[ind_2]\n labels_2 = set([c.label for c in channels_2])\n intersection = labels_1 & labels_2\n if intersection:\n hint = (\n \"Are you parsing fields of view into separate OME-Zarr \"\n \"images? This could lead to non-unique channel labels, \"\n \"and then could be the reason of the error\"\n )\n raise ValueError(\n \"Non-unique channel labels\\n\"\n f\"{labels_1=}\\n{labels_2=}\\n{hint}\"\n )\n
Update a channel list to use it in the OMERO/channels metadata.
Given a list of channel dictionaries, update each one of them by: 1. Adding a label (if missing); 2. Adding a set of OMERO-specific attributes; 3. Discarding all other attributes.
The new_channels output can be used in the attrs[\"omero\"][\"channels\"] attribute of an image group.
PARAMETER DESCRIPTION channels
A list of channel dictionaries (each one must include the wavelength_id key).
new_channels, a new list of consistent channel dictionaries that can be written to OMERO metadata.
Source code in fractal_tasks_core/channels.py
def define_omero_channels(\n *,\n channels: list[OmeroChannel],\n bit_depth: int,\n label_prefix: Optional[str] = None,\n) -> list[dict[str, Union[str, int, bool, dict[str, int]]]]:\n\"\"\"\n Update a channel list to use it in the OMERO/channels metadata.\n\n Given a list of channel dictionaries, update each one of them by:\n 1. Adding a label (if missing);\n 2. Adding a set of OMERO-specific attributes;\n 3. Discarding all other attributes.\n\n The `new_channels` output can be used in the `attrs[\"omero\"][\"channels\"]`\n attribute of an image group.\n\n Args:\n channels: A list of channel dictionaries (each one must include the\n `wavelength_id` key).\n bit_depth: bit depth.\n label_prefix: TBD\n\n Returns:\n `new_channels`, a new list of consistent channel dictionaries that\n can be written to OMERO metadata.\n \"\"\"\n\n new_channels = [c.copy(deep=True) for c in channels]\n default_colors = [\"00FFFF\", \"FF00FF\", \"FFFF00\"]\n\n for channel in new_channels:\n wavelength_id = channel.wavelength_id\n\n # If channel.label is None, set it to a default value\n if channel.label is None:\n default_label = wavelength_id\n if label_prefix:\n default_label = f\"{label_prefix}_{default_label}\"\n logging.warning(\n f\"Missing label for {channel=}, using {default_label=}\"\n )\n channel.label = default_label\n\n # If channel.color is None, set it to a default value (use the default\n # ones for the first three channels, or gray otherwise)\n if channel.color is None:\n try:\n channel.color = default_colors.pop()\n except IndexError:\n channel.color = \"808080\"\n\n # Set channel.window attribute\n if channel.window:\n channel.window.min = 0\n channel.window.max = 2**bit_depth - 1\n\n # Check that channel labels are unique for this image\n labels = [c.label for c in new_channels]\n if len(set(labels)) < len(labels):\n raise ValueError(f\"Non-unique labels in {new_channels=}\")\n\n new_channels_dictionaries = [\n c.dict(exclude={\"index\"}, exclude_unset=True) for c in new_channels\n ]\n\n return new_channels_dictionaries\n
This is a helper function that combines get_omero_channel_list with get_channel_from_list.
PARAMETER DESCRIPTION image_zarr_path
Path to an OME-NGFF image zarr group.
TYPE: str
label
label attribute of the channel to be extracted.
TYPE: Optional[str] DEFAULT: None
wavelength_id
wavelength_id attribute of the channel to be extracted.
TYPE: Optional[str] DEFAULT: None
RETURNS DESCRIPTION OmeroChannel
A single channel dictionary.
Source code in fractal_tasks_core/channels.py
def get_channel_from_image_zarr(\n *,\n image_zarr_path: str,\n label: Optional[str] = None,\n wavelength_id: Optional[str] = None,\n) -> OmeroChannel:\n\"\"\"\n Extract a channel from OME-NGFF zarr attributes.\n\n This is a helper function that combines `get_omero_channel_list` with\n `get_channel_from_list`.\n\n Args:\n image_zarr_path: Path to an OME-NGFF image zarr group.\n label: `label` attribute of the channel to be extracted.\n wavelength_id: `wavelength_id` attribute of the channel to be\n extracted.\n\n Returns:\n A single channel dictionary.\n \"\"\"\n omero_channels = get_omero_channel_list(image_zarr_path=image_zarr_path)\n channel = get_channel_from_list(\n channels=omero_channels, label=label, wavelength_id=wavelength_id\n )\n return channel\n
Find the channel that has the required values of label and/or wavelength_id, and identify its positional index (which also corresponds to its index in the zarr array).
PARAMETER DESCRIPTION channels
A list of channel dictionary, where each channel includes (at least) the label and wavelength_id keys.
TYPE: list[OmeroChannel]
label
The label to look for in the list of channels.
TYPE: Optional[str] DEFAULT: None
wavelength_id
The wavelength_id to look for in the list of channels.
TYPE: Optional[str] DEFAULT: None
RETURNS DESCRIPTION OmeroChannel
A single channel dictionary.
Source code in fractal_tasks_core/channels.py
def get_channel_from_list(\n *,\n channels: list[OmeroChannel],\n label: Optional[str] = None,\n wavelength_id: Optional[str] = None,\n) -> OmeroChannel:\n\"\"\"\n Find matching channel in a list.\n\n Find the channel that has the required values of `label` and/or\n `wavelength_id`, and identify its positional index (which also\n corresponds to its index in the zarr array).\n\n Args:\n channels: A list of channel dictionary, where each channel includes (at\n least) the `label` and `wavelength_id` keys.\n label: The label to look for in the list of channels.\n wavelength_id: The wavelength_id to look for in the list of channels.\n\n Returns:\n A single channel dictionary.\n \"\"\"\n\n # Identify matching channels\n if label:\n if wavelength_id:\n # Both label and wavelength_id are specified\n matching_channels = [\n c\n for c in channels\n if (c.label == label and c.wavelength_id == wavelength_id)\n ]\n else:\n # Only label is specified\n matching_channels = [c for c in channels if c.label == label]\n else:\n if wavelength_id:\n # Only wavelength_id is specified\n matching_channels = [\n c for c in channels if c.wavelength_id == wavelength_id\n ]\n else:\n # Neither label or wavelength_id are specified\n raise ValueError(\n \"get_channel requires at least one in {label,wavelength_id} \"\n \"arguments\"\n )\n\n # Verify that there is one and only one matching channel\n if len(matching_channels) == 0:\n required_match = [f\"{label=}\", f\"{wavelength_id=}\"]\n required_match_string = \" and \".join(\n [x for x in required_match if \"None\" not in x]\n )\n raise ChannelNotFoundError(\n f\"ChannelNotFoundError: No channel found in {channels}\"\n f\" for {required_match_string}\"\n )\n if len(matching_channels) > 1:\n raise ValueError(f\"Inconsistent set of channels: {channels}\")\n\n channel = matching_channels[0]\n channel.index = channels.index(channel)\n return channel\n
Extract the list of channels from OME-NGFF zarr attributes.
PARAMETER DESCRIPTION image_zarr_path
Path to an OME-NGFF image zarr group.
TYPE: str
RETURNS DESCRIPTION list[OmeroChannel]
A list of channel dictionaries.
Source code in fractal_tasks_core/channels.py
def get_omero_channel_list(*, image_zarr_path: str) -> list[OmeroChannel]:\n\"\"\"\n Extract the list of channels from OME-NGFF zarr attributes.\n\n Args:\n image_zarr_path: Path to an OME-NGFF image zarr group.\n\n Returns:\n A list of channel dictionaries.\n \"\"\"\n group = zarr.open_group(image_zarr_path, mode=\"r+\")\n channels_dicts = group.attrs[\"omero\"][\"channels\"]\n channels = [OmeroChannel(**c) for c in channels_dicts]\n return channels\n
Make an existing list of Omero channels Fractal-compatible
The output channels all have keys label, wavelength_id and color; the wavelength_id values are unique across the channel list.
See https://ngff.openmicroscopy.org/0.4/index.html#omero-md for the definition of NGFF Omero metadata.
PARAMETER DESCRIPTION old_channels
Existing list of Omero-channel dictionaries
TYPE: list[dict[str, Any]]
RETURNS DESCRIPTION list[dict[str, Any]]
New list of Fractal-compatible Omero-channel dictionaries
Source code in fractal_tasks_core/channels.py
def update_omero_channels(\n old_channels: list[dict[str, Any]]\n) -> list[dict[str, Any]]:\n\"\"\"\n Make an existing list of Omero channels Fractal-compatible\n\n The output channels all have keys `label`, `wavelength_id` and `color`;\n the `wavelength_id` values are unique across the channel list.\n\n See https://ngff.openmicroscopy.org/0.4/index.html#omero-md for the\n definition of NGFF Omero metadata.\n\n Args:\n old_channels: Existing list of Omero-channel dictionaries\n\n Returns:\n New list of Fractal-compatible Omero-channel dictionaries\n \"\"\"\n new_channels = deepcopy(old_channels)\n existing_wavelength_ids: list[str] = []\n handled_channels = []\n\n default_colors = [\"00FFFF\", \"FF00FF\", \"FFFF00\"]\n\n def _get_next_color() -> str:\n try:\n return default_colors.pop(0)\n except IndexError:\n return \"808080\"\n\n # Channels that contain the key \"wavelength_id\"\n for ind, old_channel in enumerate(old_channels):\n if \"wavelength_id\" in old_channel.keys():\n handled_channels.append(ind)\n existing_wavelength_ids.append(old_channel[\"wavelength_id\"])\n new_channel = old_channel.copy()\n try:\n label = old_channel[\"label\"]\n except KeyError:\n label = str(ind + 1)\n new_channel[\"label\"] = label\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Channels that contain the key \"label\" but do not contain the key\n # \"wavelength_id\"\n for ind, old_channel in enumerate(old_channels):\n if ind in handled_channels:\n continue\n if \"label\" not in old_channel.keys():\n continue\n handled_channels.append(ind)\n label = old_channel[\"label\"]\n wavelength_id = _get_new_unique_value(\n label,\n existing_wavelength_ids,\n )\n existing_wavelength_ids.append(wavelength_id)\n new_channel = old_channel.copy()\n new_channel[\"wavelength_id\"] = wavelength_id\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Channels that do not contain the key \"label\" nor the key \"wavelength_id\"\n # NOTE: these channels must be treated last, as they have lower priority\n # w.r.t. existing \"wavelength_id\" or \"label\" values\n for ind, old_channel in enumerate(old_channels):\n if ind in handled_channels:\n continue\n label = str(ind + 1)\n wavelength_id = _get_new_unique_value(\n label,\n existing_wavelength_ids,\n )\n existing_wavelength_ids.append(wavelength_id)\n new_channel = old_channel.copy()\n new_channel[\"label\"] = label\n new_channel[\"wavelength_id\"] = wavelength_id\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Log old/new values of label, wavelength_id and color\n for ind, old_channel in enumerate(old_channels):\n label = old_channel.get(\"label\")\n color = old_channel.get(\"color\")\n wavelength_id = old_channel.get(\"wavelength_id\")\n old_attributes = (\n f\"Old attributes: {label=}, {wavelength_id=}, {color=}\"\n )\n label = new_channels[ind][\"label\"]\n wavelength_id = new_channels[ind][\"wavelength_id\"]\n color = new_channels[ind][\"color\"]\n new_attributes = (\n f\"New attributes: {label=}, {wavelength_id=}, {color=}\"\n )\n logging.info(\n \"Omero channel update:\\n\"\n f\" {old_attributes}\\n\"\n f\" {new_attributes}\"\n )\n\n return new_channels\n
This helper function is similar to write_table, in that it prepares the appropriate zarr groups (labels and the new-label one) and performs overwrite-dependent checks. At a difference with write_table, this function does not actually write the label array to the new zarr group; such writing operation must take place in the actual task function, since in fractal-tasks-core it is done sequentially on different regions of the zarr array.
What this function does is:
Create the labels group, if needed.
If overwrite=False, check that the new label does not exist (either in zarr attributes or as a zarr sub-group).
Update the labels attribute of the image group.
If label_attrs is set, include this set of attributes in the new-label zarr group.
PARAMETER DESCRIPTION image_group
The group to write to.
TYPE: Group
label_name
The name of the new label; this name also overrides the multiscale name in NGFF-image Zarr attributes, if needed.
TYPE: str
overwrite
If False, check that the new label does not exist (either in zarr attributes or as a zarr sub-group); if True propagate parameter to create_group method, making it overwrite any existing sub-group with the given name.
TYPE: bool DEFAULT: False
label_attrs
Zarr attributes of the label-image group.
TYPE: dict[str, Any]
logger
The logger to use (if unset, use logging.getLogger(None)).
TYPE: Optional[Logger] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the new label.
Source code in fractal_tasks_core/labels.py
def prepare_label_group(\n image_group: zarr.hierarchy.Group,\n label_name: str,\n label_attrs: dict[str, Any],\n overwrite: bool = False,\n logger: Optional[logging.Logger] = None,\n) -> zarr.group:\n\"\"\"\n Set the stage for writing labels to a zarr group\n\n This helper function is similar to `write_table`, in that it prepares the\n appropriate zarr groups (`labels` and the new-label one) and performs\n `overwrite`-dependent checks. At a difference with `write_table`, this\n function does not actually write the label array to the new zarr group;\n such writing operation must take place in the actual task function, since\n in fractal-tasks-core it is done sequentially on different `region`s of the\n zarr array.\n\n What this function does is:\n\n 1. Create the `labels` group, if needed.\n 2. If `overwrite=False`, check that the new label does not exist (either in\n zarr attributes or as a zarr sub-group).\n 3. Update the `labels` attribute of the image group.\n 4. If `label_attrs` is set, include this set of attributes in the\n new-label zarr group.\n\n Args:\n image_group:\n The group to write to.\n label_name:\n The name of the new label; this name also overrides the multiscale\n name in NGFF-image Zarr attributes, if needed.\n overwrite:\n If `False`, check that the new label does not exist (either in zarr\n attributes or as a zarr sub-group); if `True` propagate parameter\n to `create_group` method, making it overwrite any existing\n sub-group with the given name.\n label_attrs:\n Zarr attributes of the label-image group.\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`).\n\n Returns:\n Zarr group of the new label.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n # Create labels group (if needed) and extract current_labels\n if \"labels\" not in set(image_group.group_keys()):\n labels_group = image_group.create_group(\"labels\", overwrite=False)\n else:\n labels_group = image_group[\"labels\"]\n current_labels = labels_group.attrs.asdict().get(\"labels\", [])\n\n # If overwrite=False, check that the new label does not exist (either as a\n # zarr sub-group or as part of the zarr-group attributes)\n if not overwrite:\n if label_name in set(labels_group.group_keys()):\n error_msg = (\n f\"Sub-group '{label_name}' of group {image_group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n if label_name in current_labels:\n error_msg = (\n f\"Item '{label_name}' already exists in `labels` attribute of \"\n f\"group {image_group.store.path}, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Update the `labels` metadata of the image group, if needed\n if label_name not in current_labels:\n new_labels = current_labels + [label_name]\n labels_group.attrs[\"labels\"] = new_labels\n\n # Define new-label group\n label_group = labels_group.create_group(label_name, overwrite=overwrite)\n\n # Validate attrs against NGFF specs 0.4\n try:\n meta = NgffImageMeta(**label_attrs)\n except ValidationError as e:\n error_msg = (\n \"Label attributes do not comply with NGFF image \"\n \"specifications, as encoded in fractal-tasks-core.\\n\"\n f\"Original error:\\nValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n # Replace multiscale name with label_name, if needed\n current_multiscale_name = meta.multiscale.name\n if current_multiscale_name != label_name:\n logger.warning(\n f\"Setting multiscale name to '{label_name}' (old value: \"\n f\"'{current_multiscale_name}') in label-image NGFF \"\n \"attributes.\"\n )\n label_attrs[\"multiscales\"][0][\"name\"] = label_name\n # Overwrite label_group attributes with label_attrs key/value pairs\n label_group.attrs.put(label_attrs)\n\n return label_group\n
Postprocess cellpose output, mainly to restore its original background.
NOTE: The pre/post-processing functions and the masked_loading_wrapper are currently meant to work as part of the cellpose_segmentation task, with the plan of then making them more flexible; see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
PARAMETER DESCRIPTION modified_array
The 3D (ZYX) array with the correct object data and wrong background data.
TYPE: ndarray
original_array
The 3D (ZYX) array with the wrong object data and correct background data.
TYPE: ndarray
background
The 3D (ZYX) boolean array that defines the background.
TYPE: ndarray
RETURNS DESCRIPTION ndarray
The postprocessed array.
Source code in fractal_tasks_core/masked_loading.py
def _postprocess_output(\n *,\n modified_array: np.ndarray,\n original_array: np.ndarray,\n background: np.ndarray,\n) -> np.ndarray:\n\"\"\"\n Postprocess cellpose output, mainly to restore its original background.\n\n **NOTE**: The pre/post-processing functions and the\n masked_loading_wrapper are currently meant to work as part of the\n cellpose_segmentation task, with the plan of then making them more\n flexible; see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.\n\n Args:\n modified_array: The 3D (ZYX) array with the correct object data and\n wrong background data.\n original_array: The 3D (ZYX) array with the wrong object data and\n correct background data.\n background: The 3D (ZYX) boolean array that defines the background.\n\n Returns:\n The postprocessed array.\n \"\"\"\n # Restore background\n modified_array[background] = original_array[background]\n return modified_array\n
Loading the masking label array for the appropriate ROI;
Extracting the appropriate label value from the ROI_table.obs dataframe;
Constructing the background mask, where the masking label matches with a specific label value;
Setting the background of image_array to 0;
Loading the array which will be needed in postprocessing to restore background.
NOTE 1: This function relies on V1 of the Fractal table specifications, see https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.
NOTE 2: The pre/post-processing functions and the masked_loading_wrapper are currently meant to work as part of the cellpose_segmentation task, with the plan of then making them more flexible; see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
Naming of variables refers to a two-steps labeling, as in \"first identify organoids, then look for nuclei inside each organoid\") :
\"masking\" refers to the labels that are used to identify the object vs background (e.g. the organoid labels); these labels already exist.
\"current\" refers to the labels that are currently being computed in the cellpose_segmentation task, e.g. the nuclear labels.
PARAMETER DESCRIPTION image_array
The 4D CZYX array with image data for a specific ROI.
TYPE: ndarray
region
The ZYX indices of the ROI, in a form like (slice(0, 1), slice(1000, 2000), slice(1000, 2000)).
TYPE: tuple[slice, ...]
current_label_path
Path to the image used as current label, in a form like /somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0.
TYPE: str
ROI_table_path
Path of the AnnData table for the masking-label ROIs; this is used (together with ROI_positional_index) to extract label_value.
TYPE: str
ROI_positional_index
Index of the current ROI, which is used to extract label_value from ROI_table_obs.
TYPE: int
Returns: A tuple with three arrays: the preprocessed image array, the background mask, the current label.
Source code in fractal_tasks_core/masked_loading.py
def _preprocess_input(\n image_array: np.ndarray,\n *,\n region: tuple[slice, ...],\n current_label_path: str,\n ROI_table_path: str,\n ROI_positional_index: int,\n) -> tuple[np.ndarray, np.ndarray, np.ndarray]:\n\"\"\"\n Preprocess a four-dimensional cellpose input.\n\n This involves :\n\n - Loading the masking label array for the appropriate ROI;\n - Extracting the appropriate label value from the `ROI_table.obs`\n dataframe;\n - Constructing the background mask, where the masking label matches with a\n specific label value;\n - Setting the background of `image_array` to `0`;\n - Loading the array which will be needed in postprocessing to restore\n background.\n\n **NOTE 1**: This function relies on V1 of the Fractal table specifications,\n see\n https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.\n\n **NOTE 2**: The pre/post-processing functions and the\n masked_loading_wrapper are currently meant to work as part of the\n cellpose_segmentation task, with the plan of then making them more\n flexible; see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.\n\n Naming of variables refers to a two-steps labeling, as in \"first identify\n organoids, then look for nuclei inside each organoid\") :\n\n - `\"masking\"` refers to the labels that are used to identify the object\n vs background (e.g. the organoid labels); these labels already exist.\n - `\"current\"` refers to the labels that are currently being computed in\n the `cellpose_segmentation` task, e.g. the nuclear labels.\n\n Args:\n image_array: The 4D CZYX array with image data for a specific ROI.\n region: The ZYX indices of the ROI, in a form like\n `(slice(0, 1), slice(1000, 2000), slice(1000, 2000))`.\n current_label_path: Path to the image used as current label, in a form\n like `/somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0`.\n ROI_table_path: Path of the AnnData table for the masking-label ROIs;\n this is used (together with `ROI_positional_index`) to extract\n `label_value`.\n ROI_positional_index: Index of the current ROI, which is used to\n extract `label_value` from `ROI_table_obs`.\n Returns:\n A tuple with three arrays: the preprocessed image array, the background\n mask, the current label.\n \"\"\"\n\n logger.info(f\"[_preprocess_input] {image_array.shape=}\")\n logger.info(f\"[_preprocess_input] {region=}\")\n\n # Check that image data are 4D (CZYX) - FIXME issue 340\n if not image_array.ndim == 4:\n raise ValueError(\n \"_preprocess_input requires a 4D \"\n f\"image_array argument, but {image_array.shape=}\"\n )\n\n # Load the ROI table and its metadata attributes\n ROI_table = ad.read_zarr(ROI_table_path)\n attrs = zarr.group(ROI_table_path).attrs\n logger.info(f\"[_preprocess_input] {ROI_table_path=}\")\n logger.info(f\"[_preprocess_input] {attrs.asdict()=}\")\n MaskingROITableAttrs(**attrs.asdict())\n label_relative_path = attrs[\"region\"][\"path\"]\n column_name = attrs[\"instance_key\"]\n\n # Check that ROI_table.obs has the right column and extract label_value\n if column_name not in ROI_table.obs.columns:\n raise ValueError(\n 'In _preprocess_input, \"{column_name}\" '\n f\" missing in {ROI_table.obs.columns=}\"\n )\n label_value = int(ROI_table.obs[column_name][ROI_positional_index])\n\n # Load masking-label array (lazily)\n masking_label_path = str(\n Path(ROI_table_path).parent / label_relative_path / \"0\"\n )\n logger.info(f\"{masking_label_path=}\")\n masking_label_array = da.from_zarr(masking_label_path)\n logger.info(\n f\"[_preprocess_input] {masking_label_path=}, \"\n f\"{masking_label_array.shape=}\"\n )\n\n # Load current-label array (lazily)\n current_label_array = da.from_zarr(current_label_path)\n logger.info(\n f\"[_preprocess_input] {current_label_path=}, \"\n f\"{current_label_array.shape=}\"\n )\n\n # Load ROI data for current label array\n current_label_region = current_label_array[region].compute()\n\n # Load ROI data for masking label array, with or without upscaling\n if masking_label_array.shape != current_label_array.shape:\n logger.info(\"Upscaling of masking label is needed\")\n lowres_region = convert_region_to_low_res(\n highres_region=region,\n highres_shape=current_label_array.shape,\n lowres_shape=masking_label_array.shape,\n )\n masking_label_region = masking_label_array[lowres_region].compute()\n masking_label_region = upscale_array(\n array=masking_label_region,\n target_shape=current_label_region.shape,\n )\n else:\n masking_label_region = masking_label_array[region].compute()\n\n # Check that all shapes match\n shapes = (\n masking_label_region.shape,\n current_label_region.shape,\n image_array.shape[1:],\n )\n if len(set(shapes)) > 1:\n raise ValueError(\n \"Shape mismatch:\\n\"\n f\"{current_label_region.shape=}\\n\"\n f\"{masking_label_region.shape=}\\n\"\n f\"{image_array.shape=}\"\n )\n\n # Compute background mask\n background_3D = masking_label_region != label_value\n if (masking_label_region == label_value).sum() == 0:\n raise ValueError(\n f\"Label {label_value} is not present in the extracted ROI\"\n )\n\n # Set image background to zero\n n_channels = image_array.shape[0]\n for i in range(n_channels):\n image_array[i, background_3D] = 0\n\n return (image_array, background_3D, current_label_region)\n
Wrap a function with some pre/post-processing functions
PARAMETER DESCRIPTION function
The callable function to be wrapped.
TYPE: Callable
image_array
The image array to be preprocessed and then used as positional argument for function.
TYPE: ndarray
kwargs
Keyword arguments for function.
TYPE: Optional[dict] DEFAULT: None
use_masks
If False, the wrapper only calls function(*args, **kwargs).
TYPE: bool
preprocessing_kwargs
Keyword arguments for the preprocessing function (see call signature of _preprocess_input()).
TYPE: Optional[dict] DEFAULT: None
Source code in fractal_tasks_core/masked_loading.py
def masked_loading_wrapper(\n *,\n function: Callable,\n image_array: np.ndarray,\n kwargs: Optional[dict] = None,\n use_masks: bool,\n preprocessing_kwargs: Optional[dict] = None,\n):\n\"\"\"\n Wrap a function with some pre/post-processing functions\n\n Args:\n function: The callable function to be wrapped.\n image_array: The image array to be preprocessed and then used as\n positional argument for `function`.\n kwargs: Keyword arguments for `function`.\n use_masks: If `False`, the wrapper only calls\n `function(*args, **kwargs)`.\n preprocessing_kwargs: Keyword arguments for the preprocessing function\n (see call signature of `_preprocess_input()`).\n \"\"\"\n # Optional preprocessing\n if use_masks:\n preprocessing_kwargs = preprocessing_kwargs or {}\n (\n image_array,\n background_3D,\n current_label_region,\n ) = _preprocess_input(image_array, **preprocessing_kwargs)\n # Run function\n kwargs = kwargs or {}\n new_label_img = function(image_array, **kwargs)\n # Optional postprocessing\n if use_masks:\n new_label_img = _postprocess_output(\n modified_array=new_label_img,\n original_array=current_label_region,\n background=background_3D,\n )\n return new_label_img\n
Starting from on-disk highest-resolution data, build and write to disk a pyramid with (num_levels - 1) coarsened levels. This function works for 2D, 3D or 4D arrays.
PARAMETER DESCRIPTION zarrurl
Path of the image zarr group, not including the multiscale-level path (e.g. \"some/path/plate.zarr/B/03/0\").
TYPE: Union[str, Path]
overwrite
Whether to overwrite existing pyramid levels.
TYPE: bool DEFAULT: False
num_levels
Total number of pyramid levels (including 0).
TYPE: int DEFAULT: 2
coarsening_xy
Linear coarsening factor between subsequent levels.
TYPE: int DEFAULT: 2
chunksize
Shape of a single chunk.
TYPE: Optional[Sequence[int]] DEFAULT: None
aggregation_function
Function to be used when downsampling.
TYPE: Optional[Callable] DEFAULT: None
Source code in fractal_tasks_core/pyramids.py
def build_pyramid(\n *,\n zarrurl: Union[str, pathlib.Path],\n overwrite: bool = False,\n num_levels: int = 2,\n coarsening_xy: int = 2,\n chunksize: Optional[Sequence[int]] = None,\n aggregation_function: Optional[Callable] = None,\n) -> None:\n\n\"\"\"\n Starting from on-disk highest-resolution data, build and write to disk a\n pyramid with `(num_levels - 1)` coarsened levels.\n This function works for 2D, 3D or 4D arrays.\n\n Args:\n zarrurl: Path of the image zarr group, not including the\n multiscale-level path (e.g. `\"some/path/plate.zarr/B/03/0\"`).\n overwrite: Whether to overwrite existing pyramid levels.\n num_levels: Total number of pyramid levels (including 0).\n coarsening_xy: Linear coarsening factor between subsequent levels.\n chunksize: Shape of a single chunk.\n aggregation_function: Function to be used when downsampling.\n \"\"\"\n\n # Clean up zarrurl\n zarrurl = str(pathlib.Path(zarrurl)) # FIXME\n\n # Select full-resolution multiscale level\n zarrurl_highres = f\"{zarrurl}/0\"\n logger.info(f\"[build_pyramid] High-resolution path: {zarrurl_highres}\")\n\n # Lazily load highest-resolution data\n data_highres = da.from_zarr(zarrurl_highres)\n logger.info(f\"[build_pyramid] High-resolution data: {str(data_highres)}\")\n\n # Check the number of axes and identify YX dimensions\n ndims = len(data_highres.shape)\n if ndims not in [2, 3, 4]:\n raise ValueError(f\"{data_highres.shape=}, ndims not in [2,3,4]\")\n y_axis = ndims - 2\n x_axis = ndims - 1\n\n # Set aggregation_function\n if aggregation_function is None:\n aggregation_function = np.mean\n\n # Compute and write lower-resolution levels\n previous_level = data_highres\n for ind_level in range(1, num_levels):\n # Verify that coarsening is doable\n if min(previous_level.shape[-2:]) < coarsening_xy:\n raise ValueError(\n f\"ERROR: at {ind_level}-th level, \"\n f\"coarsening_xy={coarsening_xy} \"\n f\"but previous level has shape {previous_level.shape}\"\n )\n # Apply coarsening\n newlevel = da.coarsen(\n aggregation_function,\n previous_level,\n {y_axis: coarsening_xy, x_axis: coarsening_xy},\n trim_excess=True,\n ).astype(data_highres.dtype)\n\n # Apply rechunking\n if chunksize is None:\n newlevel_rechunked = newlevel\n else:\n newlevel_rechunked = newlevel.rechunk(chunksize)\n logger.info(\n f\"[build_pyramid] Level {ind_level} data: \"\n f\"{str(newlevel_rechunked)}\"\n )\n\n # Write zarr and store output (useful to construct next level)\n previous_level = newlevel_rechunked.to_zarr(\n zarrurl,\n component=f\"{ind_level}\",\n overwrite=overwrite,\n compute=True,\n return_stored=True,\n write_empty_chunks=False,\n dimension_separator=\"/\",\n )\n
Upscale an array along a given list of axis (through repeated application of np.repeat), to match a target shape.
PARAMETER DESCRIPTION array
The array to be upscaled.
TYPE: ndarray
target_shape
The shape of the rescaled array.
TYPE: tuple[int, ...]
axis
The axis along which to upscale the array (if None, then all axis are used).
TYPE: Optional[Sequence[int]] DEFAULT: None
pad_with_zeros
If True, pad the upscaled array with zeros to match target_shape.
TYPE: bool DEFAULT: False
warn_if_inhomogeneous
If True, raise a warning when the conversion factors are not identical across all dimensions.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION ndarray
The upscaled array, with shape target_shape.
Source code in fractal_tasks_core/upscale_array.py
def upscale_array(\n *,\n array: np.ndarray,\n target_shape: tuple[int, ...],\n axis: Optional[Sequence[int]] = None,\n pad_with_zeros: bool = False,\n warn_if_inhomogeneous: bool = False,\n) -> np.ndarray:\n\"\"\"\n Upscale an array along a given list of axis (through repeated application\n of `np.repeat`), to match a target shape.\n\n Args:\n array: The array to be upscaled.\n target_shape: The shape of the rescaled array.\n axis: The axis along which to upscale the array (if `None`, then all\n axis are used).\n pad_with_zeros: If `True`, pad the upscaled array with zeros to match\n `target_shape`.\n warn_if_inhomogeneous: If `True`, raise a warning when the conversion\n factors are not identical across all dimensions.\n\n Returns:\n The upscaled array, with shape `target_shape`.\n \"\"\"\n\n # Default behavior: use all axis\n if axis is None:\n axis = list(range(len(target_shape)))\n\n array_shape = array.shape\n info = (\n f\"Trying to upscale from {array_shape=} to {target_shape=}, \"\n f\"acting on {axis=}.\"\n )\n\n if len(array_shape) != len(target_shape):\n raise ValueError(f\"{info} Dimensions-number mismatch.\")\n if axis == []:\n raise ValueError(f\"{info} Empty axis list\")\n if min(axis) < 0:\n raise ValueError(f\"{info} Negative axis specification not allowed.\")\n\n # Check that upscale is doable\n for ind, dim in enumerate(array_shape):\n # Check that array is not larger than target (downscaling)\n if dim > target_shape[ind]:\n raise ValueError(\n f\"{info} {ind}-th array dimension is larger than target.\"\n )\n # Check that all relevant axis are included in axis\n if dim != target_shape[ind] and ind not in axis:\n raise ValueError(\n f\"{info} {ind}-th array dimension differs from \"\n f\"target, but {ind} is not included in \"\n f\"{axis=}.\"\n )\n\n # Compute upscaling factors\n upscale_factors = {}\n for ax in axis:\n if (target_shape[ax] % array_shape[ax]) > 0 and not pad_with_zeros:\n raise ValueError(\n \"Incommensurable upscale attempt, \"\n f\"from {array_shape=} to {target_shape=}.\"\n )\n upscale_factors[ax] = target_shape[ax] // array_shape[ax]\n # Check that this is not downscaling\n if upscale_factors[ax] < 1:\n raise ValueError(info)\n info = f\"{info} Upscale factors: {upscale_factors}\"\n\n # Raise a warning if upscaling is non-homogeneous across all axis\n if warn_if_inhomogeneous:\n if len(set(upscale_factors.values())) > 1:\n warnings.warn(f\"{info} (inhomogeneous)\")\n\n # Upscale array, via np.repeat\n upscaled_array = array\n for ax in axis:\n upscaled_array = np.repeat(\n upscaled_array, upscale_factors[ax], axis=ax\n )\n\n # Check that final shape is correct\n if not upscaled_array.shape == target_shape:\n if pad_with_zeros:\n pad_width = []\n for ax in list(range(len(target_shape))):\n missing = target_shape[ax] - upscaled_array.shape[ax]\n if missing < 0 or (missing > 0 and ax not in axis):\n raise ValueError(\n f\"{info} \" \"Something wrong during zero-padding\"\n )\n pad_width.append([0, missing])\n upscaled_array = np.pad(\n upscaled_array,\n pad_width=pad_width,\n mode=\"constant\",\n constant_values=0,\n )\n logging.warning(f\"{info} {upscaled_array.shape=}.\")\n logging.warning(\n f\"Padding upscaled_array with zeros with {pad_width=}\"\n )\n else:\n raise ValueError(f\"{info} {upscaled_array.shape=}.\")\n\n return upscaled_array\n
Discover the acquisition index based on OME-NGFF metadata.
Given the path to a zarr image folder (e.g. /path/plate.zarr/B/03/0), extract the acquisition index from the .zattrs file of the parent folder (i.e. at the well level), or return None if acquisition is not specified.
Notes:
For non-multiplexing datasets, acquisition is not a required information in the metadata. If it is not there, this function returns None.
This function fails if we use an image that does not belong to an OME-NGFF well.
PARAMETER DESCRIPTION image_zarr_path
Full path to an OME-NGFF image folder.
TYPE: Path
Source code in fractal_tasks_core/utils.py
def _find_omengff_acquisition(image_zarr_path: Path) -> Union[int, None]:\n\"\"\"\n Discover the acquisition index based on OME-NGFF metadata.\n\n Given the path to a zarr image folder (e.g. `/path/plate.zarr/B/03/0`),\n extract the acquisition index from the `.zattrs` file of the parent\n folder (i.e. at the well level), or return `None` if acquisition is not\n specified.\n\n Notes:\n\n 1. For non-multiplexing datasets, acquisition is not a required\n information in the metadata. If it is not there, this function\n returns `None`.\n 2. This function fails if we use an image that does not belong to\n an OME-NGFF well.\n\n Args:\n image_zarr_path: Full path to an OME-NGFF image folder.\n \"\"\"\n\n # Identify well path and attrs\n well_zarr_path = image_zarr_path.parent\n if not (well_zarr_path / \".zattrs\").exists():\n raise ValueError(\n f\"{str(well_zarr_path)} must be an OME-NGFF well \"\n \"folder, but it does not include a .zattrs file.\"\n )\n well_group = zarr.open_group(str(well_zarr_path))\n attrs_images = well_group.attrs[\"well\"][\"images\"]\n\n # Loook for the acquisition of the current image (if any)\n acquisition = None\n for img_dict in attrs_images:\n if (\n img_dict[\"path\"] == image_zarr_path.name\n and \"acquisition\" in img_dict.keys()\n ):\n acquisition = img_dict[\"acquisition\"]\n break\n\n return acquisition\n
Compile dictionary of (table name, table path) key/value pairs.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed.
TYPE: str
RETURNS DESCRIPTION dict[str, str]
Dictionary with table names as keys and table paths as values. If tables Zarr group is missing, or if it does not have a tables key, then return an empty dictionary.
Source code in fractal_tasks_core/utils.py
def _get_table_path_dict(zarr_url: str) -> dict[str, str]:\n\"\"\"\n Compile dictionary of (table name, table path) key/value pairs.\n\n\n Args:\n zarr_url:\n Path or url to the individual OME-Zarr image to be processed.\n\n Returns:\n Dictionary with table names as keys and table paths as values. If\n `tables` Zarr group is missing, or if it does not have a `tables`\n key, then return an empty dictionary.\n \"\"\"\n\n try:\n tables_group = zarr.open_group(f\"{zarr_url}/tables\", \"r\")\n table_list = tables_group.attrs[\"tables\"]\n except (zarr.errors.GroupNotFoundError, KeyError):\n table_list = []\n\n table_path_dict = {}\n for table in table_list:\n table_path_dict[table] = f\"{zarr_url}/tables/{table}\"\n\n return table_path_dict\n
Flexibly extract parameters from metadata dictionary
This covers both parameters which are acquisition-specific (if the image belongs to an OME-NGFF array and its acquisition is specified) or simply available in the dictionary. The two cases are handled as:
metadata[acquisition][\"some_parameter\"] # acquisition available\nmetadata[\"some_parameter\"] # acquisition not available\n
PARAMETER DESCRIPTION keys
list of required parameters.
TYPE: Sequence[str]
metadata
metadata dictionary.
TYPE: dict[str, Any]
image_zarr_path
full path to image, e.g. /path/plate.zarr/B/03/0.
TYPE: Path
Source code in fractal_tasks_core/utils.py
def get_parameters_from_metadata(\n *,\n keys: Sequence[str],\n metadata: dict[str, Any],\n image_zarr_path: Path,\n) -> dict[str, Any]:\n\"\"\"\n Flexibly extract parameters from metadata dictionary\n\n This covers both parameters which are acquisition-specific (if the image\n belongs to an OME-NGFF array and its acquisition is specified) or simply\n available in the dictionary.\n The two cases are handled as:\n ```\n metadata[acquisition][\"some_parameter\"] # acquisition available\n metadata[\"some_parameter\"] # acquisition not available\n ```\n\n Args:\n keys: list of required parameters.\n metadata: metadata dictionary.\n image_zarr_path: full path to image, e.g. `/path/plate.zarr/B/03/0`.\n \"\"\"\n\n parameters = {}\n acquisition = _find_omengff_acquisition(image_zarr_path)\n if acquisition is not None:\n parameters[\"acquisition\"] = acquisition\n\n for key in keys:\n if acquisition is None:\n parameter = metadata[key]\n else:\n try:\n parameter = metadata[key][str(acquisition)]\n except TypeError:\n parameter = metadata[key]\n except KeyError:\n parameter = metadata[key]\n parameters[key] = parameter\n return parameters\n
Given a set of datasets (as per OME-NGFF specs), update their \"scale\" transformations in the YX directions by including a prefactor (coarsening_xy**reference_level).
PARAMETER DESCRIPTION datasets
list of datasets (as per OME-NGFF specs).
TYPE: list[dict]
coarsening_xy
linear coarsening factor between subsequent levels.
TYPE: int
reference_level
TBD
TYPE: int
remove_channel_axis
If True, remove the first item of all scale transformations.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/utils.py
def rescale_datasets(\n *,\n datasets: list[dict],\n coarsening_xy: int,\n reference_level: int,\n remove_channel_axis: bool = False,\n) -> list[dict]:\n\"\"\"\n Given a set of datasets (as per OME-NGFF specs), update their \"scale\"\n transformations in the YX directions by including a prefactor\n (coarsening_xy**reference_level).\n\n Args:\n datasets: list of datasets (as per OME-NGFF specs).\n coarsening_xy: linear coarsening factor between subsequent levels.\n reference_level: TBD\n remove_channel_axis: If `True`, remove the first item of all `scale`\n transformations.\n \"\"\"\n\n # Construct rescaled datasets\n new_datasets = []\n for ds in datasets:\n new_ds = {}\n\n # Copy all keys that are not coordinateTransformations (e.g. path)\n for key in ds.keys():\n if key != \"coordinateTransformations\":\n new_ds[key] = ds[key]\n\n # Update coordinateTransformations\n old_transformations = ds[\"coordinateTransformations\"]\n new_transformations = []\n for t in old_transformations:\n if t[\"type\"] == \"scale\":\n new_t: dict[str, Any] = t.copy()\n # Rescale last two dimensions (that is, Y and X)\n prefactor = coarsening_xy**reference_level\n new_t[\"scale\"][-2] = new_t[\"scale\"][-2] * prefactor\n new_t[\"scale\"][-1] = new_t[\"scale\"][-1] * prefactor\n if remove_channel_axis:\n new_t[\"scale\"].pop(0)\n new_transformations.append(new_t)\n else:\n new_transformations.append(t)\n new_ds[\"coordinateTransformations\"] = new_transformations\n new_datasets.append(new_ds)\n\n return new_datasets\n
This wrapper sets mode=\"w\" for overwrite=True and mode=\"w-\" for overwrite=False.
The expected behavior is
if the group does not exist, create it (independently on overwrite);
if the group already exists and overwrite=True, replace the group with an empty one;
if the group already exists and overwrite=False, fail.
From the zarr.open_group docs:
mode=\"r\" means read only (must exist);
mode=\"r+\" means read/write (must exist);
mode=\"a\" means read/write (create if doesn\u2019t exist);
mode=\"w\" means create (overwrite if exists);
mode=\"w-\" means create (fail if exists).
PARAMETER DESCRIPTION path
Store or path to directory in file system or name of zip file (zarr.open_group parameter).
TYPE: Union[str, MutableMapping]
overwrite
Determines the mode parameter of zarr.open_group, which is \"w\" (if overwrite=True) or \"w-\" (if overwrite=False).
TYPE: bool
logger
The logger to use (if unset, use logging.getLogger(None))
TYPE: Optional[Logger] DEFAULT: None
open_group_kwargs
Keyword arguments of zarr.open_group.
TYPE: Any DEFAULT: {}
RETURNS DESCRIPTION Group
The zarr group.
RAISES DESCRIPTION OverwriteNotAllowedError
If overwrite=False and the group already exists.
Source code in fractal_tasks_core/zarr_utils.py
def open_zarr_group_with_overwrite(\n path: Union[str, MutableMapping],\n *,\n overwrite: bool,\n logger: Optional[logging.Logger] = None,\n **open_group_kwargs: Any,\n) -> zarr.hierarchy.Group:\n\"\"\"\n Wrap `zarr.open_group` and add `overwrite` argument.\n\n This wrapper sets `mode=\"w\"` for `overwrite=True` and `mode=\"w-\"` for\n `overwrite=False`.\n\n The expected behavior is\n\n\n * if the group does not exist, create it (independently on `overwrite`);\n * if the group already exists and `overwrite=True`, replace the group with\n an empty one;\n * if the group already exists and `overwrite=False`, fail.\n\n From the [`zarr.open_group`\n docs](https://zarr.readthedocs.io/en/stable/api/hierarchy.html#zarr.hierarchy.open_group):\n\n * `mode=\"r\"` means read only (must exist);\n * `mode=\"r+\"` means read/write (must exist);\n * `mode=\"a\"` means read/write (create if doesn\u2019t exist);\n * `mode=\"w\"` means create (overwrite if exists);\n * `mode=\"w-\"` means create (fail if exists).\n\n\n Args:\n path:\n Store or path to directory in file system or name of zip file\n (`zarr.open_group` parameter).\n overwrite:\n Determines the `mode` parameter of `zarr.open_group`, which is\n `\"w\"` (if `overwrite=True`) or `\"w-\"` (if `overwrite=False`).\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`)\n open_group_kwargs:\n Keyword arguments of `zarr.open_group`.\n\n Returns:\n The zarr group.\n\n Raises:\n OverwriteNotAllowedError:\n If `overwrite=False` and the group already exists.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n # Set mode for zarr.open_group\n if overwrite:\n new_mode = \"w\"\n else:\n new_mode = \"w-\"\n\n # Write log about current status\n logger.info(f\"Start open_zarr_group_with_overwrite ({overwrite=}).\")\n try:\n # Call `zarr.open_group` with `mode=\"r\"`, which fails for missing group\n current_group = zarr.open_group(path, mode=\"r\")\n keys = list(current_group.group_keys())\n logger.info(f\"Zarr group {path} already exists, with {keys=}\")\n except GroupNotFoundError:\n logger.info(f\"Zarr group {path} does not exist yet.\")\n\n # Raise warning if we are overriding an existing value of `mode`\n if \"mode\" in open_group_kwargs.keys():\n mode = open_group_kwargs.pop(\"mode\")\n logger.warning(\n f\"Overriding {mode=} with {new_mode=}, \"\n \"in open_zarr_group_with_overwrite\"\n )\n\n # Call zarr.open_group\n try:\n return zarr.open_group(path, mode=new_mode, **open_group_kwargs)\n except ContainsGroupError:\n # Re-raise error with custom message and type\n error_msg = (\n f\"Cannot create zarr group at {path=} with `{overwrite=}` \"\n \"(original error: `zarr.errors.ContainsGroupError`).\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n
Two kinds of plate_prefix values are handled in a special way:
Filenames from FMI, with successful barcode reading: 210305NAR005AAN_210416_164828 with plate name 210305NAR005AAN;
Filenames from FMI, with failed barcode reading: yymmdd_hhmmss_210416_164828 with plate name RS{yymmddhhmmss}.
For all non-matching filenames, plate name is plate_prefix.
PARAMETER DESCRIPTION plate_prefix
TBD
TYPE: str
Source code in fractal_tasks_core/cellvoyager/filenames.py
def _get_plate_name(plate_prefix: str) -> str:\n\"\"\"\n Two kinds of plate_prefix values are handled in a special way:\n\n 1. Filenames from FMI, with successful barcode reading:\n `210305NAR005AAN_210416_164828` with plate name `210305NAR005AAN`;\n 2. Filenames from FMI, with failed barcode reading:\n `yymmdd_hhmmss_210416_164828` with plate name `RS{yymmddhhmmss}`.\n\n For all non-matching filenames, plate name is `plate_prefix`.\n\n Args:\n plate_prefix: TBD\n \"\"\"\n\n fields = plate_prefix.split(\"_\")\n\n # FMI (successful barcode reading)\n if (\n len(fields) == 3\n and len(fields[1]) == 6\n and len(fields[2]) == 6\n and fields[1].isdigit()\n and fields[2].isdigit()\n ):\n barcode, img_date, img_time = fields[:]\n plate = barcode\n # FMI (failed barcode reading)\n elif (\n len(fields) == 4\n and len(fields[0]) == 6\n and len(fields[1]) == 6\n and len(fields[2]) == 6\n and len(fields[3]) == 6\n and fields[0].isdigit()\n and fields[1].isdigit()\n and fields[2].isdigit()\n and fields[3].isdigit()\n ):\n scan_date, scan_time, img_date, img_time = fields[:]\n plate = f\"RS{scan_date + scan_time}\"\n # All non-matching cases\n else:\n plate = plate_prefix\n\n return plate\n
List all the items (files and folders) in a given folder that simultaneously match a series of glob patterns.
PARAMETER DESCRIPTION folder
Base folder where items will be searched.
TYPE: str
patterns
If specified, the list of patterns (defined as in https://docs.python.org/3/library/fnmatch.html) that item names will match with.
TYPE: Sequence[str] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/filenames.py
def glob_with_multiple_patterns(\n *,\n folder: str,\n patterns: Sequence[str] = None,\n) -> set[str]:\n\"\"\"\n List all the items (files and folders) in a given folder that\n simultaneously match a series of glob patterns.\n\n Args:\n folder: Base folder where items will be searched.\n patterns: If specified, the list of patterns (defined as in\n https://docs.python.org/3/library/fnmatch.html) that item\n names will match with.\n \"\"\"\n\n # Sanitize base-folder path\n if folder.endswith(\"/\"):\n actual_folder = folder[:-1]\n else:\n actual_folder = folder[:]\n\n # If not pattern is specified, look for *all* items in the base folder\n if not patterns:\n patterns = [\"*\"]\n\n # Combine multiple glob searches (via set intersection)\n logging.info(f\"[glob_with_multiple_patterns] {patterns=}\")\n items = None\n for pattern in patterns:\n new_matches = glob(f\"{actual_folder}/{pattern}\")\n if items is None:\n items = set(new_matches)\n else:\n items = items.intersection(new_matches)\n items = items or set()\n logging.info(f\"[glob_with_multiple_patterns] Found {len(items)} items\")\n\n return items\n
Handles the conversion of Cellvoyager XML metadata into well indentifiers. Returns well identifiers like A01, B02 etc. for 96 & 384 well plates. Returns well identifiers like A01.a1, A01.b2 etc. for 1536 well plates. Defaults to the processing used for 96 & 384 well plates, unless the plate_type is 1536. For 1536 well plates, the first 4x4 wells go into A01.a1 - A01.d4 and so on.
PARAMETER DESCRIPTION row_series
Series with index being the index of the image and the value the row position (starting at 1 for top left).
TYPE: Series
col_series
Series with index being the index of the image and the value the col position (starting at 1 for top left).
TYPE: Series
plate_type
Number of wells in the plate layout. Used to determine whether it's a 1536 well plate or a different layout.
TYPE: int
RETURNS DESCRIPTION list[str]
list of well_ids
Source code in fractal_tasks_core/cellvoyager/metadata.py
def _create_well_ids(\n row_series: pd.Series,\n col_series: pd.Series,\n plate_type: int,\n) -> list[str]:\n\"\"\"\n Create well_id list from XML metadata\n\n Handles the conversion of Cellvoyager XML metadata into well indentifiers.\n Returns well identifiers like A01, B02 etc. for 96 & 384 well plates.\n Returns well identifiers like A01.a1, A01.b2 etc. for 1536 well plates.\n Defaults to the processing used for 96 & 384 well plates, unless the\n plate_type is 1536. For 1536 well plates, the first 4x4 wells go into\n A01.a1 - A01.d4 and so on.\n\n Args:\n row_series: Series with index being the index of the image and the\n value the row position (starting at 1 for top left).\n col_series: Series with index being the index of the image and the\n value the col position (starting at 1 for top left).\n plate_type: Number of wells in the plate layout. Used to determine\n whether it's a 1536 well plate or a different layout.\n\n Returns:\n list of well_ids\n\n \"\"\"\n if plate_type == 1536:\n # Row are built of a base letter (matching to the 96 well plate layout)\n # and a sub letter (position of the 1536 well within the 4x4 grid,\n # can be a-d) of that well\n row_base = [chr(math.floor((x - 1) / 4) + 65) for x in (row_series)]\n row_sub = [chr((x - 1) % 4 + 97) for x in (row_series)]\n # Columns are built of a base number (matching to the 96 well plate\n # layout) and a sub integer (position of the 1536 well within the\n # 4x4 grid, can be 1-4) of that well\n col_base = [math.floor((x - 1) / 4) + 1 for x in col_series]\n col_sub = [(x - 1) % 4 + 1 for x in col_series]\n well_ids = []\n for i in range(len(row_base)):\n well_ids.append(\n f\"{row_base[i]}{col_base[i]:02}.{row_sub[i]}{col_sub[i]}\"\n )\n else:\n row_str = [chr(x) for x in (row_series + 64)]\n well_ids = [f\"{a}{b:02}\" for a, b in zip(row_str, col_series)]\n\n return well_ids\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def calculate_steps(site_series: pd.Series):\n\"\"\"\n TBD\n\n Args:\n site_series: TBD\n \"\"\"\n\n # site_series is the z_micrometer series for a given site of a given\n # channel. This function calculates the step size in Z\n\n # First diff is always NaN because there is nothing to compare it to\n steps = site_series.diff().dropna().astype(float)\n if not np.allclose(steps.iloc[0], np.array(steps)):\n raise NotImplementedError(\n \"When parsing the Yokogawa mlf file, some sites \"\n \"had varying step size in Z. \"\n \"That is not supported for the OME-Zarr parsing\"\n )\n return steps.mean()\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def get_earliest_time_per_site(mlf_frame: pd.DataFrame) -> pd.DataFrame:\n\"\"\"\n TBD\n\n Args:\n mlf_frame: TBD\n \"\"\"\n\n # Get the time information per site\n # Because a site will contain time information for each plane\n # of each channel, we just return the earliest time infromation\n # per site.\n return pd.to_datetime(\n mlf_frame.groupby([\"well_id\", \"FieldIndex\"]).min()[\"Time\"], utc=True\n )\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def get_z_steps(mlf_frame: pd.DataFrame) -> pd.DataFrame:\n\"\"\"\n TBD\n\n Args:\n mlf_frame: TBD\n \"\"\"\n\n # Process mlf_frame to extract Z information (pixel size & steps).\n # Run checks on consistencies & return site-based z step dataframe\n # Group by well, field & channel\n grouped_sites_z = (\n mlf_frame.loc[\n :,\n [\"well_id\", \"FieldIndex\", \"ActionIndex\", \"Ch\", \"Z\"],\n ]\n .set_index([\"well_id\", \"FieldIndex\", \"ActionIndex\", \"Ch\"])\n .groupby(level=[0, 1, 2, 3])\n )\n\n # If there is only 1 Z step, set the Z spacing to the count of planes => 1\n if grouped_sites_z.count()[\"Z\"].max() == 1:\n z_data = grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"])\n else:\n # Group the whole site (combine channels), because Z steps need to be\n # consistent between channels for OME-Zarr.\n z_data = grouped_sites_z.apply(calculate_steps).groupby(\n [\"well_id\", \"FieldIndex\"]\n )\n\n check_group_consistency(\n z_data, message=\"Comparing Z steps between channels\"\n )\n\n # Ensure that channels have the same number of z planes and\n # reduce it to one value.\n # Only check if there is more than one channel available\n if any(\n grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"]).count() > 1\n ):\n check_group_consistency(\n grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"]),\n message=\"Checking number of Z steps between channels\",\n )\n\n z_steps = (\n grouped_sites_z.count()\n .groupby([\"well_id\", \"FieldIndex\"])\n .mean()\n .astype(int)\n )\n\n # Combine the two dataframes\n z_frame = pd.concat([z_data.mean(), z_steps], axis=1)\n z_frame.columns = [\"pixel_size_z\", \"z_pixel\"]\n return z_frame\n
Parse Yokogawa CV7000 metadata files and prepare site-level metadata.
PARAMETER DESCRIPTION mrf_path
Full path to MeasurementDetail.mrf metadata file.
TYPE: Union[str, Path]
mlf_path
Full path to MeasurementData.mlf metadata file.
TYPE: Union[str, Path]
filename_patterns
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html
TYPE: Optional[list[str]] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/metadata.py
def parse_yokogawa_metadata(\n mrf_path: Union[str, Path],\n mlf_path: Union[str, Path],\n *,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, dict[str, int]]:\n\"\"\"\n Parse Yokogawa CV7000 metadata files and prepare site-level metadata.\n\n Args:\n mrf_path: Full path to MeasurementDetail.mrf metadata file.\n mlf_path: Full path to MeasurementData.mlf metadata file.\n filename_patterns:\n List of patterns to filter the image filenames in the mlf metadata\n table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html\n \"\"\"\n\n # Convert paths to strings\n mrf_str = Path(mrf_path).as_posix()\n mlf_str = Path(mlf_path).as_posix()\n\n mrf_frame, mlf_frame, error_count = read_metadata_files(\n mrf_str, mlf_str, filename_patterns\n )\n\n # Aggregate information from the mlf file\n per_site_parameters = [\"X\", \"Y\"]\n\n grouping_params = [\"well_id\", \"FieldIndex\"]\n grouped_sites = mlf_frame.loc[\n :, grouping_params + per_site_parameters\n ].groupby(by=grouping_params)\n\n check_group_consistency(grouped_sites, message=\"X & Y stage positions\")\n site_metadata = grouped_sites.mean()\n site_metadata.columns = [\"x_micrometer\", \"y_micrometer\"]\n site_metadata[\"z_micrometer\"] = 0\n\n site_metadata = pd.concat(\n [\n site_metadata,\n get_z_steps(mlf_frame),\n get_earliest_time_per_site(mlf_frame),\n ],\n axis=1,\n )\n\n # Aggregate information from the mrf file\n mrf_columns = [\n \"horiz_pixel_dim\",\n \"vert_pixel_dim\",\n \"horiz_pixels\",\n \"vert_pixels\",\n \"bit_depth\",\n ]\n check_group_consistency(\n mrf_frame.loc[:, mrf_columns], message=\"Image dimensions\"\n )\n site_metadata[\"pixel_size_x\"] = mrf_frame.loc[:, \"horiz_pixel_dim\"].max()\n site_metadata[\"pixel_size_y\"] = mrf_frame.loc[:, \"vert_pixel_dim\"].max()\n site_metadata[\"x_pixel\"] = int(mrf_frame.loc[:, \"horiz_pixels\"].max())\n site_metadata[\"y_pixel\"] = int(mrf_frame.loc[:, \"vert_pixels\"].max())\n site_metadata[\"bit_depth\"] = int(mrf_frame.loc[:, \"bit_depth\"].max())\n\n if error_count > 0:\n logger.info(\n f\"There were {error_count} ERR entries in the metadatafile. \"\n f\"Still succesfully parsed {len(site_metadata)} sites. \"\n )\n\n # Compute expected number of image files for each well\n list_of_wells = set(site_metadata.index.get_level_values(\"well_id\"))\n number_of_files = {}\n for this_well_id in list_of_wells:\n num_images = (mlf_frame.well_id == this_well_id).sum()\n logger.info(\n f\"Expected number of images for well {this_well_id}: {num_images}\"\n )\n number_of_files[this_well_id] = num_images\n # Check that the sum of per-well file numbers correspond to the total\n # file number\n if not sum(number_of_files.values()) == len(mlf_frame):\n raise ValueError(\n \"Error while counting the number of image files per well.\\n\"\n f\"{len(mlf_frame)=}\\n\"\n f\"{number_of_files=}\"\n )\n\n return site_metadata, number_of_files\n
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html.
TYPE: Optional[list[str]] DEFAULT: None
Returns:
Source code in fractal_tasks_core/cellvoyager/metadata.py
def read_metadata_files(\n mrf_path: str,\n mlf_path: str,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, pd.DataFrame, int]:\n\"\"\"\n Create tables for mrf & mlf Yokogawa metadata.\n\n Args:\n mrf_path: Full path to MeasurementDetail.mrf metadata file.\n mlf_path: Full path to MeasurementData.mlf metadata file.\n filename_patterns: List of patterns to filter the image filenames in\n the mlf metadata table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html.\n\n Returns:\n\n \"\"\"\n\n # parsing of mrf & mlf files are based on the\n # yokogawa_image_collection_task v0.5 in drogon, written by Dario Vischi.\n # https://github.com/fmi-basel/job-system-workflows/blob/00bbf34448972d27f258a2c28245dd96180e8229/src/gliberal_workflows/tasks/yokogawa_image_collection_task/versions/version_0_5.py # noqa\n # Now modified for Fractal use\n\n mrf_frame, plate_type = read_mrf_file(mrf_path)\n\n # filter_position & filter_wheel_position are parsed, but not\n # processed further. Figure out how to save them as relevant metadata for\n # use e.g. during illumination correction\n\n mlf_frame, error_count = read_mlf_file(\n mlf_path, plate_type, filename_patterns\n )\n # Time points are parsed as part of the mlf_frame, but currently not\n # processed further. Once we tackle time-resolved data, parse from here.\n\n return mrf_frame, mlf_frame, error_count\n
Process the mlf metadata file of a Cellvoyager CV7K/CV8K.
PARAMETER DESCRIPTION mlf_path
Full path to MeasurementData.mlf metadata file.
TYPE: str
plate_type
Plate layout, integer for the number of potential wells.
TYPE: int
filename_patterns
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html.
TYPE: Optional[list[str]] DEFAULT: None
RETURNS DESCRIPTION mlf_frame
pd.DataFrame with relevant metadata per image
TYPE: DataFrame
error_count
Count of errors found during metadata processing
TYPE: int
Source code in fractal_tasks_core/cellvoyager/metadata.py
def read_mlf_file(\n mlf_path: str,\n plate_type: int,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, int]:\n\"\"\"\n Process the mlf metadata file of a Cellvoyager CV7K/CV8K.\n\n Args:\n mlf_path: Full path to MeasurementData.mlf metadata file.\n plate_type: Plate layout, integer for the number of potential wells.\n filename_patterns: List of patterns to filter the image filenames in\n the mlf metadata table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html.\n\n Returns:\n mlf_frame: pd.DataFrame with relevant metadata per image\n error_count: Count of errors found during metadata processing\n \"\"\"\n\n # Load the whole MeasurementData.mlf file\n mlf_frame_raw = pd.read_xml(mlf_path)\n\n # Remove all rows that do not match the given patterns\n logger.info(\n f\"Read {mlf_path}, and apply following patterns to \"\n f\"image filenames: {filename_patterns}\"\n )\n if filename_patterns:\n filenames = mlf_frame_raw.MeasurementRecord\n keep_row = None\n for pattern in filename_patterns:\n actual_pattern = fnmatch.translate(pattern)\n new_matches = filenames.str.fullmatch(actual_pattern)\n if new_matches.sum() == 0:\n raise ValueError(\n f\"In {mlf_path} there is no image filename \"\n f'matching \"{actual_pattern}\".'\n )\n if keep_row is None:\n keep_row = new_matches.copy()\n else:\n keep_row = keep_row & new_matches\n if keep_row.sum() == 0:\n raise ValueError(\n f\"In {mlf_path} there is no image filename \"\n f\"matching {filename_patterns}.\"\n )\n mlf_frame_matching = mlf_frame_raw[keep_row.values].copy()\n else:\n mlf_frame_matching = mlf_frame_raw.copy()\n\n # Create a well ID column\n # Row & column are provided as int from XML metadata\n mlf_frame_matching[\"well_id\"] = _create_well_ids(\n mlf_frame_matching[\"Row\"], mlf_frame_matching[\"Column\"], plate_type\n )\n\n # Flip Y axis to align to image coordinate system\n mlf_frame_matching[\"Y\"] = -mlf_frame_matching[\"Y\"]\n\n # Compute number or errors\n error_count = (mlf_frame_matching[\"Type\"] == \"ERR\").sum()\n\n # We're only interested in the image metadata\n mlf_frame = mlf_frame_matching[mlf_frame_matching[\"Type\"] == \"IMG\"]\n\n return mlf_frame, error_count\n
This function handles different patterns of well names: Classical wells in their format like B03 (row B, column 03) typically found in 96 & 384 well plates from the cellvoyager microscopes. And 1536 well plates with wells like A01.a1 (row Aa, column 011).
PARAMETER DESCRIPTION well_id
Well name. Either formatted like A03 (for 96 well and 384 well plates), or formatted like `A01.a1 (for 1536 well plates).
TYPE: str
Returns: Tuple of row and column names.
Source code in fractal_tasks_core/cellvoyager/wells.py
def _extract_row_col_from_well_id(well_id: str) -> tuple[str, str]:\n\"\"\"\n Split well name into row & column\n\n This function handles different patterns of well names: Classical wells in\n their format like B03 (row B, column 03) typically found in 96 & 384 well\n plates from the cellvoyager microscopes. And 1536 well plates with wells\n like A01.a1 (row Aa, column 011).\n\n Args:\n well_id: Well name. Either formatted like `A03` (for 96 well and 384\n well plates), or formatted like `A01.a1 (for 1536 well plates).\n Returns:\n Tuple of row and column names.\n \"\"\"\n if len(well_id) == 3 and well_id.count(\".\") == 0:\n return (well_id[0], well_id[1:3])\n elif len(well_id) == 6 and well_id.count(\".\") == 1:\n core, suffix = well_id.split(\".\")\n row = f\"{core[0]}{suffix[0]}\"\n col = f\"{core[1:]}{suffix[1]}\"\n return (row, col)\n else:\n raise NotImplementedError(\n f\"Processing wells like {well_id} has not been implemented. \"\n \"This converter only handles wells like B03 or B03.a1\"\n )\n
Given a list of well names, construct a sorted row&column list
This function applies _extract_row_col_from_well_id to each wells element and then sorts the result.
PARAMETER DESCRIPTION wells
list of well names. Either formatted like [A03, B01, C03] for 96 well and 384 well plates. Or formatted like [A01.a1, A03.b2, B04.c4] for 1536 well plates.
TYPE: list[str]
Returns: well_rows_columns: List of tuples of row & col names
Source code in fractal_tasks_core/cellvoyager/wells.py
def generate_row_col_split(wells: list[str]) -> list[tuple[str, str]]:\n\"\"\"\n Given a list of well names, construct a sorted row&column list\n\n This function applies `_extract_row_col_from_well_id` to each `wells`\n element and then sorts the result.\n\n Args:\n wells: list of well names. Either formatted like [A03, B01, C03] for\n 96 well and 384 well plates. Or formatted like [A01.a1, A03.b2,\n B04.c4] for 1536 well plates.\n Returns:\n well_rows_columns: List of tuples of row & col names\n \"\"\"\n well_rows_columns = [_extract_row_col_from_well_id(well) for well in wells]\n return sorted(well_rows_columns)\n
Generates the well_id as extracted from the filename from row & col.
Processes the well identifiers generated by generate_row_col_split for cellvoyager datasets.
PARAMETER DESCRIPTION row
name of the row. Typically a single letter (A, B, C) for 96 & 384 well plates. And two letters (Aa, Bb, Cc) for 1536 well plates.
TYPE: str
col
name of the column. Typically 2 digits (01, 02, 03) for 96 & 384 well plates. And 3 digits (011, 012, 021) for 1536 well plates.
TYPE: str
Returns: well_id: name of the well as it would appear in the original image file name.
Source code in fractal_tasks_core/cellvoyager/wells.py
def get_filename_well_id(row: str, col: str) -> str:\n\"\"\"\n Generates the well_id as extracted from the filename from row & col.\n\n Processes the well identifiers generated by `generate_row_col_split` for\n cellvoyager datasets.\n\n Args:\n row: name of the row. Typically a single letter (A, B, C) for 96 & 384\n well plates. And two letters (Aa, Bb, Cc) for 1536 well plates.\n col: name of the column. Typically 2 digits (01, 02, 03) for 96 & 384\n well plates. And 3 digits (011, 012, 021) for 1536 well plates.\n Returns:\n well_id: name of the well as it would appear in the original image\n file name.\n \"\"\"\n if len(row) == 1 and len(col) == 2:\n return row + col\n elif len(row) == 2 and len(col) == 3:\n return f\"{row[0]}{col[:2]}.{row[1]}{col[2]}\"\n else:\n raise NotImplementedError(\n f\"Processing wells with {row=} & {col=} has not been implemented. \"\n \"This converter only handles wells like B03 or B03.a1\"\n )\n
This function creates the package manifest based on a task_list.py Python module located in the dev subfolder of the package, see an example of such list at ...
The manifest is then written to __FRACTAL_MANIFEST__.json, in the main package directory.
Note: a valid example of custom_pydantic_models would be
Source code in fractal_tasks_core/dev/create_manifest.py
def create_manifest(\n package: str = \"fractal_tasks_core\",\n manifest_version: str = \"2\",\n has_args_schemas: bool = True,\n args_schema_version: str = \"pydantic_v1\",\n docs_link: Optional[str] = None,\n custom_pydantic_models: Optional[list[tuple[str, str, str]]] = None,\n):\n\"\"\"\n This function creates the package manifest based on a `task_list.py`\n Python module located in the `dev` subfolder of the package, see an\n example of such list at ...\n\n The manifest is then written to `__FRACTAL_MANIFEST__.json`, in the\n main `package` directory.\n\n Note: a valid example of `custom_pydantic_models` would be\n ```\n [\n (\"my_task_package\", \"some_module.py\", \"SomeModel\"),\n ]\n ```\n\n Arguments:\n package: The name of the package (must be importable).\n manifest_version: Only `\"2\"` is supported.\n has_args_schemas:\n Whether to autogenerate JSON Schemas for task arguments.\n args_schema_version:\n Only `\"pydantic_v1\"` is currently supported in `fractal-server`\n and `fractal-web`.\n custom_pydantic_models:\n Custom models to be included when building JSON Schemas for task\n arguments.\n \"\"\"\n\n # Preliminary check\n if manifest_version != \"2\":\n raise NotImplementedError(f\"{manifest_version=} is not supported\")\n\n logging.info(\"Start generating a new manifest\")\n\n # Prepare an empty manifest\n manifest = dict(\n manifest_version=manifest_version,\n task_list=[],\n has_args_schemas=has_args_schemas,\n )\n if has_args_schemas:\n manifest[\"args_schema_version\"] = args_schema_version\n\n # Prepare a default value of docs_link\n if package == \"fractal_tasks_core\" and docs_link is None:\n docs_link = (\n \"https://fractal-analytics-platform.github.io/fractal-tasks-core\"\n )\n\n # Import the task list from `dev/task_list.py`\n task_list_module = import_module(f\"{package}.dev.task_list\")\n TASK_LIST = getattr(task_list_module, \"TASK_LIST\")\n\n # Loop over TASK_LIST, and append the proper task dictionary\n # to manifest[\"task_list\"]\n for task_obj in TASK_LIST:\n # Convert Pydantic object to dictionary\n task_dict = task_obj.dict(\n exclude={\"meta_init\", \"executable_init\", \"meta\", \"executable\"},\n exclude_unset=True,\n )\n\n # Copy some properties from `task_obj` to `task_dict`\n if task_obj.executable_non_parallel is not None:\n task_dict[\n \"executable_non_parallel\"\n ] = task_obj.executable_non_parallel\n if task_obj.executable_parallel is not None:\n task_dict[\"executable_parallel\"] = task_obj.executable_parallel\n if task_obj.meta_non_parallel is not None:\n task_dict[\"meta_non_parallel\"] = task_obj.meta_non_parallel\n if task_obj.meta_parallel is not None:\n task_dict[\"meta_parallel\"] = task_obj.meta_parallel\n\n # Autogenerate JSON Schemas for non-parallel/parallel task arguments\n if has_args_schemas:\n for kind in [\"non_parallel\", \"parallel\"]:\n executable = task_dict.get(f\"executable_{kind}\")\n if executable is not None:\n logging.info(f\"[{executable}] START\")\n schema = create_schema_for_single_task(\n executable,\n package=package,\n custom_pydantic_models=custom_pydantic_models,\n )\n logging.info(f\"[{executable}] END (new schema)\")\n task_dict[f\"args_schema_{kind}\"] = schema\n\n # Update docs_info, based on task-function description\n docs_info = create_docs_info(\n executable_non_parallel=task_obj.executable_non_parallel,\n executable_parallel=task_obj.executable_parallel,\n package=package,\n )\n if docs_info is not None:\n task_dict[\"docs_info\"] = docs_info\n if docs_link is not None:\n task_dict[\"docs_link\"] = docs_link\n\n manifest[\"task_list\"].append(task_dict)\n print()\n\n # Write manifest\n imported_package = import_module(package)\n manifest_path = (\n Path(imported_package.__file__).parent / \"__FRACTAL_MANIFEST__.json\"\n )\n with manifest_path.open(\"w\") as f:\n json.dump(manifest, f, indent=2)\n f.write(\"\\n\")\n logging.info(f\"Manifest stored in {manifest_path.as_posix()}\")\n
Pydantic v1 automatically includes args and kwargs properties in JSON Schemas generated via ValidatedFunction(task_function, config=None).model.schema(), with some default (empty) values -- see see https://github.com/pydantic/pydantic/blob/1.10.X-fixes/pydantic/decorator.py.
Verify that these properties match with their expected default values, and then remove them from the schema.
PARAMETER DESCRIPTION old_schema
TBD
TYPE: _Schema
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def _remove_args_kwargs_properties(old_schema: _Schema) -> _Schema:\n\"\"\"\n Remove `args` and `kwargs` schema properties.\n\n Pydantic v1 automatically includes `args` and `kwargs` properties in\n JSON Schemas generated via `ValidatedFunction(task_function,\n config=None).model.schema()`, with some default (empty) values -- see see\n https://github.com/pydantic/pydantic/blob/1.10.X-fixes/pydantic/decorator.py.\n\n Verify that these properties match with their expected default values, and\n then remove them from the schema.\n\n Args:\n old_schema: TBD\n \"\"\"\n new_schema = old_schema.copy()\n args_property = new_schema[\"properties\"].pop(\"args\")\n kwargs_property = new_schema[\"properties\"].pop(\"kwargs\")\n expected_args_property = {\"title\": \"Args\", \"type\": \"array\", \"items\": {}}\n expected_kwargs_property = {\"title\": \"Kwargs\", \"type\": \"object\"}\n if args_property != expected_args_property:\n raise ValueError(\n f\"{args_property=}\\ndiffers from\\n{expected_args_property=}\"\n )\n if kwargs_property != expected_kwargs_property:\n raise ValueError(\n f\"{kwargs_property=}\\ndiffers from\\n\"\n f\"{expected_kwargs_property=}\"\n )\n logging.info(\"[_remove_args_kwargs_properties] END\")\n return new_schema\n
Keeps only the description part of the docstrings: e.g from
'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'\n'\\n'\n'Attributes:\\n'\n'min: Do not change. It will be set to `0` by default.\\n'\n'max: Do not change. It will be set according to bitdepth of the images\\n'\n' by default (e.g. 65535 for 16 bit images).\\n'\n'start: Lower-bound rescaling value for visualization.\\n'\n'end: Upper-bound rescaling value for visualization.'\n
to 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'. PARAMETER DESCRIPTION old_schema
TBD
TYPE: _Schema
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def _remove_attributes_from_descriptions(old_schema: _Schema) -> _Schema:\n\"\"\"\n Keeps only the description part of the docstrings: e.g from\n ```\n 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\\\n'\n '\\\\n'\n 'Attributes:\\\\n'\n 'min: Do not change. It will be set to `0` by default.\\\\n'\n 'max: Do not change. It will be set according to bitdepth of the images\\\\n'\n ' by default (e.g. 65535 for 16 bit images).\\\\n'\n 'start: Lower-bound rescaling value for visualization.\\\\n'\n 'end: Upper-bound rescaling value for visualization.'\n ```\n to `'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\\\n'`.\n\n Args:\n old_schema: TBD\n \"\"\"\n new_schema = old_schema.copy()\n if \"definitions\" in new_schema:\n for name, definition in new_schema[\"definitions\"].items():\n parsed_docstring = docparse(definition[\"description\"])\n new_schema[\"definitions\"][name][\n \"description\"\n ] = parsed_docstring.short_description\n logging.info(\"[_remove_attributes_from_descriptions] END\")\n return new_schema\n
Main function to create a JSON Schema of task arguments
This function can be used in two ways:
task_function argument is None, package is set, and executable is a path relative to that package.
task_function argument is provided, executable is an absolute path to the function module, and package is `None. This is useful for testing.
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def create_schema_for_single_task(\n executable: str,\n package: Optional[str] = \"fractal_tasks_core\",\n custom_pydantic_models: Optional[list[tuple[str, str, str]]] = None,\n task_function: Optional[Callable] = None,\n verbose: bool = False,\n) -> _Schema:\n\"\"\"\n Main function to create a JSON Schema of task arguments\n\n This function can be used in two ways:\n\n 1. `task_function` argument is `None`, `package` is set, and `executable`\n is a path relative to that package.\n 2. `task_function` argument is provided, `executable` is an absolute path\n to the function module, and `package` is `None. This is useful for\n testing.\n\n \"\"\"\n\n logging.info(\"[create_schema_for_single_task] START\")\n if task_function is None:\n usage = \"1\"\n # Usage 1 (standard)\n if package is None:\n raise ValueError(\n \"Cannot call `create_schema_for_single_task with \"\n f\"{task_function=} and {package=}. Exit.\"\n )\n if os.path.isabs(executable):\n raise ValueError(\n \"Cannot call `create_schema_for_single_task with \"\n f\"{task_function=} and absolute {executable=}. Exit.\"\n )\n else:\n usage = \"2\"\n # Usage 2 (testing)\n if package is not None:\n raise ValueError(\n \"Cannot call `create_schema_for_single_task with \"\n f\"{task_function=} and non-None {package=}. Exit.\"\n )\n if not os.path.isabs(executable):\n raise ValueError(\n \"Cannot call `create_schema_for_single_task with \"\n f\"{task_function=} and non-absolute {executable=}. Exit.\"\n )\n\n # Extract function from module\n if usage == \"1\":\n # Extract the function name (for the moment we assume the function has\n # the same name as the module)\n function_name = Path(executable).with_suffix(\"\").name\n # Extract the function object\n task_function = _extract_function(\n package_name=package,\n module_relative_path=executable,\n function_name=function_name,\n verbose=verbose,\n )\n else:\n # The function object is already available, extract its name\n function_name = task_function.__name__\n\n if verbose:\n logging.info(f\"[create_schema_for_single_task] {function_name=}\")\n logging.info(f\"[create_schema_for_single_task] {task_function=}\")\n\n # Validate function signature against some custom constraints\n _validate_function_signature(task_function)\n\n # Create and clean up schema\n vf = ValidatedFunction(task_function, config=None)\n schema = vf.model.schema()\n schema = _remove_args_kwargs_properties(schema)\n schema = _remove_pydantic_internals(schema)\n schema = _remove_attributes_from_descriptions(schema)\n\n # Include titles for custom-model-typed arguments\n schema = _include_titles(schema, verbose=verbose)\n\n # Include descriptions of function. Note: this function works both\n # for usages 1 or 2 (see docstring).\n function_args_descriptions = _get_function_args_descriptions(\n package_name=package,\n module_path=executable,\n function_name=function_name,\n verbose=verbose,\n )\n schema = _insert_function_args_descriptions(\n schema=schema, descriptions=function_args_descriptions, verbose=verbose\n )\n\n # Merge lists of fractal-tasks-core and user-provided Pydantic models\n user_provided_models = custom_pydantic_models or []\n pydantic_models = FRACTAL_TASKS_CORE_PYDANTIC_MODELS + user_provided_models\n\n # Check that model names are unique\n pydantic_models_names = [item[2] for item in pydantic_models]\n duplicate_class_names = [\n name\n for name, count in Counter(pydantic_models_names).items()\n if count > 1\n ]\n if duplicate_class_names:\n pydantic_models_str = \" \" + \"\\n \".join(map(str, pydantic_models))\n raise ValueError(\n \"Cannot parse docstrings for models with non-unique names \"\n f\"{duplicate_class_names}, in\\n{pydantic_models_str}\"\n )\n\n # Extract model-attribute descriptions and insert them into schema\n for package_name, module_relative_path, class_name in pydantic_models:\n attrs_descriptions = _get_class_attrs_descriptions(\n package_name=package_name,\n module_relative_path=module_relative_path,\n class_name=class_name,\n )\n schema = _insert_class_attrs_descriptions(\n schema=schema,\n class_name=class_name,\n descriptions=attrs_descriptions,\n )\n\n logging.info(\"[create_schema_for_single_task] END\")\n return schema\n
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _get_class_attrs_descriptions(\n package_name: str, module_relative_path: str, class_name: str\n) -> dict[str, str]:\n\"\"\"\n Extract attribute descriptions from a class.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_relative_path: Example `lib_channels.py`.\n class_name: Example `OmeroChannel`.\n \"\"\"\n\n if not module_relative_path.endswith(\".py\"):\n raise ValueError(f\"Module {module_relative_path} must end with '.py'\")\n\n # Get the class ast.ClassDef object\n package_path = Path(import_module(package_name).__file__).parent\n module_path = package_path / module_relative_path\n tree = ast.parse(module_path.read_text())\n try:\n _class = next(\n c\n for c in ast.walk(tree)\n if (isinstance(c, ast.ClassDef) and c.name == class_name)\n )\n except StopIteration:\n raise RuntimeError(\n f\"Cannot find {class_name=} for {package_name=} \"\n f\"and {module_relative_path=}\"\n )\n docstring = ast.get_docstring(_class)\n parsed_docstring = docparse(docstring)\n descriptions = {\n x.arg_name: _sanitize_description(x.description)\n if x.description\n else \"Missing description\"\n for x in parsed_docstring.params\n }\n logging.info(f\"[_get_class_attrs_descriptions] END ({class_name=})\")\n return descriptions\n
This must be an absolute path like /some/module.py (if package_name is None) or a relative path like something.py (if package_name is not None).
TYPE: str
function_name
Example create_ome_zarr.
TYPE: str
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _get_function_args_descriptions(\n *,\n package_name: Optional[str],\n module_path: str,\n function_name: str,\n verbose: bool = False,\n) -> dict[str, str]:\n\"\"\"\n Extract argument descriptions from a function.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_path:\n This must be an absolute path like `/some/module.py` (if\n `package_name` is `None`) or a relative path like `something.py`\n (if `package_name` is not `None`).\n function_name: Example `create_ome_zarr`.\n \"\"\"\n\n # Extract docstring from ast.FunctionDef\n docstring = _get_function_docstring(\n package_name=package_name,\n module_path=module_path,\n function_name=function_name,\n verbose=verbose,\n )\n if verbose:\n logging.info(f\"[_get_function_args_descriptions] {docstring}\")\n\n # Parse docstring (via docstring_parser) and prepare output\n parsed_docstring = docparse(docstring)\n descriptions = {\n param.arg_name: _sanitize_description(param.description)\n for param in parsed_docstring.params\n }\n logging.info(f\"[_get_function_args_descriptions] END ({function_name=})\")\n return descriptions\n
This must be an absolute path like /some/module.py (if package_name is None) or a relative path like something.py (if package_name is not None).
TYPE: str
function_name
Example create_ome_zarr.
TYPE: str
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _get_function_docstring(\n *,\n package_name: Optional[str],\n module_path: str,\n function_name: str,\n verbose: bool = False,\n) -> str:\n\"\"\"\n Extract docstring from a function.\n\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_path:\n This must be an absolute path like `/some/module.py` (if\n `package_name` is `None`) or a relative path like `something.py`\n (if `package_name` is not `None`).\n function_name: Example `create_ome_zarr`.\n \"\"\"\n\n if not module_path.endswith(\".py\"):\n raise ValueError(f\"Module {module_path} must end with '.py'\")\n\n # Get the function ast.FunctionDef object\n if package_name is not None:\n if os.path.isabs(module_path):\n raise ValueError(\n \"Error in _get_function_docstring: `package_name` is not \"\n \"None but `module_path` is absolute.\"\n )\n package_path = Path(import_module(package_name).__file__).parent\n module_path = package_path / module_path\n else:\n if not os.path.isabs(module_path):\n raise ValueError(\n \"Error in _get_function_docstring: `package_name` is None \"\n \"but `module_path` is not absolute.\"\n )\n module_path = Path(module_path)\n\n if verbose:\n logging.info(f\"[_get_function_docstring] {function_name=}\")\n logging.info(f\"[_get_function_docstring] {module_path=}\")\n\n tree = ast.parse(module_path.read_text())\n _function = next(\n f\n for f in ast.walk(tree)\n if (isinstance(f, ast.FunctionDef) and f.name == function_name)\n )\n\n # Extract docstring from ast.FunctionDef\n return ast.get_docstring(_function)\n
Merge the descriptions obtained via _get_attributes_models_descriptions into the class_name definition, within an existing JSON Schema
PARAMETER DESCRIPTION schema
TBD
TYPE: dict
class_name
TBD
TYPE: str
descriptions
TBD
TYPE: dict
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _insert_class_attrs_descriptions(\n *, schema: dict, class_name: str, descriptions: dict\n):\n\"\"\"\n Merge the descriptions obtained via `_get_attributes_models_descriptions`\n into the `class_name` definition, within an existing JSON Schema\n\n Args:\n schema: TBD\n class_name: TBD\n descriptions: TBD\n \"\"\"\n new_schema = schema.copy()\n if \"definitions\" not in schema:\n return new_schema\n else:\n new_definitions = schema[\"definitions\"].copy()\n # Loop over existing definitions\n for name, definition in schema[\"definitions\"].items():\n if name == class_name:\n for prop in definition[\"properties\"]:\n if \"description\" in new_definitions[name][\"properties\"][prop]:\n raise ValueError(\n f\"Property {name}.{prop} already has description\"\n )\n else:\n new_definitions[name][\"properties\"][prop][\n \"description\"\n ] = descriptions[prop]\n new_schema[\"definitions\"] = new_definitions\n logging.info(\"[_insert_class_attrs_descriptions] END\")\n return new_schema\n
This is a provisional helper function that replaces newlines with spaces and reduces multiple contiguous whitespace characters to a single one. Future iterations of the docstrings format/parsing may render this function not-needed or obsolete.
PARAMETER DESCRIPTION string
TBD
TYPE: str
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _sanitize_description(string: str) -> str:\n\"\"\"\n Sanitize a description string.\n\n This is a provisional helper function that replaces newlines with spaces\n and reduces multiple contiguous whitespace characters to a single one.\n Future iterations of the docstrings format/parsing may render this function\n not-needed or obsolete.\n\n Args:\n string: TBD\n \"\"\"\n # Replace newline with space\n new_string = string.replace(\"\\n\", \" \")\n # Replace N-whitespace characterss with a single one\n while \" \" in new_string:\n new_string = new_string.replace(\" \", \" \")\n return new_string\n
Extract function from a module with the same name.
PARAMETER DESCRIPTION package_name
Example fractal_tasks_core.
TYPE: str DEFAULT: 'fractal_tasks_core'
module_relative_path
Example tasks/create_ome_zarr.py.
TYPE: str
function_name
Example create_ome_zarr.
TYPE: str
verbose
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/dev/lib_signature_constraints.py
def _extract_function(\n module_relative_path: str,\n function_name: str,\n package_name: str = \"fractal_tasks_core\",\n verbose: bool = False,\n) -> Callable:\n\"\"\"\n Extract function from a module with the same name.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_relative_path: Example `tasks/create_ome_zarr.py`.\n function_name: Example `create_ome_zarr`.\n verbose:\n \"\"\"\n if not module_relative_path.endswith(\".py\"):\n raise ValueError(f\"{module_relative_path=} must end with '.py'\")\n module_relative_path_no_py = str(\n Path(module_relative_path).with_suffix(\"\")\n )\n module_relative_path_dots = module_relative_path_no_py.replace(\"/\", \".\")\n if verbose:\n logging.info(\n f\"Now calling `import_module` for \"\n f\"{package_name}.{module_relative_path_dots}\"\n )\n imported_module = import_module(\n f\"{package_name}.{module_relative_path_dots}\"\n )\n if verbose:\n logging.info(\n f\"Now getting attribute {function_name} from \"\n f\"imported module {imported_module}.\"\n )\n task_function = getattr(imported_module, function_name)\n return task_function\n
Implement a set of checks for type hints that do not play well with the creation of JSON Schema, see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.
PARAMETER DESCRIPTION function
TBD
TYPE: Callable
Source code in fractal_tasks_core/dev/lib_signature_constraints.py
def _validate_function_signature(function: Callable):\n\"\"\"\n Validate the function signature.\n\n Implement a set of checks for type hints that do not play well with the\n creation of JSON Schema, see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.\n\n Args:\n function: TBD\n \"\"\"\n sig = signature(function)\n for param in sig.parameters.values():\n\n # CASE 1: Check that name is not forbidden\n if param.name in FORBIDDEN_PARAM_NAMES:\n raise ValueError(\n f\"Function {function} has argument with name {param.name}\"\n )\n\n # CASE 2: Raise an error for unions\n if str(param.annotation).startswith((\"typing.Union[\", \"Union[\")):\n raise ValueError(\"typing.Union is not supported\")\n\n # CASE 3: Raise an error for \"|\"\n if \"|\" in str(param.annotation):\n raise ValueError('Use of \"|\" in type hints is not supported')\n\n # CASE 4: Raise an error for optional parameter with given (non-None)\n # default, e.g. Optional[str] = \"asd\"\n is_annotation_optional = str(param.annotation).startswith(\n (\"typing.Optional[\", \"Optional[\")\n )\n default_given = (param.default is not None) and (\n param.default != inspect._empty\n )\n if default_given and is_annotation_optional:\n raise ValueError(\"Optional parameter has non-None default value\")\n\n logging.info(\"[_validate_function_signature] END\")\n return sig\n
Return task description based on function docstring.
Source code in fractal_tasks_core/dev/lib_task_docs.py
def create_docs_info(\n executable_non_parallel: Optional[str] = None,\n executable_parallel: Optional[str] = None,\n package: str = \"fractal_tasks_core\",\n) -> list[str]:\n\"\"\"\n Return task description based on function docstring.\n \"\"\"\n logging.info(\"[create_docs_info] START\")\n docs_info = []\n for executable in [executable_non_parallel, executable_parallel]:\n if executable is None:\n continue\n # Extract the function name.\n # Note: this could be made more general, but for the moment we assume\n # that the function has the same name as the module)\n function_name = Path(executable).with_suffix(\"\").name\n logging.info(f\"[create_docs_info] {function_name=}\")\n # Get function description\n description = _get_function_description(\n package_name=package,\n module_path=executable,\n function_name=function_name,\n )\n docs_info.append(f\"## {function_name}\\n{description}\\n\")\n docs_info = \"\".join(docs_info)\n logging.info(\"[create_docs_info] END\")\n return docs_info\n
Scan through properties of a JSON Schema, and set their title when it is missing.
The title is set to name.title(), where title is a standard string method - see https://docs.python.org/3/library/stdtypes.html#str.title.
PARAMETER DESCRIPTION properties
TBD
TYPE: dict[str, dict]
Source code in fractal_tasks_core/dev/lib_titles.py
def _include_titles_for_properties(\n properties: dict[str, dict],\n verbose: bool = False,\n) -> dict[str, dict]:\n\"\"\"\n Scan through properties of a JSON Schema, and set their title when it is\n missing.\n\n The title is set to `name.title()`, where `title` is a standard string\n method - see https://docs.python.org/3/library/stdtypes.html#str.title.\n\n Args:\n properties: TBD\n \"\"\"\n if verbose:\n logging.info(\n f\"[_include_titles_for_properties] Original properties:\\n\"\n f\"{properties}\"\n )\n\n new_properties = properties.copy()\n for prop_name, prop in properties.items():\n if \"title\" not in prop.keys():\n new_prop = prop.copy()\n new_prop[\"title\"] = prop_name.title()\n new_properties[prop_name] = new_prop\n if verbose:\n logging.info(\n f\"[_include_titles_for_properties] New properties:\\n\"\n f\"{new_properties}\"\n )\n return new_properties\n
These models are used in task_list.py, and they provide a layer that simplifies writing the task list of a package in a way that is compliant with fractal-server v2.
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class AcquisitionInPlate(BaseModel):\n\"\"\"\n Model for an element of `Plate.acquisitions`.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n id: int = Field(\n description=\"A unique identifier within the context of the plate\"\n )\n maximumfieldcount: Optional[int] = Field(\n None,\n description=(\n \"Int indicating the maximum number of fields of view for the \"\n \"acquisition\"\n ),\n )\n name: Optional[str] = Field(\n None, description=\"a string identifying the name of the acquisition\"\n )\n description: Optional[str] = Field(\n None,\n description=\"The description of the acquisition\",\n )\n
class Axis(BaseModel):\n\"\"\"\n Model for an element of `Multiscale.axes`.\n\n See https://ngff.openmicroscopy.org/0.4/#axes-md.\n \"\"\"\n\n name: str\n type: Optional[str] = None\n unit: Optional[str] = None\n
See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Channel(BaseModel):\n\"\"\"\n Model for an element of `Omero.channels`.\n\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n window: Optional[Window] = None\n label: Optional[str] = None\n family: Optional[str] = None\n color: str\n active: Optional[bool] = None\n
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class ColumnInPlate(BaseModel):\n\"\"\"\n Model for an element of `Plate.columns`.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n name: str\n
See https://ngff.openmicroscopy.org/0.4/#multiscale-md
Source code in fractal_tasks_core/ngff/specs.py
class Dataset(BaseModel):\n\"\"\"\n Model for an element of `Multiscale.datasets`.\n\n See https://ngff.openmicroscopy.org/0.4/#multiscale-md\n \"\"\"\n\n path: str\n coordinateTransformations: list[\n Union[\n ScaleCoordinateTransformation, TranslationCoordinateTransformation\n ]\n ] = Field(..., min_items=1)\n\n @property\n def scale_transformation(self) -> ScaleCoordinateTransformation:\n\"\"\"\n Extract the unique scale transformation, or fail otherwise.\n \"\"\"\n _transformations = [\n t for t in self.coordinateTransformations if t.type == \"scale\"\n ]\n if len(_transformations) == 0:\n raise ValueError(\n \"Missing scale transformation in dataset.\\n\"\n \"Current coordinateTransformations:\\n\"\n f\"{self.coordinateTransformations}\"\n )\n elif len(_transformations) > 1:\n raise ValueError(\n \"More than one scale transformation in dataset.\\n\"\n \"Current coordinateTransformations:\\n\"\n f\"{self.coordinateTransformations}\"\n )\n else:\n return _transformations[0]\n
Note 1: The NGFF image is defined in a different model (NgffImageMeta), while the Image model only refere to an item of Well.images.
Note 2: We deviate from NGFF specs, since we allow path to be an arbitrary string. TODO: include a check like constr(regex=r'^[A-Za-z0-9]+$'), through a Pydantic validator.
See https://ngff.openmicroscopy.org/0.4/#well-md.
Source code in fractal_tasks_core/ngff/specs.py
class ImageInWell(BaseModel):\n\"\"\"\n Model for an element of `Well.images`.\n\n **Note 1:** The NGFF image is defined in a different model\n (`NgffImageMeta`), while the `Image` model only refere to an item of\n `Well.images`.\n\n **Note 2:** We deviate from NGFF specs, since we allow `path` to be an\n arbitrary string.\n TODO: include a check like `constr(regex=r'^[A-Za-z0-9]+$')`, through a\n Pydantic validator.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n acquisition: Optional[int] = Field(\n None, description=\"A unique identifier within the context of the plate\"\n )\n path: str = Field(\n ..., description=\"The path for this field of view subgroup\"\n )\n
Model for an element of NgffImageMeta.multiscales.
See https://ngff.openmicroscopy.org/0.4/#multiscale-md.
Source code in fractal_tasks_core/ngff/specs.py
class Multiscale(BaseModel):\n\"\"\"\n Model for an element of `NgffImageMeta.multiscales`.\n\n See https://ngff.openmicroscopy.org/0.4/#multiscale-md.\n \"\"\"\n\n name: Optional[str] = None\n datasets: list[Dataset] = Field(..., min_items=1)\n version: Optional[str] = None\n axes: list[Axis] = Field(..., max_items=5, min_items=2, unique_items=True)\n coordinateTransformations: Optional[\n list[\n Union[\n ScaleCoordinateTransformation,\n TranslationCoordinateTransformation,\n ]\n ]\n ] = None\n\n @validator(\"coordinateTransformations\", always=True)\n def _no_global_coordinateTransformations(cls, v):\n\"\"\"\n Fail if Multiscale has a (global) coordinateTransformations attribute.\n \"\"\"\n if v is not None:\n raise NotImplementedError(\n \"Global coordinateTransformations at the multiscales \"\n \"level are not currently supported in the fractal-tasks-core \"\n \"model for the NGFF multiscale.\"\n )\n
Fail if Multiscale has a (global) coordinateTransformations attribute.
Source code in fractal_tasks_core/ngff/specs.py
@validator(\"coordinateTransformations\", always=True)\ndef _no_global_coordinateTransformations(cls, v):\n\"\"\"\n Fail if Multiscale has a (global) coordinateTransformations attribute.\n \"\"\"\n if v is not None:\n raise NotImplementedError(\n \"Global coordinateTransformations at the multiscales \"\n \"level are not currently supported in the fractal-tasks-core \"\n \"model for the NGFF multiscale.\"\n )\n
See https://ngff.openmicroscopy.org/0.4/#image-layout.
Source code in fractal_tasks_core/ngff/specs.py
class NgffImageMeta(BaseModel):\n\"\"\"\n Model for the metadata of a NGFF image.\n\n See https://ngff.openmicroscopy.org/0.4/#image-layout.\n \"\"\"\n\n multiscales: list[Multiscale] = Field(\n ...,\n description=\"The multiscale datasets for this image\",\n min_items=1,\n unique_items=True,\n )\n omero: Optional[Omero] = None\n\n @property\n def multiscale(self) -> Multiscale:\n\"\"\"\n The single element of `self.multiscales`.\n\n Raises:\n NotImplementedError:\n If there are no multiscales or more than one.\n \"\"\"\n if len(self.multiscales) > 1:\n raise NotImplementedError(\n \"Only images with one multiscale are supported \"\n f\"(given: {len(self.multiscales)}\"\n )\n return self.multiscales[0]\n\n @property\n def datasets(self) -> list[Dataset]:\n\"\"\"\n The `datasets` attribute of `self.multiscale`.\n \"\"\"\n return self.multiscale.datasets\n\n @property\n def num_levels(self) -> int:\n return len(self.datasets)\n\n @property\n def axes_names(self) -> list[str]:\n\"\"\"\n List of axes names.\n \"\"\"\n return [ax.name for ax in self.multiscale.axes]\n\n @property\n def pixel_sizes_zyx(self) -> list[list[float]]:\n\"\"\"\n Pixel sizes extracted from scale transformations of datasets.\n\n Raises:\n ValueError:\n If pixel sizes are below a given threshold (1e-9).\n \"\"\"\n x_index = self.axes_names.index(\"x\")\n y_index = self.axes_names.index(\"y\")\n try:\n z_index = self.axes_names.index(\"z\")\n except ValueError:\n z_index = None\n logging.warning(\n f\"Z axis is not present (axes: {self.axes_names}), and Z pixel\"\n \" size is set to 1. This may work, by accident, but it is \"\n \"not fully supported.\"\n )\n _pixel_sizes_zyx = []\n for level in range(self.num_levels):\n scale = self.datasets[level].scale_transformation.scale\n pixel_size_x = scale[x_index]\n pixel_size_y = scale[y_index]\n if z_index is not None:\n pixel_size_z = scale[z_index]\n else:\n pixel_size_z = 1.0\n _pixel_sizes_zyx.append([pixel_size_z, pixel_size_y, pixel_size_x])\n if min(_pixel_sizes_zyx[-1]) < 1e-9:\n raise ValueError(\n f\"Pixel sizes at level {level} are too small: \"\n f\"{_pixel_sizes_zyx[-1]}\"\n )\n\n return _pixel_sizes_zyx\n\n def get_pixel_sizes_zyx(self, *, level: int = 0) -> list[float]:\n return self.pixel_sizes_zyx[level]\n\n @property\n def coarsening_xy(self) -> int:\n\"\"\"\n Linear coarsening factor in the YX plane.\n\n We only support coarsening factors that are homogeneous (both in the\n X/Y directions and across pyramid levels).\n\n Raises:\n NotImplementedError:\n If coarsening ratios are not homogeneous.\n \"\"\"\n current_ratio = None\n for ind in range(1, self.num_levels):\n ratio_x = round(\n self.pixel_sizes_zyx[ind][2] / self.pixel_sizes_zyx[ind - 1][2]\n )\n ratio_y = round(\n self.pixel_sizes_zyx[ind][1] / self.pixel_sizes_zyx[ind - 1][1]\n )\n if ratio_x != ratio_y:\n raise NotImplementedError(\n \"Inhomogeneous coarsening in X/Y directions \"\n \"is not supported.\\n\"\n f\"ZYX pixel sizes:\\n {self.pixel_sizes_zyx}\"\n )\n if current_ratio is None:\n current_ratio = ratio_x\n else:\n if current_ratio != ratio_x:\n raise NotImplementedError(\n \"Inhomogeneous coarsening across levels \"\n \"is not supported.\\n\"\n f\"ZYX pixel sizes:\\n {self.pixel_sizes_zyx}\"\n )\n\n return current_ratio\n
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class NgffPlateMeta(BaseModel):\n\"\"\"\n Model for the metadata of a NGFF plate.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n plate: Plate\n
class NgffWellMeta(BaseModel):\n\"\"\"\n Model for the metadata of a NGFF well.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n well: Optional[Well] = None\n\n def get_acquisition_paths(self) -> dict[int, list[str]]:\n\"\"\"\n Create mapping from acquisition indices to corresponding paths.\n\n Runs on the well zarr attributes and loads the relative paths in the\n well.\n\n Returns:\n Dictionary with `(acquisition index: [image_path])` key/value\n pairs.\n\n Raises:\n ValueError:\n If an element of `self.well.images` has no `acquisition`\n attribute.\n \"\"\"\n acquisition_dict = {}\n for image in self.well.images:\n if image.acquisition is None:\n raise ValueError(\n \"Cannot get acquisition paths for Zarr files without \"\n \"'acquisition' metadata at the well level\"\n )\n if image.acquisition not in acquisition_dict:\n acquisition_dict[image.acquisition] = []\n acquisition_dict[image.acquisition].append(image.path)\n return acquisition_dict\n
Create mapping from acquisition indices to corresponding paths.
Runs on the well zarr attributes and loads the relative paths in the well.
RETURNS DESCRIPTION dict[int, list[str]]
Dictionary with (acquisition index: [image_path]) key/value
dict[int, list[str]]
pairs.
RAISES DESCRIPTION ValueError
If an element of self.well.images has no acquisition attribute.
Source code in fractal_tasks_core/ngff/specs.py
def get_acquisition_paths(self) -> dict[int, list[str]]:\n\"\"\"\n Create mapping from acquisition indices to corresponding paths.\n\n Runs on the well zarr attributes and loads the relative paths in the\n well.\n\n Returns:\n Dictionary with `(acquisition index: [image_path])` key/value\n pairs.\n\n Raises:\n ValueError:\n If an element of `self.well.images` has no `acquisition`\n attribute.\n \"\"\"\n acquisition_dict = {}\n for image in self.well.images:\n if image.acquisition is None:\n raise ValueError(\n \"Cannot get acquisition paths for Zarr files without \"\n \"'acquisition' metadata at the well level\"\n )\n if image.acquisition not in acquisition_dict:\n acquisition_dict[image.acquisition] = []\n acquisition_dict[image.acquisition].append(image.path)\n return acquisition_dict\n
See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Omero(BaseModel):\n\"\"\"\n Model for `NgffImageMeta.omero`.\n\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n channels: list[Channel]\n
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class Plate(BaseModel):\n\"\"\"\n Model for `NgffPlateMeta.plate`.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n acquisitions: list[AcquisitionInPlate]\n columns: list[ColumnInPlate]\n field_count: Optional[int]\n name: Optional[str]\n rows: list[RowInPlate]\n # version will become required in 0.5\n version: Optional[str] = Field(\n None, description=\"The version of the specification\"\n )\n wells: list[WellInPlate]\n
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class RowInPlate(BaseModel):\n\"\"\"\n Model for an element of `Plate.rows`.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n name: str\n
This corresponds to scale-type elements of Dataset.coordinateTransformations or Multiscale.coordinateTransformations. See https://ngff.openmicroscopy.org/0.4/#trafo-md
Source code in fractal_tasks_core/ngff/specs.py
class ScaleCoordinateTransformation(BaseModel):\n\"\"\"\n Model for a scale transformation.\n\n This corresponds to scale-type elements of\n `Dataset.coordinateTransformations` or\n `Multiscale.coordinateTransformations`.\n See https://ngff.openmicroscopy.org/0.4/#trafo-md\n \"\"\"\n\n type: Literal[\"scale\"]\n scale: list[float] = Field(..., min_items=2)\n
This corresponds to translation-type elements of Dataset.coordinateTransformations or Multiscale.coordinateTransformations. See https://ngff.openmicroscopy.org/0.4/#trafo-md
Source code in fractal_tasks_core/ngff/specs.py
class TranslationCoordinateTransformation(BaseModel):\n\"\"\"\n Model for a translation transformation.\n\n This corresponds to translation-type elements of\n `Dataset.coordinateTransformations` or\n `Multiscale.coordinateTransformations`.\n See https://ngff.openmicroscopy.org/0.4/#trafo-md\n \"\"\"\n\n type: Literal[\"translation\"]\n translation: list[float] = Field(..., min_items=2)\n
class Well(BaseModel):\n\"\"\"\n Model for `NgffWellMeta.well`.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n images: list[ImageInWell] = Field(\n ...,\n description=\"The images included in this well\",\n min_items=1,\n unique_items=True,\n )\n version: Optional[str] = Field(\n None, description=\"The version of the specification\"\n )\n
See https://ngff.openmicroscopy.org/0.4/#plate-md.
Source code in fractal_tasks_core/ngff/specs.py
class WellInPlate(BaseModel):\n\"\"\"\n Model for an element of `Plate.wells`.\n\n See https://ngff.openmicroscopy.org/0.4/#plate-md.\n \"\"\"\n\n path: str\n rowIndex: int\n columnIndex: int\n
Note that we deviate by NGFF specs by making start and end optional. See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Window(BaseModel):\n\"\"\"\n Model for `Channel.window`.\n\n Note that we deviate by NGFF specs by making `start` and `end` optional.\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n max: float\n min: float\n start: Optional[float] = None\n end: Optional[float] = None\n
This is used to provide a user-friendly error message.
Source code in fractal_tasks_core/ngff/zarr_utils.py
class ZarrGroupNotFoundError(ValueError):\n\"\"\"\n Wrap zarr.errors.GroupNotFoundError\n\n This is used to provide a user-friendly error message.\n \"\"\"\n\n pass\n
Given a Zarr group, find whether it is an OME-NGFF plate, well or image.
PARAMETER DESCRIPTION group
Zarr group
TYPE: Group
RETURNS DESCRIPTION str
The detected OME-NGFF type (plate, well or image).
Source code in fractal_tasks_core/ngff/zarr_utils.py
def detect_ome_ngff_type(group: zarr.hierarchy.Group) -> str:\n\"\"\"\n Given a Zarr group, find whether it is an OME-NGFF plate, well or image.\n\n Args:\n group: Zarr group\n\n Returns:\n The detected OME-NGFF type (`plate`, `well` or `image`).\n \"\"\"\n attrs = group.attrs.asdict()\n if \"plate\" in attrs.keys():\n ngff_type = \"plate\"\n elif \"well\" in attrs.keys():\n ngff_type = \"well\"\n elif \"multiscales\" in attrs.keys():\n ngff_type = \"image\"\n else:\n error_msg = (\n \"Zarr group at cannot be identified as one \"\n \"of OME-NGFF plate/well/image groups.\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n logger.info(f\"Zarr group identified as OME-NGFF {ngff_type}.\")\n return ngff_type\n
Load the attributes of a zarr group and cast them to NgffImageMeta.
PARAMETER DESCRIPTION zarr_path
Path to the zarr group.
TYPE: str
RETURNS DESCRIPTION NgffImageMeta
A new NgffImageMeta object.
Source code in fractal_tasks_core/ngff/zarr_utils.py
def load_NgffImageMeta(zarr_path: str) -> NgffImageMeta:\n\"\"\"\n Load the attributes of a zarr group and cast them to `NgffImageMeta`.\n\n Args:\n zarr_path: Path to the zarr group.\n\n Returns:\n A new `NgffImageMeta` object.\n \"\"\"\n try:\n zarr_group = zarr.open_group(zarr_path, mode=\"r\")\n except GroupNotFoundError:\n error_msg = (\n \"Could not load attributes for the requested image, \"\n f\"because no Zarr group was found at {zarr_path}\"\n )\n logging.error(error_msg)\n raise ZarrGroupNotFoundError(error_msg)\n zarr_attrs = zarr_group.attrs.asdict()\n try:\n return NgffImageMeta(**zarr_attrs)\n except Exception as e:\n logging.error(\n f\"Contents of {zarr_path} cannot be cast to NgffImageMeta.\\n\"\n f\"Original error:\\n{str(e)}\"\n )\n raise e\n
Load the attributes of a zarr group and cast them to NgffPlateMeta.
PARAMETER DESCRIPTION zarr_path
Path to the zarr group.
TYPE: str
RETURNS DESCRIPTION NgffPlateMeta
A new NgffPlateMeta object.
Source code in fractal_tasks_core/ngff/zarr_utils.py
def load_NgffPlateMeta(zarr_path: str) -> NgffPlateMeta:\n\"\"\"\n Load the attributes of a zarr group and cast them to `NgffPlateMeta`.\n\n Args:\n zarr_path: Path to the zarr group.\n\n Returns:\n A new `NgffPlateMeta` object.\n \"\"\"\n try:\n zarr_group = zarr.open_group(zarr_path, mode=\"r\")\n except GroupNotFoundError:\n error_msg = (\n \"Could not load attributes for the requested plate, \"\n f\"because no Zarr group was found at {zarr_path}\"\n )\n logging.error(error_msg)\n raise ZarrGroupNotFoundError(error_msg)\n zarr_attrs = zarr_group.attrs.asdict()\n try:\n return NgffPlateMeta(**zarr_attrs)\n except Exception as e:\n logging.error(\n f\"Contents of {zarr_path} cannot be cast to NgffPlateMeta.\\n\"\n f\"Original error:\\n{str(e)}\"\n )\n raise e\n
Load the attributes of a zarr group and cast them to NgffWellMeta.
PARAMETER DESCRIPTION zarr_path
Path to the zarr group.
TYPE: str
RETURNS DESCRIPTION NgffWellMeta
A new NgffWellMeta object.
Source code in fractal_tasks_core/ngff/zarr_utils.py
def load_NgffWellMeta(zarr_path: str) -> NgffWellMeta:\n\"\"\"\n Load the attributes of a zarr group and cast them to `NgffWellMeta`.\n\n Args:\n zarr_path: Path to the zarr group.\n\n Returns:\n A new `NgffWellMeta` object.\n \"\"\"\n try:\n zarr_group = zarr.open_group(zarr_path, mode=\"r\")\n except GroupNotFoundError:\n error_msg = (\n \"Could not load attributes for the requested well, \"\n f\"because no Zarr group was found at {zarr_path}\"\n )\n logging.error(error_msg)\n raise ZarrGroupNotFoundError(error_msg)\n zarr_attrs = zarr_group.attrs.asdict()\n try:\n return NgffWellMeta(**zarr_attrs)\n except Exception as e:\n logging.error(\n f\"Contents of {zarr_path} cannot be cast to NgffWellMeta.\\n\"\n f\"Original error:\\n{str(e)}\"\n )\n raise e\n
Given two integer intervals, find whether they overlap
This is the same as is_overlapping_1D (based on https://stackoverflow.com/a/70023212/19085332), for integer-valued intervals.
PARAMETER DESCRIPTION line1
The boundaries of the first interval , written as [x_min, x_max].
TYPE: Sequence[int]
line2
The boundaries of the second interval , written as [x_min, x_max].
TYPE: Sequence[int]
Source code in fractal_tasks_core/roi/_overlaps_common.py
def _is_overlapping_1D_int(\n line1: Sequence[int],\n line2: Sequence[int],\n) -> bool:\n\"\"\"\n Given two integer intervals, find whether they overlap\n\n This is the same as `is_overlapping_1D` (based on\n https://stackoverflow.com/a/70023212/19085332), for integer-valued\n intervals.\n\n Args:\n line1: The boundaries of the first interval , written as\n `[x_min, x_max]`.\n line2: The boundaries of the second interval , written as\n `[x_min, x_max]`.\n \"\"\"\n return line1[0] < line2[1] and line2[0] < line1[1]\n
Given two three-dimensional integer boxes, find whether they overlap.
This is the same as is_overlapping_3D (based on https://stackoverflow.com/a/70023212/19085332), for integer-valued boxes.
PARAMETER DESCRIPTION box1
The boundaries of the first box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: list[int]
box2
The boundaries of the second box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: list[int]
Source code in fractal_tasks_core/roi/_overlaps_common.py
def _is_overlapping_3D_int(box1: list[int], box2: list[int]) -> bool:\n\"\"\"\n Given two three-dimensional integer boxes, find whether they overlap.\n\n This is the same as is_overlapping_3D (based on\n https://stackoverflow.com/a/70023212/19085332), for integer-valued\n boxes.\n\n Args:\n box1: The boundaries of the first box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n box2: The boundaries of the second box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n \"\"\"\n overlap_x = _is_overlapping_1D_int([box1[0], box1[3]], [box2[0], box2[3]])\n overlap_y = _is_overlapping_1D_int([box1[1], box1[4]], [box2[1], box2[4]])\n overlap_z = _is_overlapping_1D_int([box1[2], box1[5]], [box2[2], box2[5]])\n return overlap_x and overlap_y and overlap_z\n
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION line1
The boundaries of the first interval, written as [x_min, x_max].
TYPE: Sequence[float]
line2
The boundaries of the second interval, written as [x_min, x_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_1D(\n line1: Sequence[float], line2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two intervals, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n line1: The boundaries of the first interval, written as\n `[x_min, x_max]`.\n line2: The boundaries of the second interval, written as\n `[x_min, x_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n return line1[0] <= line2[1] - tol and line2[0] <= line1[1] - tol\n
Given two rectangular boxes, finds whether they overlap.
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION box1
The boundaries of the first rectangle, written as [x_min, y_min, x_max, y_max].
TYPE: Sequence[float]
box2
The boundaries of the second rectangle, written as [x_min, y_min, x_max, y_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_2D(\n box1: Sequence[float], box2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two rectangular boxes, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n box1: The boundaries of the first rectangle, written as\n `[x_min, y_min, x_max, y_max]`.\n box2: The boundaries of the second rectangle, written as\n `[x_min, y_min, x_max, y_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n overlap_x = is_overlapping_1D(\n [box1[0], box1[2]], [box2[0], box2[2]], tol=tol\n )\n overlap_y = is_overlapping_1D(\n [box1[1], box1[3]], [box2[1], box2[3]], tol=tol\n )\n return overlap_x and overlap_y\n
Given two three-dimensional boxes, finds whether they overlap.
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION box1
The boundaries of the first box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: Sequence[float]
box2
The boundaries of the second box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_3D(\n box1: Sequence[float], box2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two three-dimensional boxes, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n box1: The boundaries of the first box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n box2: The boundaries of the second box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n\n overlap_x = is_overlapping_1D(\n [box1[0], box1[3]], [box2[0], box2[3]], tol=tol\n )\n overlap_y = is_overlapping_1D(\n [box1[1], box1[4]], [box2[1], box2[4]], tol=tol\n )\n overlap_z = is_overlapping_1D(\n [box1[2], box1[5]], [box2[2], box2[5]], tol=tol\n )\n return overlap_x and overlap_y and overlap_z\n
Can handle both 2D and 3D dask arrays as input and return them as is or always as a 3D array.
PARAMETER DESCRIPTION data_zyx
Dask array (2D or 3D).
TYPE: Array
region
Region to load, tuple of three slices (ZYX).
TYPE: tuple[slice, slice, slice]
compute
Whether to compute the result. If True, returns a numpy array. If False, returns a dask array.
TYPE: bool DEFAULT: True
return_as_3D
Whether to return a 3D array, even if the input is 2D.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION Union[Array, ndarray]
3D array.
Source code in fractal_tasks_core/roi/load_region.py
def load_region(\n data_zyx: da.Array,\n region: tuple[slice, slice, slice],\n compute: bool = True,\n return_as_3D: bool = False,\n) -> Union[da.Array, np.ndarray]:\n\"\"\"\n Load a region from a dask array.\n\n Can handle both 2D and 3D dask arrays as input and return them as is or\n always as a 3D array.\n\n Args:\n data_zyx: Dask array (2D or 3D).\n region: Region to load, tuple of three slices (ZYX).\n compute: Whether to compute the result. If `True`, returns a numpy\n array. If `False`, returns a dask array.\n return_as_3D: Whether to return a 3D array, even if the input is 2D.\n\n Returns:\n 3D array.\n \"\"\"\n\n if len(region) != 3:\n raise ValueError(\n f\"In `load_region`, `region` must have three elements \"\n f\"(given: {len(region)}).\"\n )\n\n if len(data_zyx.shape) == 3:\n img = data_zyx[region]\n elif len(data_zyx.shape) == 2:\n img = data_zyx[(region[1], region[2])]\n if return_as_3D:\n img = np.expand_dims(img, axis=0)\n else:\n raise ValueError(\n f\"Shape {data_zyx.shape} not supported for `load_region`\"\n )\n if compute:\n return img.compute()\n else:\n return img\n
Construct bounding-box ROI table for a mask array.
PARAMETER DESCRIPTION mask_array
Original array to construct bounding boxes.
TYPE: ndarray
pxl_sizes_zyx
Physical-unit pixel ZYX sizes.
TYPE: list[float]
origin_zyx
Shift ROI origin by this amount of ZYX pixels.
TYPE: tuple[int, int, int] DEFAULT: (0, 0, 0)
RETURNS DESCRIPTION DataFrame
DataFrame with each line representing the bounding-box ROI that corresponds to a unique value of mask_array. ROI properties are expressed in physical units (with columns defined as elsewhere this module - see e.g. prepare_well_ROI_table), and positions are optionally shifted (if origin_zyx is set). An additional column label keeps track of the mask_array value corresponding to each ROI.
Source code in fractal_tasks_core/roi/v1.py
def array_to_bounding_box_table(\n mask_array: np.ndarray,\n pxl_sizes_zyx: list[float],\n origin_zyx: tuple[int, int, int] = (0, 0, 0),\n) -> pd.DataFrame:\n\"\"\"\n Construct bounding-box ROI table for a mask array.\n\n Args:\n mask_array: Original array to construct bounding boxes.\n pxl_sizes_zyx: Physical-unit pixel ZYX sizes.\n origin_zyx: Shift ROI origin by this amount of ZYX pixels.\n\n Returns:\n DataFrame with each line representing the bounding-box ROI that\n corresponds to a unique value of `mask_array`. ROI properties are\n expressed in physical units (with columns defined as elsewhere this\n module - see e.g. `prepare_well_ROI_table`), and positions are\n optionally shifted (if `origin_zyx` is set). An additional column\n `label` keeps track of the `mask_array` value corresponding to each\n ROI.\n \"\"\"\n\n pxl_sizes_zyx_array = np.array(pxl_sizes_zyx)\n z_origin, y_origin, x_origin = origin_zyx[:]\n\n labels = np.unique(mask_array)\n labels = labels[labels > 0]\n elem_list = []\n for label in labels:\n # Compute bounding box\n label_match = np.where(mask_array == label)\n zmin, ymin, xmin = np.min(label_match, axis=1) * pxl_sizes_zyx_array\n zmax, ymax, xmax = (\n np.max(label_match, axis=1) + 1\n ) * pxl_sizes_zyx_array\n\n # Compute bounding-box edges\n length_x = xmax - xmin\n length_y = ymax - ymin\n length_z = zmax - zmin\n\n # Shift origin\n zmin += z_origin * pxl_sizes_zyx[0]\n ymin += y_origin * pxl_sizes_zyx[1]\n xmin += x_origin * pxl_sizes_zyx[2]\n\n elem_list.append((xmin, ymin, zmin, length_x, length_y, length_z))\n\n df_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n\n if len(elem_list) == 0:\n df = pd.DataFrame(columns=[x for x in df_columns] + [\"label\"])\n else:\n df = pd.DataFrame(np.array(elem_list), columns=df_columns)\n df[\"label\"] = labels\n\n return df\n
Nested list of indices. The main list has one item per ROI. Each ROI item is a list of six integers as in [start_z, end_z, start_y, end_y, start_x, end_x]. The array-index interval for a given ROI is start_x:end_x along X, and so on for Y and Z.
Source code in fractal_tasks_core/roi/v1.py
def convert_ROI_table_to_indices(\n ROI: ad.AnnData,\n full_res_pxl_sizes_zyx: Sequence[float],\n level: int = 0,\n coarsening_xy: int = 2,\n cols_xyz_pos: Sequence[str] = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n ],\n cols_xyz_len: Sequence[str] = [\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ],\n) -> list[list[int]]:\n\"\"\"\n Convert a ROI AnnData table into integer array indices.\n\n Args:\n ROI: AnnData table with list of ROIs.\n full_res_pxl_sizes_zyx:\n Physical-unit pixel ZYX sizes at the full-resolution pyramid level.\n level: Pyramid level.\n coarsening_xy: Linear coarsening factor in the YX plane.\n cols_xyz_pos: Column names for XYZ ROI positions.\n cols_xyz_len: Column names for XYZ ROI edges.\n\n Raises:\n ValueError:\n If any of the array indices is negative.\n\n Returns:\n Nested list of indices. The main list has one item per ROI. Each ROI\n item is a list of six integers as in `[start_z, end_z, start_y,\n end_y, start_x, end_x]`. The array-index interval for a given ROI\n is `start_x:end_x` along X, and so on for Y and Z.\n \"\"\"\n # Handle empty ROI table\n if len(ROI) == 0:\n return []\n\n # Set pyramid-level pixel sizes\n pxl_size_z, pxl_size_y, pxl_size_x = full_res_pxl_sizes_zyx\n prefactor = coarsening_xy**level\n pxl_size_x *= prefactor\n pxl_size_y *= prefactor\n\n x_pos, y_pos, z_pos = cols_xyz_pos[:]\n x_len, y_len, z_len = cols_xyz_len[:]\n\n list_indices = []\n for ROI_name in ROI.obs_names:\n # Extract data from anndata table\n x_micrometer = ROI[ROI_name, x_pos].X[0, 0]\n y_micrometer = ROI[ROI_name, y_pos].X[0, 0]\n z_micrometer = ROI[ROI_name, z_pos].X[0, 0]\n len_x_micrometer = ROI[ROI_name, x_len].X[0, 0]\n len_y_micrometer = ROI[ROI_name, y_len].X[0, 0]\n len_z_micrometer = ROI[ROI_name, z_len].X[0, 0]\n\n # Identify indices along the three dimensions\n start_x = x_micrometer / pxl_size_x\n end_x = (x_micrometer + len_x_micrometer) / pxl_size_x\n start_y = y_micrometer / pxl_size_y\n end_y = (y_micrometer + len_y_micrometer) / pxl_size_y\n start_z = z_micrometer / pxl_size_z\n end_z = (z_micrometer + len_z_micrometer) / pxl_size_z\n indices = [start_z, end_z, start_y, end_y, start_x, end_x]\n\n # Round indices to lower integer\n indices = list(map(round, indices))\n\n # Fail for negative indices\n if min(indices) < 0:\n raise ValueError(\n f\"ROI {ROI_name} converted into negative array indices.\\n\"\n f\"ZYX position: {z_micrometer}, {y_micrometer}, \"\n f\"{x_micrometer}\\n\"\n f\"ZYX pixel sizes: {pxl_size_z}, {pxl_size_y}, \"\n f\"{pxl_size_x} ({level=})\\n\"\n \"Hint: As of fractal-tasks-core v0.12, FOV/well ROI \"\n \"tables with non-zero origins (e.g. the ones created with \"\n \"v0.11) are not supported.\"\n )\n\n # Append ROI indices to to list\n list_indices.append(indices[:])\n\n return list_indices\n
Note that this function is only relevant when the ROIs in adata span the whole extent of the Z axis. TODO: check this explicitly.
PARAMETER DESCRIPTION adata
TBD
TYPE: AnnData
pixel_size_z
TBD
TYPE: float
Source code in fractal_tasks_core/roi/v1.py
def convert_ROIs_from_3D_to_2D(\n adata: ad.AnnData,\n pixel_size_z: float,\n) -> ad.AnnData:\n\"\"\"\n TBD\n\n Note that this function is only relevant when the ROIs in adata span the\n whole extent of the Z axis.\n TODO: check this explicitly.\n\n Args:\n adata: TBD\n pixel_size_z: TBD\n \"\"\"\n\n # Compress a 3D stack of images to a single Z plane,\n # with thickness equal to pixel_size_z\n df = adata.to_df()\n df[\"len_z_micrometer\"] = pixel_size_z\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df = df.astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n new_adata = ad.AnnData(X=df)\n\n # Rename rows and columns\n new_adata.obs_names = adata.obs_names\n new_adata.var_names = list(map(str, df.columns))\n\n return new_adata\n
Construct an empty bounding-box ROI table of given shape.
This function mirrors the functionality of array_to_bounding_box_table, for the specific case where the array includes no label. The advantages of this function are that:
It does not require computing a whole array of zeros;
We avoid hardcoding column names in the task functions.
RETURNS DESCRIPTION DataFrame
DataFrame with no rows, and with columns corresponding to the output of array_to_bounding_box_table.
Source code in fractal_tasks_core/roi/v1.py
def empty_bounding_box_table() -> pd.DataFrame:\n\"\"\"\n Construct an empty bounding-box ROI table of given shape.\n\n This function mirrors the functionality of `array_to_bounding_box_table`,\n for the specific case where the array includes no label. The advantages of\n this function are that:\n\n 1. It does not require computing a whole array of zeros;\n 2. We avoid hardcoding column names in the task functions.\n\n Returns:\n DataFrame with no rows, and with columns corresponding to the output of\n `array_to_bounding_box_table`.\n \"\"\"\n\n df_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n df = pd.DataFrame(columns=[x for x in df_columns] + [\"label\"])\n return df\n
Produce a table with ROIS placed on a rectangular grid.
The main goal of this ROI grid is to allow processing of smaller subset of the whole array.
In a specific case (that is, if the image array was obtained by stitching together a set of FOVs placed on a regular grid), the ROIs correspond to the original FOVs.
TODO: make this flexible with respect to the presence/absence of Z.
PARAMETER DESCRIPTION array_shape
ZYX shape of the image array.
TYPE: tuple[int, int, int]
pixels_ZYX
ZYX pixel sizes in micrometers.
TYPE: list[float]
grid_YX_shape
TYPE: tuple[int, int]
RETURNS DESCRIPTION AnnData
An AnnData table with a single ROI.
Source code in fractal_tasks_core/roi/v1.py
def get_image_grid_ROIs(\n array_shape: tuple[int, int, int],\n pixels_ZYX: list[float],\n grid_YX_shape: tuple[int, int],\n) -> ad.AnnData:\n\"\"\"\n Produce a table with ROIS placed on a rectangular grid.\n\n The main goal of this ROI grid is to allow processing of smaller subset of\n the whole array.\n\n In a specific case (that is, if the image array was obtained by stitching\n together a set of FOVs placed on a regular grid), the ROIs correspond to\n the original FOVs.\n\n TODO: make this flexible with respect to the presence/absence of Z.\n\n Args:\n array_shape: ZYX shape of the image array.\n pixels_ZYX: ZYX pixel sizes in micrometers.\n grid_YX_shape:\n\n Returns:\n An `AnnData` table with a single ROI.\n \"\"\"\n shape_z, shape_y, shape_x = array_shape[-3:]\n grid_size_y, grid_size_x = grid_YX_shape[:]\n X = []\n obs_names = []\n counter = 0\n start_z = 0\n len_z = shape_z\n\n # Find minimal len_y that covers [0,shape_y] with grid_size_y intervals\n len_y = math.ceil(shape_y / grid_size_y)\n len_x = math.ceil(shape_x / grid_size_x)\n for ind_y in range(grid_size_y):\n start_y = ind_y * len_y\n tmp_len_y = min(shape_y, start_y + len_y) - start_y\n for ind_x in range(grid_size_x):\n start_x = ind_x * len_x\n tmp_len_x = min(shape_x, start_x + len_x) - start_x\n X.append(\n [\n start_x * pixels_ZYX[2],\n start_y * pixels_ZYX[1],\n start_z * pixels_ZYX[0],\n tmp_len_x * pixels_ZYX[2],\n tmp_len_y * pixels_ZYX[1],\n len_z * pixels_ZYX[0],\n ]\n )\n counter += 1\n obs_names.append(f\"ROI_{counter}\")\n ROI_table = ad.AnnData(X=np.array(X, dtype=np.float32))\n ROI_table.obs_names = obs_names\n ROI_table.var_names = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n return ROI_table\n
Produce a table with a single ROI that covers the whole array
TODO: make this flexible with respect to the presence/absence of Z.
PARAMETER DESCRIPTION array_shape
ZYX shape of the image array.
TYPE: tuple[int, int, int]
pixels_ZYX
ZYX pixel sizes in micrometers.
TYPE: list[float]
RETURNS DESCRIPTION AnnData
An AnnData table with a single ROI.
Source code in fractal_tasks_core/roi/v1.py
def get_single_image_ROI(\n array_shape: tuple[int, int, int],\n pixels_ZYX: list[float],\n) -> ad.AnnData:\n\"\"\"\n Produce a table with a single ROI that covers the whole array\n\n TODO: make this flexible with respect to the presence/absence of Z.\n\n Args:\n array_shape: ZYX shape of the image array.\n pixels_ZYX: ZYX pixel sizes in micrometers.\n\n Returns:\n An `AnnData` table with a single ROI.\n \"\"\"\n shape_z, shape_y, shape_x = array_shape[-3:]\n ROI_table = ad.AnnData(\n X=np.array(\n [\n [\n 0.0,\n 0.0,\n 0.0,\n shape_x * pixels_ZYX[2],\n shape_y * pixels_ZYX[1],\n shape_z * pixels_ZYX[0],\n ],\n ],\n dtype=np.float32,\n )\n )\n ROI_table.obs_names = [\"image_1\"]\n ROI_table.var_names = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n return ROI_table\n
True if the name of the table contains one of the standard Fractal tables
If a table name is well_ROI_table, FOV_ROI_table or contains either of the two (e.g. registered_FOV_ROI_table), this function returns True.
PARAMETER DESCRIPTION table
table name
TYPE: str
RETURNS DESCRIPTION bool
bool of whether it's a standard ROI table
Source code in fractal_tasks_core/roi/v1.py
def is_standard_roi_table(table: str) -> bool:\n\"\"\"\n True if the name of the table contains one of the standard Fractal tables\n\n If a table name is well_ROI_table, FOV_ROI_table or contains either of the\n two (e.g. registered_FOV_ROI_table), this function returns True.\n\n Args:\n table: table name\n\n Returns:\n bool of whether it's a standard ROI table\n\n \"\"\"\n if \"well_ROI_table\" in table:\n return True\n elif \"FOV_ROI_table\" in table:\n return True\n else:\n return False\n
Input dataframe, possibly prepared through parse_yokogawa_metadata.
TYPE: DataFrame
metadata
Columns of df to be stored (if present) into AnnData table obs.
TYPE: tuple[str, ...] DEFAULT: ('time')
Source code in fractal_tasks_core/roi/v1.py
def prepare_FOV_ROI_table(\n df: pd.DataFrame, metadata: tuple[str, ...] = (\"time\",)\n) -> ad.AnnData:\n\"\"\"\n Prepare an AnnData table for fields-of-view ROIs.\n\n Args:\n df:\n Input dataframe, possibly prepared through\n `parse_yokogawa_metadata`.\n metadata:\n Columns of `df` to be stored (if present) into AnnData table `obs`.\n \"\"\"\n\n # Make a local copy of the dataframe, to avoid SettingWithCopyWarning\n df = df.copy()\n\n # Convert DataFrame index to str, to avoid\n # >> ImplicitModificationWarning: Transforming to str index\n # when creating AnnData object.\n # Do this in the beginning to allow concatenation with e.g. time\n df.index = df.index.astype(str)\n\n # Obtain box size in physical units\n df = df.assign(len_x_micrometer=df.x_pixel * df.pixel_size_x)\n df = df.assign(len_y_micrometer=df.y_pixel * df.pixel_size_y)\n df = df.assign(len_z_micrometer=df.z_pixel * df.pixel_size_z)\n\n # Select only the numeric positional columns needed to define ROIs\n # (to avoid) casting things like the data column to float32\n # or to use unnecessary columns like bit_depth\n positional_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n \"x_micrometer_original\",\n \"y_micrometer_original\",\n ]\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df_roi = df.loc[:, positional_columns].astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n adata = ad.AnnData(X=df_roi)\n\n # Reset origin of the FOV ROI table, so that it matches with the well\n # origin\n adata = reset_origin(adata)\n\n # Save any metadata that is specified to the obs df\n for col in metadata:\n if col in df:\n # Cast all metadata to str.\n # Reason: AnnData Zarr writers don't support all pandas types.\n # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written\n adata.obs[col] = df[col].astype(str)\n\n # Rename rows and columns: Maintain FOV indices from the dataframe\n # (they are already enforced to be unique by Pandas and may contain\n # information for the user, as they are based on the filenames)\n adata.obs_names = \"FOV_\" + adata.obs.index\n adata.var_names = list(map(str, df_roi.columns))\n\n return adata\n
Input dataframe, possibly prepared through parse_yokogawa_metadata.
TYPE: DataFrame
metadata
Columns of df to be stored (if present) into AnnData table obs.
TYPE: tuple[str, ...] DEFAULT: ('time')
Source code in fractal_tasks_core/roi/v1.py
def prepare_well_ROI_table(\n df: pd.DataFrame, metadata: tuple[str, ...] = (\"time\",)\n) -> ad.AnnData:\n\"\"\"\n Prepare an AnnData table with a single well ROI.\n\n Args:\n df:\n Input dataframe, possibly prepared through\n `parse_yokogawa_metadata`.\n metadata:\n Columns of `df` to be stored (if present) into AnnData table `obs`.\n \"\"\"\n\n # Make a local copy of the dataframe, to avoid SettingWithCopyWarning\n df = df.copy()\n\n # Convert DataFrame index to str, to avoid\n # >> ImplicitModificationWarning: Transforming to str index\n # when creating AnnData object.\n # Do this in the beginning to allow concatenation with e.g. time\n df.index = df.index.astype(str)\n\n # Calculate bounding box extents in physical units\n for mu in [\"x\", \"y\", \"z\"]:\n # Obtain per-FOV properties in physical units.\n # NOTE: a FOV ROI is defined here as the interval [min_micrometer,\n # max_micrometer], with max_micrometer=min_micrometer+len_micrometer\n min_micrometer = df[f\"{mu}_micrometer\"]\n len_micrometer = df[f\"{mu}_pixel\"] * df[f\"pixel_size_{mu}\"]\n max_micrometer = min_micrometer + len_micrometer\n # Obtain well bounding box, in physical units\n min_min_micrometer = min_micrometer.min()\n max_max_micrometer = max_micrometer.max()\n df[f\"{mu}_micrometer\"] = min_min_micrometer\n df[f\"len_{mu}_micrometer\"] = max_max_micrometer - min_min_micrometer\n\n # Select only the numeric positional columns needed to define ROIs\n # (to avoid) casting things like the data column to float32\n # or to use unnecessary columns like bit_depth\n positional_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df_roi = df.iloc[0:1, :].loc[:, positional_columns].astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n adata = ad.AnnData(X=df_roi)\n\n # Reset origin of the single-entry well ROI table\n adata = reset_origin(adata)\n\n # Save any metadata that is specified to the obs df\n for col in metadata:\n if col in df:\n # Cast all metadata to str.\n # Reason: AnnData Zarr writers don't support all pandas types.\n # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written\n adata.obs[col] = df[col].astype(str)\n\n # Rename rows and columns: Maintain FOV indices from the dataframe\n # (they are already enforced to be unique by Pandas and may contain\n # information for the user, as they are based on the filenames)\n adata.obs_names = \"well_\" + adata.obs.index\n adata.var_names = list(map(str, df_roi.columns))\n\n return adata\n
Return a copy of a ROI table, with shifted-to-zero origin for some columns.
PARAMETER DESCRIPTION ROI_table
Original ROI table.
TYPE: AnnData
x_pos
Name of the column with X position of ROIs.
TYPE: str DEFAULT: 'x_micrometer'
y_pos
Name of the column with Y position of ROIs.
TYPE: str DEFAULT: 'y_micrometer'
z_pos
Name of the column with Z position of ROIs.
TYPE: str DEFAULT: 'z_micrometer'
RETURNS DESCRIPTION AnnData
A copy of the ROI_table AnnData table, where values of x_pos, y_pos and z_pos columns have been shifted by their minimum values.
Source code in fractal_tasks_core/roi/v1.py
def reset_origin(\n ROI_table: ad.AnnData,\n x_pos: str = \"x_micrometer\",\n y_pos: str = \"y_micrometer\",\n z_pos: str = \"z_micrometer\",\n) -> ad.AnnData:\n\"\"\"\n Return a copy of a ROI table, with shifted-to-zero origin for some columns.\n\n Args:\n ROI_table: Original ROI table.\n x_pos: Name of the column with X position of ROIs.\n y_pos: Name of the column with Y position of ROIs.\n z_pos: Name of the column with Z position of ROIs.\n\n Returns:\n A copy of the `ROI_table` AnnData table, where values of `x_pos`,\n `y_pos` and `z_pos` columns have been shifted by their minimum\n values.\n \"\"\"\n new_table = ROI_table.copy()\n\n origin_x = min(new_table[:, x_pos].X[:, 0])\n origin_y = min(new_table[:, y_pos].X[:, 0])\n origin_z = min(new_table[:, z_pos].X[:, 0])\n\n for FOV in new_table.obs_names:\n new_table[FOV, x_pos] = new_table[FOV, x_pos].X[0, 0] - origin_x\n new_table[FOV, y_pos] = new_table[FOV, y_pos].X[0, 0] - origin_y\n new_table[FOV, z_pos] = new_table[FOV, z_pos].X[0, 0] - origin_z\n\n return new_table\n
This function reflects our current working assumptions (e.g. the presence of some specific columns); this may change in future versions.
PARAMETER DESCRIPTION table
AnnData table to be checked
TYPE: AnnData
Source code in fractal_tasks_core/roi/v1_checks.py
def are_ROI_table_columns_valid(*, table: ad.AnnData) -> None:\n\"\"\"\n Verify some validity assumptions on a ROI table.\n\n This function reflects our current working assumptions (e.g. the presence\n of some specific columns); this may change in future versions.\n\n Args:\n table: AnnData table to be checked\n \"\"\"\n\n # Hard constraint: table columns must include some expected ones\n columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n for column in columns:\n if column not in table.var_names:\n raise ValueError(f\"Column {column} is not present in ROI table\")\n
Check that list of indices has zero origin on each axis.
See fractal-tasks-core issues #530 and #554.
This helper function is meant to provide informative error messages when ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12. This function will be deprecated and removed as soon as the v0.11/v0.12 transition advances.
Note that only FOV_ROI_table and well_ROI_table have to fulfill this constraint, while ROI tables obtained through segmentation may have arbitrary (non-negative) indices.
PARAMETER DESCRIPTION list_indices
Output of convert_ROI_table_to_indices; each item is like [start_z, end_z, start_y, end_y, start_x, end_x].
TYPE: list[list[int]]
ROI_table_name
Name of the ROI table.
TYPE: str
RAISES DESCRIPTION ValueError
If the table name is FOV_ROI_table or well_ROI_table and the minimum value of start_x, start_y and start_z are not all zero.
Source code in fractal_tasks_core/roi/v1_checks.py
def check_valid_ROI_indices(\n list_indices: list[list[int]],\n ROI_table_name: str,\n) -> None:\n\"\"\"\n Check that list of indices has zero origin on each axis.\n\n See fractal-tasks-core issues #530 and #554.\n\n This helper function is meant to provide informative error messages when\n ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.\n This function will be deprecated and removed as soon as the v0.11/v0.12\n transition advances.\n\n Note that only `FOV_ROI_table` and `well_ROI_table` have to fulfill this\n constraint, while ROI tables obtained through segmentation may have\n arbitrary (non-negative) indices.\n\n Args:\n list_indices:\n Output of `convert_ROI_table_to_indices`; each item is like\n `[start_z, end_z, start_y, end_y, start_x, end_x]`.\n ROI_table_name: Name of the ROI table.\n\n Raises:\n ValueError:\n If the table name is `FOV_ROI_table` or `well_ROI_table` and the\n minimum value of `start_x`, `start_y` and `start_z` are not all\n zero.\n \"\"\"\n if ROI_table_name not in [\"FOV_ROI_table\", \"well_ROI_table\"]:\n # This validation function only applies to the FOV/well ROI tables\n # generated with fractal-tasks-core\n return\n\n # Find minimum index along ZYX\n min_start_z = min(item[0] for item in list_indices)\n min_start_y = min(item[2] for item in list_indices)\n min_start_x = min(item[4] for item in list_indices)\n\n # Check that minimum indices are all zero\n for ind, min_index in enumerate((min_start_z, min_start_y, min_start_x)):\n if min_index != 0:\n axis = [\"Z\", \"Y\", \"X\"][ind]\n raise ValueError(\n f\"{axis} component of ROI indices for table `{ROI_table_name}`\"\n f\" do not start with 0, but with {min_index}.\\n\"\n \"Hint: As of fractal-tasks-core v0.12, FOV/well ROI \"\n \"tables with non-zero origins (e.g. the ones created with \"\n \"v0.11) are not supported.\"\n )\n
This function reflects our current working assumptions (e.g. the presence of some specific columns); this may change in future versions.
If use_masks=True, we verify that the table is a valid masking_roi_table as of table specifications V1; if this check fails, use_masks should be set to False upstream in the parent function.
PARAMETER DESCRIPTION table_path
Path of the AnnData ROI table to be checked.
TYPE: str
use_masks
If True, perform some additional checks related to masked loading.
TYPE: bool
RETURNS DESCRIPTION Optional[bool]
Always None if use_masks=False, otherwise return whether the table is valid for masked loading.
Source code in fractal_tasks_core/roi/v1_checks.py
def is_ROI_table_valid(*, table_path: str, use_masks: bool) -> Optional[bool]:\n\"\"\"\n Verify some validity assumptions on a ROI table.\n\n This function reflects our current working assumptions (e.g. the presence\n of some specific columns); this may change in future versions.\n\n If `use_masks=True`, we verify that the table is a valid\n `masking_roi_table` as of table specifications V1; if this check fails,\n `use_masks` should be set to `False` upstream in the parent function.\n\n Args:\n table_path: Path of the AnnData ROI table to be checked.\n use_masks: If `True`, perform some additional checks related to\n masked loading.\n\n Returns:\n Always `None` if `use_masks=False`, otherwise return whether the table\n is valid for masked loading.\n \"\"\"\n\n table = ad.read_zarr(table_path)\n are_ROI_table_columns_valid(table=table)\n if not use_masks:\n return None\n\n # Check whether the table can be used for masked loading\n attrs = zarr.group(table_path).attrs.asdict()\n logger.info(f\"ROI table at {table_path} has attrs: {attrs}\")\n try:\n MaskingROITableAttrs(**attrs)\n logging.info(\"ROI table can be used for masked loading\")\n return True\n except ValidationError:\n logging.info(\"ROI table cannot be used for masked loading\")\n return False\n
This function is currently only used in tests and examples.
The plotting_function parameter is exposed so that other tools (see examples in this repository) may use it to show the FOV ROIs.
PARAMETER DESCRIPTION site_metadata
TBD
TYPE: DataFrame
selected_well
TBD
TYPE: str
plotting_function
TBD
TYPE: Callable
tol
TBD
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/v1_overlaps.py
def check_well_for_FOV_overlap(\n site_metadata: pd.DataFrame,\n selected_well: str,\n plotting_function: Callable,\n tol: float = 1e-10,\n):\n\"\"\"\n This function is currently only used in tests and examples.\n\n The `plotting_function` parameter is exposed so that other tools (see\n examples in this repository) may use it to show the FOV ROIs.\n\n Args:\n site_metadata: TBD\n selected_well: TBD\n plotting_function: TBD\n tol: TBD\n \"\"\"\n\n df = site_metadata.loc[selected_well].copy()\n df[\"xmin\"] = df[\"x_micrometer\"]\n df[\"ymin\"] = df[\"y_micrometer\"]\n df[\"xmax\"] = df[\"x_micrometer\"] + df[\"pixel_size_x\"] * df[\"x_pixel\"]\n df[\"ymax\"] = df[\"y_micrometer\"] + df[\"pixel_size_y\"] * df[\"y_pixel\"]\n\n xmin = list(df.loc[:, \"xmin\"])\n ymin = list(df.loc[:, \"ymin\"])\n xmax = list(df.loc[:, \"xmax\"])\n ymax = list(df.loc[:, \"ymax\"])\n num_lines = len(xmin)\n\n list_overlapping_FOVs = []\n for line_1 in range(num_lines):\n min_x_1, max_x_1 = [a[line_1] for a in [xmin, xmax]]\n min_y_1, max_y_1 = [a[line_1] for a in [ymin, ymax]]\n for line_2 in range(line_1):\n min_x_2, max_x_2 = [a[line_2] for a in [xmin, xmax]]\n min_y_2, max_y_2 = [a[line_2] for a in [ymin, ymax]]\n overlap = is_overlapping_2D(\n (min_x_1, min_y_1, max_x_1, max_y_1),\n (min_x_2, min_y_2, max_x_2, max_y_2),\n tol=tol,\n )\n if overlap:\n list_overlapping_FOVs.append(line_1)\n list_overlapping_FOVs.append(line_2)\n\n # Call plotting_function\n plotting_function(\n xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well\n )\n\n if len(list_overlapping_FOVs) > 0:\n # Increase values by one to switch from index to the label plotted\n return {selected_well: [x + 1 for x in list_overlapping_FOVs]}\n
Given a list of integer ROI indices, find whether there are overlaps.
PARAMETER DESCRIPTION list_indices
List of ROI indices, where each element in the list should look like [start_z, end_z, start_y, end_y, start_x, end_x].
TYPE: list[list[int]]
RETURNS DESCRIPTION Optional[tuple[int, int]]
None if no overlap was detected, otherwise a tuple with the positional indices of a pair of overlapping ROIs.
Source code in fractal_tasks_core/roi/v1_overlaps.py
def find_overlaps_in_ROI_indices(\n list_indices: list[list[int]],\n) -> Optional[tuple[int, int]]:\n\"\"\"\n Given a list of integer ROI indices, find whether there are overlaps.\n\n Args:\n list_indices: List of ROI indices, where each element in the list\n should look like\n `[start_z, end_z, start_y, end_y, start_x, end_x]`.\n\n Returns:\n `None` if no overlap was detected, otherwise a tuple with the\n positional indices of a pair of overlapping ROIs.\n \"\"\"\n\n for ind_1, ROI_1 in enumerate(list_indices):\n s_z, e_z, s_y, e_y, s_x, e_x = ROI_1[:]\n box_1 = [s_x, s_y, s_z, e_x, e_y, e_z]\n for ind_2 in range(ind_1):\n ROI_2 = list_indices[ind_2]\n s_z, e_z, s_y, e_y, s_x, e_x = ROI_2[:]\n box_2 = [s_x, s_y, s_z, e_x, e_y, e_z]\n if _is_overlapping_3D_int(box_1, box_2):\n return (ind_1, ind_2)\n return None\n
Finds the indices for the next overlapping FOVs pair.
Note: the returned indices are positional indices, starting from 0.
PARAMETER DESCRIPTION tmp_df
Dataframe with columns [\"xmin\", \"ymin\", \"xmax\", \"ymax\"].
TYPE: DataFrame
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/v1_overlaps.py
def get_overlapping_pair(\n tmp_df: pd.DataFrame, tol: float = 1e-10\n) -> Union[tuple[int, int], bool]:\n\"\"\"\n Finds the indices for the next overlapping FOVs pair.\n\n Note: the returned indices are positional indices, starting from 0.\n\n Args:\n tmp_df: Dataframe with columns `[\"xmin\", \"ymin\", \"xmax\", \"ymax\"]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n\n num_lines = len(tmp_df.index)\n for pos_ind_1 in range(num_lines):\n for pos_ind_2 in range(pos_ind_1):\n if is_overlapping_2D(\n tmp_df.iloc[pos_ind_1], tmp_df.iloc[pos_ind_2], tol=tol\n ):\n return (pos_ind_1, pos_ind_2)\n return False\n
Finds the indices for the all overlapping FOVs pair, in three dimensions.
Note: the returned indices are positional indices, starting from 0.
PARAMETER DESCRIPTION tmp_df
Dataframe with columns {x,y,z}_micrometer and len_{x,y,z}_micrometer.
TYPE: DataFrame
full_res_pxl_sizes_zyx
TBD
TYPE: Sequence[float]
Source code in fractal_tasks_core/roi/v1_overlaps.py
def get_overlapping_pairs_3D(\n tmp_df: pd.DataFrame,\n full_res_pxl_sizes_zyx: Sequence[float],\n):\n\"\"\"\n Finds the indices for the all overlapping FOVs pair, in three dimensions.\n\n Note: the returned indices are positional indices, starting from 0.\n\n Args:\n tmp_df: Dataframe with columns `{x,y,z}_micrometer` and\n `len_{x,y,z}_micrometer`.\n full_res_pxl_sizes_zyx: TBD\n \"\"\"\n\n tol = 1e-10\n if tol > min(full_res_pxl_sizes_zyx) / 1e3:\n raise ValueError(f\"{tol=} but {full_res_pxl_sizes_zyx=}\")\n\n new_tmp_df = tmp_df.copy()\n\n new_tmp_df[\"x_micrometer_max\"] = (\n new_tmp_df[\"x_micrometer\"] + new_tmp_df[\"len_x_micrometer\"]\n )\n new_tmp_df[\"y_micrometer_max\"] = (\n new_tmp_df[\"y_micrometer\"] + new_tmp_df[\"len_y_micrometer\"]\n )\n new_tmp_df[\"z_micrometer_max\"] = (\n new_tmp_df[\"z_micrometer\"] + new_tmp_df[\"len_z_micrometer\"]\n )\n # Remove columns which are not necessary for overlap checks\n list_columns = [\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n \"label\",\n ]\n new_tmp_df.drop(labels=list_columns, axis=1, inplace=True)\n\n # Loop over all pairs, and construct list of overlapping ones\n num_lines = len(new_tmp_df.index)\n overlapping_list = []\n for pos_ind_1 in range(num_lines):\n for pos_ind_2 in range(pos_ind_1):\n overlap = is_overlapping_3D(\n new_tmp_df.iloc[pos_ind_1], new_tmp_df.iloc[pos_ind_2], tol=tol\n )\n if overlap:\n overlapping_list.append((pos_ind_1, pos_ind_2))\n return overlapping_list\n
Given a metadata dataframe, shift its columns to remove FOV overlaps.
PARAMETER DESCRIPTION df
Metadata dataframe.
TYPE: DataFrame
Source code in fractal_tasks_core/roi/v1_overlaps.py
def remove_FOV_overlaps(df: pd.DataFrame):\n\"\"\"\n Given a metadata dataframe, shift its columns to remove FOV overlaps.\n\n Args:\n df: Metadata dataframe.\n \"\"\"\n\n # Set tolerance (this should be much smaller than pixel size or expected\n # round-offs), and maximum number of iterations in constraint solver\n tol = 1e-10\n max_iterations = 200\n\n # Create a local copy of the dataframe\n df = df.copy()\n\n # Create temporary columns (to streamline overlap removals), which are\n # then removed at the end of the remove_FOV_overlaps function\n df[\"xmin\"] = df[\"x_micrometer\"]\n df[\"ymin\"] = df[\"y_micrometer\"]\n df[\"xmax\"] = df[\"x_micrometer\"] + df[\"pixel_size_x\"] * df[\"x_pixel\"]\n df[\"ymax\"] = df[\"y_micrometer\"] + df[\"pixel_size_y\"] * df[\"y_pixel\"]\n list_columns = [\"xmin\", \"ymin\", \"xmax\", \"ymax\"]\n\n # Create columns with the original positions (not to be removed)\n df[\"x_micrometer_original\"] = df[\"x_micrometer\"]\n df[\"y_micrometer_original\"] = df[\"y_micrometer\"]\n\n # Check that tolerance is much smaller than pixel sizes\n min_pixel_size = df[[\"pixel_size_x\", \"pixel_size_y\"]].min().min()\n if tol > min_pixel_size / 1e3:\n raise ValueError(\n f\"In remove_FOV_overlaps, {tol=} but {min_pixel_size=}\"\n )\n\n # Loop over wells\n wells = sorted(list(set([ind[0] for ind in df.index])))\n for well in wells:\n\n logger.info(f\"removing FOV overlaps for {well=}\")\n df_well = df.loc[well].copy()\n\n # NOTE: these are positional indices (i.e. starting from 0)\n pair_pos_indices = get_overlapping_pair(df_well[list_columns], tol=tol)\n\n # Keep going until there are no overlaps, or until iteration reaches\n # max_iterations\n iteration = 0\n while pair_pos_indices:\n iteration += 1\n\n # Identify overlapping FOVs\n pos_ind_1, pos_ind_2 = pair_pos_indices\n fov_id_1 = df_well.index[pos_ind_1]\n fov_id_2 = df_well.index[pos_ind_2]\n xmin_1, ymin_1, xmax_1, ymax_1 = df_well[list_columns].iloc[\n pos_ind_1\n ]\n xmin_2, ymin_2, xmax_2, ymax_2 = df_well[list_columns].iloc[\n pos_ind_2\n ]\n logger.debug(\n f\"{well=}, {iteration=}, removing overlap between\"\n f\" {fov_id_1=} and {fov_id_2=}\"\n )\n\n # Check what kind of overlap is there (X, Y, or XY)\n is_x_equal = abs(xmin_1 - xmin_2) < tol and (xmax_1 - xmax_2) < tol\n is_y_equal = abs(ymin_1 - ymin_2) < tol and (ymax_1 - ymax_2) < tol\n is_x_overlap = is_overlapping_1D(\n [xmin_1, xmax_1], [xmin_2, xmax_2], tol=tol\n )\n is_y_overlap = is_overlapping_1D(\n [ymin_1, ymax_1], [ymin_2, ymax_2], tol=tol\n )\n\n if is_x_equal and is_y_overlap:\n # Y overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [ymin_1, ymax_1],\n [ymin_2, ymax_2],\n mu=\"y\",\n tol=tol,\n )\n elif is_y_equal and is_x_overlap:\n # X overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [xmin_1, xmax_1],\n [xmin_2, xmax_2],\n mu=\"x\",\n tol=tol,\n )\n elif not (is_x_equal or is_y_equal) and (\n is_x_overlap and is_y_overlap\n ):\n # XY overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [xmin_1, xmax_1],\n [xmin_2, xmax_2],\n mu=\"x\",\n tol=tol,\n )\n df_well = apply_shift_in_one_direction(\n df_well,\n [ymin_1, ymax_1],\n [ymin_2, ymax_2],\n mu=\"y\",\n tol=tol,\n )\n else:\n raise ValueError(\n \"Trying to remove overlap which is not there.\"\n )\n\n # Look for next overlapping FOV pair\n pair_pos_indices = get_overlapping_pair(\n df_well[list_columns], tol=tol\n )\n\n # Enforce maximum number of iterations\n if iteration >= max_iterations:\n raise ValueError(f\"Reached {max_iterations=} for {well=}\")\n\n # Note: using df.loc[well] = df_well leads to a NaN dataframe, see\n # for instance https://stackoverflow.com/a/28432733/19085332\n df.loc[well, :] = df_well.values\n\n # Remove temporary columns that were added only as part of this function\n df.drop(list_columns, axis=1, inplace=True)\n\n return df\n
Run an overlap check over all wells and optionally plots overlaps.
This function is currently only used in tests and examples.
The plotting_function parameter is exposed so that other tools (see examples in this repository) may use it to show the FOV ROIs. Its arguments are: [xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well].
PARAMETER DESCRIPTION site_metadata
TBD
TYPE: DataFrame
tol
TBD
TYPE: float DEFAULT: 1e-10
plotting_function
TBD
TYPE: Optional[Callable] DEFAULT: None
Source code in fractal_tasks_core/roi/v1_overlaps.py
def run_overlap_check(\n site_metadata: pd.DataFrame,\n tol: float = 1e-10,\n plotting_function: Optional[Callable] = None,\n):\n\"\"\"\n Run an overlap check over all wells and optionally plots overlaps.\n\n This function is currently only used in tests and examples.\n\n The `plotting_function` parameter is exposed so that other tools (see\n examples in this repository) may use it to show the FOV ROIs. Its arguments\n are: `[xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well]`.\n\n Args:\n site_metadata: TBD\n tol: TBD\n plotting_function: TBD\n \"\"\"\n\n if plotting_function is None:\n\n def plotting_function(\n xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well\n ):\n pass\n\n wells = site_metadata.index.unique(level=\"well_id\")\n overlapping_FOVs = []\n for selected_well in wells:\n overlap_curr_well = check_well_for_FOV_overlap(\n site_metadata,\n selected_well=selected_well,\n tol=tol,\n plotting_function=plotting_function,\n )\n if overlap_curr_well:\n print(selected_well)\n overlapping_FOVs.append(overlap_curr_well)\n\n return overlapping_FOVs\n
This is the general interface that should allow for a smooth coexistence of tables with different fractal_table_version values. Currently only V1 is defined and implemented. The assumption is that V2 should only change:
The lower-level writing function (that is, _write_table_v2).
The type of the table (which would also reflect into a more general type hint for table, in the current funciton);
A different definition of what values of table_attrs are valid or invalid, to be implemented in _write_table_v2.
Possibly, additional parameters for _write_table_v2, which will be optional parameters of write_table (so that write_table remains valid for both V1 and V2).
PARAMETER DESCRIPTION image_group
The image Zarr group where the table will be written.
TYPE: Group
table_name
The name of the table.
TYPE: str
table
The table object (currently an AnnData object, for V1).
TYPE: AnnData
overwrite
If False, check that the new table does not exist (either as a zarr sub-group or as part of the zarr-group attributes). In all cases, propagate parameter to low-level functions, to determine the behavior in case of an existing sub-group named as in table_name.
TYPE: bool DEFAULT: False
table_type
type attribute for the table; in case type is also present in table_attrs, this function argument takes priority.
TYPE: Optional[str] DEFAULT: None
table_attrs
If set, overwrite table_group attributes with table_attrs key/value pairs. If table_type is not provided, then table_attrs must include the type key.
TYPE: Optional[dict[str, Any]] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the table.
Source code in fractal_tasks_core/tables/__init__.py
def write_table(\n image_group: zarr.hierarchy.Group,\n table_name: str,\n table: ad.AnnData,\n overwrite: bool = False,\n table_type: Optional[str] = None,\n table_attrs: Optional[dict[str, Any]] = None,\n) -> zarr.group:\n\"\"\"\n Write a table to a Zarr group.\n\n This is the general interface that should allow for a smooth coexistence of\n tables with different `fractal_table_version` values. Currently only V1 is\n defined and implemented. The assumption is that V2 should only change:\n\n 1. The lower-level writing function (that is, `_write_table_v2`).\n 2. The type of the table (which would also reflect into a more general type\n hint for `table`, in the current funciton);\n 3. A different definition of what values of `table_attrs` are valid or\n invalid, to be implemented in `_write_table_v2`.\n 4. Possibly, additional parameters for `_write_table_v2`, which will be\n optional parameters of `write_table` (so that `write_table` remains\n valid for both V1 and V2).\n\n Args:\n image_group:\n The image Zarr group where the table will be written.\n table_name:\n The name of the table.\n table:\n The table object (currently an AnnData object, for V1).\n overwrite:\n If `False`, check that the new table does not exist (either as a\n zarr sub-group or as part of the zarr-group attributes). In all\n cases, propagate parameter to low-level functions, to determine the\n behavior in case of an existing sub-group named as in `table_name`.\n table_type: `type` attribute for the table; in case `type` is also\n present in `table_attrs`, this function argument takes priority.\n table_attrs:\n If set, overwrite table_group attributes with table_attrs key/value\n pairs. If `table_type` is not provided, then `table_attrs` must\n include the `type` key.\n\n Returns:\n Zarr group of the table.\n \"\"\"\n # Choose which version to use, giving priority to a value that is present\n # in table_attrs\n version = __FRACTAL_TABLE_VERSION__\n if table_attrs is not None:\n try:\n version = table_attrs[\"fractal_table_version\"]\n except KeyError:\n pass\n\n if version == \"1\":\n return _write_table_v1(\n image_group,\n table_name,\n table,\n overwrite,\n table_type,\n table_attrs,\n )\n else:\n raise NotImplementedError(\n f\"fractal_table_version='{version}' is not supported\"\n )\n
Wrap anndata.experimental.write_elem, to include overwrite parameter.
See docs for the original function here.
This function writes elem to the sub-group key of group. The overwrite-related expected behavior is:
if the sub-group does not exist, create it (independently on overwrite);
if the sub-group already exists and overwrite=True, overwrite the sub-group;
if the sub-group already exists and overwrite=False, fail.
Note that this version of the wrapper does not include the original dataset_kwargs parameter.
PARAMETER DESCRIPTION group
The group to write to.
TYPE: Group
key
The key to write to in the group. Note that absolute paths will be written from the root.
TYPE: str
elem
The element to write. Typically an in-memory object, e.g. an AnnData, pandas dataframe, scipy sparse matrix, etc.
TYPE: Any
overwrite
If True, overwrite the key sub-group (if present); if False and key sub-group exists, raise an error.
TYPE: bool
logger
The logger to use (if unset, use logging.getLogger(None))
TYPE: Optional[Logger] DEFAULT: None
RAISES DESCRIPTION OverwriteNotAllowedError
If overwrite=False and the sub-group already exists.
Source code in fractal_tasks_core/tables/v1.py
def _write_elem_with_overwrite(\n group: zarr.hierarchy.Group,\n key: str,\n elem: Any,\n *,\n overwrite: bool,\n logger: Optional[logging.Logger] = None,\n) -> None:\n\"\"\"\n Wrap `anndata.experimental.write_elem`, to include `overwrite` parameter.\n\n See docs for the original function\n [here](https://anndata.readthedocs.io/en/stable/generated/anndata.experimental.write_elem.html).\n\n This function writes `elem` to the sub-group `key` of `group`. The\n `overwrite`-related expected behavior is:\n\n * if the sub-group does not exist, create it (independently on\n `overwrite`);\n * if the sub-group already exists and `overwrite=True`, overwrite the\n sub-group;\n * if the sub-group already exists and `overwrite=False`, fail.\n\n Note that this version of the wrapper does not include the original\n `dataset_kwargs` parameter.\n\n Args:\n group:\n The group to write to.\n key:\n The key to write to in the group. Note that absolute paths will be\n written from the root.\n elem:\n The element to write. Typically an in-memory object, e.g. an\n AnnData, pandas dataframe, scipy sparse matrix, etc.\n overwrite:\n If `True`, overwrite the `key` sub-group (if present); if `False`\n and `key` sub-group exists, raise an error.\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`)\n\n Raises:\n OverwriteNotAllowedError:\n If `overwrite=False` and the sub-group already exists.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n if key in set(group.group_keys()):\n if not overwrite:\n error_msg = (\n f\"Sub-group '{key}' of group {group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n write_elem(group, key, elem)\n
Handle multiple options for writing an AnnData table to a zarr group.
Create the tables group, if needed.
If overwrite=False, check that the new table does not exist (either in zarr attributes or as a zarr sub-group).
Call the _write_elem_with_overwrite wrapper with the appropriate overwrite parameter.
Update the tables attribute of the image group.
Validate table_type and table_attrs according to Fractal table specifications, and raise errors/warnings if needed; then set the appropriate attributes in the new-table Zarr group.
PARAMETER DESCRIPTION image_group
The group to write to.
TYPE: Group
table_name
The name of the new table.
TYPE: str
table
The AnnData table to write.
TYPE: AnnData
overwrite
If False, check that the new table does not exist (either as a zarr sub-group or as part of the zarr-group attributes). In all cases, propagate parameter to _write_elem_with_overwrite, to determine the behavior in case of an existing sub-group named as table_name.
TYPE: bool DEFAULT: False
table_type
type attribute for the table; in case type is also present in table_attrs, this function argument takes priority.
TYPE: Optional[str] DEFAULT: None
table_attrs
If set, overwrite table_group attributes with table_attrs key/value pairs. If table_type is not provided, then table_attrs must include the type key.
TYPE: Optional[dict[str, Any]] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the new table.
Source code in fractal_tasks_core/tables/v1.py
def _write_table_v1(\n image_group: zarr.hierarchy.Group,\n table_name: str,\n table: ad.AnnData,\n overwrite: bool = False,\n table_type: Optional[str] = None,\n table_attrs: Optional[dict[str, Any]] = None,\n) -> zarr.group:\n\"\"\"\n Handle multiple options for writing an AnnData table to a zarr group.\n\n 1. Create the `tables` group, if needed.\n 2. If `overwrite=False`, check that the new table does not exist (either in\n zarr attributes or as a zarr sub-group).\n 3. Call the `_write_elem_with_overwrite` wrapper with the appropriate\n `overwrite` parameter.\n 4. Update the `tables` attribute of the image group.\n 5. Validate `table_type` and `table_attrs` according to Fractal table\n specifications, and raise errors/warnings if needed; then set the\n appropriate attributes in the new-table Zarr group.\n\n\n Args:\n image_group:\n The group to write to.\n table_name:\n The name of the new table.\n table:\n The AnnData table to write.\n overwrite:\n If `False`, check that the new table does not exist (either as a\n zarr sub-group or as part of the zarr-group attributes). In all\n cases, propagate parameter to `_write_elem_with_overwrite`, to\n determine the behavior in case of an existing sub-group named as\n `table_name`.\n table_type: `type` attribute for the table; in case `type` is also\n present in `table_attrs`, this function argument takes priority.\n table_attrs:\n If set, overwrite table_group attributes with table_attrs key/value\n pairs. If `table_type` is not provided, then `table_attrs` must\n include the `type` key.\n\n Returns:\n Zarr group of the new table.\n \"\"\"\n\n # Create tables group (if needed) and extract current_tables\n if \"tables\" not in set(image_group.group_keys()):\n tables_group = image_group.create_group(\"tables\", overwrite=False)\n else:\n tables_group = image_group[\"tables\"]\n current_tables = tables_group.attrs.asdict().get(\"tables\", [])\n\n # If overwrite=False, check that the new table does not exist (either as a\n # zarr sub-group or as part of the zarr-group attributes)\n if not overwrite:\n if table_name in set(tables_group.group_keys()):\n error_msg = (\n f\"Sub-group '{table_name}' of group {image_group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n if table_name in current_tables:\n error_msg = (\n f\"Item '{table_name}' already exists in `tables` attribute of \"\n f\"group {image_group.store.path}, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Always include fractal-roi-table version in table attributes\n if table_attrs is None:\n table_attrs = dict(fractal_table_version=\"1\")\n elif table_attrs.get(\"fractal_table_version\", None) is None:\n table_attrs[\"fractal_table_version\"] = \"1\"\n\n # Set type attribute for the table\n table_type_from_attrs = table_attrs.get(\"type\", None)\n if table_type is not None:\n if table_type_from_attrs is not None:\n logger.warning(\n f\"Setting table type to '{table_type}' (and overriding \"\n f\"'{table_type_from_attrs}' attribute).\"\n )\n table_attrs[\"type\"] = table_type\n else:\n if table_type_from_attrs is None:\n raise ValueError(\n \"Missing attribute `type` for table; this must be provided\"\n \" either via `table_type` or within `table_attrs`.\"\n )\n\n # Prepare/validate attributes for the table\n table_type = table_attrs.get(\"type\", None)\n if table_type == \"roi_table\":\n pass\n elif table_type == \"masking_roi_table\":\n try:\n MaskingROITableAttrs(**table_attrs)\n except ValidationError as e:\n error_msg = (\n \"Table attributes do not comply with Fractal \"\n \"`masking_roi_table` specifications V1.\\nOriginal error:\\n\"\n f\"ValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n elif table_type == \"feature_table\":\n try:\n FeatureTableAttrs(**table_attrs)\n except ValidationError as e:\n error_msg = (\n \"Table attributes do not comply with Fractal \"\n \"`feature_table` specifications V1.\\nOriginal error:\\n\"\n f\"ValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n else:\n logger.warning(f\"Unknown table type `{table_type}`.\")\n\n # If it's all OK, proceed and write the table\n _write_elem_with_overwrite(\n tables_group,\n table_name,\n table,\n overwrite=overwrite,\n )\n table_group = tables_group[table_name]\n\n # Update the `tables` metadata of the image group, if needed\n if table_name not in current_tables:\n new_tables = current_tables + [table_name]\n tables_group.attrs[\"tables\"] = new_tables\n\n # Update table_group attributes with table_attrs key/value pairs\n table_group.attrs.update(**table_attrs)\n\n return table_group\n
Optionally match a table type and only return the names of those tables.
PARAMETER DESCRIPTION zarr_url
Path to the OME-Zarr image
TYPE: str
table_type
The type of table to look for. Special handling for \"ROIs\" => matches both \"roi_table\" & \"masking_roi_table\".
TYPE: str DEFAULT: None
strict
If True, only return tables that have a type attribute. If False, also include tables without a type attribute.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION list[str]
List of the names of available tables
Source code in fractal_tasks_core/tables/v1.py
def get_tables_list_v1(\n zarr_url: str, table_type: str = None, strict: bool = False\n) -> list[str]:\n\"\"\"\n Find the list of tables in the Zarr file\n\n Optionally match a table type and only return the names of those tables.\n\n Args:\n zarr_url: Path to the OME-Zarr image\n table_type: The type of table to look for. Special handling for\n \"ROIs\" => matches both \"roi_table\" & \"masking_roi_table\".\n strict: If `True`, only return tables that have a type attribute.\n If `False`, also include tables without a type attribute.\n\n Returns:\n List of the names of available tables\n \"\"\"\n with zarr.open(zarr_url, mode=\"r\") as zarr_group:\n zarr_subgroups = list(zarr_group.group_keys())\n if \"tables\" not in zarr_subgroups:\n return []\n with zarr.open(zarr_url, mode=\"r\") as zarr_group:\n all_tables = list(zarr_group.tables.group_keys())\n\n if not table_type:\n return all_tables\n else:\n return _filter_tables_by_type_v1(\n zarr_url, all_tables, table_type, strict\n )\n
Calculates the new position as: p = position + max(shift, 0) - own_shift Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)
PARAMETER DESCRIPTION roi_table
AnnData table which contains a Fractal ROI table. Rows are ROIs
TYPE: AnnData
max_df
Max translation shift in z, y, x for each ROI. Rows are ROIs, columns are translation_z, translation_y, translation_x
TYPE: DataFrame
min_df
Min translation shift in z, y, x for each ROI. Rows are ROIs, columns are translation_z, translation_y, translation_x
TYPE: DataFrame
Returns: ROI table where all ROIs are registered to the smallest common area across all acquisitions.
Source code in fractal_tasks_core/tasks/_registration_utils.py
def apply_registration_to_single_ROI_table(\n roi_table: ad.AnnData,\n max_df: pd.DataFrame,\n min_df: pd.DataFrame,\n) -> ad.AnnData:\n\"\"\"\n Applies the registration to a ROI table\n\n Calculates the new position as: p = position + max(shift, 0) - own_shift\n Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)\n\n Args:\n roi_table: AnnData table which contains a Fractal ROI table.\n Rows are ROIs\n max_df: Max translation shift in z, y, x for each ROI. Rows are ROIs,\n columns are translation_z, translation_y, translation_x\n min_df: Min translation shift in z, y, x for each ROI. Rows are ROIs,\n columns are translation_z, translation_y, translation_x\n Returns:\n ROI table where all ROIs are registered to the smallest common area\n across all acquisitions.\n \"\"\"\n roi_table = copy.deepcopy(roi_table)\n rois = roi_table.obs.index\n if (rois != max_df.index).all() or (rois != min_df.index).all():\n raise ValueError(\n \"ROI table and max & min translation need to contain the same \"\n f\"ROIS, but they were {rois=}, {max_df.index=}, {min_df.index=}\"\n )\n\n for roi in rois:\n roi_table[[roi], [\"z_micrometer\"]] = (\n roi_table[[roi], [\"z_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_z\"])\n - roi_table[[roi], [\"translation_z\"]].X\n )\n roi_table[[roi], [\"y_micrometer\"]] = (\n roi_table[[roi], [\"y_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_y\"])\n - roi_table[[roi], [\"translation_y\"]].X\n )\n roi_table[[roi], [\"x_micrometer\"]] = (\n roi_table[[roi], [\"x_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_x\"])\n - roi_table[[roi], [\"translation_x\"]].X\n )\n # This calculation only works if all ROIs are the same size initially!\n roi_table[[roi], [\"len_z_micrometer\"]] = (\n roi_table[[roi], [\"len_z_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_z\"])\n + float(min_df.loc[roi, \"translation_z\"])\n )\n roi_table[[roi], [\"len_y_micrometer\"]] = (\n roi_table[[roi], [\"len_y_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_y\"])\n + float(min_df.loc[roi, \"translation_y\"])\n )\n roi_table[[roi], [\"len_x_micrometer\"]] = (\n roi_table[[roi], [\"len_x_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_x\"])\n + float(min_df.loc[roi, \"translation_x\"])\n )\n return roi_table\n
Parses zarr_urls & groups them by HCS wells & acquisition
Generates a dict with keys a unique description of the acquisition (e.g. plate + well for HCS plates). The values are dictionaries. The keys of the secondary dictionary are the acqusitions, its values the zarr_url for a given acquisition.
PARAMETER DESCRIPTION zarr_urls
List of zarr_urls
TYPE: list[str]
RETURNS DESCRIPTION dict[str, dict[int, str]]
image_groups
Source code in fractal_tasks_core/tasks/_registration_utils.py
def create_well_acquisition_dict(\n zarr_urls: list[str],\n) -> dict[str, dict[int, str]]:\n\"\"\"\n Parses zarr_urls & groups them by HCS wells & acquisition\n\n Generates a dict with keys a unique description of the acquisition\n (e.g. plate + well for HCS plates). The values are dictionaries. The keys\n of the secondary dictionary are the acqusitions, its values the `zarr_url`\n for a given acquisition.\n\n Args:\n zarr_urls: List of zarr_urls\n\n Returns:\n image_groups\n \"\"\"\n image_groups = dict()\n\n # Dict to cache well-level metadata\n well_metadata = dict()\n for zarr_url in zarr_urls:\n well_path, img_sub_path = _split_well_path_image_path(zarr_url)\n # For the first zarr_url of a well, load the well metadata and\n # initialize the image_groups dict\n if well_path not in image_groups:\n well_meta = load_NgffWellMeta(well_path)\n well_metadata[well_path] = well_meta.well\n image_groups[well_path] = {}\n\n # For every zarr_url, add it under the well_path & acquisition keys to\n # the image_groups dict\n for image in well_metadata[well_path].images:\n if image.path == img_sub_path:\n if image.acquisition in image_groups[well_path]:\n raise ValueError(\n \"This task has not been built for OME-Zarr HCS plates\"\n \"with multiple images of the same acquisition per well\"\n f\". {image.acquisition} is the acquisition for \"\n f\"multiple images in {well_path=}.\"\n )\n\n image_groups[well_path][image.acquisition] = zarr_url\n return image_groups\n
Updates the necessary metadata for a new copy of an OME-Zarr image
Based on an existing OME-Zarr image in the same well, the metadata is copied and added to the new zarr well. Additionally, the well-level metadata is updated to include this new image.
PARAMETER DESCRIPTION zarr_url_origin
zarr_url of the origin image
TYPE: str
zarr_url_new
zarr_url of the newly created image. The zarr-group already needs to exist, but metadata is written by this function.
TYPE: str
Source code in fractal_tasks_core/tasks/_zarr_utils.py
def _copy_hcs_ome_zarr_metadata(\n zarr_url_origin: str,\n zarr_url_new: str,\n) -> None:\n\"\"\"\n Updates the necessary metadata for a new copy of an OME-Zarr image\n\n Based on an existing OME-Zarr image in the same well, the metadata is\n copied and added to the new zarr well. Additionally, the well-level\n metadata is updated to include this new image.\n\n Args:\n zarr_url_origin: zarr_url of the origin image\n zarr_url_new: zarr_url of the newly created image. The zarr-group\n already needs to exist, but metadata is written by this function.\n \"\"\"\n # Copy over OME-Zarr metadata for illumination_corrected image\n # See #681 for discussion for validation of this zattrs\n old_image_group = zarr.open_group(zarr_url_origin, mode=\"r\")\n old_attrs = old_image_group.attrs.asdict()\n zarr_url_new = zarr_url_new.rstrip(\"/\")\n new_image_group = zarr.group(zarr_url_new)\n new_image_group.attrs.put(old_attrs)\n\n # Update well metadata about adding the new image:\n new_image_path = zarr_url_new.split(\"/\")[-1]\n well_url, old_image_path = _split_well_path_image_path(zarr_url_origin)\n _update_well_metadata(well_url, old_image_path, new_image_path)\n
Copies all ROI tables from one Zarr into a new Zarr
PARAMETER DESCRIPTION origin_zarr_url
url of the OME-Zarr image that contains tables. e.g. /path/to/my_plate.zarr/B/03/0
TYPE: str
target_zarr_url
url of the new OME-Zarr image where tables are copied to. e.g. /path/to/my_plate.zarr/B/03/0_illum_corr
TYPE: str
table_type
Filter for specific table types that should be copied.
TYPE: str DEFAULT: None
overwrite
Whether existing tables of the same name in the target_zarr_url should be overwritten.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/_zarr_utils.py
def _copy_tables_from_zarr_url(\n origin_zarr_url: str,\n target_zarr_url: str,\n table_type: str = None,\n overwrite: bool = True,\n) -> None:\n\"\"\"\n Copies all ROI tables from one Zarr into a new Zarr\n\n Args:\n origin_zarr_url: url of the OME-Zarr image that contains tables.\n e.g. /path/to/my_plate.zarr/B/03/0\n target_zarr_url: url of the new OME-Zarr image where tables are copied\n to. e.g. /path/to/my_plate.zarr/B/03/0_illum_corr\n table_type: Filter for specific table types that should be copied.\n overwrite: Whether existing tables of the same name in the\n target_zarr_url should be overwritten.\n \"\"\"\n table_list = get_tables_list_v1(\n zarr_url=origin_zarr_url, table_type=table_type\n )\n\n if table_list:\n logger.info(\n f\"Copying the tables {table_list} from {origin_zarr_url} to \"\n f\"{target_zarr_url}.\"\n )\n new_image_group = zarr.group(target_zarr_url)\n\n for table in table_list:\n logger.info(f\"Copying table: {table}\")\n # Get the relevant metadata of the Zarr table & add it\n table_url = f\"{origin_zarr_url}/tables/{table}\"\n old_table_group = zarr.open_group(table_url, mode=\"r\")\n # Write the Zarr table\n curr_table = ad.read_zarr(table_url)\n write_table(\n new_image_group,\n table,\n curr_table,\n table_attrs=old_table_group.attrs.asdict(),\n overwrite=overwrite,\n )\n
Pick the best match from path_list to a given path
This is a workaround to find the reference registration acquisition when there are multiple OME-Zarrs with the same acquisition identifier in the well metadata and we need to find which one is the reference for a given path.
PARAMETER DESCRIPTION path_list
List of paths to OME-Zarr images in the well metadata. For example: ['0', '0_illum_corr']
TYPE: list[str]
path
A given path for which we want to find the reference image. For example, '1_illum_corr'
TYPE: str
RETURNS DESCRIPTION str
The best matching reference path. If no direct match is found, it
str
returns the most similar one based on suffix hierarchy or the base
str
path if applicable. For example, '0_illum_corr' with the example
str
inputs above.
Source code in fractal_tasks_core/tasks/_zarr_utils.py
def _get_matching_ref_acquisition_path_heuristic(\n path_list: list[str], path: str\n) -> str:\n\"\"\"\n Pick the best match from path_list to a given path\n\n This is a workaround to find the reference registration acquisition when\n there are multiple OME-Zarrs with the same acquisition identifier in the\n well metadata and we need to find which one is the reference for a given\n path.\n\n Args:\n path_list: List of paths to OME-Zarr images in the well metadata. For\n example: ['0', '0_illum_corr']\n path: A given path for which we want to find the reference image. For\n example, '1_illum_corr'\n\n Returns:\n The best matching reference path. If no direct match is found, it\n returns the most similar one based on suffix hierarchy or the base\n path if applicable. For example, '0_illum_corr' with the example\n inputs above.\n \"\"\"\n\n # Extract the base number and suffix from the input path\n base, suffix = _split_base_suffix(path)\n\n # Sort path_list\n sorted_path_list = sorted(path_list)\n\n # Never return the input `path`\n if path in sorted_path_list:\n sorted_path_list.remove(path)\n\n # First matching rule: a path with the same suffix\n for p in sorted_path_list:\n # Split the list path into base and suffix\n p_base, p_suffix = _split_base_suffix(p)\n # If suffices match, it's the match.\n if p_suffix == suffix:\n return p\n\n # If no match is found, return the first entry in the list\n logger.warning(\n \"No heuristic reference acquisition match found, defaulting to first \"\n f\"option {sorted_path_list[0]}.\"\n )\n return sorted_path_list[0]\n
Update the well metadata by adding the new_image_path to the image list.
The content of new_image_path will be based on old_image_path, the origin for the new image that was created. This function aims to avoid race conditions with other processes that try to update the well metadata file by using FileLock & Timeouts
PARAMETER DESCRIPTION well_url
Path to the HCS OME-Zarr well that needs to be updated
TYPE: str
old_image_path
path relative to well_url where the original image is found
TYPE: str
new_image_path
path relative to well_url where the new image is placed
TYPE: str
timeout
Timeout in seconds for trying to get the file lock
TYPE: int DEFAULT: 120
Source code in fractal_tasks_core/tasks/_zarr_utils.py
def _update_well_metadata(\n well_url: str,\n old_image_path: str,\n new_image_path: str,\n timeout: int = 120,\n) -> None:\n\"\"\"\n Update the well metadata by adding the new_image_path to the image list.\n\n The content of new_image_path will be based on old_image_path, the origin\n for the new image that was created.\n This function aims to avoid race conditions with other processes that try\n to update the well metadata file by using FileLock & Timeouts\n\n Args:\n well_url: Path to the HCS OME-Zarr well that needs to be updated\n old_image_path: path relative to well_url where the original image is\n found\n new_image_path: path relative to well_url where the new image is placed\n timeout: Timeout in seconds for trying to get the file lock\n \"\"\"\n lock = FileLock(f\"{well_url}/.zattrs.lock\")\n with lock.acquire(timeout=timeout):\n\n well_meta = load_NgffWellMeta(well_url)\n existing_well_images = [image.path for image in well_meta.well.images]\n if new_image_path in existing_well_images:\n raise ValueError(\n f\"Could not add the {new_image_path=} image to the well \"\n \"metadata because and image with that name \"\n f\"already existed in the well metadata: {well_meta}\"\n )\n try:\n well_meta_image_old = next(\n image\n for image in well_meta.well.images\n if image.path == old_image_path\n )\n except StopIteration:\n raise ValueError(\n f\"Could not find an image with {old_image_path=} in the \"\n \"current well metadata.\"\n )\n well_meta_image = copy.deepcopy(well_meta_image_old)\n well_meta_image.path = new_image_path\n well_meta.well.images.append(well_meta_image)\n well_meta.well.images = sorted(\n well_meta.well.images,\n key=lambda _image: _image.path,\n )\n\n well_group = zarr.group(well_url)\n well_group.attrs.put(well_meta.dict(exclude_none=True))\n
Apply registration to images by using a registered ROI table
This task consists of 4 parts:
Mask all regions in images that are not available in the registered ROI table and store each acquisition aligned to the reference_acquisition (by looping over ROIs).
Do the same for all label images.
Copy all tables from the non-aligned image to the aligned image (currently only works well if the only tables are well & FOV ROI tables (registered and original). Not implemented for measurement tables and other ROI tables).
Clean up: Delete the old, non-aligned image and rename the new, aligned image to take over its place.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
registered_roi_table
Name of the ROI table which has been registered and will be applied to mask and shift the images. Examples: registered_FOV_ROI_table => loop over the field of views, registered_well_ROI_table => process the whole well as one image.
TYPE: str
reference_acquisition
Which acquisition to register against. Uses the OME-NGFF HCS well metadata acquisition keys to find the reference acquisition.
TYPE: int DEFAULT: 0
overwrite_input
Whether the old image data should be replaced with the newly registered image data. Currently only implemented for overwrite_input=True.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/apply_registration_to_image.py
@validate_arguments\ndef apply_registration_to_image(\n *,\n # Fractal parameters\n zarr_url: str,\n # Core parameters\n registered_roi_table: str,\n reference_acquisition: int = 0,\n overwrite_input: bool = True,\n):\n\"\"\"\n Apply registration to images by using a registered ROI table\n\n This task consists of 4 parts:\n\n 1. Mask all regions in images that are not available in the\n registered ROI table and store each acquisition aligned to the\n reference_acquisition (by looping over ROIs).\n 2. Do the same for all label images.\n 3. Copy all tables from the non-aligned image to the aligned image\n (currently only works well if the only tables are well & FOV ROI tables\n (registered and original). Not implemented for measurement tables and\n other ROI tables).\n 4. Clean up: Delete the old, non-aligned image and rename the new,\n aligned image to take over its place.\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n registered_roi_table: Name of the ROI table which has been registered\n and will be applied to mask and shift the images.\n Examples: `registered_FOV_ROI_table` => loop over the field of\n views, `registered_well_ROI_table` => process the whole well as\n one image.\n reference_acquisition: Which acquisition to register against. Uses the\n OME-NGFF HCS well metadata acquisition keys to find the reference\n acquisition.\n overwrite_input: Whether the old image data should be replaced with the\n newly registered image data. Currently only implemented for\n `overwrite_input=True`.\n\n \"\"\"\n logger.info(zarr_url)\n logger.info(\n f\"Running `apply_registration_to_image` on {zarr_url=}, \"\n f\"{registered_roi_table=} and {reference_acquisition=}. \"\n f\"Using {overwrite_input=}\"\n )\n\n well_url, old_img_path = _split_well_path_image_path(zarr_url)\n new_zarr_url = f\"{well_url}/{zarr_url.split('/')[-1]}_registered\"\n # Get the zarr_url for the reference acquisition\n acq_dict = load_NgffWellMeta(well_url).get_acquisition_paths()\n if reference_acquisition not in acq_dict:\n raise ValueError(\n f\"{reference_acquisition=} was not one of the available \"\n f\"acquisitions in {acq_dict=} for well {well_url}\"\n )\n elif len(acq_dict[reference_acquisition]) > 1:\n ref_path = _get_matching_ref_acquisition_path_heuristic(\n acq_dict[reference_acquisition], old_img_path\n )\n logger.warning(\n \"Running registration when there are multiple images of the same \"\n \"acquisition in a well. Using a heuristic to match the reference \"\n f\"acquisition. Using {ref_path} as the reference image.\"\n )\n else:\n ref_path = acq_dict[reference_acquisition][0]\n reference_zarr_url = f\"{well_url}/{ref_path}\"\n\n ROI_table_ref = ad.read_zarr(\n f\"{reference_zarr_url}/tables/{registered_roi_table}\"\n )\n ROI_table_acq = ad.read_zarr(f\"{zarr_url}/tables/{registered_roi_table}\")\n\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n coarsening_xy = ngff_image_meta.coarsening_xy\n num_levels = ngff_image_meta.num_levels\n\n ####################\n # Process images\n ####################\n logger.info(\"Write the registered Zarr image to disk\")\n write_registered_zarr(\n zarr_url=zarr_url,\n new_zarr_url=new_zarr_url,\n ROI_table=ROI_table_acq,\n ROI_table_ref=ROI_table_ref,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n aggregation_function=np.mean,\n )\n\n ####################\n # Process labels\n ####################\n try:\n labels_group = zarr.open_group(f\"{zarr_url}/labels\", \"r\")\n label_list = labels_group.attrs[\"labels\"]\n except (zarr.errors.GroupNotFoundError, KeyError):\n label_list = []\n\n if label_list:\n logger.info(f\"Processing the label images: {label_list}\")\n labels_group = zarr.group(f\"{new_zarr_url}/labels\")\n labels_group.attrs[\"labels\"] = label_list\n\n for label in label_list:\n write_registered_zarr(\n zarr_url=f\"{zarr_url}/labels/{label}\",\n new_zarr_url=f\"{new_zarr_url}/labels/{label}\",\n ROI_table=ROI_table_acq,\n ROI_table_ref=ROI_table_ref,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n aggregation_function=np.max,\n )\n\n ####################\n # Copy tables\n # 1. Copy all standard ROI tables from the reference acquisition.\n # 2. Copy all tables that aren't standard ROI tables from the given\n # acquisition.\n ####################\n table_dict_reference = _get_table_path_dict(reference_zarr_url)\n table_dict_component = _get_table_path_dict(zarr_url)\n\n table_dict = {}\n # Define which table should get copied:\n for table in table_dict_reference:\n if is_standard_roi_table(table):\n table_dict[table] = table_dict_reference[table]\n for table in table_dict_component:\n if not is_standard_roi_table(table):\n if reference_zarr_url != zarr_url:\n logger.warning(\n f\"{zarr_url} contained a table that is not a standard \"\n \"ROI table. The `Apply Registration To Image task` is \"\n \"best used before additional tables are generated. It \"\n f\"will copy the {table} from this acquisition without \"\n \"applying any transformations. This will work well if \"\n f\"{table} contains measurements. But if {table} is a \"\n \"custom ROI table coming from another task, the \"\n \"transformation is not applied and it will not match \"\n \"with the registered image anymore.\"\n )\n table_dict[table] = table_dict_component[table]\n\n if table_dict:\n logger.info(f\"Processing the tables: {table_dict}\")\n new_image_group = zarr.group(new_zarr_url)\n\n for table in table_dict.keys():\n logger.info(f\"Copying table: {table}\")\n # Get the relevant metadata of the Zarr table & add it\n # See issue #516 for the need for this workaround\n max_retries = 20\n sleep_time = 5\n current_round = 0\n while current_round < max_retries:\n try:\n old_table_group = zarr.open_group(\n table_dict[table], mode=\"r\"\n )\n current_round = max_retries\n except zarr.errors.GroupNotFoundError:\n logger.debug(\n f\"Table {table} not found in attempt {current_round}. \"\n f\"Waiting {sleep_time} seconds before trying again.\"\n )\n current_round += 1\n time.sleep(sleep_time)\n # Write the Zarr table\n curr_table = ad.read_zarr(table_dict[table])\n write_table(\n new_image_group,\n table,\n curr_table,\n table_attrs=old_table_group.attrs.asdict(),\n overwrite=True,\n )\n\n ####################\n # Clean up Zarr file\n ####################\n if overwrite_input:\n logger.info(\n \"Replace original zarr image with the newly created Zarr image\"\n )\n # Potential for race conditions: Every acquisition reads the\n # reference acquisition, but the reference acquisition also gets\n # modified\n # See issue #516 for the details\n os.rename(zarr_url, f\"{zarr_url}_tmp\")\n os.rename(new_zarr_url, zarr_url)\n shutil.rmtree(f\"{zarr_url}_tmp\")\n image_list_updates = dict(image_list_updates=[dict(zarr_url=zarr_url)])\n else:\n image_list_updates = dict(\n image_list_updates=[dict(zarr_url=new_zarr_url, origin=zarr_url)]\n )\n # Update the metadata of the the well\n well_url, new_img_path = _split_well_path_image_path(new_zarr_url)\n _update_well_metadata(\n well_url=well_url,\n old_image_path=old_img_path,\n new_image_path=new_img_path,\n )\n\n return image_list_updates\n
This function loads the image or label data from a zarr array based on the ROI bounding-box coordinates and stores them into a new zarr array. The new Zarr array has the same shape as the original array, but will have 0s where the ROI tables don't specify loading of the image data. The ROIs loaded from list_indices will be written into the list_indices_ref position, thus performing translational registration if the two lists of ROI indices vary.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be used as the basis for the new OME-Zarr image.
TYPE: str
new_zarr_url
Path or url to the new OME-Zarr image to be written
TYPE: str
ROI_table
Fractal ROI table for the component
TYPE: AnnData
ROI_table_ref
Fractal ROI table for the reference acquisition
TYPE: AnnData
num_levels
Number of pyramid layers to be created (argument of build_pyramid).
TYPE: int
coarsening_xy
Coarsening factor between pyramid levels
TYPE: int DEFAULT: 2
aggregation_function
Function to be used when downsampling (argument of build_pyramid).
TYPE: Callable DEFAULT: mean
Source code in fractal_tasks_core/tasks/apply_registration_to_image.py
def write_registered_zarr(\n zarr_url: str,\n new_zarr_url: str,\n ROI_table: ad.AnnData,\n ROI_table_ref: ad.AnnData,\n num_levels: int,\n coarsening_xy: int = 2,\n aggregation_function: Callable = np.mean,\n):\n\"\"\"\n Write registered zarr array based on ROI tables\n\n This function loads the image or label data from a zarr array based on the\n ROI bounding-box coordinates and stores them into a new zarr array.\n The new Zarr array has the same shape as the original array, but will have\n 0s where the ROI tables don't specify loading of the image data.\n The ROIs loaded from `list_indices` will be written into the\n `list_indices_ref` position, thus performing translational registration if\n the two lists of ROI indices vary.\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be used as\n the basis for the new OME-Zarr image.\n new_zarr_url: Path or url to the new OME-Zarr image to be written\n ROI_table: Fractal ROI table for the component\n ROI_table_ref: Fractal ROI table for the reference acquisition\n num_levels: Number of pyramid layers to be created (argument of\n `build_pyramid`).\n coarsening_xy: Coarsening factor between pyramid levels\n aggregation_function: Function to be used when downsampling (argument\n of `build_pyramid`).\n\n \"\"\"\n # Read pixel sizes from Zarr attributes\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n\n # Create list of indices for 3D ROIs\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n list_indices_ref = convert_ROI_table_to_indices(\n ROI_table_ref,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n\n old_image_group = zarr.open_group(zarr_url, mode=\"r\")\n old_ngff_image_meta = load_NgffImageMeta(zarr_url)\n new_image_group = zarr.group(new_zarr_url)\n new_image_group.attrs.put(old_image_group.attrs.asdict())\n\n # Loop over all channels. For each channel, write full-res image data.\n data_array = da.from_zarr(old_image_group[\"0\"])\n # Create dask array with 0s of same shape\n new_array = da.zeros_like(data_array)\n\n # TODO: Add sanity checks on the 2 ROI tables:\n # 1. The number of ROIs need to match\n # 2. The size of the ROIs need to match\n # (otherwise, we can't assign them to the reference regions)\n # ROI_table_ref vs ROI_table_acq\n for i, roi_indices in enumerate(list_indices):\n reference_region = convert_indices_to_regions(list_indices_ref[i])\n region = convert_indices_to_regions(roi_indices)\n\n axes_list = old_ngff_image_meta.axes_names\n\n if axes_list == [\"c\", \"z\", \"y\", \"x\"]:\n num_channels = data_array.shape[0]\n # Loop over channels\n for ind_ch in range(num_channels):\n idx = tuple(\n [slice(ind_ch, ind_ch + 1)] + list(reference_region)\n )\n new_array[idx] = load_region(\n data_zyx=data_array[ind_ch], region=region, compute=False\n )\n elif axes_list == [\"z\", \"y\", \"x\"]:\n new_array[reference_region] = load_region(\n data_zyx=data_array, region=region, compute=False\n )\n elif axes_list == [\"c\", \"y\", \"x\"]:\n # TODO: Implement cyx case (based on looping over xy case)\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n elif axes_list == [\"y\", \"x\"]:\n # TODO: Implement yx case\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n else:\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n\n new_array.to_zarr(\n f\"{new_zarr_url}/0\",\n overwrite=True,\n dimension_separator=\"/\",\n write_empty_chunks=False,\n )\n\n # Starting from on-disk highest-resolution data, build and write to\n # disk a pyramid of coarser levels\n build_pyramid(\n zarrurl=new_zarr_url,\n overwrite=True,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=data_array.chunksize,\n aggregation_function=aggregation_function,\n )\n
Loading the images of a given ROI (=> loop over ROIs)
Calculating the transformation for that ROI
Storing the calculated transformation in the ROI table
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
init_args
Intialization arguments provided by image_based_registration_hcs_init. They contain the reference_zarr_url that is used for registration. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: InitArgsRegistration
wavelength_id
Wavelength that will be used for image-based registration; e.g. A01_C01 for Yokogawa, C01 for MD.
TYPE: str
roi_table
Name of the ROI table over which the task loops to calculate the registration. Examples: FOV_ROI_table => loop over the field of views, well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
level
Pyramid level of the image to be used for registration. Choose 0 to process at full resolution.
TYPE: int DEFAULT: 2
Source code in fractal_tasks_core/tasks/calculate_registration_image_based.py
@validate_arguments\ndef calculate_registration_image_based(\n *,\n # Fractal arguments\n zarr_url: str,\n init_args: InitArgsRegistration,\n # Core parameters\n wavelength_id: str,\n roi_table: str = \"FOV_ROI_table\",\n level: int = 2,\n) -> None:\n\"\"\"\n Calculate registration based on images\n\n This task consists of 3 parts:\n\n 1. Loading the images of a given ROI (=> loop over ROIs)\n 2. Calculating the transformation for that ROI\n 3. Storing the calculated transformation in the ROI table\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n init_args: Intialization arguments provided by\n `image_based_registration_hcs_init`. They contain the\n reference_zarr_url that is used for registration.\n (standard argument for Fractal tasks, managed by Fractal server).\n wavelength_id: Wavelength that will be used for image-based\n registration; e.g. `A01_C01` for Yokogawa, `C01` for MD.\n roi_table: Name of the ROI table over which the task loops to\n calculate the registration. Examples: `FOV_ROI_table` => loop over\n the field of views, `well_ROI_table` => process the whole well as\n one image.\n level: Pyramid level of the image to be used for registration.\n Choose `0` to process at full resolution.\n\n \"\"\"\n logger.info(\n f\"Running for {zarr_url=}.\\n\"\n f\"Calculating translation registration per {roi_table=} for \"\n f\"{wavelength_id=}.\"\n )\n\n init_args.reference_zarr_url = init_args.reference_zarr_url\n\n # Read some parameters from Zarr metadata\n ngff_image_meta = load_NgffImageMeta(str(init_args.reference_zarr_url))\n coarsening_xy = ngff_image_meta.coarsening_xy\n\n # Get channel_index via wavelength_id.\n # Intially only allow registration of the same wavelength\n channel_ref: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=init_args.reference_zarr_url,\n wavelength_id=wavelength_id,\n )\n channel_index_ref = channel_ref.index\n\n channel_align: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=zarr_url,\n wavelength_id=wavelength_id,\n )\n channel_index_align = channel_align.index\n\n # Lazily load zarr array\n data_reference_zyx = da.from_zarr(\n f\"{init_args.reference_zarr_url}/{level}\"\n )[channel_index_ref]\n data_alignment_zyx = da.from_zarr(f\"{zarr_url}/{level}\")[\n channel_index_align\n ]\n\n # Read ROIs\n ROI_table_ref = ad.read_zarr(\n f\"{init_args.reference_zarr_url}/tables/{roi_table}\"\n )\n ROI_table_x = ad.read_zarr(f\"{zarr_url}/tables/{roi_table}\")\n logger.info(\n f\"Found {len(ROI_table_x)} ROIs in {roi_table=} to be processed.\"\n )\n\n # Check that table type of ROI_table_ref is valid. Note that\n # \"ngff:region_table\" and None are accepted for backwards compatibility\n valid_table_types = [\n \"roi_table\",\n \"masking_roi_table\",\n \"ngff:region_table\",\n None,\n ]\n ROI_table_ref_group = zarr.open_group(\n f\"{init_args.reference_zarr_url}/tables/{roi_table}\",\n mode=\"r\",\n )\n ref_table_attrs = ROI_table_ref_group.attrs.asdict()\n ref_table_type = ref_table_attrs.get(\"type\")\n if ref_table_type not in valid_table_types:\n raise ValueError(\n (\n f\"Table '{roi_table}' (with type '{ref_table_type}') is \"\n \"not a valid ROI table.\"\n )\n )\n\n # For each acquisition, get the relevant info\n # TODO: Add additional checks on ROIs?\n if (ROI_table_ref.obs.index != ROI_table_x.obs.index).all():\n raise ValueError(\n \"Registration is only implemented for ROIs that match between the \"\n \"acquisitions (e.g. well, FOV ROIs). Here, the ROIs in the \"\n f\"reference acquisitions were {ROI_table_ref.obs.index}, but the \"\n f\"ROIs in the alignment acquisition were {ROI_table_x.obs.index}\"\n )\n # TODO: Make this less restrictive? i.e. could we also run it if different\n # acquisitions have different FOVs? But then how do we know which FOVs to\n # match?\n # If we relax this, downstream assumptions on matching based on order\n # in the list will break.\n\n # Read pixel sizes from zarr attributes\n ngff_image_meta_acq_x = load_NgffImageMeta(zarr_url)\n pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n pxl_sizes_zyx_acq_x = ngff_image_meta_acq_x.get_pixel_sizes_zyx(level=0)\n\n if pxl_sizes_zyx != pxl_sizes_zyx_acq_x:\n raise ValueError(\n \"Pixel sizes need to be equal between acquisitions for \"\n \"registration.\"\n )\n\n # Create list of indices for 3D ROIs spanning the entire Z direction\n list_indices_ref = convert_ROI_table_to_indices(\n ROI_table_ref,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices_ref, roi_table)\n\n list_indices_acq_x = convert_ROI_table_to_indices(\n ROI_table_x,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices_acq_x, roi_table)\n\n num_ROIs = len(list_indices_ref)\n compute = True\n new_shifts = {}\n for i_ROI in range(num_ROIs):\n logger.info(\n f\"Now processing ROI {i_ROI+1}/{num_ROIs} \"\n f\"for channel {channel_align}.\"\n )\n img_ref = load_region(\n data_zyx=data_reference_zyx,\n region=convert_indices_to_regions(list_indices_ref[i_ROI]),\n compute=compute,\n )\n img_acq_x = load_region(\n data_zyx=data_alignment_zyx,\n region=convert_indices_to_regions(list_indices_acq_x[i_ROI]),\n compute=compute,\n )\n\n ##############\n # Calculate the transformation\n ##############\n # Basic version (no padding, no internal binning)\n if img_ref.shape != img_acq_x.shape:\n raise NotImplementedError(\n \"This registration is not implemented for ROIs with \"\n \"different shapes between acquisitions.\"\n )\n shifts = phase_cross_correlation(\n np.squeeze(img_ref), np.squeeze(img_acq_x)\n )[0]\n\n # Registration based on scmultiplex, image-based\n # shifts, _, _ = calculate_shift(np.squeeze(img_ref),\n # np.squeeze(img_acq_x), bin=binning, binarize=False)\n\n # TODO: Make this work on label images\n # (=> different loading) etc.\n\n ##############\n # Storing the calculated transformation ###\n ##############\n # Store the shift in ROI table\n # TODO: Store in OME-NGFF transformations: Check SpatialData approach,\n # per ROI storage?\n\n # Adapt ROIs for the given ROI table:\n ROI_name = ROI_table_ref.obs.index[i_ROI]\n new_shifts[ROI_name] = calculate_physical_shifts(\n shifts,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n\n # Write physical shifts to disk (as part of the ROI table)\n logger.info(f\"Updating the {roi_table=} with translation columns\")\n image_group = zarr.group(zarr_url)\n new_ROI_table = get_ROI_table_with_translation(ROI_table_x, new_shifts)\n write_table(\n image_group,\n roi_table,\n new_ROI_table,\n overwrite=True,\n table_attrs=ref_table_attrs,\n )\n
Run cellpose segmentation on the ROIs of a single OME-Zarr image.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
level
Pyramid level of the image to be segmented. Choose 0 to process at full resolution.
TYPE: int
channel
Primary channel for segmentation; requires either wavelength_id (e.g. A01_C01) or label (e.g. DAPI).
TYPE: ChannelInputModel
channel2
Second channel for segmentation (in the same format as channel). If specified, cellpose runs in dual channel mode. For dual channel segmentation of cells, the first channel should contain the membrane marker, the second channel should contain the nuclear marker.
TYPE: Optional[ChannelInputModel] DEFAULT: None
input_ROI_table
Name of the ROI table over which the task loops to apply Cellpose segmentation. Examples: FOV_ROI_table => loop over the field of views, organoid_ROI_table => loop over the organoid ROI table (generated by another task), well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
output_ROI_table
If provided, a ROI table with that name is created, which will contain the bounding boxes of the newly segmented labels. ROI tables should have ROI in their name.
TYPE: Optional[str] DEFAULT: None
use_masks
If True, try to use masked loading and fall back to use_masks=False if the ROI table is not suitable. Masked loading is relevant when only a subset of the bounding box should actually be processed (e.g. running within organoid_ROI_table).
TYPE: bool DEFAULT: True
output_label_name
Name of the output label image (e.g. \"organoids\").
TYPE: Optional[str] DEFAULT: None
relabeling
If True, apply relabeling so that label values are unique for all objects in the well.
TYPE: bool DEFAULT: True
diameter_level0
Expected diameter of the objects that should be segmented in pixels at level 0. Initial diameter is rescaled using the level that was selected. The rescaled value is passed as the diameter to the CellposeModel.eval method.
TYPE: float DEFAULT: 30.0
model_type
Parameter of CellposeModel class. Defines which model should be used. Typical choices are nuclei, cyto, cyto2, etc.
Parameter of CellposeModel class (takes precedence over model_type). Allows you to specify the path of a custom trained cellpose model.
TYPE: Optional[str] DEFAULT: None
cellprob_threshold
Parameter of CellposeModel.eval method. Valid values between -6 to 6. From Cellpose documentation: \"Decrease this threshold if cellpose is not returning as many ROIs as you\u2019d expect. Similarly, increase this threshold if cellpose is returning too ROIs particularly from dim areas.\"
TYPE: float DEFAULT: 0.0
flow_threshold
Parameter of CellposeModel.eval method. Valid values between 0.0 and 1.0. From Cellpose documentation: \"Increase this threshold if cellpose is not returning as many ROIs as you\u2019d expect. Similarly, decrease this threshold if cellpose is returning too many ill-shaped ROIs.\"
TYPE: float DEFAULT: 0.4
normalize
By default, data is normalized so 0.0=1st percentile and 1.0=99th percentile of image intensities in each channel. This automatic normalization can lead to issues when the image to be segmented is very sparse. You can turn off the default rescaling. With the \"custom\" option, you can either provide your own rescaling percentiles or fixed rescaling upper and lower bound integers.
Ratio of the pixel sizes along Z and XY axis (ignored if the image is not three-dimensional). If None, it is inferred from the OME-NGFF metadata.
TYPE: Optional[float] DEFAULT: None
min_size
Parameter of CellposeModel class. Minimum size of the segmented objects (in pixels). Use -1 to turn off the size filter.
TYPE: int DEFAULT: 15
augment
Parameter of CellposeModel class. Whether to use cellpose augmentation to tile images with overlap.
TYPE: bool DEFAULT: False
net_avg
Parameter of CellposeModel class. Whether to use cellpose net averaging to run the 4 built-in networks (useful for nuclei, cyto and cyto2, not sure it works for the others).
TYPE: bool DEFAULT: False
use_gpu
If False, always use the CPU; if True, use the GPU if possible (as defined in cellpose.core.use_gpu()) and fall-back to the CPU otherwise.
TYPE: bool DEFAULT: True
batch_size
number of 224x224 patches to run simultaneously on the GPU (can make smaller or bigger depending on GPU memory usage)
TYPE: int DEFAULT: 8
invert
invert image pixel intensity before running network (if True, image is also normalized)
TYPE: bool DEFAULT: False
tile
tiles image to ensure GPU/CPU memory usage limited (recommended)
TYPE: bool DEFAULT: True
tile_overlap
fraction of overlap of tiles when computing flows
TYPE: float DEFAULT: 0.1
resample
run dynamics at original image size (will be slower but create more accurate boundaries)
TYPE: bool DEFAULT: True
interp
interpolate during 2D dynamics (not available in 3D) (in previous versions it was False, now it defaults to True)
TYPE: bool DEFAULT: True
stitch_threshold
if stitch_threshold>0.0 and not do_3D and equal image sizes, masks are stitched in 3D to return volume segmentation
TYPE: float DEFAULT: 0.0
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/cellpose_segmentation.py
@validate_arguments\ndef cellpose_segmentation(\n *,\n # Fractal parameters\n zarr_url: str,\n # Core parameters\n level: int,\n channel: ChannelInputModel,\n channel2: Optional[ChannelInputModel] = None,\n input_ROI_table: str = \"FOV_ROI_table\",\n output_ROI_table: Optional[str] = None,\n output_label_name: Optional[str] = None,\n # Cellpose-related arguments\n diameter_level0: float = 30.0,\n # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/401 # noqa E501\n model_type: Literal[tuple(models.MODEL_NAMES)] = \"cyto2\",\n pretrained_model: Optional[str] = None,\n # Advanced parameters\n cellprob_threshold: float = 0.0,\n flow_threshold: float = 0.4,\n normalize: CellposeCustomNormalizer = CellposeCustomNormalizer(),\n anisotropy: Optional[float] = None,\n min_size: int = 15,\n augment: bool = False,\n net_avg: bool = False,\n use_gpu: bool = True,\n batch_size: int = 8,\n invert: bool = False,\n tile: bool = True,\n tile_overlap: float = 0.1,\n resample: bool = True,\n interp: bool = True,\n stitch_threshold: float = 0.0,\n use_masks: bool = True,\n relabeling: bool = True,\n overwrite: bool = True,\n) -> None:\n\"\"\"\n Run cellpose segmentation on the ROIs of a single OME-Zarr image.\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n level: Pyramid level of the image to be segmented. Choose `0` to\n process at full resolution.\n channel: Primary channel for segmentation; requires either\n `wavelength_id` (e.g. `A01_C01`) or `label` (e.g. `DAPI`).\n channel2: Second channel for segmentation (in the same format as\n `channel`). If specified, cellpose runs in dual channel mode.\n For dual channel segmentation of cells, the first channel should\n contain the membrane marker, the second channel should contain the\n nuclear marker.\n input_ROI_table: Name of the ROI table over which the task loops to\n apply Cellpose segmentation. Examples: `FOV_ROI_table` => loop over\n the field of views, `organoid_ROI_table` => loop over the organoid\n ROI table (generated by another task), `well_ROI_table` => process\n the whole well as one image.\n output_ROI_table: If provided, a ROI table with that name is created,\n which will contain the bounding boxes of the newly segmented\n labels. ROI tables should have `ROI` in their name.\n use_masks: If `True`, try to use masked loading and fall back to\n `use_masks=False` if the ROI table is not suitable. Masked\n loading is relevant when only a subset of the bounding box should\n actually be processed (e.g. running within `organoid_ROI_table`).\n output_label_name: Name of the output label image (e.g. `\"organoids\"`).\n relabeling: If `True`, apply relabeling so that label values are\n unique for all objects in the well.\n diameter_level0: Expected diameter of the objects that should be\n segmented in pixels at level 0. Initial diameter is rescaled using\n the `level` that was selected. The rescaled value is passed as\n the diameter to the `CellposeModel.eval` method.\n model_type: Parameter of `CellposeModel` class. Defines which model\n should be used. Typical choices are `nuclei`, `cyto`, `cyto2`, etc.\n pretrained_model: Parameter of `CellposeModel` class (takes\n precedence over `model_type`). Allows you to specify the path of\n a custom trained cellpose model.\n cellprob_threshold: Parameter of `CellposeModel.eval` method. Valid\n values between -6 to 6. From Cellpose documentation: \"Decrease this\n threshold if cellpose is not returning as many ROIs as you\u2019d\n expect. Similarly, increase this threshold if cellpose is returning\n too ROIs particularly from dim areas.\"\n flow_threshold: Parameter of `CellposeModel.eval` method. Valid\n values between 0.0 and 1.0. From Cellpose documentation: \"Increase\n this threshold if cellpose is not returning as many ROIs as you\u2019d\n expect. Similarly, decrease this threshold if cellpose is returning\n too many ill-shaped ROIs.\"\n normalize: By default, data is normalized so 0.0=1st percentile and\n 1.0=99th percentile of image intensities in each channel.\n This automatic normalization can lead to issues when the image to\n be segmented is very sparse. You can turn off the default\n rescaling. With the \"custom\" option, you can either provide your\n own rescaling percentiles or fixed rescaling upper and lower\n bound integers.\n anisotropy: Ratio of the pixel sizes along Z and XY axis (ignored if\n the image is not three-dimensional). If `None`, it is inferred from\n the OME-NGFF metadata.\n min_size: Parameter of `CellposeModel` class. Minimum size of the\n segmented objects (in pixels). Use `-1` to turn off the size\n filter.\n augment: Parameter of `CellposeModel` class. Whether to use cellpose\n augmentation to tile images with overlap.\n net_avg: Parameter of `CellposeModel` class. Whether to use cellpose\n net averaging to run the 4 built-in networks (useful for `nuclei`,\n `cyto` and `cyto2`, not sure it works for the others).\n use_gpu: If `False`, always use the CPU; if `True`, use the GPU if\n possible (as defined in `cellpose.core.use_gpu()`) and fall-back\n to the CPU otherwise.\n batch_size: number of 224x224 patches to run simultaneously on the GPU\n (can make smaller or bigger depending on GPU memory usage)\n invert: invert image pixel intensity before running network (if True,\n image is also normalized)\n tile: tiles image to ensure GPU/CPU memory usage limited (recommended)\n tile_overlap: fraction of overlap of tiles when computing flows\n resample: run dynamics at original image size (will be slower but\n create more accurate boundaries)\n interp: interpolate during 2D dynamics (not available in 3D)\n (in previous versions it was False, now it defaults to True)\n stitch_threshold: if stitch_threshold>0.0 and not do_3D and equal\n image sizes, masks are stitched in 3D to return volume segmentation\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n\n # Set input path\n logger.info(f\"{zarr_url=}\")\n\n # Preliminary checks on Cellpose model\n if pretrained_model:\n if not os.path.exists(pretrained_model):\n raise ValueError(f\"{pretrained_model=} does not exist.\")\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n actual_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=level)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n logger.info(\n f\"NGFF image has level-{level} pixel sizes \"\n f\"{actual_res_pxl_sizes_zyx}\"\n )\n\n # Find channel index\n try:\n tmp_channel: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=zarr_url,\n wavelength_id=channel.wavelength_id,\n label=channel.label,\n )\n except ChannelNotFoundError as e:\n logger.warning(\n \"Channel not found, exit from the task.\\n\"\n f\"Original error: {str(e)}\"\n )\n return None\n ind_channel = tmp_channel.index\n\n # Find channel index for second channel, if one is provided\n if channel2:\n try:\n tmp_channel_c2: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=zarr_url,\n wavelength_id=channel2.wavelength_id,\n label=channel2.label,\n )\n except ChannelNotFoundError as e:\n logger.warning(\n f\"Second channel with wavelength_id: {channel2.wavelength_id} \"\n f\"and label: {channel2.label} not found, exit from the task.\\n\"\n f\"Original error: {str(e)}\"\n )\n return None\n ind_channel_c2 = tmp_channel_c2.index\n\n # Set channel label\n if output_label_name is None:\n try:\n channel_label = tmp_channel.label\n output_label_name = f\"label_{channel_label}\"\n except (KeyError, IndexError):\n output_label_name = f\"label_{ind_channel}\"\n\n # Load ZYX data\n data_zyx = da.from_zarr(f\"{zarr_url}/{level}\")[ind_channel]\n logger.info(f\"{data_zyx.shape=}\")\n if channel2:\n data_zyx_c2 = da.from_zarr(f\"{zarr_url}/{level}\")[ind_channel_c2]\n logger.info(f\"Second channel: {data_zyx_c2.shape=}\")\n\n # Read ROI table\n ROI_table_path = f\"{zarr_url}/tables/{input_ROI_table}\"\n ROI_table = ad.read_zarr(ROI_table_path)\n\n # Perform some checks on the ROI table\n valid_ROI_table = is_ROI_table_valid(\n table_path=ROI_table_path, use_masks=use_masks\n )\n if use_masks and not valid_ROI_table:\n logger.info(\n f\"ROI table at {ROI_table_path} cannot be used for masked \"\n \"loading. Set use_masks=False.\"\n )\n use_masks = False\n logger.info(f\"{use_masks=}\")\n\n # Create list of indices for 3D ROIs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n\n # If we are not planning to use masked loading, fail for overlapping ROIs\n if not use_masks:\n overlap = find_overlaps_in_ROI_indices(list_indices)\n if overlap:\n raise ValueError(\n f\"ROI indices created from {input_ROI_table} table have \"\n \"overlaps, but we are not using masked loading.\"\n )\n\n # Select 2D/3D behavior and set some parameters\n do_3D = data_zyx.shape[0] > 1 and len(data_zyx.shape) == 3\n if do_3D:\n if anisotropy is None:\n # Compute anisotropy as pixel_size_z/pixel_size_x\n anisotropy = (\n actual_res_pxl_sizes_zyx[0] / actual_res_pxl_sizes_zyx[2]\n )\n logger.info(f\"Anisotropy: {anisotropy}\")\n\n # Rescale datasets (only relevant for level>0)\n if ngff_image_meta.axes_names[0] != \"c\":\n raise ValueError(\n \"Cannot set `remove_channel_axis=True` for multiscale \"\n f\"metadata with axes={ngff_image_meta.axes_names}. \"\n 'First axis should have name \"c\".'\n )\n new_datasets = rescale_datasets(\n datasets=[ds.dict() for ds in ngff_image_meta.datasets],\n coarsening_xy=coarsening_xy,\n reference_level=level,\n remove_channel_axis=True,\n )\n\n label_attrs = {\n \"image-label\": {\n \"version\": __OME_NGFF_VERSION__,\n \"source\": {\"image\": \"../../\"},\n },\n \"multiscales\": [\n {\n \"name\": output_label_name,\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n ax.dict()\n for ax in ngff_image_meta.multiscale.axes\n if ax.type != \"channel\"\n ],\n \"datasets\": new_datasets,\n }\n ],\n }\n\n image_group = zarr.group(zarr_url)\n label_group = prepare_label_group(\n image_group,\n output_label_name,\n overwrite=overwrite,\n label_attrs=label_attrs,\n logger=logger,\n )\n\n logger.info(\n f\"Helper function `prepare_label_group` returned {label_group=}\"\n )\n logger.info(f\"Output label path: {zarr_url}/labels/{output_label_name}/0\")\n store = zarr.storage.FSStore(f\"{zarr_url}/labels/{output_label_name}/0\")\n label_dtype = np.uint32\n\n # Ensure that all output shapes & chunks are 3D (for 2D data: (1, y, x))\n # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/398\n shape = data_zyx.shape\n if len(shape) == 2:\n shape = (1, *shape)\n chunks = data_zyx.chunksize\n if len(chunks) == 2:\n chunks = (1, *chunks)\n mask_zarr = zarr.create(\n shape=shape,\n chunks=chunks,\n dtype=label_dtype,\n store=store,\n overwrite=False,\n dimension_separator=\"/\",\n )\n\n logger.info(\n f\"mask will have shape {data_zyx.shape} \"\n f\"and chunks {data_zyx.chunks}\"\n )\n\n # Initialize cellpose\n gpu = use_gpu and cellpose.core.use_gpu()\n if pretrained_model:\n model = models.CellposeModel(\n gpu=gpu, pretrained_model=pretrained_model\n )\n else:\n model = models.CellposeModel(gpu=gpu, model_type=model_type)\n\n # Initialize other things\n logger.info(f\"Start cellpose_segmentation task for {zarr_url}\")\n logger.info(f\"relabeling: {relabeling}\")\n logger.info(f\"do_3D: {do_3D}\")\n logger.info(f\"use_gpu: {gpu}\")\n logger.info(f\"level: {level}\")\n logger.info(f\"model_type: {model_type}\")\n logger.info(f\"pretrained_model: {pretrained_model}\")\n logger.info(f\"anisotropy: {anisotropy}\")\n logger.info(\"Total well shape/chunks:\")\n logger.info(f\"{data_zyx.shape}\")\n logger.info(f\"{data_zyx.chunks}\")\n if channel2:\n logger.info(\"Dual channel input for cellpose model\")\n logger.info(f\"{data_zyx_c2.shape}\")\n logger.info(f\"{data_zyx_c2.chunks}\")\n\n # Counters for relabeling\n if relabeling:\n num_labels_tot = 0\n\n # Iterate over ROIs\n num_ROIs = len(list_indices)\n\n if output_ROI_table:\n bbox_dataframe_list = []\n\n logger.info(f\"Now starting loop over {num_ROIs} ROIs\")\n for i_ROI, indices in enumerate(list_indices):\n # Define region\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n logger.info(f\"Now processing ROI {i_ROI+1}/{num_ROIs}\")\n\n # Prepare single-channel or dual-channel input for cellpose\n if channel2:\n # Dual channel mode, first channel is the membrane channel\n img_1 = load_region(\n data_zyx,\n region,\n compute=True,\n return_as_3D=True,\n )\n img_np = np.zeros((2, *img_1.shape))\n img_np[0, :, :, :] = img_1\n img_np[1, :, :, :] = load_region(\n data_zyx_c2,\n region,\n compute=True,\n return_as_3D=True,\n )\n channels = [1, 2]\n else:\n img_np = np.expand_dims(\n load_region(data_zyx, region, compute=True, return_as_3D=True),\n axis=0,\n )\n channels = [0, 0]\n\n # Prepare keyword arguments for segment_ROI function\n kwargs_segment_ROI = dict(\n model=model,\n channels=channels,\n do_3D=do_3D,\n anisotropy=anisotropy,\n label_dtype=label_dtype,\n diameter=diameter_level0 / coarsening_xy**level,\n cellprob_threshold=cellprob_threshold,\n flow_threshold=flow_threshold,\n normalize=normalize,\n min_size=min_size,\n augment=augment,\n net_avg=net_avg,\n batch_size=batch_size,\n invert=invert,\n tile=tile,\n tile_overlap=tile_overlap,\n resample=resample,\n interp=interp,\n stitch_threshold=stitch_threshold,\n )\n\n # Prepare keyword arguments for preprocessing function\n preprocessing_kwargs = {}\n if use_masks:\n preprocessing_kwargs = dict(\n region=region,\n current_label_path=f\"{zarr_url}/labels/{output_label_name}/0\",\n ROI_table_path=ROI_table_path,\n ROI_positional_index=i_ROI,\n )\n\n # Call segment_ROI through the masked-loading wrapper, which includes\n # pre/post-processing functions if needed\n new_label_img = masked_loading_wrapper(\n image_array=img_np,\n function=segment_ROI,\n kwargs=kwargs_segment_ROI,\n use_masks=use_masks,\n preprocessing_kwargs=preprocessing_kwargs,\n )\n\n # Shift labels and update relabeling counters\n if relabeling:\n num_labels_roi = np.max(new_label_img)\n new_label_img[new_label_img > 0] += num_labels_tot\n num_labels_tot += num_labels_roi\n\n # Write some logs\n logger.info(f\"ROI {indices}, {num_labels_roi=}, {num_labels_tot=}\")\n\n # Check that total number of labels is under control\n if num_labels_tot > np.iinfo(label_dtype).max:\n raise ValueError(\n \"ERROR in re-labeling:\"\n f\"Reached {num_labels_tot} labels, \"\n f\"but dtype={label_dtype}\"\n )\n\n if output_ROI_table:\n bbox_df = array_to_bounding_box_table(\n new_label_img,\n actual_res_pxl_sizes_zyx,\n origin_zyx=(s_z, s_y, s_x),\n )\n\n bbox_dataframe_list.append(bbox_df)\n\n overlap_list = []\n for df in bbox_dataframe_list:\n overlap_list.extend(\n get_overlapping_pairs_3D(df, full_res_pxl_sizes_zyx)\n )\n if len(overlap_list) > 0:\n logger.warning(\n f\"{len(overlap_list)} bounding-box pairs overlap\"\n )\n\n # Compute and store 0-th level to disk\n da.array(new_label_img).to_zarr(\n url=mask_zarr,\n region=region,\n compute=True,\n )\n\n logger.info(\n f\"End cellpose_segmentation task for {zarr_url}, \"\n \"now building pyramids.\"\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=f\"{zarr_url}/labels/{output_label_name}\",\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=chunks,\n aggregation_function=np.max,\n )\n\n logger.info(\"End building pyramids\")\n\n if output_ROI_table:\n # Handle the case where `bbox_dataframe_list` is empty (typically\n # because list_indices is also empty)\n if len(bbox_dataframe_list) == 0:\n bbox_dataframe_list = [empty_bounding_box_table()]\n # Concatenate all ROI dataframes\n df_well = pd.concat(bbox_dataframe_list, axis=0, ignore_index=True)\n df_well.index = df_well.index.astype(str)\n # Extract labels and drop them from df_well\n labels = pd.DataFrame(df_well[\"label\"].astype(str))\n df_well.drop(labels=[\"label\"], axis=1, inplace=True)\n # Convert all to float (warning: some would be int, in principle)\n bbox_dtype = np.float32\n df_well = df_well.astype(bbox_dtype)\n # Convert to anndata\n bbox_table = ad.AnnData(df_well, dtype=bbox_dtype)\n bbox_table.obs = labels\n\n # Write to zarr group\n image_group = zarr.group(zarr_url)\n logger.info(\n \"Now writing bounding-box ROI table to \"\n f\"{zarr_url}/tables/{output_ROI_table}\"\n )\n table_attrs = {\n \"type\": \"masking_roi_table\",\n \"region\": {\"path\": f\"../labels/{output_label_name}\"},\n \"instance_key\": \"label\",\n }\n write_table(\n image_group,\n output_ROI_table,\n bbox_table,\n overwrite=overwrite,\n table_attrs=table_attrs,\n )\n
Internal function that runs Cellpose segmentation for a single ROI.
PARAMETER DESCRIPTION x
4D numpy array.
TYPE: ndarray
model
An instance of models.CellposeModel.
TYPE: CellposeModel DEFAULT: None
do_3D
If True, cellpose runs in 3D mode: runs on xy, xz & yz planes, then averages the flows.
TYPE: bool DEFAULT: True
channels
Which channels to use. If only one channel is provided, [0, 0] should be used. If two channels are provided (the first dimension of x has length of 2), [1, 2] should be used (x[0, :, :,:] contains the membrane channel and x[1, :, :, :] contains the nuclear channel).
TYPE: list[int] DEFAULT: [0, 0]
anisotropy
Set anisotropy rescaling factor for Z dimension.
TYPE: Optional[float] DEFAULT: None
diameter
Expected object diameter in pixels for cellpose.
TYPE: float DEFAULT: 30.0
cellprob_threshold
Cellpose model parameter.
TYPE: float DEFAULT: 0.0
flow_threshold
Cellpose model parameter.
TYPE: float DEFAULT: 0.4
normalize
normalize data so 0.0=1st percentile and 1.0=99th percentile of image intensities in each channel. This automatic normalization can lead to issues when the image to be segmented is very sparse.
Whether to use cellpose augmentation to tile images with overlap.
TYPE: bool DEFAULT: False
net_avg
Whether to use cellpose net averaging to run the 4 built-in networks (useful for nuclei, cyto and cyto2, not sure it works for the others).
TYPE: bool DEFAULT: False
min_size
Minimum size of the segmented objects.
TYPE: int DEFAULT: 15
batch_size
number of 224x224 patches to run simultaneously on the GPU (can make smaller or bigger depending on GPU memory usage)
TYPE: int DEFAULT: 8
invert
invert image pixel intensity before running network (if True, image is also normalized)
TYPE: bool DEFAULT: False
tile
tiles image to ensure GPU/CPU memory usage limited (recommended)
TYPE: bool DEFAULT: True
tile_overlap
fraction of overlap of tiles when computing flows
TYPE: float DEFAULT: 0.1
resample
run dynamics at original image size (will be slower but create more accurate boundaries)
TYPE: bool DEFAULT: True
interp
interpolate during 2D dynamics (not available in 3D) (in previous versions it was False, now it defaults to True)
TYPE: bool DEFAULT: True
stitch_threshold
if stitch_threshold>0.0 and not do_3D and equal image sizes, masks are stitched in 3D to return volume segmentation
TYPE: float DEFAULT: 0.0
Source code in fractal_tasks_core/tasks/cellpose_segmentation.py
def segment_ROI(\n x: np.ndarray,\n model: models.CellposeModel = None,\n do_3D: bool = True,\n channels: list[int] = [0, 0],\n anisotropy: Optional[float] = None,\n diameter: float = 30.0,\n cellprob_threshold: float = 0.0,\n flow_threshold: float = 0.4,\n normalize: CellposeCustomNormalizer = CellposeCustomNormalizer(),\n label_dtype: Optional[np.dtype] = None,\n augment: bool = False,\n net_avg: bool = False,\n min_size: int = 15,\n batch_size: int = 8,\n invert: bool = False,\n tile: bool = True,\n tile_overlap: float = 0.1,\n resample: bool = True,\n interp: bool = True,\n stitch_threshold: float = 0.0,\n) -> np.ndarray:\n\"\"\"\n Internal function that runs Cellpose segmentation for a single ROI.\n\n Args:\n x: 4D numpy array.\n model: An instance of `models.CellposeModel`.\n do_3D: If `True`, cellpose runs in 3D mode: runs on xy, xz & yz planes,\n then averages the flows.\n channels: Which channels to use. If only one channel is provided, `[0,\n 0]` should be used. If two channels are provided (the first\n dimension of `x` has length of 2), `[1, 2]` should be used\n (`x[0, :, :,:]` contains the membrane channel and\n `x[1, :, :, :]` contains the nuclear channel).\n anisotropy: Set anisotropy rescaling factor for Z dimension.\n diameter: Expected object diameter in pixels for cellpose.\n cellprob_threshold: Cellpose model parameter.\n flow_threshold: Cellpose model parameter.\n normalize: normalize data so 0.0=1st percentile and 1.0=99th\n percentile of image intensities in each channel. This automatic\n normalization can lead to issues when the image to be segmented\n is very sparse.\n label_dtype: Label images are cast into this `np.dtype`.\n augment: Whether to use cellpose augmentation to tile images with\n overlap.\n net_avg: Whether to use cellpose net averaging to run the 4 built-in\n networks (useful for `nuclei`, `cyto` and `cyto2`, not sure it\n works for the others).\n min_size: Minimum size of the segmented objects.\n batch_size: number of 224x224 patches to run simultaneously on the GPU\n (can make smaller or bigger depending on GPU memory usage)\n invert: invert image pixel intensity before running network (if True,\n image is also normalized)\n tile: tiles image to ensure GPU/CPU memory usage limited (recommended)\n tile_overlap: fraction of overlap of tiles when computing flows\n resample: run dynamics at original image size (will be slower but\n create more accurate boundaries)\n interp: interpolate during 2D dynamics (not available in 3D)\n (in previous versions it was False, now it defaults to True)\n stitch_threshold: if stitch_threshold>0.0 and not do_3D and equal\n image sizes, masks are stitched in 3D to return volume segmentation\n \"\"\"\n\n # Write some debugging info\n logger.info(\n \"[segment_ROI] START |\"\n f\" x: {type(x)}, {x.shape} |\"\n f\" {do_3D=} |\"\n f\" {model.diam_mean=} |\"\n f\" {diameter=} |\"\n f\" {flow_threshold=} |\"\n f\" {normalize.type=}\"\n )\n\n # Optionally perform custom normalization\n if normalize.type == \"custom\":\n x = normalized_img(\n x,\n lower_p=normalize.lower_percentile,\n upper_p=normalize.upper_percentile,\n lower_bound=normalize.lower_bound,\n upper_bound=normalize.upper_bound,\n )\n\n # Actual labeling\n t0 = time.perf_counter()\n mask, _, _ = model.eval(\n x,\n channels=channels,\n do_3D=do_3D,\n net_avg=net_avg,\n augment=augment,\n diameter=diameter,\n anisotropy=anisotropy,\n cellprob_threshold=cellprob_threshold,\n flow_threshold=flow_threshold,\n normalize=normalize.cellpose_normalize,\n min_size=min_size,\n batch_size=batch_size,\n invert=invert,\n tile=tile,\n tile_overlap=tile_overlap,\n resample=resample,\n interp=interp,\n stitch_threshold=stitch_threshold,\n )\n\n if mask.ndim == 2:\n # If we get a 2D image, we still return it as a 3D array\n mask = np.expand_dims(mask, axis=0)\n t1 = time.perf_counter()\n\n # Write some debugging info\n logger.info(\n \"[segment_ROI] END |\"\n f\" Elapsed: {t1-t0:.3f} s |\"\n f\" {mask.shape=},\"\n f\" {mask.dtype=} (then {label_dtype}),\"\n f\" {np.max(mask)=} |\"\n f\" {model.diam_mean=} |\"\n f\" {diameter=} |\"\n f\" {flow_threshold=}\"\n )\n\n return mask.astype(label_dtype)\n
Validator to handle different normalization scenarios for Cellpose models
If type=\"default\", then Cellpose default normalization is used and no other parameters can be specified. If type=\"no_normalization\", then no normalization is used and no other parameters can be specified. If type=\"custom\", then either percentiles or explicit integer bounds can be applied.
ATTRIBUTE DESCRIPTION type
One of default (Cellpose default normalization), custom (using the other custom parameters) or no_normalization.
Specify a custom lower-bound percentile for rescaling as a float value between 0 and 100. Set to 1 to run the same as default). You can only specify percentiles or bounds, not both.
TYPE: Optional[float]
upper_percentile
Specify a custom upper-bound percentile for rescaling as a float value between 0 and 100. Set to 99 to run the same as default, set to e.g. 99.99 if the default rescaling was too harsh. You can only specify percentiles or bounds, not both.
TYPE: Optional[float]
lower_bound
Explicit lower bound value to rescale the image at. Needs to be an integer, e.g. 100. You can only specify percentiles or bounds, not both.
TYPE: Optional[int]
upper_bound
Explicit upper bound value to rescale the image at. Needs to be an integer, e.g. 2000. You can only specify percentiles or bounds, not both.
TYPE: Optional[int]
Source code in fractal_tasks_core/tasks/cellpose_transforms.py
class CellposeCustomNormalizer(BaseModel):\n\"\"\"\n Validator to handle different normalization scenarios for Cellpose models\n\n If `type=\"default\"`, then Cellpose default normalization is\n used and no other parameters can be specified.\n If `type=\"no_normalization\"`, then no normalization is used and no\n other parameters can be specified.\n If `type=\"custom\"`, then either percentiles or explicit integer\n bounds can be applied.\n\n Attributes:\n type:\n One of `default` (Cellpose default normalization), `custom`\n (using the other custom parameters) or `no_normalization`.\n lower_percentile: Specify a custom lower-bound percentile for rescaling\n as a float value between 0 and 100. Set to 1 to run the same as\n default). You can only specify percentiles or bounds, not both.\n upper_percentile: Specify a custom upper-bound percentile for rescaling\n as a float value between 0 and 100. Set to 99 to run the same as\n default, set to e.g. 99.99 if the default rescaling was too harsh.\n You can only specify percentiles or bounds, not both.\n lower_bound: Explicit lower bound value to rescale the image at.\n Needs to be an integer, e.g. 100.\n You can only specify percentiles or bounds, not both.\n upper_bound: Explicit upper bound value to rescale the image at.\n Needs to be an integer, e.g. 2000.\n You can only specify percentiles or bounds, not both.\n \"\"\"\n\n type: Literal[\"default\", \"custom\", \"no_normalization\"] = \"default\"\n lower_percentile: Optional[float] = Field(None, ge=0, le=100)\n upper_percentile: Optional[float] = Field(None, ge=0, le=100)\n lower_bound: Optional[int] = None\n upper_bound: Optional[int] = None\n\n # In the future, add an option to allow using precomputed percentiles\n # that are stored in OME-Zarr histograms and use this pydantic model that\n # those histograms actually exist\n\n @root_validator\n def validate_conditions(cls, values):\n # Extract values\n type = values.get(\"type\")\n lower_percentile = values.get(\"lower_percentile\")\n upper_percentile = values.get(\"upper_percentile\")\n lower_bound = values.get(\"lower_bound\")\n upper_bound = values.get(\"upper_bound\")\n\n # Verify that custom parameters are only provided when type=\"custom\"\n if type != \"custom\":\n if lower_percentile is not None:\n raise ValueError(\n f\"Type='{type}' but {lower_percentile=}. \"\n \"Hint: set type='custom'.\"\n )\n if upper_percentile is not None:\n raise ValueError(\n f\"Type='{type}' but {upper_percentile=}. \"\n \"Hint: set type='custom'.\"\n )\n if lower_bound is not None:\n raise ValueError(\n f\"Type='{type}' but {lower_bound=}. \"\n \"Hint: set type='custom'.\"\n )\n if upper_bound is not None:\n raise ValueError(\n f\"Type='{type}' but {upper_bound=}. \"\n \"Hint: set type='custom'.\"\n )\n\n # The only valid options are:\n # 1. Both percentiles are set and both bounds are unset\n # 2. Both bounds are set and both percentiles are unset\n are_percentiles_set = (\n lower_percentile is not None,\n upper_percentile is not None,\n )\n are_bounds_set = (\n lower_bound is not None,\n upper_bound is not None,\n )\n if len(set(are_percentiles_set)) != 1:\n raise ValueError(\n \"Both lower_percentile and upper_percentile must be set \"\n \"together.\"\n )\n if len(set(are_bounds_set)) != 1:\n raise ValueError(\n \"Both lower_bound and upper_bound must be set together\"\n )\n if lower_percentile is not None and lower_bound is not None:\n raise ValueError(\n \"You cannot set both explicit bounds and percentile bounds \"\n \"at the same time. Hint: use only one of the two options.\"\n )\n\n return values\n\n @property\n def cellpose_normalize(self) -> bool:\n\"\"\"\n Determine whether cellpose should apply its internal normalization.\n\n If type is set to `custom` or `no_normalization`, don't apply cellpose\n internal normalization\n \"\"\"\n return self.type == \"default\"\n
normalize image so 0.0 is lower value and 1.0 is upper value
PARAMETER DESCRIPTION Y
The image to be normalized
TYPE: ndarray
lower
Lower normalization value
TYPE: int DEFAULT: 0
upper
Upper normalization value
TYPE: int DEFAULT: 65535
Source code in fractal_tasks_core/tasks/cellpose_transforms.py
def normalize_bounds(Y: np.ndarray, lower: int = 0, upper: int = 65535):\n\"\"\"normalize image so 0.0 is lower value and 1.0 is upper value\n\n Args:\n Y: The image to be normalized\n lower: Lower normalization value\n upper: Upper normalization value\n\n \"\"\"\n X = Y.copy()\n X = (X - lower) / (upper - lower)\n return X\n
normalize image so 0.0 is lower percentile and 1.0 is upper percentile Percentiles are passed as floats (must be between 0 and 100)
PARAMETER DESCRIPTION Y
The image to be normalized
TYPE: ndarray
lower
Lower percentile
TYPE: float DEFAULT: 1
upper
Upper percentile
TYPE: float DEFAULT: 99
Source code in fractal_tasks_core/tasks/cellpose_transforms.py
def normalize_percentile(Y: np.ndarray, lower: float = 1, upper: float = 99):\n\"\"\"normalize image so 0.0 is lower percentile and 1.0 is upper percentile\n Percentiles are passed as floats (must be between 0 and 100)\n\n Args:\n Y: The image to be normalized\n lower: Lower percentile\n upper: Upper percentile\n\n \"\"\"\n X = Y.copy()\n x01 = np.percentile(X, lower)\n x99 = np.percentile(X, upper)\n X = (X - x01) / (x99 - x01)\n return X\n
Source code in fractal_tasks_core/tasks/cellpose_transforms.py
def normalized_img(\n img: np.ndarray,\n axis: int = -1,\n invert: bool = False,\n lower_p: float = 1.0,\n upper_p: float = 99.0,\n lower_bound: Optional[int] = None,\n upper_bound: Optional[int] = None,\n):\n\"\"\"normalize each channel of the image so that so that 0.0=lower percentile\n or lower bound and 1.0=upper percentile or upper bound of image intensities.\n\n The normalization can result in values < 0 or > 1 (no clipping).\n\n Based on https://github.com/MouseLand/cellpose/blob/4f5661983c3787efa443bbbd3f60256f4fd8bf53/cellpose/transforms.py#L375 # noqa E501\n\n optional inversion\n\n Parameters\n ------------\n\n img: ND-array (at least 3 dimensions)\n\n axis: channel axis to loop over for normalization\n\n invert: invert image (useful if cells are dark instead of bright)\n\n lower_p: Lower percentile for rescaling\n\n upper_p: Upper percentile for rescaling\n\n lower_bound: Lower fixed-value used for rescaling\n\n upper_bound: Upper fixed-value used for rescaling\n\n Returns\n ---------------\n\n img: ND-array, float32\n normalized image of same size\n\n \"\"\"\n if img.ndim < 3:\n error_message = \"Image needs to have at least 3 dimensions\"\n logger.critical(error_message)\n raise ValueError(error_message)\n\n img = img.astype(np.float32)\n img = np.moveaxis(img, axis, 0)\n for k in range(img.shape[0]):\n if lower_p is not None:\n # ptp can still give nan's with weird images\n i99 = np.percentile(img[k], upper_p)\n i1 = np.percentile(img[k], lower_p)\n if i99 - i1 > +1e-3: # np.ptp(img[k]) > 1e-3:\n img[k] = normalize_percentile(\n img[k], lower=lower_p, upper=upper_p\n )\n if invert:\n img[k] = -1 * img[k] + 1\n else:\n img[k] = 0\n elif lower_bound is not None:\n if upper_bound - lower_bound > +1e-3:\n img[k] = normalize_bounds(\n img[k], lower=lower_bound, upper=upper_bound\n )\n if invert:\n img[k] = -1 * img[k] + 1\n else:\n img[k] = 0\n else:\n raise ValueError(\"No normalization method specified\")\n img = np.moveaxis(img, 0, axis)\n return img\n
This task is run after an init task (typically cellvoyager_to_ome_zarr_init or cellvoyager_to_ome_zarr_init_multiplex), and it populates the empty OME-Zarr files that were prepared.
Note that the current task always overwrites existing data. To avoid this behavior, set the overwrite argument of the init task to False.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
init_args
Intialization arguments provided by create_cellvoyager_ome_zarr_init.
TYPE: InitArgsCellVoyager
Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_compute.py
@validate_arguments\ndef cellvoyager_to_ome_zarr_compute(\n *,\n # Fractal parameters\n zarr_url: str,\n init_args: InitArgsCellVoyager,\n):\n\"\"\"\n Convert Yokogawa output (png, tif) to zarr file.\n\n This task is run after an init task (typically\n `cellvoyager_to_ome_zarr_init` or\n `cellvoyager_to_ome_zarr_init_multiplex`), and it populates the empty\n OME-Zarr files that were prepared.\n\n Note that the current task always overwrites existing data. To avoid this\n behavior, set the `overwrite` argument of the init task to `False`.\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n init_args: Intialization arguments provided by\n `create_cellvoyager_ome_zarr_init`.\n \"\"\"\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n\n channels: list[OmeroChannel] = get_omero_channel_list(\n image_zarr_path=zarr_url\n )\n wavelength_ids = [c.wavelength_id for c in channels]\n\n # Read useful information from ROI table\n adata = read_zarr(f\"{zarr_url}/tables/FOV_ROI_table\")\n fov_indices = convert_ROI_table_to_indices(\n adata,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(fov_indices, \"FOV_ROI_table\")\n adata_well = read_zarr(f\"{zarr_url}/tables/well_ROI_table\")\n well_indices = convert_ROI_table_to_indices(\n adata_well,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(well_indices, \"well_ROI_table\")\n if len(well_indices) > 1:\n raise ValueError(f\"Something wrong with {well_indices=}\")\n\n max_z = well_indices[0][1]\n max_y = well_indices[0][3]\n max_x = well_indices[0][5]\n\n # Load a single image, to retrieve useful information\n patterns = [\n f\"{init_args.plate_prefix}_{init_args.well_ID}_*.\"\n f\"{init_args.image_extension}\"\n ]\n if init_args.image_glob_patterns:\n patterns.extend(init_args.image_glob_patterns)\n\n tmp_images = glob_with_multiple_patterns(\n folder=init_args.image_dir,\n patterns=patterns,\n )\n sample = imread(tmp_images.pop())\n\n # Initialize zarr\n chunksize = (1, 1, sample.shape[1], sample.shape[2])\n canvas_zarr = zarr.create(\n shape=(len(wavelength_ids), max_z, max_y, max_x),\n chunks=chunksize,\n dtype=sample.dtype,\n store=zarr.storage.FSStore(zarr_url + \"/0\"),\n overwrite=True,\n dimension_separator=\"/\",\n )\n\n # Loop over channels\n for i_c, wavelength_id in enumerate(wavelength_ids):\n A, C = wavelength_id.split(\"_\")\n\n patterns = [\n f\"{init_args.plate_prefix}_{init_args.well_ID}_*{A}*{C}*.\"\n f\"{init_args.image_extension}\"\n ]\n if init_args.image_glob_patterns:\n patterns.extend(init_args.image_glob_patterns)\n filenames_set = glob_with_multiple_patterns(\n folder=init_args.image_dir,\n patterns=patterns,\n )\n filenames = sorted(list(filenames_set), key=sort_fun)\n if len(filenames) == 0:\n raise ValueError(\n \"Error in yokogawa_to_ome_zarr: len(filenames)=0.\\n\"\n f\" image_dir: {init_args.image_dir}\\n\"\n f\" wavelength_id: {wavelength_id},\\n\"\n f\" patterns: {patterns}\"\n )\n # Loop over 3D FOV ROIs\n for indices in fov_indices:\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(i_c, i_c + 1),\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n FOV_3D = da.concatenate(\n [imread(img) for img in filenames[:e_z]],\n )\n FOV_4D = da.expand_dims(FOV_3D, axis=0)\n filenames = filenames[e_z:]\n da.array(FOV_4D).to_zarr(\n url=canvas_zarr,\n region=region,\n compute=True,\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=zarr_url,\n overwrite=True,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=chunksize,\n )\n\n # Generate image list updates\n # TODO: Can we check for dimensionality more robustly? Just checks for the\n # last FOV of the last wavelength now\n if FOV_4D.shape[-3] > 1:\n is_3D = True\n else:\n is_3D = False\n attributes = {\n \"plate\": f\"{init_args.plate_prefix}.zarr\",\n \"well\": init_args.well_ID,\n }\n if init_args.acquisition is not None:\n attributes[\"acquisition\"] = init_args.acquisition\n\n image_list_updates = dict(\n image_list_updates=[\n dict(\n zarr_url=zarr_url,\n attributes=attributes,\n types={\"is_3D\": is_3D},\n )\n ]\n )\n\n return image_list_updates\n
Takes a string (filename of a Yokogawa image), extract site and z-index metadata and returns them as a list of integers.
PARAMETER DESCRIPTION filename
Name of the image file.
TYPE: str
Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_compute.py
def sort_fun(filename: str) -> list[int]:\n\"\"\"\n Takes a string (filename of a Yokogawa image), extract site and\n z-index metadata and returns them as a list of integers.\n\n Args:\n filename: Name of the image file.\n \"\"\"\n\n filename_metadata = parse_filename(filename)\n site = int(filename_metadata[\"F\"])\n z_index = int(filename_metadata[\"Z\"])\n return [site, z_index]\n
Create a OME-NGFF zarr folder, without reading/writing image data.
Find plates (for each folder in input_paths):
glob image files,
parse metadata from image filename to identify plates,
identify populated channels.
Create a zarr folder (for each plate):
parse mlf metadata,
identify wells and field of view (FOV),
create FOV ZARR,
verify that channels are uniform (i.e., same channels).
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. Not used by the converter task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
image_dirs
list of paths to the folders that contains the Cellvoyager image files. Each entry is a path to a folder that contains the image files themselves for a multiwell plate and the MeasurementData & MeasurementDetail metadata files.
TYPE: list[str]
allowed_channels
A list of OmeroChannel s, where each channel must include the wavelength_id attribute and where the wavelength_id values must be unique across the list.
TYPE: list[OmeroChannel]
image_glob_patterns
If specified, only parse images with filenames that match with all these patterns. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html, Example: image_glob_pattern=[\"*_B03_*\"] => only process well B03 image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"] => only process well C09, field of view 16 and Z planes 0-59.
TYPE: Optional[list[str]] DEFAULT: None
num_levels
Number of resolution-pyramid levels. If set to 5, there will be the full-resolution level and 4 levels of downsampled images.
TYPE: int DEFAULT: 5
coarsening_xy
Linear coarsening factor between subsequent levels. If set to 2, level 1 is 2x downsampled, level 2 is 4x downsampled etc.
TYPE: int DEFAULT: 2
image_extension
Filename extension of images (e.g. \"tif\" or \"png\")
TYPE: str DEFAULT: 'tif'
metadata_table_file
If None, parse Yokogawa metadata from mrf/mlf files in the input_path folder; else, the full path to a csv file containing the parsed metadata table.
TYPE: Optional[str] DEFAULT: None
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION dict[str, Any]
A metadata dictionary containing important metadata about the OME-Zarr plate, the images and some parameters required by downstream tasks (like num_levels).
Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_init.py
@validate_arguments\ndef cellvoyager_to_ome_zarr_init(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Core parameters\n image_dirs: list[str],\n allowed_channels: list[OmeroChannel],\n # Advanced parameters\n image_glob_patterns: Optional[list[str]] = None,\n num_levels: int = 5,\n coarsening_xy: int = 2,\n image_extension: str = \"tif\",\n metadata_table_file: Optional[str] = None,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Create a OME-NGFF zarr folder, without reading/writing image data.\n\n Find plates (for each folder in input_paths):\n\n - glob image files,\n - parse metadata from image filename to identify plates,\n - identify populated channels.\n\n Create a zarr folder (for each plate):\n\n - parse mlf metadata,\n - identify wells and field of view (FOV),\n - create FOV ZARR,\n - verify that channels are uniform (i.e., same channels).\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed. Not used by the converter task.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created.\n (standard argument for Fractal tasks, managed by Fractal server).\n image_dirs: list of paths to the folders that contains the Cellvoyager\n image files. Each entry is a path to a folder that contains the\n image files themselves for a multiwell plate and the\n MeasurementData & MeasurementDetail metadata files.\n allowed_channels: A list of `OmeroChannel` s, where each channel must\n include the `wavelength_id` attribute and where the\n `wavelength_id` values must be unique across the list.\n image_glob_patterns: If specified, only parse images with filenames\n that match with all these patterns. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html, Example:\n `image_glob_pattern=[\"*_B03_*\"]` => only process well B03\n `image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"]` =>\n only process well C09, field of view 16 and Z planes 0-59.\n num_levels: Number of resolution-pyramid levels. If set to `5`, there\n will be the full-resolution level and 4 levels of\n downsampled images.\n coarsening_xy: Linear coarsening factor between subsequent levels.\n If set to `2`, level 1 is 2x downsampled, level 2 is\n 4x downsampled etc.\n image_extension: Filename extension of images (e.g. `\"tif\"` or `\"png\"`)\n metadata_table_file: If `None`, parse Yokogawa metadata from mrf/mlf\n files in the input_path folder; else, the full path to a csv file\n containing the parsed metadata table.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n A metadata dictionary containing important metadata about the OME-Zarr\n plate, the images and some parameters required by downstream tasks\n (like `num_levels`).\n \"\"\"\n\n # Preliminary checks on metadata_table_file\n if metadata_table_file:\n if not metadata_table_file.endswith(\".csv\"):\n raise ValueError(f\"{metadata_table_file=} is not a csv file\")\n if not os.path.isfile(metadata_table_file):\n raise FileNotFoundError(f\"{metadata_table_file=} does not exist\")\n\n # Identify all plates and all channels, across all input folders\n plates = []\n actual_wavelength_ids = None\n dict_plate_paths = {}\n dict_plate_prefixes: dict[str, Any] = {}\n\n # Preliminary checks on allowed_channels argument\n check_unique_wavelength_ids(allowed_channels)\n\n for image_dir in image_dirs:\n # Glob image filenames\n patterns = [f\"*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n input_filenames = glob_with_multiple_patterns(\n folder=image_dir,\n patterns=patterns,\n )\n\n tmp_wavelength_ids = []\n tmp_plates = []\n for fn in input_filenames:\n try:\n filename_metadata = parse_filename(Path(fn).name)\n plate_prefix = filename_metadata[\"plate_prefix\"]\n plate = filename_metadata[\"plate\"]\n if plate not in dict_plate_prefixes.keys():\n dict_plate_prefixes[plate] = plate_prefix\n tmp_plates.append(plate)\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n tmp_wavelength_ids.append(f\"A{A}_C{C}\")\n except ValueError as e:\n logger.warning(\n f'Skipping \"{Path(fn).name}\". Original error: ' + str(e)\n )\n tmp_plates = sorted(list(set(tmp_plates)))\n tmp_wavelength_ids = sorted(list(set(tmp_wavelength_ids)))\n\n info = (\n \"Listing plates/channels:\\n\"\n f\"Folder: {image_dir}\\n\"\n f\"Patterns: {patterns}\\n\"\n f\"Plates: {tmp_plates}\\n\"\n f\"Channels: {tmp_wavelength_ids}\\n\"\n )\n\n # Check that only one plate is found\n if len(tmp_plates) > 1:\n raise ValueError(f\"{info}ERROR: {len(tmp_plates)} plates detected\")\n elif len(tmp_plates) == 0:\n raise ValueError(f\"{info}ERROR: No plates detected\")\n plate = tmp_plates[0]\n\n # If plate already exists in other folder, add suffix\n if plate in plates:\n ind = 1\n new_plate = f\"{plate}_{ind}\"\n while new_plate in plates:\n new_plate = f\"{plate}_{ind}\"\n ind += 1\n logger.info(\n f\"WARNING: {plate} already exists, renaming it as {new_plate}\"\n )\n plates.append(new_plate)\n dict_plate_prefixes[new_plate] = dict_plate_prefixes[plate]\n plate = new_plate\n else:\n plates.append(plate)\n\n # Check that channels are the same as in previous plates\n if actual_wavelength_ids is None:\n actual_wavelength_ids = tmp_wavelength_ids[:]\n else:\n if actual_wavelength_ids != tmp_wavelength_ids:\n raise ValueError(\n f\"ERROR\\n{info}\\nERROR:\"\n f\" expected channels {actual_wavelength_ids}\"\n )\n\n # Update dict_plate_paths\n dict_plate_paths[plate] = image_dir\n\n # Check that all channels are in the allowed_channels\n allowed_wavelength_ids = [\n channel.wavelength_id for channel in allowed_channels\n ]\n if not set(actual_wavelength_ids).issubset(set(allowed_wavelength_ids)):\n msg = \"ERROR in create_ome_zarr\\n\"\n msg += f\"actual_wavelength_ids: {actual_wavelength_ids}\\n\"\n msg += f\"allowed_wavelength_ids: {allowed_wavelength_ids}\\n\"\n raise ValueError(msg)\n\n # Create actual_channels, i.e. a list of the channel dictionaries which are\n # present\n actual_channels = [\n channel\n for channel in allowed_channels\n if channel.wavelength_id in actual_wavelength_ids\n ]\n\n ################################################################\n # Create well/image OME-Zarr folders on disk, and prepare output\n # metadata\n parallelization_list = []\n\n for plate in plates:\n # Define plate zarr\n relative_zarrurl = f\"{plate}.zarr\"\n in_path = dict_plate_paths[plate]\n logger.info(f\"Creating {relative_zarrurl}\")\n # Call zarr.open_group wrapper, which handles overwrite=True/False\n group_plate = open_zarr_group_with_overwrite(\n str(Path(zarr_dir) / relative_zarrurl),\n overwrite=overwrite,\n )\n\n # Obtain FOV-metadata dataframe\n if metadata_table_file is None:\n mrf_path = f\"{in_path}/MeasurementDetail.mrf\"\n mlf_path = f\"{in_path}/MeasurementData.mlf\"\n\n site_metadata, number_images_mlf = parse_yokogawa_metadata(\n mrf_path,\n mlf_path,\n filename_patterns=image_glob_patterns,\n )\n site_metadata = remove_FOV_overlaps(site_metadata)\n\n # If a metadata table was passed, load it and use it directly\n else:\n logger.warning(\n \"Since a custom metadata table was provided, there will \"\n \"be no additional check on the number of image files.\"\n )\n site_metadata = pd.read_csv(metadata_table_file)\n site_metadata.set_index([\"well_id\", \"FieldIndex\"], inplace=True)\n\n # Extract pixel sizes and bit_depth\n pixel_size_z = site_metadata[\"pixel_size_z\"][0]\n pixel_size_y = site_metadata[\"pixel_size_y\"][0]\n pixel_size_x = site_metadata[\"pixel_size_x\"][0]\n bit_depth = site_metadata[\"bit_depth\"][0]\n\n if min(pixel_size_z, pixel_size_y, pixel_size_x) < 1e-9:\n raise ValueError(pixel_size_z, pixel_size_y, pixel_size_x)\n\n # Identify all wells\n plate_prefix = dict_plate_prefixes[plate]\n\n patterns = [f\"{plate_prefix}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n plate_images = glob_with_multiple_patterns(\n folder=str(in_path), patterns=patterns\n )\n\n wells = [\n parse_filename(os.path.basename(fn))[\"well\"] for fn in plate_images\n ]\n wells = sorted(list(set(wells)))\n\n # Verify that all wells have all channels\n for well in wells:\n patterns = [f\"{plate_prefix}_{well}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n well_images = glob_with_multiple_patterns(\n folder=str(in_path), patterns=patterns\n )\n\n # Check number of images matches with expected one\n if metadata_table_file is None:\n num_images_glob = len(well_images)\n num_images_expected = number_images_mlf[well]\n if num_images_glob != num_images_expected:\n raise ValueError(\n f\"Wrong number of images for {well=}\\n\"\n f\"Expected {num_images_expected} (from mlf file)\\n\"\n f\"Found {num_images_glob} files\\n\"\n \"Other parameters:\\n\"\n f\" {image_extension=}\\n\"\n f\" {image_glob_patterns=}\"\n )\n\n well_wavelength_ids = []\n for fpath in well_images:\n try:\n filename_metadata = parse_filename(os.path.basename(fpath))\n well_wavelength_ids.append(\n f\"A{filename_metadata['A']}_C{filename_metadata['C']}\"\n )\n except IndexError:\n logger.info(f\"Skipping {fpath}\")\n well_wavelength_ids = sorted(list(set(well_wavelength_ids)))\n if well_wavelength_ids != actual_wavelength_ids:\n raise ValueError(\n f\"ERROR: well {well} in plate {plate} (prefix: \"\n f\"{plate_prefix}) has missing channels.\\n\"\n f\"Expected: {actual_channels}\\n\"\n f\"Found: {well_wavelength_ids}.\\n\"\n )\n\n well_rows_columns = generate_row_col_split(wells)\n\n row_list = [\n well_row_column[0] for well_row_column in well_rows_columns\n ]\n col_list = [\n well_row_column[1] for well_row_column in well_rows_columns\n ]\n row_list = sorted(list(set(row_list)))\n col_list = sorted(list(set(col_list)))\n\n plate_attrs = {\n \"acquisitions\": [{\"id\": 0, \"name\": plate}],\n \"columns\": [{\"name\": col} for col in col_list],\n \"rows\": [{\"name\": row} for row in row_list],\n \"version\": __OME_NGFF_VERSION__,\n \"wells\": [\n {\n \"path\": well_row_column[0] + \"/\" + well_row_column[1],\n \"rowIndex\": row_list.index(well_row_column[0]),\n \"columnIndex\": col_list.index(well_row_column[1]),\n }\n for well_row_column in well_rows_columns\n ],\n }\n\n # Validate plate attrs:\n Plate(**plate_attrs)\n\n group_plate.attrs[\"plate\"] = plate_attrs\n\n for row, column in well_rows_columns:\n parallelization_list.append(\n {\n \"zarr_url\": f\"{zarr_dir}/{plate}.zarr/{row}/{column}/0/\",\n \"init_args\": InitArgsCellVoyager(\n image_dir=in_path,\n plate_prefix=plate_prefix,\n well_ID=get_filename_well_id(row, column),\n image_extension=image_extension,\n image_glob_patterns=image_glob_patterns,\n ).dict(),\n }\n )\n group_well = group_plate.create_group(f\"{row}/{column}/\")\n\n well_attrs = {\n \"images\": [{\"path\": \"0\"}],\n \"version\": __OME_NGFF_VERSION__,\n }\n\n # Validate well attrs:\n Well(**well_attrs)\n group_well.attrs[\"well\"] = well_attrs\n\n group_image = group_well.create_group(\"0/\") # noqa: F841\n group_image.attrs[\"multiscales\"] = [\n {\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n {\"name\": \"c\", \"type\": \"channel\"},\n {\n \"name\": \"z\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"y\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"x\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n ],\n \"datasets\": [\n {\n \"path\": f\"{ind_level}\",\n \"coordinateTransformations\": [\n {\n \"type\": \"scale\",\n \"scale\": [\n 1,\n pixel_size_z,\n pixel_size_y\n * coarsening_xy**ind_level,\n pixel_size_x\n * coarsening_xy**ind_level,\n ],\n }\n ],\n }\n for ind_level in range(num_levels)\n ],\n }\n ]\n\n group_image.attrs[\"omero\"] = {\n \"id\": 1, # TODO does this depend on the plate number?\n \"name\": \"TBD\",\n \"version\": __OME_NGFF_VERSION__,\n \"channels\": define_omero_channels(\n channels=actual_channels, bit_depth=bit_depth\n ),\n }\n\n # Validate Image attrs\n NgffImageMeta(**group_image.attrs)\n\n # Prepare AnnData tables for FOV/well ROIs\n well_id = get_filename_well_id(row, column)\n FOV_ROIs_table = prepare_FOV_ROI_table(site_metadata.loc[well_id])\n well_ROIs_table = prepare_well_ROI_table(\n site_metadata.loc[well_id]\n )\n\n # Write AnnData tables into the `tables` zarr group\n write_table(\n group_image,\n \"FOV_ROI_table\",\n FOV_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n write_table(\n group_image,\n \"well_ROI_table\",\n well_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n\n return dict(parallelization_list=parallelization_list)\n
Create OME-NGFF structure and metadata to host a multiplexing dataset.
This task takes a set of image folders (i.e. different multiplexing acquisitions) and build the internal structure and metadata of a OME-NGFF zarr group, without actually loading/writing the image data.
Each element in input_paths should be treated as a different acquisition.
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. Not used by the converter task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
acquisitions
dictionary of acquisitions. Each key is the acquisition identifier (normally 0, 1, 2, 3 etc.). Each item defines the acquisition by providing the image_dir and the allowed_channels.
TYPE: dict[str, MultiplexingAcquisition]
image_glob_patterns
If specified, only parse images with filenames that match with all these patterns. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html, Example: image_glob_pattern=[\"*_B03_*\"] => only process well B03 image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"] => only process well C09, field of view 16 and Z planes 0-59.
TYPE: Optional[list[str]] DEFAULT: None
num_levels
Number of resolution-pyramid levels. If set to 5, there will be the full-resolution level and 4 levels of downsampled images.
TYPE: int DEFAULT: 5
coarsening_xy
Linear coarsening factor between subsequent levels. If set to 2, level 1 is 2x downsampled, level 2 is 4x downsampled etc.
TYPE: int DEFAULT: 2
image_extension
Filename extension of images (e.g. \"tif\" or \"png\").
TYPE: str DEFAULT: 'tif'
metadata_table_files
If None, parse Yokogawa metadata from mrf/mlf files in the input_path folder; else, a dictionary of key-value pairs like (acquisition, path) with acquisition a string like the key of the acquisitions dict and path pointing to a csv file containing the parsed metadata table.
TYPE: Optional[dict[str, str]] DEFAULT: None
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION dict[str, Any]
A metadata dictionary containing important metadata about the OME-Zarr plate, the images and some parameters required by downstream tasks (like num_levels).
Source code in fractal_tasks_core/tasks/cellvoyager_to_ome_zarr_init_multiplex.py
@validate_arguments\ndef cellvoyager_to_ome_zarr_init_multiplex(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Core parameters\n acquisitions: dict[str, MultiplexingAcquisition],\n # Advanced parameters\n image_glob_patterns: Optional[list[str]] = None,\n num_levels: int = 5,\n coarsening_xy: int = 2,\n image_extension: str = \"tif\",\n metadata_table_files: Optional[dict[str, str]] = None,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Create OME-NGFF structure and metadata to host a multiplexing dataset.\n\n This task takes a set of image folders (i.e. different multiplexing\n acquisitions) and build the internal structure and metadata of a OME-NGFF\n zarr group, without actually loading/writing the image data.\n\n Each element in input_paths should be treated as a different acquisition.\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed. Not used by the converter task.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created.\n (standard argument for Fractal tasks, managed by Fractal server).\n acquisitions: dictionary of acquisitions. Each key is the acquisition\n identifier (normally 0, 1, 2, 3 etc.). Each item defines the\n acquisition by providing the image_dir and the allowed_channels.\n image_glob_patterns: If specified, only parse images with filenames\n that match with all these patterns. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html, Example:\n `image_glob_pattern=[\"*_B03_*\"]` => only process well B03\n `image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"]` =>\n only process well C09, field of view 16 and Z planes 0-59.\n num_levels: Number of resolution-pyramid levels. If set to `5`, there\n will be the full-resolution level and 4 levels of downsampled\n images.\n coarsening_xy: Linear coarsening factor between subsequent levels.\n If set to `2`, level 1 is 2x downsampled, level 2 is 4x downsampled\n etc.\n image_extension: Filename extension of images\n (e.g. `\"tif\"` or `\"png\"`).\n metadata_table_files: If `None`, parse Yokogawa metadata from mrf/mlf\n files in the input_path folder; else, a dictionary of key-value\n pairs like `(acquisition, path)` with `acquisition` a string like\n the key of the `acquisitions` dict and `path` pointing to a csv\n file containing the parsed metadata table.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n A metadata dictionary containing important metadata about the OME-Zarr\n plate, the images and some parameters required by downstream tasks\n (like `num_levels`).\n \"\"\"\n\n if metadata_table_files:\n\n # Checks on the dict:\n # 1. Acquisitions in acquisitions dict and metadata_table_files match\n # 2. Files end with \".csv\"\n # 3. Files exist.\n if set(acquisitions.keys()) != set(metadata_table_files.keys()):\n raise ValueError(\n \"Mismatch in acquisition keys between \"\n f\"{acquisitions.keys()=} and \"\n f\"{metadata_table_files.keys()=}\"\n )\n for f in metadata_table_files.values():\n if not f.endswith(\".csv\"):\n raise ValueError(\n f\"{f} (in metadata_table_file) is not a csv file.\"\n )\n if not os.path.isfile(f):\n raise ValueError(\n f\"{f} (in metadata_table_file) does not exist.\"\n )\n\n # Preliminary checks on acquisitions\n # Note that in metadata the keys of dictionary arguments should be\n # strings (and not integers), so that they can be read from a JSON file\n for key, values in acquisitions.items():\n if not isinstance(key, str):\n raise ValueError(f\"{acquisitions=} has non-string keys\")\n check_unique_wavelength_ids(values.allowed_channels)\n\n # Identify all plates and all channels, per input folders\n dict_acquisitions: dict = {}\n for acquisition, acq_input in acquisitions.items():\n dict_acquisitions[acquisition] = {}\n\n actual_wavelength_ids = []\n plates = []\n plate_prefixes = []\n\n # Loop over all images\n patterns = [f\"*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n input_filenames = glob_with_multiple_patterns(\n folder=acq_input.image_dir,\n patterns=patterns,\n )\n for fn in input_filenames:\n try:\n filename_metadata = parse_filename(Path(fn).name)\n plate = filename_metadata[\"plate\"]\n plates.append(plate)\n plate_prefix = filename_metadata[\"plate_prefix\"]\n plate_prefixes.append(plate_prefix)\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n actual_wavelength_ids.append(f\"A{A}_C{C}\")\n except ValueError as e:\n logger.warning(\n f'Skipping \"{Path(fn).name}\". Original error: ' + str(e)\n )\n plates = sorted(list(set(plates)))\n actual_wavelength_ids = sorted(list(set(actual_wavelength_ids)))\n\n info = (\n \"Listing all plates/channels:\\n\"\n f\"Patterns: {patterns}\\n\"\n f\"Plates: {plates}\\n\"\n f\"Actual wavelength IDs: {actual_wavelength_ids}\\n\"\n )\n\n # Check that a folder includes a single plate\n if len(plates) > 1:\n raise ValueError(f\"{info}ERROR: {len(plates)} plates detected\")\n elif len(plates) == 0:\n raise ValueError(f\"{info}ERROR: No plates detected\")\n original_plate = plates[0]\n plate_prefix = plate_prefixes[0]\n\n # Replace plate with the one of acquisition 0, if needed\n if int(acquisition) > 0:\n plate = dict_acquisitions[\"0\"][\"plate\"]\n logger.warning(\n f\"For {acquisition=}, we replace {original_plate=} with \"\n f\"{plate=} (the one for acquisition 0)\"\n )\n\n # Check that all channels are in the allowed_channels\n allowed_wavelength_ids = [\n c.wavelength_id for c in acq_input.allowed_channels\n ]\n if not set(actual_wavelength_ids).issubset(\n set(allowed_wavelength_ids)\n ):\n msg = \"ERROR in create_ome_zarr\\n\"\n msg += f\"actual_wavelength_ids: {actual_wavelength_ids}\\n\"\n msg += f\"allowed_wavelength_ids: {allowed_wavelength_ids}\\n\"\n raise ValueError(msg)\n\n # Create actual_channels, i.e. a list of the channel dictionaries which\n # are present\n actual_channels = [\n channel\n for channel in acq_input.allowed_channels\n if channel.wavelength_id in actual_wavelength_ids\n ]\n\n logger.info(f\"plate: {plate}\")\n logger.info(f\"actual_channels: {actual_channels}\")\n\n dict_acquisitions[acquisition] = {}\n dict_acquisitions[acquisition][\"plate\"] = plate\n dict_acquisitions[acquisition][\"original_plate\"] = original_plate\n dict_acquisitions[acquisition][\"plate_prefix\"] = plate_prefix\n dict_acquisitions[acquisition][\"image_folder\"] = acq_input.image_dir\n dict_acquisitions[acquisition][\"original_paths\"] = [\n acq_input.image_dir\n ]\n dict_acquisitions[acquisition][\"actual_channels\"] = actual_channels\n dict_acquisitions[acquisition][\n \"actual_wavelength_ids\"\n ] = actual_wavelength_ids\n\n parallelization_list = []\n acquisitions_sorted = sorted(list(acquisitions.keys()))\n current_plates = [item[\"plate\"] for item in dict_acquisitions.values()]\n if len(set(current_plates)) > 1:\n raise ValueError(f\"{current_plates=}\")\n plate = current_plates[0]\n\n zarrurl = dict_acquisitions[acquisitions_sorted[0]][\"plate\"] + \".zarr\"\n full_zarrurl = str(Path(zarr_dir) / zarrurl)\n logger.info(f\"Creating {full_zarrurl=}\")\n # Call zarr.open_group wrapper, which handles overwrite=True/False\n group_plate = open_zarr_group_with_overwrite(\n full_zarrurl, overwrite=overwrite\n )\n group_plate.attrs[\"plate\"] = {\n \"acquisitions\": [\n {\n \"id\": int(acquisition),\n \"name\": dict_acquisitions[acquisition][\"original_plate\"],\n }\n for acquisition in acquisitions_sorted\n ]\n }\n\n zarrurls: dict[str, list[str]] = {\"well\": [], \"image\": []}\n zarrurls[\"plate\"] = [f\"{plate}.zarr\"]\n\n ################################################################\n logging.info(f\"{acquisitions_sorted=}\")\n\n for acquisition in acquisitions_sorted:\n\n # Define plate zarr\n image_folder = dict_acquisitions[acquisition][\"image_folder\"]\n logger.info(f\"Looking at {image_folder=}\")\n\n # Obtain FOV-metadata dataframe\n if metadata_table_files is None:\n mrf_path = f\"{image_folder}/MeasurementDetail.mrf\"\n mlf_path = f\"{image_folder}/MeasurementData.mlf\"\n site_metadata, total_files = parse_yokogawa_metadata(\n mrf_path, mlf_path, filename_patterns=image_glob_patterns\n )\n site_metadata = remove_FOV_overlaps(site_metadata)\n else:\n site_metadata = pd.read_csv(metadata_table_files[acquisition])\n site_metadata.set_index([\"well_id\", \"FieldIndex\"], inplace=True)\n\n # Extract pixel sizes and bit_depth\n pixel_size_z = site_metadata[\"pixel_size_z\"][0]\n pixel_size_y = site_metadata[\"pixel_size_y\"][0]\n pixel_size_x = site_metadata[\"pixel_size_x\"][0]\n bit_depth = site_metadata[\"bit_depth\"][0]\n\n if min(pixel_size_z, pixel_size_y, pixel_size_x) < 1e-9:\n raise ValueError(pixel_size_z, pixel_size_y, pixel_size_x)\n\n # Identify all wells\n plate_prefix = dict_acquisitions[acquisition][\"plate_prefix\"]\n patterns = [f\"{plate_prefix}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n plate_images = glob_with_multiple_patterns(\n folder=str(image_folder),\n patterns=patterns,\n )\n\n wells = [\n parse_filename(os.path.basename(fn))[\"well\"] for fn in plate_images\n ]\n wells = sorted(list(set(wells)))\n logger.info(f\"{wells=}\")\n\n # Verify that all wells have all channels\n actual_channels = dict_acquisitions[acquisition][\"actual_channels\"]\n for well in wells:\n patterns = [f\"{plate_prefix}_{well}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n well_images = glob_with_multiple_patterns(\n folder=str(image_folder),\n patterns=patterns,\n )\n\n well_wavelength_ids = []\n for fpath in well_images:\n try:\n filename_metadata = parse_filename(os.path.basename(fpath))\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n well_wavelength_ids.append(f\"A{A}_C{C}\")\n except IndexError:\n logger.info(f\"Skipping {fpath}\")\n well_wavelength_ids = sorted(list(set(well_wavelength_ids)))\n actual_wavelength_ids = dict_acquisitions[acquisition][\n \"actual_wavelength_ids\"\n ]\n if well_wavelength_ids != actual_wavelength_ids:\n raise ValueError(\n f\"ERROR: well {well} in plate {plate} (prefix: \"\n f\"{plate_prefix}) has missing channels.\\n\"\n f\"Expected: {actual_wavelength_ids}\\n\"\n f\"Found: {well_wavelength_ids}.\\n\"\n )\n\n well_rows_columns = generate_row_col_split(wells)\n row_list = [\n well_row_column[0] for well_row_column in well_rows_columns\n ]\n col_list = [\n well_row_column[1] for well_row_column in well_rows_columns\n ]\n row_list = sorted(list(set(row_list)))\n col_list = sorted(list(set(col_list)))\n\n plate_attrs = group_plate.attrs[\"plate\"]\n plate_attrs[\"columns\"] = [{\"name\": col} for col in col_list]\n plate_attrs[\"rows\"] = [{\"name\": row} for row in row_list]\n plate_attrs[\"wells\"] = [\n {\n \"path\": well_row_column[0] + \"/\" + well_row_column[1],\n \"rowIndex\": row_list.index(well_row_column[0]),\n \"columnIndex\": col_list.index(well_row_column[1]),\n }\n for well_row_column in well_rows_columns\n ]\n plate_attrs[\"version\"] = __OME_NGFF_VERSION__\n # Validate plate attrs\n Plate(**plate_attrs)\n group_plate.attrs[\"plate\"] = plate_attrs\n\n for row, column in well_rows_columns:\n parallelization_list.append(\n {\n \"zarr_url\": (\n f\"{zarr_dir}/{plate}.zarr/{row}/{column}/\"\n f\"{acquisition}/\"\n ),\n \"init_args\": InitArgsCellVoyager(\n image_dir=acquisitions[acquisition].image_dir,\n plate_prefix=plate_prefix,\n well_ID=get_filename_well_id(row, column),\n image_extension=image_extension,\n image_glob_patterns=image_glob_patterns,\n acquisition=acquisition,\n ).dict(),\n }\n )\n try:\n group_well = group_plate.create_group(f\"{row}/{column}/\")\n logging.info(f\"Created new group_well at {row}/{column}/\")\n well_attrs = {\n \"images\": [\n {\n \"path\": f\"{acquisition}\",\n \"acquisition\": int(acquisition),\n }\n ],\n \"version\": __OME_NGFF_VERSION__,\n }\n # Validate well attrs:\n Well(**well_attrs)\n group_well.attrs[\"well\"] = well_attrs\n zarrurls[\"well\"].append(f\"{plate}.zarr/{row}/{column}\")\n except ContainsGroupError:\n group_well = zarr.open_group(\n f\"{full_zarrurl}/{row}/{column}/\", mode=\"r+\"\n )\n logging.info(\n f\"Loaded group_well from {full_zarrurl}/{row}/{column}\"\n )\n current_images = group_well.attrs[\"well\"][\"images\"] + [\n {\"path\": f\"{acquisition}\", \"acquisition\": int(acquisition)}\n ]\n well_attrs = dict(\n images=current_images,\n version=group_well.attrs[\"well\"][\"version\"],\n )\n # Validate well attrs:\n Well(**well_attrs)\n group_well.attrs[\"well\"] = well_attrs\n\n group_image = group_well.create_group(\n f\"{acquisition}/\"\n ) # noqa: F841\n logging.info(f\"Created image group {row}/{column}/{acquisition}\")\n image = f\"{plate}.zarr/{row}/{column}/{acquisition}\"\n zarrurls[\"image\"].append(image)\n\n group_image.attrs[\"multiscales\"] = [\n {\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n {\"name\": \"c\", \"type\": \"channel\"},\n {\n \"name\": \"z\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"y\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"x\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n ],\n \"datasets\": [\n {\n \"path\": f\"{ind_level}\",\n \"coordinateTransformations\": [\n {\n \"type\": \"scale\",\n \"scale\": [\n 1,\n pixel_size_z,\n pixel_size_y\n * coarsening_xy**ind_level,\n pixel_size_x\n * coarsening_xy**ind_level,\n ],\n }\n ],\n }\n for ind_level in range(num_levels)\n ],\n }\n ]\n\n group_image.attrs[\"omero\"] = {\n \"id\": 1, # FIXME does this depend on the plate number?\n \"name\": \"TBD\",\n \"version\": __OME_NGFF_VERSION__,\n \"channels\": define_omero_channels(\n channels=actual_channels,\n bit_depth=bit_depth,\n label_prefix=acquisition,\n ),\n }\n # Validate Image attrs\n NgffImageMeta(**group_image.attrs)\n\n # Prepare AnnData tables for FOV/well ROIs\n well_id = get_filename_well_id(row, column)\n FOV_ROIs_table = prepare_FOV_ROI_table(site_metadata.loc[well_id])\n well_ROIs_table = prepare_well_ROI_table(\n site_metadata.loc[well_id]\n )\n\n # Write AnnData tables into the `tables` zarr group\n write_table(\n group_image,\n \"FOV_ROI_table\",\n FOV_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n write_table(\n group_image,\n \"well_ROI_table\",\n well_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n\n # Check that the different images (e.g. different acquisitions) in the each\n # well have unique labels\n for well_path in zarrurls[\"well\"]:\n check_well_channel_labels(\n well_zarr_path=str(Path(zarr_dir) / well_path)\n )\n\n return dict(parallelization_list=parallelization_list)\n
Generate metadata for OME-Zarr HCS plates & wells.
Based on the list of zarr_urls, generate metadata for all plates and all their wells.
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed.
TYPE: list[str]
RETURNS DESCRIPTION plate_metadata_dicts
Dictionary of plate plate metadata. The structure is: {\"old_plate_name\": NgffPlateMeta (as dict)}.
TYPE: dict[str, dict]
new_well_image_attrs
Dictionary of image lists for the new wells. The structure is: {\"old_plate_name\": {\"old_well_name\": [ImageInWell(as dict)]}}
TYPE: dict[str, dict[str, dict]]
well_image_attrs
Dictionary of Image attributes of the existing wells.
TYPE: dict[str, dict]
Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
def _generate_plate_well_metadata(\n zarr_urls: list[str],\n) -> tuple[dict[str, dict], dict[str, dict[str, dict]], dict[str, dict]]:\n\"\"\"\n Generate metadata for OME-Zarr HCS plates & wells.\n\n Based on the list of zarr_urls, generate metadata for all plates and all\n their wells.\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed.\n\n Returns:\n plate_metadata_dicts: Dictionary of plate plate metadata. The structure\n is: {\"old_plate_name\": NgffPlateMeta (as dict)}.\n new_well_image_attrs: Dictionary of image lists for the new wells.\n The structure is: {\"old_plate_name\": {\"old_well_name\":\n [ImageInWell(as dict)]}}\n well_image_attrs: Dictionary of Image attributes of the existing wells.\n \"\"\"\n # TODO: Simplify this block. Currently complicated, because we need to loop\n # through all potential plates, all their wells & their images to build up\n # the metadata for the plate & well.\n plate_metadata_dicts = {}\n plate_wells = {}\n well_image_attrs = {}\n new_well_image_attrs = {}\n for zarr_url in zarr_urls:\n # Extract plate/well/image parts of `zarr_url`\n old_plate_url = _get_plate_url_from_image_url(zarr_url)\n well_sub_url = _get_well_sub_url(zarr_url)\n curr_img_sub_url = _get_image_sub_url(zarr_url)\n\n # The first time a plate is found, create its metadata\n if old_plate_url not in plate_metadata_dicts:\n logger.info(f\"Reading plate metadata of {old_plate_url=}\")\n old_plate_meta = load_NgffPlateMeta(old_plate_url)\n plate_metadata = dict(\n plate=dict(\n acquisitions=old_plate_meta.plate.acquisitions,\n field_count=old_plate_meta.plate.field_count,\n name=old_plate_meta.plate.name,\n # The new field count could be different from the old\n # field count\n version=old_plate_meta.plate.version,\n )\n )\n plate_metadata_dicts[old_plate_url] = plate_metadata\n plate_wells[old_plate_url] = []\n well_image_attrs[old_plate_url] = {}\n new_well_image_attrs[old_plate_url] = {}\n\n # The first time a plate/well pair is found, create the well metadata\n if well_sub_url not in plate_wells[old_plate_url]:\n plate_wells[old_plate_url].append(well_sub_url)\n old_well_url = f\"{old_plate_url}/{well_sub_url}\"\n logger.info(f\"Reading well metadata of {old_well_url}\")\n well_attrs = load_NgffWellMeta(old_well_url)\n well_image_attrs[old_plate_url][well_sub_url] = well_attrs.well\n new_well_image_attrs[old_plate_url][well_sub_url] = []\n\n # Find images of the current well with name matching the current image\n # TODO: clarify whether this list must always have length 1\n curr_well_image_list = [\n img\n for img in well_image_attrs[old_plate_url][well_sub_url].images\n if img.path == curr_img_sub_url\n ]\n new_well_image_attrs[old_plate_url][\n well_sub_url\n ] += curr_well_image_list\n\n # Fill in the plate metadata based on all available wells\n for old_plate_url in plate_metadata_dicts:\n well_list, row_list, column_list = _generate_wells_rows_columns(\n plate_wells[old_plate_url]\n )\n plate_metadata_dicts[old_plate_url][\"plate\"][\"columns\"] = []\n for column in column_list:\n plate_metadata_dicts[old_plate_url][\"plate\"][\"columns\"].append(\n {\"name\": column}\n )\n\n plate_metadata_dicts[old_plate_url][\"plate\"][\"rows\"] = []\n for row in row_list:\n plate_metadata_dicts[old_plate_url][\"plate\"][\"rows\"].append(\n {\"name\": row}\n )\n plate_metadata_dicts[old_plate_url][\"plate\"][\"wells\"] = well_list\n\n # Validate with NgffPlateMeta model\n plate_metadata_dicts[old_plate_url] = NgffPlateMeta(\n **plate_metadata_dicts[old_plate_url]\n ).dict(exclude_none=True)\n\n return plate_metadata_dicts, new_well_image_attrs, well_image_attrs\n
Generate the plate well metadata based on the list of wells.
Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
def _generate_wells_rows_columns(\n well_list: list[str],\n) -> tuple[list[WellInPlate], list[str], list[str]]:\n\"\"\"\n Generate the plate well metadata based on the list of wells.\n \"\"\"\n rows = []\n columns = []\n wells = []\n for well in well_list:\n rows.append(well.split(\"/\")[0])\n columns.append(well.split(\"/\")[1])\n rows = sorted(list(set(rows)))\n columns = sorted(list(set(columns)))\n for well in well_list:\n wells.append(\n WellInPlate(\n path=well,\n rowIndex=rows.index(well.split(\"/\")[0]),\n columnIndex=columns.index(well.split(\"/\")[1]),\n )\n )\n\n return wells, rows, columns\n
Given the absolute zarr_url for an OME-Zarr image within an HCS plate, return the path to the plate zarr group.
Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
def _get_plate_url_from_image_url(zarr_url: str) -> str:\n\"\"\"\n Given the absolute `zarr_url` for an OME-Zarr image within an HCS plate,\n return the path to the plate zarr group.\n \"\"\"\n zarr_url = zarr_url.rstrip(\"/\")\n plate_path = \"/\".join(zarr_url.split(\"/\")[:-3])\n return plate_path\n
Given the absolute zarr_url for an OME-Zarr image within an HCS plate, return the path to the image zarr group.
Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
def _get_well_sub_url(zarr_url: str) -> str:\n\"\"\"\n Given the absolute `zarr_url` for an OME-Zarr image within an HCS plate,\n return the path to the image zarr group.\n \"\"\"\n zarr_url = zarr_url.rstrip(\"/\")\n well_url = \"/\".join(zarr_url.split(\"/\")[-3:-1])\n return well_url\n
Duplicate the OME-Zarr HCS structure for a set of zarr_urls.
This task only processes the zarr images in the zarr_urls, not all the images in the plate. It copies all the plate & well structure, but none of the image metadata or the actual image data:
For each plate, create a new OME-Zarr HCS plate with the attributes for all the images in zarr_urls
For each well (in each plate), create a new zarr subgroup with the same attributes as the original one.
Note: this task makes use of methods from the Attributes class, see https://zarr.readthedocs.io/en/stable/api/attrs.html.
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
suffix
The suffix that is used to transform plate.zarr into plate_suffix.zarr. Note that None is not currently supported.
TYPE: str DEFAULT: 'mip'
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION dict[str, Any]
A parallelization list to be used in a compute task to fill the wells
dict[str, Any]
with OME-Zarr images.
Source code in fractal_tasks_core/tasks/copy_ome_zarr_hcs_plate.py
@validate_arguments\ndef copy_ome_zarr_hcs_plate(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Advanced parameters\n suffix: str = \"mip\",\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Duplicate the OME-Zarr HCS structure for a set of zarr_urls.\n\n This task only processes the zarr images in the zarr_urls, not all the\n images in the plate. It copies all the plate & well structure, but none\n of the image metadata or the actual image data:\n\n - For each plate, create a new OME-Zarr HCS plate with the attributes for\n all the images in zarr_urls\n - For each well (in each plate), create a new zarr subgroup with the\n same attributes as the original one.\n\n Note: this task makes use of methods from the `Attributes` class, see\n https://zarr.readthedocs.io/en/stable/api/attrs.html.\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created.\n (standard argument for Fractal tasks, managed by Fractal server).\n suffix: The suffix that is used to transform `plate.zarr` into\n `plate_suffix.zarr`. Note that `None` is not currently supported.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n A parallelization list to be used in a compute task to fill the wells\n with OME-Zarr images.\n \"\"\"\n\n # Preliminary check\n if suffix is None or suffix == \"\":\n raise ValueError(\n \"Running copy_ome_zarr_hcs_plate without a suffix would lead to\"\n \"overwriting of the existing HCS plates.\"\n )\n\n parallelization_list = []\n\n # Generate parallelization list\n for zarr_url in zarr_urls:\n old_plate_url = _get_plate_url_from_image_url(zarr_url)\n well_sub_url = _get_well_sub_url(zarr_url)\n old_plate_name = old_plate_url.split(\".zarr\")[-2].split(\"/\")[-1]\n new_plate_name = f\"{old_plate_name}_{suffix}\"\n zarrurl_plate_new = f\"{zarr_dir}/{new_plate_name}.zarr\"\n curr_img_sub_url = _get_image_sub_url(zarr_url)\n new_zarr_url = f\"{zarrurl_plate_new}/{well_sub_url}/{curr_img_sub_url}\"\n parallelization_item = dict(\n zarr_url=new_zarr_url,\n init_args=dict(origin_url=zarr_url),\n )\n InitArgsMIP(**parallelization_item[\"init_args\"])\n parallelization_list.append(parallelization_item)\n\n # Generate the plate metadata & parallelization list\n (\n plate_attrs_dicts,\n new_well_image_attrs,\n well_image_attrs,\n ) = _generate_plate_well_metadata(zarr_urls=zarr_urls)\n\n # Create the new OME-Zarr HCS plate\n for old_plate_url, plate_attrs in plate_attrs_dicts.items():\n old_plate_name = old_plate_url.split(\".zarr\")[-2].split(\"/\")[-1]\n new_plate_name = f\"{old_plate_name}_{suffix}\"\n zarrurl_new = f\"{zarr_dir}/{new_plate_name}.zarr\"\n logger.info(f\"{old_plate_url=}\")\n logger.info(f\"{zarrurl_new=}\")\n new_plate_group = open_zarr_group_with_overwrite(\n zarrurl_new, overwrite=overwrite\n )\n new_plate_group.attrs.put(plate_attrs)\n\n # Write well groups:\n for well_sub_url in new_well_image_attrs[old_plate_url]:\n new_well_group = zarr.group(f\"{zarrurl_new}/{well_sub_url}\")\n well_attrs = dict(\n well=dict(\n images=[\n img.dict(exclude_none=True)\n for img in new_well_image_attrs[old_plate_url][\n well_sub_url\n ]\n ],\n version=well_image_attrs[old_plate_url][\n well_sub_url\n ].version,\n )\n )\n new_well_group.attrs.put(well_attrs)\n\n return dict(parallelization_list=parallelization_list)\n
Applies pre-calculated registration to ROI tables.
Apply pre-calculated registration such that resulting ROIs contain the consensus align region between all acquisitions.
Parallelization level: well
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. Refers to the zarr_url of the reference acquisition. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
init_args
Intialization arguments provided by init_group_by_well_for_multiplexing. It contains the zarr_url_list listing all the zarr_urls in the same well as the zarr_url of the reference acquisition that are being processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: InitArgsRegistrationConsensus
roi_table
Name of the ROI table over which the task loops to calculate the registration. Examples: FOV_ROI_table => loop over the field of views, well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
new_roi_table
Optional name for the new, registered ROI table. If no name is given, it will default to \"registered_\" + roi_table
TYPE: Optional[str] DEFAULT: None
Source code in fractal_tasks_core/tasks/find_registration_consensus.py
@validate_arguments\ndef find_registration_consensus(\n *,\n # Fractal parameters\n zarr_url: str,\n init_args: InitArgsRegistrationConsensus,\n # Core parameters\n roi_table: str = \"FOV_ROI_table\",\n # Advanced parameters\n new_roi_table: Optional[str] = None,\n):\n\"\"\"\n Applies pre-calculated registration to ROI tables.\n\n Apply pre-calculated registration such that resulting ROIs contain\n the consensus align region between all acquisitions.\n\n Parallelization level: well\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n Refers to the zarr_url of the reference acquisition.\n (standard argument for Fractal tasks, managed by Fractal server).\n init_args: Intialization arguments provided by\n `init_group_by_well_for_multiplexing`. It contains the\n zarr_url_list listing all the zarr_urls in the same well as the\n zarr_url of the reference acquisition that are being processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n roi_table: Name of the ROI table over which the task loops to\n calculate the registration. Examples: `FOV_ROI_table` => loop over\n the field of views, `well_ROI_table` => process the whole well as\n one image.\n new_roi_table: Optional name for the new, registered ROI table. If no\n name is given, it will default to \"registered_\" + `roi_table`\n\n \"\"\"\n if not new_roi_table:\n new_roi_table = \"registered_\" + roi_table\n logger.info(\n f\"Running for {zarr_url=} & the other acquisitions in that well. \\n\"\n f\"Applying translation registration to {roi_table=} and storing it as \"\n f\"{new_roi_table=}.\"\n )\n\n # Collect all the ROI tables\n roi_tables = {}\n roi_tables_attrs = {}\n for acq_zarr_url in init_args.zarr_url_list:\n curr_ROI_table = ad.read_zarr(f\"{acq_zarr_url}/tables/{roi_table}\")\n curr_ROI_table_group = zarr.open_group(\n f\"{acq_zarr_url}/tables/{roi_table}\", mode=\"r\"\n )\n curr_ROI_table_attrs = curr_ROI_table_group.attrs.asdict()\n\n # For reference_acquisition, handle the fact that it doesn't\n # have the shifts\n if acq_zarr_url == zarr_url:\n curr_ROI_table = add_zero_translation_columns(curr_ROI_table)\n # Check for valid ROI tables\n are_ROI_table_columns_valid(table=curr_ROI_table)\n translation_columns = [\n \"translation_z\",\n \"translation_y\",\n \"translation_x\",\n ]\n if curr_ROI_table.var.index.isin(translation_columns).sum() != 3:\n raise ValueError(\n f\"{roi_table=} in {acq_zarr_url} does not contain the \"\n f\"translation columns {translation_columns} necessary to use \"\n \"this task.\"\n )\n roi_tables[acq_zarr_url] = curr_ROI_table\n roi_tables_attrs[acq_zarr_url] = curr_ROI_table_attrs\n\n # Check that all acquisitions have the same ROIs\n rois = roi_tables[list(roi_tables.keys())[0]].obs.index\n for acq_zarr_url, acq_roi_table in roi_tables.items():\n if not (acq_roi_table.obs.index == rois).all():\n raise ValueError(\n f\"Acquisition {acq_zarr_url} does not contain the same ROIs \"\n f\"as the reference acquisition {zarr_url}:\\n\"\n f\"{acq_zarr_url}: {acq_roi_table.obs.index}\\n\"\n f\"{zarr_url}: {rois}\"\n )\n\n roi_table_dfs = [\n roi_table.to_df().loc[:, translation_columns]\n for roi_table in roi_tables.values()\n ]\n logger.info(\"Calculating min & max translation across acquisitions.\")\n max_df, min_df = calculate_min_max_across_dfs(roi_table_dfs)\n shifted_rois = {}\n\n # Loop over acquisitions\n for acq_zarr_url in init_args.zarr_url_list:\n shifted_rois[acq_zarr_url] = apply_registration_to_single_ROI_table(\n roi_tables[acq_zarr_url], max_df, min_df\n )\n\n # TODO: Drop translation columns from this table?\n\n logger.info(\n f\"Write the registered ROI table {new_roi_table} for \"\n \"{acq_zarr_url=}\"\n )\n # Save the shifted ROI table as a new table\n image_group = zarr.group(acq_zarr_url)\n write_table(\n image_group,\n new_roi_table,\n shifted_rois[acq_zarr_url],\n table_attrs=roi_tables_attrs[acq_zarr_url],\n )\n
Corrects a stack of images, using a given illumination profile (e.g. bright in the center of the image, dim outside).
PARAMETER DESCRIPTION img_stack
4D numpy array (czyx), with dummy size along c.
TYPE: ndarray
corr_img
2D numpy array (yx)
TYPE: ndarray
background
Background value that is subtracted from the image before the illumination correction is applied.
TYPE: int DEFAULT: 110
Source code in fractal_tasks_core/tasks/illumination_correction.py
def correct(\n img_stack: np.ndarray,\n corr_img: np.ndarray,\n background: int = 110,\n):\n\"\"\"\n Corrects a stack of images, using a given illumination profile (e.g. bright\n in the center of the image, dim outside).\n\n Args:\n img_stack: 4D numpy array (czyx), with dummy size along c.\n corr_img: 2D numpy array (yx)\n background: Background value that is subtracted from the image before\n the illumination correction is applied.\n \"\"\"\n\n logger.info(f\"Start correct, {img_stack.shape}\")\n\n # Check shapes\n if corr_img.shape != img_stack.shape[2:] or img_stack.shape[0] != 1:\n raise ValueError(\n \"Error in illumination_correction:\\n\"\n f\"{img_stack.shape=}\\n{corr_img.shape=}\"\n )\n\n # Store info about dtype\n dtype = img_stack.dtype\n dtype_max = np.iinfo(dtype).max\n\n # Background subtraction\n img_stack[img_stack <= background] = 0\n img_stack[img_stack > background] -= background\n\n # Apply the normalized correction matrix (requires a float array)\n # img_stack = img_stack.astype(np.float64)\n new_img_stack = img_stack / (corr_img / np.max(corr_img))[None, None, :, :]\n\n # Handle edge case: corrected image may have values beyond the limit of\n # the encoding, e.g. beyond 65535 for 16bit images. This clips values\n # that surpass this limit and triggers a warning\n if np.sum(new_img_stack > dtype_max) > 0:\n warnings.warn(\n \"Illumination correction created values beyond the max range of \"\n f\"the current image type. These have been clipped to {dtype_max=}.\"\n )\n new_img_stack[new_img_stack > dtype_max] = dtype_max\n\n logger.info(\"End correct\")\n\n # Cast back to original dtype and return\n return new_img_stack.astype(dtype)\n
Applies illumination correction to the images in the OME-Zarr.
Assumes that the illumination correction profiles were generated before separately and that the same background subtraction was used during calculation of the illumination correction (otherwise, it will not work well & the correction may only be partial).
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
illumination_profiles_folder
Path of folder of illumination profiles.
TYPE: str
illumination_profiles
Dictionary where keys match the wavelength_id attributes of existing channels (e.g. A01_C01 ) and values are the filenames of the corresponding illumination profiles.
TYPE: dict[str, str]
background
Background value that is subtracted from the image before the illumination correction is applied. Set it to 0 if you don't want any background subtraction.
TYPE: int DEFAULT: 0
input_ROI_table
Name of the ROI table that contains the information about the location of the individual field of views (FOVs) to which the illumination correction shall be applied. Defaults to \"FOV_ROI_table\", the default name Fractal converters give the ROI tables that list all FOVs separately. If you generated your OME-Zarr with a different converter and used Import OME-Zarr to generate the ROI tables, image_ROI_table is the right choice if you only have 1 FOV per Zarr image and grid_ROI_table if you have multiple FOVs per Zarr image and set the right grid options during import.
TYPE: str DEFAULT: 'FOV_ROI_table'
overwrite_input
If True, the results of this task will overwrite the input image data. If false, a new image is generated and the illumination corrected data is saved there.
TYPE: bool DEFAULT: True
suffix
What suffix to append to the illumination corrected images. Only relevant if overwrite_input=False.
TYPE: str DEFAULT: '_illum_corr'
Source code in fractal_tasks_core/tasks/illumination_correction.py
@validate_arguments\ndef illumination_correction(\n *,\n # Fractal parameters\n zarr_url: str,\n # Core parameters\n illumination_profiles_folder: str,\n illumination_profiles: dict[str, str],\n background: int = 0,\n input_ROI_table: str = \"FOV_ROI_table\",\n overwrite_input: bool = True,\n # Advanced parameters\n suffix: str = \"_illum_corr\",\n) -> dict[str, Any]:\n\n\"\"\"\n Applies illumination correction to the images in the OME-Zarr.\n\n Assumes that the illumination correction profiles were generated before\n separately and that the same background subtraction was used during\n calculation of the illumination correction (otherwise, it will not work\n well & the correction may only be partial).\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n illumination_profiles_folder: Path of folder of illumination profiles.\n illumination_profiles: Dictionary where keys match the `wavelength_id`\n attributes of existing channels (e.g. `A01_C01` ) and values are\n the filenames of the corresponding illumination profiles.\n background: Background value that is subtracted from the image before\n the illumination correction is applied. Set it to `0` if you don't\n want any background subtraction.\n input_ROI_table: Name of the ROI table that contains the information\n about the location of the individual field of views (FOVs) to\n which the illumination correction shall be applied. Defaults to\n \"FOV_ROI_table\", the default name Fractal converters give the ROI\n tables that list all FOVs separately. If you generated your\n OME-Zarr with a different converter and used Import OME-Zarr to\n generate the ROI tables, `image_ROI_table` is the right choice if\n you only have 1 FOV per Zarr image and `grid_ROI_table` if you\n have multiple FOVs per Zarr image and set the right grid options\n during import.\n overwrite_input: If `True`, the results of this task will overwrite\n the input image data. If false, a new image is generated and the\n illumination corrected data is saved there.\n suffix: What suffix to append to the illumination corrected images.\n Only relevant if `overwrite_input=False`.\n \"\"\"\n\n # Defione old/new zarrurls\n if overwrite_input:\n zarr_url_new = zarr_url.rstrip(\"/\")\n else:\n zarr_url_new = zarr_url.rstrip(\"/\") + suffix\n\n t_start = time.perf_counter()\n logger.info(\"Start illumination_correction\")\n logger.info(f\" {overwrite_input=}\")\n logger.info(f\" {zarr_url=}\")\n logger.info(f\" {zarr_url_new=}\")\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n\n # Read channels from .zattrs\n channels: list[OmeroChannel] = get_omero_channel_list(\n image_zarr_path=zarr_url\n )\n num_channels = len(channels)\n\n # Read FOV ROIs\n FOV_ROI_table = ad.read_zarr(f\"{zarr_url}/tables/{input_ROI_table}\")\n\n # Create list of indices for 3D FOVs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n FOV_ROI_table,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n\n # Extract image size from FOV-ROI indices. Note: this works at level=0,\n # where FOVs should all be of the exact same size (in pixels)\n ref_img_size = None\n for indices in list_indices:\n img_size = (indices[3] - indices[2], indices[5] - indices[4])\n if ref_img_size is None:\n ref_img_size = img_size\n else:\n if img_size != ref_img_size:\n raise ValueError(\n \"ERROR: inconsistent image sizes in list_indices\"\n )\n img_size_y, img_size_x = img_size[:]\n\n # Assemble dictionary of matrices and check their shapes\n corrections = {}\n for channel in channels:\n wavelength_id = channel.wavelength_id\n corrections[wavelength_id] = imread(\n (\n Path(illumination_profiles_folder)\n / illumination_profiles[wavelength_id]\n ).as_posix()\n )\n if corrections[wavelength_id].shape != (img_size_y, img_size_x):\n raise ValueError(\n \"Error in illumination_correction, \"\n \"correction matrix has wrong shape.\"\n )\n\n # Lazily load highest-res level from original zarr array\n data_czyx = da.from_zarr(f\"{zarr_url}/0\")\n\n # Create zarr for output\n if overwrite_input:\n new_zarr = zarr.open(f\"{zarr_url_new}/0\")\n else:\n new_zarr = zarr.create(\n shape=data_czyx.shape,\n chunks=data_czyx.chunksize,\n dtype=data_czyx.dtype,\n store=zarr.storage.FSStore(f\"{zarr_url_new}/0\"),\n overwrite=False,\n dimension_separator=\"/\",\n )\n _copy_hcs_ome_zarr_metadata(zarr_url, zarr_url_new)\n # Copy ROI tables from the old zarr_url to keep ROI tables and other\n # tables available in the new Zarr\n _copy_tables_from_zarr_url(zarr_url, zarr_url_new)\n\n # Iterate over FOV ROIs\n num_ROIs = len(list_indices)\n for i_c, channel in enumerate(channels):\n for i_ROI, indices in enumerate(list_indices):\n # Define region\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(i_c, i_c + 1),\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n logger.info(\n f\"Now processing ROI {i_ROI+1}/{num_ROIs} \"\n f\"for channel {i_c+1}/{num_channels}\"\n )\n # Execute illumination correction\n corrected_fov = correct(\n data_czyx[region].compute(),\n corrections[channel.wavelength_id],\n background=background,\n )\n # Write to disk\n da.array(corrected_fov).to_zarr(\n url=new_zarr,\n region=region,\n compute=True,\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=zarr_url_new,\n overwrite=True,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=data_czyx.chunksize,\n )\n\n t_end = time.perf_counter()\n logger.info(f\"End illumination_correction, elapsed: {t_end-t_start}\")\n\n if overwrite_input:\n image_list_updates = dict(image_list_updates=[dict(zarr_url=zarr_url)])\n else:\n image_list_updates = dict(\n image_list_updates=[dict(zarr_url=zarr_url_new, origin=zarr_url)]\n )\n return image_list_updates\n
This task prepares a parallelization list of all zarr_urls that need to be used to calculate the registration between acquisitions (all zarr_urls except the reference acquisition vs. the reference acquisition). This task only works for HCS OME-Zarrs for 2 reasons: Only HCS OME-Zarrs currently have defined acquisition metadata to determine reference acquisitions. And we have only implemented the grouping of images for HCS OME-Zarrs by well (with the assumption that every well just has 1 image per acqusition).
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. Not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
reference_acquisition
Which acquisition to register against. Needs to match the acquisition metadata in the OME-Zarr image.
TYPE: int DEFAULT: 0
RETURNS DESCRIPTION task_output
Dictionary for Fractal server that contains a parallelization list.
TYPE: dict[str, list[dict[str, Any]]]
Source code in fractal_tasks_core/tasks/image_based_registration_hcs_init.py
@validate_arguments\ndef image_based_registration_hcs_init(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Core parameters\n reference_acquisition: int = 0,\n) -> dict[str, list[dict[str, Any]]]:\n\"\"\"\n Initialized calculate registration task\n\n This task prepares a parallelization list of all zarr_urls that need to be\n used to calculate the registration between acquisitions (all zarr_urls\n except the reference acquisition vs. the reference acquisition).\n This task only works for HCS OME-Zarrs for 2 reasons: Only HCS OME-Zarrs\n currently have defined acquisition metadata to determine reference\n acquisitions. And we have only implemented the grouping of images for\n HCS OME-Zarrs by well (with the assumption that every well just has 1\n image per acqusition).\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created. Not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n reference_acquisition: Which acquisition to register against. Needs to\n match the acquisition metadata in the OME-Zarr image.\n\n Returns:\n task_output: Dictionary for Fractal server that contains a\n parallelization list.\n \"\"\"\n logger.info(\n f\"Running `image_based_registration_hcs_init` for {zarr_urls=}\"\n )\n image_groups = create_well_acquisition_dict(zarr_urls)\n\n # Create the parallelization list\n parallelization_list = []\n for key, image_group in image_groups.items():\n # Assert that all image groups have the reference acquisition present\n if reference_acquisition not in image_group.keys():\n raise ValueError(\n f\"Registration with {reference_acquisition=} can only work if \"\n \"all wells have the reference acquisition present. It was not \"\n f\"found for well {key}.\"\n )\n # Add all zarr_urls except the reference acquisition to the\n # parallelization list\n for acquisition, zarr_url in image_group.items():\n if acquisition != reference_acquisition:\n reference_zarr_url = image_group[reference_acquisition]\n parallelization_list.append(\n dict(\n zarr_url=zarr_url,\n init_args=dict(reference_zarr_url=reference_zarr_url),\n )\n )\n\n return dict(parallelization_list=parallelization_list)\n
The single OME-Zarr can be a full OME-Zarr HCS plate or an individual OME-Zarr image. The image needs to be in the zarr_dir as specified by the dataset. The current version of this task:
Creates the appropriate components-related metadata, needed for processing an existing OME-Zarr through Fractal.
Optionally adds new ROI tables to the existing OME-Zarr.
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. Not used. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
zarr_name
The OME-Zarr name, without its parent folder. The parent folder is provided by zarr_dir; e.g. zarr_name=\"array.zarr\", if the OME-Zarr path is in /zarr_dir/array.zarr.
TYPE: str
add_image_ROI_table
Whether to add a image_ROI_table table to each image, with a single ROI covering the whole image.
TYPE: bool DEFAULT: True
add_grid_ROI_table
Whether to add a grid_ROI_table table to each image, with the image split into a rectangular grid of ROIs.
TYPE: bool DEFAULT: True
grid_y_shape
Y shape of the ROI grid in grid_ROI_table.
TYPE: int DEFAULT: 2
grid_x_shape
X shape of the ROI grid in grid_ROI_table.
TYPE: int DEFAULT: 2
update_omero_metadata
Whether to update Omero-channels metadata, to make them Fractal-compatible.
TYPE: bool DEFAULT: True
overwrite
Whether new ROI tables (added when add_image_ROI_table and/or add_grid_ROI_table are True) can overwite existing ones.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/tasks/import_ome_zarr.py
@validate_arguments\ndef import_ome_zarr(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Core parameters\n zarr_name: str,\n update_omero_metadata: bool = True,\n add_image_ROI_table: bool = True,\n add_grid_ROI_table: bool = True,\n # Advanced parameters\n grid_y_shape: int = 2,\n grid_x_shape: int = 2,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Import a single OME-Zarr into Fractal.\n\n The single OME-Zarr can be a full OME-Zarr HCS plate or an individual\n OME-Zarr image. The image needs to be in the zarr_dir as specified by the\n dataset. The current version of this task:\n\n 1. Creates the appropriate components-related metadata, needed for\n processing an existing OME-Zarr through Fractal.\n 2. Optionally adds new ROI tables to the existing OME-Zarr.\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed. Not used.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_name: The OME-Zarr name, without its parent folder. The parent\n folder is provided by zarr_dir; e.g. `zarr_name=\"array.zarr\"`,\n if the OME-Zarr path is in `/zarr_dir/array.zarr`.\n add_image_ROI_table: Whether to add a `image_ROI_table` table to each\n image, with a single ROI covering the whole image.\n add_grid_ROI_table: Whether to add a `grid_ROI_table` table to each\n image, with the image split into a rectangular grid of ROIs.\n grid_y_shape: Y shape of the ROI grid in `grid_ROI_table`.\n grid_x_shape: X shape of the ROI grid in `grid_ROI_table`.\n update_omero_metadata: Whether to update Omero-channels metadata, to\n make them Fractal-compatible.\n overwrite: Whether new ROI tables (added when `add_image_ROI_table`\n and/or `add_grid_ROI_table` are `True`) can overwite existing ones.\n \"\"\"\n\n # Is this based on the Zarr_dir or the zarr_urls?\n if len(zarr_urls) > 0:\n logger.warning(\n \"Running import while there are already items from the image list \"\n \"provided to the task. The following inputs were provided: \"\n f\"{zarr_urls=}\"\n \"This task will not process the existing images, but look for \"\n f\"zarr files named {zarr_name=} in the {zarr_dir=} instead.\"\n )\n\n zarr_path = f\"{zarr_dir.rstrip('/')}/{zarr_name}\"\n logger.info(f\"Zarr path: {zarr_path}\")\n\n root_group = zarr.open_group(zarr_path, mode=\"r\")\n ngff_type = detect_ome_ngff_type(root_group)\n grid_YX_shape = (grid_y_shape, grid_x_shape)\n\n image_list_updates = []\n if ngff_type == \"plate\":\n for well in root_group.attrs[\"plate\"][\"wells\"]:\n well_path = well[\"path\"]\n\n well_group = zarr.open_group(zarr_path, path=well_path, mode=\"r\")\n for image in well_group.attrs[\"well\"][\"images\"]:\n image_path = image[\"path\"]\n zarr_url = f\"{zarr_path}/{well_path}/{image_path}\"\n types = _process_single_image(\n zarr_url,\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n image_list_updates.append(\n dict(\n zarr_url=zarr_url,\n attributes=dict(\n plate=zarr_name,\n well=well_path.replace(\"/\", \"\"),\n ),\n types=types,\n )\n )\n elif ngff_type == \"well\":\n logger.warning(\n \"Only OME-Zarr for plates are fully supported in Fractal; \"\n f\"e.g. the current one ({ngff_type=}) cannot be \"\n \"processed via the `maximum_intensity_projection` task.\"\n )\n for image in root_group.attrs[\"well\"][\"images\"]:\n image_path = image[\"path\"]\n zarr_url = f\"{zarr_path}/{image_path}\"\n well_name = \"\".join(zarr_path.split(\"/\")[-2:])\n types = _process_single_image(\n zarr_url,\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n image_list_updates.append(\n dict(\n zarr_url=zarr_url,\n attributes=dict(\n well=well_name,\n ),\n types=types,\n )\n )\n elif ngff_type == \"image\":\n logger.warning(\n \"Only OME-Zarr for plates are fully supported in Fractal; \"\n f\"e.g. the current one ({ngff_type=}) cannot be \"\n \"processed via the `maximum_intensity_projection` task.\"\n )\n zarr_url = zarr_path\n types = _process_single_image(\n zarr_url,\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n image_list_updates.append(\n dict(\n zarr_url=zarr_url,\n types=types,\n )\n )\n\n image_list_changes = dict(image_list_updates=image_list_updates)\n return image_list_changes\n
Returns the parallelization_list to run find_registration_consensus.
PARAMETER DESCRIPTION zarr_urls
List of paths or urls to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: list[str]
zarr_dir
path of the directory where the new OME-Zarrs will be created. Not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
reference_acquisition
Which acquisition to register against. Uses the OME-NGFF HCS well metadata acquisition keys to find the reference acquisition.
TYPE: int DEFAULT: 0
Source code in fractal_tasks_core/tasks/init_group_by_well_for_multiplexing.py
@validate_arguments\ndef init_group_by_well_for_multiplexing(\n *,\n # Fractal parameters\n zarr_urls: list[str],\n zarr_dir: str,\n # Core parameters\n reference_acquisition: int = 0,\n) -> dict[str, list[str]]:\n\"\"\"\n Finds images for all acquisitions per well.\n\n Returns the parallelization_list to run `find_registration_consensus`.\n\n Args:\n zarr_urls: List of paths or urls to the individual OME-Zarr image to\n be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_dir: path of the directory where the new OME-Zarrs will be\n created. Not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n reference_acquisition: Which acquisition to register against. Uses the\n OME-NGFF HCS well metadata acquisition keys to find the reference\n acquisition.\n \"\"\"\n logger.info(\n f\"Running `init_group_by_well_for_multiplexing` for {zarr_urls=}\"\n )\n image_groups = create_well_acquisition_dict(zarr_urls)\n\n # Create the parallelization list\n parallelization_list = []\n for key, image_group in image_groups.items():\n # Assert that all image groups have the reference acquisition present\n if reference_acquisition not in image_group.keys():\n raise ValueError(\n f\"Registration with {reference_acquisition=} can only work if \"\n \"all wells have the reference acquisition present. It was not \"\n f\"found for well {key}.\"\n )\n\n # Create a parallelization list entry for each image group\n zarr_url_list = []\n for acquisition, zarr_url in image_group.items():\n if acquisition == reference_acquisition:\n reference_zarr_url = zarr_url\n\n zarr_url_list.append(zarr_url)\n\n parallelization_list.append(\n dict(\n zarr_url=reference_zarr_url,\n init_args=dict(zarr_url_list=zarr_url_list),\n )\n )\n\n return dict(parallelization_list=parallelization_list)\n
Arguments to be passed from cellvoyager converter init to compute
ATTRIBUTE DESCRIPTION image_dir
Directory where the raw images are found
TYPE: str
plate_prefix
part of the image filename needed for finding the right subset of image files
TYPE: str
well_ID
part of the image filename needed for finding the right subset of image files
TYPE: str
image_extension
part of the image filename needed for finding the right subset of image files
TYPE: str
image_glob_patterns
Additional glob patterns to filter the available images with
TYPE: Optional[list[str]]
acquisition
Acquisition metadata needed for multiplexing
TYPE: Optional[int]
Source code in fractal_tasks_core/tasks/io_models.py
class InitArgsCellVoyager(BaseModel):\n\"\"\"\n Arguments to be passed from cellvoyager converter init to compute\n\n Attributes:\n image_dir: Directory where the raw images are found\n plate_prefix: part of the image filename needed for finding the\n right subset of image files\n well_ID: part of the image filename needed for finding the\n right subset of image files\n image_extension: part of the image filename needed for finding the\n right subset of image files\n image_glob_patterns: Additional glob patterns to filter the available\n images with\n acquisition: Acquisition metadata needed for multiplexing\n \"\"\"\n\n image_dir: str\n plate_prefix: str\n well_ID: str\n image_extension: str\n image_glob_patterns: Optional[list[str]]\n acquisition: Optional[int]\n
Source code in fractal_tasks_core/tasks/io_models.py
class InitArgsMIP(BaseModel):\n\"\"\"\n Init Args for MIP task.\n\n Attributes:\n origin_url: Path to the zarr_url with the 3D data\n \"\"\"\n\n origin_url: str\n
Passed from image_based_registration_hcs_init to calculate_registration_image_based.
ATTRIBUTE DESCRIPTION reference_zarr_url
zarr_url for the reference image
TYPE: str
Source code in fractal_tasks_core/tasks/io_models.py
class InitArgsRegistration(BaseModel):\n\"\"\"\n Registration init args.\n\n Passed from `image_based_registration_hcs_init` to\n `calculate_registration_image_based`.\n\n Attributes:\n reference_zarr_url: zarr_url for the reference image\n \"\"\"\n\n reference_zarr_url: str\n
Provides the list of zarr_urls for all acquisitions for a given well
ATTRIBUTE DESCRIPTION zarr_url_list
List of zarr_urls for all the OME-Zarr images in the well.
TYPE: list[str]
Source code in fractal_tasks_core/tasks/io_models.py
class InitArgsRegistrationConsensus(BaseModel):\n\"\"\"\n Registration consensus init args.\n\n Provides the list of zarr_urls for all acquisitions for a given well\n\n Attributes:\n zarr_url_list: List of zarr_urls for all the OME-Zarr images in the\n well.\n \"\"\"\n\n zarr_url_list: list[str]\n
Input class for Multiplexing Cellvoyager converter
ATTRIBUTE DESCRIPTION image_dir
Path to the folder that contains the Cellvoyager image files for that acquisition and the MeasurementData & MeasurementDetail metadata files.
TYPE: str
allowed_channels
A list of OmeroChannel objects, where each channel must include the wavelength_id attribute and where the wavelength_id values must be unique across the list.
TYPE: list[OmeroChannel]
Source code in fractal_tasks_core/tasks/io_models.py
class MultiplexingAcquisition(BaseModel):\n\"\"\"\n Input class for Multiplexing Cellvoyager converter\n\n Attributes:\n image_dir: Path to the folder that contains the Cellvoyager image\n files for that acquisition and the MeasurementData &\n MeasurementDetail metadata files.\n allowed_channels: A list of `OmeroChannel` objects, where each channel\n must include the `wavelength_id` attribute and where the\n `wavelength_id` values must be unique across the list.\n \"\"\"\n\n image_dir: str\n allowed_channels: list[OmeroChannel]\n
A value of the input_specs argument in napari_workflows_wrapper.
ATTRIBUTE DESCRIPTION type
Input type (either image or label).
TYPE: Literal['image', 'label']
label_name
Label name (for label inputs only).
TYPE: Optional[str]
channel
ChannelInputModel object (for image inputs only).
TYPE: Optional[ChannelInputModel]
Source code in fractal_tasks_core/tasks/io_models.py
class NapariWorkflowsInput(BaseModel):\n\"\"\"\n A value of the `input_specs` argument in `napari_workflows_wrapper`.\n\n Attributes:\n type: Input type (either `image` or `label`).\n label_name: Label name (for label inputs only).\n channel: `ChannelInputModel` object (for image inputs only).\n \"\"\"\n\n type: Literal[\"image\", \"label\"]\n label_name: Optional[str]\n channel: Optional[ChannelInputModel]\n\n @validator(\"label_name\", always=True)\n def label_name_is_present(cls, v, values):\n\"\"\"\n Check that label inputs have `label_name` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"label\" and not v:\n raise ValueError(\n f\"Input item has type={_type} but label_name={v}.\"\n )\n return v\n\n @validator(\"channel\", always=True)\n def channel_is_present(cls, v, values):\n\"\"\"\n Check that image inputs have `channel` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"image\" and not v:\n raise ValueError(f\"Input item has type={_type} but channel={v}.\")\n return v\n
Source code in fractal_tasks_core/tasks/io_models.py
@validator(\"channel\", always=True)\ndef channel_is_present(cls, v, values):\n\"\"\"\n Check that image inputs have `channel` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"image\" and not v:\n raise ValueError(f\"Input item has type={_type} but channel={v}.\")\n return v\n
Source code in fractal_tasks_core/tasks/io_models.py
@validator(\"label_name\", always=True)\ndef label_name_is_present(cls, v, values):\n\"\"\"\n Check that label inputs have `label_name` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"label\" and not v:\n raise ValueError(\n f\"Input item has type={_type} but label_name={v}.\"\n )\n return v\n
A value of the output_specs argument in napari_workflows_wrapper.
ATTRIBUTE DESCRIPTION type
Output type (either label or dataframe).
TYPE: Literal['label', 'dataframe']
label_name
Label name (for label outputs, it is used as the name of the label; for dataframe outputs, it is used to fill the region[\"path\"] field).
TYPE: str
table_name
Table name (for dataframe outputs only).
TYPE: Optional[str]
Source code in fractal_tasks_core/tasks/io_models.py
class NapariWorkflowsOutput(BaseModel):\n\"\"\"\n A value of the `output_specs` argument in `napari_workflows_wrapper`.\n\n Attributes:\n type: Output type (either `label` or `dataframe`).\n label_name: Label name (for label outputs, it is used as the name of\n the label; for dataframe outputs, it is used to fill the\n `region[\"path\"]` field).\n table_name: Table name (for dataframe outputs only).\n \"\"\"\n\n type: Literal[\"label\", \"dataframe\"]\n label_name: str\n table_name: Optional[str] = None\n\n @validator(\"table_name\", always=True)\n def table_name_only_for_dataframe_type(cls, v, values):\n\"\"\"\n Check that table_name is set only for dataframe outputs.\n \"\"\"\n _type = values.get(\"type\")\n if (_type == \"dataframe\" and (not v)) or (_type != \"dataframe\" and v):\n raise ValueError(\n f\"Output item has type={_type} but table_name={v}.\"\n )\n return v\n
Check that table_name is set only for dataframe outputs.
Source code in fractal_tasks_core/tasks/io_models.py
@validator(\"table_name\", always=True)\ndef table_name_only_for_dataframe_type(cls, v, values):\n\"\"\"\n Check that table_name is set only for dataframe outputs.\n \"\"\"\n _type = values.get(\"type\")\n if (_type == \"dataframe\" and (not v)) or (_type != \"dataframe\" and v):\n raise ValueError(\n f\"Output item has type={_type} but table_name={v}.\"\n )\n return v\n
Perform maximum-intensity projection along Z axis.
Note: this task stores the output in a new zarr file.
PARAMETER DESCRIPTION zarr_url
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
init_args
Intialization arguments provided by create_cellvoyager_ome_zarr_init.
TYPE: InitArgsMIP
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/tasks/maximum_intensity_projection.py
@validate_arguments\ndef maximum_intensity_projection(\n *,\n # Fractal parameters\n zarr_url: str,\n init_args: InitArgsMIP,\n # Advanced parameters\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Perform maximum-intensity projection along Z axis.\n\n Note: this task stores the output in a new zarr file.\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n init_args: Intialization arguments provided by\n `create_cellvoyager_ome_zarr_init`.\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n logger.info(f\"{init_args.origin_url=}\")\n logger.info(f\"{zarr_url=}\")\n\n # Read image metadata\n ngff_image = load_NgffImageMeta(init_args.origin_url)\n # Currently not using the validation models due to wavelength_id issue\n # See #681 for discussion\n # new_attrs = ngff_image.dict(exclude_none=True)\n # Current way to get the necessary metadata for MIP\n group = zarr.open_group(init_args.origin_url, mode=\"r\")\n new_attrs = group.attrs.asdict()\n\n # Create the zarr image with correct\n new_image_group = zarr.group(zarr_url)\n new_image_group.attrs.put(new_attrs)\n\n # Load 0-th level\n data_czyx = da.from_zarr(init_args.origin_url + \"/0\")\n num_channels = data_czyx.shape[0]\n chunksize_y = data_czyx.chunksize[-2]\n chunksize_x = data_czyx.chunksize[-1]\n logger.info(f\"{num_channels=}\")\n logger.info(f\"{chunksize_y=}\")\n logger.info(f\"{chunksize_x=}\")\n\n # Loop over channels\n accumulate_chl = []\n for ind_ch in range(num_channels):\n # Perform MIP for each channel of level 0\n mip_yx = da.stack([da.max(data_czyx[ind_ch], axis=0)], axis=0)\n accumulate_chl.append(mip_yx)\n accumulated_array = da.stack(accumulate_chl, axis=0)\n\n # Write to disk (triggering execution)\n try:\n accumulated_array.to_zarr(\n f\"{zarr_url}/0\",\n overwrite=overwrite,\n dimension_separator=\"/\",\n write_empty_chunks=False,\n )\n except ContainsArrayError as e:\n error_msg = (\n f\"Cannot write array to zarr group at '{zarr_url}/0', \"\n f\"with {overwrite=} (original error: {str(e)}).\\n\"\n \"Hint: try setting overwrite=True.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=zarr_url,\n overwrite=overwrite,\n num_levels=ngff_image.num_levels,\n coarsening_xy=ngff_image.coarsening_xy,\n chunksize=(1, 1, chunksize_y, chunksize_x),\n )\n\n # Copy over any tables from the original zarr\n # Generate the list of tables:\n tables = get_tables_list_v1(init_args.origin_url)\n roi_tables = get_tables_list_v1(init_args.origin_url, table_type=\"ROIs\")\n non_roi_tables = [table for table in tables if table not in roi_tables]\n\n for table in roi_tables:\n logger.info(\n f\"Reading {table} from \"\n f\"{init_args.origin_url=}, convert it to 2D, and \"\n \"write it back to the new zarr file.\"\n )\n new_ROI_table = ad.read_zarr(f\"{init_args.origin_url}/tables/{table}\")\n old_ROI_table_attrs = zarr.open_group(\n f\"{init_args.origin_url}/tables/{table}\"\n ).attrs.asdict()\n\n # Convert 3D ROIs to 2D\n pxl_sizes_zyx = ngff_image.get_pixel_sizes_zyx(level=0)\n new_ROI_table = convert_ROIs_from_3D_to_2D(\n new_ROI_table, pixel_size_z=pxl_sizes_zyx[0]\n )\n # Write new table\n write_table(\n new_image_group,\n table,\n new_ROI_table,\n table_attrs=old_ROI_table_attrs,\n overwrite=overwrite,\n )\n\n for table in non_roi_tables:\n logger.info(\n f\"Reading {table} from \"\n f\"{init_args.origin_url=}, and \"\n \"write it back to the new zarr file.\"\n )\n new_non_ROI_table = ad.read_zarr(\n f\"{init_args.origin_url}/tables/{table}\"\n )\n old_non_ROI_table_attrs = zarr.open_group(\n f\"{init_args.origin_url}/tables/{table}\"\n ).attrs.asdict()\n\n # Write new table\n write_table(\n new_image_group,\n table,\n new_non_ROI_table,\n table_attrs=old_non_ROI_table_attrs,\n overwrite=overwrite,\n )\n\n # Generate image_list_updates\n image_list_update_dict = dict(\n image_list_updates=[\n dict(\n zarr_url=zarr_url,\n origin=init_args.origin_url,\n types=dict(is_3D=False),\n )\n ]\n )\n return image_list_update_dict\n
Encapsulates features that are out-of-scope for the current wrapper task.
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper.py
class OutOfTaskScopeError(NotImplementedError):\n\"\"\"\n Encapsulates features that are out-of-scope for the current wrapper task.\n \"\"\"\n\n pass\n
Path or url to the individual OME-Zarr image to be processed. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
workflow_file
Absolute path to napari-workflows YAML file
TYPE: str
input_specs
A dictionary of NapariWorkflowsInput values.
TYPE: dict[str, NapariWorkflowsInput]
output_specs
A dictionary of NapariWorkflowsOutput values.
TYPE: dict[str, NapariWorkflowsOutput]
input_ROI_table
Name of the ROI table over which the task loops to apply napari workflows. Examples: FOV_ROI_table => loop over the field of views; organoid_ROI_table => loop over the organoid ROI table (generated by another task); well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
level
Pyramid level of the image to be used as input for napari-workflows. Choose 0 to process at full resolution. Levels > 0 are currently only supported for workflows that only have intensity images as input and only produce a label images as output.
TYPE: int DEFAULT: 0
relabeling
If True, apply relabeling so that label values are unique across all ROIs in the well.
TYPE: bool DEFAULT: True
expected_dimensions
Expected dimensions (either 2 or 3). Useful when loading 2D images that are stored in a 3D array with shape (1, size_x, size_y) [which is the default way Fractal stores 2D images], but you want to make sure the napari workflow gets a 2D array to process. Also useful to set to 2 when loading a 2D OME-Zarr that is saved as (size_x, size_y).
TYPE: int DEFAULT: 3
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper.py
@validate_arguments\ndef napari_workflows_wrapper(\n *,\n # Fractal parameters\n zarr_url: str,\n # Core parameters\n workflow_file: str,\n input_specs: dict[str, NapariWorkflowsInput],\n output_specs: dict[str, NapariWorkflowsOutput],\n input_ROI_table: str = \"FOV_ROI_table\",\n level: int = 0,\n # Advanced parameters\n relabeling: bool = True,\n expected_dimensions: int = 3,\n overwrite: bool = True,\n):\n\"\"\"\n Run a napari-workflow on the ROIs of a single OME-NGFF image.\n\n This task takes images and labels and runs a napari-workflow on them that\n can produce a label and tables as output.\n\n Examples of allowed entries for `input_specs` and `output_specs`:\n\n ```\n input_specs = {\n \"in_1\": {\"type\": \"image\", \"channel\": {\"wavelength_id\": \"A01_C02\"}},\n \"in_2\": {\"type\": \"image\", \"channel\": {\"label\": \"DAPI\"}},\n \"in_3\": {\"type\": \"label\", \"label_name\": \"label_DAPI\"},\n }\n\n output_specs = {\n \"out_1\": {\"type\": \"label\", \"label_name\": \"label_DAPI_new\"},\n \"out_2\": {\"type\": \"dataframe\", \"table_name\": \"measurements\"},\n }\n ```\n\n Args:\n zarr_url: Path or url to the individual OME-Zarr image to be processed.\n (standard argument for Fractal tasks, managed by Fractal server).\n workflow_file: Absolute path to napari-workflows YAML file\n input_specs: A dictionary of `NapariWorkflowsInput` values.\n output_specs: A dictionary of `NapariWorkflowsOutput` values.\n input_ROI_table: Name of the ROI table over which the task loops to\n apply napari workflows.\n Examples:\n `FOV_ROI_table`\n => loop over the field of views;\n `organoid_ROI_table`\n => loop over the organoid ROI table (generated by another task);\n `well_ROI_table`\n => process the whole well as one image.\n level: Pyramid level of the image to be used as input for\n napari-workflows. Choose `0` to process at full resolution.\n Levels > 0 are currently only supported for workflows that only\n have intensity images as input and only produce a label images as\n output.\n relabeling: If `True`, apply relabeling so that label values are\n unique across all ROIs in the well.\n expected_dimensions: Expected dimensions (either `2` or `3`). Useful\n when loading 2D images that are stored in a 3D array with shape\n `(1, size_x, size_y)` [which is the default way Fractal stores 2D\n images], but you want to make sure the napari workflow gets a 2D\n array to process. Also useful to set to `2` when loading a 2D\n OME-Zarr that is saved as `(size_x, size_y)`.\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n wf: napari_workflows.Worfklow = load_workflow(workflow_file)\n logger.info(f\"Loaded workflow from {workflow_file}\")\n\n # Validation of input/output specs\n if not (set(wf.leafs()) <= set(output_specs.keys())):\n msg = f\"Some item of {wf.leafs()=} is not part of {output_specs=}.\"\n logger.warning(msg)\n if not (set(wf.roots()) <= set(input_specs.keys())):\n msg = f\"Some item of {wf.roots()=} is not part of {input_specs=}.\"\n logger.error(msg)\n raise ValueError(msg)\n list_outputs = sorted(output_specs.keys())\n\n # Characterization of workflow and scope restriction\n input_types = [in_params.type for (name, in_params) in input_specs.items()]\n output_types = [\n out_params.type for (name, out_params) in output_specs.items()\n ]\n are_inputs_all_images = set(input_types) == {\"image\"}\n are_outputs_all_labels = set(output_types) == {\"label\"}\n are_outputs_all_dataframes = set(output_types) == {\"dataframe\"}\n is_labeling_workflow = are_inputs_all_images and are_outputs_all_labels\n is_measurement_only_workflow = are_outputs_all_dataframes\n # Level-related constraint\n logger.info(f\"This workflow acts at {level=}\")\n logger.info(\n f\"Is the current workflow a labeling one? {is_labeling_workflow}\"\n )\n if level > 0 and not is_labeling_workflow:\n msg = (\n f\"{level=}>0 is currently only accepted for labeling workflows, \"\n \"i.e. those going from image(s) to label(s)\"\n )\n logger.error(msg)\n raise OutOfTaskScopeError(msg)\n # Relabeling-related (soft) constraint\n if is_measurement_only_workflow and relabeling:\n logger.warning(\n \"This is a measurement-output-only workflow, setting \"\n \"relabeling=False.\"\n )\n relabeling = False\n if relabeling:\n max_label_for_relabeling = 0\n\n label_dtype = np.uint32\n\n # Read ROI table\n ROI_table = ad.read_zarr(f\"{zarr_url}/tables/{input_ROI_table}\")\n\n # Load image metadata\n ngff_image_meta = load_NgffImageMeta(zarr_url)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n\n # Read pixel sizes from zattrs file\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n\n # Create list of indices for 3D FOVs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n num_ROIs = len(list_indices)\n logger.info(\n f\"Completed reading ROI table {input_ROI_table},\"\n f\" found {num_ROIs} ROIs.\"\n )\n\n # Input preparation: \"image\" type\n image_inputs = [\n (name, in_params)\n for (name, in_params) in input_specs.items()\n if in_params.type == \"image\"\n ]\n input_image_arrays = {}\n if image_inputs:\n img_array = da.from_zarr(f\"{zarr_url}/{level}\")\n # Loop over image inputs and assign corresponding channel of the image\n for (name, params) in image_inputs:\n channel = get_channel_from_image_zarr(\n image_zarr_path=zarr_url,\n wavelength_id=params.channel.wavelength_id,\n label=params.channel.label,\n )\n channel_index = channel.index\n input_image_arrays[name] = img_array[channel_index]\n\n # Handle dimensions\n shape = input_image_arrays[name].shape\n if expected_dimensions == 3 and shape[0] == 1:\n logger.warning(\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n if expected_dimensions == 2:\n if len(shape) == 2:\n # We already load the data as a 2D array\n pass\n elif shape[0] == 1:\n input_image_arrays[name] = input_image_arrays[name][\n 0, :, :\n ]\n else:\n msg = (\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n logger.info(f\"Prepared input with {name=} and {params=}\")\n logger.info(f\"{input_image_arrays=}\")\n\n # Input preparation: \"label\" type\n label_inputs = [\n (name, in_params)\n for (name, in_params) in input_specs.items()\n if in_params.type == \"label\"\n ]\n if label_inputs:\n # Set target_shape for upscaling labels\n if not image_inputs:\n logger.warning(\n f\"{len(label_inputs)=} but num_image_inputs=0. \"\n \"Label array(s) will not be upscaled.\"\n )\n upscale_labels = False\n else:\n target_shape = list(input_image_arrays.values())[0].shape\n upscale_labels = True\n # Loop over label inputs and load corresponding (upscaled) image\n input_label_arrays = {}\n for (name, params) in label_inputs:\n label_name = params.label_name\n label_array_raw = da.from_zarr(\n f\"{zarr_url}/labels/{label_name}/{level}\"\n )\n input_label_arrays[name] = label_array_raw\n\n # Handle dimensions\n shape = input_label_arrays[name].shape\n if expected_dimensions == 3 and shape[0] == 1:\n logger.warning(\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n if expected_dimensions == 2:\n if len(shape) == 2:\n # We already load the data as a 2D array\n pass\n elif shape[0] == 1:\n input_label_arrays[name] = input_label_arrays[name][\n 0, :, :\n ]\n else:\n msg = (\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n\n if upscale_labels:\n # Check that dimensionality matches the image\n if len(input_label_arrays[name].shape) != len(target_shape):\n raise ValueError(\n f\"Label {name} has shape \"\n f\"{input_label_arrays[name].shape}. \"\n \"But the corresponding image has shape \"\n f\"{target_shape}. Those dimensionalities do not \"\n f\"match. Is {expected_dimensions=} the correct \"\n \"setting?\"\n )\n if expected_dimensions == 3:\n upscaling_axes = [1, 2]\n else:\n upscaling_axes = [0, 1]\n input_label_arrays[name] = upscale_array(\n array=input_label_arrays[name],\n target_shape=target_shape,\n axis=upscaling_axes,\n pad_with_zeros=True,\n )\n\n logger.info(f\"Prepared input with {name=} and {params=}\")\n logger.info(f\"{input_label_arrays=}\")\n\n # Output preparation: \"label\" type\n label_outputs = [\n (name, out_params)\n for (name, out_params) in output_specs.items()\n if out_params.type == \"label\"\n ]\n if label_outputs:\n # Preliminary scope checks\n if len(label_outputs) > 1:\n raise OutOfTaskScopeError(\n \"Multiple label outputs would break label-inputs-only \"\n f\"workflows (found {len(label_outputs)=}).\"\n )\n if len(label_outputs) > 1 and relabeling:\n raise OutOfTaskScopeError(\n \"Multiple label outputs would break relabeling in labeling+\"\n f\"measurement workflows (found {len(label_outputs)=}).\"\n )\n\n # We only support two cases:\n # 1. If there exist some input images, then use the first one to\n # determine output-label array properties\n # 2. If there are no input images, but there are input labels, then (A)\n # re-load the pixel sizes and re-build ROI indices, and (B) use the\n # first input label to determine output-label array properties\n if image_inputs:\n reference_array = list(input_image_arrays.values())[0]\n elif label_inputs:\n reference_array = list(input_label_arrays.values())[0]\n # Re-load pixel size, matching to the correct level\n input_label_name = label_inputs[0][1].label_name\n ngff_label_image_meta = load_NgffImageMeta(\n f\"{zarr_url}/labels/{input_label_name}\"\n )\n full_res_pxl_sizes_zyx = ngff_label_image_meta.get_pixel_sizes_zyx(\n level=0\n )\n # Create list of indices for 3D FOVs spanning the whole Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n num_ROIs = len(list_indices)\n logger.info(\n f\"Re-create ROI indices from ROI table {input_ROI_table}, \"\n f\"using {full_res_pxl_sizes_zyx=}. \"\n \"This is necessary because label-input-only workflows may \"\n \"have label inputs that are at a different resolution and \"\n \"are not upscaled.\"\n )\n else:\n msg = (\n \"Missing image_inputs and label_inputs, we cannot assign\"\n \" label output properties\"\n )\n raise OutOfTaskScopeError(msg)\n\n # Extract label properties from reference_array, and make sure they are\n # for three dimensions\n label_shape = reference_array.shape\n label_chunksize = reference_array.chunksize\n if len(label_shape) == 2 and len(label_chunksize) == 2:\n if expected_dimensions == 3:\n raise ValueError(\n f\"Something wrong: {label_shape=} but \"\n f\"{expected_dimensions=}\"\n )\n label_shape = (1, label_shape[0], label_shape[1])\n label_chunksize = (1, label_chunksize[0], label_chunksize[1])\n logger.info(f\"{label_shape=}\")\n logger.info(f\"{label_chunksize=}\")\n\n # Loop over label outputs and (1) set zattrs, (2) create zarr group\n output_label_zarr_groups: dict[str, Any] = {}\n for (name, out_params) in label_outputs:\n\n # (1a) Rescale OME-NGFF datasets (relevant for level>0)\n if not ngff_image_meta.multiscale.axes[0].name == \"c\":\n raise ValueError(\n \"Cannot set `remove_channel_axis=True` for multiscale \"\n f\"metadata with axes={ngff_image_meta.multiscale.axes}. \"\n 'First axis should have name \"c\".'\n )\n new_datasets = rescale_datasets(\n datasets=[\n ds.dict() for ds in ngff_image_meta.multiscale.datasets\n ],\n coarsening_xy=coarsening_xy,\n reference_level=level,\n remove_channel_axis=True,\n )\n\n # (1b) Prepare attrs for label group\n label_name = out_params.label_name\n label_attrs = {\n \"image-label\": {\n \"version\": __OME_NGFF_VERSION__,\n \"source\": {\"image\": \"../../\"},\n },\n \"multiscales\": [\n {\n \"name\": label_name,\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n ax.dict()\n for ax in ngff_image_meta.multiscale.axes\n if ax.type != \"channel\"\n ],\n \"datasets\": new_datasets,\n }\n ],\n }\n\n # (2) Prepare label group\n image_group = zarr.group(zarr_url)\n label_group = prepare_label_group(\n image_group,\n label_name,\n overwrite=overwrite,\n label_attrs=label_attrs,\n logger=logger,\n )\n logger.info(\n \"Helper function `prepare_label_group` returned \"\n f\"{label_group=}\"\n )\n\n # (3) Create zarr group at level=0\n store = zarr.storage.FSStore(f\"{zarr_url}/labels/{label_name}/0\")\n mask_zarr = zarr.create(\n shape=label_shape,\n chunks=label_chunksize,\n dtype=label_dtype,\n store=store,\n overwrite=overwrite,\n dimension_separator=\"/\",\n )\n output_label_zarr_groups[name] = mask_zarr\n logger.info(f\"Prepared output with {name=} and {out_params=}\")\n logger.info(f\"{output_label_zarr_groups=}\")\n\n # Output preparation: \"dataframe\" type\n dataframe_outputs = [\n (name, out_params)\n for (name, out_params) in output_specs.items()\n if out_params.type == \"dataframe\"\n ]\n output_dataframe_lists: dict[str, list] = {}\n for (name, out_params) in dataframe_outputs:\n output_dataframe_lists[name] = []\n logger.info(f\"Prepared output with {name=} and {out_params=}\")\n logger.info(f\"{output_dataframe_lists=}\")\n\n #####\n\n for i_ROI, indices in enumerate(list_indices):\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (slice(s_z, e_z), slice(s_y, e_y), slice(s_x, e_x))\n\n logger.info(f\"ROI {i_ROI+1}/{num_ROIs}: {region=}\")\n\n # Always re-load napari worfklow\n wf = load_workflow(workflow_file)\n\n # Set inputs\n for input_name in input_specs.keys():\n input_type = input_specs[input_name].type\n\n if input_type == \"image\":\n wf.set(\n input_name,\n load_region(\n input_image_arrays[input_name],\n region,\n compute=True,\n return_as_3D=False,\n ),\n )\n elif input_type == \"label\":\n wf.set(\n input_name,\n load_region(\n input_label_arrays[input_name],\n region,\n compute=True,\n return_as_3D=False,\n ),\n )\n\n # Get outputs\n outputs = wf.get(list_outputs)\n\n # Iterate first over dataframe outputs (to use the correct\n # max_label_for_relabeling, if needed)\n for ind_output, output_name in enumerate(list_outputs):\n if output_specs[output_name].type != \"dataframe\":\n continue\n df = outputs[ind_output]\n if relabeling:\n df[\"label\"] += max_label_for_relabeling\n logger.info(\n f'ROI {i_ROI+1}/{num_ROIs}: Relabeling \"{name}\" dataframe'\n \"output, with {max_label_for_relabeling=}\"\n )\n\n # Append the new-ROI dataframe to the all-ROIs list\n output_dataframe_lists[output_name].append(df)\n\n # After all dataframe outputs, iterate over label outputs (which\n # actually can be only 0 or 1)\n for ind_output, output_name in enumerate(list_outputs):\n if output_specs[output_name].type != \"label\":\n continue\n mask = outputs[ind_output]\n\n # Check dimensions\n if len(mask.shape) != expected_dimensions:\n msg = (\n f\"Output {output_name} has shape {mask.shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n elif expected_dimensions == 2:\n mask = np.expand_dims(mask, axis=0)\n\n # Sanity check: issue warning for non-consecutive labels\n unique_labels = np.unique(mask)\n num_unique_labels_in_this_ROI = len(unique_labels)\n if np.min(unique_labels) == 0:\n num_unique_labels_in_this_ROI -= 1\n num_labels_in_this_ROI = int(np.max(mask))\n if num_labels_in_this_ROI != num_unique_labels_in_this_ROI:\n logger.warning(\n f'ROI {i_ROI+1}/{num_ROIs}: \"{name}\" label output has'\n f\"non-consecutive labels: {num_labels_in_this_ROI=} but\"\n f\"{num_unique_labels_in_this_ROI=}\"\n )\n\n if relabeling:\n mask[mask > 0] += max_label_for_relabeling\n logger.info(\n f'ROI {i_ROI+1}/{num_ROIs}: Relabeling \"{name}\" label '\n f\"output, with {max_label_for_relabeling=}\"\n )\n max_label_for_relabeling += num_labels_in_this_ROI\n logger.info(\n f\"ROI {i_ROI+1}/{num_ROIs}: label-number update with \"\n f\"{num_labels_in_this_ROI=}; \"\n f\"new {max_label_for_relabeling=}\"\n )\n\n da.array(mask).to_zarr(\n url=output_label_zarr_groups[output_name],\n region=region,\n compute=True,\n overwrite=overwrite,\n )\n logger.info(f\"ROI {i_ROI+1}/{num_ROIs}: output handling complete\")\n\n # Output handling: \"dataframe\" type (for each output, concatenate ROI\n # dataframes, clean up, and store in a AnnData table on-disk)\n for (name, out_params) in dataframe_outputs:\n table_name = out_params.table_name\n # Concatenate all FOV dataframes\n list_dfs = output_dataframe_lists[name]\n if len(list_dfs) == 0:\n measurement_table = ad.AnnData()\n else:\n df_well = pd.concat(list_dfs, axis=0, ignore_index=True)\n # Extract labels and drop them from df_well\n labels = pd.DataFrame(df_well[\"label\"].astype(str))\n df_well.drop(labels=[\"label\"], axis=1, inplace=True)\n # Convert all to float (warning: some would be int, in principle)\n measurement_dtype = np.float32\n df_well = df_well.astype(measurement_dtype)\n df_well.index = df_well.index.map(str)\n # Convert to anndata\n measurement_table = ad.AnnData(df_well, dtype=measurement_dtype)\n measurement_table.obs = labels\n\n # Write to zarr group\n image_group = zarr.group(zarr_url)\n table_attrs = dict(\n type=\"feature_table\",\n region=dict(path=f\"../labels/{out_params.label_name}\"),\n instance_key=\"label\",\n )\n write_table(\n image_group,\n table_name,\n measurement_table,\n overwrite=overwrite,\n table_attrs=table_attrs,\n )\n\n # Output handling: \"label\" type (for each output, build and write to disk\n # pyramid of coarser levels)\n for (name, out_params) in label_outputs:\n label_name = out_params.label_name\n build_pyramid(\n zarrurl=f\"{zarr_url}/labels/{label_name}\",\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=label_chunksize,\n aggregation_function=np.max,\n )\n
Thanks to the package manifest and to their structure, the tasks in fractal_tasks_core.tasks can be run within the Fractal platform; this consists in a backend server which can be accessed by one of the two available clients (a command-line client and a web-client).
The fractal-demos repository lists a set of relevant examples, including:
How to set up a fractal-server instance;
How to set up a fractal-client command-line client;
How to use the command-line client to submit a series of typical workflows (based on fractal-tasks-core tasks) to Fractal; see folders from 01 to 10 in the examples folder.
The fractal-tasks-core GitHub repository includes an examples folder, listing a few examples of how to run fractal-tasks-core tasks from a standard Python script (instead of using the Fractal platform).
What follows is the content of examples/README.md:
This folder is not always kept up-to-date. If you encounter any unexpected problem, please open a new issue on the fractal-tasks-core GitHub repository.
Examples from 01 to 09 are currently aligned with fractal-tasks-core 0.10.0.
"},{"location":"version_updates/v0_14_0/","title":"From version 0.13.1 to 0.14.0","text":""},{"location":"version_updates/v0_14_0/#package-structure","title":"Package structure","text":"
Version 0.14.0 includes a large refactor of the fractal_tasks_core package, leading to this new structure:
Within fractal-tasks-core, we make use of tables which are AnnData objects
+stored within OME-Zarr image groups. This page describes the different kinds of
+tables we use, and it includes:
Note: The specifications below are largely inspired by a proposed update
+to OME-NGFF specs. This update is currently
+on hold, and fractal-tasks-core will evolve as soon as an official NGFF
+table specs is adopted - see also the Outlook section.
In this section we describe version 1 (V1) of the Fractal table specifications;
+for the moment, only V1 exists.
+Note that V1 specifications are only implemented as os of version 0.14.0 of
+fractal-tasks-core.
The core-table specification consists in the definition of the required Zarr
+structure and attributes, and of the AnnData table format.
+
AnnData table format
+
We store tabular data into Zarr groups as AnnData ("Annotated Data") objects;
+the anndata Python library provides the
+definition of this format and the relevant tools. Quoting from the anndata
+documentation:
+
+
AnnData is specifically designed for matrix-like data. By this we mean that
+we have \(n\) observations, each of which can be represented as \(d\)-dimensional
+vectors, where each dimension corresponds to a variable or feature. Both the
+rows and columns of this \(n \times d\) matrix are special in the sense that
+they are indexed.
Note that AnnData tables are easily transformed from/into pandas.DataFrame
+objects - see e.g. the AnnData.to_df
+method.
+
Zarr structure and attributes
+
The structure of Zarr groups is based on the image specification in NGFF
+0.4, with an
+additional tables group and the corresponding subgroups (similar to
+labels):
+
image.zarr # Zarr group for a NGFF image
+|
+├── 0 # Zarr array for multiscale level 0
+├── ...
+├── N # Zarr array for multiscale level N
+|
+├── labels # Zarr subgroup with a list of labels associated to this image
+| ├── label_A # Zarr subgroup for a given label
+| ├── label_B # Zarr subgroup for a given label
+| └── ...
+|
+└── tables # Zarr subgroup with a list of tables associated to this image
+ ├── table_1 # Zarr subgroup for a given table
+ ├── table_2 # Zarr subgroup for a given table
+ └── ...
+
+
The Zarr attributes of the tables group must include the key tables,
+pointing to the list of all tables (this simplifies discovery of tables
+associated to the current NGFF image), as in
+
image.zarr/tables/.zattrs
{
+"tables":["table_1","table_2"]
+}
+
+
The Zarr attributes of each specific-table group must include the version of
+the table specification (currently version 1), through the
+fractal_table_version attribute. Also note that the anndata function to
+write an AnnData object into a Zarr group automatically sets additional
+attributes. Here is an example of the resulting Zarr attributes:
+
image.zarr/tables/table_1/.zattrs
{
+"fractal_table_version":"1",
+"encoding-type":"anndata",// Automatically added by anndata 0.11
+"encoding-version":"0.1.0",// Automatically added by anndata 0.11
+}
+
In fractal-tasks-core, a ROI table defines regions of space which are
+three-dimensional (see also the Outlook section about
+dimensionality flexibility) and box-shaped.
+Typical use cases are described here.
+
Zarr attributes
+
The specification of a ROI table is a subset of the core table
+one. Moreover, the table-group Zarr attributes must include the
+type attribute with value roi_table, as in
+
The var
+attribute
+of a given AnnData object indexes the columns of the table. A
+fractal-tasks-core ROI table must include the following six columns:
+
+
x_micrometer, y_micrometer, z_micrometer:
+ the lower bounds of the XYZ intervals defining the ROI, in micrometers;
+
len_x_micrometer, len_y_micrometer, len_z_micrometer:
+ the XYZ edge lengths, in micrometers.
+
+
+
Notes:
+
+
The axes origin for the ROI positions (e.g. for x_micrometer)
+ corresponds to the top-left corner of the image (for the YX axes) and to
+ the lowest Z plane.
+
ROIs are defined in physical coordinates, and they do not store
+ information on the number or size of pixels.
+
+
+
ROI tables may also include other columns, beyond the required ones. Here are
+the ones that are typically used in fractal-tasks-core (see also the Use
+cases section):
+
+
x_micrometer_original and y_micrometer_original, which are a copy of
+ x_micrometer and y_micrometer taken before applying some transformation;
+
translation_x, translation_y and translation_z, which are used during
+ registration of multiplexing acquisitions;
Masking ROI tables are a specific instance of the basic ROI tables described
+above, where each ROI must also be associated to a specific label of a label
+image.
+
Motivation
+
The motivation for this association is based on the following use case:
+
+
By performing segmentation of a NGFF image, we identify N objects and we
+ store them as a label image (where the value at each pixel correspond to the
+ label index);
+
We also compute the three-dimensional bounding box of each segmented object,
+ and store these bounding boxes into a masking ROI table;
+
For each one of these ROIs, we also include information that link it to both
+ the label image and a specific label index;
+
During further processing we can load/modify specific sub-regions of the ROI,
+ based on information contained in the label image. This kind of operations
+ are masked, as they only act on the array elements that match a certain
+ condition on the label value.
+
+
Zarr attributes
+
For this kind of tables, fractal-tasks-core closely follows the proposed
+NGFF update mentioned above. The
+requirements on the Zarr attributes of a given table are:
+
+
Attributes must contain a type key, with value masking_roi_table2.
+
Attributes must contain a region key; the corresponding value must be an
+ object with a path key and a string value (i.e. the path to the data the
+ table is annotating).
+
Attributes must include a key instance_key, which is the key in obs that
+ denotes which instance in region the row corresponds to.
On top of the required ROI-table colums, the masking-ROI-table AnnData object
+must have an attribute obs with a key matching to the instance_key zarr
+attribute. For instance if instance_key="label" then table.obs["label"]
+must exist, with its items matching the labels in the image in
+"../labels/label_DAPI".
The typical use case for feature tables is to store measurements related to
+segmented objects, while mantaining a link to the original instances (e.g.
+labels). Note that the current specification is aligned to the one of masking
+ROI tables, since they both need to relate a table to a
+label image, but the two may diverge in the future.
+
As part of the current fractal-tasks-core tasks, measurements can be
+performed e.g. via regionprops from scikit-image, as wrapped in
+napari-skimage-regionprops).
+
Zarr attributes
+
For this kind of tables, fractal-tasks-core closely follows the proposed
+NGFF update mentioned above. The
+requirements on the Zarr attributes of a given table are:
+
+
Attributes must contain a type key, with value feature_table2.
+
Attributes must contain a region key; the corresponding value must be an
+ object with a path key and a string value (i.e. the path to the data the
+ table is annotating).
+
Attributes must include a key instance_key, which is the key in obs that
+ denotes which instance in region the row corresponds to.
The feature-table AnnData object must have an attribute obs with a key
+matching to the instance_key zarr attribute. For instance if
+instance_key="label" then table.obs["label"] must exist, with its items
+matching the labels in the image in "../labels/label_DAPI".
OME-Zarrs created via fractal-tasks-core (e.g. by parsing Yokogawa images via
+the
+create_ome_zarr
+or
+create_ome_zarr_multiplex
+tasks) always include two specific ROI tables:
+
+
The table named well_ROI_table, which covers the NGFF image corresponding to the whole well1;
+
The table named FOV_ROI_table, which lists all original fields of view (FOVs).
+
+
Each one of these two tables includes ROIs that span the whole image size along
+the Z axis. Note that this differs, e.g., from ROIs which are the bounding
+boxes of three-dimensional segmented objects, and which may cover only a part
+of the image Z size.
When working with an externally-generated OME-Zarr, one may use the
+import_ome_zarr
+task
+to make it compatible with fractal-tasks-core. This task optionally adds two
+ROI tables to the NGFF images:
+
+
The table named image_ROI_table, which covers the whole image;
+
A table named grid_ROI_table, which splits the whole-image ROI into a YX
+ rectangular grid of smaller ROIs. This may correspond to original FOVs (in
+ case the image is a tiled well1), or it may simply be useful for applying
+ downstream processing to smaller arrays and avoid large memory requirements.
+
+
As for the case of well_ROI_table and FOV_ROI_table described
+above, also these two tables include ROIs spanning the
+whole image extension along the Z axis.
ROI tables are also used and updated during image processing, e.g as in:
+
+
The FOV ROI table may undergo transformations during processing, e.g. FOV
+ ROIs may be shifted to avoid overlaps; in this case, we use the optional
+ columns x_micrometer_original and y_micrometer_original to store the values
+ before the transformation.
+
The FOV ROI table is also used to store information on the registration of
+ multiplexing acquisitions, via the translation_x, translation_y and
+ translation_z optional columns.
+
Several tasks in fractal-tasks-core take an existing ROI table as an input
+ and then loop over the ROIs defined in the table. This makes the task more
+ flexible, as it can be used to process e.g. a whole well, a set of FOVs, or a
+ set of custom regions of the array.
To read an AnnData table from a Zarr group, one may use the read_zarr
+function.
+In the following example a NGFF image was created by stitching together two
+field of views, where each one is made of a stack of five Z planes with 1 um
+spacing between the planes.
+The FOV_ROI_table has information on the XY position and size of the two
+original FOVs (named FOV_1 and FOV_2):
+
The anndata.experimental.write_elem function provides the required
+functionality to write an AnnData object to a Zarr group. In
+fractal-tasks-core, the write_table helper function wraps the anndata
+function and includes additional functionalities -- see its
+documentation.
+
With respect to the wrapped anndata function, the main additional features of write_table are
+
+
The boolean parameter overwrite (defaulting to False), that determines the behavior in case of an already-existing table at the given path.
+
The table_attrs parameter, as a shorthand for updating the Zarr attributes of the table group after its creation.
These specifications may evolve (especially based on the future NGFF updates),
+eventually leading to breaking changes in future versions.
+fractal-tasks-core will aim at mantaining backwards-compatibility with V1 for
+a reasonable amount of time.
+
Here is an in-progress list of aspects that may be reviewed:
+
+
We aim at removing the use of hard-coded units from the column names (e.g.
+ x_micrometer), in favor of a more general definition of units.
+
The z_micrometer and len_z_micrometer columns are currently required in
+ all ROI tables, even when the ROIs actually define a two-dimensional XY
+ region; in that case, we set z_micrometer=0 and len_z_micrometer is such
+ that the whole Z size is covered (that is, len_z_micrometer is the product
+ of the spacing between Z planes and the number of planes). In a future
+ version, we may introduce more flexibility and also accept ROI tables which
+ only include X and Y axes, and adapt the relevant tools so that they
+ automatically expand these ROIs into three-dimensions when appropriate.
+
Concerning the use of AnnData tables or other formats for tabular data, our
+ plan is to follow whatever serialised table specification becomes part of the
+ NGFF standard. For the record, Zarr does not natively support storage of
+ dataframes (see e.g.
+ https://github.com/zarr-developers/numcodecs/issues/452), which is one aspect
+ in favor of sticking with the anndata library.
+
+
+
+
+
+
Within fractal-tasks-core, NGFF images represent whole wells; this still
+complies with the NGFF specifications, as of an approved clarification in the
+specs. This explains the reason for
+storing the regions corresponding to the original FOVs in a specific ROI table,
+since one NGFF image includes a collection of FOVs. Note that this approach
+does not rely on the assumption that the FOVs constitute a regular tiling of
+the well, but it also covers the case of irregularly placed FOVs. ↩↩
+
+
+
Note that the table types masking_roi_table and feature_table closely
+resemble the type="ngff:region_table" specification in the previous proposed
+NGFF table specs. ↩↩