Here is a list of tasks that are available within Fractal-compatible packages,
+including both fractal-tasks-core and others.
+
These are the tasks that we are aware of (on December 5th, 2023); if you created
+your own package of Fractal tasks, reach out to have it listed here (or, if you
+want to build your own tasks, follow these instructions).
(major) Introduce new tasks for registration of multiplexing cycles: calculate_registration_image_based, apply_registration_to_ROI_tables, apply_registration_to_image (#487).
+
(major) Introduce new overwrite argument for tasks create_ome_zarr, create_ome_zarr_multiplex, yokogawa_to_ome_zarr, copy_ome_zarr, maximum_intensity_projection, cellpose_segmentation, napari_workflows_wrapper (#499).
+
(major) Rename illumination_correction parameter from overwrite to overwrite_input (#499).
+
Fix plate-selection bug in copy_ome_zarr task (#513).
+
Fix bug in definition of metadata["plate"] in create_ome_zarr_multiplex task (#513).
+
Introduce new helper functions write_table, prepare_label_group and open_zarr_group_with_overwrite (#499).
Make tasks-related dependencies optional, and installable via fractal-tasks extra (#390).
+
Remove tools package extra (#384), and split the subpackage content into lib_ROI_overlaps and examples (#390).
+
+
+
(major) Modify task arguments
+
Add Pydantic model lib_channels.OmeroChannel (#410, #422);
+
Add Pydantic model tasks._input_models.Channel (#422);
+
Add Pydantic model tasks._input_models.NapariWorkflowsInput (#422);
+
Add Pydantic model tasks._input_models.NapariWorkflowsOutput (#422);
+
Move all Pydantic models to main package (#438).
+
Modify arguments of illumination_correction task (#431);
+
Modify arguments of create_ome_zarr and create_ome_zarr_multiplex (#433).
+
Modify argument default for ROI_table_names, in copy_ome_zarr (#449).
+
Remove the delete option from yokogawa to ome zarr (#443).
+
Reorder task inputs (#451).
+
+
+
JSON Schemas for task arguments:
+
Add JSON Schemas for task arguments in the package manifest (#369, #384).
+
Add JSON Schemas for attributes of custom task-argument Pydantic models (#436).
+
Make schema-generation tools more general, when handling custom Pydantic models (#445).
+
Include titles for custom-model-typed arguments and argument attributes (#447).
+
Remove TaskArguments models and switch to Pydantic V1 validate_arguments (#369).
+
Make coercing&validating task arguments required, rather than optional (#408).
+
Remove default_args from manifest (#379, #393).
+
+
+
Other:
+
Make pydantic dependency required for running tasks, and pin it to V1 (#408).
+
Remove legacy executor definitions from manifest (#361).
+
Add GitHub action for testing pip install with/without fractal-tasks extra (#390).
+
Remove sqlmodel from dev dependencies (#374).
+
Relax constraint on torch version, from ==1.12.1 to <=2.0.0 (#406).
+
Review task docstrings and improve documentation (#413, #416).
+
Update anndata dependency requirements (from ^0.8.0 to >=0.8.0,<=0.9.1), and replace anndata.experimental.write_elem with anndata._io.specs.write_elem (#428).
Disable bugged validation of model_type argument in cellpose_segmentation (#344).
+
Raise an error if the user provides an unexpected argument to a task (#337); this applies to the case of running a task as a script, with a pydantic model for task-argument validation.
(major) Update task interface: remove filename extension from input_paths and output_path for all tasks, and add new arguments (image_extension,image_glob_pattern) to create_ome_zarr task (#323).
+
Implement logic for handling image_glob_patterns argument, both when globbing images and in Yokogawa metadata parsing (#326).
The fractal-tasks-core repository is the reference implementation for Fractal
+tasks and for Fractal task packages, but the Fractal platform can also be
+used to execute custom tasks. This page lists the Fractal-compatibility
+requirements, for a single custom task or for a task
+package.
Each task must be associated to some metadata, so that it can be used in
+Fractal. The full specification is
+here,
+and the required attributes are:
+
+
name: the task name, e.g. "Create OME-Zarr structure";
+
command: a command that can be executed from the command line;
+
input_type: this can be any string (typical examples: "image" or "zarr");
+ the special value "Any" means that Fractal won't perform any check of the
+ input_type when applying the task to a dataset.
+
output_type: same logic as input_type.
+
source: this is meant to be as close as possible to unique task identifier;
+ for custom tasks, it can be anything (e.g. "my_task"), but for task that
+ are collected automatically from a package (see Task package this
+ attribute will have a very specific form (e.g.
+ "pip_remote:fractal_tasks_core:0.10.0:fractal-tasks::convert_yokogawa_to_ome-zarr").
+
meta: a JSON object (similar to a Python dictionary) with some additional
+ information, see Task meta-parameters.
+
+
There are multiple ways to get the appropriate metadata into the database,
+including a POST request to the fractal-server API (see Tasks section in
+the fractal-server API
+documentation)
+or the automated addition of a whole set of tasks through specific API
+endpoints (see Task package).
Therefore the task command must accept these additional command-line arguments.
+If the task is a Python script, this can be achieved easily by using the
+run_fractal_task function - which is available as part of
+fractal_tasks_core.tasks._utils.
The meta attribute of tasks (see the corresponding item in Task
+metadata) is where we specify some requirements on how the
+task should be run. This notably includes:
+
+
If the task has to be run in parallel (e.g. over multiple wells of an
+ OME-Zarr dataset), then meta should include a key-value pair like
+ {"parallelization_level": "well"}. If the parallelization_level key is
+ missing, the task is considered as non-parallel.
+
If Fractal is configured to run on a SLURM cluster, meta may include
+ additional information on the SLRUM requirements (more info on the Fractal
+ SLURM backend
+ here).
When a task is run via Fractal, its input parameters (i.e. the ones in the file
+specified via the -j command-line otion) will always include a set of keyword
+arguments with specific names:
The only task output which will be visible to Fractal is what goes in the
+output metadata-update file (i.e. the one specified through the
+--metadata-out command-line option). Note that this only holds for
+non-parallel tasks, while (for the moment) Fractal fully ignores the output of
+parallel tasks.
+
+
IMPORTANT: This means that each task must always write any output to
+disk, before ending.
The description of other advanced features is not yet available in this page.
+
+
Also other attributes of the Task metadata exist, and they
+ would be recognized by other Fractal components (e.g. fractal-server or
+ fractal-web). These include JSON Schemas for input parameters and additional
+ documentation-related attributes.
+
In fractal-tasks-core, we use pydantic
+ v1 to fully coerce and validate the input
+ parameters into a set of given types.
Given a set of Python scripts corresponding to Fractal tasks, it is useful to
+combine them into a single Python package, using the standard
+tools or
+other options (e.g. for fractal-tasks-core we use
+poetry).
Creating a package is often a good practice, for reasons unrelated to Fractal:
+
+
It makes it simple to assign a global version to the package, and to host it
+ on a public index like PyPI;
+
It may reduce code duplication:
+
The scripts may have a shared set of external dependencies, which are
+ defined in a single place for a package.
+
The scripts may import functions from a shared set of auxiliary Python
+ modules, which can be included in the package.
+
+
+
+
Moreover, having a single package also streamlines some Fractal-related
+operations. Given the package MyTasks (available on PyPI, or locally), the
+Fractal platform offers a feature that automatically:
+
+
Downloads the wheel file of package MyTasks (if it's on a public index,
+ rather than a local file);
+
Creates a Python virtual environment (venv) which is specific for a given
+ version of the MyTasks package, and installs the MyTasks package in that
+ venv;
+
Populates all the corresponding entries in the task database table with
+ the appropriate Task metadata, which are extracted from
+ the package manifest.
+
+
This feature is currently exposed in the /api/v1/task/collect/pip/ endpoint of fractal-server (see API documentation).
To be compatible with Fractal, a task package must satisfy some additional requirements:
+
+
The package is built as a a wheel file, and can be installed via pip.
+
The __FRACTAL_MANIFEST__.json file is bundled in the package, in its root
+ folder. If you are using poetry, no special operation is needed. If you
+ are using a setup.cfg file, see
+ this
+ comment.
+
Include JSON Schemas. The tools in fractal_tasks_core.dev are used to
+ generate JSON Schema's for the input parameters of each task in
+ fractal-tasks-core. They are meant to be flexible and re-usable to perform
+ the same operation on an independent package, but they are not thoroughly
+ documented/tested for more general use; feel free to open an issue if something
+ is not clear.
+
Include additional task metadata like docs_info or docs_link, which will
+ be displayed in the Fractal web-client. Note: this feature is not yet
+ implemented.
We use poetry to manage the development environment and the dependencies. A simple way to install it is pipx install poetry==1.7.1, or you can look at the installation section here.
+will take care of installing all the dependencies in a separate environment, optionally installing also the dependencies for developement and to build the documentation.
+
We use pytest for unit and integration testing of Fractal. If you installed the development dependencies, you may run the test suite by invoking:
+
poetry run pytest
+
+
The tests files are in the tests folder of the repository, and they are also
+run through GitHub Actions; both the main fractal_tasks_core tests (in
+tests/) and the fractal_tasks_core.tasks tests (in tests/tasks/) are run
+with Python 3.9, 3.10 and 3.11.
The documentations is built with mkdocs.
+To build the documentation locally, setup a development python environment (e.g. with poetry install --with docs) and then run one of these commands:
+
poetry run mkdocs serve --config-file mkdocs.yml # serves the docs at http://127.0.0.1:8000
+poetry run mkdocs build --config-file mkdocs.yml # creates a build in the `site` folder
+
You reviewed dependencies and dev dependencies and the lock file is up to date with pyproject.toml (it is useful to have a look at the output of deptry . -v, where deptry is already installed as part of the dev dependencies - NOTE: deptry should be installed independently, e.g. via pipx install deptry).
+
The current HEAD of the main branch passes all the tests (note: make sure that you are using the poetry-installed local package).
+
Update changelog. First look at the list of commits since the last tag, via:
+
+then add the upcoming release to docs/source/changelog.rst with the main information about it, using standard categories like "New features", "Fixes" and "Other changes", and including PR numbers when relevant. Commit docs/source/changelog.rst and push.
+
If appropriate (e.g. if you added some new task arguments, or if you modified some of their descriptions), update the JSON Schemas in the manifest via:
+
poetry run python fractal_tasks_core/dev/update_manifest.py
+
+
+
Actual release
+
+
Use:
+
poetry run bumpver update --[tag-num|patch|minor] --tag-commit --commit --dry
+
+to test updating the version bump.
+
If the previous step looks good, use:
+
poetry run bumpver update --[tag-num|patch|minor] --tag-commit --commit
+
+to actually bump the version. This will trigger a dedicated GitHub
+action to build the new package and publish it to PyPI.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/doc-requirements.txt b/doc-requirements.txt
new file mode 100644
index 000000000..1fc3cce45
--- /dev/null
+++ b/doc-requirements.txt
@@ -0,0 +1,8 @@
+mkdocs==1.5.2
+mkdocs-material==9.1.21
+mkdocs-autorefs==0.5.0
+mkdocs-gen-files==0.4.0
+mkdocs-literate-nav==0.5.0
+mkdocs-section-index==0.3.5
+mkdocstrings[python]==0.22.0
+mkdocs-include-markdown-plugin==4.0.4
diff --git a/extra.css b/extra.css
new file mode 100644
index 000000000..be13614b9
--- /dev/null
+++ b/extra.css
@@ -0,0 +1,17 @@
+/* Custom style for blockquotes */
+blockquote {
+ background-color: #e7e3e3d8; /* Light gray background */
+ border: 1px solid #9c27b0; /* Purple border */
+ padding: 10px;
+ margin: 20px 0;
+ border-radius: 4px;
+ font-size: 16px;
+ line-height: 1.6;
+ box-shadow: 2px 2px 5px rgba(0, 0, 0, 0.1); /* Optional: Add a subtle shadow */
+}
+
+/* Style the text inside blockquotes */
+blockquote p {
+ margin: 0;
+ color: #333; /* Dark text color */
+}
diff --git a/gen_ref_pages.py b/gen_ref_pages.py
new file mode 100644
index 000000000..a5b3a6ea6
--- /dev/null
+++ b/gen_ref_pages.py
@@ -0,0 +1,81 @@
+from pathlib import Path
+from typing import Iterable
+from typing import Mapping
+
+import mkdocs_gen_files
+from mkdocs_gen_files import Nav
+
+
+class CustomNav(Nav):
+ """
+ The original Nav class is part of mkdocs_gen_files
+ (https://github.com/oprypin/mkdocs-gen-files)
+ Original Copyright 2020 Oleh Prypin
+ License: MIT
+ """
+
+ def __init__(self, *args, **kwargs):
+ super().__init__(*args, **kwargs)
+
+ @classmethod
+ def _items(cls, data: Mapping, level: int) -> Iterable[Nav.Item]:
+ """
+ Custom modification: rather than looping over data.items(), we loop
+ over keys/values in a custom order (that is, we first include "tasks",
+ then "dev", then all the rest)
+ """
+ sorted_keys = list(data.keys())
+ if None in sorted_keys:
+ sorted_keys.remove(None)
+ sorted_keys = sorted(sorted_keys, key=str.casefold)
+ if "dev" in sorted_keys:
+ sorted_keys.remove("dev")
+ sorted_keys = ["dev"] + sorted_keys
+ if "tasks" in sorted_keys:
+ sorted_keys.remove("tasks")
+ sorted_keys = ["tasks"] + sorted_keys
+
+ for key in sorted_keys:
+ value = data[key]
+ if key is not None:
+ yield cls.Item(
+ level=level, title=key, filename=value.get(None)
+ )
+ yield from cls._items(value, level + 1)
+
+
+nav = CustomNav()
+
+for path in sorted(Path("fractal_tasks_core").rglob("*.py")):
+ module_path = path.relative_to(".").with_suffix("")
+ doc_path = path.relative_to(".").with_suffix(".md")
+ full_doc_path = Path("reference", doc_path)
+
+ parts = list(module_path.parts)
+
+ if parts[-1] == "__init__":
+ parts = parts[:-1]
+ doc_path = doc_path.with_name("index.md")
+ full_doc_path = full_doc_path.with_name("index.md")
+ elif parts[-1] == "__main__":
+ continue
+
+ # Remove fractal_tasks_core from doc_path
+ doc_path = Path("/".join(doc_path.as_posix().split("/")[1:]))
+
+ # Remove fractal_tasks_core from parts, and skip the case where
+ # parts=["fractal_tasks_core"]
+ if parts[1:]:
+ nav[parts[1:]] = doc_path.as_posix()
+
+ with mkdocs_gen_files.open(full_doc_path, "w") as fd:
+ identifier = ".".join(parts)
+ fd.write(f"::: {identifier}")
+
+ mkdocs_gen_files.set_edit_path(full_doc_path, path)
+
+
+with mkdocs_gen_files.open(
+ "reference/fractal_tasks_core/SUMMARY.md", "w"
+) as nav_file:
+ nav_file.writelines(nav.build_literate_nav())
diff --git a/index.html b/index.html
new file mode 100644
index 000000000..8673f746e
--- /dev/null
+++ b/index.html
@@ -0,0 +1,1454 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Fractal Tasks Core
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Fractal is a framework to process high content imaging data at scale and prepare it for interactive visualization.
+
+
This project is under active development 🔨. If you need help or found a bug, open an issue here.
+
+
Fractal provides distributed workflows that convert TBs of image data into OME-Zar files.
+The platform then processes the 3D image data by applying tasks like illumination correction, maximum intensity projection, 3D segmentation using cellpose and measurements using napari workflows.
+The pyramidal OME-Zarr files enable interactive visualization in the napari viewer.
+
+
The fractal-tasks-core package contains the python tasks that parse Yokogawa CV7000 images into OME-Zarr and process OME-Zarr files. Find more information about Fractal in general and the other repositories at this link. All tasks are written as Python functions and are optimized for usage in Fractal workflows, but they can also be used as standalone functions to parse data or process OME-Zarr files. We heavily use regions of interest (ROIs) in our OME-Zarr files to store the positions of field of views. ROIs are saved as AnnData tables following this spec proposal. We save wells as large Zarr arrays instead of a collection of arrays for each field of view (see details here).
+
Here is an example of the interactive visualization in napari using the newly-proposed async loading in NAP4 and the napari-ome-zarr plugin:
Create Zarr Structure: Task to generate the zarr structure based on Yokogawa metadata files
+
Yokogawa to Zarr: Parses the Yokogawa CV7000 image data and saves it to the Zarr file
+
Illumination Correction: Applies an illumination correction based on a flatfield image & subtracts a background from the image.
+
Image Labeling (& Image Labeling Whole Well): Applies a cellpose network to the image of a single ROI or the whole well. cellpose parameters can be tuned for optimal performance.
+
Maximum Intensity Projection: Creates a maximum intensity projection of the whole plate.
+
Measurement: Make some standard measurements (intensity & morphology) using napari workflows, saving results to AnnData tables.
+
+
Some additional tasks are currently being worked on and some older tasks are still present in the fractal_tasks_core folder. See the package page for the detailed description of all tasks.
Fractal was conceived in the Liberali Lab at the Friedrich Miescher Institute for Biomedical Research and in the Pelkmans Lab at the University of Zurich by @jluethi and @gusqgm. The Fractal project is now developed at the BioVisionCenter at the University of Zurich and the project lead is with @jluethi. The core development is done under contract by eXact lab S.r.l..
defglob_with_multiple_patterns(
+ *,
+ folder:str,
+ patterns:Sequence[str]=None,
+)->set[str]:
+"""
+ List all the items (files and folders) in a given folder that
+ simultaneously match a series of glob patterns.
+
+ Args:
+ folder: Base folder where items will be searched.
+ patterns: If specified, the list of patterns (defined as in
+ https://docs.python.org/3/library/fnmatch.html) that item
+ names will match with.
+ """
+
+ # Sanitize base-folder path
+ iffolder.endswith("/"):
+ actual_folder=folder[:-1]
+ else:
+ actual_folder=folder[:]
+
+ # If not pattern is specified, look for *all* items in the base folder
+ ifnotpatterns:
+ patterns=["*"]
+
+ # Combine multiple glob searches (via set intersection)
+ logging.info(f"[glob_with_multiple_patterns] {patterns=}")
+ items=None
+ forpatterninpatterns:
+ new_matches=glob(f"{actual_folder}/{pattern}")
+ ifitemsisNone:
+ items=set(new_matches)
+ else:
+ items=items.intersection(new_matches)
+ items=itemsorset()
+ logging.info(f"[glob_with_multiple_patterns] Found {len(items)} items")
+
+ returnitems
+
defcalculate_steps(site_series:pd.Series):
+"""
+ TBD
+
+ Args:
+ site_series: TBD
+ """
+
+ # site_series is the z_micrometer series for a given site of a given
+ # channel. This function calculates the step size in Z
+
+ # First diff is always NaN because there is nothing to compare it to
+ steps=site_series.diff().dropna().astype(float)
+ ifnotnp.allclose(steps.iloc[0],np.array(steps)):
+ raiseNotImplementedError(
+ "When parsing the Yokogawa mlf file, some sites "
+ "had varying step size in Z. "
+ "That is not supported for the OME-Zarr parsing"
+ )
+ returnsteps.mean()
+
defget_earliest_time_per_site(mlf_frame:pd.DataFrame)->pd.DataFrame:
+"""
+ TBD
+
+ Args:
+ mlf_frame: TBD
+ """
+
+ # Get the time information per site
+ # Because a site will contain time information for each plane
+ # of each channel, we just return the earliest time infromation
+ # per site.
+ returnpd.to_datetime(
+ mlf_frame.groupby(["well_id","FieldIndex"]).min()["Time"],utc=True
+ )
+
defget_z_steps(mlf_frame:pd.DataFrame)->pd.DataFrame:
+"""
+ TBD
+
+ Args:
+ mlf_frame: TBD
+ """
+
+ # Process mlf_frame to extract Z information (pixel size & steps).
+ # Run checks on consistencies & return site-based z step dataframe
+ # Group by well, field & channel
+ grouped_sites_z=(
+ mlf_frame.loc[
+ :,
+ ["well_id","FieldIndex","ActionIndex","Ch","Z"],
+ ]
+ .set_index(["well_id","FieldIndex","ActionIndex","Ch"])
+ .groupby(level=[0,1,2,3])
+ )
+
+ # If there is only 1 Z step, set the Z spacing to the count of planes => 1
+ ifgrouped_sites_z.count()["Z"].max()==1:
+ z_data=grouped_sites_z.count().groupby(["well_id","FieldIndex"])
+ else:
+ # Group the whole site (combine channels), because Z steps need to be
+ # consistent between channels for OME-Zarr.
+ z_data=grouped_sites_z.apply(calculate_steps).groupby(
+ ["well_id","FieldIndex"]
+ )
+
+ check_group_consistency(
+ z_data,message="Comparing Z steps between channels"
+ )
+
+ # Ensure that channels have the same number of z planes and
+ # reduce it to one value.
+ # Only check if there is more than one channel available
+ ifany(
+ grouped_sites_z.count().groupby(["well_id","FieldIndex"]).count()>1
+ ):
+ check_group_consistency(
+ grouped_sites_z.count().groupby(["well_id","FieldIndex"]),
+ message="Checking number of Z steps between channels",
+ )
+
+ z_steps=(
+ grouped_sites_z.count()
+ .groupby(["well_id","FieldIndex"])
+ .mean()
+ .astype(int)
+ )
+
+ # Combine the two dataframes
+ z_frame=pd.concat([z_data.mean(),z_steps],axis=1)
+ z_frame.columns=["pixel_size_z","z_pixel"]
+ returnz_frame
+
defread_metadata_files(
+ mrf_path:str,
+ mlf_path:str,
+ filename_patterns:Optional[list[str]]=None,
+)->tuple[pd.DataFrame,pd.DataFrame,int]:
+"""
+ TBD
+
+ Args:
+ mrf_path: Full path to MeasurementDetail.mrf metadata file.
+ mlf_path: Full path to MeasurementData.mlf metadata file.
+ filename_patterns: List of patterns to filter the image filenames in
+ the mlf metadata table. Patterns must be defined as in
+ https://docs.python.org/3/library/fnmatch.html.
+ """
+
+ # parsing of mrf & mlf files are based on the
+ # yokogawa_image_collection_task v0.5 in drogon, written by Dario Vischi.
+ # https://github.com/fmi-basel/job-system-workflows/blob/00bbf34448972d27f258a2c28245dd96180e8229/src/gliberal_workflows/tasks/yokogawa_image_collection_task/versions/version_0_5.py # noqa
+ # Now modified for Fractal use
+
+ mrf_frame=read_mrf_file(mrf_path)
+ # TODO: filter_position & filter_wheel_position are parsed, but not
+ # processed further. Figure out how to save them as relevant metadata for
+ # use e.g. during illumination correction
+
+ mlf_frame,error_count=read_mlf_file(mlf_path,filename_patterns)
+ # TODO: Time points are parsed as part of the mlf_frame, but currently not
+ # processed further. Once we tackle time-resolved data, parse from here.
+
+ returnmrf_frame,mlf_frame,error_count
+
defread_mlf_file(
+ mlf_path:str,
+ filename_patterns:Optional[list[str]]=None,
+)->tuple[pd.DataFrame,int]:
+"""
+ TBD
+
+ Args:
+ mlf_path: Full path to MeasurementData.mlf metadata file.
+ filename_patterns: List of patterns to filter the image filenames in
+ the mlf metadata table. Patterns must be defined as in
+ https://docs.python.org/3/library/fnmatch.html.
+ """
+
+ # Load the whole MeasurementData.mlf file
+ mlf_frame_raw=pd.read_xml(mlf_path)
+
+ # Remove all rows that do not match the given patterns
+ logger.info(
+ f"Read {mlf_path}, and apply following patterns to "
+ f"image filenames: {filename_patterns}"
+ )
+ iffilename_patterns:
+ filenames=mlf_frame_raw.MeasurementRecord
+ keep_row=None
+ forpatterninfilename_patterns:
+ actual_pattern=fnmatch.translate(pattern)
+ new_matches=filenames.str.fullmatch(actual_pattern)
+ ifnew_matches.sum()==0:
+ raiseValueError(
+ f"In {mlf_path} there is no image filename "
+ f'matching "{actual_pattern}".'
+ )
+ ifkeep_rowisNone:
+ keep_row=new_matches.copy()
+ else:
+ keep_row=keep_row&new_matches
+ ifkeep_row.sum()==0:
+ raiseValueError(
+ f"In {mlf_path} there is no image filename "
+ f"matching {filename_patterns}."
+ )
+ mlf_frame_matching=mlf_frame_raw[keep_row.values].copy()
+ else:
+ mlf_frame_matching=mlf_frame_raw.copy()
+
+ # Create a well ID column
+ row_str=[chr(x)forxin(mlf_frame_matching["Row"]+64)]
+ mlf_frame_matching["well_id"]=[
+ f"{a}{b:02}"fora,binzip(row_str,mlf_frame_matching["Column"])
+ ]
+
+ # Flip Y axis to align to image coordinate system
+ mlf_frame_matching["Y"]=-mlf_frame_matching["Y"]
+
+ # Compute number or errors
+ error_count=(mlf_frame_matching["Type"]=="ERR").sum()
+
+ # We're only interested in the image metadata
+ mlf_frame=mlf_frame_matching[mlf_frame_matching["Type"]=="IMG"]
+
+ returnmlf_frame,error_count
+
classChannelInputModel(BaseModel):
+"""
+ A channel which is specified by either `wavelength_id` or `label`.
+
+ This model is similar to `OmeroChannel`, but it is used for
+ task-function arguments (and for generating appropriate JSON schemas).
+
+ Attributes:
+ wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.
+ label: Name of the channel.
+ """
+
+ wavelength_id:Optional[str]=None
+ label:Optional[str]=None
+
+ @validator("label",always=True)
+ defmutually_exclusive_channel_attributes(cls,v,values):
+"""
+ Check that either `label` or `wavelength_id` is set.
+ """
+ wavelength_id=values.get("wavelength_id")
+ label=v
+ ifwavelength_idandv:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both set "
+ f"(given {wavelength_id=} and {label=})."
+ )
+ ifwavelength_idisNoneandvisNone:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both `None`"
+ )
+ returnv
+
@validator("label",always=True)
+defmutually_exclusive_channel_attributes(cls,v,values):
+"""
+ Check that either `label` or `wavelength_id` is set.
+ """
+ wavelength_id=values.get("wavelength_id")
+ label=v
+ ifwavelength_idandv:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both set "
+ f"(given {wavelength_id=} and {label=})."
+ )
+ ifwavelength_idisNoneandvisNone:
+ raiseValueError(
+ "`wavelength_id` and `label` cannot be both `None`"
+ )
+ returnv
+
Custom error for when get_channel_from_list fails,
+that can be captured and handled upstream if needed.
+
+
+ Source code in fractal_tasks_core/channels.py
+
137
+138
+139
+140
+141
+142
+143
classChannelNotFoundError(ValueError):
+"""
+ Custom error for when `get_channel_from_list` fails,
+ that can be captured and handled upstream if needed.
+ """
+
+ pass
+
classOmeroChannel(BaseModel):
+"""
+ Custom class for Omero channels, based on OME-NGFF v0.4.
+
+ Attributes:
+ wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.
+ index: Do not change. For internal use only.
+ label: Name of the channel.
+ window: Optional `Window` object to set default display settings for
+ napari.
+ color: Optional hex colormap to display the channel in napari (it
+ must be of length 6, e.g. `00FFFF`).
+ active: Should this channel be shown in the viewer?
+ coefficient: Do not change. Omero-channel attribute.
+ inverted: Do not change. Omero-channel attribute.
+ """
+
+ # Custom
+
+ wavelength_id:str
+ index:Optional[int]
+
+ # From OME-NGFF v0.4 transitional metadata
+
+ label:Optional[str]
+ window:Optional[Window]
+ color:Optional[str]
+ active:bool=True
+ coefficient:int=1
+ inverted:bool=False
+
+ @validator("color",always=True)
+ defvalid_hex_color(cls,v,values):
+"""
+ Check that `color` is made of exactly six elements which are letters
+ (a-f or A-F) or digits (0-9).
+ """
+ ifvisNone:
+ returnv
+ iflen(v)!=6:
+ raiseValueError(f'color must have length 6 (given: "{v}")')
+ allowed_characters="abcdefABCDEF0123456789"
+ forcharacterinv:
+ ifcharacternotinallowed_characters:
+ raiseValueError(
+ "color must only include characters from "
+ f'"{allowed_characters}" (given: "{v}")'
+ )
+ returnv
+
@validator("color",always=True)
+defvalid_hex_color(cls,v,values):
+"""
+ Check that `color` is made of exactly six elements which are letters
+ (a-f or A-F) or digits (0-9).
+ """
+ ifvisNone:
+ returnv
+ iflen(v)!=6:
+ raiseValueError(f'color must have length 6 (given: "{v}")')
+ allowed_characters="abcdefABCDEF0123456789"
+ forcharacterinv:
+ ifcharacternotinallowed_characters:
+ raiseValueError(
+ "color must only include characters from "
+ f'"{allowed_characters}" (given: "{v}")'
+ )
+ returnv
+
classWindow(BaseModel):
+"""
+ Custom class for Omero-channel window, based on OME-NGFF v0.4.
+
+ Attributes:
+ min: Do not change. It will be set to `0` by default.
+ max:
+ Do not change. It will be set according to bit-depth of the images
+ by default (e.g. 65535 for 16 bit images).
+ start: Lower-bound rescaling value for visualization.
+ end: Upper-bound rescaling value for visualization.
+ """
+
+ min:Optional[int]
+ max:Optional[int]
+ start:int
+ end:int
+
def_get_new_unique_value(
+ value:str,
+ existing_values:list[str],
+)->str:
+"""
+ Produce a string value that is not present in a given list
+
+ Append `_1`, `_2`, ... to a given string, if needed, until finding a value
+ which is not already present in `existing_values`.
+
+ Args:
+ value: The first guess for the new value
+ existing_values: The list of existing values
+
+ Returns:
+ A string value which is not present in `existing_values`
+ """
+ counter=1
+ new_value=value
+ whilenew_valueinexisting_values:
+ new_value=f"{value}-{counter}"
+ counter+=1
+ returnnew_value
+
defcheck_well_channel_labels(*,well_zarr_path:str)->None:
+"""
+ Check that the channel labels for a well are unique.
+
+ First identify the channel-labels list for each image in the well, then
+ compare lists and verify their intersection is empty.
+
+ Args:
+ well_zarr_path: path to an OME-NGFF well zarr group.
+ """
+
+ # Iterate over all images (multiplexing cycles, multi-FOVs, ...)
+ group=zarr.open_group(well_zarr_path,mode="r+")
+ image_paths=[image["path"]forimageingroup.attrs["well"]["images"]]
+ list_of_channel_lists=[]
+ forimage_pathinimage_paths:
+ channels=get_omero_channel_list(
+ image_zarr_path=f"{well_zarr_path}/{image_path}"
+ )
+ list_of_channel_lists.append(channels[:])
+
+ # For each pair of channel-labels lists, verify they do not overlap
+ forind_1,channels_1inenumerate(list_of_channel_lists):
+ labels_1=set([c.labelforcinchannels_1])
+ forind_2inrange(ind_1):
+ channels_2=list_of_channel_lists[ind_2]
+ labels_2=set([c.labelforcinchannels_2])
+ intersection=labels_1&labels_2
+ ifintersection:
+ hint=(
+ "Are you parsing fields of view into separate OME-Zarr "
+ "images? This could lead to non-unique channel labels, "
+ "and then could be the reason of the error"
+ )
+ raiseValueError(
+ "Non-unique channel labels\n"
+ f"{labels_1=}\n{labels_2=}\n{hint}"
+ )
+
Update a channel list to use it in the OMERO/channels metadata.
+
Given a list of channel dictionaries, update each one of them by:
+ 1. Adding a label (if missing);
+ 2. Adding a set of OMERO-specific attributes;
+ 3. Discarding all other attributes.
+
The new_channels output can be used in the attrs["omero"]["channels"]
+attribute of an image group.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
channels
+
+
+
A list of channel dictionaries (each one must include the
+wavelength_id key).
defdefine_omero_channels(
+ *,
+ channels:list[OmeroChannel],
+ bit_depth:int,
+ label_prefix:Optional[str]=None,
+)->list[dict[str,Union[str,int,bool,dict[str,int]]]]:
+"""
+ Update a channel list to use it in the OMERO/channels metadata.
+
+ Given a list of channel dictionaries, update each one of them by:
+ 1. Adding a label (if missing);
+ 2. Adding a set of OMERO-specific attributes;
+ 3. Discarding all other attributes.
+
+ The `new_channels` output can be used in the `attrs["omero"]["channels"]`
+ attribute of an image group.
+
+ Args:
+ channels: A list of channel dictionaries (each one must include the
+ `wavelength_id` key).
+ bit_depth: bit depth.
+ label_prefix: TBD
+
+ Returns:
+ `new_channels`, a new list of consistent channel dictionaries that
+ can be written to OMERO metadata.
+ """
+
+ new_channels=[c.copy(deep=True)forcinchannels]
+ default_colors=["00FFFF","FF00FF","FFFF00"]
+
+ forchannelinnew_channels:
+ wavelength_id=channel.wavelength_id
+
+ # If channel.label is None, set it to a default value
+ ifchannel.labelisNone:
+ default_label=wavelength_id
+ iflabel_prefix:
+ default_label=f"{label_prefix}_{default_label}"
+ logging.warning(
+ f"Missing label for {channel=}, using {default_label=}"
+ )
+ channel.label=default_label
+
+ # If channel.color is None, set it to a default value (use the default
+ # ones for the first three channels, or gray otherwise)
+ ifchannel.colorisNone:
+ try:
+ channel.color=default_colors.pop()
+ exceptIndexError:
+ channel.color="808080"
+
+ # Set channel.window attribute
+ ifchannel.window:
+ channel.window.min=0
+ channel.window.max=2**bit_depth-1
+
+ # Check that channel labels are unique for this image
+ labels=[c.labelforcinnew_channels]
+ iflen(set(labels))<len(labels):
+ raiseValueError(f"Non-unique labels in {new_channels=}")
+
+ new_channels_dictionaries=[
+ c.dict(exclude={"index"},exclude_unset=True)forcinnew_channels
+ ]
+
+ returnnew_channels_dictionaries
+
defget_channel_from_image_zarr(
+ *,
+ image_zarr_path:str,
+ label:Optional[str]=None,
+ wavelength_id:Optional[str]=None,
+)->OmeroChannel:
+"""
+ Extract a channel from OME-NGFF zarr attributes.
+
+ This is a helper function that combines `get_omero_channel_list` with
+ `get_channel_from_list`.
+
+ Args:
+ image_zarr_path: Path to an OME-NGFF image zarr group.
+ label: `label` attribute of the channel to be extracted.
+ wavelength_id: `wavelength_id` attribute of the channel to be
+ extracted.
+
+ Returns:
+ A single channel dictionary.
+ """
+ omero_channels=get_omero_channel_list(image_zarr_path=image_zarr_path)
+ channel=get_channel_from_list(
+ channels=omero_channels,label=label,wavelength_id=wavelength_id
+ )
+ returnchannel
+
Find the channel that has the required values of label and/or
+wavelength_id, and identify its positional index (which also
+corresponds to its index in the zarr array).
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
channels
+
+
+
A list of channel dictionary, where each channel includes (at
+least) the label and wavelength_id keys.
defget_channel_from_list(
+ *,
+ channels:list[OmeroChannel],
+ label:Optional[str]=None,
+ wavelength_id:Optional[str]=None,
+)->OmeroChannel:
+"""
+ Find matching channel in a list.
+
+ Find the channel that has the required values of `label` and/or
+ `wavelength_id`, and identify its positional index (which also
+ corresponds to its index in the zarr array).
+
+ Args:
+ channels: A list of channel dictionary, where each channel includes (at
+ least) the `label` and `wavelength_id` keys.
+ label: The label to look for in the list of channels.
+ wavelength_id: The wavelength_id to look for in the list of channels.
+
+ Returns:
+ A single channel dictionary.
+ """
+
+ # Identify matching channels
+ iflabel:
+ ifwavelength_id:
+ # Both label and wavelength_id are specified
+ matching_channels=[
+ c
+ forcinchannels
+ if(c.label==labelandc.wavelength_id==wavelength_id)
+ ]
+ else:
+ # Only label is specified
+ matching_channels=[cforcinchannelsifc.label==label]
+ else:
+ ifwavelength_id:
+ # Only wavelength_id is specified
+ matching_channels=[
+ cforcinchannelsifc.wavelength_id==wavelength_id
+ ]
+ else:
+ # Neither label or wavelength_id are specified
+ raiseValueError(
+ "get_channel requires at least one in {label,wavelength_id} "
+ "arguments"
+ )
+
+ # Verify that there is one and only one matching channel
+ iflen(matching_channels)==0:
+ required_match=[f"{label=}",f"{wavelength_id=}"]
+ required_match_string=" and ".join(
+ [xforxinrequired_matchif"None"notinx]
+ )
+ raiseChannelNotFoundError(
+ f"ChannelNotFoundError: No channel found in {channels}"
+ f" for {required_match_string}"
+ )
+ iflen(matching_channels)>1:
+ raiseValueError(f"Inconsistent set of channels: {channels}")
+
+ channel=matching_channels[0]
+ channel.index=channels.index(channel)
+ returnchannel
+
defget_omero_channel_list(*,image_zarr_path:str)->list[OmeroChannel]:
+"""
+ Extract the list of channels from OME-NGFF zarr attributes.
+
+ Args:
+ image_zarr_path: Path to an OME-NGFF image zarr group.
+
+ Returns:
+ A list of channel dictionaries.
+ """
+ group=zarr.open_group(image_zarr_path,mode="r+")
+ channels_dicts=group.attrs["omero"]["channels"]
+ channels=[OmeroChannel(**c)forcinchannels_dicts]
+ returnchannels
+
Keeps only the description part of the docstrings: e.g from
+
'Custom class for Omero-channel window, based on OME-NGFF v0.4.\n'
+'\n'
+'Attributes:\n'
+'min: Do not change. It will be set to `0` by default.\n'
+'max: Do not change. It will be set according to bitdepth of the images\n'
+' by default (e.g. 65535 for 16 bit images).\n'
+'start: Lower-bound rescaling value for visualization.\n'
+'end: Upper-bound rescaling value for visualization.'
+
+to 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\n'.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
old_schema
+
+
+
TBD
+
+
+
+ TYPE:
+ _Schema
+
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/dev/lib_args_schemas.py
+
def_remove_attributes_from_descriptions(old_schema:_Schema)->_Schema:
+"""
+ Keeps only the description part of the docstrings: e.g from
+ ```
+ 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'
+ '\\n'
+ 'Attributes:\\n'
+ 'min: Do not change. It will be set to `0` by default.\\n'
+ 'max: Do not change. It will be set according to bitdepth of the images\\n'
+ ' by default (e.g. 65535 for 16 bit images).\\n'
+ 'start: Lower-bound rescaling value for visualization.\\n'
+ 'end: Upper-bound rescaling value for visualization.'
+ ```
+ to `'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'`.
+
+ Args:
+ old_schema: TBD
+ """
+ new_schema=old_schema.copy()
+ if"definitions"innew_schema:
+ forname,definitioninnew_schema["definitions"].items():
+ parsed_docstring=docparse(definition["description"])
+ new_schema["definitions"][name][
+ "description"
+ ]=parsed_docstring.short_description
+ logging.info("[_remove_attributes_from_descriptions] END")
+ returnnew_schema
+
This is a provisional helper function that replaces newlines with spaces
+and reduces multiple contiguous whitespace characters to a single one.
+Future iterations of the docstrings format/parsing may render this function
+not-needed or obsolete.
def_sanitize_description(string:str)->str:
+"""
+ Sanitize a description string.
+
+ This is a provisional helper function that replaces newlines with spaces
+ and reduces multiple contiguous whitespace characters to a single one.
+ Future iterations of the docstrings format/parsing may render this function
+ not-needed or obsolete.
+
+ Args:
+ string: TBD
+ """
+ # Replace newline with space
+ new_string=string.replace("\n"," ")
+ # Replace N-whitespace characterss with a single one
+ while" "innew_string:
+ new_string=new_string.replace(" "," ")
+ returnnew_string
+
def_extract_function(
+ module_relative_path:str,
+ function_name:str,
+ package_name:str="fractal_tasks_core",
+)->Callable:
+"""
+ Extract function from a module with the same name.
+
+ Args:
+ package_name: Example `fractal_tasks_core`.
+ module_relative_path: Example `tasks/create_ome_zarr.py`.
+ function_name: Example `create_ome_zarr`.
+ """
+ ifnotmodule_relative_path.endswith(".py"):
+ raiseValueError(f"{module_relative_path=} must end with '.py'")
+ module_relative_path_no_py=str(
+ Path(module_relative_path).with_suffix("")
+ )
+ module_relative_path_dots=module_relative_path_no_py.replace("/",".")
+ module=import_module(f"{package_name}.{module_relative_path_dots}")
+ task_function=getattr(module,function_name)
+ returntask_function
+
def_validate_function_signature(function:Callable):
+"""
+ Validate the function signature.
+
+ Implement a set of checks for type hints that do not play well with the
+ creation of JSON Schema, see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.
+
+ Args:
+ function: TBD
+ """
+ sig=signature(function)
+ forparaminsig.parameters.values():
+
+ # CASE 1: Check that name is not forbidden
+ ifparam.nameinFORBIDDEN_PARAM_NAMES:
+ raiseValueError(
+ f"Function {function} has argument with name {param.name}"
+ )
+
+ # CASE 2: Raise an error for unions
+ ifstr(param.annotation).startswith(("typing.Union[","Union[")):
+ raiseValueError("typing.Union is not supported")
+
+ # CASE 3: Raise an error for "|"
+ if"|"instr(param.annotation):
+ raiseValueError('Use of "|" in type hints is not supported')
+
+ # CASE 4: Raise an error for optional parameter with given (non-None)
+ # default, e.g. Optional[str] = "asd"
+ is_annotation_optional=str(param.annotation).startswith(
+ ("typing.Optional[","Optional[")
+ )
+ default_given=(param.defaultisnotNone)and(
+ param.default!=inspect._empty
+ )
+ ifdefault_givenandis_annotation_optional:
+ raiseValueError("Optional parameter has non-None default value")
+
+ logging.info("[_validate_function_signature] END")
+ returnsig
+
defcreate_docs_info(
+ executable:str,
+ package:str="fractal_tasks_core",
+)->str:
+"""
+ Return task description based on function docstring.
+ """
+ logging.info("[create_docs_info] START")
+ # Extract the function name. Note: this could be made more general, but for
+ # the moment we assume the function has the same name as the module)
+ function_name=Path(executable).with_suffix("").name
+ logging.info(f"[create_docs_info] {function_name=}")
+ # Get function description
+ docs_info=_get_function_description(
+ package_name=package,
+ module_relative_path=executable,
+ function_name=function_name,
+ )
+ logging.info("[create_docs_info] END")
+ returndocs_info
+
defcreate_docs_link(executable:str)->str:
+"""
+ Return link to docs page for a fractal_tasks_core task.
+ """
+ logging.info("[create_docs_link] START")
+
+ # Extract the function name. Note: this could be made more general, but for
+ # the moment we assume the function has the same name as the module)
+ function_name=Path(executable).with_suffix("").name
+ logging.info(f"[create_docs_link] {function_name=}")
+ # Define docs_link
+ docs_link=(
+ "https://fractal-analytics-platform.github.io/fractal-tasks-core/"
+ f"reference/fractal_tasks_core/tasks/{function_name}/"
+ f"#fractal_tasks_core.tasks.{function_name}.{function_name}"
+ )
+ logging.info("[create_docs_link] END")
+ returndocs_link
+
def_include_titles_for_properties(
+ properties:dict[str,dict]
+)->dict[str,dict]:
+"""
+ Scan through properties of a JSON Schema, and set their title when it is
+ missing.
+
+ The title is set to `name.title()`, where `title` is a standard string
+ method - see https://docs.python.org/3/library/stdtypes.html#str.title.
+
+ Args:
+ properties: TBD
+ """
+ new_properties=properties.copy()
+ forprop_name,propinproperties.items():
+ if"title"notinprop.keys():
+ new_prop=prop.copy()
+ new_prop["title"]=prop_name.title()
+ new_properties[prop_name]=new_prop
+ returnnew_properties
+
This helper function is similar to write_table, in that it prepares the
+appropriate zarr groups (labels and the new-label one) and performs
+overwrite-dependent checks. At a difference with write_table, this
+function does not actually write the label array to the new zarr group;
+such writing operation must take place in the actual task function, since
+in fractal-tasks-core it is done sequentially on different regions of the
+zarr array.
+
What this function does is:
+
+
Create the labels group, if needed.
+
If overwrite=False, check that the new label does not exist (either in
+ zarr attributes or as a zarr sub-group).
+
Update the labels attribute of the image group.
+
If label_attrs is set, include this set of attributes in the
+ new-label zarr group.
If False, check that the new label does not exist (either in zarr
+attributes or as a zarr sub-group); if True propagate parameter
+to create_group method, making it overwrite any existing
+sub-group with the given name.
defprepare_label_group(
+ image_group:zarr.hierarchy.Group,
+ label_name:str,
+ label_attrs:dict[str,Any],
+ overwrite:bool=False,
+ logger:Optional[logging.Logger]=None,
+)->zarr.group:
+"""
+ Set the stage for writing labels to a zarr group
+
+ This helper function is similar to `write_table`, in that it prepares the
+ appropriate zarr groups (`labels` and the new-label one) and performs
+ `overwrite`-dependent checks. At a difference with `write_table`, this
+ function does not actually write the label array to the new zarr group;
+ such writing operation must take place in the actual task function, since
+ in fractal-tasks-core it is done sequentially on different `region`s of the
+ zarr array.
+
+ What this function does is:
+
+ 1. Create the `labels` group, if needed.
+ 2. If `overwrite=False`, check that the new label does not exist (either in
+ zarr attributes or as a zarr sub-group).
+ 3. Update the `labels` attribute of the image group.
+ 4. If `label_attrs` is set, include this set of attributes in the
+ new-label zarr group.
+
+ Args:
+ image_group:
+ The group to write to.
+ label_name:
+ The name of the new label; this name also overrides the multiscale
+ name in NGFF-image Zarr attributes, if needed.
+ overwrite:
+ If `False`, check that the new label does not exist (either in zarr
+ attributes or as a zarr sub-group); if `True` propagate parameter
+ to `create_group` method, making it overwrite any existing
+ sub-group with the given name.
+ label_attrs:
+ Zarr attributes of the label-image group.
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`).
+
+ Returns:
+ Zarr group of the new label.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ # Create labels group (if needed) and extract current_labels
+ if"labels"notinset(image_group.group_keys()):
+ labels_group=image_group.create_group("labels",overwrite=False)
+ else:
+ labels_group=image_group["labels"]
+ current_labels=labels_group.attrs.asdict().get("labels",[])
+
+ # If overwrite=False, check that the new label does not exist (either as a
+ # zarr sub-group or as part of the zarr-group attributes)
+ ifnotoverwrite:
+ iflabel_nameinset(labels_group.group_keys()):
+ error_msg=(
+ f"Sub-group '{label_name}' of group {image_group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ iflabel_nameincurrent_labels:
+ error_msg=(
+ f"Item '{label_name}' already exists in `labels` attribute of "
+ f"group {image_group.store.path}, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
+ # Update the `labels` metadata of the image group, if needed
+ iflabel_namenotincurrent_labels:
+ new_labels=current_labels+[label_name]
+ labels_group.attrs["labels"]=new_labels
+
+ # Define new-label group
+ label_group=labels_group.create_group(label_name,overwrite=overwrite)
+
+ # Validate attrs against NGFF specs 0.4
+ try:
+ meta=NgffImageMeta(**label_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Label attributes do not comply with NGFF image "
+ "specifications, as encoded in fractal-tasks-core.\n"
+ f"Original error:\nValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ # Replace multiscale name with label_name, if needed
+ current_multiscale_name=meta.multiscale.name
+ ifcurrent_multiscale_name!=label_name:
+ logger.warning(
+ f"Setting multiscale name to '{label_name}' (old value: "
+ f"'{current_multiscale_name}') in label-image NGFF "
+ "attributes."
+ )
+ label_attrs["multiscales"][0]["name"]=label_name
+ # Overwrite label_group attributes with label_attrs key/value pairs
+ label_group.attrs.put(label_attrs)
+
+ returnlabel_group
+
def_postprocess_output(
+ *,
+ modified_array:np.ndarray,
+ original_array:np.ndarray,
+ background:np.ndarray,
+)->np.ndarray:
+"""
+ Postprocess cellpose output, mainly to restore its original background.
+
+ **NOTE**: The pre/post-processing functions and the
+ masked_loading_wrapper are currently meant to work as part of the
+ cellpose_segmentation task, with the plan of then making them more
+ flexible; see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
+
+ Args:
+ modified_array: The 3D (ZYX) array with the correct object data and
+ wrong background data.
+ original_array: The 3D (ZYX) array with the wrong object data and
+ correct background data.
+ background: The 3D (ZYX) boolean array that defines the background.
+
+ Returns:
+ The postprocessed array.
+ """
+ # Restore background
+ modified_array[background]=original_array[background]
+ returnmodified_array
+
def_preprocess_input(
+ image_array:np.ndarray,
+ *,
+ region:tuple[slice,...],
+ current_label_path:str,
+ ROI_table_path:str,
+ ROI_positional_index:int,
+)->tuple[np.ndarray,np.ndarray,np.ndarray]:
+"""
+ Preprocess a four-dimensional cellpose input.
+
+ This involves :
+
+ - Loading the masking label array for the appropriate ROI;
+ - Extracting the appropriate label value from the `ROI_table.obs`
+ dataframe;
+ - Constructing the background mask, where the masking label matches with a
+ specific label value;
+ - Setting the background of `image_array` to `0`;
+ - Loading the array which will be needed in postprocessing to restore
+ background.
+
+ **NOTE 1**: This function relies on V1 of the Fractal table specifications,
+ see
+ https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.
+
+ **NOTE 2**: The pre/post-processing functions and the
+ masked_loading_wrapper are currently meant to work as part of the
+ cellpose_segmentation task, with the plan of then making them more
+ flexible; see
+ https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
+
+ Naming of variables refers to a two-steps labeling, as in "first identify
+ organoids, then look for nuclei inside each organoid") :
+
+ - `"masking"` refers to the labels that are used to identify the object
+ vs background (e.g. the organoid labels); these labels already exist.
+ - `"current"` refers to the labels that are currently being computed in
+ the `cellpose_segmentation` task, e.g. the nuclear labels.
+
+ Args:
+ image_array: The 4D CZYX array with image data for a specific ROI.
+ region: The ZYX indices of the ROI, in a form like
+ `(slice(0, 1), slice(1000, 2000), slice(1000, 2000))`.
+ current_label_path: Path to the image used as current label, in a form
+ like `/somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0`.
+ ROI_table_path: Path of the AnnData table for the masking-label ROIs;
+ this is used (together with `ROI_positional_index`) to extract
+ `label_value`.
+ ROI_positional_index: Index of the current ROI, which is used to
+ extract `label_value` from `ROI_table_obs`.
+ Returns:
+ A tuple with three arrays: the preprocessed image array, the background
+ mask, the current label.
+ """
+
+ logger.info(f"[_preprocess_input] {image_array.shape=}")
+ logger.info(f"[_preprocess_input] {region=}")
+
+ # Check that image data are 4D (CZYX) - FIXME issue 340
+ ifnotimage_array.ndim==4:
+ raiseValueError(
+ "_preprocess_input requires a 4D "
+ f"image_array argument, but {image_array.shape=}"
+ )
+
+ # Load the ROI table and its metadata attributes
+ ROI_table=ad.read_zarr(ROI_table_path)
+ attrs=zarr.group(ROI_table_path).attrs
+ logger.info(f"[_preprocess_input] {ROI_table_path=}")
+ logger.info(f"[_preprocess_input] {attrs.asdict()=}")
+ MaskingROITableAttrs(**attrs.asdict())
+ label_relative_path=attrs["region"]["path"]
+ column_name=attrs["instance_key"]
+
+ # Check that ROI_table.obs has the right column and extract label_value
+ ifcolumn_namenotinROI_table.obs.columns:
+ raiseValueError(
+ 'In _preprocess_input, "{column_name}" '
+ f" missing in {ROI_table.obs.columns=}"
+ )
+ label_value=int(ROI_table.obs[column_name][ROI_positional_index])
+
+ # Load masking-label array (lazily)
+ masking_label_path=str(
+ Path(ROI_table_path).parent/label_relative_path/"0"
+ )
+ logger.info(f"{masking_label_path=}")
+ masking_label_array=da.from_zarr(masking_label_path)
+ logger.info(
+ f"[_preprocess_input] {masking_label_path=}, "
+ f"{masking_label_array.shape=}"
+ )
+
+ # Load current-label array (lazily)
+ current_label_array=da.from_zarr(current_label_path)
+ logger.info(
+ f"[_preprocess_input] {current_label_path=}, "
+ f"{current_label_array.shape=}"
+ )
+
+ # Load ROI data for current label array
+ current_label_region=current_label_array[region].compute()
+
+ # Load ROI data for masking label array, with or without upscaling
+ ifmasking_label_array.shape!=current_label_array.shape:
+ logger.info("Upscaling of masking label is needed")
+ lowres_region=convert_region_to_low_res(
+ highres_region=region,
+ highres_shape=current_label_array.shape,
+ lowres_shape=masking_label_array.shape,
+ )
+ masking_label_region=masking_label_array[lowres_region].compute()
+ masking_label_region=upscale_array(
+ array=masking_label_region,
+ target_shape=current_label_region.shape,
+ )
+ else:
+ masking_label_region=masking_label_array[region].compute()
+
+ # Check that all shapes match
+ shapes=(
+ masking_label_region.shape,
+ current_label_region.shape,
+ image_array.shape[1:],
+ )
+ iflen(set(shapes))>1:
+ raiseValueError(
+ "Shape mismatch:\n"
+ f"{current_label_region.shape=}\n"
+ f"{masking_label_region.shape=}\n"
+ f"{image_array.shape=}"
+ )
+
+ # Compute background mask
+ background_3D=masking_label_region!=label_value
+ if(masking_label_region==label_value).sum()==0:
+ raiseValueError(
+ f"Label {label_value} is not present in the extracted ROI"
+ )
+
+ # Set image background to zero
+ n_channels=image_array.shape[0]
+ foriinrange(n_channels):
+ image_array[i,background_3D]=0
+
+ return(image_array,background_3D,current_label_region)
+
defmasked_loading_wrapper(
+ *,
+ function:Callable,
+ image_array:np.ndarray,
+ kwargs:Optional[dict]=None,
+ use_masks:bool,
+ preprocessing_kwargs:Optional[dict]=None,
+):
+"""
+ Wrap a function with some pre/post-processing functions
+
+ Args:
+ function: The callable function to be wrapped.
+ image_array: The image array to be preprocessed and then used as
+ positional argument for `function`.
+ kwargs: Keyword arguments for `function`.
+ use_masks: If `False`, the wrapper only calls
+ `function(*args, **kwargs)`.
+ preprocessing_kwargs: Keyword arguments for the preprocessing function
+ (see call signature of `_preprocess_input()`).
+ """
+ # Optional preprocessing
+ ifuse_masks:
+ preprocessing_kwargs=preprocessing_kwargsor{}
+ (
+ image_array,
+ background_3D,
+ current_label_region,
+ )=_preprocess_input(image_array,**preprocessing_kwargs)
+ # Run function
+ kwargs=kwargsor{}
+ new_label_img=function(image_array,**kwargs)
+ # Optional postprocessing
+ ifuse_masks:
+ new_label_img=_postprocess_output(
+ modified_array=new_label_img,
+ original_array=current_label_region,
+ background=background_3D,
+ )
+ returnnew_label_img
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
56
+57
+58
+59
+60
+61
+62
+63
+64
+65
classAxis(BaseModel):
+"""
+ Model for an element of `Multiscale.axes`.
+
+ See https://ngff.openmicroscopy.org/0.4/#axes-md.
+ """
+
+ name:str
+ type:Optional[str]=None
+ unit:Optional[str]=None
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
32
+33
+34
+35
+36
+37
+38
+39
+40
+41
+42
+43
classChannel(BaseModel):
+"""
+ Model for an element of `Omero.channels`.
+
+ See https://ngff.openmicroscopy.org/0.4/#omero-md.
+ """
+
+ window:Optional[Window]=None
+ label:Optional[str]=None
+ family:Optional[str]=None
+ color:str
+ active:Optional[bool]=None
+
Note 1: The NGFF image is defined in a different model
+(NgffImageMeta), while the Image model only refere to an item of
+Well.images.
+
Note 2: We deviate from NGFF specs, since we allow path to be an
+arbitrary string.
+TODO: include a check like constr(regex=r'^[A-Za-z0-9]+$'), through a
+Pydantic validator.
classImageInWell(BaseModel):
+"""
+ Model for an element of `Well.images`.
+
+ **Note 1:** The NGFF image is defined in a different model
+ (`NgffImageMeta`), while the `Image` model only refere to an item of
+ `Well.images`.
+
+ **Note 2:** We deviate from NGFF specs, since we allow `path` to be an
+ arbitrary string.
+ TODO: include a check like `constr(regex=r'^[A-Za-z0-9]+$')`, through a
+ Pydantic validator.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ acquisition:Optional[int]=Field(
+ None,description="A unique identifier within the context of the plate"
+ )
+ path:str=Field(
+ ...,description="The path for this field of view subgroup"
+ )
+
classMultiscale(BaseModel):
+"""
+ Model for an element of `NgffImageMeta.multiscales`.
+
+ See https://ngff.openmicroscopy.org/0.4/#multiscale-md.
+ """
+
+ name:Optional[str]=None
+ datasets:list[Dataset]=Field(...,min_items=1)
+ version:Optional[str]=None
+ axes:list[Axis]=Field(...,max_items=5,min_items=2,unique_items=True)
+ coordinateTransformations:Optional[
+ list[
+ Union[
+ ScaleCoordinateTransformation,
+ TranslationCoordinateTransformation,
+ ]
+ ]
+ ]=None
+
+ @validator("coordinateTransformations",always=True)
+ def_no_global_coordinateTransformations(cls,v):
+"""
+ Fail if Multiscale has a (global) coordinateTransformations attribute.
+ """
+ ifvisnotNone:
+ raiseNotImplementedError(
+ "Global coordinateTransformations at the multiscales "
+ "level are not currently supported in the fractal-tasks-core "
+ "model for the NGFF multiscale."
+ )
+
@validator("coordinateTransformations",always=True)
+def_no_global_coordinateTransformations(cls,v):
+"""
+ Fail if Multiscale has a (global) coordinateTransformations attribute.
+ """
+ ifvisnotNone:
+ raiseNotImplementedError(
+ "Global coordinateTransformations at the multiscales "
+ "level are not currently supported in the fractal-tasks-core "
+ "model for the NGFF multiscale."
+ )
+
classNgffWellMeta(BaseModel):
+"""
+ Model for the metadata of a NGFF well.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ well:Optional[Well]=None
+
+ defget_acquisition_paths(self)->dict[int,str]:
+"""
+ Create mapping from acquisition indices to corresponding paths.
+
+ Runs on the well zarr attributes and loads the relative paths in the
+ well.
+
+ Returns:
+ Dictionary with `(acquisition index: image path)` key/value pairs.
+
+ Raises:
+ ValueError:
+ If an element of `self.well.images` has no `acquisition`
+ attribute.
+ NotImplementedError:
+ If acquisitions are not unique.
+ """
+ acquisition_dict={}
+ forimageinself.well.images:
+ ifimage.acquisitionisNone:
+ raiseValueError(
+ "Cannot get acquisition paths for Zarr files without "
+ "'acquisition' metadata at the well level"
+ )
+ ifimage.acquisitioninacquisition_dict:
+ raiseNotImplementedError(
+ "The `NgffWellMeta.get_acquisition_paths` method (in "
+ "fractal-tasks-core) does not support wells with "
+ "multiple images of the same acquisition."
+ )
+ acquisition_dict[image.acquisition]=image.path
+ returnacquisition_dict
+
defget_acquisition_paths(self)->dict[int,str]:
+"""
+ Create mapping from acquisition indices to corresponding paths.
+
+ Runs on the well zarr attributes and loads the relative paths in the
+ well.
+
+ Returns:
+ Dictionary with `(acquisition index: image path)` key/value pairs.
+
+ Raises:
+ ValueError:
+ If an element of `self.well.images` has no `acquisition`
+ attribute.
+ NotImplementedError:
+ If acquisitions are not unique.
+ """
+ acquisition_dict={}
+ forimageinself.well.images:
+ ifimage.acquisitionisNone:
+ raiseValueError(
+ "Cannot get acquisition paths for Zarr files without "
+ "'acquisition' metadata at the well level"
+ )
+ ifimage.acquisitioninacquisition_dict:
+ raiseNotImplementedError(
+ "The `NgffWellMeta.get_acquisition_paths` method (in "
+ "fractal-tasks-core) does not support wells with "
+ "multiple images of the same acquisition."
+ )
+ acquisition_dict[image.acquisition]=image.path
+ returnacquisition_dict
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
68
+69
+70
+71
+72
+73
+74
+75
+76
+77
+78
+79
classScaleCoordinateTransformation(BaseModel):
+"""
+ Model for a scale transformation.
+
+ This corresponds to scale-type elements of
+ `Dataset.coordinateTransformations` or
+ `Multiscale.coordinateTransformations`.
+ See https://ngff.openmicroscopy.org/0.4/#trafo-md
+ """
+
+ type:Literal["scale"]
+ scale:list[float]=Field(...,min_items=2)
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
82
+83
+84
+85
+86
+87
+88
+89
+90
+91
+92
+93
classTranslationCoordinateTransformation(BaseModel):
+"""
+ Model for a translation transformation.
+
+ This corresponds to translation-type elements of
+ `Dataset.coordinateTransformations` or
+ `Multiscale.coordinateTransformations`.
+ See https://ngff.openmicroscopy.org/0.4/#trafo-md
+ """
+
+ type:Literal["translation"]
+ translation:list[float]=Field(...,min_items=2)
+
classWell(BaseModel):
+"""
+ Model for `NgffWellMeta.well`.
+
+ See https://ngff.openmicroscopy.org/0.4/#well-md.
+ """
+
+ images:list[ImageInWell]=Field(
+ ...,
+ description="The images included in this well",
+ min_items=1,
+ unique_items=True,
+ )
+ version:Optional[str]=Field(
+ None,description="The version of the specification"
+ )
+
+
+
+ Source code in fractal_tasks_core/ngff/specs.py
+
18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
classWindow(BaseModel):
+"""
+ Model for `Channel.window`.
+
+ Note that we deviate by NGFF specs by making `start` and `end` optional.
+ See https://ngff.openmicroscopy.org/0.4/#omero-md.
+ """
+
+ max:float
+ min:float
+ start:Optional[float]=None
+ end:Optional[float]=None
+
This is used to provide a user-friendly error message.
+
+
+ Source code in fractal_tasks_core/ngff/zarr_utils.py
+
15
+16
+17
+18
+19
+20
+21
+22
classZarrGroupNotFoundError(ValueError):
+"""
+ Wrap zarr.errors.GroupNotFoundError
+
+ This is used to provide a user-friendly error message.
+ """
+
+ pass
+
defdetect_ome_ngff_type(group:zarr.hierarchy.Group)->str:
+"""
+ Given a Zarr group, find whether it is an OME-NGFF plate, well or image.
+
+ Args:
+ group: Zarr group
+
+ Returns:
+ The detected OME-NGFF type (`plate`, `well` or `image`).
+ """
+ attrs=group.attrs.asdict()
+ if"plate"inattrs.keys():
+ ngff_type="plate"
+ elif"well"inattrs.keys():
+ ngff_type="well"
+ elif"multiscales"inattrs.keys():
+ ngff_type="image"
+ else:
+ error_msg=(
+ "Zarr group at cannot be identified as one "
+ "of OME-NGFF plate/well/image groups."
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ logger.info(f"Zarr group identified as OME-NGFF {ngff_type}.")
+ returnngff_type
+
defload_NgffImageMeta(zarr_path:str)->NgffImageMeta:
+"""
+ Load the attributes of a zarr group and cast them to `NgffImageMeta`.
+
+ Args:
+ zarr_path: Path to the zarr group.
+
+ Returns:
+ A new `NgffImageMeta` object.
+ """
+ try:
+ zarr_group=zarr.open_group(zarr_path,mode="r")
+ exceptGroupNotFoundError:
+ error_msg=(
+ "Could not load attributes for the requested image, "
+ f"because no Zarr image was found at {zarr_path}"
+ )
+ logging.error(error_msg)
+ raiseZarrGroupNotFoundError(error_msg)
+ zarr_attrs=zarr_group.attrs.asdict()
+ try:
+ returnNgffImageMeta(**zarr_attrs)
+ exceptExceptionase:
+ logging.error(
+ f"Contents of {zarr_path} cannot be cast to NgffImageMeta.\n"
+ f"Original error:\n{str(e)}"
+ )
+ raisee
+
defload_NgffWellMeta(zarr_path:str)->NgffWellMeta:
+"""
+ Load the attributes of a zarr group and cast them to `NgffWellMeta`.
+
+ Args:
+ zarr_path: Path to the zarr group.
+
+ Returns:
+ A new `NgffWellMeta` object.
+ """
+ try:
+ zarr_group=zarr.open_group(zarr_path,mode="r")
+ exceptGroupNotFoundError:
+ error_msg=(
+ "Could not load attributes for the requested well, "
+ f"because no Zarr image was found at {zarr_path}"
+ )
+ logging.error(error_msg)
+ raiseZarrGroupNotFoundError(error_msg)
+ zarr_attrs=zarr_group.attrs.asdict()
+ try:
+ returnNgffWellMeta(**zarr_attrs)
+ exceptExceptionase:
+ logging.error(
+ f"Contents of {zarr_path} cannot be cast to NgffWellMeta.\n"
+ f"Original error:\n{str(e)}"
+ )
+ raisee
+
Starting from on-disk highest-resolution data, build and write to disk a
+pyramid with (num_levels - 1) coarsened levels.
+This function works for 2D, 3D or 4D arrays.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
zarrurl
+
+
+
Path of the image zarr group, not including the
+multiscale-level path (e.g. "some/path/plate.zarr/B/03/0").
def_is_overlapping_1D_int(
+ line1:Sequence[int],
+ line2:Sequence[int],
+)->bool:
+"""
+ Given two integer intervals, find whether they overlap
+
+ This is the same as `is_overlapping_1D` (based on
+ https://stackoverflow.com/a/70023212/19085332), for integer-valued
+ intervals.
+
+ Args:
+ line1: The boundaries of the first interval , written as
+ `[x_min, x_max]`.
+ line2: The boundaries of the second interval , written as
+ `[x_min, x_max]`.
+ """
+ returnline1[0]<line2[1]andline2[0]<line1[1]
+
def_is_overlapping_3D_int(box1:list[int],box2:list[int])->bool:
+"""
+ Given two three-dimensional integer boxes, find whether they overlap.
+
+ This is the same as is_overlapping_3D (based on
+ https://stackoverflow.com/a/70023212/19085332), for integer-valued
+ boxes.
+
+ Args:
+ box1: The boundaries of the first box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ box2: The boundaries of the second box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ """
+ overlap_x=_is_overlapping_1D_int([box1[0],box1[3]],[box2[0],box2[3]])
+ overlap_y=_is_overlapping_1D_int([box1[1],box1[4]],[box2[1],box2[4]])
+ overlap_z=_is_overlapping_1D_int([box1[2],box1[5]],[box2[2],box2[5]])
+ returnoverlap_xandoverlap_yandoverlap_z
+
defis_overlapping_1D(
+ line1:Sequence[float],line2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two intervals, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ line1: The boundaries of the first interval, written as
+ `[x_min, x_max]`.
+ line2: The boundaries of the second interval, written as
+ `[x_min, x_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+ returnline1[0]<=line2[1]-tolandline2[0]<=line1[1]-tol
+
defis_overlapping_2D(
+ box1:Sequence[float],box2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two rectangular boxes, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ box1: The boundaries of the first rectangle, written as
+ `[x_min, y_min, x_max, y_max]`.
+ box2: The boundaries of the second rectangle, written as
+ `[x_min, y_min, x_max, y_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+ overlap_x=is_overlapping_1D(
+ [box1[0],box1[2]],[box2[0],box2[2]],tol=tol
+ )
+ overlap_y=is_overlapping_1D(
+ [box1[1],box1[3]],[box2[1],box2[3]],tol=tol
+ )
+ returnoverlap_xandoverlap_y
+
defis_overlapping_3D(
+ box1:Sequence[float],box2:Sequence[float],tol:float=1e-10
+)->bool:
+"""
+ Given two three-dimensional boxes, finds whether they overlap.
+
+ This is based on https://stackoverflow.com/a/70023212/19085332, and we
+ additionally use a finite tolerance for floating-point comparisons.
+
+ Args:
+ box1: The boundaries of the first box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ box2: The boundaries of the second box, written as
+ `[x_min, y_min, z_min, x_max, y_max, z_max]`.
+ tol: Finite tolerance for floating-point comparisons.
+ """
+
+ overlap_x=is_overlapping_1D(
+ [box1[0],box1[3]],[box2[0],box2[3]],tol=tol
+ )
+ overlap_y=is_overlapping_1D(
+ [box1[1],box1[4]],[box2[1],box2[4]],tol=tol
+ )
+ overlap_z=is_overlapping_1D(
+ [box1[2],box1[5]],[box2[2],box2[5]],tol=tol
+ )
+ returnoverlap_xandoverlap_yandoverlap_z
+
defload_region(
+ data_zyx:da.Array,
+ region:tuple[slice,slice,slice],
+ compute:bool=True,
+ return_as_3D:bool=False,
+)->Union[da.Array,np.ndarray]:
+"""
+ Load a region from a dask array.
+
+ Can handle both 2D and 3D dask arrays as input and return them as is or
+ always as a 3D array.
+
+ Args:
+ data_zyx: Dask array (2D or 3D).
+ region: Region to load, tuple of three slices (ZYX).
+ compute: Whether to compute the result. If `True`, returns a numpy
+ array. If `False`, returns a dask array.
+ return_as_3D: Whether to return a 3D array, even if the input is 2D.
+
+ Returns:
+ 3D array.
+ """
+
+ iflen(region)!=3:
+ raiseValueError(
+ f"In `load_region`, `region` must have three elements "
+ f"(given: {len(region)})."
+ )
+
+ iflen(data_zyx.shape)==3:
+ img=data_zyx[region]
+ eliflen(data_zyx.shape)==2:
+ img=data_zyx[(region[1],region[2])]
+ ifreturn_as_3D:
+ img=np.expand_dims(img,axis=0)
+ else:
+ raiseValueError(
+ f"Shape {data_zyx.shape} not supported for `load_region`"
+ )
+ ifcompute:
+ returnimg.compute()
+ else:
+ returnimg
+
DataFrame with each line representing the bounding-box ROI that
+corresponds to a unique value of mask_array. ROI properties are
+expressed in physical units (with columns defined as elsewhere this
+module - see e.g. prepare_well_ROI_table), and positions are
+optionally shifted (if origin_zyx is set). An additional column
+label keeps track of the mask_array value corresponding to each
+ROI.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
Nested list of indices. The main list has one item per ROI. Each ROI
+item is a list of six integers as in [start_z, end_z, start_y,
+end_y, start_x, end_x]. The array-index interval for a given ROI
+is start_x:end_x along X, and so on for Y and Z.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
defconvert_ROI_table_to_indices(
+ ROI:ad.AnnData,
+ full_res_pxl_sizes_zyx:Sequence[float],
+ level:int=0,
+ coarsening_xy:int=2,
+ cols_xyz_pos:Sequence[str]=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ ],
+ cols_xyz_len:Sequence[str]=[
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ],
+)->list[list[int]]:
+"""
+ Convert a ROI AnnData table into integer array indices.
+
+ Args:
+ ROI: AnnData table with list of ROIs.
+ full_res_pxl_sizes_zyx:
+ Physical-unit pixel ZYX sizes at the full-resolution pyramid level.
+ level: Pyramid level.
+ coarsening_xy: Linear coarsening factor in the YX plane.
+ cols_xyz_pos: Column names for XYZ ROI positions.
+ cols_xyz_len: Column names for XYZ ROI edges.
+
+ Raises:
+ ValueError:
+ If any of the array indices is negative.
+
+ Returns:
+ Nested list of indices. The main list has one item per ROI. Each ROI
+ item is a list of six integers as in `[start_z, end_z, start_y,
+ end_y, start_x, end_x]`. The array-index interval for a given ROI
+ is `start_x:end_x` along X, and so on for Y and Z.
+ """
+ # Handle empty ROI table
+ iflen(ROI)==0:
+ return[]
+
+ # Set pyramid-level pixel sizes
+ pxl_size_z,pxl_size_y,pxl_size_x=full_res_pxl_sizes_zyx
+ prefactor=coarsening_xy**level
+ pxl_size_x*=prefactor
+ pxl_size_y*=prefactor
+
+ x_pos,y_pos,z_pos=cols_xyz_pos[:]
+ x_len,y_len,z_len=cols_xyz_len[:]
+
+ list_indices=[]
+ forROI_nameinROI.obs_names:
+ # Extract data from anndata table
+ x_micrometer=ROI[ROI_name,x_pos].X[0,0]
+ y_micrometer=ROI[ROI_name,y_pos].X[0,0]
+ z_micrometer=ROI[ROI_name,z_pos].X[0,0]
+ len_x_micrometer=ROI[ROI_name,x_len].X[0,0]
+ len_y_micrometer=ROI[ROI_name,y_len].X[0,0]
+ len_z_micrometer=ROI[ROI_name,z_len].X[0,0]
+
+ # Identify indices along the three dimensions
+ start_x=x_micrometer/pxl_size_x
+ end_x=(x_micrometer+len_x_micrometer)/pxl_size_x
+ start_y=y_micrometer/pxl_size_y
+ end_y=(y_micrometer+len_y_micrometer)/pxl_size_y
+ start_z=z_micrometer/pxl_size_z
+ end_z=(z_micrometer+len_z_micrometer)/pxl_size_z
+ indices=[start_z,end_z,start_y,end_y,start_x,end_x]
+
+ # Round indices to lower integer
+ indices=list(map(round,indices))
+
+ # Fail for negative indices
+ ifmin(indices)<0:
+ raiseValueError(
+ f"ROI {ROI_name} converted into negative array indices.\n"
+ f"ZYX position: {z_micrometer}, {y_micrometer}, "
+ f"{x_micrometer}\n"
+ f"ZYX pixel sizes: {pxl_size_z}, {pxl_size_y}, "
+ f"{pxl_size_x} ({level=})\n"
+ "Hint: As of fractal-tasks-core v0.12, FOV/well ROI "
+ "tables with non-zero origins (e.g. the ones created with "
+ "v0.11) are not supported."
+ )
+
+ # Append ROI indices to to list
+ list_indices.append(indices[:])
+
+ returnlist_indices
+
defconvert_ROIs_from_3D_to_2D(
+ adata:ad.AnnData,
+ pixel_size_z:float,
+)->ad.AnnData:
+"""
+ TBD
+
+ Note that this function is only relevant when the ROIs in adata span the
+ whole extent of the Z axis.
+ TODO: check this explicitly.
+
+ Args:
+ adata: TBD
+ pixel_size_z: TBD
+ """
+
+ # Compress a 3D stack of images to a single Z plane,
+ # with thickness equal to pixel_size_z
+ df=adata.to_df()
+ df["len_z_micrometer"]=pixel_size_z
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df=df.astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ new_adata=ad.AnnData(X=df)
+
+ # Rename rows and columns
+ new_adata.obs_names=adata.obs_names
+ new_adata.var_names=list(map(str,df.columns))
+
+ returnnew_adata
+
Construct an empty bounding-box ROI table of given shape.
+
This function mirrors the functionality of array_to_bounding_box_table,
+for the specific case where the array includes no label. The advantages of
+this function are that:
+
+
It does not require computing a whole array of zeros;
+
We avoid hardcoding column names in the task functions.
+
+
+
+
+
+
+
+
RETURNS
+
DESCRIPTION
+
+
+
+
+
+
+ DataFrame
+
+
+
+
+
DataFrame with no rows, and with columns corresponding to the output of
+array_to_bounding_box_table.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/roi/v1.py
+
defempty_bounding_box_table()->pd.DataFrame:
+"""
+ Construct an empty bounding-box ROI table of given shape.
+
+ This function mirrors the functionality of `array_to_bounding_box_table`,
+ for the specific case where the array includes no label. The advantages of
+ this function are that:
+
+ 1. It does not require computing a whole array of zeros;
+ 2. We avoid hardcoding column names in the task functions.
+
+ Returns:
+ DataFrame with no rows, and with columns corresponding to the output of
+ `array_to_bounding_box_table`.
+ """
+
+ df_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ df=pd.DataFrame(columns=[xforxindf_columns]+["label"])
+ returndf
+
Produce a table with ROIS placed on a rectangular grid.
+
The main goal of this ROI grid is to allow processing of smaller subset of
+the whole array.
+
In a specific case (that is, if the image array was obtained by stitching
+together a set of FOVs placed on a regular grid), the ROIs correspond to
+the original FOVs.
+
TODO: make this flexible with respect to the presence/absence of Z.
defget_image_grid_ROIs(
+ array_shape:tuple[int,int,int],
+ pixels_ZYX:list[float],
+ grid_YX_shape:tuple[int,int],
+)->ad.AnnData:
+"""
+ Produce a table with ROIS placed on a rectangular grid.
+
+ The main goal of this ROI grid is to allow processing of smaller subset of
+ the whole array.
+
+ In a specific case (that is, if the image array was obtained by stitching
+ together a set of FOVs placed on a regular grid), the ROIs correspond to
+ the original FOVs.
+
+ TODO: make this flexible with respect to the presence/absence of Z.
+
+ Args:
+ array_shape: ZYX shape of the image array.
+ pixels_ZYX: ZYX pixel sizes in micrometers.
+ grid_YX_shape:
+
+ Returns:
+ An `AnnData` table with a single ROI.
+ """
+ shape_z,shape_y,shape_x=array_shape[-3:]
+ grid_size_y,grid_size_x=grid_YX_shape[:]
+ X=[]
+ obs_names=[]
+ counter=0
+ start_z=0
+ len_z=shape_z
+
+ # Find minimal len_y that covers [0,shape_y] with grid_size_y intervals
+ len_y=math.ceil(shape_y/grid_size_y)
+ len_x=math.ceil(shape_x/grid_size_x)
+ forind_yinrange(grid_size_y):
+ start_y=ind_y*len_y
+ tmp_len_y=min(shape_y,start_y+len_y)-start_y
+ forind_xinrange(grid_size_x):
+ start_x=ind_x*len_x
+ tmp_len_x=min(shape_x,start_x+len_x)-start_x
+ X.append(
+ [
+ start_x*pixels_ZYX[2],
+ start_y*pixels_ZYX[1],
+ start_z*pixels_ZYX[0],
+ tmp_len_x*pixels_ZYX[2],
+ tmp_len_y*pixels_ZYX[1],
+ len_z*pixels_ZYX[0],
+ ]
+ )
+ counter+=1
+ obs_names.append(f"ROI_{counter}")
+ ROI_table=ad.AnnData(X=np.array(X,dtype=np.float32))
+ ROI_table.obs_names=obs_names
+ ROI_table.var_names=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ returnROI_table
+
defis_standard_roi_table(table:str)->bool:
+"""
+ True if the name of the table contains one of the standard Fractal tables
+
+ If a table name is well_ROI_table, FOV_ROI_table or contains either of the
+ two (e.g. registered_FOV_ROI_table), this function returns True.
+
+ Args:
+ table: table name
+
+ Returns:
+ bool of whether it's a standard ROI table
+
+ """
+ if"well_ROI_table"intable:
+ returnTrue
+ elif"FOV_ROI_table"intable:
+ returnTrue
+ else:
+ returnFalse
+
defprepare_FOV_ROI_table(
+ df:pd.DataFrame,metadata:tuple[str,...]=("time",)
+)->ad.AnnData:
+"""
+ Prepare an AnnData table for fields-of-view ROIs.
+
+ Args:
+ df:
+ Input dataframe, possibly prepared through
+ `parse_yokogawa_metadata`.
+ metadata:
+ Columns of `df` to be stored (if present) into AnnData table `obs`.
+ """
+
+ # Make a local copy of the dataframe, to avoid SettingWithCopyWarning
+ df=df.copy()
+
+ # Convert DataFrame index to str, to avoid
+ # >> ImplicitModificationWarning: Transforming to str index
+ # when creating AnnData object.
+ # Do this in the beginning to allow concatenation with e.g. time
+ df.index=df.index.astype(str)
+
+ # Obtain box size in physical units
+ df=df.assign(len_x_micrometer=df.x_pixel*df.pixel_size_x)
+ df=df.assign(len_y_micrometer=df.y_pixel*df.pixel_size_y)
+ df=df.assign(len_z_micrometer=df.z_pixel*df.pixel_size_z)
+
+ # Select only the numeric positional columns needed to define ROIs
+ # (to avoid) casting things like the data column to float32
+ # or to use unnecessary columns like bit_depth
+ positional_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ "x_micrometer_original",
+ "y_micrometer_original",
+ ]
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df_roi=df.loc[:,positional_columns].astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ adata=ad.AnnData(X=df_roi)
+
+ # Reset origin of the FOV ROI table, so that it matches with the well
+ # origin
+ adata=reset_origin(adata)
+
+ # Save any metadata that is specified to the obs df
+ forcolinmetadata:
+ ifcolindf:
+ # Cast all metadata to str.
+ # Reason: AnnData Zarr writers don't support all pandas types.
+ # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written
+ adata.obs[col]=df[col].astype(str)
+
+ # Rename rows and columns: Maintain FOV indices from the dataframe
+ # (they are already enforced to be unique by Pandas and may contain
+ # information for the user, as they are based on the filenames)
+ adata.obs_names="FOV_"+adata.obs.index
+ adata.var_names=list(map(str,df_roi.columns))
+
+ returnadata
+
defprepare_well_ROI_table(
+ df:pd.DataFrame,metadata:tuple[str,...]=("time",)
+)->ad.AnnData:
+"""
+ Prepare an AnnData table with a single well ROI.
+
+ Args:
+ df:
+ Input dataframe, possibly prepared through
+ `parse_yokogawa_metadata`.
+ metadata:
+ Columns of `df` to be stored (if present) into AnnData table `obs`.
+ """
+
+ # Make a local copy of the dataframe, to avoid SettingWithCopyWarning
+ df=df.copy()
+
+ # Convert DataFrame index to str, to avoid
+ # >> ImplicitModificationWarning: Transforming to str index
+ # when creating AnnData object.
+ # Do this in the beginning to allow concatenation with e.g. time
+ df.index=df.index.astype(str)
+
+ # Calculate bounding box extents in physical units
+ formuin["x","y","z"]:
+ # Obtain per-FOV properties in physical units.
+ # NOTE: a FOV ROI is defined here as the interval [min_micrometer,
+ # max_micrometer], with max_micrometer=min_micrometer+len_micrometer
+ min_micrometer=df[f"{mu}_micrometer"]
+ len_micrometer=df[f"{mu}_pixel"]*df[f"pixel_size_{mu}"]
+ max_micrometer=min_micrometer+len_micrometer
+ # Obtain well bounding box, in physical units
+ min_min_micrometer=min_micrometer.min()
+ max_max_micrometer=max_micrometer.max()
+ df[f"{mu}_micrometer"]=min_min_micrometer
+ df[f"len_{mu}_micrometer"]=max_max_micrometer-min_min_micrometer
+
+ # Select only the numeric positional columns needed to define ROIs
+ # (to avoid) casting things like the data column to float32
+ # or to use unnecessary columns like bit_depth
+ positional_columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+
+ # Assign dtype explicitly, to avoid
+ # >> UserWarning: X converted to numpy array with dtype float64
+ # when creating AnnData object
+ df_roi=df.iloc[0:1,:].loc[:,positional_columns].astype(np.float32)
+
+ # Create an AnnData object directly from the DataFrame
+ adata=ad.AnnData(X=df_roi)
+
+ # Reset origin of the single-entry well ROI table
+ adata=reset_origin(adata)
+
+ # Save any metadata that is specified to the obs df
+ forcolinmetadata:
+ ifcolindf:
+ # Cast all metadata to str.
+ # Reason: AnnData Zarr writers don't support all pandas types.
+ # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written
+ adata.obs[col]=df[col].astype(str)
+
+ # Rename rows and columns: Maintain FOV indices from the dataframe
+ # (they are already enforced to be unique by Pandas and may contain
+ # information for the user, as they are based on the filenames)
+ adata.obs_names="well_"+adata.obs.index
+ adata.var_names=list(map(str,df_roi.columns))
+
+ returnadata
+
defreset_origin(
+ ROI_table:ad.AnnData,
+ x_pos:str="x_micrometer",
+ y_pos:str="y_micrometer",
+ z_pos:str="z_micrometer",
+)->ad.AnnData:
+"""
+ Return a copy of a ROI table, with shifted-to-zero origin for some columns.
+
+ Args:
+ ROI_table: Original ROI table.
+ x_pos: Name of the column with X position of ROIs.
+ y_pos: Name of the column with Y position of ROIs.
+ z_pos: Name of the column with Z position of ROIs.
+
+ Returns:
+ A copy of the `ROI_table` AnnData table, where values of `x_pos`,
+ `y_pos` and `z_pos` columns have been shifted by their minimum
+ values.
+ """
+ new_table=ROI_table.copy()
+
+ origin_x=min(new_table[:,x_pos].X[:,0])
+ origin_y=min(new_table[:,y_pos].X[:,0])
+ origin_z=min(new_table[:,z_pos].X[:,0])
+
+ forFOVinnew_table.obs_names:
+ new_table[FOV,x_pos]=new_table[FOV,x_pos].X[0,0]-origin_x
+ new_table[FOV,y_pos]=new_table[FOV,y_pos].X[0,0]-origin_y
+ new_table[FOV,z_pos]=new_table[FOV,z_pos].X[0,0]-origin_z
+
+ returnnew_table
+
defare_ROI_table_columns_valid(*,table:ad.AnnData)->None:
+"""
+ Verify some validity assumptions on a ROI table.
+
+ This function reflects our current working assumptions (e.g. the presence
+ of some specific columns); this may change in future versions.
+
+ Args:
+ table: AnnData table to be checked
+ """
+
+ # Hard constraint: table columns must include some expected ones
+ columns=[
+ "x_micrometer",
+ "y_micrometer",
+ "z_micrometer",
+ "len_x_micrometer",
+ "len_y_micrometer",
+ "len_z_micrometer",
+ ]
+ forcolumnincolumns:
+ ifcolumnnotintable.var_names:
+ raiseValueError(f"Column {column} is not present in ROI table")
+
Check that list of indices has zero origin on each axis.
+
See fractal-tasks-core issues #530 and #554.
+
This helper function is meant to provide informative error messages when
+ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.
+This function will be deprecated and removed as soon as the v0.11/v0.12
+transition advances.
+
Note that only FOV_ROI_table and well_ROI_table have to fulfill this
+constraint, while ROI tables obtained through segmentation may have
+arbitrary (non-negative) indices.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
list_indices
+
+
+
Output of convert_ROI_table_to_indices; each item is like
+[start_z, end_z, start_y, end_y, start_x, end_x].
defcheck_valid_ROI_indices(
+ list_indices:list[list[int]],
+ ROI_table_name:str,
+)->None:
+"""
+ Check that list of indices has zero origin on each axis.
+
+ See fractal-tasks-core issues #530 and #554.
+
+ This helper function is meant to provide informative error messages when
+ ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.
+ This function will be deprecated and removed as soon as the v0.11/v0.12
+ transition advances.
+
+ Note that only `FOV_ROI_table` and `well_ROI_table` have to fulfill this
+ constraint, while ROI tables obtained through segmentation may have
+ arbitrary (non-negative) indices.
+
+ Args:
+ list_indices:
+ Output of `convert_ROI_table_to_indices`; each item is like
+ `[start_z, end_z, start_y, end_y, start_x, end_x]`.
+ ROI_table_name: Name of the ROI table.
+
+ Raises:
+ ValueError:
+ If the table name is `FOV_ROI_table` or `well_ROI_table` and the
+ minimum value of `start_x`, `start_y` and `start_z` are not all
+ zero.
+ """
+ ifROI_table_namenotin["FOV_ROI_table","well_ROI_table"]:
+ # This validation function only applies to the FOV/well ROI tables
+ # generated with fractal-tasks-core
+ return
+
+ # Find minimum index along ZYX
+ min_start_z=min(item[0]foriteminlist_indices)
+ min_start_y=min(item[2]foriteminlist_indices)
+ min_start_x=min(item[4]foriteminlist_indices)
+
+ # Check that minimum indices are all zero
+ forind,min_indexinenumerate((min_start_z,min_start_y,min_start_x)):
+ ifmin_index!=0:
+ axis=["Z","Y","X"][ind]
+ raiseValueError(
+ f"{axis} component of ROI indices for table `{ROI_table_name}`"
+ f" do not start with 0, but with {min_index}.\n"
+ "Hint: As of fractal-tasks-core v0.12, FOV/well ROI "
+ "tables with non-zero origins (e.g. the ones created with "
+ "v0.11) are not supported."
+ )
+
This function reflects our current working assumptions (e.g. the presence
+of some specific columns); this may change in future versions.
+
If use_masks=True, we verify that the table is a valid
+masking_roi_table as of table specifications V1; if this check fails,
+use_masks should be set to False upstream in the parent function.
defis_ROI_table_valid(*,table_path:str,use_masks:bool)->Optional[bool]:
+"""
+ Verify some validity assumptions on a ROI table.
+
+ This function reflects our current working assumptions (e.g. the presence
+ of some specific columns); this may change in future versions.
+
+ If `use_masks=True`, we verify that the table is a valid
+ `masking_roi_table` as of table specifications V1; if this check fails,
+ `use_masks` should be set to `False` upstream in the parent function.
+
+ Args:
+ table_path: Path of the AnnData ROI table to be checked.
+ use_masks: If `True`, perform some additional checks related to
+ masked loading.
+
+ Returns:
+ Always `None` if `use_masks=False`, otherwise return whether the table
+ is valid for masked loading.
+ """
+
+ table=ad.read_zarr(table_path)
+ are_ROI_table_columns_valid(table=table)
+ ifnotuse_masks:
+ returnNone
+
+ # Check whether the table can be used for masked loading
+ attrs=zarr.group(table_path).attrs.asdict()
+ logger.info(f"ROI table at {table_path} has attrs: {attrs}")
+ try:
+ MaskingROITableAttrs(**attrs)
+ logging.info("ROI table can be used for masked loading")
+ returnTrue
+ exceptValidationError:
+ logging.info("ROI table cannot be used for masked loading")
+ returnFalse
+
deffind_overlaps_in_ROI_indices(
+ list_indices:list[list[int]],
+)->Optional[tuple[int,int]]:
+"""
+ Given a list of integer ROI indices, find whether there are overlaps.
+
+ Args:
+ list_indices: List of ROI indices, where each element in the list
+ should look like
+ `[start_z, end_z, start_y, end_y, start_x, end_x]`.
+
+ Returns:
+ `None` if no overlap was detected, otherwise a tuple with the
+ positional indices of a pair of overlapping ROIs.
+ """
+
+ forind_1,ROI_1inenumerate(list_indices):
+ s_z,e_z,s_y,e_y,s_x,e_x=ROI_1[:]
+ box_1=[s_x,s_y,s_z,e_x,e_y,e_z]
+ forind_2inrange(ind_1):
+ ROI_2=list_indices[ind_2]
+ s_z,e_z,s_y,e_y,s_x,e_x=ROI_2[:]
+ box_2=[s_x,s_y,s_z,e_x,e_y,e_z]
+ if_is_overlapping_3D_int(box_1,box_2):
+ return(ind_1,ind_2)
+ returnNone
+
Run an overlap check over all wells and optionally plots overlaps.
+
This function is currently only used in tests and examples.
+
The plotting_function parameter is exposed so that other tools (see
+examples in this repository) may use it to show the FOV ROIs. Its arguments
+are: [xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well].
defrun_overlap_check(
+ site_metadata:pd.DataFrame,
+ tol:float=1e-10,
+ plotting_function:Optional[Callable]=None,
+):
+"""
+ Run an overlap check over all wells and optionally plots overlaps.
+
+ This function is currently only used in tests and examples.
+
+ The `plotting_function` parameter is exposed so that other tools (see
+ examples in this repository) may use it to show the FOV ROIs. Its arguments
+ are: `[xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well]`.
+
+ Args:
+ site_metadata: TBD
+ tol: TBD
+ plotting_function: TBD
+ """
+
+ ifplotting_functionisNone:
+
+ defplotting_function(
+ xmin,xmax,ymin,ymax,list_overlapping_FOVs,selected_well
+ ):
+ pass
+
+ wells=site_metadata.index.unique(level="well_id")
+ overlapping_FOVs=[]
+ forselected_wellinwells:
+ overlap_curr_well=check_well_for_FOV_overlap(
+ site_metadata,
+ selected_well=selected_well,
+ tol=tol,
+ plotting_function=plotting_function,
+ )
+ ifoverlap_curr_well:
+ print(selected_well)
+ overlapping_FOVs.append(overlap_curr_well)
+
+ returnoverlapping_FOVs
+
This is the general interface that should allow for a smooth coexistence of
+tables with different fractal_table_version values. Currently only V1 is
+defined and implemented. The assumption is that V2 should only change:
+
+
The lower-level writing function (that is, _write_table_v2).
+
The type of the table (which would also reflect into a more general type
+ hint for table, in the current funciton);
+
A different definition of what values of table_attrs are valid or
+ invalid, to be implemented in _write_table_v2.
+
Possibly, additional parameters for _write_table_v2, which will be
+ optional parameters of write_table (so that write_table remains
+ valid for both V1 and V2).
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
image_group
+
+
+
The image Zarr group where the table will be written.
If False, check that the new table does not exist (either as a
+zarr sub-group or as part of the zarr-group attributes). In all
+cases, propagate parameter to low-level functions, to determine the
+behavior in case of an existing sub-group named as in table_name.
If set, overwrite table_group attributes with table_attrs key/value
+pairs. If table_type is not provided, then table_attrs must
+include the type key.
defwrite_table(
+ image_group:zarr.hierarchy.Group,
+ table_name:str,
+ table:ad.AnnData,
+ overwrite:bool=False,
+ table_type:Optional[str]=None,
+ table_attrs:Optional[dict[str,Any]]=None,
+)->zarr.group:
+"""
+ Write a table to a Zarr group.
+
+ This is the general interface that should allow for a smooth coexistence of
+ tables with different `fractal_table_version` values. Currently only V1 is
+ defined and implemented. The assumption is that V2 should only change:
+
+ 1. The lower-level writing function (that is, `_write_table_v2`).
+ 2. The type of the table (which would also reflect into a more general type
+ hint for `table`, in the current funciton);
+ 3. A different definition of what values of `table_attrs` are valid or
+ invalid, to be implemented in `_write_table_v2`.
+ 4. Possibly, additional parameters for `_write_table_v2`, which will be
+ optional parameters of `write_table` (so that `write_table` remains
+ valid for both V1 and V2).
+
+ Args:
+ image_group:
+ The image Zarr group where the table will be written.
+ table_name:
+ The name of the table.
+ table:
+ The table object (currently an AnnData object, for V1).
+ overwrite:
+ If `False`, check that the new table does not exist (either as a
+ zarr sub-group or as part of the zarr-group attributes). In all
+ cases, propagate parameter to low-level functions, to determine the
+ behavior in case of an existing sub-group named as in `table_name`.
+ table_type: `type` attribute for the table; in case `type` is also
+ present in `table_attrs`, this function argument takes priority.
+ table_attrs:
+ If set, overwrite table_group attributes with table_attrs key/value
+ pairs. If `table_type` is not provided, then `table_attrs` must
+ include the `type` key.
+
+ Returns:
+ Zarr group of the table.
+ """
+ # Choose which version to use, giving priority to a value that is present
+ # in table_attrs
+ version=__FRACTAL_TABLE_VERSION__
+ iftable_attrsisnotNone:
+ try:
+ version=table_attrs["fractal_table_version"]
+ exceptKeyError:
+ pass
+
+ ifversion=="1":
+ return_write_table_v1(
+ image_group,
+ table_name,
+ table,
+ overwrite,
+ table_type,
+ table_attrs,
+ )
+ else:
+ raiseNotImplementedError(
+ f"fractal_table_version='{version}' is not supported"
+ )
+
def_write_elem_with_overwrite(
+ group:zarr.hierarchy.Group,
+ key:str,
+ elem:Any,
+ *,
+ overwrite:bool,
+ logger:Optional[logging.Logger]=None,
+)->None:
+"""
+ Wrap `anndata.experimental.write_elem`, to include `overwrite` parameter.
+
+ See docs for the original function
+ [here](https://anndata.readthedocs.io/en/stable/generated/anndata.experimental.write_elem.html).
+
+ This function writes `elem` to the sub-group `key` of `group`. The
+ `overwrite`-related expected behavior is:
+
+ * if the sub-group does not exist, create it (independently on
+ `overwrite`);
+ * if the sub-group already exists and `overwrite=True`, overwrite the
+ sub-group;
+ * if the sub-group already exists and `overwrite=False`, fail.
+
+ Note that this version of the wrapper does not include the original
+ `dataset_kwargs` parameter.
+
+ Args:
+ group:
+ The group to write to.
+ key:
+ The key to write to in the group. Note that absolute paths will be
+ written from the root.
+ elem:
+ The element to write. Typically an in-memory object, e.g. an
+ AnnData, pandas dataframe, scipy sparse matrix, etc.
+ overwrite:
+ If `True`, overwrite the `key` sub-group (if present); if `False`
+ and `key` sub-group exists, raise an error.
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`)
+
+ Raises:
+ OverwriteNotAllowedError:
+ If `overwrite=False` and the sub-group already exists.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ ifkeyinset(group.group_keys()):
+ ifnotoverwrite:
+ error_msg=(
+ f"Sub-group '{key}' of group {group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ write_elem(group,key,elem)
+
Handle multiple options for writing an AnnData table to a zarr group.
+
+
Create the tables group, if needed.
+
If overwrite=False, check that the new table does not exist (either in
+ zarr attributes or as a zarr sub-group).
+
Call the _write_elem_with_overwrite wrapper with the appropriate
+ overwrite parameter.
+
Update the tables attribute of the image group.
+
Validate table_type and table_attrs according to Fractal table
+ specifications, and raise errors/warnings if needed; then set the
+ appropriate attributes in the new-table Zarr group.
If False, check that the new table does not exist (either as a
+zarr sub-group or as part of the zarr-group attributes). In all
+cases, propagate parameter to _write_elem_with_overwrite, to
+determine the behavior in case of an existing sub-group named as
+table_name.
If set, overwrite table_group attributes with table_attrs key/value
+pairs. If table_type is not provided, then table_attrs must
+include the type key.
def_write_table_v1(
+ image_group:zarr.hierarchy.Group,
+ table_name:str,
+ table:ad.AnnData,
+ overwrite:bool=False,
+ table_type:Optional[str]=None,
+ table_attrs:Optional[dict[str,Any]]=None,
+)->zarr.group:
+"""
+ Handle multiple options for writing an AnnData table to a zarr group.
+
+ 1. Create the `tables` group, if needed.
+ 2. If `overwrite=False`, check that the new table does not exist (either in
+ zarr attributes or as a zarr sub-group).
+ 3. Call the `_write_elem_with_overwrite` wrapper with the appropriate
+ `overwrite` parameter.
+ 4. Update the `tables` attribute of the image group.
+ 5. Validate `table_type` and `table_attrs` according to Fractal table
+ specifications, and raise errors/warnings if needed; then set the
+ appropriate attributes in the new-table Zarr group.
+
+
+ Args:
+ image_group:
+ The group to write to.
+ table_name:
+ The name of the new table.
+ table:
+ The AnnData table to write.
+ overwrite:
+ If `False`, check that the new table does not exist (either as a
+ zarr sub-group or as part of the zarr-group attributes). In all
+ cases, propagate parameter to `_write_elem_with_overwrite`, to
+ determine the behavior in case of an existing sub-group named as
+ `table_name`.
+ table_type: `type` attribute for the table; in case `type` is also
+ present in `table_attrs`, this function argument takes priority.
+ table_attrs:
+ If set, overwrite table_group attributes with table_attrs key/value
+ pairs. If `table_type` is not provided, then `table_attrs` must
+ include the `type` key.
+
+ Returns:
+ Zarr group of the new table.
+ """
+
+ # Create tables group (if needed) and extract current_tables
+ if"tables"notinset(image_group.group_keys()):
+ tables_group=image_group.create_group("tables",overwrite=False)
+ else:
+ tables_group=image_group["tables"]
+ current_tables=tables_group.attrs.asdict().get("tables",[])
+
+ # If overwrite=False, check that the new table does not exist (either as a
+ # zarr sub-group or as part of the zarr-group attributes)
+ ifnotoverwrite:
+ iftable_nameinset(tables_group.group_keys()):
+ error_msg=(
+ f"Sub-group '{table_name}' of group {image_group.store.path} "
+ f"already exists, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+ iftable_nameincurrent_tables:
+ error_msg=(
+ f"Item '{table_name}' already exists in `tables` attribute of "
+ f"group {image_group.store.path}, but `{overwrite=}`.\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
+ # Always include fractal-roi-table version in table attributes
+ iftable_attrsisNone:
+ table_attrs=dict(fractal_table_version="1")
+ eliftable_attrs.get("fractal_table_version",None)isNone:
+ table_attrs["fractal_table_version"]="1"
+
+ # Set type attribute for the table
+ table_type_from_attrs=table_attrs.get("type",None)
+ iftable_typeisnotNone:
+ iftable_type_from_attrsisnotNone:
+ logger.warning(
+ f"Setting table type to '{table_type}' (and overriding "
+ f"'{table_type_from_attrs}' attribute)."
+ )
+ table_attrs["type"]=table_type
+ else:
+ iftable_type_from_attrsisNone:
+ raiseValueError(
+ "Missing attribute `type` for table; this must be provided"
+ " either via `table_type` or within `table_attrs`."
+ )
+
+ # Prepare/validate attributes for the table
+ table_type=table_attrs.get("type",None)
+ iftable_type=="roi_table":
+ pass
+ eliftable_type=="masking_roi_table":
+ try:
+ MaskingROITableAttrs(**table_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Table attributes do not comply with Fractal "
+ "`masking_roi_table` specifications V1.\nOriginal error:\n"
+ f"ValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ eliftable_type=="feature_table":
+ try:
+ FeatureTableAttrs(**table_attrs)
+ exceptValidationErrorase:
+ error_msg=(
+ "Table attributes do not comply with Fractal "
+ "`feature_table` specifications V1.\nOriginal error:\n"
+ f"ValidationError: {str(e)}"
+ )
+ logger.error(error_msg)
+ raiseValueError(error_msg)
+ else:
+ logger.warning(f"Unknown table type `{table_type}`.")
+
+ # If it's all OK, proceed and write the table
+ _write_elem_with_overwrite(
+ tables_group,
+ table_name,
+ table,
+ overwrite=overwrite,
+ )
+ table_group=tables_group[table_name]
+
+ # Update the `tables` metadata of the image group, if needed
+ iftable_namenotincurrent_tables:
+ new_tables=current_tables+[table_name]
+ tables_group.attrs["tables"]=new_tables
+
+ # Update table_group attributes with table_attrs key/value pairs
+ table_group.attrs.update(**table_attrs)
+
+ returntable_group
+
Applies pre-calculated registration to ROI tables.
+
Apply pre-calculated registration such that resulting ROIs contain
+the consensus align region between all cycles.
+
Parallelization level: well
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table over which the task loops to
+calculate the registration. Examples: FOV_ROI_table => loop over
+the field of views, well_ROI_table => process the whole well as
+one image.
@validate_arguments
+defapply_registration_to_ROI_tables(
+ *,
+ # Fractal arguments
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ # Task-specific arguments
+ roi_table:str="FOV_ROI_table",
+ reference_cycle:int=0,
+ new_roi_table:Optional[str]=None,
+)->dict[str,Any]:
+"""
+ Applies pre-calculated registration to ROI tables.
+
+ Apply pre-calculated registration such that resulting ROIs contain
+ the consensus align region between all cycles.
+
+ Parallelization level: well
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that is
+ processed. Example: `"some_plate.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ roi_table: Name of the ROI table over which the task loops to
+ calculate the registration. Examples: `FOV_ROI_table` => loop over
+ the field of views, `well_ROI_table` => process the whole well as
+ one image.
+ reference_cycle: Which cycle to register against. Defaults to 0,
+ which is the first OME-Zarr image in the well, usually the first
+ cycle that was provided
+ new_roi_table: Optional name for the new, registered ROI table. If no
+ name is given, it will default to "registered_" + `roi_table`
+
+ """
+ ifnotnew_roi_table:
+ new_roi_table="registered_"+roi_table
+ logger.info(
+ f"Running for {input_paths=}, {component=}. \n"
+ f"Applyg translation registration to {roi_table=} and storing it as "
+ f"{new_roi_table=}."
+ )
+
+ well_zarr=f"{input_paths[0]}/{component}"
+ ngff_well_meta=load_NgffWellMeta(well_zarr)
+ acquisition_dict=ngff_well_meta.get_acquisition_paths()
+ logger.info(
+ "Calculating common registration for the following cycles: "
+ f"{acquisition_dict}"
+ )
+
+ # TODO: Allow a filter on which acquisitions should get processed?
+
+ # Collect all the ROI tables
+ roi_tables={}
+ roi_tables_attrs={}
+ foracqinacquisition_dict.keys():
+ acq_path=acquisition_dict[acq]
+ curr_ROI_table=ad.read_zarr(
+ f"{well_zarr}/{acq_path}/tables/{roi_table}"
+ )
+ curr_ROI_table_group=zarr.open_group(
+ f"{well_zarr}/{acq_path}/tables/{roi_table}",mode="r"
+ )
+ curr_ROI_table_attrs=curr_ROI_table_group.attrs.asdict()
+
+ # For reference_cycle acquisition, handle the fact that it doesn't
+ # have the shifts
+ ifacq==reference_cycle:
+ curr_ROI_table=add_zero_translation_columns(curr_ROI_table)
+ # Check for valid ROI tables
+ are_ROI_table_columns_valid(table=curr_ROI_table)
+ translation_columns=[
+ "translation_z",
+ "translation_y",
+ "translation_x",
+ ]
+ ifcurr_ROI_table.var.index.isin(translation_columns).sum()!=3:
+ raiseValueError(
+ f"Cycle {acq}'s {roi_table} does not contain the "
+ f"translation columns {translation_columns} necessary to use "
+ "this task."
+ )
+ roi_tables[acq]=curr_ROI_table
+ roi_tables_attrs[acq]=curr_ROI_table_attrs
+
+ # Check that all acquisitions have the same ROIs
+ rois=roi_tables[reference_cycle].obs.index
+ foracq,acq_roi_tableinroi_tables.items():
+ ifnot(acq_roi_table.obs.index==rois).all():
+ raiseValueError(
+ f"Acquisition {acq} does not contain the same ROIs as the "
+ f"reference acquisition {reference_cycle}:\n"
+ f"{acq}: {acq_roi_table.obs.index}\n"
+ f"{reference_cycle}: {rois}"
+ )
+
+ roi_table_dfs=[
+ roi_table.to_df().loc[:,translation_columns]
+ forroi_tableinroi_tables.values()
+ ]
+ logger.info("Calculating min & max translation across cycles.")
+ max_df,min_df=calculate_min_max_across_dfs(roi_table_dfs)
+ shifted_rois={}
+ # Loop over acquisitions
+ foracqinacquisition_dict.keys():
+ shifted_rois[acq]=apply_registration_to_single_ROI_table(
+ roi_tables[acq],max_df,min_df
+ )
+
+ # TODO: Drop translation columns from this table?
+
+ logger.info(
+ f"Write the registered ROI table {new_roi_table} for {acq=}"
+ )
+ # Save the shifted ROI table as a new table
+ image_group=zarr.group(f"{well_zarr}/{acq}")
+ write_table(
+ image_group,
+ new_roi_table,
+ shifted_rois[acq],
+ table_attrs=roi_tables_attrs[acq],
+ )
+
+ # TODO: Optionally apply registration to other tables as well?
+ # e.g. to well_ROI_table based on FOV_ROI_table
+ # => out of scope for the initial task, apply registration separately
+ # to each table
+ # Easiest implementation: Apply average shift calculcated here to other
+ # ROIs. From many to 1 (e.g. FOV => well) => average shift, but crop len
+ # From well to many (e.g. well to FOVs) => average shift, crop len by that
+ # amount
+ # Many to many (FOVs to organoids) => tricky because of matching
+
+ return{}
+
defapply_registration_to_single_ROI_table(
+ roi_table:ad.AnnData,
+ max_df:pd.DataFrame,
+ min_df:pd.DataFrame,
+)->ad.AnnData:
+"""
+ Applies the registration to a ROI table
+
+ Calculates the new position as: p = position + max(shift, 0) - own_shift
+ Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)
+
+ Args:
+ roi_table: AnnData table which contains a Fractal ROI table.
+ Rows are ROIs
+ max_df: Max translation shift in z, y, x for each ROI. Rows are ROIs,
+ columns are translation_z, translation_y, translation_x
+ min_df: Min translation shift in z, y, x for each ROI. Rows are ROIs,
+ columns are translation_z, translation_y, translation_x
+ Returns:
+ ROI table where all ROIs are registered to the smallest common area
+ across all cycles.
+ """
+ roi_table=copy.deepcopy(roi_table)
+ rois=roi_table.obs.index
+ if(rois!=max_df.index).all()or(rois!=min_df.index).all():
+ raiseValueError(
+ "ROI table and max & min translation need to contain the same "
+ f"ROIS, but they were {rois=}, {max_df.index=}, {min_df.index=}"
+ )
+
+ forroiinrois:
+ roi_table[[roi],["z_micrometer"]]=(
+ roi_table[[roi],["z_micrometer"]].X
+ +float(max_df.loc[roi,"translation_z"])
+ -roi_table[[roi],["translation_z"]].X
+ )
+ roi_table[[roi],["y_micrometer"]]=(
+ roi_table[[roi],["y_micrometer"]].X
+ +float(max_df.loc[roi,"translation_y"])
+ -roi_table[[roi],["translation_y"]].X
+ )
+ roi_table[[roi],["x_micrometer"]]=(
+ roi_table[[roi],["x_micrometer"]].X
+ +float(max_df.loc[roi,"translation_x"])
+ -roi_table[[roi],["translation_x"]].X
+ )
+ # This calculation only works if all ROIs are the same size initially!
+ roi_table[[roi],["len_z_micrometer"]]=(
+ roi_table[[roi],["len_z_micrometer"]].X
+ -float(max_df.loc[roi,"translation_z"])
+ +float(min_df.loc[roi,"translation_z"])
+ )
+ roi_table[[roi],["len_y_micrometer"]]=(
+ roi_table[[roi],["len_y_micrometer"]].X
+ -float(max_df.loc[roi,"translation_y"])
+ +float(min_df.loc[roi,"translation_y"])
+ )
+ roi_table[[roi],["len_x_micrometer"]]=(
+ roi_table[[roi],["len_x_micrometer"]].X
+ -float(max_df.loc[roi,"translation_x"])
+ +float(min_df.loc[roi,"translation_x"])
+ )
+ returnroi_table
+
Apply registration to images by using a registered ROI table
+
This task consists of 4 parts:
+
+
Mask all regions in images that are not available in the
+registered ROI table and store each cycle aligned to the
+reference_cycle (by looping over ROIs).
+
Do the same for all label images.
+
Copy all tables from the non-aligned image to the aligned image
+(currently only works well if the only tables are well & FOV ROI tables
+(registered and original). Not implemented for measurement tables and
+other ROI tables).
+
Clean up: Delete the old, non-aligned image and rename the new,
+aligned image to take over its place.
+
+
Parallelization level: image
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table which has been registered
+and will be applied to mask and shift the images.
+Examples: registered_FOV_ROI_table => loop over the field of
+views, registered_well_ROI_table => process the whole well as
+one image.
@validate_arguments
+defapply_registration_to_image(
+ *,
+ # Fractal arguments
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ # Task-specific arguments
+ registered_roi_table:str,
+ reference_cycle:str="0",
+ overwrite_input:bool=True,
+):
+"""
+ Apply registration to images by using a registered ROI table
+
+ This task consists of 4 parts:
+
+ 1. Mask all regions in images that are not available in the
+ registered ROI table and store each cycle aligned to the
+ reference_cycle (by looping over ROIs).
+ 2. Do the same for all label images.
+ 3. Copy all tables from the non-aligned image to the aligned image
+ (currently only works well if the only tables are well & FOV ROI tables
+ (registered and original). Not implemented for measurement tables and
+ other ROI tables).
+ 4. Clean up: Delete the old, non-aligned image and rename the new,
+ aligned image to take over its place.
+
+ Parallelization level: image
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that is
+ processed. Example: `"some_plate.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ registered_roi_table: Name of the ROI table which has been registered
+ and will be applied to mask and shift the images.
+ Examples: `registered_FOV_ROI_table` => loop over the field of
+ views, `registered_well_ROI_table` => process the whole well as
+ one image.
+ reference_cycle: Which cycle to register against. Defaults to 0,
+ which is the first OME-Zarr image in the well, usually the first
+ cycle that was provided
+ overwrite_input: Whether the old image data should be replaced with the
+ newly registered image data. Currently only implemented for
+ `overwrite_input=True`.
+
+ """
+ logger.info(component)
+ ifnotoverwrite_input:
+ raiseNotImplementedError(
+ "This task is only implemented for the overwrite_input version"
+ )
+ logger.info(
+ f"Running `apply_registration_to_image` on {input_paths=}, "
+ f"{component=}, {registered_roi_table=} and {reference_cycle=}. "
+ f"Using {overwrite_input=}"
+ )
+
+ input_path=Path(input_paths[0])
+ new_component="/".join(
+ component.split("/")[:-1]+[component.split("/")[-1]+"_registered"]
+ )
+ reference_component="/".join(
+ component.split("/")[:-1]+[reference_cycle]
+ )
+
+ ROI_table_ref=ad.read_zarr(
+ f"{input_path/reference_component}/tables/{registered_roi_table}"
+ )
+ ROI_table_cycle=ad.read_zarr(
+ f"{input_path/component}/tables/{registered_roi_table}"
+ )
+
+ ngff_image_meta=load_NgffImageMeta(str(input_path/component))
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ num_levels=ngff_image_meta.num_levels
+
+ ####################
+ # Process images
+ ####################
+ logger.info("Write the registered Zarr image to disk")
+ write_registered_zarr(
+ input_path=input_path,
+ component=component,
+ new_component=new_component,
+ ROI_table=ROI_table_cycle,
+ ROI_table_ref=ROI_table_ref,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ aggregation_function=np.mean,
+ )
+
+ ####################
+ # Process labels
+ ####################
+ try:
+ labels_group=zarr.open_group(f"{input_path/component}/labels","r")
+ label_list=labels_group.attrs["labels"]
+ except(zarr.errors.GroupNotFoundError,KeyError):
+ label_list=[]
+
+ iflabel_list:
+ logger.info(f"Processing the label images: {label_list}")
+ labels_group=zarr.group(f"{input_path/new_component}/labels")
+ labels_group.attrs["labels"]=label_list
+
+ forlabelinlabel_list:
+ label_component=f"{component}/labels/{label}"
+ label_component_new=f"{new_component}/labels/{label}"
+ write_registered_zarr(
+ input_path=input_path,
+ component=label_component,
+ new_component=label_component_new,
+ ROI_table=ROI_table_cycle,
+ ROI_table_ref=ROI_table_ref,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ aggregation_function=np.max,
+ )
+
+ ####################
+ # Copy tables
+ # 1. Copy all standard ROI tables from cycle 0.
+ # 2. Copy all tables that aren't standard ROI tables from the given cycle
+ ####################
+ table_dict_reference=get_table_path_dict(input_path,reference_component)
+ table_dict_component=get_table_path_dict(input_path,component)
+
+ table_dict={}
+ # Define which table should get copied:
+ fortableintable_dict_reference:
+ ifis_standard_roi_table(table):
+ table_dict[table]=table_dict_reference[table]
+ fortableintable_dict_component:
+ ifnotis_standard_roi_table(table):
+ ifreference_component!=component:
+ logger.warning(
+ f"{component} contained a table that is not a standard "
+ "ROI table. The `Apply Registration To Image task` is "
+ "best used before additional tables are generated. It "
+ f"will copy the {table} from this cycle without applying "
+ f"any transformations. This will work well if {table} "
+ f"contains measurements. But if {table} is a custom ROI "
+ "table coming from another task, the transformation is "
+ "not applied and it will not match with the registered "
+ "image anymore"
+ )
+ table_dict[table]=table_dict_component[table]
+
+ iftable_dict:
+ logger.info(f"Processing the tables: {table_dict}")
+ new_image_group=zarr.group(f"{input_path/new_component}")
+
+ fortableintable_dict.keys():
+ logger.info(f"Copying table: {table}")
+ # Get the relevant metadata of the Zarr table & add it
+ # See issue #516 for the need for this workaround
+ max_retries=20
+ sleep_time=5
+ current_round=0
+ whilecurrent_round<max_retries:
+ try:
+ old_table_group=zarr.open_group(
+ table_dict[table],mode="r"
+ )
+ current_round=max_retries
+ exceptzarr.errors.GroupNotFoundError:
+ logger.debug(
+ f"Table {table} not found in attempt {current_round}. "
+ f"Waiting {sleep_time} seconds before trying again."
+ )
+ current_round+=1
+ time.sleep(sleep_time)
+ # Write the Zarr table
+ curr_table=ad.read_zarr(table_dict[table])
+ write_table(
+ new_image_group,
+ table,
+ curr_table,
+ table_attrs=old_table_group.attrs.asdict(),
+ )
+
+ ####################
+ # Clean up Zarr file
+ ####################
+ ifoverwrite_input:
+ logger.info(
+ "Replace original zarr image with the newly created Zarr image"
+ )
+ # Potential for race conditions: Every cycle reads the
+ # reference cycle, but the reference cycle also gets modified
+ # See issue #516 for the details
+ os.rename(f"{input_path/component}",f"{input_path/component}_tmp")
+ os.rename(f"{input_path/new_component}",f"{input_path/component}")
+ shutil.rmtree(f"{input_path/component}_tmp")
+ else:
+ raiseNotImplementedError
+
This function loads the image or label data from a zarr array based on the
+ROI bounding-box coordinates and stores them into a new zarr array.
+The new Zarr array has the same shape as the original array, but will have
+0s where the ROI tables don't specify loading of the image data.
+The ROIs loaded from list_indices will be written into the
+list_indices_ref position, thus performing translational registration if
+the two lists of ROI indices vary.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_path
+
+
+
Base folder where the Zarr is stored
+(does not contain the Zarr file itself)
Path to the new Zarr image that will be written
+(also in the input_path folder). For example:
+"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1_registered"
defwrite_registered_zarr(
+ input_path:Path,
+ component:str,
+ new_component:str,
+ ROI_table:ad.AnnData,
+ ROI_table_ref:ad.AnnData,
+ num_levels:int,
+ coarsening_xy:int=2,
+ aggregation_function:Callable=np.mean,
+):
+"""
+ Write registered zarr array based on ROI tables
+
+ This function loads the image or label data from a zarr array based on the
+ ROI bounding-box coordinates and stores them into a new zarr array.
+ The new Zarr array has the same shape as the original array, but will have
+ 0s where the ROI tables don't specify loading of the image data.
+ The ROIs loaded from `list_indices` will be written into the
+ `list_indices_ref` position, thus performing translational registration if
+ the two lists of ROI indices vary.
+
+ Args:
+ input_path: Base folder where the Zarr is stored
+ (does not contain the Zarr file itself)
+ component: Path to the OME-Zarr image that is processed. For example:
+ `"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1"`
+ new_component: Path to the new Zarr image that will be written
+ (also in the input_path folder). For example:
+ `"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1_registered"`
+ ROI_table: Fractal ROI table for the component
+ ROI_table_ref: Fractal ROI table for the reference cycle
+ num_levels: Number of pyramid layers to be created (argument of
+ `build_pyramid`).
+ coarsening_xy: Coarsening factor between pyramid levels
+ aggregation_function: Function to be used when downsampling (argument
+ of `build_pyramid`).
+
+ """
+ # Read pixel sizes from Zarr attributes
+ ngff_image_meta=load_NgffImageMeta(str(input_path/component))
+ pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+
+ # Create list of indices for 3D ROIs
+ list_indices=convert_ROI_table_to_indices(
+ ROI_table,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ list_indices_ref=convert_ROI_table_to_indices(
+ ROI_table_ref,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+
+ old_image_group=zarr.open_group(f"{input_path/component}",mode="r")
+ old_ngff_image_meta=load_NgffImageMeta(str(input_path/component))
+ new_image_group=zarr.group(f"{input_path/new_component}")
+ new_image_group.attrs.put(old_image_group.attrs.asdict())
+
+ # Loop over all channels. For each channel, write full-res image data.
+ data_array=da.from_zarr(old_image_group["0"])
+ # Create dask array with 0s of same shape
+ new_array=da.zeros_like(data_array)
+
+ # TODO: Add sanity checks on the 2 ROI tables:
+ # 1. The number of ROIs need to match
+ # 2. The size of the ROIs need to match
+ # (otherwise, we can't assign them to the reference regions)
+ # ROI_table_ref vs ROI_table_cycle
+ fori,roi_indicesinenumerate(list_indices):
+ reference_region=convert_indices_to_regions(list_indices_ref[i])
+ region=convert_indices_to_regions(roi_indices)
+
+ axes_list=old_ngff_image_meta.axes_names
+
+ ifaxes_list==["c","z","y","x"]:
+ num_channels=data_array.shape[0]
+ # Loop over channels
+ forind_chinrange(num_channels):
+ idx=tuple(
+ [slice(ind_ch,ind_ch+1)]+list(reference_region)
+ )
+ new_array[idx]=load_region(
+ data_zyx=data_array[ind_ch],region=region,compute=False
+ )
+ elifaxes_list==["z","y","x"]:
+ new_array[reference_region]=load_region(
+ data_zyx=data_array,region=region,compute=False
+ )
+ elifaxes_list==["c","y","x"]:
+ # TODO: Implement cyx case (based on looping over xy case)
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+ elifaxes_list==["y","x"]:
+ # TODO: Implement yx case
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+ else:
+ raiseNotImplementedError(
+ "`write_registered_zarr` has not been implemented for "
+ f"a zarr with {axes_list=}"
+ )
+
+ new_array.to_zarr(
+ f"{input_path/new_component}/0",
+ overwrite=True,
+ dimension_separator="/",
+ write_empty_chunks=False,
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to
+ # disk a pyramid of coarser levels
+ build_pyramid(
+ zarrurl=f"{input_path/new_component}",
+ overwrite=True,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=data_array.chunksize,
+ aggregation_function=aggregation_function,
+ )
+
Loading the images of a given ROI (=> loop over ROIs)
+
Calculating the transformation for that ROI
+
Storing the calculated transformation in the ROI table
+
+
Parallelization level: image
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table over which the task loops to
+calculate the registration. Examples: FOV_ROI_table => loop over
+the field of views, well_ROI_table => process the whole well as
+one image.
@validate_arguments
+defcalculate_registration_image_based(
+ *,
+ # Fractal arguments
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ # Task-specific arguments
+ wavelength_id:str,
+ roi_table:str="FOV_ROI_table",
+ reference_cycle:int=0,
+ level:int=2,
+)->dict[str,Any]:
+"""
+ Calculate registration based on images
+
+ This task consists of 3 parts:
+
+ 1. Loading the images of a given ROI (=> loop over ROIs)
+ 2. Calculating the transformation for that ROI
+ 3. Storing the calculated transformation in the ROI table
+
+ Parallelization level: image
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that is
+ processed. Example: `"some_plate.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ wavelength_id: Wavelength that will be used for image-based
+ registration; e.g. `A01_C01` for Yokogawa, `C01` for MD.
+ roi_table: Name of the ROI table over which the task loops to
+ calculate the registration. Examples: `FOV_ROI_table` => loop over
+ the field of views, `well_ROI_table` => process the whole well as
+ one image.
+ reference_cycle: Which cycle to register against. Defaults to 0,
+ which is the first OME-Zarr image in the well (usually the first
+ cycle that was provided).
+ level: Pyramid level of the image to be segmented. Choose `0` to
+ process at full resolution.
+
+ """
+ logger.info(
+ f"Running for {input_paths=}, {component=}. \n"
+ f"Calculating translation registration per {roi_table=} for "
+ f"{wavelength_id=}."
+ )
+ # Set OME-Zarr paths
+ zarr_img_cycle_x=Path(input_paths[0])/component
+
+ # If the task is run for the reference cycle, exit
+ # TODO: Improve the input for this: Can we filter components to not
+ # run for itself?
+ alignment_cycle=zarr_img_cycle_x.name
+ ifalignment_cycle==str(reference_cycle):
+ logger.info(
+ "Calculate registration image-based is running for "
+ f"cycle {alignment_cycle}, which is the reference_cycle."
+ "Thus, exiting the task."
+ )
+ return{}
+ else:
+ logger.info(
+ "Calculate registration image-based is running for "
+ f"cycle {alignment_cycle}"
+ )
+
+ zarr_img_ref_cycle=zarr_img_cycle_x.parent/str(reference_cycle)
+
+ # Read some parameters from Zarr metadata
+ ngff_image_meta=load_NgffImageMeta(str(zarr_img_ref_cycle))
+ coarsening_xy=ngff_image_meta.coarsening_xy
+
+ # Get channel_index via wavelength_id.
+ # Intially only allow registration of the same wavelength
+ channel_ref:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=str(zarr_img_ref_cycle),
+ wavelength_id=wavelength_id,
+ )
+ channel_index_ref=channel_ref.index
+
+ channel_align:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=str(zarr_img_cycle_x),
+ wavelength_id=wavelength_id,
+ )
+ channel_index_align=channel_align.index
+
+ # Lazily load zarr array
+ data_reference_zyx=da.from_zarr(f"{zarr_img_ref_cycle}/{level}")[
+ channel_index_ref
+ ]
+ data_alignment_zyx=da.from_zarr(f"{zarr_img_cycle_x}/{level}")[
+ channel_index_align
+ ]
+
+ # Read ROIs
+ ROI_table_ref=ad.read_zarr(f"{zarr_img_ref_cycle}/tables/{roi_table}")
+ ROI_table_x=ad.read_zarr(f"{zarr_img_cycle_x}/tables/{roi_table}")
+ logger.info(
+ f"Found {len(ROI_table_x)} ROIs in {roi_table=} to be processed."
+ )
+
+ # Check that table type of ROI_table_ref is valid. Note that
+ # "ngff:region_table" and None are accepted for backwards compatibility
+ valid_table_types=[
+ "roi_table",
+ "masking_roi_table",
+ "ngff:region_table",
+ None,
+ ]
+ ROI_table_ref_group=zarr.open_group(
+ f"{zarr_img_ref_cycle}/tables/{roi_table}",
+ mode="r",
+ )
+ ref_table_attrs=ROI_table_ref_group.attrs.asdict()
+ ref_table_type=ref_table_attrs.get("type")
+ ifref_table_typenotinvalid_table_types:
+ raiseValueError(
+ (
+ f"Table '{roi_table}' (with type '{ref_table_type}') is "
+ "not a valid ROI table."
+ )
+ )
+
+ # For each cycle, get the relevant info
+ # TODO: Add additional checks on ROIs?
+ if(ROI_table_ref.obs.index!=ROI_table_x.obs.index).all():
+ raiseValueError(
+ "Registration is only implemented for ROIs that match between the "
+ "cycles (e.g. well, FOV ROIs). Here, the ROIs in the reference "
+ "cycles were {ROI_table_ref.obs.index}, but the ROIs in the "
+ "alignment cycle were {ROI_table_x.obs.index}"
+ )
+ # TODO: Make this less restrictive? i.e. could we also run it if different
+ # cycles have different FOVs? But then how do we know which FOVs to match?
+ # If we relax this, downstream assumptions on matching based on order
+ # in the list will break.
+
+ # Read pixel sizes from zarr attributes
+ ngff_image_meta_cycle_x=load_NgffImageMeta(str(zarr_img_cycle_x))
+ pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ pxl_sizes_zyx_cycle_x=ngff_image_meta_cycle_x.get_pixel_sizes_zyx(
+ level=0
+ )
+
+ ifpxl_sizes_zyx!=pxl_sizes_zyx_cycle_x:
+ raiseValueError(
+ "Pixel sizes need to be equal between cycles for registration"
+ )
+
+ # Create list of indices for 3D ROIs spanning the entire Z direction
+ list_indices_ref=convert_ROI_table_to_indices(
+ ROI_table_ref,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices_ref,roi_table)
+
+ list_indices_cycle_x=convert_ROI_table_to_indices(
+ ROI_table_x,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices_cycle_x,roi_table)
+
+ num_ROIs=len(list_indices_ref)
+ compute=True
+ new_shifts={}
+ fori_ROIinrange(num_ROIs):
+ logger.info(
+ f"Now processing ROI {i_ROI+1}/{num_ROIs} "
+ f"for channel {channel_align}."
+ )
+ img_ref=load_region(
+ data_zyx=data_reference_zyx,
+ region=convert_indices_to_regions(list_indices_ref[i_ROI]),
+ compute=compute,
+ )
+ img_cycle_x=load_region(
+ data_zyx=data_alignment_zyx,
+ region=convert_indices_to_regions(list_indices_cycle_x[i_ROI]),
+ compute=compute,
+ )
+
+ ##############
+ # Calculate the transformation
+ ##############
+ # Basic version (no padding, no internal binning)
+ ifimg_ref.shape!=img_cycle_x.shape:
+ raiseNotImplementedError(
+ "This registration is not implemented for ROIs with "
+ "different shapes between cycles"
+ )
+ shifts=phase_cross_correlation(
+ np.squeeze(img_ref),np.squeeze(img_cycle_x)
+ )[0]
+
+ # Registration based on scmultiplex, image-based
+ # shifts, _, _ = calculate_shift(np.squeeze(img_ref),
+ # np.squeeze(img_cycle_x), bin=binning, binarize=False)
+
+ # TODO: Make this work on label images
+ # (=> different loading) etc.
+
+ ##############
+ # Storing the calculated transformation ###
+ ##############
+ # Store the shift in ROI table
+ # TODO: Store in OME-NGFF transformations: Check SpatialData approach,
+ # per ROI storage?
+
+ # Adapt ROIs for the given ROI table:
+ ROI_name=ROI_table_ref.obs.index[i_ROI]
+ new_shifts[ROI_name]=calculate_physical_shifts(
+ shifts,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=pxl_sizes_zyx,
+ )
+
+ # Write physical shifts to disk (as part of the ROI table)
+ logger.info(f"Updating the {roi_table=} with translation columns")
+ image_group=zarr.group(zarr_img_cycle_x)
+ new_ROI_table=get_ROI_table_with_translation(ROI_table_x,new_shifts)
+ write_table(
+ image_group,
+ roi_table,
+ new_ROI_table,
+ overwrite=True,
+ table_attrs=ref_table_attrs,
+ )
+
+ return{}
+
Run cellpose segmentation on the ROIs of a single OME-Zarr image.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Second channel for segmentation (in the same format as
+channel). If specified, cellpose runs in dual channel mode.
+For dual channel segmentation of cells, the first channel should
+contain the membrane marker, the second channel should contain the
+nuclear marker.
Name of the ROI table over which the task loops to
+apply Cellpose segmentation. Examples: FOV_ROI_table => loop over
+the field of views, organoid_ROI_table => loop over the organoid
+ROI table (generated by another task), well_ROI_table => process
+the whole well as one image.
If provided, a ROI table with that name is created,
+which will contain the bounding boxes of the newly segmented
+labels. ROI tables should have ROI in their name.
If True, try to use masked loading and fall back to
+use_masks=False if the ROI table is not suitable. Masked
+loading is relevant when only a subset of the bounding box should
+actually be processed (e.g. running within organoid_ROI_table).
Expected diameter of the objects that should be
+segmented in pixels at level 0. Initial diameter is rescaled using
+the level that was selected. The rescaled value is passed as
+the diameter to the CellposeModel.eval method.
Parameter of CellposeModel.eval method. Valid
+values between -6 to 6. From Cellpose documentation: "Decrease this
+threshold if cellpose is not returning as many ROIs as you’d
+expect. Similarly, increase this threshold if cellpose is returning
+too ROIs particularly from dim areas."
Parameter of CellposeModel.eval method. Valid
+values between 0.0 and 1.0. From Cellpose documentation: "Increase
+this threshold if cellpose is not returning as many ROIs as you’d
+expect. Similarly, decrease this threshold if cellpose is returning
+too many ill-shaped ROIs."
Parameter of CellposeModel class. Whether to use cellpose
+net averaging to run the 4 built-in networks (useful for nuclei,
+cyto and cyto2, not sure it works for the others).
@validate_arguments
+defcellpose_segmentation(
+ *,
+ # Fractal arguments
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ # Task-specific arguments
+ level:int,
+ channel:ChannelInputModel,
+ channel2:Optional[ChannelInputModel]=None,
+ input_ROI_table:str="FOV_ROI_table",
+ output_ROI_table:Optional[str]=None,
+ output_label_name:Optional[str]=None,
+ use_masks:bool=True,
+ relabeling:bool=True,
+ # Cellpose-related arguments
+ diameter_level0:float=30.0,
+ model_type:str="cyto2",
+ pretrained_model:Optional[str]=None,
+ cellprob_threshold:float=0.0,
+ flow_threshold:float=0.4,
+ anisotropy:Optional[float]=None,
+ min_size:int=15,
+ augment:bool=False,
+ net_avg:bool=False,
+ use_gpu:bool=True,
+ overwrite:bool=True,
+)->dict[str,Any]:
+"""
+ Run cellpose segmentation on the ROIs of a single OME-Zarr image.
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that is
+ processed. Example: `"some_plate.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ level: Pyramid level of the image to be segmented. Choose `0` to
+ process at full resolution.
+ channel: Primary channel for segmentation; requires either
+ `wavelength_id` (e.g. `A01_C01`) or `label` (e.g. `DAPI`).
+ channel2: Second channel for segmentation (in the same format as
+ `channel`). If specified, cellpose runs in dual channel mode.
+ For dual channel segmentation of cells, the first channel should
+ contain the membrane marker, the second channel should contain the
+ nuclear marker.
+ input_ROI_table: Name of the ROI table over which the task loops to
+ apply Cellpose segmentation. Examples: `FOV_ROI_table` => loop over
+ the field of views, `organoid_ROI_table` => loop over the organoid
+ ROI table (generated by another task), `well_ROI_table` => process
+ the whole well as one image.
+ output_ROI_table: If provided, a ROI table with that name is created,
+ which will contain the bounding boxes of the newly segmented
+ labels. ROI tables should have `ROI` in their name.
+ use_masks: If `True`, try to use masked loading and fall back to
+ `use_masks=False` if the ROI table is not suitable. Masked
+ loading is relevant when only a subset of the bounding box should
+ actually be processed (e.g. running within `organoid_ROI_table`).
+ output_label_name: Name of the output label image (e.g. `"organoids"`).
+ relabeling: If `True`, apply relabeling so that label values are
+ unique for all objects in the well.
+ diameter_level0: Expected diameter of the objects that should be
+ segmented in pixels at level 0. Initial diameter is rescaled using
+ the `level` that was selected. The rescaled value is passed as
+ the diameter to the `CellposeModel.eval` method.
+ model_type: Parameter of `CellposeModel` class. Defines which model
+ should be used. Typical choices are `nuclei`, `cyto`, `cyto2`, etc.
+ pretrained_model: Parameter of `CellposeModel` class (takes
+ precedence over `model_type`). Allows you to specify the path of
+ a custom trained cellpose model.
+ cellprob_threshold: Parameter of `CellposeModel.eval` method. Valid
+ values between -6 to 6. From Cellpose documentation: "Decrease this
+ threshold if cellpose is not returning as many ROIs as you’d
+ expect. Similarly, increase this threshold if cellpose is returning
+ too ROIs particularly from dim areas."
+ flow_threshold: Parameter of `CellposeModel.eval` method. Valid
+ values between 0.0 and 1.0. From Cellpose documentation: "Increase
+ this threshold if cellpose is not returning as many ROIs as you’d
+ expect. Similarly, decrease this threshold if cellpose is returning
+ too many ill-shaped ROIs."
+ anisotropy: Ratio of the pixel sizes along Z and XY axis (ignored if
+ the image is not three-dimensional). If `None`, it is inferred from
+ the OME-NGFF metadata.
+ min_size: Parameter of `CellposeModel` class. Minimum size of the
+ segmented objects (in pixels). Use `-1` to turn off the size
+ filter.
+ augment: Parameter of `CellposeModel` class. Whether to use cellpose
+ augmentation to tile images with overlap.
+ net_avg: Parameter of `CellposeModel` class. Whether to use cellpose
+ net averaging to run the 4 built-in networks (useful for `nuclei`,
+ `cyto` and `cyto2`, not sure it works for the others).
+ use_gpu: If `False`, always use the CPU; if `True`, use the GPU if
+ possible (as defined in `cellpose.core.use_gpu()`) and fall-back
+ to the CPU otherwise.
+ overwrite: If `True`, overwrite the task output.
+ """
+
+ # Set input path
+ iflen(input_paths)>1:
+ raiseNotImplementedError
+ in_path=Path(input_paths[0])
+ zarrurl=(in_path.resolve()/component).as_posix()
+ logger.info(f"{zarrurl=}")
+
+ # Preliminary checks on Cellpose model
+ ifpretrained_modelisNone:
+ ifmodel_typenotinmodels.MODEL_NAMES:
+ raiseValueError(f"ERROR model_type={model_type} is not allowed.")
+ else:
+ ifnotos.path.exists(pretrained_model):
+ raiseValueError(f"{pretrained_model=} does not exist.")
+
+ # Read attributes from NGFF metadata
+ ngff_image_meta=load_NgffImageMeta(zarrurl)
+ num_levels=ngff_image_meta.num_levels
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ full_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ actual_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=level)
+ logger.info(f"NGFF image has {num_levels=}")
+ logger.info(f"NGFF image has {coarsening_xy=}")
+ logger.info(
+ f"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}"
+ )
+ logger.info(
+ f"NGFF image has level-{level} pixel sizes "
+ f"{actual_res_pxl_sizes_zyx}"
+ )
+
+ plate,well=component.split(".zarr/")
+
+ # Find channel index
+ try:
+ tmp_channel:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=zarrurl,
+ wavelength_id=channel.wavelength_id,
+ label=channel.label,
+ )
+ exceptChannelNotFoundErrorase:
+ logger.warning(
+ "Channel not found, exit from the task.\n"
+ f"Original error: {str(e)}"
+ )
+ return{}
+ ind_channel=tmp_channel.index
+
+ # Find channel index for second channel, if one is provided
+ ifchannel2:
+ try:
+ tmp_channel_c2:OmeroChannel=get_channel_from_image_zarr(
+ image_zarr_path=zarrurl,
+ wavelength_id=channel2.wavelength_id,
+ label=channel2.label,
+ )
+ exceptChannelNotFoundErrorase:
+ logger.warning(
+ f"Second channel with wavelength_id: {channel2.wavelength_id} "
+ f"and label: {channel2.label} not found, exit from the task.\n"
+ f"Original error: {str(e)}"
+ )
+ return{}
+ ind_channel_c2=tmp_channel_c2.index
+
+ # Set channel label
+ ifoutput_label_nameisNone:
+ try:
+ channel_label=tmp_channel.label
+ output_label_name=f"label_{channel_label}"
+ except(KeyError,IndexError):
+ output_label_name=f"label_{ind_channel}"
+
+ # Load ZYX data
+ data_zyx=da.from_zarr(f"{zarrurl}/{level}")[ind_channel]
+ logger.info(f"{data_zyx.shape=}")
+ ifchannel2:
+ data_zyx_c2=da.from_zarr(f"{zarrurl}/{level}")[ind_channel_c2]
+ logger.info(f"Second channel: {data_zyx_c2.shape=}")
+
+ # Read ROI table
+ ROI_table_path=f"{zarrurl}/tables/{input_ROI_table}"
+ ROI_table=ad.read_zarr(ROI_table_path)
+
+ # Perform some checks on the ROI table
+ valid_ROI_table=is_ROI_table_valid(
+ table_path=ROI_table_path,use_masks=use_masks
+ )
+ ifuse_masksandnotvalid_ROI_table:
+ logger.info(
+ f"ROI table at {ROI_table_path} cannot be used for masked "
+ "loading. Set use_masks=False."
+ )
+ use_masks=False
+ logger.info(f"{use_masks=}")
+
+ # Create list of indices for 3D ROIs spanning the entire Z direction
+ list_indices=convert_ROI_table_to_indices(
+ ROI_table,
+ level=level,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices,input_ROI_table)
+
+ # If we are not planning to use masked loading, fail for overlapping ROIs
+ ifnotuse_masks:
+ overlap=find_overlaps_in_ROI_indices(list_indices)
+ ifoverlap:
+ raiseValueError(
+ f"ROI indices created from {input_ROI_table} table have "
+ "overlaps, but we are not using masked loading."
+ )
+
+ # Select 2D/3D behavior and set some parameters
+ do_3D=data_zyx.shape[0]>1andlen(data_zyx.shape)==3
+ ifdo_3D:
+ ifanisotropyisNone:
+ # Compute anisotropy as pixel_size_z/pixel_size_x
+ anisotropy=(
+ actual_res_pxl_sizes_zyx[0]/actual_res_pxl_sizes_zyx[2]
+ )
+ logger.info(f"Anisotropy: {anisotropy}")
+
+ # Rescale datasets (only relevant for level>0)
+ ifngff_image_meta.axes_names[0]!="c":
+ raiseValueError(
+ "Cannot set `remove_channel_axis=True` for multiscale "
+ f"metadata with axes={ngff_image_meta.axes_names}. "
+ 'First axis should have name "c".'
+ )
+ new_datasets=rescale_datasets(
+ datasets=[ds.dict()fordsinngff_image_meta.datasets],
+ coarsening_xy=coarsening_xy,
+ reference_level=level,
+ remove_channel_axis=True,
+ )
+
+ label_attrs={
+ "image-label":{
+ "version":__OME_NGFF_VERSION__,
+ "source":{"image":"../../"},
+ },
+ "multiscales":[
+ {
+ "name":output_label_name,
+ "version":__OME_NGFF_VERSION__,
+ "axes":[
+ ax.dict()
+ foraxinngff_image_meta.multiscale.axes
+ ifax.type!="channel"
+ ],
+ "datasets":new_datasets,
+ }
+ ],
+ }
+
+ image_group=zarr.group(zarrurl)
+ label_group=prepare_label_group(
+ image_group,
+ output_label_name,
+ overwrite=overwrite,
+ label_attrs=label_attrs,
+ logger=logger,
+ )
+
+ logger.info(
+ f"Helper function `prepare_label_group` returned {label_group=}"
+ )
+ logger.info(f"Output label path: {zarrurl}/labels/{output_label_name}/0")
+ store=zarr.storage.FSStore(f"{zarrurl}/labels/{output_label_name}/0")
+ label_dtype=np.uint32
+
+ # Ensure that all output shapes & chunks are 3D (for 2D data: (1, y, x))
+ # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/398
+ shape=data_zyx.shape
+ iflen(shape)==2:
+ shape=(1,*shape)
+ chunks=data_zyx.chunksize
+ iflen(chunks)==2:
+ chunks=(1,*chunks)
+ mask_zarr=zarr.create(
+ shape=shape,
+ chunks=chunks,
+ dtype=label_dtype,
+ store=store,
+ overwrite=False,
+ dimension_separator="/",
+ )
+
+ logger.info(
+ f"mask will have shape {data_zyx.shape} "
+ f"and chunks {data_zyx.chunks}"
+ )
+
+ # Initialize cellpose
+ gpu=use_gpuandcellpose.core.use_gpu()
+ ifpretrained_model:
+ model=models.CellposeModel(
+ gpu=gpu,pretrained_model=pretrained_model
+ )
+ else:
+ model=models.CellposeModel(gpu=gpu,model_type=model_type)
+
+ # Initialize other things
+ logger.info(f"Start cellpose_segmentation task for {zarrurl}")
+ logger.info(f"relabeling: {relabeling}")
+ logger.info(f"do_3D: {do_3D}")
+ logger.info(f"use_gpu: {gpu}")
+ logger.info(f"level: {level}")
+ logger.info(f"model_type: {model_type}")
+ logger.info(f"pretrained_model: {pretrained_model}")
+ logger.info(f"anisotropy: {anisotropy}")
+ logger.info("Total well shape/chunks:")
+ logger.info(f"{data_zyx.shape}")
+ logger.info(f"{data_zyx.chunks}")
+ ifchannel2:
+ logger.info("Dual channel input for cellpose model")
+ logger.info(f"{data_zyx_c2.shape}")
+ logger.info(f"{data_zyx_c2.chunks}")
+
+ # Counters for relabeling
+ ifrelabeling:
+ num_labels_tot=0
+
+ # Iterate over ROIs
+ num_ROIs=len(list_indices)
+
+ ifoutput_ROI_table:
+ bbox_dataframe_list=[]
+
+ logger.info(f"Now starting loop over {num_ROIs} ROIs")
+ fori_ROI,indicesinenumerate(list_indices):
+ # Define region
+ s_z,e_z,s_y,e_y,s_x,e_x=indices[:]
+ region=(
+ slice(s_z,e_z),
+ slice(s_y,e_y),
+ slice(s_x,e_x),
+ )
+ logger.info(f"Now processing ROI {i_ROI+1}/{num_ROIs}")
+
+ # Prepare single-channel or dual-channel input for cellpose
+ ifchannel2:
+ # Dual channel mode, first channel is the membrane channel
+ img_1=load_region(
+ data_zyx,
+ region,
+ compute=True,
+ return_as_3D=True,
+ )
+ img_np=np.zeros((2,*img_1.shape))
+ img_np[0,:,:,:]=img_1
+ img_np[1,:,:,:]=load_region(
+ data_zyx_c2,
+ region,
+ compute=True,
+ return_as_3D=True,
+ )
+ channels=[1,2]
+ else:
+ img_np=np.expand_dims(
+ load_region(data_zyx,region,compute=True,return_as_3D=True),
+ axis=0,
+ )
+ channels=[0,0]
+
+ # Prepare keyword arguments for segment_ROI function
+ kwargs_segment_ROI=dict(
+ model=model,
+ channels=channels,
+ do_3D=do_3D,
+ anisotropy=anisotropy,
+ label_dtype=label_dtype,
+ diameter=diameter_level0/coarsening_xy**level,
+ cellprob_threshold=cellprob_threshold,
+ flow_threshold=flow_threshold,
+ min_size=min_size,
+ augment=augment,
+ net_avg=net_avg,
+ )
+
+ # Prepare keyword arguments for preprocessing function
+ preprocessing_kwargs={}
+ ifuse_masks:
+ preprocessing_kwargs=dict(
+ region=region,
+ current_label_path=f"{zarrurl}/labels/{output_label_name}/0",
+ ROI_table_path=ROI_table_path,
+ ROI_positional_index=i_ROI,
+ )
+
+ # Call segment_ROI through the masked-loading wrapper, which includes
+ # pre/post-processing functions if needed
+ new_label_img=masked_loading_wrapper(
+ image_array=img_np,
+ function=segment_ROI,
+ kwargs=kwargs_segment_ROI,
+ use_masks=use_masks,
+ preprocessing_kwargs=preprocessing_kwargs,
+ )
+
+ # Shift labels and update relabeling counters
+ ifrelabeling:
+ num_labels_roi=np.max(new_label_img)
+ new_label_img[new_label_img>0]+=num_labels_tot
+ num_labels_tot+=num_labels_roi
+
+ # Write some logs
+ logger.info(f"ROI {indices}, {num_labels_roi=}, {num_labels_tot=}")
+
+ # Check that total number of labels is under control
+ ifnum_labels_tot>np.iinfo(label_dtype).max:
+ raiseValueError(
+ "ERROR in re-labeling:"
+ f"Reached {num_labels_tot} labels, "
+ f"but dtype={label_dtype}"
+ )
+
+ ifoutput_ROI_table:
+ bbox_df=array_to_bounding_box_table(
+ new_label_img,
+ actual_res_pxl_sizes_zyx,
+ origin_zyx=(s_z,s_y,s_x),
+ )
+
+ bbox_dataframe_list.append(bbox_df)
+
+ overlap_list=[]
+ fordfinbbox_dataframe_list:
+ overlap_list.extend(
+ get_overlapping_pairs_3D(df,full_res_pxl_sizes_zyx)
+ )
+ iflen(overlap_list)>0:
+ logger.warning(
+ f"{len(overlap_list)} bounding-box pairs overlap"
+ )
+
+ # Compute and store 0-th level to disk
+ da.array(new_label_img).to_zarr(
+ url=mask_zarr,
+ region=region,
+ compute=True,
+ )
+
+ logger.info(
+ f"End cellpose_segmentation task for {zarrurl}, "
+ "now building pyramids."
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to disk a
+ # pyramid of coarser levels
+ build_pyramid(
+ zarrurl=f"{zarrurl}/labels/{output_label_name}",
+ overwrite=overwrite,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=chunks,
+ aggregation_function=np.max,
+ )
+
+ logger.info("End building pyramids")
+
+ ifoutput_ROI_table:
+ # Handle the case where `bbox_dataframe_list` is empty (typically
+ # because list_indices is also empty)
+ iflen(bbox_dataframe_list)==0:
+ bbox_dataframe_list=[empty_bounding_box_table()]
+ # Concatenate all ROI dataframes
+ df_well=pd.concat(bbox_dataframe_list,axis=0,ignore_index=True)
+ df_well.index=df_well.index.astype(str)
+ # Extract labels and drop them from df_well
+ labels=pd.DataFrame(df_well["label"].astype(str))
+ df_well.drop(labels=["label"],axis=1,inplace=True)
+ # Convert all to float (warning: some would be int, in principle)
+ bbox_dtype=np.float32
+ df_well=df_well.astype(bbox_dtype)
+ # Convert to anndata
+ bbox_table=ad.AnnData(df_well,dtype=bbox_dtype)
+ bbox_table.obs=labels
+
+ # Write to zarr group
+ image_group=zarr.group(f"{in_path}/{component}")
+ logger.info(
+ "Now writing bounding-box ROI table to "
+ f"{in_path}/{component}/tables/{output_ROI_table}"
+ )
+ table_attrs={
+ "type":"masking_roi_table",
+ "region":{"path":f"../labels/{output_label_name}"},
+ "instance_key":"label",
+ }
+ write_table(
+ image_group,
+ output_ROI_table,
+ bbox_table,
+ overwrite=overwrite,
+ table_attrs=table_attrs,
+ )
+
+ return{}
+
Which channels to use. If only one channel is provided, [0,
+0] should be used. If two channels are provided (the first
+dimension of x has length of 2), [1, 2] should be used
+(x[0, :, :,:] contains the membrane channel and
+x[1, :, :, :] contains the nuclear channel).
defsegment_ROI(
+ x:np.ndarray,
+ model:models.CellposeModel=None,
+ do_3D:bool=True,
+ channels:list[int]=[0,0],
+ anisotropy:Optional[float]=None,
+ diameter:float=30.0,
+ cellprob_threshold:float=0.0,
+ flow_threshold:float=0.4,
+ label_dtype:Optional[np.dtype]=None,
+ augment:bool=False,
+ net_avg:bool=False,
+ min_size:int=15,
+)->np.ndarray:
+"""
+ Internal function that runs Cellpose segmentation for a single ROI.
+
+ Args:
+ x: 4D numpy array.
+ model: An instance of `models.CellposeModel`.
+ do_3D: If `True`, cellpose runs in 3D mode: runs on xy, xz & yz planes,
+ then averages the flows.
+ channels: Which channels to use. If only one channel is provided, `[0,
+ 0]` should be used. If two channels are provided (the first
+ dimension of `x` has length of 2), `[1, 2]` should be used
+ (`x[0, :, :,:]` contains the membrane channel and
+ `x[1, :, :, :]` contains the nuclear channel).
+ anisotropy: Set anisotropy rescaling factor for Z dimension.
+ diameter: Expected object diameter in pixels for cellpose.
+ cellprob_threshold: Cellpose model parameter.
+ flow_threshold: Cellpose model parameter.
+ label_dtype: Label images are cast into this `np.dtype`.
+ augment: Whether to use cellpose augmentation to tile images with
+ overlap.
+ net_avg: Whether to use cellpose net averaging to run the 4 built-in
+ networks (useful for `nuclei`, `cyto` and `cyto2`, not sure it
+ works for the others).
+ min_size: Minimum size of the segmented objects.
+ """
+
+ # Write some debugging info
+ logger.info(
+ "[segment_ROI] START |"
+ f" x: {type(x)}, {x.shape} |"
+ f" {do_3D=} |"
+ f" {model.diam_mean=} |"
+ f" {diameter=} |"
+ f" {flow_threshold=}"
+ )
+
+ # Actual labeling
+ t0=time.perf_counter()
+ mask,_,_=model.eval(
+ x,
+ channels=channels,
+ do_3D=do_3D,
+ net_avg=net_avg,
+ augment=augment,
+ diameter=diameter,
+ anisotropy=anisotropy,
+ cellprob_threshold=cellprob_threshold,
+ flow_threshold=flow_threshold,
+ min_size=min_size,
+ )
+
+ ifmask.ndim==2:
+ # If we get a 2D image, we still return it as a 3D array
+ mask=np.expand_dims(mask,axis=0)
+ t1=time.perf_counter()
+
+ # Write some debugging info
+ logger.info(
+ "[segment_ROI] END |"
+ f" Elapsed: {t1-t0:.3f} s |"
+ f" {mask.shape=},"
+ f" {mask.dtype=} (then {label_dtype}),"
+ f" {np.max(mask)=} |"
+ f" {model.diam_mean=} |"
+ f" {diameter=} |"
+ f" {flow_threshold=}"
+ )
+
+ returnmask.astype(label_dtype)
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path were the output of this task is stored. Example:
+"/some/path/" => puts the new OME-Zarr file in the same folder as
+the input OME-Zarr file "/some/new_path" => puts the new OME-Zarr
+file into a new folder at /some/new_path. (standard argument for
+Fractal tasks, managed by Fractal server).
Dictionary containing metadata about the OME-Zarr. This task
+requires the following elements to be present in the metadata:
+plate: List of plates
+(e.g. ["MyPlate.zarr"]);
+well: List of wells in the OME-Zarr plate
+(e.g. ["MyPlate.zarr/B/03/MyPlate.zarr/B/05"]);
+"image": List of images in the OME-Zarr plate
+(e.g. ["MyPlate.zarr/B/03/0", "MyPlate.zarr/B/05/0"]).
+standard argument for Fractal tasks, managed by Fractal server).
@validate_arguments
+defcopy_ome_zarr(
+ *,
+ input_paths:Sequence[str],
+ output_path:str,
+ metadata:dict[str,Any],
+ project_to_2D:bool=True,
+ suffix:str="mip",
+ ROI_table_names:tuple[str,...]=("FOV_ROI_table","well_ROI_table"),
+ overwrite:bool=False,
+)->dict[str,Any]:
+
+"""
+ Duplicate an input zarr structure to a new path.
+
+ This task copies all the structure, but none of the image data:
+
+ - For each plate, create a new zarr group with the same attributes as
+ the original one.
+ - For each well (in each plate), create a new zarr subgroup with the
+ same attributes as the original one.
+ - For each image (in each well), create a new zarr subgroup with the
+ same attributes as the original one.
+ - For each image (in each well), copy the relevant AnnData tables from
+ the original source.
+
+ Note: this task makes use of methods from the `Attributes` class, see
+ https://zarr.readthedocs.io/en/stable/api/attrs.html.
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: Path were the output of this task is stored. Example:
+ `"/some/path/"` => puts the new OME-Zarr file in the same folder as
+ the input OME-Zarr file `"/some/new_path"` => puts the new OME-Zarr
+ file into a new folder at `/some/new_path`. (standard argument for
+ Fractal tasks, managed by Fractal server).
+ metadata: Dictionary containing metadata about the OME-Zarr. This task
+ requires the following elements to be present in the metadata:
+ `plate`: List of plates
+ (e.g. `["MyPlate.zarr"]`);
+ `well`: List of wells in the OME-Zarr plate
+ (e.g. `["MyPlate.zarr/B/03/MyPlate.zarr/B/05"]`);
+ "image": List of images in the OME-Zarr plate
+ (e.g. `["MyPlate.zarr/B/03/0", "MyPlate.zarr/B/05/0"]`).
+ standard argument for Fractal tasks, managed by Fractal server).
+ project_to_2D: If `True`, apply a 3D->2D projection to the ROI tables
+ that are copied to the new OME-Zarr.
+ suffix: The suffix that is used to transform `plate.zarr` into
+ `plate_suffix.zarr`. Note that `None` is not currently supported.
+ ROI_table_names: List of Anndata table names to be copied. Note:
+ copying non-ROI tables may fail if `project_to_2D=True`.
+ overwrite: If `True`, overwrite the task output.
+
+ Returns:
+ An update to the metadata table with new `plate`, `well`, `image`
+ entries (now with the suffix in the plate name).
+ """
+
+ # Preliminary check
+ iflen(input_paths)>1:
+ raiseNotImplementedError
+ ifsuffixisNone:
+ # FIXME create a standard suffix (with timestamp)
+ raiseNotImplementedError
+
+ # List all plates
+ in_path=Path(input_paths[0])
+ list_plates=[
+ p.as_posix()
+ forpinPath(in_path).glob("*.zarr")
+ ifp.nameinmetadata["plate"]
+ ]
+ logger.info(f"{list_plates=}")
+
+ meta_update:dict[str,Any]={"copy_ome_zarr":{}}
+ meta_update["copy_ome_zarr"]["suffix"]=suffix
+ meta_update["copy_ome_zarr"]["sources"]={}
+
+ # Loop over all plates
+ forzarrurl_oldinlist_plates:
+ zarrfile=zarrurl_old.split("/")[-1]
+ old_plate_name=zarrfile.split(".zarr")[0]
+ new_plate_name=f"{old_plate_name}_{suffix}"
+ new_plate_dir=Path(output_path).resolve()
+ zarrurl_new=f"{(new_plate_dir/new_plate_name).as_posix()}.zarr"
+ meta_update["copy_ome_zarr"]["sources"][new_plate_name]=zarrurl_old
+
+ logger.info(f"{zarrurl_old=}")
+ logger.info(f"{zarrurl_new=}")
+ logger.info(f"{meta_update=}")
+
+ # Replicate plate attrs
+ old_plate_group=zarr.open_group(zarrurl_old,mode="r")
+ new_plate_group=open_zarr_group_with_overwrite(
+ zarrurl_new,overwrite=overwrite
+ )
+ new_plate_group.attrs.put(old_plate_group.attrs.asdict())
+
+ well_paths=[
+ well["path"]forwellinnew_plate_group.attrs["plate"]["wells"]
+ ]
+ logger.info(f"{well_paths=}")
+ forwell_pathinwell_paths:
+
+ # Replicate well attrs
+ old_well_group=zarr.open_group(
+ f"{zarrurl_old}/{well_path}",mode="r"
+ )
+ new_well_group=zarr.group(f"{zarrurl_new}/{well_path}")
+ new_well_group.attrs.put(old_well_group.attrs.asdict())
+
+ image_paths=[
+ image["path"]
+ forimageinnew_well_group.attrs["well"]["images"]
+ ]
+ logger.info(f"{image_paths=}")
+
+ forimage_pathinimage_paths:
+
+ # Replicate image attrs
+ old_image_group=zarr.open_group(
+ f"{zarrurl_old}/{well_path}/{image_path}",mode="r"
+ )
+ new_image_group=zarr.group(
+ f"{zarrurl_new}/{well_path}/{image_path}"
+ )
+ new_image_group.attrs.put(old_image_group.attrs.asdict())
+
+ # Extract pixel sizes, if needed
+ ifROI_table_names:
+
+ ifproject_to_2D:
+ path_image=f"{zarrurl_old}/{well_path}/{image_path}"
+ ngff_image_meta=load_NgffImageMeta(path_image)
+ pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(
+ level=0
+ )
+ pxl_size_z=pxl_sizes_zyx[0]
+
+ # Copy the tables in ROI_table_names
+ forROI_table_nameinROI_table_names:
+
+ logger.info(
+ f"I will now read {ROI_table_name} from "
+ f"{zarrurl_old=}, convert it to 2D, and "
+ "write it back to the new zarr file."
+ )
+ new_ROI_table=ad.read_zarr(
+ f"{zarrurl_old}/{well_path}/{image_path}/"
+ f"tables/{ROI_table_name}"
+ )
+ old_ROI_table_attrs=zarr.open_group(
+ f"{zarrurl_old}/{well_path}/{image_path}/"
+ f"tables/{ROI_table_name}"
+ ).attrs.asdict()
+ # Convert 3D ROIs to 2D
+ ifproject_to_2D:
+ new_ROI_table=convert_ROIs_from_3D_to_2D(
+ new_ROI_table,pxl_size_z
+ )
+ # Write new table
+ write_table(
+ new_image_group,
+ ROI_table_name,
+ new_ROI_table,
+ table_attrs=old_ROI_table_attrs,
+ )
+
+ forkeyin["plate","well","image"]:
+ meta_update[key]=[
+ component.replace(".zarr",f"_{suffix}.zarr")
+ forcomponentinmetadata[key]
+ ]
+
+ returnmeta_update
+
Create a OME-NGFF zarr folder, without reading/writing image data.
+
Find plates (for each folder in input_paths):
+
+
glob image files,
+
parse metadata from image filename to identify plates,
+
identify populated channels.
+
+
Create a zarr folder (for each plate):
+
+
parse mlf metadata,
+
identify wells and field of view (FOV),
+
create FOV ZARR,
+
verify that channels are uniform (i.e., same channels).
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data from
+the microscope is stored (as TIF or PNG). Should point to the
+parent folder containing the images and the metadata files
+MeasurementData.mlf and MeasurementDetail.mrf (if present).
+Example: ["/some/path/"].
+(standard argument for Fractal tasks, managed by Fractal server).
Path were the output of this task is stored.
+Example: "/some/path/" => puts the new OME-Zarr file in the
+"/some/path/".
+(standard argument for Fractal tasks, managed by Fractal server).
A list of OmeroChannel s, where each channel must
+include the wavelength_id attribute and where the
+wavelength_id values must be unique across the list.
If specified, only parse images with filenames
+that match with all these patterns. Patterns must be defined as in
+https://docs.python.org/3/library/fnmatch.html, Example:
+image_glob_pattern=["*_B03_*"] => only process well B03
+image_glob_pattern=["*_C09_*", "*F016*", "*Z[0-5][0-9]C*"] =>
+only process well C09, field of view 16 and Z planes 0-59.
If None, parse Yokogawa metadata from mrf/mlf
+files in the input_path folder; else, the full path to a csv file
+containing the parsed metadata table.
A metadata dictionary containing important metadata about the OME-Zarr
+plate, the images and some parameters required by downstream tasks
+(like num_levels).
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/tasks/create_ome_zarr.py
+
Create OME-NGFF structure and metadata to host a multiplexing dataset.
+
This task takes a set of image folders (i.e. different acquisition cycles)
+and build the internal structure and metadata of a OME-NGFF zarr group,
+without actually loading/writing the image data.
+
Each element in input_paths should be treated as a different acquisition.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data from the
+microscope is stored (as TIF or PNG). Each element of the list is
+treated as another cycle of the multiplexing data, the cycles are
+ordered by their order in this list. Should point to the parent
+folder containing the images and the metadata files
+MeasurementData.mlf and MeasurementDetail.mrf (if present).
+Example: ["/path/cycle1/", "/path/cycle2/"]. (standard argument
+for Fractal tasks, managed by Fractal server).
Path were the output of this task is stored.
+Example: "/some/path/" => puts the new OME-Zarr file in the
+/some/path/.
+(standard argument for Fractal tasks, managed by Fractal server).
A dictionary of lists of OmeroChannels, where
+each channel must include the wavelength_id attribute and where
+the wavelength_id values must be unique across each list.
+Dictionary keys represent channel indices ("0","1",..).
If specified, only parse images with filenames
+that match with all these patterns. Patterns must be defined as in
+https://docs.python.org/3/library/fnmatch.html, Example:
+image_glob_pattern=["*_B03_*"] => only process well B03
+image_glob_pattern=["*_C09_*", "*F016*", "*Z[0-5][0-9]C*"] =>
+only process well C09, field of view 16 and Z planes 0-59.
If None, parse Yokogawa metadata from mrf/mlf
+files in the input_path folder; else, a dictionary of key-value
+pairs like (acquisition, path) with acquisition a string
+and path pointing to a csv file containing the parsed metadata
+table.
A metadata dictionary containing important metadata about the OME-Zarr
+plate, the images and some parameters required by downstream tasks
+(like num_levels).
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/tasks/create_ome_zarr_multiplex.py
+
defcorrect(
+ img_stack:np.ndarray,
+ corr_img:np.ndarray,
+ background:int=110,
+):
+"""
+ Corrects a stack of images, using a given illumination profile (e.g. bright
+ in the center of the image, dim outside).
+
+ Args:
+ img_stack: 4D numpy array (czyx), with dummy size along c.
+ corr_img: 2D numpy array (yx)
+ background: Background value that is subtracted from the image before
+ the illumination correction is applied.
+ """
+
+ logger.info(f"Start correct, {img_stack.shape}")
+
+ # Check shapes
+ ifcorr_img.shape!=img_stack.shape[2:]orimg_stack.shape[0]!=1:
+ raiseValueError(
+ "Error in illumination_correction:\n"
+ f"{img_stack.shape=}\n{corr_img.shape=}"
+ )
+
+ # Store info about dtype
+ dtype=img_stack.dtype
+ dtype_max=np.iinfo(dtype).max
+
+ # Background subtraction
+ img_stack[img_stack<=background]=0
+ img_stack[img_stack>background]-=background
+
+ # Apply the normalized correction matrix (requires a float array)
+ # img_stack = img_stack.astype(np.float64)
+ new_img_stack=img_stack/(corr_img/np.max(corr_img))[None,None,:,:]
+
+ # Handle edge case: corrected image may have values beyond the limit of
+ # the encoding, e.g. beyond 65535 for 16bit images. This clips values
+ # that surpass this limit and triggers a warning
+ ifnp.sum(new_img_stack>dtype_max)>0:
+ warnings.warn(
+ "Illumination correction created values beyond the max range of "
+ f"the current image type. These have been clipped to {dtype_max=}."
+ )
+ new_img_stack[new_img_stack>dtype_max]=dtype_max
+
+ logger.info("End correct")
+
+ # Cast back to original dtype and return
+ returnnew_img_stack.astype(dtype)
+
Applies illumination correction to the images in the OME-Zarr.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file. Example:
+["/some/path/"]. This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path were the output of this task is stored. Examples:
+"/some/path/" => puts the new OME-Zarr file in the same folder as
+the input OME-Zarr file; "/some/new_path" => puts the new
+OME-Zarr file into a new folder at /some/new_path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Dictionary where keys match the wavelength_id attributes
+of existing channels (e.g. A01_C01 ) and values are the
+filenames of the corresponding illumination profiles.
Background value that is subtracted from the image before
+the illumination correction is applied. Set it to 0 if you don't
+want any background subtraction.
Not implemented yet. This is not implemented well in
+Fractal server at the moment, it's unclear how a user would specify
+fitting new components. If the results shall not overwrite the
+input data and the output path is the same as the input path, a new
+component needs to be provided.
+Example: myplate_new_name.zarr/B/03/0/.
@validate_arguments
+defillumination_correction(
+ *,
+ # Standard arguments
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ # Task-specific arguments
+ illumination_profiles_folder:str,
+ dict_corr:dict[str,str],
+ background:int=110,
+ overwrite_input:bool=True,
+ new_component:Optional[str]=None,
+)->dict[str,Any]:
+
+"""
+ Applies illumination correction to the images in the OME-Zarr.
+
+ Args:
+ input_paths: List of input paths where the image data is stored as
+ OME-Zarrs. Should point to the parent folder containing one or many
+ OME-Zarr files, not the actual OME-Zarr file. Example:
+ `["/some/path/"]`. This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: Path were the output of this task is stored. Examples:
+ `"/some/path/"` => puts the new OME-Zarr file in the same folder as
+ the input OME-Zarr file; `"/some/new_path"` => puts the new
+ OME-Zarr file into a new folder at `/some/new_path`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that is
+ processed. Example: `"some_plate.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: This parameter is not used by this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ illumination_profiles_folder: Path of folder of illumination profiles.
+ dict_corr: Dictionary where keys match the `wavelength_id` attributes
+ of existing channels (e.g. `A01_C01` ) and values are the
+ filenames of the corresponding illumination profiles.
+ background: Background value that is subtracted from the image before
+ the illumination correction is applied. Set it to `0` if you don't
+ want any background subtraction.
+ overwrite_input:
+ If `True`, the results of this task will overwrite the input image
+ data. In the current version, `overwrite_input=False` is not
+ implemented.
+ new_component: Not implemented yet. This is not implemented well in
+ Fractal server at the moment, it's unclear how a user would specify
+ fitting new components. If the results shall not overwrite the
+ input data and the output path is the same as the input path, a new
+ component needs to be provided.
+ Example: `myplate_new_name.zarr/B/03/0/`.
+ """
+
+ # Preliminary checks
+ iflen(input_paths)>1:
+ raiseNotImplementedError
+ if(overwrite_inputandnew_componentisnotNone)or(
+ new_componentisNoneandnotoverwrite_input
+ ):
+ raiseValueError(f"{overwrite_input=}, but {new_component=}")
+
+ ifnotoverwrite_input:
+ msg=(
+ "We still have to harmonize illumination_correction("
+ "overwrite_input=False) with replicate_zarr_structure(..., "
+ "suffix=..)"
+ )
+ raiseNotImplementedError(msg)
+
+ # Defione old/new zarrurls
+ plate,well=component.split(".zarr/")
+ in_path=Path(input_paths[0])
+ zarrurl_old=(in_path/component).as_posix()
+ ifoverwrite_input:
+ zarrurl_new=zarrurl_old
+ else:
+ new_plate,new_well=new_component.split(".zarr/")
+ ifnew_well!=well:
+ raiseValueError(f"{well=}, {new_well=}")
+ zarrurl_new=(Path(output_path)/new_component).as_posix()
+
+ t_start=time.perf_counter()
+ logger.info("Start illumination_correction")
+ logger.info(f" {overwrite_input=}")
+ logger.info(f" {zarrurl_old=}")
+ logger.info(f" {zarrurl_new=}")
+
+ # Read attributes from NGFF metadata
+ ngff_image_meta=load_NgffImageMeta(zarrurl_old)
+ num_levels=ngff_image_meta.num_levels
+ coarsening_xy=ngff_image_meta.coarsening_xy
+ full_res_pxl_sizes_zyx=ngff_image_meta.get_pixel_sizes_zyx(level=0)
+ logger.info(f"NGFF image has {num_levels=}")
+ logger.info(f"NGFF image has {coarsening_xy=}")
+ logger.info(
+ f"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}"
+ )
+
+ # Read channels from .zattrs
+ channels:list[OmeroChannel]=get_omero_channel_list(
+ image_zarr_path=zarrurl_old
+ )
+ num_channels=len(channels)
+
+ # Read FOV ROIs
+ FOV_ROI_table=ad.read_zarr(f"{zarrurl_old}/tables/FOV_ROI_table")
+
+ # Create list of indices for 3D FOVs spanning the entire Z direction
+ list_indices=convert_ROI_table_to_indices(
+ FOV_ROI_table,
+ level=0,
+ coarsening_xy=coarsening_xy,
+ full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,
+ )
+ check_valid_ROI_indices(list_indices,"FOV_ROI_table")
+
+ # Extract image size from FOV-ROI indices. Note: this works at level=0,
+ # where FOVs should all be of the exact same size (in pixels)
+ ref_img_size=None
+ forindicesinlist_indices:
+ img_size=(indices[3]-indices[2],indices[5]-indices[4])
+ ifref_img_sizeisNone:
+ ref_img_size=img_size
+ else:
+ ifimg_size!=ref_img_size:
+ raiseValueError(
+ "ERROR: inconsistent image sizes in list_indices"
+ )
+ img_size_y,img_size_x=img_size[:]
+
+ # Assemble dictionary of matrices and check their shapes
+ corrections={}
+ forchannelinchannels:
+ wavelength_id=channel.wavelength_id
+ corrections[wavelength_id]=imread(
+ (
+ Path(illumination_profiles_folder)/dict_corr[wavelength_id]
+ ).as_posix()
+ )
+ ifcorrections[wavelength_id].shape!=(img_size_y,img_size_x):
+ raiseValueError(
+ "Error in illumination_correction, "
+ "correction matrix has wrong shape."
+ )
+
+ # Lazily load highest-res level from original zarr array
+ data_czyx=da.from_zarr(f"{zarrurl_old}/0")
+
+ # Create zarr for output
+ ifoverwrite_input:
+ fov_path=zarrurl_old
+ new_zarr=zarr.open(f"{zarrurl_old}/0")
+ else:
+ fov_path=zarrurl_new
+ new_zarr=zarr.create(
+ shape=data_czyx.shape,
+ chunks=data_czyx.chunksize,
+ dtype=data_czyx.dtype,
+ store=zarr.storage.FSStore(f"{zarrurl_new}/0"),
+ overwrite=False,
+ dimension_separator="/",
+ )
+
+ # Iterate over FOV ROIs
+ num_ROIs=len(list_indices)
+ fori_c,channelinenumerate(channels):
+ fori_ROI,indicesinenumerate(list_indices):
+ # Define region
+ s_z,e_z,s_y,e_y,s_x,e_x=indices[:]
+ region=(
+ slice(i_c,i_c+1),
+ slice(s_z,e_z),
+ slice(s_y,e_y),
+ slice(s_x,e_x),
+ )
+ logger.info(
+ f"Now processing ROI {i_ROI+1}/{num_ROIs} "
+ f"for channel {i_c+1}/{num_channels}"
+ )
+ # Execute illumination correction
+ corrected_fov=correct(
+ data_czyx[region].compute(),
+ corrections[channel.wavelength_id],
+ background=background,
+ )
+ # Write to disk
+ da.array(corrected_fov).to_zarr(
+ url=new_zarr,
+ region=region,
+ compute=True,
+ )
+
+ # Starting from on-disk highest-resolution data, build and write to disk a
+ # pyramid of coarser levels
+ build_pyramid(
+ zarrurl=fov_path,
+ overwrite=True,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=data_czyx.chunksize,
+ )
+
+ t_end=time.perf_counter()
+ logger.info(f"End illumination_correction, elapsed: {t_end-t_start}")
+
+ return{}
+
Creates the appropriate components-related metadata, needed for
+ processing an existing OME-Zarr through Fractal.
+
Optionally adds new ROI tables to the existing OME-Zarr.
+
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
A length-one list with the parent folder of the OME-Zarr
+to be imported; e.g. input_paths=["/somewhere"], if the OME-Zarr
+path is /somewhere/array.zarr.
+(standard argument for Fractal tasks, managed by Fractal server).
@validate_arguments
+defimport_ome_zarr(
+ *,
+ input_paths:Sequence[str],
+ output_path:str,
+ metadata:dict[str,Any],
+ zarr_name:str,
+ add_image_ROI_table:bool=True,
+ add_grid_ROI_table:bool=True,
+ grid_y_shape:int=2,
+ grid_x_shape:int=2,
+ update_omero_metadata:bool=True,
+ overwrite:bool=False,
+)->dict[str,Any]:
+"""
+ Import an OME-Zarr into Fractal.
+
+ The current version of this task:
+
+ 1. Creates the appropriate components-related metadata, needed for
+ processing an existing OME-Zarr through Fractal.
+ 2. Optionally adds new ROI tables to the existing OME-Zarr.
+
+ Args:
+ input_paths: A length-one list with the parent folder of the OME-Zarr
+ to be imported; e.g. `input_paths=["/somewhere"]`, if the OME-Zarr
+ path is `/somewhere/array.zarr`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: Not used in this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: Not used in this task.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ zarr_name: The OME-Zarr name, without its parent folder; e.g.
+ `zarr_name="array.zarr"`, if the OME-Zarr path is
+ `/somewhere/array.zarr`.
+ add_image_ROI_table: Whether to add a `image_ROI_table` table to each
+ image, with a single ROI covering the whole image.
+ add_grid_ROI_table: Whether to add a `grid_ROI_table` table to each
+ image, with the image split into a rectangular grid of ROIs.
+ grid_y_shape: Y shape of the ROI grid in `grid_ROI_table`.
+ grid_x_shape: X shape of the ROI grid in `grid_ROI_table`.
+ update_omero_metadata: Whether to update Omero-channels metadata, to
+ make them Fractal-compatible.
+ overwrite: Whether new ROI tables (added when `add_image_ROI_table`
+ and/or `add_grid_ROI_table` are `True`) can overwite existing ones.
+ """
+
+ # Preliminary checks
+ iflen(input_paths)>1:
+ raiseNotImplementedError
+
+ zarr_path=str(Path(input_paths[0])/zarr_name)
+ logger.info(f"Zarr path: {zarr_path}")
+
+ zarrurls:dict=dict(plate=[],well=[],image=[])
+
+ root_group=zarr.open_group(zarr_path,mode="r")
+ ngff_type=detect_ome_ngff_type(root_group)
+ grid_YX_shape=(grid_y_shape,grid_x_shape)
+
+ ifngff_type=="plate":
+ zarrurls["plate"].append(zarr_name)
+ forwellinroot_group.attrs["plate"]["wells"]:
+ well_path=well["path"]
+ zarrurls["well"].append(f"{zarr_name}/{well_path}")
+
+ well_group=zarr.open_group(zarr_path,path=well_path,mode="r")
+ forimageinwell_group.attrs["well"]["images"]:
+ image_path=image["path"]
+ zarrurls["image"].append(
+ f"{zarr_name}/{well_path}/{image_path}"
+ )
+ _process_single_image(
+ f"{zarr_path}/{well_path}/{image_path}",
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+ elifngff_type=="well":
+ zarrurls["well"].append(zarr_name)
+ logger.warning(
+ "Only OME-Zarr for plates are fully supported in Fractal; "
+ f"e.g. the current one ({ngff_type=}) cannot be "
+ "processed via the `maximum_intensity_projection` task."
+ )
+ forimageinroot_group.attrs["well"]["images"]:
+ image_path=image["path"]
+ zarrurls["image"].append(f"{zarr_name}/{image_path}")
+ _process_single_image(
+ f"{zarr_path}/{image_path}",
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+ elifngff_type=="image":
+ zarrurls["image"].append(zarr_name)
+ logger.warning(
+ "Only OME-Zarr for plates are fully supported in Fractal; "
+ f"e.g. the current one ({ngff_type=}) cannot be "
+ "processed via the `maximum_intensity_projection` task."
+ )
+ _process_single_image(
+ zarr_path,
+ add_image_ROI_table,
+ add_grid_ROI_table,
+ update_omero_metadata,
+ grid_YX_shape=grid_YX_shape,
+ overwrite=overwrite,
+ )
+
+ # Remove zarrurls keys pointing to empty lists
+ clean_zarrurls={
+ key:valueforkey,valueinzarrurls.items()iflen(value)>0
+ }
+
+ returnclean_zarrurls
+
Perform maximum-intensity projection along Z axis.
+
Note: this task stores the output in a new zarr file.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
This parameter is not used by this task.
+This task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path were the output of this task is stored.
+Example: "/some/path/" => puts the new OME-Zarr file in that
+folder.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that
+is processed. Component is typically changed by the copy_ome_zarr
+task before, to point to a new mip Zarr file.
+Example: "some_plate_mip.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Dictionary containing metadata about the OME-Zarr.
+This task requires the key copy_ome_zarr to be present in the
+metadata (as defined in copy_ome_zarr task).
+(standard argument for Fractal tasks, managed by Fractal server).
@validate_arguments
+defmaximum_intensity_projection(
+ *,
+ input_paths:Sequence[str],
+ output_path:str,
+ component:str,
+ metadata:dict[str,Any],
+ overwrite:bool=False,
+)->dict[str,Any]:
+"""
+ Perform maximum-intensity projection along Z axis.
+
+ Note: this task stores the output in a new zarr file.
+
+ Args:
+ input_paths: This parameter is not used by this task.
+ This task only supports a single input path.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ output_path: Path were the output of this task is stored.
+ Example: `"/some/path/"` => puts the new OME-Zarr file in that
+ folder.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ component: Path to the OME-Zarr image in the OME-Zarr plate that
+ is processed. Component is typically changed by the `copy_ome_zarr`
+ task before, to point to a new mip Zarr file.
+ Example: `"some_plate_mip.zarr/B/03/0"`.
+ (standard argument for Fractal tasks, managed by Fractal server).
+ metadata: Dictionary containing metadata about the OME-Zarr.
+ This task requires the key `copy_ome_zarr` to be present in the
+ metadata (as defined in `copy_ome_zarr` task).
+ (standard argument for Fractal tasks, managed by Fractal server).
+ overwrite: If `True`, overwrite the task output.
+ """
+
+ # Preliminary checks
+ iflen(input_paths)>1:
+ raiseNotImplementedError
+
+ plate,well=component.split(".zarr/")
+ zarrurl_old=metadata["copy_ome_zarr"]["sources"][plate]+"/"+well
+ clean_output_path=Path(output_path).resolve()
+ zarrurl_new=(clean_output_path/component).as_posix()
+ logger.info(f"{zarrurl_old=}")
+ logger.info(f"{zarrurl_new=}")
+
+ # Read some parameters from metadata
+ ngff_image=load_NgffImageMeta(zarrurl_old)
+ num_levels=ngff_image.num_levels
+ coarsening_xy=ngff_image.coarsening_xy
+
+ # Load 0-th level
+ data_czyx=da.from_zarr(zarrurl_old+"/0")
+ num_channels=data_czyx.shape[0]
+ chunksize_y=data_czyx.chunksize[-2]
+ chunksize_x=data_czyx.chunksize[-1]
+ logger.info(f"{num_channels=}")
+ logger.info(f"{chunksize_y=}")
+ logger.info(f"{chunksize_x=}")
+ # Loop over channels
+ accumulate_chl=[]
+ forind_chinrange(num_channels):
+ # Perform MIP for each channel of level 0
+ mip_yx=da.stack([da.max(data_czyx[ind_ch],axis=0)],axis=0)
+ accumulate_chl.append(mip_yx)
+ accumulated_array=da.stack(accumulate_chl,axis=0)
+
+ # Write to disk (triggering execution)
+ try:
+ accumulated_array.to_zarr(
+ f"{zarrurl_new}/0",
+ overwrite=overwrite,
+ dimension_separator="/",
+ write_empty_chunks=False,
+ )
+ exceptContainsArrayErrorase:
+ error_msg=(
+ f"Cannot write array to zarr group at '{zarrurl_new}/0', "
+ f"with {overwrite=} (original error: {str(e)}).\n"
+ "Hint: try setting overwrite=True."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
+ # Starting from on-disk highest-resolution data, build and write to disk a
+ # pyramid of coarser levels
+ build_pyramid(
+ zarrurl=zarrurl_new,
+ overwrite=overwrite,
+ num_levels=num_levels,
+ coarsening_xy=coarsening_xy,
+ chunksize=(1,1,chunksize_y,chunksize_x),
+ )
+
+ return{}
+
List of input paths where the image data is stored as
+OME-Zarrs. Should point to the parent folder containing one or many
+OME-Zarr files, not the actual OME-Zarr file.
+Example: ["/some/path/"].
+his task only supports a single input path.
+(standard argument for Fractal tasks, managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed.
+Example: "some_plate.zarr/B/03/0".
+(standard argument for Fractal tasks, managed by Fractal server).
Name of the ROI table over which the task loops to
+apply napari workflows.
+Examples:
+FOV_ROI_table
+=> loop over the field of views;
+organoid_ROI_table
+=> loop over the organoid ROI table (generated by another task);
+well_ROI_table
+=> process the whole well as one image.
Pyramid level of the image to be used as input for
+napari-workflows. Choose 0 to process at full resolution.
+Levels > 0 are currently only supported for workflows that only
+have intensity images as input and only produce a label images as
+output.
Expected dimensions (either 2 or 3). Useful
+when loading 2D images that are stored in a 3D array with shape
+(1, size_x, size_y) [which is the default way Fractal stores 2D
+images], but you want to make sure the napari workflow gets a 2D
+array to process. Also useful to set to 2 when loading a 2D
+OME-Zarr that is saved as (size_x, size_y).
classNapariWorkflowsOutput(BaseModel):
+"""
+ A value of the `output_specs` argument in `napari_workflows_wrapper`.
+
+ Attributes:
+ type: Output type (either `label` or `dataframe`).
+ label_name: Label name (for label outputs, it is used as the name of
+ the label; for dataframe outputs, it is used to fill the
+ `region["path"]` field).
+ table_name: Table name (for dataframe outputs only).
+ """
+
+ type:Literal["label","dataframe"]
+ label_name:str
+ table_name:Optional[str]=None
+
+ @validator("table_name",always=True)
+ deftable_name_only_for_dataframe_type(cls,v,values):
+"""
+ Check that table_name is set only for dataframe outputs.
+ """
+ _type=values.get("type")
+ if(_type=="dataframe"and(notv))or(_type!="dataframe"andv):
+ raiseValueError(
+ f"Output item has type={_type} but table_name={v}."
+ )
+ returnv
+
Check that table_name is set only for dataframe outputs.
+
+
+ Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
+
26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
@validator("table_name",always=True)
+deftable_name_only_for_dataframe_type(cls,v,values):
+"""
+ Check that table_name is set only for dataframe outputs.
+ """
+ _type=values.get("type")
+ if(_type=="dataframe"and(notv))or(_type!="dataframe"andv):
+ raiseValueError(
+ f"Output item has type={_type} but table_name={v}."
+ )
+ returnv
+
defsort_fun(filename:str)->list[int]:
+"""
+ Takes a string (filename of a Yokogawa image), extract site and
+ z-index metadata and returns them as a list of integers.
+
+ Args:
+ filename: Name of the image file.
+ """
+
+ filename_metadata=parse_filename(filename)
+ site=int(filename_metadata["F"])
+ z_index=int(filename_metadata["Z"])
+ return[site,z_index]
+
This task is typically run after Create OME-Zarr or
+Create OME-Zarr Multiplexing and populates the empty OME-Zarr files that
+were prepared.
+
+
+
+
+
+
+
PARAMETER
+
DESCRIPTION
+
+
+
+
+
input_paths
+
+
+
List of input paths where the OME-Zarrs. Should point to
+the parent folder containing one or many OME-Zarr files, not the
+actual OME-Zarr file. Example: ["/some/path/"].
+This task only supports a single input path.
+(standard argument for Fractal tasks,
+managed by Fractal server).
Path to the OME-Zarr image in the OME-Zarr plate that is
+processed. Example: "some_plate.zarr/B/03/0"
+(standard argument for Fractal tasks, managed by Fractal server).
Dictionary containing metadata about the OME-Zarr. This task
+requires the following elements to be present in the metadata.
+original_paths:
+list of paths that correspond to the input_paths of the
+create_ome_zarr task (=> where the microscopy image are stored);
+num_levels (int):
+number of pyramid levels in the image (this determines how many
+pyramid levels are built for the segmentation);
+coarsening_xy (int):
+coarsening factor in XY of the downsampling when building the
+pyramid;
+image_extension:
+filename extension of images (e.g. "tif" or "png");
+image_glob_patterns:
+parameter of create_ome_zarr task (if specified, only parse
+images with filenames that match with all these patterns).
+(standard argument for Fractal tasks, managed by Fractal server).
defupscale_array(
+ *,
+ array:np.ndarray,
+ target_shape:tuple[int,...],
+ axis:Optional[Sequence[int]]=None,
+ pad_with_zeros:bool=False,
+ warn_if_inhomogeneous:bool=False,
+)->np.ndarray:
+"""
+ Upscale an array along a given list of axis (through repeated application
+ of `np.repeat`), to match a target shape.
+
+ Args:
+ array: The array to be upscaled.
+ target_shape: The shape of the rescaled array.
+ axis: The axis along which to upscale the array (if `None`, then all
+ axis are used).
+ pad_with_zeros: If `True`, pad the upscaled array with zeros to match
+ `target_shape`.
+ warn_if_inhomogeneous: If `True`, raise a warning when the conversion
+ factors are not identical across all dimensions.
+
+ Returns:
+ The upscaled array, with shape `target_shape`.
+ """
+
+ # Default behavior: use all axis
+ ifaxisisNone:
+ axis=list(range(len(target_shape)))
+
+ array_shape=array.shape
+ info=(
+ f"Trying to upscale from {array_shape=} to {target_shape=}, "
+ f"acting on {axis=}."
+ )
+
+ iflen(array_shape)!=len(target_shape):
+ raiseValueError(f"{info} Dimensions-number mismatch.")
+ ifaxis==[]:
+ raiseValueError(f"{info} Empty axis list")
+ ifmin(axis)<0:
+ raiseValueError(f"{info} Negative axis specification not allowed.")
+
+ # Check that upscale is doable
+ forind,diminenumerate(array_shape):
+ # Check that array is not larger than target (downscaling)
+ ifdim>target_shape[ind]:
+ raiseValueError(
+ f"{info}{ind}-th array dimension is larger than target."
+ )
+ # Check that all relevant axis are included in axis
+ ifdim!=target_shape[ind]andindnotinaxis:
+ raiseValueError(
+ f"{info}{ind}-th array dimension differs from "
+ f"target, but {ind} is not included in "
+ f"{axis=}."
+ )
+
+ # Compute upscaling factors
+ upscale_factors={}
+ foraxinaxis:
+ if(target_shape[ax]%array_shape[ax])>0andnotpad_with_zeros:
+ raiseValueError(
+ "Incommensurable upscale attempt, "
+ f"from {array_shape=} to {target_shape=}."
+ )
+ upscale_factors[ax]=target_shape[ax]//array_shape[ax]
+ # Check that this is not downscaling
+ ifupscale_factors[ax]<1:
+ raiseValueError(info)
+ info=f"{info} Upscale factors: {upscale_factors}"
+
+ # Raise a warning if upscaling is non-homogeneous across all axis
+ ifwarn_if_inhomogeneous:
+ iflen(set(upscale_factors.values()))>1:
+ warnings.warn(f"{info} (inhomogeneous)")
+
+ # Upscale array, via np.repeat
+ upscaled_array=array
+ foraxinaxis:
+ upscaled_array=np.repeat(
+ upscaled_array,upscale_factors[ax],axis=ax
+ )
+
+ # Check that final shape is correct
+ ifnotupscaled_array.shape==target_shape:
+ ifpad_with_zeros:
+ pad_width=[]
+ foraxinlist(range(len(target_shape))):
+ missing=target_shape[ax]-upscaled_array.shape[ax]
+ ifmissing<0or(missing>0andaxnotinaxis):
+ raiseValueError(
+ f"{info} ""Something wrong during zero-padding"
+ )
+ pad_width.append([0,missing])
+ upscaled_array=np.pad(
+ upscaled_array,
+ pad_width=pad_width,
+ mode="constant",
+ constant_values=0,
+ )
+ logging.warning(f"{info}{upscaled_array.shape=}.")
+ logging.warning(
+ f"Padding upscaled_array with zeros with {pad_width=}"
+ )
+ else:
+ raiseValueError(f"{info}{upscaled_array.shape=}.")
+
+ returnupscaled_array
+
Discover the acquisition index based on OME-NGFF metadata.
+
Given the path to a zarr image folder (e.g. /path/plate.zarr/B/03/0),
+extract the acquisition index from the .zattrs file of the parent
+folder (i.e. at the well level), or return None if acquisition is not
+specified.
+
Notes:
+
+
For non-multiplexing datasets, acquisition is not a required
+ information in the metadata. If it is not there, this function
+ returns None.
+
This function fails if we use an image that does not belong to
+ an OME-NGFF well.
def_find_omengff_acquisition(image_zarr_path:Path)->Union[int,None]:
+"""
+ Discover the acquisition index based on OME-NGFF metadata.
+
+ Given the path to a zarr image folder (e.g. `/path/plate.zarr/B/03/0`),
+ extract the acquisition index from the `.zattrs` file of the parent
+ folder (i.e. at the well level), or return `None` if acquisition is not
+ specified.
+
+ Notes:
+
+ 1. For non-multiplexing datasets, acquisition is not a required
+ information in the metadata. If it is not there, this function
+ returns `None`.
+ 2. This function fails if we use an image that does not belong to
+ an OME-NGFF well.
+
+ Args:
+ image_zarr_path: Full path to an OME-NGFF image folder.
+ """
+
+ # Identify well path and attrs
+ well_zarr_path=image_zarr_path.parent
+ ifnot(well_zarr_path/".zattrs").exists():
+ raiseValueError(
+ f"{str(well_zarr_path)} must be an OME-NGFF well "
+ "folder, but it does not include a .zattrs file."
+ )
+ well_group=zarr.open_group(str(well_zarr_path))
+ attrs_images=well_group.attrs["well"]["images"]
+
+ # Loook for the acquisition of the current image (if any)
+ acquisition=None
+ forimg_dictinattrs_images:
+ if(
+ img_dict["path"]==image_zarr_path.name
+ and"acquisition"inimg_dict.keys()
+ ):
+ acquisition=img_dict["acquisition"]
+ break
+
+ returnacquisition
+
Flexibly extract parameters from metadata dictionary
+
This covers both parameters which are acquisition-specific (if the image
+belongs to an OME-NGFF array and its acquisition is specified) or simply
+available in the dictionary.
+The two cases are handled as:
+
metadata[acquisition]["some_parameter"] # acquisition available
+metadata["some_parameter"] # acquisition not available
+
defget_parameters_from_metadata(
+ *,
+ keys:Sequence[str],
+ metadata:dict[str,Any],
+ image_zarr_path:Path,
+)->dict[str,Any]:
+"""
+ Flexibly extract parameters from metadata dictionary
+
+ This covers both parameters which are acquisition-specific (if the image
+ belongs to an OME-NGFF array and its acquisition is specified) or simply
+ available in the dictionary.
+ The two cases are handled as:
+ ```
+ metadata[acquisition]["some_parameter"] # acquisition available
+ metadata["some_parameter"] # acquisition not available
+ ```
+
+ Args:
+ keys: list of required parameters.
+ metadata: metadata dictionary.
+ image_zarr_path: full path to image, e.g. `/path/plate.zarr/B/03/0`.
+ """
+
+ parameters={}
+ acquisition=_find_omengff_acquisition(image_zarr_path)
+ ifacquisitionisnotNone:
+ parameters["acquisition"]=acquisition
+
+ forkeyinkeys:
+ ifacquisitionisNone:
+ parameter=metadata[key]
+ else:
+ try:
+ parameter=metadata[key][str(acquisition)]
+ exceptTypeError:
+ parameter=metadata[key]
+ exceptKeyError:
+ parameter=metadata[key]
+ parameters[key]=parameter
+ returnparameters
+
Dictionary with table names as keys and table paths as values. If
+tables Zarr group is missing, or if it does not have a tables
+key, then return an empty dictionary.
+
+
+
+
+
+
+
+ Source code in fractal_tasks_core/utils.py
+
defget_table_path_dict(input_path:Path,component:str)->dict[str,str]:
+"""
+ Compile dictionary of (table name, table path) key/value pairs.
+
+
+ Args:
+ input_path:
+ Path to the parent folder of a plate zarr group (e.g.
+ `/some/path/`).
+ component:
+ Path (relative to `input_path`) to an image zarr group (e.g.
+ `plate.zarr/B/03/0`).
+
+ Returns:
+ Dictionary with table names as keys and table paths as values. If
+ `tables` Zarr group is missing, or if it does not have a `tables`
+ key, then return an empty dictionary.
+ """
+
+ try:
+ tables_group=zarr.open_group(f"{input_path/component}/tables","r")
+ table_list=tables_group.attrs["tables"]
+ except(zarr.errors.GroupNotFoundError,KeyError):
+ table_list=[]
+
+ table_path_dict={}
+ fortableintable_list:
+ table_path_dict[table]=f"{input_path/component}/tables/{table}"
+
+ returntable_path_dict
+
Given a set of datasets (as per OME-NGFF specs), update their "scale"
+transformations in the YX directions by including a prefactor
+(coarsening_xy**reference_level).
defopen_zarr_group_with_overwrite(
+ path:Union[str,MutableMapping],
+ *,
+ overwrite:bool,
+ logger:Optional[logging.Logger]=None,
+ **open_group_kwargs:Any,
+)->zarr.hierarchy.Group:
+"""
+ Wrap `zarr.open_group` and add `overwrite` argument.
+
+ This wrapper sets `mode="w"` for `overwrite=True` and `mode="w-"` for
+ `overwrite=False`.
+
+ The expected behavior is
+
+
+ * if the group does not exist, create it (independently on `overwrite`);
+ * if the group already exists and `overwrite=True`, replace the group with
+ an empty one;
+ * if the group already exists and `overwrite=False`, fail.
+
+ From the [`zarr.open_group`
+ docs](https://zarr.readthedocs.io/en/stable/api/hierarchy.html#zarr.hierarchy.open_group):
+
+ * `mode="r"` means read only (must exist);
+ * `mode="r+"` means read/write (must exist);
+ * `mode="a"` means read/write (create if doesn’t exist);
+ * `mode="w"` means create (overwrite if exists);
+ * `mode="w-"` means create (fail if exists).
+
+
+ Args:
+ path:
+ Store or path to directory in file system or name of zip file
+ (`zarr.open_group` parameter).
+ overwrite:
+ Determines the `mode` parameter of `zarr.open_group`, which is
+ `"w"` (if `overwrite=True`) or `"w-"` (if `overwrite=False`).
+ logger:
+ The logger to use (if unset, use `logging.getLogger(None)`)
+ open_group_kwargs:
+ Keyword arguments of `zarr.open_group`.
+
+ Returns:
+ The zarr group.
+
+ Raises:
+ OverwriteNotAllowedError:
+ If `overwrite=False` and the group already exists.
+ """
+
+ # Set logger
+ ifloggerisNone:
+ logger=logging.getLogger(None)
+
+ # Set mode for zarr.open_group
+ ifoverwrite:
+ new_mode="w"
+ else:
+ new_mode="w-"
+
+ # Write log about current status
+ logger.info(f"Start open_zarr_group_with_overwrite ({overwrite=}).")
+ try:
+ # Call `zarr.open_group` with `mode="r"`, which fails for missing group
+ current_group=zarr.open_group(path,mode="r")
+ keys=list(current_group.group_keys())
+ logger.info(f"Zarr group {path} already exists, with {keys=}")
+ exceptGroupNotFoundError:
+ logger.info(f"Zarr group {path} does not exist yet.")
+
+ # Raise warning if we are overriding an existing value of `mode`
+ if"mode"inopen_group_kwargs.keys():
+ mode=open_group_kwargs.pop("mode")
+ logger.warning(
+ f"Overriding {mode=} with {new_mode=}, "
+ "in open_zarr_group_with_overwrite"
+ )
+
+ # Call zarr.open_group
+ try:
+ returnzarr.open_group(path,mode=new_mode,**open_group_kwargs)
+ exceptContainsGroupError:
+ # Re-raise error with custom message and type
+ error_msg=(
+ f"Cannot create zarr group at {path=} with `{overwrite=}` "
+ "(original error: `zarr.errors.ContainsGroupError`).\n"
+ "Hint: try setting `overwrite=True`."
+ )
+ logger.error(error_msg)
+ raiseOverwriteNotAllowedError(error_msg)
+
Thanks to the package manifest and to their structure, the tasks in
+fractal_tasks_core.tasks can be run within the Fractal
+platform; this consists in a
+backend server
+which can be accessed by one of the two available clients (a command-line
+client and a
+web-client).
+
The fractal-demos repository lists a set of relevant examples, including:
How to use the command-line client to submit a series of typical workflows (based on fractal-tasks-core tasks) to Fractal; see folders from 01 to 10 in the examples folder.
The fractal-tasks-core GitHub repository includes an examples folder, listing a few examples of how to run fractal-tasks-core tasks from a standard Python script (instead of using the Fractal platform).
Enter one of the example folders, remove the tmp_out temporary output
+ folder (if present), and run one of the run_workflow Python scripts.
+
+
+
View the output OME-Zarr in the tmp_out folder with
+ napari, which can be installed via pip install
+ napari[pyqt5] napari-ome-zarr.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/search/search_index.json b/search/search_index.json
new file mode 100644
index 000000000..ce7627753
--- /dev/null
+++ b/search/search_index.json
@@ -0,0 +1 @@
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to Fractal Tasks Core's documentation!","text":"
Fractal is a framework to process high content imaging data at scale and prepare it for interactive visualization.
This project is under active development \ud83d\udd28. If you need help or found a bug, open an issue here.
Fractal provides distributed workflows that convert TBs of image data into OME-Zar files. The platform then processes the 3D image data by applying tasks like illumination correction, maximum intensity projection, 3D segmentation using cellpose and measurements using napari workflows. The pyramidal OME-Zarr files enable interactive visualization in the napari viewer.
The fractal-tasks-core package contains the python tasks that parse Yokogawa CV7000 images into OME-Zarr and process OME-Zarr files. Find more information about Fractal in general and the other repositories at this link. All tasks are written as Python functions and are optimized for usage in Fractal workflows, but they can also be used as standalone functions to parse data or process OME-Zarr files. We heavily use regions of interest (ROIs) in our OME-Zarr files to store the positions of field of views. ROIs are saved as AnnData tables following this spec proposal. We save wells as large Zarr arrays instead of a collection of arrays for each field of view (see details here).
Here is an example of the interactive visualization in napari using the newly-proposed async loading in NAP4 and the napari-ome-zarr plugin:
Create Zarr Structure: Task to generate the zarr structure based on Yokogawa metadata files
Yokogawa to Zarr: Parses the Yokogawa CV7000 image data and saves it to the Zarr file
Illumination Correction: Applies an illumination correction based on a flatfield image & subtracts a background from the image.
Image Labeling (& Image Labeling Whole Well): Applies a cellpose network to the image of a single ROI or the whole well. cellpose parameters can be tuned for optimal performance.
Maximum Intensity Projection: Creates a maximum intensity projection of the whole plate.
Measurement: Make some standard measurements (intensity & morphology) using napari workflows, saving results to AnnData tables.
Some additional tasks are currently being worked on and some older tasks are still present in the fractal_tasks_core folder. See the package page for the detailed description of all tasks.
Fractal was conceived in the Liberali Lab at the Friedrich Miescher Institute for Biomedical Research and in the Pelkmans Lab at the University of Zurich by @jluethi and @gusqgm. The Fractal project is now developed at the BioVisionCenter at the University of Zurich and the project lead is with @jluethi. The core development is done under contract by eXact lab S.r.l..
Here is a list of tasks that are available within Fractal-compatible packages, including both fractal-tasks-core and others.
These are the tasks that we are aware of (on December 5th, 2023); if you created your own package of Fractal tasks, reach out to have it listed here (or, if you want to build your own tasks, follow these instructions).
(major) Introduce new tasks for registration of multiplexing cycles: calculate_registration_image_based, apply_registration_to_ROI_tables, apply_registration_to_image (#487).
(major) Introduce new overwrite argument for tasks create_ome_zarr, create_ome_zarr_multiplex, yokogawa_to_ome_zarr, copy_ome_zarr, maximum_intensity_projection, cellpose_segmentation, napari_workflows_wrapper (#499).
(major) Rename illumination_correction parameter from overwrite to overwrite_input (#499).
Fix plate-selection bug in copy_ome_zarr task (#513).
Fix bug in definition of metadata[\"plate\"] in create_ome_zarr_multiplex task (#513).
Introduce new helper functions write_table, prepare_label_group and open_zarr_group_with_overwrite (#499).
Make tasks-related dependencies optional, and installable via fractal-tasks extra (#390).
Remove tools package extra (#384), and split the subpackage content into lib_ROI_overlaps and examples (#390).
(major) Modify task arguments
Add Pydantic model lib_channels.OmeroChannel (#410, #422);
Add Pydantic model tasks._input_models.Channel (#422);
Add Pydantic model tasks._input_models.NapariWorkflowsInput (#422);
Add Pydantic model tasks._input_models.NapariWorkflowsOutput (#422);
Move all Pydantic models to main package (#438).
Modify arguments of illumination_correction task (#431);
Modify arguments of create_ome_zarr and create_ome_zarr_multiplex (#433).
Modify argument default for ROI_table_names, in copy_ome_zarr (#449).
Remove the delete option from yokogawa to ome zarr (#443).
Reorder task inputs (#451).
JSON Schemas for task arguments:
Add JSON Schemas for task arguments in the package manifest (#369, #384).
Add JSON Schemas for attributes of custom task-argument Pydantic models (#436).
Make schema-generation tools more general, when handling custom Pydantic models (#445).
Include titles for custom-model-typed arguments and argument attributes (#447).
Remove TaskArguments models and switch to Pydantic V1 validate_arguments (#369).
Make coercing&validating task arguments required, rather than optional (#408).
Remove default_args from manifest (#379, #393).
Other:
Make pydantic dependency required for running tasks, and pin it to V1 (#408).
Remove legacy executor definitions from manifest (#361).
Add GitHub action for testing pip install with/without fractal-tasks extra (#390).
Remove sqlmodel from dev dependencies (#374).
Relax constraint on torch version, from ==1.12.1 to <=2.0.0 (#406).
Review task docstrings and improve documentation (#413, #416).
Update anndata dependency requirements (from ^0.8.0 to >=0.8.0,<=0.9.1), and replace anndata.experimental.write_elem with anndata._io.specs.write_elem (#428).
Disable bugged validation of model_type argument in cellpose_segmentation (#344).
Raise an error if the user provides an unexpected argument to a task (#337); this applies to the case of running a task as a script, with a pydantic model for task-argument validation.
(major) Update task interface: remove filename extension from input_paths and output_path for all tasks, and add new arguments (image_extension,image_glob_pattern) to create_ome_zarr task (#323).
Implement logic for handling image_glob_patterns argument, both when globbing images and in Yokogawa metadata parsing (#326).
"},{"location":"custom_task/","title":"How to write a Fractal-compatible custom task","text":"
The fractal-tasks-core repository is the reference implementation for Fractal tasks and for Fractal task packages, but the Fractal platform can also be used to execute custom tasks. This page lists the Fractal-compatibility requirements, for a single custom task or for a task package.
Note that these specifications evolve frequently, see e.g. discussion at https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151.
Note: While the contents of this page remain valid, the recommended procedure to get up to speed and build a Python package of Fractal-compatible tasks is to use the template available at https://github.com/fractal-analytics-platform/fractal-tasks-template.
A Fractal task is mainly formed by two components:
A set of metadata, which are stored in the task table of the database of a fractal-server instance, see Task metadata.
An executable command, which can take some specific command-line arguments (see Command-line interface); the standard example is a Python script.
In the following we explain what are the Fractal-compatibility requirements for a single task, and then for a task package.
Each task must be associated to some metadata, so that it can be used in Fractal. The full specification is here, and the required attributes are:
name: the task name, e.g. \"Create OME-Zarr structure\";
command: a command that can be executed from the command line;
input_type: this can be any string (typical examples: \"image\" or \"zarr\"); the special value \"Any\" means that Fractal won't perform any check of the input_type when applying the task to a dataset.
output_type: same logic as input_type.
source: this is meant to be as close as possible to unique task identifier; for custom tasks, it can be anything (e.g. \"my_task\"), but for task that are collected automatically from a package (see Task package this attribute will have a very specific form (e.g. \"pip_remote:fractal_tasks_core:0.10.0:fractal-tasks::convert_yokogawa_to_ome-zarr\").
meta: a JSON object (similar to a Python dictionary) with some additional information, see Task meta-parameters.
There are multiple ways to get the appropriate metadata into the database, including a POST request to the fractal-server API (see Tasks section in the fractal-server API documentation) or the automated addition of a whole set of tasks through specific API endpoints (see Task package).
Therefore the task command must accept these additional command-line arguments. If the task is a Python script, this can be achieved easily by using the run_fractal_task function - which is available as part of fractal_tasks_core.tasks._utils.
The meta attribute of tasks (see the corresponding item in Task metadata) is where we specify some requirements on how the task should be run. This notably includes:
If the task has to be run in parallel (e.g. over multiple wells of an OME-Zarr dataset), then meta should include a key-value pair like {\"parallelization_level\": \"well\"}. If the parallelization_level key is missing, the task is considered as non-parallel.
If Fractal is configured to run on a SLURM cluster, meta may include additional information on the SLRUM requirements (more info on the Fractal SLURM backend here).
When a task is run via Fractal, its input parameters (i.e. the ones in the file specified via the -j command-line otion) will always include a set of keyword arguments with specific names:
The only task output which will be visible to Fractal is what goes in the output metadata-update file (i.e. the one specified through the --metadata-out command-line option). Note that this only holds for non-parallel tasks, while (for the moment) Fractal fully ignores the output of parallel tasks.
IMPORTANT: This means that each task must always write any output to disk, before ending.
The description of other advanced features is not yet available in this page.
Also other attributes of the Task metadata exist, and they would be recognized by other Fractal components (e.g. fractal-server or fractal-web). These include JSON Schemas for input parameters and additional documentation-related attributes.
In fractal-tasks-core, we use pydantic v1 to fully coerce and validate the input parameters into a set of given types.
Here we describe a simplified example of a Fractal-compatible Python task (for more realistic examples see the fractal-task-core tasks folder).
The script /some/path/my_task.py may look like
# Import a helper function from fractal_tasks_core\nfrom fractal_tasks_core.tasks._utils import run_fractal_task\n\ndef my_task_function(\n # Reserved Fractal arguments\n input_paths,\n output_path,\n metadata,\n # Task-specific arguments\n argument_A,\n argument_B = \"default_B_value\",\n):\n # Do something, based on the task parameters\n print(\"Here we go, we are in `my_task_function`\")\n with open(f\"{output_path}/output.txt\", \"w\") as f:\n f.write(f\"argument_A={argument_A}\\n\")\n f.write(f\"argument_B={argument_B}\\n\")\n # Compile the output metadata update and return\n output_metadata_update = {\"nothing\": \"to add\"}\n return output_metadata_update\n\n# Thi block is executed when running the Python script directly\nif __name__ == \"__main__\":\n run_fractal_task(task_function=my_task_function)\n
where we use run_fractal_task so that we don't have to take care of the command-line arguments.
Some valid metadata attributes for this task would be:
Given a set of Python scripts corresponding to Fractal tasks, it is useful to combine them into a single Python package, using the standard tools or other options (e.g. for fractal-tasks-core we use poetry).
Creating a package is often a good practice, for reasons unrelated to Fractal:
It makes it simple to assign a global version to the package, and to host it on a public index like PyPI;
It may reduce code duplication:
The scripts may have a shared set of external dependencies, which are defined in a single place for a package.
The scripts may import functions from a shared set of auxiliary Python modules, which can be included in the package.
Moreover, having a single package also streamlines some Fractal-related operations. Given the package MyTasks (available on PyPI, or locally), the Fractal platform offers a feature that automatically:
Downloads the wheel file of package MyTasks (if it's on a public index, rather than a local file);
Creates a Python virtual environment (venv) which is specific for a given version of the MyTasks package, and installs the MyTasks package in that venv;
Populates all the corresponding entries in the task database table with the appropriate Task metadata, which are extracted from the package manifest.
This feature is currently exposed in the /api/v1/task/collect/pip/ endpoint of fractal-server (see API documentation).
To be compatible with Fractal, a task package must satisfy some additional requirements:
The package is built as a a wheel file, and can be installed via pip.
The __FRACTAL_MANIFEST__.json file is bundled in the package, in its root folder. If you are using poetry, no special operation is needed. If you are using a setup.cfg file, see this comment.
Include JSON Schemas. The tools in fractal_tasks_core.dev are used to generate JSON Schema's for the input parameters of each task in fractal-tasks-core. They are meant to be flexible and re-usable to perform the same operation on an independent package, but they are not thoroughly documented/tested for more general use; feel free to open an issue if something is not clear.
Include additional task metadata like docs_info or docs_link, which will be displayed in the Fractal web-client. Note: this feature is not yet implemented.
The ones in the list are the main requirements; if you hit unexpected behaviors, also have a look at https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151 or open a new issue.
"},{"location":"development/","title":"Development","text":""},{"location":"development/#setting-up-environment","title":"Setting up environment","text":"
We use poetry to manage the development environment and the dependencies. A simple way to install it is pipx install poetry==1.7.1, or you can look at the installation section here.
Running any of
# Install the core library only\npoetry install\n\n# Install the core library and the tasks\npoetry install -E fractal-tasks\n\n# Install the core library and the development/documentation dependencies\npoetry install --with dev --with docs\n
will take care of installing all the dependencies in a separate environment, optionally installing also the dependencies for developement and to build the documentation."},{"location":"development/#testing","title":"Testing","text":"
We use pytest for unit and integration testing of Fractal. If you installed the development dependencies, you may run the test suite by invoking:
poetry run pytest\n
The tests files are in the tests folder of the repository, and they are also run through GitHub Actions; both the main fractal_tasks_core tests (in tests/) and the fractal_tasks_core.tasks tests (in tests/tasks/) are run with Python 3.9, 3.10 and 3.11.
The documentations is built with mkdocs. To build the documentation locally, setup a development python environment (e.g. with poetry install --with docs) and then run one of these commands:
poetry run mkdocs serve --config-file mkdocs.yml # serves the docs at http://127.0.0.1:8000\npoetry run mkdocs build --config-file mkdocs.yml # creates a build in the `site` folder\n
We do not enforce strict mypy compliance, but we do run it as part of a specific GitHub Action. You can run mypy locally for instance as:
poetry run mypy --package fractal_tasks_core --ignore-missing-imports --warn-redundant-casts --warn-unused-ignores --warn-unreachable --pretty\n
"},{"location":"development/#how-to-release","title":"How to release","text":"
Preliminary check-list:
The main branch is checked out.
You reviewed dependencies and dev dependencies and the lock file is up to date with pyproject.toml (it is useful to have a look at the output of deptry . -v, where deptry is already installed as part of the dev dependencies - NOTE: deptry should be installed independently, e.g. via pipx install deptry).
The current HEAD of the main branch passes all the tests (note: make sure that you are using the poetry-installed local package).
Update changelog. First look at the list of commits since the last tag, via:
then add the upcoming release to docs/source/changelog.rst with the main information about it, using standard categories like \"New features\", \"Fixes\" and \"Other changes\", and including PR numbers when relevant. Commit docs/source/changelog.rst and push.
If appropriate (e.g. if you added some new task arguments, or if you modified some of their descriptions), update the JSON Schemas in the manifest via:
poetry run python fractal_tasks_core/dev/update_manifest.py\n
Actual release
Use:
poetry run bumpver update --[tag-num|patch|minor] --tag-commit --commit --dry\n
to test updating the version bump.
If the previous step looks good, use:
poetry run bumpver update --[tag-num|patch|minor] --tag-commit --commit\n
to actually bump the version. This will trigger a dedicated GitHub action to build the new package and publish it to PyPI.
"},{"location":"install/","title":"How to install","text":"
The fractal_tasks_core Python package is hosted on PyPI (https://pypi.org/project/fractal-tasks-core), and can be installed via pip. It includes three (sub)packages:
The main fractal_tasks_core package: a set of helper functions to be used in the Fractal tasks (and possibly in other independent packages).
The fractal_tasks_core.tasks subpackage: a set of standard Fractal tasks.
The fractal_tasks_core.dev subpackage: a set of developement tools (mostly related to creation of JSON Schemas for task arguments).
which only installs the dependencies necessary for the main package and for the dev subpackage."},{"location":"install/#full-installation","title":"Full installation","text":"
In order to also use the tasks subpackage, the additional extra fractal-tasks must be included, as in
pip install fractal-tasks-core[fractal-tasks]\n
Warning: This command installs heavier dependencies (e.g. torch)."},{"location":"tables/","title":"Table specifcations","text":"
Within fractal-tasks-core, we make use of tables which are AnnData objects stored within OME-Zarr image groups. This page describes the different kinds of tables we use, and it includes:
A core table specification, valid for all tables;
The definition of tables for regions of interests (ROIs);
The definition of masking ROI tables, namely ROI tables that are linked e.g. to labels;
A feature-table specification, to store measurements.
Note: The specifications below are largely inspired by a proposed update to OME-NGFF specs. This update is currently on hold, and fractal-tasks-core will evolve as soon as an official NGFF table specs is adopted - see also the Outlook section.
In this section we describe version 1 (V1) of the Fractal table specifications; for the moment, only V1 exists. Note that V1 specifications are only implemented as os of version 0.14.0 of fractal-tasks-core.
The core-table specification consists in the definition of the required Zarr structure and attributes, and of the AnnData table format.
AnnData table format
We store tabular data into Zarr groups as AnnData (\"Annotated Data\") objects; the anndata Python library provides the definition of this format and the relevant tools. Quoting from the anndata documentation:
AnnData is specifically designed for matrix-like data. By this we mean that we have \\(n\\) observations, each of which can be represented as \\(d\\)-dimensional vectors, where each dimension corresponds to a variable or feature. Both the rows and columns of this \\(n \\times d\\) matrix are special in the sense that they are indexed.
Note that AnnData tables are easily transformed from/into pandas.DataFrame objects - see e.g. the AnnData.to_df method.
Zarr structure and attributes
The structure of Zarr groups is based on the image specification in NGFF 0.4, with an additional tables group and the corresponding subgroups (similar to labels):
image.zarr # Zarr group for a NGFF image\n|\n\u251c\u2500\u2500 0 # Zarr array for multiscale level 0\n\u251c\u2500\u2500 ...\n\u251c\u2500\u2500 N # Zarr array for multiscale level N\n|\n\u251c\u2500\u2500 labels # Zarr subgroup with a list of labels associated to this image\n| \u251c\u2500\u2500 label_A # Zarr subgroup for a given label\n| \u251c\u2500\u2500 label_B # Zarr subgroup for a given label\n| \u2514\u2500\u2500 ...\n|\n\u2514\u2500\u2500 tables # Zarr subgroup with a list of tables associated to this image\n \u251c\u2500\u2500 table_1 # Zarr subgroup for a given table\n \u251c\u2500\u2500 table_2 # Zarr subgroup for a given table\n \u2514\u2500\u2500 ...\n
The Zarr attributes of the tables group must include the key tables, pointing to the list of all tables (this simplifies discovery of tables associated to the current NGFF image), as in image.zarr/tables/.zattrs
{\n\"tables\": [\"table_1\", \"table_2\"]\n}\n
The Zarr attributes of each specific-table group must include the version of the table specification (currently version 1), through the fractal_table_version attribute. Also note that the anndata function to write an AnnData object into a Zarr group automatically sets additional attributes. Here is an example of the resulting Zarr attributes: image.zarr/tables/table_1/.zattrs
{\n\"fractal_table_version\": \"1\",\n\"encoding-type\": \"anndata\", // Automatically added by anndata 0.11\n\"encoding-version\": \"0.1.0\", // Automatically added by anndata 0.11\n}\n
In fractal-tasks-core, a ROI table defines regions of space which are three-dimensional (see also the Outlook section about dimensionality flexibility) and box-shaped. Typical use cases are described here.
Zarr attributes
The specification of a ROI table is a subset of the core table one. Moreover, the table-group Zarr attributes must include the type attribute with value roi_table, as in image.zarr/tables/table_1/.zattrs
The var attribute of a given AnnData object indexes the columns of the table. A fractal-tasks-core ROI table must include the following six columns:
x_micrometer, y_micrometer, z_micrometer: the lower bounds of the XYZ intervals defining the ROI, in micrometers;
len_x_micrometer, len_y_micrometer, len_z_micrometer: the XYZ edge lengths, in micrometers.
Notes:
The axes origin for the ROI positions (e.g. for x_micrometer) corresponds to the top-left corner of the image (for the YX axes) and to the lowest Z plane.
ROIs are defined in physical coordinates, and they do not store information on the number or size of pixels.
ROI tables may also include other columns, beyond the required ones. Here are the ones that are typically used in fractal-tasks-core (see also the Use cases section):
x_micrometer_original and y_micrometer_original, which are a copy of x_micrometer and y_micrometer taken before applying some transformation;
translation_x, translation_y and translation_z, which are used during registration of multiplexing cycles;
label, which is used to link a ROI to a label (either for masking ROI tables or for feature tables).
"},{"location":"tables/#masking-roi-tables","title":"Masking ROI tables","text":"
Masking ROI tables are a specific instance of the basic ROI tables described above, where each ROI must also be associated to a specific label of a label image.
Motivation
The motivation for this association is based on the following use case:
By performing segmentation of a NGFF image, we identify N objects and we store them as a label image (where the value at each pixel correspond to the label index);
We also compute the three-dimensional bounding box of each segmented object, and store these bounding boxes into a masking ROI table;
For each one of these ROIs, we also include information that link it to both the label image and a specific label index;
During further processing we can load/modify specific sub-regions of the ROI, based on information contained in the label image. This kind of operations are masked, as they only act on the array elements that match a certain condition on the label value.
Zarr attributes
For this kind of tables, fractal-tasks-core closely follows the proposed NGFF update mentioned above. The requirements on the Zarr attributes of a given table are:
Attributes must contain a type key, with value masking_roi_table2.
Attributes must contain a region key; the corresponding value must be an object with a path key and a string value (i.e. the path to the data the table is annotating).
Attributes must include a key instance_key, which is the key in obs that denotes which instance in region the row corresponds to.
Here is an example of valid Zarr attributes image.zarr/tables/table_1/.zattrs
On top of the required ROI-table colums, the masking-ROI-table AnnData object must have an attribute obs with a key matching to the instance_key zarr attribute. For instance if instance_key=\"label\" then table.obs[\"label\"] must exist, with its items matching the labels in the image in \"../labels/label_DAPI\".
The typical use case for feature tables is to store measurements related to segmented objects, while mantaining a link to the original instances (e.g. labels). Note that the current specification is aligned to the one of masking ROI tables, since they both need to relate a table to a label image, but the two may diverge in the future.
As part of the current fractal-tasks-core tasks, measurements can be performed e.g. via regionprops from scikit-image, as wrapped in napari-skimage-regionprops).
Zarr attributes
For this kind of tables, fractal-tasks-core closely follows the proposed NGFF update mentioned above. The requirements on the Zarr attributes of a given table are:
Attributes must contain a type key, with value feature_table2.
Attributes must contain a region key; the corresponding value must be an object with a path key and a string value (i.e. the path to the data the table is annotating).
Attributes must include a key instance_key, which is the key in obs that denotes which instance in region the row corresponds to.
Here is an example of valid Zarr attributes image.zarr/tables/table_1/.zattrs
The feature-table AnnData object must have an attribute obs with a key matching to the instance_key zarr attribute. For instance if instance_key=\"label\" then table.obs[\"label\"] must exist, with its items matching the labels in the image in \"../labels/label_DAPI\".
"},{"location":"tables/#examples","title":"Examples","text":""},{"location":"tables/#use-cases-for-roi-tables","title":"Use cases for ROI tables","text":""},{"location":"tables/#ome-zarr-creation","title":"OME-Zarr creation","text":"
OME-Zarrs created via fractal-tasks-core (e.g. by parsing Yokogawa images via the create_ome_zarr or create_ome_zarr_multiplex tasks) always include two specific ROI tables:
The table named well_ROI_table, which covers the NGFF image corresponding to the whole well1;
The table named FOV_ROI_table, which lists all original fields of view (FOVs).
Each one of these two tables includes ROIs that span the whole image size along the Z axis. Note that this differs, e.g., from ROIs which are the bounding boxes of three-dimensional segmented objects, and which may cover only a part of the image Z size.
When working with an externally-generated OME-Zarr, one may use the import_ome_zarr task to make it compatible with fractal-tasks-core. This task optionally adds two ROI tables to the NGFF images:
The table named image_ROI_table, which covers the whole image;
A table named grid_ROI_table, which splits the whole-image ROI into a YX rectangular grid of smaller ROIs. This may correspond to original FOVs (in case the image is a tiled well1), or it may simply be useful for applying downstream processing to smaller arrays and avoid large memory requirements.
As for the case of well_ROI_table and FOV_ROI_table described above, also these two tables include ROIs spanning the whole image extension along the Z axis.
ROI tables are also used and updated during image processing, e.g as in:
The FOV ROI table may undergo transformations during processing, e.g. FOV ROIs may be shifted to avoid overlaps; in this case, we use the optional columns x_micrometer_original and y_micrometer_original to store the values before the transformation.
The FOV ROI table is also used to store information on the registration of multiplexing cycles, via the translation_x, translation_y and translation_z optional columns.
Several tasks in fractal-tasks-core take an existing ROI table as an input and then loop over the ROIs defined in the table. This makes the task more flexible, as it can be used to process e.g. a whole well, a set of FOVs, or a set of custom regions of the array.
The anndata library offers a set of functions for input/output of AnnData tables, including functions specifically targeting the Zarr format.
"},{"location":"tables/#reading-a-table","title":"Reading a table","text":"
To read an AnnData table from a Zarr group, one may use the read_zarr function. In the following example a NGFF image was created by stitching together two field of views, where each one is made of a stack of five Z planes with 1 um spacing between the planes. The FOV_ROI_table has information on the XY position and size of the two original FOVs (named FOV_1 and FOV_2):
In this case, the second FOV (labeled FOV_2) is defined as the three-dimensional region such that
X is between 416 and 832 micrometers;
Y is between 0 and 351 micrometers;
Z is between 0 and 5 - which means that all the five available Z planes are included.
"},{"location":"tables/#writing-a-table","title":"Writing a table","text":"
The anndata.experimental.write_elem function provides the required functionality to write an AnnData object to a Zarr group. In fractal-tasks-core, the write_table helper function wraps the anndata function and includes additional functionalities -- see its documentation.
With respect to the wrapped anndata function, the main additional features of write_table are
The boolean parameter overwrite (defaulting to False), that determines the behavior in case of an already-existing table at the given path.
The table_attrs parameter, as a shorthand for updating the Zarr attributes of the table group after its creation.
These specifications may evolve (especially based on the future NGFF updates), eventually leading to breaking changes in future versions. fractal-tasks-core will aim at mantaining backwards-compatibility with V1 for a reasonable amount of time.
Here is an in-progress list of aspects that may be reviewed:
We aim at removing the use of hard-coded units from the column names (e.g. x_micrometer), in favor of a more general definition of units.
The z_micrometer and len_z_micrometer columns are currently required in all ROI tables, even when the ROIs actually define a two-dimensional XY region; in that case, we set z_micrometer=0 and len_z_micrometer is such that the whole Z size is covered (that is, len_z_micrometer is the product of the spacing between Z planes and the number of planes). In a future version, we may introduce more flexibility and also accept ROI tables which only include X and Y axes, and adapt the relevant tools so that they automatically expand these ROIs into three-dimensions when appropriate.
Concerning the use of AnnData tables or other formats for tabular data, our plan is to follow whatever serialised table specification becomes part of the NGFF standard. For the record, Zarr does not natively support storage of dataframes (see e.g. https://github.com/zarr-developers/numcodecs/issues/452), which is one aspect in favor of sticking with the anndata library.
Within fractal-tasks-core, NGFF images represent whole wells; this still complies with the NGFF specifications, as of an approved clarification in the specs. This explains the reason for storing the regions corresponding to the original FOVs in a specific ROI table, since one NGFF image includes a collection of FOVs. Note that this approach does not rely on the assumption that the FOVs constitute a regular tiling of the well, but it also covers the case of irregularly placed FOVs.\u00a0\u21a9\u21a9
Note that the table types masking_roi_table and feature_table closely resemble the type=\"ngff:region_table\" specification in the previous proposed NGFF table specs.\u00a0\u21a9\u21a9
A channel which is specified by either wavelength_id or label.
This model is similar to OmeroChannel, but it is used for task-function arguments (and for generating appropriate JSON schemas).
ATTRIBUTE DESCRIPTION wavelength_id
Unique ID for the channel wavelength, e.g. A01_C01.
TYPE: Optional[str]
label
Name of the channel.
TYPE: Optional[str]
Source code in fractal_tasks_core/channels.py
class ChannelInputModel(BaseModel):\n\"\"\"\n A channel which is specified by either `wavelength_id` or `label`.\n\n This model is similar to `OmeroChannel`, but it is used for\n task-function arguments (and for generating appropriate JSON schemas).\n\n Attributes:\n wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.\n label: Name of the channel.\n \"\"\"\n\n wavelength_id: Optional[str] = None\n label: Optional[str] = None\n\n @validator(\"label\", always=True)\n def mutually_exclusive_channel_attributes(cls, v, values):\n\"\"\"\n Check that either `label` or `wavelength_id` is set.\n \"\"\"\n wavelength_id = values.get(\"wavelength_id\")\n label = v\n if wavelength_id and v:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both set \"\n f\"(given {wavelength_id=} and {label=}).\"\n )\n if wavelength_id is None and v is None:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both `None`\"\n )\n return v\n
@validator(\"label\", always=True)\ndef mutually_exclusive_channel_attributes(cls, v, values):\n\"\"\"\n Check that either `label` or `wavelength_id` is set.\n \"\"\"\n wavelength_id = values.get(\"wavelength_id\")\n label = v\n if wavelength_id and v:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both set \"\n f\"(given {wavelength_id=} and {label=}).\"\n )\n if wavelength_id is None and v is None:\n raise ValueError(\n \"`wavelength_id` and `label` cannot be both `None`\"\n )\n return v\n
Custom error for when get_channel_from_list fails, that can be captured and handled upstream if needed.
Source code in fractal_tasks_core/channels.py
class ChannelNotFoundError(ValueError):\n\"\"\"\n Custom error for when `get_channel_from_list` fails,\n that can be captured and handled upstream if needed.\n \"\"\"\n\n pass\n
Custom class for Omero channels, based on OME-NGFF v0.4.
ATTRIBUTE DESCRIPTION wavelength_id
Unique ID for the channel wavelength, e.g. A01_C01.
TYPE: str
index
Do not change. For internal use only.
TYPE: Optional[int]
label
Name of the channel.
TYPE: Optional[str]
window
Optional Window object to set default display settings for napari.
TYPE: Optional[Window]
color
Optional hex colormap to display the channel in napari (it must be of length 6, e.g. 00FFFF).
TYPE: Optional[str]
active
Should this channel be shown in the viewer?
TYPE: bool
coefficient
Do not change. Omero-channel attribute.
TYPE: int
inverted
Do not change. Omero-channel attribute.
TYPE: bool
Source code in fractal_tasks_core/channels.py
class OmeroChannel(BaseModel):\n\"\"\"\n Custom class for Omero channels, based on OME-NGFF v0.4.\n\n Attributes:\n wavelength_id: Unique ID for the channel wavelength, e.g. `A01_C01`.\n index: Do not change. For internal use only.\n label: Name of the channel.\n window: Optional `Window` object to set default display settings for\n napari.\n color: Optional hex colormap to display the channel in napari (it\n must be of length 6, e.g. `00FFFF`).\n active: Should this channel be shown in the viewer?\n coefficient: Do not change. Omero-channel attribute.\n inverted: Do not change. Omero-channel attribute.\n \"\"\"\n\n # Custom\n\n wavelength_id: str\n index: Optional[int]\n\n # From OME-NGFF v0.4 transitional metadata\n\n label: Optional[str]\n window: Optional[Window]\n color: Optional[str]\n active: bool = True\n coefficient: int = 1\n inverted: bool = False\n\n @validator(\"color\", always=True)\n def valid_hex_color(cls, v, values):\n\"\"\"\n Check that `color` is made of exactly six elements which are letters\n (a-f or A-F) or digits (0-9).\n \"\"\"\n if v is None:\n return v\n if len(v) != 6:\n raise ValueError(f'color must have length 6 (given: \"{v}\")')\n allowed_characters = \"abcdefABCDEF0123456789\"\n for character in v:\n if character not in allowed_characters:\n raise ValueError(\n \"color must only include characters from \"\n f'\"{allowed_characters}\" (given: \"{v}\")'\n )\n return v\n
Check that color is made of exactly six elements which are letters (a-f or A-F) or digits (0-9).
Source code in fractal_tasks_core/channels.py
@validator(\"color\", always=True)\ndef valid_hex_color(cls, v, values):\n\"\"\"\n Check that `color` is made of exactly six elements which are letters\n (a-f or A-F) or digits (0-9).\n \"\"\"\n if v is None:\n return v\n if len(v) != 6:\n raise ValueError(f'color must have length 6 (given: \"{v}\")')\n allowed_characters = \"abcdefABCDEF0123456789\"\n for character in v:\n if character not in allowed_characters:\n raise ValueError(\n \"color must only include characters from \"\n f'\"{allowed_characters}\" (given: \"{v}\")'\n )\n return v\n
Custom class for Omero-channel window, based on OME-NGFF v0.4.
ATTRIBUTE DESCRIPTION min
Do not change. It will be set to 0 by default.
TYPE: Optional[int]
max
Do not change. It will be set according to bit-depth of the images by default (e.g. 65535 for 16 bit images).
TYPE: Optional[int]
start
Lower-bound rescaling value for visualization.
TYPE: int
end
Upper-bound rescaling value for visualization.
TYPE: int
Source code in fractal_tasks_core/channels.py
class Window(BaseModel):\n\"\"\"\n Custom class for Omero-channel window, based on OME-NGFF v0.4.\n\n Attributes:\n min: Do not change. It will be set to `0` by default.\n max:\n Do not change. It will be set according to bit-depth of the images\n by default (e.g. 65535 for 16 bit images).\n start: Lower-bound rescaling value for visualization.\n end: Upper-bound rescaling value for visualization.\n \"\"\"\n\n min: Optional[int]\n max: Optional[int]\n start: int\n end: int\n
Produce a string value that is not present in a given list
Append _1, _2, ... to a given string, if needed, until finding a value which is not already present in existing_values.
PARAMETER DESCRIPTION value
The first guess for the new value
TYPE: str
existing_values
The list of existing values
TYPE: list[str]
RETURNS DESCRIPTION str
A string value which is not present in existing_values
Source code in fractal_tasks_core/channels.py
def _get_new_unique_value(\n value: str,\n existing_values: list[str],\n) -> str:\n\"\"\"\n Produce a string value that is not present in a given list\n\n Append `_1`, `_2`, ... to a given string, if needed, until finding a value\n which is not already present in `existing_values`.\n\n Args:\n value: The first guess for the new value\n existing_values: The list of existing values\n\n Returns:\n A string value which is not present in `existing_values`\n \"\"\"\n counter = 1\n new_value = value\n while new_value in existing_values:\n new_value = f\"{value}-{counter}\"\n counter += 1\n return new_value\n
Check that the wavelength_id attributes of a channel list are unique.
PARAMETER DESCRIPTION channels
TBD
TYPE: list[OmeroChannel]
Source code in fractal_tasks_core/channels.py
def check_unique_wavelength_ids(channels: list[OmeroChannel]):\n\"\"\"\n Check that the `wavelength_id` attributes of a channel list are unique.\n\n Args:\n channels: TBD\n \"\"\"\n wavelength_ids = [c.wavelength_id for c in channels]\n if len(set(wavelength_ids)) < len(wavelength_ids):\n raise ValueError(\n f\"Non-unique wavelength_id's in {wavelength_ids}\\n\" f\"{channels=}\"\n )\n
Check that the channel labels for a well are unique.
First identify the channel-labels list for each image in the well, then compare lists and verify their intersection is empty.
PARAMETER DESCRIPTION well_zarr_path
path to an OME-NGFF well zarr group.
TYPE: str
Source code in fractal_tasks_core/channels.py
def check_well_channel_labels(*, well_zarr_path: str) -> None:\n\"\"\"\n Check that the channel labels for a well are unique.\n\n First identify the channel-labels list for each image in the well, then\n compare lists and verify their intersection is empty.\n\n Args:\n well_zarr_path: path to an OME-NGFF well zarr group.\n \"\"\"\n\n # Iterate over all images (multiplexing cycles, multi-FOVs, ...)\n group = zarr.open_group(well_zarr_path, mode=\"r+\")\n image_paths = [image[\"path\"] for image in group.attrs[\"well\"][\"images\"]]\n list_of_channel_lists = []\n for image_path in image_paths:\n channels = get_omero_channel_list(\n image_zarr_path=f\"{well_zarr_path}/{image_path}\"\n )\n list_of_channel_lists.append(channels[:])\n\n # For each pair of channel-labels lists, verify they do not overlap\n for ind_1, channels_1 in enumerate(list_of_channel_lists):\n labels_1 = set([c.label for c in channels_1])\n for ind_2 in range(ind_1):\n channels_2 = list_of_channel_lists[ind_2]\n labels_2 = set([c.label for c in channels_2])\n intersection = labels_1 & labels_2\n if intersection:\n hint = (\n \"Are you parsing fields of view into separate OME-Zarr \"\n \"images? This could lead to non-unique channel labels, \"\n \"and then could be the reason of the error\"\n )\n raise ValueError(\n \"Non-unique channel labels\\n\"\n f\"{labels_1=}\\n{labels_2=}\\n{hint}\"\n )\n
Update a channel list to use it in the OMERO/channels metadata.
Given a list of channel dictionaries, update each one of them by: 1. Adding a label (if missing); 2. Adding a set of OMERO-specific attributes; 3. Discarding all other attributes.
The new_channels output can be used in the attrs[\"omero\"][\"channels\"] attribute of an image group.
PARAMETER DESCRIPTION channels
A list of channel dictionaries (each one must include the wavelength_id key).
new_channels, a new list of consistent channel dictionaries that can be written to OMERO metadata.
Source code in fractal_tasks_core/channels.py
def define_omero_channels(\n *,\n channels: list[OmeroChannel],\n bit_depth: int,\n label_prefix: Optional[str] = None,\n) -> list[dict[str, Union[str, int, bool, dict[str, int]]]]:\n\"\"\"\n Update a channel list to use it in the OMERO/channels metadata.\n\n Given a list of channel dictionaries, update each one of them by:\n 1. Adding a label (if missing);\n 2. Adding a set of OMERO-specific attributes;\n 3. Discarding all other attributes.\n\n The `new_channels` output can be used in the `attrs[\"omero\"][\"channels\"]`\n attribute of an image group.\n\n Args:\n channels: A list of channel dictionaries (each one must include the\n `wavelength_id` key).\n bit_depth: bit depth.\n label_prefix: TBD\n\n Returns:\n `new_channels`, a new list of consistent channel dictionaries that\n can be written to OMERO metadata.\n \"\"\"\n\n new_channels = [c.copy(deep=True) for c in channels]\n default_colors = [\"00FFFF\", \"FF00FF\", \"FFFF00\"]\n\n for channel in new_channels:\n wavelength_id = channel.wavelength_id\n\n # If channel.label is None, set it to a default value\n if channel.label is None:\n default_label = wavelength_id\n if label_prefix:\n default_label = f\"{label_prefix}_{default_label}\"\n logging.warning(\n f\"Missing label for {channel=}, using {default_label=}\"\n )\n channel.label = default_label\n\n # If channel.color is None, set it to a default value (use the default\n # ones for the first three channels, or gray otherwise)\n if channel.color is None:\n try:\n channel.color = default_colors.pop()\n except IndexError:\n channel.color = \"808080\"\n\n # Set channel.window attribute\n if channel.window:\n channel.window.min = 0\n channel.window.max = 2**bit_depth - 1\n\n # Check that channel labels are unique for this image\n labels = [c.label for c in new_channels]\n if len(set(labels)) < len(labels):\n raise ValueError(f\"Non-unique labels in {new_channels=}\")\n\n new_channels_dictionaries = [\n c.dict(exclude={\"index\"}, exclude_unset=True) for c in new_channels\n ]\n\n return new_channels_dictionaries\n
This is a helper function that combines get_omero_channel_list with get_channel_from_list.
PARAMETER DESCRIPTION image_zarr_path
Path to an OME-NGFF image zarr group.
TYPE: str
label
label attribute of the channel to be extracted.
TYPE: Optional[str] DEFAULT: None
wavelength_id
wavelength_id attribute of the channel to be extracted.
TYPE: Optional[str] DEFAULT: None
RETURNS DESCRIPTION OmeroChannel
A single channel dictionary.
Source code in fractal_tasks_core/channels.py
def get_channel_from_image_zarr(\n *,\n image_zarr_path: str,\n label: Optional[str] = None,\n wavelength_id: Optional[str] = None,\n) -> OmeroChannel:\n\"\"\"\n Extract a channel from OME-NGFF zarr attributes.\n\n This is a helper function that combines `get_omero_channel_list` with\n `get_channel_from_list`.\n\n Args:\n image_zarr_path: Path to an OME-NGFF image zarr group.\n label: `label` attribute of the channel to be extracted.\n wavelength_id: `wavelength_id` attribute of the channel to be\n extracted.\n\n Returns:\n A single channel dictionary.\n \"\"\"\n omero_channels = get_omero_channel_list(image_zarr_path=image_zarr_path)\n channel = get_channel_from_list(\n channels=omero_channels, label=label, wavelength_id=wavelength_id\n )\n return channel\n
Find the channel that has the required values of label and/or wavelength_id, and identify its positional index (which also corresponds to its index in the zarr array).
PARAMETER DESCRIPTION channels
A list of channel dictionary, where each channel includes (at least) the label and wavelength_id keys.
TYPE: list[OmeroChannel]
label
The label to look for in the list of channels.
TYPE: Optional[str] DEFAULT: None
wavelength_id
The wavelength_id to look for in the list of channels.
TYPE: Optional[str] DEFAULT: None
RETURNS DESCRIPTION OmeroChannel
A single channel dictionary.
Source code in fractal_tasks_core/channels.py
def get_channel_from_list(\n *,\n channels: list[OmeroChannel],\n label: Optional[str] = None,\n wavelength_id: Optional[str] = None,\n) -> OmeroChannel:\n\"\"\"\n Find matching channel in a list.\n\n Find the channel that has the required values of `label` and/or\n `wavelength_id`, and identify its positional index (which also\n corresponds to its index in the zarr array).\n\n Args:\n channels: A list of channel dictionary, where each channel includes (at\n least) the `label` and `wavelength_id` keys.\n label: The label to look for in the list of channels.\n wavelength_id: The wavelength_id to look for in the list of channels.\n\n Returns:\n A single channel dictionary.\n \"\"\"\n\n # Identify matching channels\n if label:\n if wavelength_id:\n # Both label and wavelength_id are specified\n matching_channels = [\n c\n for c in channels\n if (c.label == label and c.wavelength_id == wavelength_id)\n ]\n else:\n # Only label is specified\n matching_channels = [c for c in channels if c.label == label]\n else:\n if wavelength_id:\n # Only wavelength_id is specified\n matching_channels = [\n c for c in channels if c.wavelength_id == wavelength_id\n ]\n else:\n # Neither label or wavelength_id are specified\n raise ValueError(\n \"get_channel requires at least one in {label,wavelength_id} \"\n \"arguments\"\n )\n\n # Verify that there is one and only one matching channel\n if len(matching_channels) == 0:\n required_match = [f\"{label=}\", f\"{wavelength_id=}\"]\n required_match_string = \" and \".join(\n [x for x in required_match if \"None\" not in x]\n )\n raise ChannelNotFoundError(\n f\"ChannelNotFoundError: No channel found in {channels}\"\n f\" for {required_match_string}\"\n )\n if len(matching_channels) > 1:\n raise ValueError(f\"Inconsistent set of channels: {channels}\")\n\n channel = matching_channels[0]\n channel.index = channels.index(channel)\n return channel\n
Extract the list of channels from OME-NGFF zarr attributes.
PARAMETER DESCRIPTION image_zarr_path
Path to an OME-NGFF image zarr group.
TYPE: str
RETURNS DESCRIPTION list[OmeroChannel]
A list of channel dictionaries.
Source code in fractal_tasks_core/channels.py
def get_omero_channel_list(*, image_zarr_path: str) -> list[OmeroChannel]:\n\"\"\"\n Extract the list of channels from OME-NGFF zarr attributes.\n\n Args:\n image_zarr_path: Path to an OME-NGFF image zarr group.\n\n Returns:\n A list of channel dictionaries.\n \"\"\"\n group = zarr.open_group(image_zarr_path, mode=\"r+\")\n channels_dicts = group.attrs[\"omero\"][\"channels\"]\n channels = [OmeroChannel(**c) for c in channels_dicts]\n return channels\n
Make an existing list of Omero channels Fractal-compatible
The output channels all have keys label, wavelength_id and color; the wavelength_id values are unique across the channel list.
See https://ngff.openmicroscopy.org/0.4/index.html#omero-md for the definition of NGFF Omero metadata.
PARAMETER DESCRIPTION old_channels
Existing list of Omero-channel dictionaries
TYPE: list[dict[str, Any]]
RETURNS DESCRIPTION list[dict[str, Any]]
New list of Fractal-compatible Omero-channel dictionaries
Source code in fractal_tasks_core/channels.py
def update_omero_channels(\n old_channels: list[dict[str, Any]]\n) -> list[dict[str, Any]]:\n\"\"\"\n Make an existing list of Omero channels Fractal-compatible\n\n The output channels all have keys `label`, `wavelength_id` and `color`;\n the `wavelength_id` values are unique across the channel list.\n\n See https://ngff.openmicroscopy.org/0.4/index.html#omero-md for the\n definition of NGFF Omero metadata.\n\n Args:\n old_channels: Existing list of Omero-channel dictionaries\n\n Returns:\n New list of Fractal-compatible Omero-channel dictionaries\n \"\"\"\n new_channels = deepcopy(old_channels)\n existing_wavelength_ids: list[str] = []\n handled_channels = []\n\n default_colors = [\"00FFFF\", \"FF00FF\", \"FFFF00\"]\n\n def _get_next_color() -> str:\n try:\n return default_colors.pop(0)\n except IndexError:\n return \"808080\"\n\n # Channels that contain the key \"wavelength_id\"\n for ind, old_channel in enumerate(old_channels):\n if \"wavelength_id\" in old_channel.keys():\n handled_channels.append(ind)\n existing_wavelength_ids.append(old_channel[\"wavelength_id\"])\n new_channel = old_channel.copy()\n try:\n label = old_channel[\"label\"]\n except KeyError:\n label = str(ind + 1)\n new_channel[\"label\"] = label\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Channels that contain the key \"label\" but do not contain the key\n # \"wavelength_id\"\n for ind, old_channel in enumerate(old_channels):\n if ind in handled_channels:\n continue\n if \"label\" not in old_channel.keys():\n continue\n handled_channels.append(ind)\n label = old_channel[\"label\"]\n wavelength_id = _get_new_unique_value(\n label,\n existing_wavelength_ids,\n )\n existing_wavelength_ids.append(wavelength_id)\n new_channel = old_channel.copy()\n new_channel[\"wavelength_id\"] = wavelength_id\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Channels that do not contain the key \"label\" nor the key \"wavelength_id\"\n # NOTE: these channels must be treated last, as they have lower priority\n # w.r.t. existing \"wavelength_id\" or \"label\" values\n for ind, old_channel in enumerate(old_channels):\n if ind in handled_channels:\n continue\n label = str(ind + 1)\n wavelength_id = _get_new_unique_value(\n label,\n existing_wavelength_ids,\n )\n existing_wavelength_ids.append(wavelength_id)\n new_channel = old_channel.copy()\n new_channel[\"label\"] = label\n new_channel[\"wavelength_id\"] = wavelength_id\n if \"color\" not in old_channel:\n new_channel[\"color\"] = _get_next_color()\n new_channels[ind] = new_channel\n\n # Log old/new values of label, wavelength_id and color\n for ind, old_channel in enumerate(old_channels):\n label = old_channel.get(\"label\")\n color = old_channel.get(\"color\")\n wavelength_id = old_channel.get(\"wavelength_id\")\n old_attributes = (\n f\"Old attributes: {label=}, {wavelength_id=}, {color=}\"\n )\n label = new_channels[ind][\"label\"]\n wavelength_id = new_channels[ind][\"wavelength_id\"]\n color = new_channels[ind][\"color\"]\n new_attributes = (\n f\"New attributes: {label=}, {wavelength_id=}, {color=}\"\n )\n logging.info(\n \"Omero channel update:\\n\"\n f\" {old_attributes}\\n\"\n f\" {new_attributes}\"\n )\n\n return new_channels\n
This helper function is similar to write_table, in that it prepares the appropriate zarr groups (labels and the new-label one) and performs overwrite-dependent checks. At a difference with write_table, this function does not actually write the label array to the new zarr group; such writing operation must take place in the actual task function, since in fractal-tasks-core it is done sequentially on different regions of the zarr array.
What this function does is:
Create the labels group, if needed.
If overwrite=False, check that the new label does not exist (either in zarr attributes or as a zarr sub-group).
Update the labels attribute of the image group.
If label_attrs is set, include this set of attributes in the new-label zarr group.
PARAMETER DESCRIPTION image_group
The group to write to.
TYPE: Group
label_name
The name of the new label; this name also overrides the multiscale name in NGFF-image Zarr attributes, if needed.
TYPE: str
overwrite
If False, check that the new label does not exist (either in zarr attributes or as a zarr sub-group); if True propagate parameter to create_group method, making it overwrite any existing sub-group with the given name.
TYPE: bool DEFAULT: False
label_attrs
Zarr attributes of the label-image group.
TYPE: dict[str, Any]
logger
The logger to use (if unset, use logging.getLogger(None)).
TYPE: Optional[Logger] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the new label.
Source code in fractal_tasks_core/labels.py
def prepare_label_group(\n image_group: zarr.hierarchy.Group,\n label_name: str,\n label_attrs: dict[str, Any],\n overwrite: bool = False,\n logger: Optional[logging.Logger] = None,\n) -> zarr.group:\n\"\"\"\n Set the stage for writing labels to a zarr group\n\n This helper function is similar to `write_table`, in that it prepares the\n appropriate zarr groups (`labels` and the new-label one) and performs\n `overwrite`-dependent checks. At a difference with `write_table`, this\n function does not actually write the label array to the new zarr group;\n such writing operation must take place in the actual task function, since\n in fractal-tasks-core it is done sequentially on different `region`s of the\n zarr array.\n\n What this function does is:\n\n 1. Create the `labels` group, if needed.\n 2. If `overwrite=False`, check that the new label does not exist (either in\n zarr attributes or as a zarr sub-group).\n 3. Update the `labels` attribute of the image group.\n 4. If `label_attrs` is set, include this set of attributes in the\n new-label zarr group.\n\n Args:\n image_group:\n The group to write to.\n label_name:\n The name of the new label; this name also overrides the multiscale\n name in NGFF-image Zarr attributes, if needed.\n overwrite:\n If `False`, check that the new label does not exist (either in zarr\n attributes or as a zarr sub-group); if `True` propagate parameter\n to `create_group` method, making it overwrite any existing\n sub-group with the given name.\n label_attrs:\n Zarr attributes of the label-image group.\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`).\n\n Returns:\n Zarr group of the new label.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n # Create labels group (if needed) and extract current_labels\n if \"labels\" not in set(image_group.group_keys()):\n labels_group = image_group.create_group(\"labels\", overwrite=False)\n else:\n labels_group = image_group[\"labels\"]\n current_labels = labels_group.attrs.asdict().get(\"labels\", [])\n\n # If overwrite=False, check that the new label does not exist (either as a\n # zarr sub-group or as part of the zarr-group attributes)\n if not overwrite:\n if label_name in set(labels_group.group_keys()):\n error_msg = (\n f\"Sub-group '{label_name}' of group {image_group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n if label_name in current_labels:\n error_msg = (\n f\"Item '{label_name}' already exists in `labels` attribute of \"\n f\"group {image_group.store.path}, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Update the `labels` metadata of the image group, if needed\n if label_name not in current_labels:\n new_labels = current_labels + [label_name]\n labels_group.attrs[\"labels\"] = new_labels\n\n # Define new-label group\n label_group = labels_group.create_group(label_name, overwrite=overwrite)\n\n # Validate attrs against NGFF specs 0.4\n try:\n meta = NgffImageMeta(**label_attrs)\n except ValidationError as e:\n error_msg = (\n \"Label attributes do not comply with NGFF image \"\n \"specifications, as encoded in fractal-tasks-core.\\n\"\n f\"Original error:\\nValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n # Replace multiscale name with label_name, if needed\n current_multiscale_name = meta.multiscale.name\n if current_multiscale_name != label_name:\n logger.warning(\n f\"Setting multiscale name to '{label_name}' (old value: \"\n f\"'{current_multiscale_name}') in label-image NGFF \"\n \"attributes.\"\n )\n label_attrs[\"multiscales\"][0][\"name\"] = label_name\n # Overwrite label_group attributes with label_attrs key/value pairs\n label_group.attrs.put(label_attrs)\n\n return label_group\n
Postprocess cellpose output, mainly to restore its original background.
NOTE: The pre/post-processing functions and the masked_loading_wrapper are currently meant to work as part of the cellpose_segmentation task, with the plan of then making them more flexible; see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
PARAMETER DESCRIPTION modified_array
The 3D (ZYX) array with the correct object data and wrong background data.
TYPE: ndarray
original_array
The 3D (ZYX) array with the wrong object data and correct background data.
TYPE: ndarray
background
The 3D (ZYX) boolean array that defines the background.
TYPE: ndarray
RETURNS DESCRIPTION ndarray
The postprocessed array.
Source code in fractal_tasks_core/masked_loading.py
def _postprocess_output(\n *,\n modified_array: np.ndarray,\n original_array: np.ndarray,\n background: np.ndarray,\n) -> np.ndarray:\n\"\"\"\n Postprocess cellpose output, mainly to restore its original background.\n\n **NOTE**: The pre/post-processing functions and the\n masked_loading_wrapper are currently meant to work as part of the\n cellpose_segmentation task, with the plan of then making them more\n flexible; see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.\n\n Args:\n modified_array: The 3D (ZYX) array with the correct object data and\n wrong background data.\n original_array: The 3D (ZYX) array with the wrong object data and\n correct background data.\n background: The 3D (ZYX) boolean array that defines the background.\n\n Returns:\n The postprocessed array.\n \"\"\"\n # Restore background\n modified_array[background] = original_array[background]\n return modified_array\n
Loading the masking label array for the appropriate ROI;
Extracting the appropriate label value from the ROI_table.obs dataframe;
Constructing the background mask, where the masking label matches with a specific label value;
Setting the background of image_array to 0;
Loading the array which will be needed in postprocessing to restore background.
NOTE 1: This function relies on V1 of the Fractal table specifications, see https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.
NOTE 2: The pre/post-processing functions and the masked_loading_wrapper are currently meant to work as part of the cellpose_segmentation task, with the plan of then making them more flexible; see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.
Naming of variables refers to a two-steps labeling, as in \"first identify organoids, then look for nuclei inside each organoid\") :
\"masking\" refers to the labels that are used to identify the object vs background (e.g. the organoid labels); these labels already exist.
\"current\" refers to the labels that are currently being computed in the cellpose_segmentation task, e.g. the nuclear labels.
PARAMETER DESCRIPTION image_array
The 4D CZYX array with image data for a specific ROI.
TYPE: ndarray
region
The ZYX indices of the ROI, in a form like (slice(0, 1), slice(1000, 2000), slice(1000, 2000)).
TYPE: tuple[slice, ...]
current_label_path
Path to the image used as current label, in a form like /somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0.
TYPE: str
ROI_table_path
Path of the AnnData table for the masking-label ROIs; this is used (together with ROI_positional_index) to extract label_value.
TYPE: str
ROI_positional_index
Index of the current ROI, which is used to extract label_value from ROI_table_obs.
TYPE: int
Returns: A tuple with three arrays: the preprocessed image array, the background mask, the current label.
Source code in fractal_tasks_core/masked_loading.py
def _preprocess_input(\n image_array: np.ndarray,\n *,\n region: tuple[slice, ...],\n current_label_path: str,\n ROI_table_path: str,\n ROI_positional_index: int,\n) -> tuple[np.ndarray, np.ndarray, np.ndarray]:\n\"\"\"\n Preprocess a four-dimensional cellpose input.\n\n This involves :\n\n - Loading the masking label array for the appropriate ROI;\n - Extracting the appropriate label value from the `ROI_table.obs`\n dataframe;\n - Constructing the background mask, where the masking label matches with a\n specific label value;\n - Setting the background of `image_array` to `0`;\n - Loading the array which will be needed in postprocessing to restore\n background.\n\n **NOTE 1**: This function relies on V1 of the Fractal table specifications,\n see\n https://fractal-analytics-platform.github.io/fractal-tasks-core/tables/.\n\n **NOTE 2**: The pre/post-processing functions and the\n masked_loading_wrapper are currently meant to work as part of the\n cellpose_segmentation task, with the plan of then making them more\n flexible; see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/340.\n\n Naming of variables refers to a two-steps labeling, as in \"first identify\n organoids, then look for nuclei inside each organoid\") :\n\n - `\"masking\"` refers to the labels that are used to identify the object\n vs background (e.g. the organoid labels); these labels already exist.\n - `\"current\"` refers to the labels that are currently being computed in\n the `cellpose_segmentation` task, e.g. the nuclear labels.\n\n Args:\n image_array: The 4D CZYX array with image data for a specific ROI.\n region: The ZYX indices of the ROI, in a form like\n `(slice(0, 1), slice(1000, 2000), slice(1000, 2000))`.\n current_label_path: Path to the image used as current label, in a form\n like `/somewhere/plate.zarr/A/01/0/labels/nuclei_in_organoids/0`.\n ROI_table_path: Path of the AnnData table for the masking-label ROIs;\n this is used (together with `ROI_positional_index`) to extract\n `label_value`.\n ROI_positional_index: Index of the current ROI, which is used to\n extract `label_value` from `ROI_table_obs`.\n Returns:\n A tuple with three arrays: the preprocessed image array, the background\n mask, the current label.\n \"\"\"\n\n logger.info(f\"[_preprocess_input] {image_array.shape=}\")\n logger.info(f\"[_preprocess_input] {region=}\")\n\n # Check that image data are 4D (CZYX) - FIXME issue 340\n if not image_array.ndim == 4:\n raise ValueError(\n \"_preprocess_input requires a 4D \"\n f\"image_array argument, but {image_array.shape=}\"\n )\n\n # Load the ROI table and its metadata attributes\n ROI_table = ad.read_zarr(ROI_table_path)\n attrs = zarr.group(ROI_table_path).attrs\n logger.info(f\"[_preprocess_input] {ROI_table_path=}\")\n logger.info(f\"[_preprocess_input] {attrs.asdict()=}\")\n MaskingROITableAttrs(**attrs.asdict())\n label_relative_path = attrs[\"region\"][\"path\"]\n column_name = attrs[\"instance_key\"]\n\n # Check that ROI_table.obs has the right column and extract label_value\n if column_name not in ROI_table.obs.columns:\n raise ValueError(\n 'In _preprocess_input, \"{column_name}\" '\n f\" missing in {ROI_table.obs.columns=}\"\n )\n label_value = int(ROI_table.obs[column_name][ROI_positional_index])\n\n # Load masking-label array (lazily)\n masking_label_path = str(\n Path(ROI_table_path).parent / label_relative_path / \"0\"\n )\n logger.info(f\"{masking_label_path=}\")\n masking_label_array = da.from_zarr(masking_label_path)\n logger.info(\n f\"[_preprocess_input] {masking_label_path=}, \"\n f\"{masking_label_array.shape=}\"\n )\n\n # Load current-label array (lazily)\n current_label_array = da.from_zarr(current_label_path)\n logger.info(\n f\"[_preprocess_input] {current_label_path=}, \"\n f\"{current_label_array.shape=}\"\n )\n\n # Load ROI data for current label array\n current_label_region = current_label_array[region].compute()\n\n # Load ROI data for masking label array, with or without upscaling\n if masking_label_array.shape != current_label_array.shape:\n logger.info(\"Upscaling of masking label is needed\")\n lowres_region = convert_region_to_low_res(\n highres_region=region,\n highres_shape=current_label_array.shape,\n lowres_shape=masking_label_array.shape,\n )\n masking_label_region = masking_label_array[lowres_region].compute()\n masking_label_region = upscale_array(\n array=masking_label_region,\n target_shape=current_label_region.shape,\n )\n else:\n masking_label_region = masking_label_array[region].compute()\n\n # Check that all shapes match\n shapes = (\n masking_label_region.shape,\n current_label_region.shape,\n image_array.shape[1:],\n )\n if len(set(shapes)) > 1:\n raise ValueError(\n \"Shape mismatch:\\n\"\n f\"{current_label_region.shape=}\\n\"\n f\"{masking_label_region.shape=}\\n\"\n f\"{image_array.shape=}\"\n )\n\n # Compute background mask\n background_3D = masking_label_region != label_value\n if (masking_label_region == label_value).sum() == 0:\n raise ValueError(\n f\"Label {label_value} is not present in the extracted ROI\"\n )\n\n # Set image background to zero\n n_channels = image_array.shape[0]\n for i in range(n_channels):\n image_array[i, background_3D] = 0\n\n return (image_array, background_3D, current_label_region)\n
Wrap a function with some pre/post-processing functions
PARAMETER DESCRIPTION function
The callable function to be wrapped.
TYPE: Callable
image_array
The image array to be preprocessed and then used as positional argument for function.
TYPE: ndarray
kwargs
Keyword arguments for function.
TYPE: Optional[dict] DEFAULT: None
use_masks
If False, the wrapper only calls function(*args, **kwargs).
TYPE: bool
preprocessing_kwargs
Keyword arguments for the preprocessing function (see call signature of _preprocess_input()).
TYPE: Optional[dict] DEFAULT: None
Source code in fractal_tasks_core/masked_loading.py
def masked_loading_wrapper(\n *,\n function: Callable,\n image_array: np.ndarray,\n kwargs: Optional[dict] = None,\n use_masks: bool,\n preprocessing_kwargs: Optional[dict] = None,\n):\n\"\"\"\n Wrap a function with some pre/post-processing functions\n\n Args:\n function: The callable function to be wrapped.\n image_array: The image array to be preprocessed and then used as\n positional argument for `function`.\n kwargs: Keyword arguments for `function`.\n use_masks: If `False`, the wrapper only calls\n `function(*args, **kwargs)`.\n preprocessing_kwargs: Keyword arguments for the preprocessing function\n (see call signature of `_preprocess_input()`).\n \"\"\"\n # Optional preprocessing\n if use_masks:\n preprocessing_kwargs = preprocessing_kwargs or {}\n (\n image_array,\n background_3D,\n current_label_region,\n ) = _preprocess_input(image_array, **preprocessing_kwargs)\n # Run function\n kwargs = kwargs or {}\n new_label_img = function(image_array, **kwargs)\n # Optional postprocessing\n if use_masks:\n new_label_img = _postprocess_output(\n modified_array=new_label_img,\n original_array=current_label_region,\n background=background_3D,\n )\n return new_label_img\n
Starting from on-disk highest-resolution data, build and write to disk a pyramid with (num_levels - 1) coarsened levels. This function works for 2D, 3D or 4D arrays.
PARAMETER DESCRIPTION zarrurl
Path of the image zarr group, not including the multiscale-level path (e.g. \"some/path/plate.zarr/B/03/0\").
TYPE: Union[str, Path]
overwrite
Whether to overwrite existing pyramid levels.
TYPE: bool DEFAULT: False
num_levels
Total number of pyramid levels (including 0).
TYPE: int DEFAULT: 2
coarsening_xy
Linear coarsening factor between subsequent levels.
TYPE: int DEFAULT: 2
chunksize
Shape of a single chunk.
TYPE: Optional[Sequence[int]] DEFAULT: None
aggregation_function
Function to be used when downsampling.
TYPE: Optional[Callable] DEFAULT: None
Source code in fractal_tasks_core/pyramids.py
def build_pyramid(\n *,\n zarrurl: Union[str, pathlib.Path],\n overwrite: bool = False,\n num_levels: int = 2,\n coarsening_xy: int = 2,\n chunksize: Optional[Sequence[int]] = None,\n aggregation_function: Optional[Callable] = None,\n) -> None:\n\n\"\"\"\n Starting from on-disk highest-resolution data, build and write to disk a\n pyramid with `(num_levels - 1)` coarsened levels.\n This function works for 2D, 3D or 4D arrays.\n\n Args:\n zarrurl: Path of the image zarr group, not including the\n multiscale-level path (e.g. `\"some/path/plate.zarr/B/03/0\"`).\n overwrite: Whether to overwrite existing pyramid levels.\n num_levels: Total number of pyramid levels (including 0).\n coarsening_xy: Linear coarsening factor between subsequent levels.\n chunksize: Shape of a single chunk.\n aggregation_function: Function to be used when downsampling.\n \"\"\"\n\n # Clean up zarrurl\n zarrurl = str(pathlib.Path(zarrurl)) # FIXME\n\n # Select full-resolution multiscale level\n zarrurl_highres = f\"{zarrurl}/0\"\n logger.info(f\"[build_pyramid] High-resolution path: {zarrurl_highres}\")\n\n # Lazily load highest-resolution data\n data_highres = da.from_zarr(zarrurl_highres)\n logger.info(f\"[build_pyramid] High-resolution data: {str(data_highres)}\")\n\n # Check the number of axes and identify YX dimensions\n ndims = len(data_highres.shape)\n if ndims not in [2, 3, 4]:\n raise ValueError(f\"{data_highres.shape=}, ndims not in [2,3,4]\")\n y_axis = ndims - 2\n x_axis = ndims - 1\n\n # Set aggregation_function\n if aggregation_function is None:\n aggregation_function = np.mean\n\n # Compute and write lower-resolution levels\n previous_level = data_highres\n for ind_level in range(1, num_levels):\n # Verify that coarsening is doable\n if min(previous_level.shape[-2:]) < coarsening_xy:\n raise ValueError(\n f\"ERROR: at {ind_level}-th level, \"\n f\"coarsening_xy={coarsening_xy} \"\n f\"but previous level has shape {previous_level.shape}\"\n )\n # Apply coarsening\n newlevel = da.coarsen(\n aggregation_function,\n previous_level,\n {y_axis: coarsening_xy, x_axis: coarsening_xy},\n trim_excess=True,\n ).astype(data_highres.dtype)\n\n # Apply rechunking\n if chunksize is None:\n newlevel_rechunked = newlevel\n else:\n newlevel_rechunked = newlevel.rechunk(chunksize)\n logger.info(\n f\"[build_pyramid] Level {ind_level} data: \"\n f\"{str(newlevel_rechunked)}\"\n )\n\n # Write zarr and store output (useful to construct next level)\n previous_level = newlevel_rechunked.to_zarr(\n zarrurl,\n component=f\"{ind_level}\",\n overwrite=overwrite,\n compute=True,\n return_stored=True,\n write_empty_chunks=False,\n dimension_separator=\"/\",\n )\n
Upscale an array along a given list of axis (through repeated application of np.repeat), to match a target shape.
PARAMETER DESCRIPTION array
The array to be upscaled.
TYPE: ndarray
target_shape
The shape of the rescaled array.
TYPE: tuple[int, ...]
axis
The axis along which to upscale the array (if None, then all axis are used).
TYPE: Optional[Sequence[int]] DEFAULT: None
pad_with_zeros
If True, pad the upscaled array with zeros to match target_shape.
TYPE: bool DEFAULT: False
warn_if_inhomogeneous
If True, raise a warning when the conversion factors are not identical across all dimensions.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION ndarray
The upscaled array, with shape target_shape.
Source code in fractal_tasks_core/upscale_array.py
def upscale_array(\n *,\n array: np.ndarray,\n target_shape: tuple[int, ...],\n axis: Optional[Sequence[int]] = None,\n pad_with_zeros: bool = False,\n warn_if_inhomogeneous: bool = False,\n) -> np.ndarray:\n\"\"\"\n Upscale an array along a given list of axis (through repeated application\n of `np.repeat`), to match a target shape.\n\n Args:\n array: The array to be upscaled.\n target_shape: The shape of the rescaled array.\n axis: The axis along which to upscale the array (if `None`, then all\n axis are used).\n pad_with_zeros: If `True`, pad the upscaled array with zeros to match\n `target_shape`.\n warn_if_inhomogeneous: If `True`, raise a warning when the conversion\n factors are not identical across all dimensions.\n\n Returns:\n The upscaled array, with shape `target_shape`.\n \"\"\"\n\n # Default behavior: use all axis\n if axis is None:\n axis = list(range(len(target_shape)))\n\n array_shape = array.shape\n info = (\n f\"Trying to upscale from {array_shape=} to {target_shape=}, \"\n f\"acting on {axis=}.\"\n )\n\n if len(array_shape) != len(target_shape):\n raise ValueError(f\"{info} Dimensions-number mismatch.\")\n if axis == []:\n raise ValueError(f\"{info} Empty axis list\")\n if min(axis) < 0:\n raise ValueError(f\"{info} Negative axis specification not allowed.\")\n\n # Check that upscale is doable\n for ind, dim in enumerate(array_shape):\n # Check that array is not larger than target (downscaling)\n if dim > target_shape[ind]:\n raise ValueError(\n f\"{info} {ind}-th array dimension is larger than target.\"\n )\n # Check that all relevant axis are included in axis\n if dim != target_shape[ind] and ind not in axis:\n raise ValueError(\n f\"{info} {ind}-th array dimension differs from \"\n f\"target, but {ind} is not included in \"\n f\"{axis=}.\"\n )\n\n # Compute upscaling factors\n upscale_factors = {}\n for ax in axis:\n if (target_shape[ax] % array_shape[ax]) > 0 and not pad_with_zeros:\n raise ValueError(\n \"Incommensurable upscale attempt, \"\n f\"from {array_shape=} to {target_shape=}.\"\n )\n upscale_factors[ax] = target_shape[ax] // array_shape[ax]\n # Check that this is not downscaling\n if upscale_factors[ax] < 1:\n raise ValueError(info)\n info = f\"{info} Upscale factors: {upscale_factors}\"\n\n # Raise a warning if upscaling is non-homogeneous across all axis\n if warn_if_inhomogeneous:\n if len(set(upscale_factors.values())) > 1:\n warnings.warn(f\"{info} (inhomogeneous)\")\n\n # Upscale array, via np.repeat\n upscaled_array = array\n for ax in axis:\n upscaled_array = np.repeat(\n upscaled_array, upscale_factors[ax], axis=ax\n )\n\n # Check that final shape is correct\n if not upscaled_array.shape == target_shape:\n if pad_with_zeros:\n pad_width = []\n for ax in list(range(len(target_shape))):\n missing = target_shape[ax] - upscaled_array.shape[ax]\n if missing < 0 or (missing > 0 and ax not in axis):\n raise ValueError(\n f\"{info} \" \"Something wrong during zero-padding\"\n )\n pad_width.append([0, missing])\n upscaled_array = np.pad(\n upscaled_array,\n pad_width=pad_width,\n mode=\"constant\",\n constant_values=0,\n )\n logging.warning(f\"{info} {upscaled_array.shape=}.\")\n logging.warning(\n f\"Padding upscaled_array with zeros with {pad_width=}\"\n )\n else:\n raise ValueError(f\"{info} {upscaled_array.shape=}.\")\n\n return upscaled_array\n
Discover the acquisition index based on OME-NGFF metadata.
Given the path to a zarr image folder (e.g. /path/plate.zarr/B/03/0), extract the acquisition index from the .zattrs file of the parent folder (i.e. at the well level), or return None if acquisition is not specified.
Notes:
For non-multiplexing datasets, acquisition is not a required information in the metadata. If it is not there, this function returns None.
This function fails if we use an image that does not belong to an OME-NGFF well.
PARAMETER DESCRIPTION image_zarr_path
Full path to an OME-NGFF image folder.
TYPE: Path
Source code in fractal_tasks_core/utils.py
def _find_omengff_acquisition(image_zarr_path: Path) -> Union[int, None]:\n\"\"\"\n Discover the acquisition index based on OME-NGFF metadata.\n\n Given the path to a zarr image folder (e.g. `/path/plate.zarr/B/03/0`),\n extract the acquisition index from the `.zattrs` file of the parent\n folder (i.e. at the well level), or return `None` if acquisition is not\n specified.\n\n Notes:\n\n 1. For non-multiplexing datasets, acquisition is not a required\n information in the metadata. If it is not there, this function\n returns `None`.\n 2. This function fails if we use an image that does not belong to\n an OME-NGFF well.\n\n Args:\n image_zarr_path: Full path to an OME-NGFF image folder.\n \"\"\"\n\n # Identify well path and attrs\n well_zarr_path = image_zarr_path.parent\n if not (well_zarr_path / \".zattrs\").exists():\n raise ValueError(\n f\"{str(well_zarr_path)} must be an OME-NGFF well \"\n \"folder, but it does not include a .zattrs file.\"\n )\n well_group = zarr.open_group(str(well_zarr_path))\n attrs_images = well_group.attrs[\"well\"][\"images\"]\n\n # Loook for the acquisition of the current image (if any)\n acquisition = None\n for img_dict in attrs_images:\n if (\n img_dict[\"path\"] == image_zarr_path.name\n and \"acquisition\" in img_dict.keys()\n ):\n acquisition = img_dict[\"acquisition\"]\n break\n\n return acquisition\n
Flexibly extract parameters from metadata dictionary
This covers both parameters which are acquisition-specific (if the image belongs to an OME-NGFF array and its acquisition is specified) or simply available in the dictionary. The two cases are handled as:
metadata[acquisition][\"some_parameter\"] # acquisition available\nmetadata[\"some_parameter\"] # acquisition not available\n
PARAMETER DESCRIPTION keys
list of required parameters.
TYPE: Sequence[str]
metadata
metadata dictionary.
TYPE: dict[str, Any]
image_zarr_path
full path to image, e.g. /path/plate.zarr/B/03/0.
TYPE: Path
Source code in fractal_tasks_core/utils.py
def get_parameters_from_metadata(\n *,\n keys: Sequence[str],\n metadata: dict[str, Any],\n image_zarr_path: Path,\n) -> dict[str, Any]:\n\"\"\"\n Flexibly extract parameters from metadata dictionary\n\n This covers both parameters which are acquisition-specific (if the image\n belongs to an OME-NGFF array and its acquisition is specified) or simply\n available in the dictionary.\n The two cases are handled as:\n ```\n metadata[acquisition][\"some_parameter\"] # acquisition available\n metadata[\"some_parameter\"] # acquisition not available\n ```\n\n Args:\n keys: list of required parameters.\n metadata: metadata dictionary.\n image_zarr_path: full path to image, e.g. `/path/plate.zarr/B/03/0`.\n \"\"\"\n\n parameters = {}\n acquisition = _find_omengff_acquisition(image_zarr_path)\n if acquisition is not None:\n parameters[\"acquisition\"] = acquisition\n\n for key in keys:\n if acquisition is None:\n parameter = metadata[key]\n else:\n try:\n parameter = metadata[key][str(acquisition)]\n except TypeError:\n parameter = metadata[key]\n except KeyError:\n parameter = metadata[key]\n parameters[key] = parameter\n return parameters\n
Compile dictionary of (table name, table path) key/value pairs.
PARAMETER DESCRIPTION input_path
Path to the parent folder of a plate zarr group (e.g. /some/path/).
TYPE: Path
component
Path (relative to input_path) to an image zarr group (e.g. plate.zarr/B/03/0).
TYPE: str
RETURNS DESCRIPTION dict[str, str]
Dictionary with table names as keys and table paths as values. If tables Zarr group is missing, or if it does not have a tables key, then return an empty dictionary.
Source code in fractal_tasks_core/utils.py
def get_table_path_dict(input_path: Path, component: str) -> dict[str, str]:\n\"\"\"\n Compile dictionary of (table name, table path) key/value pairs.\n\n\n Args:\n input_path:\n Path to the parent folder of a plate zarr group (e.g.\n `/some/path/`).\n component:\n Path (relative to `input_path`) to an image zarr group (e.g.\n `plate.zarr/B/03/0`).\n\n Returns:\n Dictionary with table names as keys and table paths as values. If\n `tables` Zarr group is missing, or if it does not have a `tables`\n key, then return an empty dictionary.\n \"\"\"\n\n try:\n tables_group = zarr.open_group(f\"{input_path / component}/tables\", \"r\")\n table_list = tables_group.attrs[\"tables\"]\n except (zarr.errors.GroupNotFoundError, KeyError):\n table_list = []\n\n table_path_dict = {}\n for table in table_list:\n table_path_dict[table] = f\"{input_path / component}/tables/{table}\"\n\n return table_path_dict\n
Given a set of datasets (as per OME-NGFF specs), update their \"scale\" transformations in the YX directions by including a prefactor (coarsening_xy**reference_level).
PARAMETER DESCRIPTION datasets
list of datasets (as per OME-NGFF specs).
TYPE: list[dict]
coarsening_xy
linear coarsening factor between subsequent levels.
TYPE: int
reference_level
TBD
TYPE: int
remove_channel_axis
If True, remove the first item of all scale transformations.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/utils.py
def rescale_datasets(\n *,\n datasets: list[dict],\n coarsening_xy: int,\n reference_level: int,\n remove_channel_axis: bool = False,\n) -> list[dict]:\n\"\"\"\n Given a set of datasets (as per OME-NGFF specs), update their \"scale\"\n transformations in the YX directions by including a prefactor\n (coarsening_xy**reference_level).\n\n Args:\n datasets: list of datasets (as per OME-NGFF specs).\n coarsening_xy: linear coarsening factor between subsequent levels.\n reference_level: TBD\n remove_channel_axis: If `True`, remove the first item of all `scale`\n transformations.\n \"\"\"\n\n # Construct rescaled datasets\n new_datasets = []\n for ds in datasets:\n new_ds = {}\n\n # Copy all keys that are not coordinateTransformations (e.g. path)\n for key in ds.keys():\n if key != \"coordinateTransformations\":\n new_ds[key] = ds[key]\n\n # Update coordinateTransformations\n old_transformations = ds[\"coordinateTransformations\"]\n new_transformations = []\n for t in old_transformations:\n if t[\"type\"] == \"scale\":\n new_t: dict[str, Any] = t.copy()\n # Rescale last two dimensions (that is, Y and X)\n prefactor = coarsening_xy**reference_level\n new_t[\"scale\"][-2] = new_t[\"scale\"][-2] * prefactor\n new_t[\"scale\"][-1] = new_t[\"scale\"][-1] * prefactor\n if remove_channel_axis:\n new_t[\"scale\"].pop(0)\n new_transformations.append(new_t)\n else:\n new_transformations.append(t)\n new_ds[\"coordinateTransformations\"] = new_transformations\n new_datasets.append(new_ds)\n\n return new_datasets\n
This wrapper sets mode=\"w\" for overwrite=True and mode=\"w-\" for overwrite=False.
The expected behavior is
if the group does not exist, create it (independently on overwrite);
if the group already exists and overwrite=True, replace the group with an empty one;
if the group already exists and overwrite=False, fail.
From the zarr.open_group docs:
mode=\"r\" means read only (must exist);
mode=\"r+\" means read/write (must exist);
mode=\"a\" means read/write (create if doesn\u2019t exist);
mode=\"w\" means create (overwrite if exists);
mode=\"w-\" means create (fail if exists).
PARAMETER DESCRIPTION path
Store or path to directory in file system or name of zip file (zarr.open_group parameter).
TYPE: Union[str, MutableMapping]
overwrite
Determines the mode parameter of zarr.open_group, which is \"w\" (if overwrite=True) or \"w-\" (if overwrite=False).
TYPE: bool
logger
The logger to use (if unset, use logging.getLogger(None))
TYPE: Optional[Logger] DEFAULT: None
open_group_kwargs
Keyword arguments of zarr.open_group.
TYPE: Any DEFAULT: {}
RETURNS DESCRIPTION Group
The zarr group.
RAISES DESCRIPTION OverwriteNotAllowedError
If overwrite=False and the group already exists.
Source code in fractal_tasks_core/zarr_utils.py
def open_zarr_group_with_overwrite(\n path: Union[str, MutableMapping],\n *,\n overwrite: bool,\n logger: Optional[logging.Logger] = None,\n **open_group_kwargs: Any,\n) -> zarr.hierarchy.Group:\n\"\"\"\n Wrap `zarr.open_group` and add `overwrite` argument.\n\n This wrapper sets `mode=\"w\"` for `overwrite=True` and `mode=\"w-\"` for\n `overwrite=False`.\n\n The expected behavior is\n\n\n * if the group does not exist, create it (independently on `overwrite`);\n * if the group already exists and `overwrite=True`, replace the group with\n an empty one;\n * if the group already exists and `overwrite=False`, fail.\n\n From the [`zarr.open_group`\n docs](https://zarr.readthedocs.io/en/stable/api/hierarchy.html#zarr.hierarchy.open_group):\n\n * `mode=\"r\"` means read only (must exist);\n * `mode=\"r+\"` means read/write (must exist);\n * `mode=\"a\"` means read/write (create if doesn\u2019t exist);\n * `mode=\"w\"` means create (overwrite if exists);\n * `mode=\"w-\"` means create (fail if exists).\n\n\n Args:\n path:\n Store or path to directory in file system or name of zip file\n (`zarr.open_group` parameter).\n overwrite:\n Determines the `mode` parameter of `zarr.open_group`, which is\n `\"w\"` (if `overwrite=True`) or `\"w-\"` (if `overwrite=False`).\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`)\n open_group_kwargs:\n Keyword arguments of `zarr.open_group`.\n\n Returns:\n The zarr group.\n\n Raises:\n OverwriteNotAllowedError:\n If `overwrite=False` and the group already exists.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n # Set mode for zarr.open_group\n if overwrite:\n new_mode = \"w\"\n else:\n new_mode = \"w-\"\n\n # Write log about current status\n logger.info(f\"Start open_zarr_group_with_overwrite ({overwrite=}).\")\n try:\n # Call `zarr.open_group` with `mode=\"r\"`, which fails for missing group\n current_group = zarr.open_group(path, mode=\"r\")\n keys = list(current_group.group_keys())\n logger.info(f\"Zarr group {path} already exists, with {keys=}\")\n except GroupNotFoundError:\n logger.info(f\"Zarr group {path} does not exist yet.\")\n\n # Raise warning if we are overriding an existing value of `mode`\n if \"mode\" in open_group_kwargs.keys():\n mode = open_group_kwargs.pop(\"mode\")\n logger.warning(\n f\"Overriding {mode=} with {new_mode=}, \"\n \"in open_zarr_group_with_overwrite\"\n )\n\n # Call zarr.open_group\n try:\n return zarr.open_group(path, mode=new_mode, **open_group_kwargs)\n except ContainsGroupError:\n # Re-raise error with custom message and type\n error_msg = (\n f\"Cannot create zarr group at {path=} with `{overwrite=}` \"\n \"(original error: `zarr.errors.ContainsGroupError`).\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n
Two kinds of plate_prefix values are handled in a special way:
Filenames from FMI, with successful barcode reading: 210305NAR005AAN_210416_164828 with plate name 210305NAR005AAN;
Filenames from FMI, with failed barcode reading: yymmdd_hhmmss_210416_164828 with plate name RS{yymmddhhmmss}.
For all non-matching filenames, plate name is plate_prefix.
PARAMETER DESCRIPTION plate_prefix
TBD
TYPE: str
Source code in fractal_tasks_core/cellvoyager/filenames.py
def _get_plate_name(plate_prefix: str) -> str:\n\"\"\"\n Two kinds of plate_prefix values are handled in a special way:\n\n 1. Filenames from FMI, with successful barcode reading:\n `210305NAR005AAN_210416_164828` with plate name `210305NAR005AAN`;\n 2. Filenames from FMI, with failed barcode reading:\n `yymmdd_hhmmss_210416_164828` with plate name `RS{yymmddhhmmss}`.\n\n For all non-matching filenames, plate name is `plate_prefix`.\n\n Args:\n plate_prefix: TBD\n \"\"\"\n\n fields = plate_prefix.split(\"_\")\n\n # FMI (successful barcode reading)\n if (\n len(fields) == 3\n and len(fields[1]) == 6\n and len(fields[2]) == 6\n and fields[1].isdigit()\n and fields[2].isdigit()\n ):\n barcode, img_date, img_time = fields[:]\n plate = barcode\n # FMI (failed barcode reading)\n elif (\n len(fields) == 4\n and len(fields[0]) == 6\n and len(fields[1]) == 6\n and len(fields[2]) == 6\n and len(fields[3]) == 6\n and fields[0].isdigit()\n and fields[1].isdigit()\n and fields[2].isdigit()\n and fields[3].isdigit()\n ):\n scan_date, scan_time, img_date, img_time = fields[:]\n plate = f\"RS{scan_date + scan_time}\"\n # All non-matching cases\n else:\n plate = plate_prefix\n\n return plate\n
List all the items (files and folders) in a given folder that simultaneously match a series of glob patterns.
PARAMETER DESCRIPTION folder
Base folder where items will be searched.
TYPE: str
patterns
If specified, the list of patterns (defined as in https://docs.python.org/3/library/fnmatch.html) that item names will match with.
TYPE: Sequence[str] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/filenames.py
def glob_with_multiple_patterns(\n *,\n folder: str,\n patterns: Sequence[str] = None,\n) -> set[str]:\n\"\"\"\n List all the items (files and folders) in a given folder that\n simultaneously match a series of glob patterns.\n\n Args:\n folder: Base folder where items will be searched.\n patterns: If specified, the list of patterns (defined as in\n https://docs.python.org/3/library/fnmatch.html) that item\n names will match with.\n \"\"\"\n\n # Sanitize base-folder path\n if folder.endswith(\"/\"):\n actual_folder = folder[:-1]\n else:\n actual_folder = folder[:]\n\n # If not pattern is specified, look for *all* items in the base folder\n if not patterns:\n patterns = [\"*\"]\n\n # Combine multiple glob searches (via set intersection)\n logging.info(f\"[glob_with_multiple_patterns] {patterns=}\")\n items = None\n for pattern in patterns:\n new_matches = glob(f\"{actual_folder}/{pattern}\")\n if items is None:\n items = set(new_matches)\n else:\n items = items.intersection(new_matches)\n items = items or set()\n logging.info(f\"[glob_with_multiple_patterns] Found {len(items)} items\")\n\n return items\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def calculate_steps(site_series: pd.Series):\n\"\"\"\n TBD\n\n Args:\n site_series: TBD\n \"\"\"\n\n # site_series is the z_micrometer series for a given site of a given\n # channel. This function calculates the step size in Z\n\n # First diff is always NaN because there is nothing to compare it to\n steps = site_series.diff().dropna().astype(float)\n if not np.allclose(steps.iloc[0], np.array(steps)):\n raise NotImplementedError(\n \"When parsing the Yokogawa mlf file, some sites \"\n \"had varying step size in Z. \"\n \"That is not supported for the OME-Zarr parsing\"\n )\n return steps.mean()\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def get_earliest_time_per_site(mlf_frame: pd.DataFrame) -> pd.DataFrame:\n\"\"\"\n TBD\n\n Args:\n mlf_frame: TBD\n \"\"\"\n\n # Get the time information per site\n # Because a site will contain time information for each plane\n # of each channel, we just return the earliest time infromation\n # per site.\n return pd.to_datetime(\n mlf_frame.groupby([\"well_id\", \"FieldIndex\"]).min()[\"Time\"], utc=True\n )\n
Source code in fractal_tasks_core/cellvoyager/metadata.py
def get_z_steps(mlf_frame: pd.DataFrame) -> pd.DataFrame:\n\"\"\"\n TBD\n\n Args:\n mlf_frame: TBD\n \"\"\"\n\n # Process mlf_frame to extract Z information (pixel size & steps).\n # Run checks on consistencies & return site-based z step dataframe\n # Group by well, field & channel\n grouped_sites_z = (\n mlf_frame.loc[\n :,\n [\"well_id\", \"FieldIndex\", \"ActionIndex\", \"Ch\", \"Z\"],\n ]\n .set_index([\"well_id\", \"FieldIndex\", \"ActionIndex\", \"Ch\"])\n .groupby(level=[0, 1, 2, 3])\n )\n\n # If there is only 1 Z step, set the Z spacing to the count of planes => 1\n if grouped_sites_z.count()[\"Z\"].max() == 1:\n z_data = grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"])\n else:\n # Group the whole site (combine channels), because Z steps need to be\n # consistent between channels for OME-Zarr.\n z_data = grouped_sites_z.apply(calculate_steps).groupby(\n [\"well_id\", \"FieldIndex\"]\n )\n\n check_group_consistency(\n z_data, message=\"Comparing Z steps between channels\"\n )\n\n # Ensure that channels have the same number of z planes and\n # reduce it to one value.\n # Only check if there is more than one channel available\n if any(\n grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"]).count() > 1\n ):\n check_group_consistency(\n grouped_sites_z.count().groupby([\"well_id\", \"FieldIndex\"]),\n message=\"Checking number of Z steps between channels\",\n )\n\n z_steps = (\n grouped_sites_z.count()\n .groupby([\"well_id\", \"FieldIndex\"])\n .mean()\n .astype(int)\n )\n\n # Combine the two dataframes\n z_frame = pd.concat([z_data.mean(), z_steps], axis=1)\n z_frame.columns = [\"pixel_size_z\", \"z_pixel\"]\n return z_frame\n
Parse Yokogawa CV7000 metadata files and prepare site-level metadata.
PARAMETER DESCRIPTION mrf_path
Full path to MeasurementDetail.mrf metadata file.
TYPE: Union[str, Path]
mlf_path
Full path to MeasurementData.mlf metadata file.
TYPE: Union[str, Path]
filename_patterns
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html
TYPE: Optional[list[str]] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/metadata.py
def parse_yokogawa_metadata(\n mrf_path: Union[str, Path],\n mlf_path: Union[str, Path],\n *,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, dict[str, int]]:\n\"\"\"\n Parse Yokogawa CV7000 metadata files and prepare site-level metadata.\n\n Args:\n mrf_path: Full path to MeasurementDetail.mrf metadata file.\n mlf_path: Full path to MeasurementData.mlf metadata file.\n filename_patterns:\n List of patterns to filter the image filenames in the mlf metadata\n table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html\n \"\"\"\n\n # Convert paths to strings\n mrf_str = Path(mrf_path).as_posix()\n mlf_str = Path(mlf_path).as_posix()\n\n mrf_frame, mlf_frame, error_count = read_metadata_files(\n mrf_str, mlf_str, filename_patterns\n )\n\n # Aggregate information from the mlf file\n per_site_parameters = [\"X\", \"Y\"]\n\n grouping_params = [\"well_id\", \"FieldIndex\"]\n grouped_sites = mlf_frame.loc[\n :, grouping_params + per_site_parameters\n ].groupby(by=grouping_params)\n\n check_group_consistency(grouped_sites, message=\"X & Y stage positions\")\n site_metadata = grouped_sites.mean()\n site_metadata.columns = [\"x_micrometer\", \"y_micrometer\"]\n site_metadata[\"z_micrometer\"] = 0\n\n site_metadata = pd.concat(\n [\n site_metadata,\n get_z_steps(mlf_frame),\n get_earliest_time_per_site(mlf_frame),\n ],\n axis=1,\n )\n\n # Aggregate information from the mrf file\n mrf_columns = [\n \"horiz_pixel_dim\",\n \"vert_pixel_dim\",\n \"horiz_pixels\",\n \"vert_pixels\",\n \"bit_depth\",\n ]\n check_group_consistency(\n mrf_frame.loc[:, mrf_columns], message=\"Image dimensions\"\n )\n site_metadata[\"pixel_size_x\"] = mrf_frame.loc[:, \"horiz_pixel_dim\"].max()\n site_metadata[\"pixel_size_y\"] = mrf_frame.loc[:, \"vert_pixel_dim\"].max()\n site_metadata[\"x_pixel\"] = int(mrf_frame.loc[:, \"horiz_pixels\"].max())\n site_metadata[\"y_pixel\"] = int(mrf_frame.loc[:, \"vert_pixels\"].max())\n site_metadata[\"bit_depth\"] = int(mrf_frame.loc[:, \"bit_depth\"].max())\n\n if error_count > 0:\n logger.info(\n f\"There were {error_count} ERR entries in the metadatafile. \"\n f\"Still succesfully parsed {len(site_metadata)} sites. \"\n )\n\n # Compute expected number of image files for each well\n list_of_wells = set(site_metadata.index.get_level_values(\"well_id\"))\n number_of_files = {}\n for this_well_id in list_of_wells:\n num_images = (mlf_frame.well_id == this_well_id).sum()\n logger.info(\n f\"Expected number of images for well {this_well_id}: {num_images}\"\n )\n number_of_files[this_well_id] = num_images\n # Check that the sum of per-well file numbers correspond to the total\n # file number\n if not sum(number_of_files.values()) == len(mlf_frame):\n raise ValueError(\n \"Error while counting the number of image files per well.\\n\"\n f\"{len(mlf_frame)=}\\n\"\n f\"{number_of_files=}\"\n )\n\n return site_metadata, number_of_files\n
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html.
TYPE: Optional[list[str]] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/metadata.py
def read_metadata_files(\n mrf_path: str,\n mlf_path: str,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, pd.DataFrame, int]:\n\"\"\"\n TBD\n\n Args:\n mrf_path: Full path to MeasurementDetail.mrf metadata file.\n mlf_path: Full path to MeasurementData.mlf metadata file.\n filename_patterns: List of patterns to filter the image filenames in\n the mlf metadata table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html.\n \"\"\"\n\n # parsing of mrf & mlf files are based on the\n # yokogawa_image_collection_task v0.5 in drogon, written by Dario Vischi.\n # https://github.com/fmi-basel/job-system-workflows/blob/00bbf34448972d27f258a2c28245dd96180e8229/src/gliberal_workflows/tasks/yokogawa_image_collection_task/versions/version_0_5.py # noqa\n # Now modified for Fractal use\n\n mrf_frame = read_mrf_file(mrf_path)\n # TODO: filter_position & filter_wheel_position are parsed, but not\n # processed further. Figure out how to save them as relevant metadata for\n # use e.g. during illumination correction\n\n mlf_frame, error_count = read_mlf_file(mlf_path, filename_patterns)\n # TODO: Time points are parsed as part of the mlf_frame, but currently not\n # processed further. Once we tackle time-resolved data, parse from here.\n\n return mrf_frame, mlf_frame, error_count\n
List of patterns to filter the image filenames in the mlf metadata table. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html.
TYPE: Optional[list[str]] DEFAULT: None
Source code in fractal_tasks_core/cellvoyager/metadata.py
def read_mlf_file(\n mlf_path: str,\n filename_patterns: Optional[list[str]] = None,\n) -> tuple[pd.DataFrame, int]:\n\"\"\"\n TBD\n\n Args:\n mlf_path: Full path to MeasurementData.mlf metadata file.\n filename_patterns: List of patterns to filter the image filenames in\n the mlf metadata table. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html.\n \"\"\"\n\n # Load the whole MeasurementData.mlf file\n mlf_frame_raw = pd.read_xml(mlf_path)\n\n # Remove all rows that do not match the given patterns\n logger.info(\n f\"Read {mlf_path}, and apply following patterns to \"\n f\"image filenames: {filename_patterns}\"\n )\n if filename_patterns:\n filenames = mlf_frame_raw.MeasurementRecord\n keep_row = None\n for pattern in filename_patterns:\n actual_pattern = fnmatch.translate(pattern)\n new_matches = filenames.str.fullmatch(actual_pattern)\n if new_matches.sum() == 0:\n raise ValueError(\n f\"In {mlf_path} there is no image filename \"\n f'matching \"{actual_pattern}\".'\n )\n if keep_row is None:\n keep_row = new_matches.copy()\n else:\n keep_row = keep_row & new_matches\n if keep_row.sum() == 0:\n raise ValueError(\n f\"In {mlf_path} there is no image filename \"\n f\"matching {filename_patterns}.\"\n )\n mlf_frame_matching = mlf_frame_raw[keep_row.values].copy()\n else:\n mlf_frame_matching = mlf_frame_raw.copy()\n\n # Create a well ID column\n row_str = [chr(x) for x in (mlf_frame_matching[\"Row\"] + 64)]\n mlf_frame_matching[\"well_id\"] = [\n f\"{a}{b:02}\" for a, b in zip(row_str, mlf_frame_matching[\"Column\"])\n ]\n\n # Flip Y axis to align to image coordinate system\n mlf_frame_matching[\"Y\"] = -mlf_frame_matching[\"Y\"]\n\n # Compute number or errors\n error_count = (mlf_frame_matching[\"Type\"] == \"ERR\").sum()\n\n # We're only interested in the image metadata\n mlf_frame = mlf_frame_matching[mlf_frame_matching[\"Type\"] == \"IMG\"]\n\n return mlf_frame, error_count\n
Pydantic v1 automatically includes args and kwargs properties in JSON Schemas generated via ValidatedFunction(task_function, config=None).model.schema(), with some default (empty) values -- see see https://github.com/pydantic/pydantic/blob/1.10.X-fixes/pydantic/decorator.py.
Verify that these properties match with their expected default values, and then remove them from the schema.
PARAMETER DESCRIPTION old_schema
TBD
TYPE: _Schema
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def _remove_args_kwargs_properties(old_schema: _Schema) -> _Schema:\n\"\"\"\n Remove `args` and `kwargs` schema properties.\n\n Pydantic v1 automatically includes `args` and `kwargs` properties in\n JSON Schemas generated via `ValidatedFunction(task_function,\n config=None).model.schema()`, with some default (empty) values -- see see\n https://github.com/pydantic/pydantic/blob/1.10.X-fixes/pydantic/decorator.py.\n\n Verify that these properties match with their expected default values, and\n then remove them from the schema.\n\n Args:\n old_schema: TBD\n \"\"\"\n new_schema = old_schema.copy()\n args_property = new_schema[\"properties\"].pop(\"args\")\n kwargs_property = new_schema[\"properties\"].pop(\"kwargs\")\n expected_args_property = {\"title\": \"Args\", \"type\": \"array\", \"items\": {}}\n expected_kwargs_property = {\"title\": \"Kwargs\", \"type\": \"object\"}\n if args_property != expected_args_property:\n raise ValueError(\n f\"{args_property=}\\ndiffers from\\n{expected_args_property=}\"\n )\n if kwargs_property != expected_kwargs_property:\n raise ValueError(\n f\"{kwargs_property=}\\ndiffers from\\n\"\n f\"{expected_kwargs_property=}\"\n )\n logging.info(\"[_remove_args_kwargs_properties] END\")\n return new_schema\n
Keeps only the description part of the docstrings: e.g from
'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'\n'\\n'\n'Attributes:\\n'\n'min: Do not change. It will be set to `0` by default.\\n'\n'max: Do not change. It will be set according to bitdepth of the images\\n'\n' by default (e.g. 65535 for 16 bit images).\\n'\n'start: Lower-bound rescaling value for visualization.\\n'\n'end: Upper-bound rescaling value for visualization.'\n
to 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\n'. PARAMETER DESCRIPTION old_schema
TBD
TYPE: _Schema
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def _remove_attributes_from_descriptions(old_schema: _Schema) -> _Schema:\n\"\"\"\n Keeps only the description part of the docstrings: e.g from\n ```\n 'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\\\n'\n '\\\\n'\n 'Attributes:\\\\n'\n 'min: Do not change. It will be set to `0` by default.\\\\n'\n 'max: Do not change. It will be set according to bitdepth of the images\\\\n'\n ' by default (e.g. 65535 for 16 bit images).\\\\n'\n 'start: Lower-bound rescaling value for visualization.\\\\n'\n 'end: Upper-bound rescaling value for visualization.'\n ```\n to `'Custom class for Omero-channel window, based on OME-NGFF v0.4.\\\\n'`.\n\n Args:\n old_schema: TBD\n \"\"\"\n new_schema = old_schema.copy()\n if \"definitions\" in new_schema:\n for name, definition in new_schema[\"definitions\"].items():\n parsed_docstring = docparse(definition[\"description\"])\n new_schema[\"definitions\"][name][\n \"description\"\n ] = parsed_docstring.short_description\n logging.info(\"[_remove_attributes_from_descriptions] END\")\n return new_schema\n
Main function to create a JSON Schema of task arguments
Source code in fractal_tasks_core/dev/lib_args_schemas.py
def create_schema_for_single_task(\n executable: str,\n package: str = \"fractal_tasks_core\",\n custom_pydantic_models: Optional[list[tuple[str, str, str]]] = None,\n) -> _Schema:\n\"\"\"\n Main function to create a JSON Schema of task arguments\n \"\"\"\n\n logging.info(\"[create_schema_for_single_task] START\")\n\n # Extract the function name. Note: this could be made more general, but for\n # the moment we assume the function has the same name as the module)\n function_name = Path(executable).with_suffix(\"\").name\n logging.info(f\"[create_schema_for_single_task] {function_name=}\")\n\n # Extract function from module\n task_function = _extract_function(\n package_name=package,\n module_relative_path=executable,\n function_name=function_name,\n )\n\n logging.info(f\"[create_schema_for_single_task] {task_function=}\")\n\n # Validate function signature against some custom constraints\n _validate_function_signature(task_function)\n\n # Create and clean up schema\n vf = ValidatedFunction(task_function, config=None)\n schema = vf.model.schema()\n schema = _remove_args_kwargs_properties(schema)\n schema = _remove_pydantic_internals(schema)\n schema = _remove_attributes_from_descriptions(schema)\n\n # Include titles for custom-model-typed arguments\n schema = _include_titles(schema)\n\n # Include descriptions of function arguments\n function_args_descriptions = _get_function_args_descriptions(\n package_name=package,\n module_relative_path=executable,\n function_name=function_name,\n )\n schema = _insert_function_args_descriptions(\n schema=schema, descriptions=function_args_descriptions\n )\n\n # Merge lists of fractal-tasks-core and user-provided Pydantic models\n user_provided_models = custom_pydantic_models or []\n pydantic_models = FRACTAL_TASKS_CORE_PYDANTIC_MODELS + user_provided_models\n\n # Check that model names are unique\n pydantic_models_names = [item[2] for item in pydantic_models]\n duplicate_class_names = [\n name\n for name, count in Counter(pydantic_models_names).items()\n if count > 1\n ]\n if duplicate_class_names:\n pydantic_models_str = \" \" + \"\\n \".join(map(str, pydantic_models))\n raise ValueError(\n \"Cannot parse docstrings for models with non-unique names \"\n f\"{duplicate_class_names}, in\\n{pydantic_models_str}\"\n )\n\n # Extract model-attribute descriptions and insert them into schema\n for package_name, module_relative_path, class_name in pydantic_models:\n attrs_descriptions = _get_class_attrs_descriptions(\n package_name=package_name,\n module_relative_path=module_relative_path,\n class_name=class_name,\n )\n schema = _insert_class_attrs_descriptions(\n schema=schema,\n class_name=class_name,\n descriptions=attrs_descriptions,\n )\n\n logging.info(\"[create_schema_for_single_task] END\")\n return schema\n
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _get_class_attrs_descriptions(\n package_name: str, module_relative_path: str, class_name: str\n) -> dict[str, str]:\n\"\"\"\n Extract attribute descriptions from a class.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_relative_path: Example `lib_channels.py`.\n class_name: Example `OmeroChannel`.\n \"\"\"\n\n if not module_relative_path.endswith(\".py\"):\n raise ValueError(f\"Module {module_relative_path} must end with '.py'\")\n\n # Get the class ast.ClassDef object\n package_path = Path(import_module(package_name).__file__).parent\n module_path = package_path / module_relative_path\n tree = ast.parse(module_path.read_text())\n try:\n _class = next(\n c\n for c in ast.walk(tree)\n if (isinstance(c, ast.ClassDef) and c.name == class_name)\n )\n except StopIteration:\n raise RuntimeError(\n f\"Cannot find {class_name=} for {package_name=} \"\n f\"and {module_relative_path=}\"\n )\n docstring = ast.get_docstring(_class)\n parsed_docstring = docparse(docstring)\n descriptions = {\n x.arg_name: _sanitize_description(x.description)\n if x.description\n else \"Missing description\"\n for x in parsed_docstring.params\n }\n logging.info(f\"[_get_class_attrs_descriptions] END ({class_name=})\")\n return descriptions\n
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _get_function_docstring(\n package_name: str, module_relative_path: str, function_name: str\n) -> str:\n\"\"\"\n Extract docstring from a function.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_relative_path: Example `tasks/create_ome_zarr.py`.\n function_name: Example `create_ome_zarr`.\n \"\"\"\n\n if not module_relative_path.endswith(\".py\"):\n raise ValueError(f\"Module {module_relative_path} must end with '.py'\")\n\n # Get the function ast.FunctionDef object\n package_path = Path(import_module(package_name).__file__).parent\n module_path = package_path / module_relative_path\n tree = ast.parse(module_path.read_text())\n _function = next(\n f\n for f in ast.walk(tree)\n if (isinstance(f, ast.FunctionDef) and f.name == function_name)\n )\n\n # Extract docstring from ast.FunctionDef\n return ast.get_docstring(_function)\n
Merge the descriptions obtained via _get_attributes_models_descriptions into the class_name definition, within an existing JSON Schema
PARAMETER DESCRIPTION schema
TBD
TYPE: dict
class_name
TBD
TYPE: str
descriptions
TBD
TYPE: dict
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _insert_class_attrs_descriptions(\n *, schema: dict, class_name: str, descriptions: dict\n):\n\"\"\"\n Merge the descriptions obtained via `_get_attributes_models_descriptions`\n into the `class_name` definition, within an existing JSON Schema\n\n Args:\n schema: TBD\n class_name: TBD\n descriptions: TBD\n \"\"\"\n new_schema = schema.copy()\n if \"definitions\" not in schema:\n return new_schema\n else:\n new_definitions = schema[\"definitions\"].copy()\n # Loop over existing definitions\n for name, definition in schema[\"definitions\"].items():\n if name == class_name:\n for prop in definition[\"properties\"]:\n if \"description\" in new_definitions[name][\"properties\"][prop]:\n raise ValueError(\n f\"Property {name}.{prop} already has description\"\n )\n else:\n new_definitions[name][\"properties\"][prop][\n \"description\"\n ] = descriptions[prop]\n new_schema[\"definitions\"] = new_definitions\n logging.info(\"[_insert_class_attrs_descriptions] END\")\n return new_schema\n
This is a provisional helper function that replaces newlines with spaces and reduces multiple contiguous whitespace characters to a single one. Future iterations of the docstrings format/parsing may render this function not-needed or obsolete.
PARAMETER DESCRIPTION string
TBD
TYPE: str
Source code in fractal_tasks_core/dev/lib_descriptions.py
def _sanitize_description(string: str) -> str:\n\"\"\"\n Sanitize a description string.\n\n This is a provisional helper function that replaces newlines with spaces\n and reduces multiple contiguous whitespace characters to a single one.\n Future iterations of the docstrings format/parsing may render this function\n not-needed or obsolete.\n\n Args:\n string: TBD\n \"\"\"\n # Replace newline with space\n new_string = string.replace(\"\\n\", \" \")\n # Replace N-whitespace characterss with a single one\n while \" \" in new_string:\n new_string = new_string.replace(\" \", \" \")\n return new_string\n
Extract function from a module with the same name.
PARAMETER DESCRIPTION package_name
Example fractal_tasks_core.
TYPE: str DEFAULT: 'fractal_tasks_core'
module_relative_path
Example tasks/create_ome_zarr.py.
TYPE: str
function_name
Example create_ome_zarr.
TYPE: str
Source code in fractal_tasks_core/dev/lib_signature_constraints.py
def _extract_function(\n module_relative_path: str,\n function_name: str,\n package_name: str = \"fractal_tasks_core\",\n) -> Callable:\n\"\"\"\n Extract function from a module with the same name.\n\n Args:\n package_name: Example `fractal_tasks_core`.\n module_relative_path: Example `tasks/create_ome_zarr.py`.\n function_name: Example `create_ome_zarr`.\n \"\"\"\n if not module_relative_path.endswith(\".py\"):\n raise ValueError(f\"{module_relative_path=} must end with '.py'\")\n module_relative_path_no_py = str(\n Path(module_relative_path).with_suffix(\"\")\n )\n module_relative_path_dots = module_relative_path_no_py.replace(\"/\", \".\")\n module = import_module(f\"{package_name}.{module_relative_path_dots}\")\n task_function = getattr(module, function_name)\n return task_function\n
Implement a set of checks for type hints that do not play well with the creation of JSON Schema, see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.
PARAMETER DESCRIPTION function
TBD
TYPE: Callable
Source code in fractal_tasks_core/dev/lib_signature_constraints.py
def _validate_function_signature(function: Callable):\n\"\"\"\n Validate the function signature.\n\n Implement a set of checks for type hints that do not play well with the\n creation of JSON Schema, see\n https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/399.\n\n Args:\n function: TBD\n \"\"\"\n sig = signature(function)\n for param in sig.parameters.values():\n\n # CASE 1: Check that name is not forbidden\n if param.name in FORBIDDEN_PARAM_NAMES:\n raise ValueError(\n f\"Function {function} has argument with name {param.name}\"\n )\n\n # CASE 2: Raise an error for unions\n if str(param.annotation).startswith((\"typing.Union[\", \"Union[\")):\n raise ValueError(\"typing.Union is not supported\")\n\n # CASE 3: Raise an error for \"|\"\n if \"|\" in str(param.annotation):\n raise ValueError('Use of \"|\" in type hints is not supported')\n\n # CASE 4: Raise an error for optional parameter with given (non-None)\n # default, e.g. Optional[str] = \"asd\"\n is_annotation_optional = str(param.annotation).startswith(\n (\"typing.Optional[\", \"Optional[\")\n )\n default_given = (param.default is not None) and (\n param.default != inspect._empty\n )\n if default_given and is_annotation_optional:\n raise ValueError(\"Optional parameter has non-None default value\")\n\n logging.info(\"[_validate_function_signature] END\")\n return sig\n
Return task description based on function docstring.
Source code in fractal_tasks_core/dev/lib_task_docs.py
def create_docs_info(\n executable: str,\n package: str = \"fractal_tasks_core\",\n) -> str:\n\"\"\"\n Return task description based on function docstring.\n \"\"\"\n logging.info(\"[create_docs_info] START\")\n # Extract the function name. Note: this could be made more general, but for\n # the moment we assume the function has the same name as the module)\n function_name = Path(executable).with_suffix(\"\").name\n logging.info(f\"[create_docs_info] {function_name=}\")\n # Get function description\n docs_info = _get_function_description(\n package_name=package,\n module_relative_path=executable,\n function_name=function_name,\n )\n logging.info(\"[create_docs_info] END\")\n return docs_info\n
Return link to docs page for a fractal_tasks_core task.
Source code in fractal_tasks_core/dev/lib_task_docs.py
def create_docs_link(executable: str) -> str:\n\"\"\"\n Return link to docs page for a fractal_tasks_core task.\n \"\"\"\n logging.info(\"[create_docs_link] START\")\n\n # Extract the function name. Note: this could be made more general, but for\n # the moment we assume the function has the same name as the module)\n function_name = Path(executable).with_suffix(\"\").name\n logging.info(f\"[create_docs_link] {function_name=}\")\n # Define docs_link\n docs_link = (\n \"https://fractal-analytics-platform.github.io/fractal-tasks-core/\"\n f\"reference/fractal_tasks_core/tasks/{function_name}/\"\n f\"#fractal_tasks_core.tasks.{function_name}.{function_name}\"\n )\n logging.info(\"[create_docs_link] END\")\n return docs_link\n
Scan through properties of a JSON Schema, and set their title when it is missing.
The title is set to name.title(), where title is a standard string method - see https://docs.python.org/3/library/stdtypes.html#str.title.
PARAMETER DESCRIPTION properties
TBD
TYPE: dict[str, dict]
Source code in fractal_tasks_core/dev/lib_titles.py
def _include_titles_for_properties(\n properties: dict[str, dict]\n) -> dict[str, dict]:\n\"\"\"\n Scan through properties of a JSON Schema, and set their title when it is\n missing.\n\n The title is set to `name.title()`, where `title` is a standard string\n method - see https://docs.python.org/3/library/stdtypes.html#str.title.\n\n Args:\n properties: TBD\n \"\"\"\n new_properties = properties.copy()\n for prop_name, prop in properties.items():\n if \"title\" not in prop.keys():\n new_prop = prop.copy()\n new_prop[\"title\"] = prop_name.title()\n new_properties[prop_name] = new_prop\n return new_properties\n
class Axis(BaseModel):\n\"\"\"\n Model for an element of `Multiscale.axes`.\n\n See https://ngff.openmicroscopy.org/0.4/#axes-md.\n \"\"\"\n\n name: str\n type: Optional[str] = None\n unit: Optional[str] = None\n
See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Channel(BaseModel):\n\"\"\"\n Model for an element of `Omero.channels`.\n\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n window: Optional[Window] = None\n label: Optional[str] = None\n family: Optional[str] = None\n color: str\n active: Optional[bool] = None\n
See https://ngff.openmicroscopy.org/0.4/#multiscale-md
Source code in fractal_tasks_core/ngff/specs.py
class Dataset(BaseModel):\n\"\"\"\n Model for an element of `Multiscale.datasets`.\n\n See https://ngff.openmicroscopy.org/0.4/#multiscale-md\n \"\"\"\n\n path: str\n coordinateTransformations: list[\n Union[\n ScaleCoordinateTransformation, TranslationCoordinateTransformation\n ]\n ] = Field(..., min_items=1)\n\n @property\n def scale_transformation(self) -> ScaleCoordinateTransformation:\n\"\"\"\n Extract the unique scale transformation, or fail otherwise.\n \"\"\"\n _transformations = [\n t for t in self.coordinateTransformations if t.type == \"scale\"\n ]\n if len(_transformations) == 0:\n raise ValueError(\n \"Missing scale transformation in dataset.\\n\"\n \"Current coordinateTransformations:\\n\"\n f\"{self.coordinateTransformations}\"\n )\n elif len(_transformations) > 1:\n raise ValueError(\n \"More than one scale transformation in dataset.\\n\"\n \"Current coordinateTransformations:\\n\"\n f\"{self.coordinateTransformations}\"\n )\n else:\n return _transformations[0]\n
Note 1: The NGFF image is defined in a different model (NgffImageMeta), while the Image model only refere to an item of Well.images.
Note 2: We deviate from NGFF specs, since we allow path to be an arbitrary string. TODO: include a check like constr(regex=r'^[A-Za-z0-9]+$'), through a Pydantic validator.
See https://ngff.openmicroscopy.org/0.4/#well-md.
Source code in fractal_tasks_core/ngff/specs.py
class ImageInWell(BaseModel):\n\"\"\"\n Model for an element of `Well.images`.\n\n **Note 1:** The NGFF image is defined in a different model\n (`NgffImageMeta`), while the `Image` model only refere to an item of\n `Well.images`.\n\n **Note 2:** We deviate from NGFF specs, since we allow `path` to be an\n arbitrary string.\n TODO: include a check like `constr(regex=r'^[A-Za-z0-9]+$')`, through a\n Pydantic validator.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n acquisition: Optional[int] = Field(\n None, description=\"A unique identifier within the context of the plate\"\n )\n path: str = Field(\n ..., description=\"The path for this field of view subgroup\"\n )\n
Model for an element of NgffImageMeta.multiscales.
See https://ngff.openmicroscopy.org/0.4/#multiscale-md.
Source code in fractal_tasks_core/ngff/specs.py
class Multiscale(BaseModel):\n\"\"\"\n Model for an element of `NgffImageMeta.multiscales`.\n\n See https://ngff.openmicroscopy.org/0.4/#multiscale-md.\n \"\"\"\n\n name: Optional[str] = None\n datasets: list[Dataset] = Field(..., min_items=1)\n version: Optional[str] = None\n axes: list[Axis] = Field(..., max_items=5, min_items=2, unique_items=True)\n coordinateTransformations: Optional[\n list[\n Union[\n ScaleCoordinateTransformation,\n TranslationCoordinateTransformation,\n ]\n ]\n ] = None\n\n @validator(\"coordinateTransformations\", always=True)\n def _no_global_coordinateTransformations(cls, v):\n\"\"\"\n Fail if Multiscale has a (global) coordinateTransformations attribute.\n \"\"\"\n if v is not None:\n raise NotImplementedError(\n \"Global coordinateTransformations at the multiscales \"\n \"level are not currently supported in the fractal-tasks-core \"\n \"model for the NGFF multiscale.\"\n )\n
Fail if Multiscale has a (global) coordinateTransformations attribute.
Source code in fractal_tasks_core/ngff/specs.py
@validator(\"coordinateTransformations\", always=True)\ndef _no_global_coordinateTransformations(cls, v):\n\"\"\"\n Fail if Multiscale has a (global) coordinateTransformations attribute.\n \"\"\"\n if v is not None:\n raise NotImplementedError(\n \"Global coordinateTransformations at the multiscales \"\n \"level are not currently supported in the fractal-tasks-core \"\n \"model for the NGFF multiscale.\"\n )\n
See https://ngff.openmicroscopy.org/0.4/#image-layout.
Source code in fractal_tasks_core/ngff/specs.py
class NgffImageMeta(BaseModel):\n\"\"\"\n Model for the metadata of a NGFF image.\n\n See https://ngff.openmicroscopy.org/0.4/#image-layout.\n \"\"\"\n\n multiscales: list[Multiscale] = Field(\n ...,\n description=\"The multiscale datasets for this image\",\n min_items=1,\n unique_items=True,\n )\n omero: Optional[Omero] = None\n\n @property\n def multiscale(self) -> Multiscale:\n\"\"\"\n The single element of `self.multiscales`.\n\n Raises:\n NotImplementedError:\n If there are no multiscales or more than one.\n \"\"\"\n if len(self.multiscales) > 1:\n raise NotImplementedError(\n \"Only images with one multiscale are supported \"\n f\"(given: {len(self.multiscales)}\"\n )\n return self.multiscales[0]\n\n @property\n def datasets(self) -> list[Dataset]:\n\"\"\"\n The `datasets` attribute of `self.multiscale`.\n \"\"\"\n return self.multiscale.datasets\n\n @property\n def num_levels(self) -> int:\n return len(self.datasets)\n\n @property\n def axes_names(self) -> list[str]:\n\"\"\"\n List of axes names.\n \"\"\"\n return [ax.name for ax in self.multiscale.axes]\n\n @property\n def pixel_sizes_zyx(self) -> list[list[float]]:\n\"\"\"\n Pixel sizes extracted from scale transformations of datasets.\n\n Raises:\n ValueError:\n If pixel sizes are below a given threshold (1e-9).\n \"\"\"\n x_index = self.axes_names.index(\"x\")\n y_index = self.axes_names.index(\"y\")\n try:\n z_index = self.axes_names.index(\"z\")\n except ValueError:\n z_index = None\n logging.warning(\n f\"Z axis is not present (axes: {self.axes_names}), and Z pixel\"\n \" size is set to 1. This may work, by accident, but it is \"\n \"not fully supported.\"\n )\n _pixel_sizes_zyx = []\n for level in range(self.num_levels):\n scale = self.datasets[level].scale_transformation.scale\n pixel_size_x = scale[x_index]\n pixel_size_y = scale[y_index]\n if z_index is not None:\n pixel_size_z = scale[z_index]\n else:\n pixel_size_z = 1.0\n _pixel_sizes_zyx.append([pixel_size_z, pixel_size_y, pixel_size_x])\n if min(_pixel_sizes_zyx[-1]) < 1e-9:\n raise ValueError(\n f\"Pixel sizes at level {level} are too small: \"\n f\"{_pixel_sizes_zyx[-1]}\"\n )\n\n return _pixel_sizes_zyx\n\n def get_pixel_sizes_zyx(self, *, level: int = 0) -> list[float]:\n return self.pixel_sizes_zyx[level]\n\n @property\n def coarsening_xy(self) -> int:\n\"\"\"\n Linear coarsening factor in the YX plane.\n\n We only support coarsening factors that are homogeneous (both in the\n X/Y directions and across pyramid levels).\n\n Raises:\n NotImplementedError:\n If coarsening ratios are not homogeneous.\n \"\"\"\n current_ratio = None\n for ind in range(1, self.num_levels):\n ratio_x = round(\n self.pixel_sizes_zyx[ind][2] / self.pixel_sizes_zyx[ind - 1][2]\n )\n ratio_y = round(\n self.pixel_sizes_zyx[ind][1] / self.pixel_sizes_zyx[ind - 1][1]\n )\n if ratio_x != ratio_y:\n raise NotImplementedError(\n \"Inhomogeneous coarsening in X/Y directions \"\n \"is not supported.\\n\"\n f\"ZYX pixel sizes:\\n {self.pixel_sizes_zyx}\"\n )\n if current_ratio is None:\n current_ratio = ratio_x\n else:\n if current_ratio != ratio_x:\n raise NotImplementedError(\n \"Inhomogeneous coarsening across levels \"\n \"is not supported.\\n\"\n f\"ZYX pixel sizes:\\n {self.pixel_sizes_zyx}\"\n )\n\n return current_ratio\n
class NgffWellMeta(BaseModel):\n\"\"\"\n Model for the metadata of a NGFF well.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n well: Optional[Well] = None\n\n def get_acquisition_paths(self) -> dict[int, str]:\n\"\"\"\n Create mapping from acquisition indices to corresponding paths.\n\n Runs on the well zarr attributes and loads the relative paths in the\n well.\n\n Returns:\n Dictionary with `(acquisition index: image path)` key/value pairs.\n\n Raises:\n ValueError:\n If an element of `self.well.images` has no `acquisition`\n attribute.\n NotImplementedError:\n If acquisitions are not unique.\n \"\"\"\n acquisition_dict = {}\n for image in self.well.images:\n if image.acquisition is None:\n raise ValueError(\n \"Cannot get acquisition paths for Zarr files without \"\n \"'acquisition' metadata at the well level\"\n )\n if image.acquisition in acquisition_dict:\n raise NotImplementedError(\n \"The `NgffWellMeta.get_acquisition_paths` method (in \"\n \"fractal-tasks-core) does not support wells with \"\n \"multiple images of the same acquisition.\"\n )\n acquisition_dict[image.acquisition] = image.path\n return acquisition_dict\n
Create mapping from acquisition indices to corresponding paths.
Runs on the well zarr attributes and loads the relative paths in the well.
RETURNS DESCRIPTION dict[int, str]
Dictionary with (acquisition index: image path) key/value pairs.
RAISES DESCRIPTION ValueError
If an element of self.well.images has no acquisition attribute.
NotImplementedError
If acquisitions are not unique.
Source code in fractal_tasks_core/ngff/specs.py
def get_acquisition_paths(self) -> dict[int, str]:\n\"\"\"\n Create mapping from acquisition indices to corresponding paths.\n\n Runs on the well zarr attributes and loads the relative paths in the\n well.\n\n Returns:\n Dictionary with `(acquisition index: image path)` key/value pairs.\n\n Raises:\n ValueError:\n If an element of `self.well.images` has no `acquisition`\n attribute.\n NotImplementedError:\n If acquisitions are not unique.\n \"\"\"\n acquisition_dict = {}\n for image in self.well.images:\n if image.acquisition is None:\n raise ValueError(\n \"Cannot get acquisition paths for Zarr files without \"\n \"'acquisition' metadata at the well level\"\n )\n if image.acquisition in acquisition_dict:\n raise NotImplementedError(\n \"The `NgffWellMeta.get_acquisition_paths` method (in \"\n \"fractal-tasks-core) does not support wells with \"\n \"multiple images of the same acquisition.\"\n )\n acquisition_dict[image.acquisition] = image.path\n return acquisition_dict\n
See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Omero(BaseModel):\n\"\"\"\n Model for `NgffImageMeta.omero`.\n\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n channels: list[Channel]\n
This corresponds to scale-type elements of Dataset.coordinateTransformations or Multiscale.coordinateTransformations. See https://ngff.openmicroscopy.org/0.4/#trafo-md
Source code in fractal_tasks_core/ngff/specs.py
class ScaleCoordinateTransformation(BaseModel):\n\"\"\"\n Model for a scale transformation.\n\n This corresponds to scale-type elements of\n `Dataset.coordinateTransformations` or\n `Multiscale.coordinateTransformations`.\n See https://ngff.openmicroscopy.org/0.4/#trafo-md\n \"\"\"\n\n type: Literal[\"scale\"]\n scale: list[float] = Field(..., min_items=2)\n
This corresponds to translation-type elements of Dataset.coordinateTransformations or Multiscale.coordinateTransformations. See https://ngff.openmicroscopy.org/0.4/#trafo-md
Source code in fractal_tasks_core/ngff/specs.py
class TranslationCoordinateTransformation(BaseModel):\n\"\"\"\n Model for a translation transformation.\n\n This corresponds to translation-type elements of\n `Dataset.coordinateTransformations` or\n `Multiscale.coordinateTransformations`.\n See https://ngff.openmicroscopy.org/0.4/#trafo-md\n \"\"\"\n\n type: Literal[\"translation\"]\n translation: list[float] = Field(..., min_items=2)\n
class Well(BaseModel):\n\"\"\"\n Model for `NgffWellMeta.well`.\n\n See https://ngff.openmicroscopy.org/0.4/#well-md.\n \"\"\"\n\n images: list[ImageInWell] = Field(\n ...,\n description=\"The images included in this well\",\n min_items=1,\n unique_items=True,\n )\n version: Optional[str] = Field(\n None, description=\"The version of the specification\"\n )\n
Note that we deviate by NGFF specs by making start and end optional. See https://ngff.openmicroscopy.org/0.4/#omero-md.
Source code in fractal_tasks_core/ngff/specs.py
class Window(BaseModel):\n\"\"\"\n Model for `Channel.window`.\n\n Note that we deviate by NGFF specs by making `start` and `end` optional.\n See https://ngff.openmicroscopy.org/0.4/#omero-md.\n \"\"\"\n\n max: float\n min: float\n start: Optional[float] = None\n end: Optional[float] = None\n
This is used to provide a user-friendly error message.
Source code in fractal_tasks_core/ngff/zarr_utils.py
class ZarrGroupNotFoundError(ValueError):\n\"\"\"\n Wrap zarr.errors.GroupNotFoundError\n\n This is used to provide a user-friendly error message.\n \"\"\"\n\n pass\n
Given a Zarr group, find whether it is an OME-NGFF plate, well or image.
PARAMETER DESCRIPTION group
Zarr group
TYPE: Group
RETURNS DESCRIPTION str
The detected OME-NGFF type (plate, well or image).
Source code in fractal_tasks_core/ngff/zarr_utils.py
def detect_ome_ngff_type(group: zarr.hierarchy.Group) -> str:\n\"\"\"\n Given a Zarr group, find whether it is an OME-NGFF plate, well or image.\n\n Args:\n group: Zarr group\n\n Returns:\n The detected OME-NGFF type (`plate`, `well` or `image`).\n \"\"\"\n attrs = group.attrs.asdict()\n if \"plate\" in attrs.keys():\n ngff_type = \"plate\"\n elif \"well\" in attrs.keys():\n ngff_type = \"well\"\n elif \"multiscales\" in attrs.keys():\n ngff_type = \"image\"\n else:\n error_msg = (\n \"Zarr group at cannot be identified as one \"\n \"of OME-NGFF plate/well/image groups.\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n logger.info(f\"Zarr group identified as OME-NGFF {ngff_type}.\")\n return ngff_type\n
Load the attributes of a zarr group and cast them to NgffImageMeta.
PARAMETER DESCRIPTION zarr_path
Path to the zarr group.
TYPE: str
RETURNS DESCRIPTION NgffImageMeta
A new NgffImageMeta object.
Source code in fractal_tasks_core/ngff/zarr_utils.py
def load_NgffImageMeta(zarr_path: str) -> NgffImageMeta:\n\"\"\"\n Load the attributes of a zarr group and cast them to `NgffImageMeta`.\n\n Args:\n zarr_path: Path to the zarr group.\n\n Returns:\n A new `NgffImageMeta` object.\n \"\"\"\n try:\n zarr_group = zarr.open_group(zarr_path, mode=\"r\")\n except GroupNotFoundError:\n error_msg = (\n \"Could not load attributes for the requested image, \"\n f\"because no Zarr image was found at {zarr_path}\"\n )\n logging.error(error_msg)\n raise ZarrGroupNotFoundError(error_msg)\n zarr_attrs = zarr_group.attrs.asdict()\n try:\n return NgffImageMeta(**zarr_attrs)\n except Exception as e:\n logging.error(\n f\"Contents of {zarr_path} cannot be cast to NgffImageMeta.\\n\"\n f\"Original error:\\n{str(e)}\"\n )\n raise e\n
Load the attributes of a zarr group and cast them to NgffWellMeta.
PARAMETER DESCRIPTION zarr_path
Path to the zarr group.
TYPE: str
RETURNS DESCRIPTION NgffWellMeta
A new NgffWellMeta object.
Source code in fractal_tasks_core/ngff/zarr_utils.py
def load_NgffWellMeta(zarr_path: str) -> NgffWellMeta:\n\"\"\"\n Load the attributes of a zarr group and cast them to `NgffWellMeta`.\n\n Args:\n zarr_path: Path to the zarr group.\n\n Returns:\n A new `NgffWellMeta` object.\n \"\"\"\n try:\n zarr_group = zarr.open_group(zarr_path, mode=\"r\")\n except GroupNotFoundError:\n error_msg = (\n \"Could not load attributes for the requested well, \"\n f\"because no Zarr image was found at {zarr_path}\"\n )\n logging.error(error_msg)\n raise ZarrGroupNotFoundError(error_msg)\n zarr_attrs = zarr_group.attrs.asdict()\n try:\n return NgffWellMeta(**zarr_attrs)\n except Exception as e:\n logging.error(\n f\"Contents of {zarr_path} cannot be cast to NgffWellMeta.\\n\"\n f\"Original error:\\n{str(e)}\"\n )\n raise e\n
Given two integer intervals, find whether they overlap
This is the same as is_overlapping_1D (based on https://stackoverflow.com/a/70023212/19085332), for integer-valued intervals.
PARAMETER DESCRIPTION line1
The boundaries of the first interval , written as [x_min, x_max].
TYPE: Sequence[int]
line2
The boundaries of the second interval , written as [x_min, x_max].
TYPE: Sequence[int]
Source code in fractal_tasks_core/roi/_overlaps_common.py
def _is_overlapping_1D_int(\n line1: Sequence[int],\n line2: Sequence[int],\n) -> bool:\n\"\"\"\n Given two integer intervals, find whether they overlap\n\n This is the same as `is_overlapping_1D` (based on\n https://stackoverflow.com/a/70023212/19085332), for integer-valued\n intervals.\n\n Args:\n line1: The boundaries of the first interval , written as\n `[x_min, x_max]`.\n line2: The boundaries of the second interval , written as\n `[x_min, x_max]`.\n \"\"\"\n return line1[0] < line2[1] and line2[0] < line1[1]\n
Given two three-dimensional integer boxes, find whether they overlap.
This is the same as is_overlapping_3D (based on https://stackoverflow.com/a/70023212/19085332), for integer-valued boxes.
PARAMETER DESCRIPTION box1
The boundaries of the first box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: list[int]
box2
The boundaries of the second box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: list[int]
Source code in fractal_tasks_core/roi/_overlaps_common.py
def _is_overlapping_3D_int(box1: list[int], box2: list[int]) -> bool:\n\"\"\"\n Given two three-dimensional integer boxes, find whether they overlap.\n\n This is the same as is_overlapping_3D (based on\n https://stackoverflow.com/a/70023212/19085332), for integer-valued\n boxes.\n\n Args:\n box1: The boundaries of the first box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n box2: The boundaries of the second box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n \"\"\"\n overlap_x = _is_overlapping_1D_int([box1[0], box1[3]], [box2[0], box2[3]])\n overlap_y = _is_overlapping_1D_int([box1[1], box1[4]], [box2[1], box2[4]])\n overlap_z = _is_overlapping_1D_int([box1[2], box1[5]], [box2[2], box2[5]])\n return overlap_x and overlap_y and overlap_z\n
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION line1
The boundaries of the first interval, written as [x_min, x_max].
TYPE: Sequence[float]
line2
The boundaries of the second interval, written as [x_min, x_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_1D(\n line1: Sequence[float], line2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two intervals, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n line1: The boundaries of the first interval, written as\n `[x_min, x_max]`.\n line2: The boundaries of the second interval, written as\n `[x_min, x_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n return line1[0] <= line2[1] - tol and line2[0] <= line1[1] - tol\n
Given two rectangular boxes, finds whether they overlap.
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION box1
The boundaries of the first rectangle, written as [x_min, y_min, x_max, y_max].
TYPE: Sequence[float]
box2
The boundaries of the second rectangle, written as [x_min, y_min, x_max, y_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_2D(\n box1: Sequence[float], box2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two rectangular boxes, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n box1: The boundaries of the first rectangle, written as\n `[x_min, y_min, x_max, y_max]`.\n box2: The boundaries of the second rectangle, written as\n `[x_min, y_min, x_max, y_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n overlap_x = is_overlapping_1D(\n [box1[0], box1[2]], [box2[0], box2[2]], tol=tol\n )\n overlap_y = is_overlapping_1D(\n [box1[1], box1[3]], [box2[1], box2[3]], tol=tol\n )\n return overlap_x and overlap_y\n
Given two three-dimensional boxes, finds whether they overlap.
This is based on https://stackoverflow.com/a/70023212/19085332, and we additionally use a finite tolerance for floating-point comparisons.
PARAMETER DESCRIPTION box1
The boundaries of the first box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: Sequence[float]
box2
The boundaries of the second box, written as [x_min, y_min, z_min, x_max, y_max, z_max].
TYPE: Sequence[float]
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/_overlaps_common.py
def is_overlapping_3D(\n box1: Sequence[float], box2: Sequence[float], tol: float = 1e-10\n) -> bool:\n\"\"\"\n Given two three-dimensional boxes, finds whether they overlap.\n\n This is based on https://stackoverflow.com/a/70023212/19085332, and we\n additionally use a finite tolerance for floating-point comparisons.\n\n Args:\n box1: The boundaries of the first box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n box2: The boundaries of the second box, written as\n `[x_min, y_min, z_min, x_max, y_max, z_max]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n\n overlap_x = is_overlapping_1D(\n [box1[0], box1[3]], [box2[0], box2[3]], tol=tol\n )\n overlap_y = is_overlapping_1D(\n [box1[1], box1[4]], [box2[1], box2[4]], tol=tol\n )\n overlap_z = is_overlapping_1D(\n [box1[2], box1[5]], [box2[2], box2[5]], tol=tol\n )\n return overlap_x and overlap_y and overlap_z\n
Can handle both 2D and 3D dask arrays as input and return them as is or always as a 3D array.
PARAMETER DESCRIPTION data_zyx
Dask array (2D or 3D).
TYPE: Array
region
Region to load, tuple of three slices (ZYX).
TYPE: tuple[slice, slice, slice]
compute
Whether to compute the result. If True, returns a numpy array. If False, returns a dask array.
TYPE: bool DEFAULT: True
return_as_3D
Whether to return a 3D array, even if the input is 2D.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION Union[Array, ndarray]
3D array.
Source code in fractal_tasks_core/roi/load_region.py
def load_region(\n data_zyx: da.Array,\n region: tuple[slice, slice, slice],\n compute: bool = True,\n return_as_3D: bool = False,\n) -> Union[da.Array, np.ndarray]:\n\"\"\"\n Load a region from a dask array.\n\n Can handle both 2D and 3D dask arrays as input and return them as is or\n always as a 3D array.\n\n Args:\n data_zyx: Dask array (2D or 3D).\n region: Region to load, tuple of three slices (ZYX).\n compute: Whether to compute the result. If `True`, returns a numpy\n array. If `False`, returns a dask array.\n return_as_3D: Whether to return a 3D array, even if the input is 2D.\n\n Returns:\n 3D array.\n \"\"\"\n\n if len(region) != 3:\n raise ValueError(\n f\"In `load_region`, `region` must have three elements \"\n f\"(given: {len(region)}).\"\n )\n\n if len(data_zyx.shape) == 3:\n img = data_zyx[region]\n elif len(data_zyx.shape) == 2:\n img = data_zyx[(region[1], region[2])]\n if return_as_3D:\n img = np.expand_dims(img, axis=0)\n else:\n raise ValueError(\n f\"Shape {data_zyx.shape} not supported for `load_region`\"\n )\n if compute:\n return img.compute()\n else:\n return img\n
Construct bounding-box ROI table for a mask array.
PARAMETER DESCRIPTION mask_array
Original array to construct bounding boxes.
TYPE: ndarray
pxl_sizes_zyx
Physical-unit pixel ZYX sizes.
TYPE: list[float]
origin_zyx
Shift ROI origin by this amount of ZYX pixels.
TYPE: tuple[int, int, int] DEFAULT: (0, 0, 0)
RETURNS DESCRIPTION DataFrame
DataFrame with each line representing the bounding-box ROI that corresponds to a unique value of mask_array. ROI properties are expressed in physical units (with columns defined as elsewhere this module - see e.g. prepare_well_ROI_table), and positions are optionally shifted (if origin_zyx is set). An additional column label keeps track of the mask_array value corresponding to each ROI.
Source code in fractal_tasks_core/roi/v1.py
def array_to_bounding_box_table(\n mask_array: np.ndarray,\n pxl_sizes_zyx: list[float],\n origin_zyx: tuple[int, int, int] = (0, 0, 0),\n) -> pd.DataFrame:\n\"\"\"\n Construct bounding-box ROI table for a mask array.\n\n Args:\n mask_array: Original array to construct bounding boxes.\n pxl_sizes_zyx: Physical-unit pixel ZYX sizes.\n origin_zyx: Shift ROI origin by this amount of ZYX pixels.\n\n Returns:\n DataFrame with each line representing the bounding-box ROI that\n corresponds to a unique value of `mask_array`. ROI properties are\n expressed in physical units (with columns defined as elsewhere this\n module - see e.g. `prepare_well_ROI_table`), and positions are\n optionally shifted (if `origin_zyx` is set). An additional column\n `label` keeps track of the `mask_array` value corresponding to each\n ROI.\n \"\"\"\n\n pxl_sizes_zyx_array = np.array(pxl_sizes_zyx)\n z_origin, y_origin, x_origin = origin_zyx[:]\n\n labels = np.unique(mask_array)\n labels = labels[labels > 0]\n elem_list = []\n for label in labels:\n # Compute bounding box\n label_match = np.where(mask_array == label)\n zmin, ymin, xmin = np.min(label_match, axis=1) * pxl_sizes_zyx_array\n zmax, ymax, xmax = (\n np.max(label_match, axis=1) + 1\n ) * pxl_sizes_zyx_array\n\n # Compute bounding-box edges\n length_x = xmax - xmin\n length_y = ymax - ymin\n length_z = zmax - zmin\n\n # Shift origin\n zmin += z_origin * pxl_sizes_zyx[0]\n ymin += y_origin * pxl_sizes_zyx[1]\n xmin += x_origin * pxl_sizes_zyx[2]\n\n elem_list.append((xmin, ymin, zmin, length_x, length_y, length_z))\n\n df_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n\n if len(elem_list) == 0:\n df = pd.DataFrame(columns=[x for x in df_columns] + [\"label\"])\n else:\n df = pd.DataFrame(np.array(elem_list), columns=df_columns)\n df[\"label\"] = labels\n\n return df\n
Nested list of indices. The main list has one item per ROI. Each ROI item is a list of six integers as in [start_z, end_z, start_y, end_y, start_x, end_x]. The array-index interval for a given ROI is start_x:end_x along X, and so on for Y and Z.
Source code in fractal_tasks_core/roi/v1.py
def convert_ROI_table_to_indices(\n ROI: ad.AnnData,\n full_res_pxl_sizes_zyx: Sequence[float],\n level: int = 0,\n coarsening_xy: int = 2,\n cols_xyz_pos: Sequence[str] = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n ],\n cols_xyz_len: Sequence[str] = [\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ],\n) -> list[list[int]]:\n\"\"\"\n Convert a ROI AnnData table into integer array indices.\n\n Args:\n ROI: AnnData table with list of ROIs.\n full_res_pxl_sizes_zyx:\n Physical-unit pixel ZYX sizes at the full-resolution pyramid level.\n level: Pyramid level.\n coarsening_xy: Linear coarsening factor in the YX plane.\n cols_xyz_pos: Column names for XYZ ROI positions.\n cols_xyz_len: Column names for XYZ ROI edges.\n\n Raises:\n ValueError:\n If any of the array indices is negative.\n\n Returns:\n Nested list of indices. The main list has one item per ROI. Each ROI\n item is a list of six integers as in `[start_z, end_z, start_y,\n end_y, start_x, end_x]`. The array-index interval for a given ROI\n is `start_x:end_x` along X, and so on for Y and Z.\n \"\"\"\n # Handle empty ROI table\n if len(ROI) == 0:\n return []\n\n # Set pyramid-level pixel sizes\n pxl_size_z, pxl_size_y, pxl_size_x = full_res_pxl_sizes_zyx\n prefactor = coarsening_xy**level\n pxl_size_x *= prefactor\n pxl_size_y *= prefactor\n\n x_pos, y_pos, z_pos = cols_xyz_pos[:]\n x_len, y_len, z_len = cols_xyz_len[:]\n\n list_indices = []\n for ROI_name in ROI.obs_names:\n # Extract data from anndata table\n x_micrometer = ROI[ROI_name, x_pos].X[0, 0]\n y_micrometer = ROI[ROI_name, y_pos].X[0, 0]\n z_micrometer = ROI[ROI_name, z_pos].X[0, 0]\n len_x_micrometer = ROI[ROI_name, x_len].X[0, 0]\n len_y_micrometer = ROI[ROI_name, y_len].X[0, 0]\n len_z_micrometer = ROI[ROI_name, z_len].X[0, 0]\n\n # Identify indices along the three dimensions\n start_x = x_micrometer / pxl_size_x\n end_x = (x_micrometer + len_x_micrometer) / pxl_size_x\n start_y = y_micrometer / pxl_size_y\n end_y = (y_micrometer + len_y_micrometer) / pxl_size_y\n start_z = z_micrometer / pxl_size_z\n end_z = (z_micrometer + len_z_micrometer) / pxl_size_z\n indices = [start_z, end_z, start_y, end_y, start_x, end_x]\n\n # Round indices to lower integer\n indices = list(map(round, indices))\n\n # Fail for negative indices\n if min(indices) < 0:\n raise ValueError(\n f\"ROI {ROI_name} converted into negative array indices.\\n\"\n f\"ZYX position: {z_micrometer}, {y_micrometer}, \"\n f\"{x_micrometer}\\n\"\n f\"ZYX pixel sizes: {pxl_size_z}, {pxl_size_y}, \"\n f\"{pxl_size_x} ({level=})\\n\"\n \"Hint: As of fractal-tasks-core v0.12, FOV/well ROI \"\n \"tables with non-zero origins (e.g. the ones created with \"\n \"v0.11) are not supported.\"\n )\n\n # Append ROI indices to to list\n list_indices.append(indices[:])\n\n return list_indices\n
Note that this function is only relevant when the ROIs in adata span the whole extent of the Z axis. TODO: check this explicitly.
PARAMETER DESCRIPTION adata
TBD
TYPE: AnnData
pixel_size_z
TBD
TYPE: float
Source code in fractal_tasks_core/roi/v1.py
def convert_ROIs_from_3D_to_2D(\n adata: ad.AnnData,\n pixel_size_z: float,\n) -> ad.AnnData:\n\"\"\"\n TBD\n\n Note that this function is only relevant when the ROIs in adata span the\n whole extent of the Z axis.\n TODO: check this explicitly.\n\n Args:\n adata: TBD\n pixel_size_z: TBD\n \"\"\"\n\n # Compress a 3D stack of images to a single Z plane,\n # with thickness equal to pixel_size_z\n df = adata.to_df()\n df[\"len_z_micrometer\"] = pixel_size_z\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df = df.astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n new_adata = ad.AnnData(X=df)\n\n # Rename rows and columns\n new_adata.obs_names = adata.obs_names\n new_adata.var_names = list(map(str, df.columns))\n\n return new_adata\n
Construct an empty bounding-box ROI table of given shape.
This function mirrors the functionality of array_to_bounding_box_table, for the specific case where the array includes no label. The advantages of this function are that:
It does not require computing a whole array of zeros;
We avoid hardcoding column names in the task functions.
RETURNS DESCRIPTION DataFrame
DataFrame with no rows, and with columns corresponding to the output of array_to_bounding_box_table.
Source code in fractal_tasks_core/roi/v1.py
def empty_bounding_box_table() -> pd.DataFrame:\n\"\"\"\n Construct an empty bounding-box ROI table of given shape.\n\n This function mirrors the functionality of `array_to_bounding_box_table`,\n for the specific case where the array includes no label. The advantages of\n this function are that:\n\n 1. It does not require computing a whole array of zeros;\n 2. We avoid hardcoding column names in the task functions.\n\n Returns:\n DataFrame with no rows, and with columns corresponding to the output of\n `array_to_bounding_box_table`.\n \"\"\"\n\n df_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n df = pd.DataFrame(columns=[x for x in df_columns] + [\"label\"])\n return df\n
Produce a table with ROIS placed on a rectangular grid.
The main goal of this ROI grid is to allow processing of smaller subset of the whole array.
In a specific case (that is, if the image array was obtained by stitching together a set of FOVs placed on a regular grid), the ROIs correspond to the original FOVs.
TODO: make this flexible with respect to the presence/absence of Z.
PARAMETER DESCRIPTION array_shape
ZYX shape of the image array.
TYPE: tuple[int, int, int]
pixels_ZYX
ZYX pixel sizes in micrometers.
TYPE: list[float]
grid_YX_shape
TYPE: tuple[int, int]
RETURNS DESCRIPTION AnnData
An AnnData table with a single ROI.
Source code in fractal_tasks_core/roi/v1.py
def get_image_grid_ROIs(\n array_shape: tuple[int, int, int],\n pixels_ZYX: list[float],\n grid_YX_shape: tuple[int, int],\n) -> ad.AnnData:\n\"\"\"\n Produce a table with ROIS placed on a rectangular grid.\n\n The main goal of this ROI grid is to allow processing of smaller subset of\n the whole array.\n\n In a specific case (that is, if the image array was obtained by stitching\n together a set of FOVs placed on a regular grid), the ROIs correspond to\n the original FOVs.\n\n TODO: make this flexible with respect to the presence/absence of Z.\n\n Args:\n array_shape: ZYX shape of the image array.\n pixels_ZYX: ZYX pixel sizes in micrometers.\n grid_YX_shape:\n\n Returns:\n An `AnnData` table with a single ROI.\n \"\"\"\n shape_z, shape_y, shape_x = array_shape[-3:]\n grid_size_y, grid_size_x = grid_YX_shape[:]\n X = []\n obs_names = []\n counter = 0\n start_z = 0\n len_z = shape_z\n\n # Find minimal len_y that covers [0,shape_y] with grid_size_y intervals\n len_y = math.ceil(shape_y / grid_size_y)\n len_x = math.ceil(shape_x / grid_size_x)\n for ind_y in range(grid_size_y):\n start_y = ind_y * len_y\n tmp_len_y = min(shape_y, start_y + len_y) - start_y\n for ind_x in range(grid_size_x):\n start_x = ind_x * len_x\n tmp_len_x = min(shape_x, start_x + len_x) - start_x\n X.append(\n [\n start_x * pixels_ZYX[2],\n start_y * pixels_ZYX[1],\n start_z * pixels_ZYX[0],\n tmp_len_x * pixels_ZYX[2],\n tmp_len_y * pixels_ZYX[1],\n len_z * pixels_ZYX[0],\n ]\n )\n counter += 1\n obs_names.append(f\"ROI_{counter}\")\n ROI_table = ad.AnnData(X=np.array(X, dtype=np.float32))\n ROI_table.obs_names = obs_names\n ROI_table.var_names = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n return ROI_table\n
Produce a table with a single ROI that covers the whole array
TODO: make this flexible with respect to the presence/absence of Z.
PARAMETER DESCRIPTION array_shape
ZYX shape of the image array.
TYPE: tuple[int, int, int]
pixels_ZYX
ZYX pixel sizes in micrometers.
TYPE: list[float]
RETURNS DESCRIPTION AnnData
An AnnData table with a single ROI.
Source code in fractal_tasks_core/roi/v1.py
def get_single_image_ROI(\n array_shape: tuple[int, int, int],\n pixels_ZYX: list[float],\n) -> ad.AnnData:\n\"\"\"\n Produce a table with a single ROI that covers the whole array\n\n TODO: make this flexible with respect to the presence/absence of Z.\n\n Args:\n array_shape: ZYX shape of the image array.\n pixels_ZYX: ZYX pixel sizes in micrometers.\n\n Returns:\n An `AnnData` table with a single ROI.\n \"\"\"\n shape_z, shape_y, shape_x = array_shape[-3:]\n ROI_table = ad.AnnData(\n X=np.array(\n [\n [\n 0.0,\n 0.0,\n 0.0,\n shape_x * pixels_ZYX[2],\n shape_y * pixels_ZYX[1],\n shape_z * pixels_ZYX[0],\n ],\n ],\n dtype=np.float32,\n )\n )\n ROI_table.obs_names = [\"image_1\"]\n ROI_table.var_names = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n return ROI_table\n
True if the name of the table contains one of the standard Fractal tables
If a table name is well_ROI_table, FOV_ROI_table or contains either of the two (e.g. registered_FOV_ROI_table), this function returns True.
PARAMETER DESCRIPTION table
table name
TYPE: str
RETURNS DESCRIPTION bool
bool of whether it's a standard ROI table
Source code in fractal_tasks_core/roi/v1.py
def is_standard_roi_table(table: str) -> bool:\n\"\"\"\n True if the name of the table contains one of the standard Fractal tables\n\n If a table name is well_ROI_table, FOV_ROI_table or contains either of the\n two (e.g. registered_FOV_ROI_table), this function returns True.\n\n Args:\n table: table name\n\n Returns:\n bool of whether it's a standard ROI table\n\n \"\"\"\n if \"well_ROI_table\" in table:\n return True\n elif \"FOV_ROI_table\" in table:\n return True\n else:\n return False\n
Input dataframe, possibly prepared through parse_yokogawa_metadata.
TYPE: DataFrame
metadata
Columns of df to be stored (if present) into AnnData table obs.
TYPE: tuple[str, ...] DEFAULT: ('time')
Source code in fractal_tasks_core/roi/v1.py
def prepare_FOV_ROI_table(\n df: pd.DataFrame, metadata: tuple[str, ...] = (\"time\",)\n) -> ad.AnnData:\n\"\"\"\n Prepare an AnnData table for fields-of-view ROIs.\n\n Args:\n df:\n Input dataframe, possibly prepared through\n `parse_yokogawa_metadata`.\n metadata:\n Columns of `df` to be stored (if present) into AnnData table `obs`.\n \"\"\"\n\n # Make a local copy of the dataframe, to avoid SettingWithCopyWarning\n df = df.copy()\n\n # Convert DataFrame index to str, to avoid\n # >> ImplicitModificationWarning: Transforming to str index\n # when creating AnnData object.\n # Do this in the beginning to allow concatenation with e.g. time\n df.index = df.index.astype(str)\n\n # Obtain box size in physical units\n df = df.assign(len_x_micrometer=df.x_pixel * df.pixel_size_x)\n df = df.assign(len_y_micrometer=df.y_pixel * df.pixel_size_y)\n df = df.assign(len_z_micrometer=df.z_pixel * df.pixel_size_z)\n\n # Select only the numeric positional columns needed to define ROIs\n # (to avoid) casting things like the data column to float32\n # or to use unnecessary columns like bit_depth\n positional_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n \"x_micrometer_original\",\n \"y_micrometer_original\",\n ]\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df_roi = df.loc[:, positional_columns].astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n adata = ad.AnnData(X=df_roi)\n\n # Reset origin of the FOV ROI table, so that it matches with the well\n # origin\n adata = reset_origin(adata)\n\n # Save any metadata that is specified to the obs df\n for col in metadata:\n if col in df:\n # Cast all metadata to str.\n # Reason: AnnData Zarr writers don't support all pandas types.\n # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written\n adata.obs[col] = df[col].astype(str)\n\n # Rename rows and columns: Maintain FOV indices from the dataframe\n # (they are already enforced to be unique by Pandas and may contain\n # information for the user, as they are based on the filenames)\n adata.obs_names = \"FOV_\" + adata.obs.index\n adata.var_names = list(map(str, df_roi.columns))\n\n return adata\n
Input dataframe, possibly prepared through parse_yokogawa_metadata.
TYPE: DataFrame
metadata
Columns of df to be stored (if present) into AnnData table obs.
TYPE: tuple[str, ...] DEFAULT: ('time')
Source code in fractal_tasks_core/roi/v1.py
def prepare_well_ROI_table(\n df: pd.DataFrame, metadata: tuple[str, ...] = (\"time\",)\n) -> ad.AnnData:\n\"\"\"\n Prepare an AnnData table with a single well ROI.\n\n Args:\n df:\n Input dataframe, possibly prepared through\n `parse_yokogawa_metadata`.\n metadata:\n Columns of `df` to be stored (if present) into AnnData table `obs`.\n \"\"\"\n\n # Make a local copy of the dataframe, to avoid SettingWithCopyWarning\n df = df.copy()\n\n # Convert DataFrame index to str, to avoid\n # >> ImplicitModificationWarning: Transforming to str index\n # when creating AnnData object.\n # Do this in the beginning to allow concatenation with e.g. time\n df.index = df.index.astype(str)\n\n # Calculate bounding box extents in physical units\n for mu in [\"x\", \"y\", \"z\"]:\n # Obtain per-FOV properties in physical units.\n # NOTE: a FOV ROI is defined here as the interval [min_micrometer,\n # max_micrometer], with max_micrometer=min_micrometer+len_micrometer\n min_micrometer = df[f\"{mu}_micrometer\"]\n len_micrometer = df[f\"{mu}_pixel\"] * df[f\"pixel_size_{mu}\"]\n max_micrometer = min_micrometer + len_micrometer\n # Obtain well bounding box, in physical units\n min_min_micrometer = min_micrometer.min()\n max_max_micrometer = max_micrometer.max()\n df[f\"{mu}_micrometer\"] = min_min_micrometer\n df[f\"len_{mu}_micrometer\"] = max_max_micrometer - min_min_micrometer\n\n # Select only the numeric positional columns needed to define ROIs\n # (to avoid) casting things like the data column to float32\n # or to use unnecessary columns like bit_depth\n positional_columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n\n # Assign dtype explicitly, to avoid\n # >> UserWarning: X converted to numpy array with dtype float64\n # when creating AnnData object\n df_roi = df.iloc[0:1, :].loc[:, positional_columns].astype(np.float32)\n\n # Create an AnnData object directly from the DataFrame\n adata = ad.AnnData(X=df_roi)\n\n # Reset origin of the single-entry well ROI table\n adata = reset_origin(adata)\n\n # Save any metadata that is specified to the obs df\n for col in metadata:\n if col in df:\n # Cast all metadata to str.\n # Reason: AnnData Zarr writers don't support all pandas types.\n # e.g. pandas.core.arrays.datetimes.DatetimeArray can't be written\n adata.obs[col] = df[col].astype(str)\n\n # Rename rows and columns: Maintain FOV indices from the dataframe\n # (they are already enforced to be unique by Pandas and may contain\n # information for the user, as they are based on the filenames)\n adata.obs_names = \"well_\" + adata.obs.index\n adata.var_names = list(map(str, df_roi.columns))\n\n return adata\n
Return a copy of a ROI table, with shifted-to-zero origin for some columns.
PARAMETER DESCRIPTION ROI_table
Original ROI table.
TYPE: AnnData
x_pos
Name of the column with X position of ROIs.
TYPE: str DEFAULT: 'x_micrometer'
y_pos
Name of the column with Y position of ROIs.
TYPE: str DEFAULT: 'y_micrometer'
z_pos
Name of the column with Z position of ROIs.
TYPE: str DEFAULT: 'z_micrometer'
RETURNS DESCRIPTION AnnData
A copy of the ROI_table AnnData table, where values of x_pos, y_pos and z_pos columns have been shifted by their minimum values.
Source code in fractal_tasks_core/roi/v1.py
def reset_origin(\n ROI_table: ad.AnnData,\n x_pos: str = \"x_micrometer\",\n y_pos: str = \"y_micrometer\",\n z_pos: str = \"z_micrometer\",\n) -> ad.AnnData:\n\"\"\"\n Return a copy of a ROI table, with shifted-to-zero origin for some columns.\n\n Args:\n ROI_table: Original ROI table.\n x_pos: Name of the column with X position of ROIs.\n y_pos: Name of the column with Y position of ROIs.\n z_pos: Name of the column with Z position of ROIs.\n\n Returns:\n A copy of the `ROI_table` AnnData table, where values of `x_pos`,\n `y_pos` and `z_pos` columns have been shifted by their minimum\n values.\n \"\"\"\n new_table = ROI_table.copy()\n\n origin_x = min(new_table[:, x_pos].X[:, 0])\n origin_y = min(new_table[:, y_pos].X[:, 0])\n origin_z = min(new_table[:, z_pos].X[:, 0])\n\n for FOV in new_table.obs_names:\n new_table[FOV, x_pos] = new_table[FOV, x_pos].X[0, 0] - origin_x\n new_table[FOV, y_pos] = new_table[FOV, y_pos].X[0, 0] - origin_y\n new_table[FOV, z_pos] = new_table[FOV, z_pos].X[0, 0] - origin_z\n\n return new_table\n
This function reflects our current working assumptions (e.g. the presence of some specific columns); this may change in future versions.
PARAMETER DESCRIPTION table
AnnData table to be checked
TYPE: AnnData
Source code in fractal_tasks_core/roi/v1_checks.py
def are_ROI_table_columns_valid(*, table: ad.AnnData) -> None:\n\"\"\"\n Verify some validity assumptions on a ROI table.\n\n This function reflects our current working assumptions (e.g. the presence\n of some specific columns); this may change in future versions.\n\n Args:\n table: AnnData table to be checked\n \"\"\"\n\n # Hard constraint: table columns must include some expected ones\n columns = [\n \"x_micrometer\",\n \"y_micrometer\",\n \"z_micrometer\",\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n ]\n for column in columns:\n if column not in table.var_names:\n raise ValueError(f\"Column {column} is not present in ROI table\")\n
Check that list of indices has zero origin on each axis.
See fractal-tasks-core issues #530 and #554.
This helper function is meant to provide informative error messages when ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12. This function will be deprecated and removed as soon as the v0.11/v0.12 transition advances.
Note that only FOV_ROI_table and well_ROI_table have to fulfill this constraint, while ROI tables obtained through segmentation may have arbitrary (non-negative) indices.
PARAMETER DESCRIPTION list_indices
Output of convert_ROI_table_to_indices; each item is like [start_z, end_z, start_y, end_y, start_x, end_x].
TYPE: list[list[int]]
ROI_table_name
Name of the ROI table.
TYPE: str
RAISES DESCRIPTION ValueError
If the table name is FOV_ROI_table or well_ROI_table and the minimum value of start_x, start_y and start_z are not all zero.
Source code in fractal_tasks_core/roi/v1_checks.py
def check_valid_ROI_indices(\n list_indices: list[list[int]],\n ROI_table_name: str,\n) -> None:\n\"\"\"\n Check that list of indices has zero origin on each axis.\n\n See fractal-tasks-core issues #530 and #554.\n\n This helper function is meant to provide informative error messages when\n ROI tables created with fractal-tasks-core up to v0.11 are used in v0.12.\n This function will be deprecated and removed as soon as the v0.11/v0.12\n transition advances.\n\n Note that only `FOV_ROI_table` and `well_ROI_table` have to fulfill this\n constraint, while ROI tables obtained through segmentation may have\n arbitrary (non-negative) indices.\n\n Args:\n list_indices:\n Output of `convert_ROI_table_to_indices`; each item is like\n `[start_z, end_z, start_y, end_y, start_x, end_x]`.\n ROI_table_name: Name of the ROI table.\n\n Raises:\n ValueError:\n If the table name is `FOV_ROI_table` or `well_ROI_table` and the\n minimum value of `start_x`, `start_y` and `start_z` are not all\n zero.\n \"\"\"\n if ROI_table_name not in [\"FOV_ROI_table\", \"well_ROI_table\"]:\n # This validation function only applies to the FOV/well ROI tables\n # generated with fractal-tasks-core\n return\n\n # Find minimum index along ZYX\n min_start_z = min(item[0] for item in list_indices)\n min_start_y = min(item[2] for item in list_indices)\n min_start_x = min(item[4] for item in list_indices)\n\n # Check that minimum indices are all zero\n for ind, min_index in enumerate((min_start_z, min_start_y, min_start_x)):\n if min_index != 0:\n axis = [\"Z\", \"Y\", \"X\"][ind]\n raise ValueError(\n f\"{axis} component of ROI indices for table `{ROI_table_name}`\"\n f\" do not start with 0, but with {min_index}.\\n\"\n \"Hint: As of fractal-tasks-core v0.12, FOV/well ROI \"\n \"tables with non-zero origins (e.g. the ones created with \"\n \"v0.11) are not supported.\"\n )\n
This function reflects our current working assumptions (e.g. the presence of some specific columns); this may change in future versions.
If use_masks=True, we verify that the table is a valid masking_roi_table as of table specifications V1; if this check fails, use_masks should be set to False upstream in the parent function.
PARAMETER DESCRIPTION table_path
Path of the AnnData ROI table to be checked.
TYPE: str
use_masks
If True, perform some additional checks related to masked loading.
TYPE: bool
RETURNS DESCRIPTION Optional[bool]
Always None if use_masks=False, otherwise return whether the table is valid for masked loading.
Source code in fractal_tasks_core/roi/v1_checks.py
def is_ROI_table_valid(*, table_path: str, use_masks: bool) -> Optional[bool]:\n\"\"\"\n Verify some validity assumptions on a ROI table.\n\n This function reflects our current working assumptions (e.g. the presence\n of some specific columns); this may change in future versions.\n\n If `use_masks=True`, we verify that the table is a valid\n `masking_roi_table` as of table specifications V1; if this check fails,\n `use_masks` should be set to `False` upstream in the parent function.\n\n Args:\n table_path: Path of the AnnData ROI table to be checked.\n use_masks: If `True`, perform some additional checks related to\n masked loading.\n\n Returns:\n Always `None` if `use_masks=False`, otherwise return whether the table\n is valid for masked loading.\n \"\"\"\n\n table = ad.read_zarr(table_path)\n are_ROI_table_columns_valid(table=table)\n if not use_masks:\n return None\n\n # Check whether the table can be used for masked loading\n attrs = zarr.group(table_path).attrs.asdict()\n logger.info(f\"ROI table at {table_path} has attrs: {attrs}\")\n try:\n MaskingROITableAttrs(**attrs)\n logging.info(\"ROI table can be used for masked loading\")\n return True\n except ValidationError:\n logging.info(\"ROI table cannot be used for masked loading\")\n return False\n
This function is currently only used in tests and examples.
The plotting_function parameter is exposed so that other tools (see examples in this repository) may use it to show the FOV ROIs.
PARAMETER DESCRIPTION site_metadata
TBD
TYPE: DataFrame
selected_well
TBD
TYPE: str
plotting_function
TBD
TYPE: Callable
tol
TBD
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/v1_overlaps.py
def check_well_for_FOV_overlap(\n site_metadata: pd.DataFrame,\n selected_well: str,\n plotting_function: Callable,\n tol: float = 1e-10,\n):\n\"\"\"\n This function is currently only used in tests and examples.\n\n The `plotting_function` parameter is exposed so that other tools (see\n examples in this repository) may use it to show the FOV ROIs.\n\n Args:\n site_metadata: TBD\n selected_well: TBD\n plotting_function: TBD\n tol: TBD\n \"\"\"\n\n df = site_metadata.loc[selected_well].copy()\n df[\"xmin\"] = df[\"x_micrometer\"]\n df[\"ymin\"] = df[\"y_micrometer\"]\n df[\"xmax\"] = df[\"x_micrometer\"] + df[\"pixel_size_x\"] * df[\"x_pixel\"]\n df[\"ymax\"] = df[\"y_micrometer\"] + df[\"pixel_size_y\"] * df[\"y_pixel\"]\n\n xmin = list(df.loc[:, \"xmin\"])\n ymin = list(df.loc[:, \"ymin\"])\n xmax = list(df.loc[:, \"xmax\"])\n ymax = list(df.loc[:, \"ymax\"])\n num_lines = len(xmin)\n\n list_overlapping_FOVs = []\n for line_1 in range(num_lines):\n min_x_1, max_x_1 = [a[line_1] for a in [xmin, xmax]]\n min_y_1, max_y_1 = [a[line_1] for a in [ymin, ymax]]\n for line_2 in range(line_1):\n min_x_2, max_x_2 = [a[line_2] for a in [xmin, xmax]]\n min_y_2, max_y_2 = [a[line_2] for a in [ymin, ymax]]\n overlap = is_overlapping_2D(\n (min_x_1, min_y_1, max_x_1, max_y_1),\n (min_x_2, min_y_2, max_x_2, max_y_2),\n tol=tol,\n )\n if overlap:\n list_overlapping_FOVs.append(line_1)\n list_overlapping_FOVs.append(line_2)\n\n # Call plotting_function\n plotting_function(\n xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well\n )\n\n if len(list_overlapping_FOVs) > 0:\n # Increase values by one to switch from index to the label plotted\n return {selected_well: [x + 1 for x in list_overlapping_FOVs]}\n
Given a list of integer ROI indices, find whether there are overlaps.
PARAMETER DESCRIPTION list_indices
List of ROI indices, where each element in the list should look like [start_z, end_z, start_y, end_y, start_x, end_x].
TYPE: list[list[int]]
RETURNS DESCRIPTION Optional[tuple[int, int]]
None if no overlap was detected, otherwise a tuple with the positional indices of a pair of overlapping ROIs.
Source code in fractal_tasks_core/roi/v1_overlaps.py
def find_overlaps_in_ROI_indices(\n list_indices: list[list[int]],\n) -> Optional[tuple[int, int]]:\n\"\"\"\n Given a list of integer ROI indices, find whether there are overlaps.\n\n Args:\n list_indices: List of ROI indices, where each element in the list\n should look like\n `[start_z, end_z, start_y, end_y, start_x, end_x]`.\n\n Returns:\n `None` if no overlap was detected, otherwise a tuple with the\n positional indices of a pair of overlapping ROIs.\n \"\"\"\n\n for ind_1, ROI_1 in enumerate(list_indices):\n s_z, e_z, s_y, e_y, s_x, e_x = ROI_1[:]\n box_1 = [s_x, s_y, s_z, e_x, e_y, e_z]\n for ind_2 in range(ind_1):\n ROI_2 = list_indices[ind_2]\n s_z, e_z, s_y, e_y, s_x, e_x = ROI_2[:]\n box_2 = [s_x, s_y, s_z, e_x, e_y, e_z]\n if _is_overlapping_3D_int(box_1, box_2):\n return (ind_1, ind_2)\n return None\n
Finds the indices for the next overlapping FOVs pair.
Note: the returned indices are positional indices, starting from 0.
PARAMETER DESCRIPTION tmp_df
Dataframe with columns [\"xmin\", \"ymin\", \"xmax\", \"ymax\"].
TYPE: DataFrame
tol
Finite tolerance for floating-point comparisons.
TYPE: float DEFAULT: 1e-10
Source code in fractal_tasks_core/roi/v1_overlaps.py
def get_overlapping_pair(\n tmp_df: pd.DataFrame, tol: float = 1e-10\n) -> Union[tuple[int, int], bool]:\n\"\"\"\n Finds the indices for the next overlapping FOVs pair.\n\n Note: the returned indices are positional indices, starting from 0.\n\n Args:\n tmp_df: Dataframe with columns `[\"xmin\", \"ymin\", \"xmax\", \"ymax\"]`.\n tol: Finite tolerance for floating-point comparisons.\n \"\"\"\n\n num_lines = len(tmp_df.index)\n for pos_ind_1 in range(num_lines):\n for pos_ind_2 in range(pos_ind_1):\n if is_overlapping_2D(\n tmp_df.iloc[pos_ind_1], tmp_df.iloc[pos_ind_2], tol=tol\n ):\n return (pos_ind_1, pos_ind_2)\n return False\n
Finds the indices for the all overlapping FOVs pair, in three dimensions.
Note: the returned indices are positional indices, starting from 0.
PARAMETER DESCRIPTION tmp_df
Dataframe with columns {x,y,z}_micrometer and len_{x,y,z}_micrometer.
TYPE: DataFrame
full_res_pxl_sizes_zyx
TBD
TYPE: Sequence[float]
Source code in fractal_tasks_core/roi/v1_overlaps.py
def get_overlapping_pairs_3D(\n tmp_df: pd.DataFrame,\n full_res_pxl_sizes_zyx: Sequence[float],\n):\n\"\"\"\n Finds the indices for the all overlapping FOVs pair, in three dimensions.\n\n Note: the returned indices are positional indices, starting from 0.\n\n Args:\n tmp_df: Dataframe with columns `{x,y,z}_micrometer` and\n `len_{x,y,z}_micrometer`.\n full_res_pxl_sizes_zyx: TBD\n \"\"\"\n\n tol = 1e-10\n if tol > min(full_res_pxl_sizes_zyx) / 1e3:\n raise ValueError(f\"{tol=} but {full_res_pxl_sizes_zyx=}\")\n\n new_tmp_df = tmp_df.copy()\n\n new_tmp_df[\"x_micrometer_max\"] = (\n new_tmp_df[\"x_micrometer\"] + new_tmp_df[\"len_x_micrometer\"]\n )\n new_tmp_df[\"y_micrometer_max\"] = (\n new_tmp_df[\"y_micrometer\"] + new_tmp_df[\"len_y_micrometer\"]\n )\n new_tmp_df[\"z_micrometer_max\"] = (\n new_tmp_df[\"z_micrometer\"] + new_tmp_df[\"len_z_micrometer\"]\n )\n # Remove columns which are not necessary for overlap checks\n list_columns = [\n \"len_x_micrometer\",\n \"len_y_micrometer\",\n \"len_z_micrometer\",\n \"label\",\n ]\n new_tmp_df.drop(labels=list_columns, axis=1, inplace=True)\n\n # Loop over all pairs, and construct list of overlapping ones\n num_lines = len(new_tmp_df.index)\n overlapping_list = []\n for pos_ind_1 in range(num_lines):\n for pos_ind_2 in range(pos_ind_1):\n overlap = is_overlapping_3D(\n new_tmp_df.iloc[pos_ind_1], new_tmp_df.iloc[pos_ind_2], tol=tol\n )\n if overlap:\n overlapping_list.append((pos_ind_1, pos_ind_2))\n return overlapping_list\n
Given a metadata dataframe, shift its columns to remove FOV overlaps.
PARAMETER DESCRIPTION df
Metadata dataframe.
TYPE: DataFrame
Source code in fractal_tasks_core/roi/v1_overlaps.py
def remove_FOV_overlaps(df: pd.DataFrame):\n\"\"\"\n Given a metadata dataframe, shift its columns to remove FOV overlaps.\n\n Args:\n df: Metadata dataframe.\n \"\"\"\n\n # Set tolerance (this should be much smaller than pixel size or expected\n # round-offs), and maximum number of iterations in constraint solver\n tol = 1e-10\n max_iterations = 200\n\n # Create a local copy of the dataframe\n df = df.copy()\n\n # Create temporary columns (to streamline overlap removals), which are\n # then removed at the end of the remove_FOV_overlaps function\n df[\"xmin\"] = df[\"x_micrometer\"]\n df[\"ymin\"] = df[\"y_micrometer\"]\n df[\"xmax\"] = df[\"x_micrometer\"] + df[\"pixel_size_x\"] * df[\"x_pixel\"]\n df[\"ymax\"] = df[\"y_micrometer\"] + df[\"pixel_size_y\"] * df[\"y_pixel\"]\n list_columns = [\"xmin\", \"ymin\", \"xmax\", \"ymax\"]\n\n # Create columns with the original positions (not to be removed)\n df[\"x_micrometer_original\"] = df[\"x_micrometer\"]\n df[\"y_micrometer_original\"] = df[\"y_micrometer\"]\n\n # Check that tolerance is much smaller than pixel sizes\n min_pixel_size = df[[\"pixel_size_x\", \"pixel_size_y\"]].min().min()\n if tol > min_pixel_size / 1e3:\n raise ValueError(\n f\"In remove_FOV_overlaps, {tol=} but {min_pixel_size=}\"\n )\n\n # Loop over wells\n wells = sorted(list(set([ind[0] for ind in df.index])))\n for well in wells:\n\n logger.info(f\"removing FOV overlaps for {well=}\")\n df_well = df.loc[well].copy()\n\n # NOTE: these are positional indices (i.e. starting from 0)\n pair_pos_indices = get_overlapping_pair(df_well[list_columns], tol=tol)\n\n # Keep going until there are no overlaps, or until iteration reaches\n # max_iterations\n iteration = 0\n while pair_pos_indices:\n iteration += 1\n\n # Identify overlapping FOVs\n pos_ind_1, pos_ind_2 = pair_pos_indices\n fov_id_1 = df_well.index[pos_ind_1]\n fov_id_2 = df_well.index[pos_ind_2]\n xmin_1, ymin_1, xmax_1, ymax_1 = df_well[list_columns].iloc[\n pos_ind_1\n ]\n xmin_2, ymin_2, xmax_2, ymax_2 = df_well[list_columns].iloc[\n pos_ind_2\n ]\n logger.debug(\n f\"{well=}, {iteration=}, removing overlap between\"\n f\" {fov_id_1=} and {fov_id_2=}\"\n )\n\n # Check what kind of overlap is there (X, Y, or XY)\n is_x_equal = abs(xmin_1 - xmin_2) < tol and (xmax_1 - xmax_2) < tol\n is_y_equal = abs(ymin_1 - ymin_2) < tol and (ymax_1 - ymax_2) < tol\n is_x_overlap = is_overlapping_1D(\n [xmin_1, xmax_1], [xmin_2, xmax_2], tol=tol\n )\n is_y_overlap = is_overlapping_1D(\n [ymin_1, ymax_1], [ymin_2, ymax_2], tol=tol\n )\n\n if is_x_equal and is_y_overlap:\n # Y overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [ymin_1, ymax_1],\n [ymin_2, ymax_2],\n mu=\"y\",\n tol=tol,\n )\n elif is_y_equal and is_x_overlap:\n # X overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [xmin_1, xmax_1],\n [xmin_2, xmax_2],\n mu=\"x\",\n tol=tol,\n )\n elif not (is_x_equal or is_y_equal) and (\n is_x_overlap and is_y_overlap\n ):\n # XY overlap\n df_well = apply_shift_in_one_direction(\n df_well,\n [xmin_1, xmax_1],\n [xmin_2, xmax_2],\n mu=\"x\",\n tol=tol,\n )\n df_well = apply_shift_in_one_direction(\n df_well,\n [ymin_1, ymax_1],\n [ymin_2, ymax_2],\n mu=\"y\",\n tol=tol,\n )\n else:\n raise ValueError(\n \"Trying to remove overlap which is not there.\"\n )\n\n # Look for next overlapping FOV pair\n pair_pos_indices = get_overlapping_pair(\n df_well[list_columns], tol=tol\n )\n\n # Enforce maximum number of iterations\n if iteration >= max_iterations:\n raise ValueError(f\"Reached {max_iterations=} for {well=}\")\n\n # Note: using df.loc[well] = df_well leads to a NaN dataframe, see\n # for instance https://stackoverflow.com/a/28432733/19085332\n df.loc[well, :] = df_well.values\n\n # Remove temporary columns that were added only as part of this function\n df.drop(list_columns, axis=1, inplace=True)\n\n return df\n
Run an overlap check over all wells and optionally plots overlaps.
This function is currently only used in tests and examples.
The plotting_function parameter is exposed so that other tools (see examples in this repository) may use it to show the FOV ROIs. Its arguments are: [xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well].
PARAMETER DESCRIPTION site_metadata
TBD
TYPE: DataFrame
tol
TBD
TYPE: float DEFAULT: 1e-10
plotting_function
TBD
TYPE: Optional[Callable] DEFAULT: None
Source code in fractal_tasks_core/roi/v1_overlaps.py
def run_overlap_check(\n site_metadata: pd.DataFrame,\n tol: float = 1e-10,\n plotting_function: Optional[Callable] = None,\n):\n\"\"\"\n Run an overlap check over all wells and optionally plots overlaps.\n\n This function is currently only used in tests and examples.\n\n The `plotting_function` parameter is exposed so that other tools (see\n examples in this repository) may use it to show the FOV ROIs. Its arguments\n are: `[xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well]`.\n\n Args:\n site_metadata: TBD\n tol: TBD\n plotting_function: TBD\n \"\"\"\n\n if plotting_function is None:\n\n def plotting_function(\n xmin, xmax, ymin, ymax, list_overlapping_FOVs, selected_well\n ):\n pass\n\n wells = site_metadata.index.unique(level=\"well_id\")\n overlapping_FOVs = []\n for selected_well in wells:\n overlap_curr_well = check_well_for_FOV_overlap(\n site_metadata,\n selected_well=selected_well,\n tol=tol,\n plotting_function=plotting_function,\n )\n if overlap_curr_well:\n print(selected_well)\n overlapping_FOVs.append(overlap_curr_well)\n\n return overlapping_FOVs\n
This is the general interface that should allow for a smooth coexistence of tables with different fractal_table_version values. Currently only V1 is defined and implemented. The assumption is that V2 should only change:
The lower-level writing function (that is, _write_table_v2).
The type of the table (which would also reflect into a more general type hint for table, in the current funciton);
A different definition of what values of table_attrs are valid or invalid, to be implemented in _write_table_v2.
Possibly, additional parameters for _write_table_v2, which will be optional parameters of write_table (so that write_table remains valid for both V1 and V2).
PARAMETER DESCRIPTION image_group
The image Zarr group where the table will be written.
TYPE: Group
table_name
The name of the table.
TYPE: str
table
The table object (currently an AnnData object, for V1).
TYPE: AnnData
overwrite
If False, check that the new table does not exist (either as a zarr sub-group or as part of the zarr-group attributes). In all cases, propagate parameter to low-level functions, to determine the behavior in case of an existing sub-group named as in table_name.
TYPE: bool DEFAULT: False
table_type
type attribute for the table; in case type is also present in table_attrs, this function argument takes priority.
TYPE: Optional[str] DEFAULT: None
table_attrs
If set, overwrite table_group attributes with table_attrs key/value pairs. If table_type is not provided, then table_attrs must include the type key.
TYPE: Optional[dict[str, Any]] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the table.
Source code in fractal_tasks_core/tables/__init__.py
def write_table(\n image_group: zarr.hierarchy.Group,\n table_name: str,\n table: ad.AnnData,\n overwrite: bool = False,\n table_type: Optional[str] = None,\n table_attrs: Optional[dict[str, Any]] = None,\n) -> zarr.group:\n\"\"\"\n Write a table to a Zarr group.\n\n This is the general interface that should allow for a smooth coexistence of\n tables with different `fractal_table_version` values. Currently only V1 is\n defined and implemented. The assumption is that V2 should only change:\n\n 1. The lower-level writing function (that is, `_write_table_v2`).\n 2. The type of the table (which would also reflect into a more general type\n hint for `table`, in the current funciton);\n 3. A different definition of what values of `table_attrs` are valid or\n invalid, to be implemented in `_write_table_v2`.\n 4. Possibly, additional parameters for `_write_table_v2`, which will be\n optional parameters of `write_table` (so that `write_table` remains\n valid for both V1 and V2).\n\n Args:\n image_group:\n The image Zarr group where the table will be written.\n table_name:\n The name of the table.\n table:\n The table object (currently an AnnData object, for V1).\n overwrite:\n If `False`, check that the new table does not exist (either as a\n zarr sub-group or as part of the zarr-group attributes). In all\n cases, propagate parameter to low-level functions, to determine the\n behavior in case of an existing sub-group named as in `table_name`.\n table_type: `type` attribute for the table; in case `type` is also\n present in `table_attrs`, this function argument takes priority.\n table_attrs:\n If set, overwrite table_group attributes with table_attrs key/value\n pairs. If `table_type` is not provided, then `table_attrs` must\n include the `type` key.\n\n Returns:\n Zarr group of the table.\n \"\"\"\n # Choose which version to use, giving priority to a value that is present\n # in table_attrs\n version = __FRACTAL_TABLE_VERSION__\n if table_attrs is not None:\n try:\n version = table_attrs[\"fractal_table_version\"]\n except KeyError:\n pass\n\n if version == \"1\":\n return _write_table_v1(\n image_group,\n table_name,\n table,\n overwrite,\n table_type,\n table_attrs,\n )\n else:\n raise NotImplementedError(\n f\"fractal_table_version='{version}' is not supported\"\n )\n
Wrap anndata.experimental.write_elem, to include overwrite parameter.
See docs for the original function here.
This function writes elem to the sub-group key of group. The overwrite-related expected behavior is:
if the sub-group does not exist, create it (independently on overwrite);
if the sub-group already exists and overwrite=True, overwrite the sub-group;
if the sub-group already exists and overwrite=False, fail.
Note that this version of the wrapper does not include the original dataset_kwargs parameter.
PARAMETER DESCRIPTION group
The group to write to.
TYPE: Group
key
The key to write to in the group. Note that absolute paths will be written from the root.
TYPE: str
elem
The element to write. Typically an in-memory object, e.g. an AnnData, pandas dataframe, scipy sparse matrix, etc.
TYPE: Any
overwrite
If True, overwrite the key sub-group (if present); if False and key sub-group exists, raise an error.
TYPE: bool
logger
The logger to use (if unset, use logging.getLogger(None))
TYPE: Optional[Logger] DEFAULT: None
RAISES DESCRIPTION OverwriteNotAllowedError
If overwrite=False and the sub-group already exists.
Source code in fractal_tasks_core/tables/v1.py
def _write_elem_with_overwrite(\n group: zarr.hierarchy.Group,\n key: str,\n elem: Any,\n *,\n overwrite: bool,\n logger: Optional[logging.Logger] = None,\n) -> None:\n\"\"\"\n Wrap `anndata.experimental.write_elem`, to include `overwrite` parameter.\n\n See docs for the original function\n [here](https://anndata.readthedocs.io/en/stable/generated/anndata.experimental.write_elem.html).\n\n This function writes `elem` to the sub-group `key` of `group`. The\n `overwrite`-related expected behavior is:\n\n * if the sub-group does not exist, create it (independently on\n `overwrite`);\n * if the sub-group already exists and `overwrite=True`, overwrite the\n sub-group;\n * if the sub-group already exists and `overwrite=False`, fail.\n\n Note that this version of the wrapper does not include the original\n `dataset_kwargs` parameter.\n\n Args:\n group:\n The group to write to.\n key:\n The key to write to in the group. Note that absolute paths will be\n written from the root.\n elem:\n The element to write. Typically an in-memory object, e.g. an\n AnnData, pandas dataframe, scipy sparse matrix, etc.\n overwrite:\n If `True`, overwrite the `key` sub-group (if present); if `False`\n and `key` sub-group exists, raise an error.\n logger:\n The logger to use (if unset, use `logging.getLogger(None)`)\n\n Raises:\n OverwriteNotAllowedError:\n If `overwrite=False` and the sub-group already exists.\n \"\"\"\n\n # Set logger\n if logger is None:\n logger = logging.getLogger(None)\n\n if key in set(group.group_keys()):\n if not overwrite:\n error_msg = (\n f\"Sub-group '{key}' of group {group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n write_elem(group, key, elem)\n
Handle multiple options for writing an AnnData table to a zarr group.
Create the tables group, if needed.
If overwrite=False, check that the new table does not exist (either in zarr attributes or as a zarr sub-group).
Call the _write_elem_with_overwrite wrapper with the appropriate overwrite parameter.
Update the tables attribute of the image group.
Validate table_type and table_attrs according to Fractal table specifications, and raise errors/warnings if needed; then set the appropriate attributes in the new-table Zarr group.
PARAMETER DESCRIPTION image_group
The group to write to.
TYPE: Group
table_name
The name of the new table.
TYPE: str
table
The AnnData table to write.
TYPE: AnnData
overwrite
If False, check that the new table does not exist (either as a zarr sub-group or as part of the zarr-group attributes). In all cases, propagate parameter to _write_elem_with_overwrite, to determine the behavior in case of an existing sub-group named as table_name.
TYPE: bool DEFAULT: False
table_type
type attribute for the table; in case type is also present in table_attrs, this function argument takes priority.
TYPE: Optional[str] DEFAULT: None
table_attrs
If set, overwrite table_group attributes with table_attrs key/value pairs. If table_type is not provided, then table_attrs must include the type key.
TYPE: Optional[dict[str, Any]] DEFAULT: None
RETURNS DESCRIPTION group
Zarr group of the new table.
Source code in fractal_tasks_core/tables/v1.py
def _write_table_v1(\n image_group: zarr.hierarchy.Group,\n table_name: str,\n table: ad.AnnData,\n overwrite: bool = False,\n table_type: Optional[str] = None,\n table_attrs: Optional[dict[str, Any]] = None,\n) -> zarr.group:\n\"\"\"\n Handle multiple options for writing an AnnData table to a zarr group.\n\n 1. Create the `tables` group, if needed.\n 2. If `overwrite=False`, check that the new table does not exist (either in\n zarr attributes or as a zarr sub-group).\n 3. Call the `_write_elem_with_overwrite` wrapper with the appropriate\n `overwrite` parameter.\n 4. Update the `tables` attribute of the image group.\n 5. Validate `table_type` and `table_attrs` according to Fractal table\n specifications, and raise errors/warnings if needed; then set the\n appropriate attributes in the new-table Zarr group.\n\n\n Args:\n image_group:\n The group to write to.\n table_name:\n The name of the new table.\n table:\n The AnnData table to write.\n overwrite:\n If `False`, check that the new table does not exist (either as a\n zarr sub-group or as part of the zarr-group attributes). In all\n cases, propagate parameter to `_write_elem_with_overwrite`, to\n determine the behavior in case of an existing sub-group named as\n `table_name`.\n table_type: `type` attribute for the table; in case `type` is also\n present in `table_attrs`, this function argument takes priority.\n table_attrs:\n If set, overwrite table_group attributes with table_attrs key/value\n pairs. If `table_type` is not provided, then `table_attrs` must\n include the `type` key.\n\n Returns:\n Zarr group of the new table.\n \"\"\"\n\n # Create tables group (if needed) and extract current_tables\n if \"tables\" not in set(image_group.group_keys()):\n tables_group = image_group.create_group(\"tables\", overwrite=False)\n else:\n tables_group = image_group[\"tables\"]\n current_tables = tables_group.attrs.asdict().get(\"tables\", [])\n\n # If overwrite=False, check that the new table does not exist (either as a\n # zarr sub-group or as part of the zarr-group attributes)\n if not overwrite:\n if table_name in set(tables_group.group_keys()):\n error_msg = (\n f\"Sub-group '{table_name}' of group {image_group.store.path} \"\n f\"already exists, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n if table_name in current_tables:\n error_msg = (\n f\"Item '{table_name}' already exists in `tables` attribute of \"\n f\"group {image_group.store.path}, but `{overwrite=}`.\\n\"\n \"Hint: try setting `overwrite=True`.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Always include fractal-roi-table version in table attributes\n if table_attrs is None:\n table_attrs = dict(fractal_table_version=\"1\")\n elif table_attrs.get(\"fractal_table_version\", None) is None:\n table_attrs[\"fractal_table_version\"] = \"1\"\n\n # Set type attribute for the table\n table_type_from_attrs = table_attrs.get(\"type\", None)\n if table_type is not None:\n if table_type_from_attrs is not None:\n logger.warning(\n f\"Setting table type to '{table_type}' (and overriding \"\n f\"'{table_type_from_attrs}' attribute).\"\n )\n table_attrs[\"type\"] = table_type\n else:\n if table_type_from_attrs is None:\n raise ValueError(\n \"Missing attribute `type` for table; this must be provided\"\n \" either via `table_type` or within `table_attrs`.\"\n )\n\n # Prepare/validate attributes for the table\n table_type = table_attrs.get(\"type\", None)\n if table_type == \"roi_table\":\n pass\n elif table_type == \"masking_roi_table\":\n try:\n MaskingROITableAttrs(**table_attrs)\n except ValidationError as e:\n error_msg = (\n \"Table attributes do not comply with Fractal \"\n \"`masking_roi_table` specifications V1.\\nOriginal error:\\n\"\n f\"ValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n elif table_type == \"feature_table\":\n try:\n FeatureTableAttrs(**table_attrs)\n except ValidationError as e:\n error_msg = (\n \"Table attributes do not comply with Fractal \"\n \"`feature_table` specifications V1.\\nOriginal error:\\n\"\n f\"ValidationError: {str(e)}\"\n )\n logger.error(error_msg)\n raise ValueError(error_msg)\n else:\n logger.warning(f\"Unknown table type `{table_type}`.\")\n\n # If it's all OK, proceed and write the table\n _write_elem_with_overwrite(\n tables_group,\n table_name,\n table,\n overwrite=overwrite,\n )\n table_group = tables_group[table_name]\n\n # Update the `tables` metadata of the image group, if needed\n if table_name not in current_tables:\n new_tables = current_tables + [table_name]\n tables_group.attrs[\"tables\"] = new_tables\n\n # Update table_group attributes with table_attrs key/value pairs\n table_group.attrs.update(**table_attrs)\n\n return table_group\n
Applies pre-calculated registration to ROI tables.
Apply pre-calculated registration such that resulting ROIs contain the consensus align region between all cycles.
Parallelization level: well
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
roi_table
Name of the ROI table over which the task loops to calculate the registration. Examples: FOV_ROI_table => loop over the field of views, well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
reference_cycle
Which cycle to register against. Defaults to 0, which is the first OME-Zarr image in the well, usually the first cycle that was provided
TYPE: int DEFAULT: 0
new_roi_table
Optional name for the new, registered ROI table. If no name is given, it will default to \"registered_\" + roi_table
TYPE: Optional[str] DEFAULT: None
Source code in fractal_tasks_core/tasks/apply_registration_to_ROI_tables.py
@validate_arguments\ndef apply_registration_to_ROI_tables(\n *,\n # Fractal arguments\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments\n roi_table: str = \"FOV_ROI_table\",\n reference_cycle: int = 0,\n new_roi_table: Optional[str] = None,\n) -> dict[str, Any]:\n\"\"\"\n Applies pre-calculated registration to ROI tables.\n\n Apply pre-calculated registration such that resulting ROIs contain\n the consensus align region between all cycles.\n\n Parallelization level: well\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n roi_table: Name of the ROI table over which the task loops to\n calculate the registration. Examples: `FOV_ROI_table` => loop over\n the field of views, `well_ROI_table` => process the whole well as\n one image.\n reference_cycle: Which cycle to register against. Defaults to 0,\n which is the first OME-Zarr image in the well, usually the first\n cycle that was provided\n new_roi_table: Optional name for the new, registered ROI table. If no\n name is given, it will default to \"registered_\" + `roi_table`\n\n \"\"\"\n if not new_roi_table:\n new_roi_table = \"registered_\" + roi_table\n logger.info(\n f\"Running for {input_paths=}, {component=}. \\n\"\n f\"Applyg translation registration to {roi_table=} and storing it as \"\n f\"{new_roi_table=}.\"\n )\n\n well_zarr = f\"{input_paths[0]}/{component}\"\n ngff_well_meta = load_NgffWellMeta(well_zarr)\n acquisition_dict = ngff_well_meta.get_acquisition_paths()\n logger.info(\n \"Calculating common registration for the following cycles: \"\n f\"{acquisition_dict}\"\n )\n\n # TODO: Allow a filter on which acquisitions should get processed?\n\n # Collect all the ROI tables\n roi_tables = {}\n roi_tables_attrs = {}\n for acq in acquisition_dict.keys():\n acq_path = acquisition_dict[acq]\n curr_ROI_table = ad.read_zarr(\n f\"{well_zarr}/{acq_path}/tables/{roi_table}\"\n )\n curr_ROI_table_group = zarr.open_group(\n f\"{well_zarr}/{acq_path}/tables/{roi_table}\", mode=\"r\"\n )\n curr_ROI_table_attrs = curr_ROI_table_group.attrs.asdict()\n\n # For reference_cycle acquisition, handle the fact that it doesn't\n # have the shifts\n if acq == reference_cycle:\n curr_ROI_table = add_zero_translation_columns(curr_ROI_table)\n # Check for valid ROI tables\n are_ROI_table_columns_valid(table=curr_ROI_table)\n translation_columns = [\n \"translation_z\",\n \"translation_y\",\n \"translation_x\",\n ]\n if curr_ROI_table.var.index.isin(translation_columns).sum() != 3:\n raise ValueError(\n f\"Cycle {acq}'s {roi_table} does not contain the \"\n f\"translation columns {translation_columns} necessary to use \"\n \"this task.\"\n )\n roi_tables[acq] = curr_ROI_table\n roi_tables_attrs[acq] = curr_ROI_table_attrs\n\n # Check that all acquisitions have the same ROIs\n rois = roi_tables[reference_cycle].obs.index\n for acq, acq_roi_table in roi_tables.items():\n if not (acq_roi_table.obs.index == rois).all():\n raise ValueError(\n f\"Acquisition {acq} does not contain the same ROIs as the \"\n f\"reference acquisition {reference_cycle}:\\n\"\n f\"{acq}: {acq_roi_table.obs.index}\\n\"\n f\"{reference_cycle}: {rois}\"\n )\n\n roi_table_dfs = [\n roi_table.to_df().loc[:, translation_columns]\n for roi_table in roi_tables.values()\n ]\n logger.info(\"Calculating min & max translation across cycles.\")\n max_df, min_df = calculate_min_max_across_dfs(roi_table_dfs)\n shifted_rois = {}\n # Loop over acquisitions\n for acq in acquisition_dict.keys():\n shifted_rois[acq] = apply_registration_to_single_ROI_table(\n roi_tables[acq], max_df, min_df\n )\n\n # TODO: Drop translation columns from this table?\n\n logger.info(\n f\"Write the registered ROI table {new_roi_table} for {acq=}\"\n )\n # Save the shifted ROI table as a new table\n image_group = zarr.group(f\"{well_zarr}/{acq}\")\n write_table(\n image_group,\n new_roi_table,\n shifted_rois[acq],\n table_attrs=roi_tables_attrs[acq],\n )\n\n # TODO: Optionally apply registration to other tables as well?\n # e.g. to well_ROI_table based on FOV_ROI_table\n # => out of scope for the initial task, apply registration separately\n # to each table\n # Easiest implementation: Apply average shift calculcated here to other\n # ROIs. From many to 1 (e.g. FOV => well) => average shift, but crop len\n # From well to many (e.g. well to FOVs) => average shift, crop len by that\n # amount\n # Many to many (FOVs to organoids) => tricky because of matching\n\n return {}\n
Calculates the new position as: p = position + max(shift, 0) - own_shift Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)
PARAMETER DESCRIPTION roi_table
AnnData table which contains a Fractal ROI table. Rows are ROIs
TYPE: AnnData
max_df
Max translation shift in z, y, x for each ROI. Rows are ROIs, columns are translation_z, translation_y, translation_x
TYPE: DataFrame
min_df
Min translation shift in z, y, x for each ROI. Rows are ROIs, columns are translation_z, translation_y, translation_x
TYPE: DataFrame
Returns: ROI table where all ROIs are registered to the smallest common area across all cycles.
Source code in fractal_tasks_core/tasks/apply_registration_to_ROI_tables.py
def apply_registration_to_single_ROI_table(\n roi_table: ad.AnnData,\n max_df: pd.DataFrame,\n min_df: pd.DataFrame,\n) -> ad.AnnData:\n\"\"\"\n Applies the registration to a ROI table\n\n Calculates the new position as: p = position + max(shift, 0) - own_shift\n Calculates the new len as: l = len - max(shift, 0) + min(shift, 0)\n\n Args:\n roi_table: AnnData table which contains a Fractal ROI table.\n Rows are ROIs\n max_df: Max translation shift in z, y, x for each ROI. Rows are ROIs,\n columns are translation_z, translation_y, translation_x\n min_df: Min translation shift in z, y, x for each ROI. Rows are ROIs,\n columns are translation_z, translation_y, translation_x\n Returns:\n ROI table where all ROIs are registered to the smallest common area\n across all cycles.\n \"\"\"\n roi_table = copy.deepcopy(roi_table)\n rois = roi_table.obs.index\n if (rois != max_df.index).all() or (rois != min_df.index).all():\n raise ValueError(\n \"ROI table and max & min translation need to contain the same \"\n f\"ROIS, but they were {rois=}, {max_df.index=}, {min_df.index=}\"\n )\n\n for roi in rois:\n roi_table[[roi], [\"z_micrometer\"]] = (\n roi_table[[roi], [\"z_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_z\"])\n - roi_table[[roi], [\"translation_z\"]].X\n )\n roi_table[[roi], [\"y_micrometer\"]] = (\n roi_table[[roi], [\"y_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_y\"])\n - roi_table[[roi], [\"translation_y\"]].X\n )\n roi_table[[roi], [\"x_micrometer\"]] = (\n roi_table[[roi], [\"x_micrometer\"]].X\n + float(max_df.loc[roi, \"translation_x\"])\n - roi_table[[roi], [\"translation_x\"]].X\n )\n # This calculation only works if all ROIs are the same size initially!\n roi_table[[roi], [\"len_z_micrometer\"]] = (\n roi_table[[roi], [\"len_z_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_z\"])\n + float(min_df.loc[roi, \"translation_z\"])\n )\n roi_table[[roi], [\"len_y_micrometer\"]] = (\n roi_table[[roi], [\"len_y_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_y\"])\n + float(min_df.loc[roi, \"translation_y\"])\n )\n roi_table[[roi], [\"len_x_micrometer\"]] = (\n roi_table[[roi], [\"len_x_micrometer\"]].X\n - float(max_df.loc[roi, \"translation_x\"])\n + float(min_df.loc[roi, \"translation_x\"])\n )\n return roi_table\n
Apply registration to images by using a registered ROI table
This task consists of 4 parts:
Mask all regions in images that are not available in the registered ROI table and store each cycle aligned to the reference_cycle (by looping over ROIs).
Do the same for all label images.
Copy all tables from the non-aligned image to the aligned image (currently only works well if the only tables are well & FOV ROI tables (registered and original). Not implemented for measurement tables and other ROI tables).
Clean up: Delete the old, non-aligned image and rename the new, aligned image to take over its place.
Parallelization level: image
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
registered_roi_table
Name of the ROI table which has been registered and will be applied to mask and shift the images. Examples: registered_FOV_ROI_table => loop over the field of views, registered_well_ROI_table => process the whole well as one image.
TYPE: str
reference_cycle
Which cycle to register against. Defaults to 0, which is the first OME-Zarr image in the well, usually the first cycle that was provided
TYPE: str DEFAULT: '0'
overwrite_input
Whether the old image data should be replaced with the newly registered image data. Currently only implemented for overwrite_input=True.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/apply_registration_to_image.py
@validate_arguments\ndef apply_registration_to_image(\n *,\n # Fractal arguments\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments\n registered_roi_table: str,\n reference_cycle: str = \"0\",\n overwrite_input: bool = True,\n):\n\"\"\"\n Apply registration to images by using a registered ROI table\n\n This task consists of 4 parts:\n\n 1. Mask all regions in images that are not available in the\n registered ROI table and store each cycle aligned to the\n reference_cycle (by looping over ROIs).\n 2. Do the same for all label images.\n 3. Copy all tables from the non-aligned image to the aligned image\n (currently only works well if the only tables are well & FOV ROI tables\n (registered and original). Not implemented for measurement tables and\n other ROI tables).\n 4. Clean up: Delete the old, non-aligned image and rename the new,\n aligned image to take over its place.\n\n Parallelization level: image\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n registered_roi_table: Name of the ROI table which has been registered\n and will be applied to mask and shift the images.\n Examples: `registered_FOV_ROI_table` => loop over the field of\n views, `registered_well_ROI_table` => process the whole well as\n one image.\n reference_cycle: Which cycle to register against. Defaults to 0,\n which is the first OME-Zarr image in the well, usually the first\n cycle that was provided\n overwrite_input: Whether the old image data should be replaced with the\n newly registered image data. Currently only implemented for\n `overwrite_input=True`.\n\n \"\"\"\n logger.info(component)\n if not overwrite_input:\n raise NotImplementedError(\n \"This task is only implemented for the overwrite_input version\"\n )\n logger.info(\n f\"Running `apply_registration_to_image` on {input_paths=}, \"\n f\"{component=}, {registered_roi_table=} and {reference_cycle=}. \"\n f\"Using {overwrite_input=}\"\n )\n\n input_path = Path(input_paths[0])\n new_component = \"/\".join(\n component.split(\"/\")[:-1] + [component.split(\"/\")[-1] + \"_registered\"]\n )\n reference_component = \"/\".join(\n component.split(\"/\")[:-1] + [reference_cycle]\n )\n\n ROI_table_ref = ad.read_zarr(\n f\"{input_path / reference_component}/tables/{registered_roi_table}\"\n )\n ROI_table_cycle = ad.read_zarr(\n f\"{input_path / component}/tables/{registered_roi_table}\"\n )\n\n ngff_image_meta = load_NgffImageMeta(str(input_path / component))\n coarsening_xy = ngff_image_meta.coarsening_xy\n num_levels = ngff_image_meta.num_levels\n\n ####################\n # Process images\n ####################\n logger.info(\"Write the registered Zarr image to disk\")\n write_registered_zarr(\n input_path=input_path,\n component=component,\n new_component=new_component,\n ROI_table=ROI_table_cycle,\n ROI_table_ref=ROI_table_ref,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n aggregation_function=np.mean,\n )\n\n ####################\n # Process labels\n ####################\n try:\n labels_group = zarr.open_group(f\"{input_path / component}/labels\", \"r\")\n label_list = labels_group.attrs[\"labels\"]\n except (zarr.errors.GroupNotFoundError, KeyError):\n label_list = []\n\n if label_list:\n logger.info(f\"Processing the label images: {label_list}\")\n labels_group = zarr.group(f\"{input_path / new_component}/labels\")\n labels_group.attrs[\"labels\"] = label_list\n\n for label in label_list:\n label_component = f\"{component}/labels/{label}\"\n label_component_new = f\"{new_component}/labels/{label}\"\n write_registered_zarr(\n input_path=input_path,\n component=label_component,\n new_component=label_component_new,\n ROI_table=ROI_table_cycle,\n ROI_table_ref=ROI_table_ref,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n aggregation_function=np.max,\n )\n\n ####################\n # Copy tables\n # 1. Copy all standard ROI tables from cycle 0.\n # 2. Copy all tables that aren't standard ROI tables from the given cycle\n ####################\n table_dict_reference = get_table_path_dict(input_path, reference_component)\n table_dict_component = get_table_path_dict(input_path, component)\n\n table_dict = {}\n # Define which table should get copied:\n for table in table_dict_reference:\n if is_standard_roi_table(table):\n table_dict[table] = table_dict_reference[table]\n for table in table_dict_component:\n if not is_standard_roi_table(table):\n if reference_component != component:\n logger.warning(\n f\"{component} contained a table that is not a standard \"\n \"ROI table. The `Apply Registration To Image task` is \"\n \"best used before additional tables are generated. It \"\n f\"will copy the {table} from this cycle without applying \"\n f\"any transformations. This will work well if {table} \"\n f\"contains measurements. But if {table} is a custom ROI \"\n \"table coming from another task, the transformation is \"\n \"not applied and it will not match with the registered \"\n \"image anymore\"\n )\n table_dict[table] = table_dict_component[table]\n\n if table_dict:\n logger.info(f\"Processing the tables: {table_dict}\")\n new_image_group = zarr.group(f\"{input_path / new_component}\")\n\n for table in table_dict.keys():\n logger.info(f\"Copying table: {table}\")\n # Get the relevant metadata of the Zarr table & add it\n # See issue #516 for the need for this workaround\n max_retries = 20\n sleep_time = 5\n current_round = 0\n while current_round < max_retries:\n try:\n old_table_group = zarr.open_group(\n table_dict[table], mode=\"r\"\n )\n current_round = max_retries\n except zarr.errors.GroupNotFoundError:\n logger.debug(\n f\"Table {table} not found in attempt {current_round}. \"\n f\"Waiting {sleep_time} seconds before trying again.\"\n )\n current_round += 1\n time.sleep(sleep_time)\n # Write the Zarr table\n curr_table = ad.read_zarr(table_dict[table])\n write_table(\n new_image_group,\n table,\n curr_table,\n table_attrs=old_table_group.attrs.asdict(),\n )\n\n ####################\n # Clean up Zarr file\n ####################\n if overwrite_input:\n logger.info(\n \"Replace original zarr image with the newly created Zarr image\"\n )\n # Potential for race conditions: Every cycle reads the\n # reference cycle, but the reference cycle also gets modified\n # See issue #516 for the details\n os.rename(f\"{input_path / component}\", f\"{input_path / component}_tmp\")\n os.rename(f\"{input_path / new_component}\", f\"{input_path / component}\")\n shutil.rmtree(f\"{input_path / component}_tmp\")\n else:\n raise NotImplementedError\n
This function loads the image or label data from a zarr array based on the ROI bounding-box coordinates and stores them into a new zarr array. The new Zarr array has the same shape as the original array, but will have 0s where the ROI tables don't specify loading of the image data. The ROIs loaded from list_indices will be written into the list_indices_ref position, thus performing translational registration if the two lists of ROI indices vary.
PARAMETER DESCRIPTION input_path
Base folder where the Zarr is stored (does not contain the Zarr file itself)
TYPE: Path
component
Path to the OME-Zarr image that is processed. For example: \"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1\"
TYPE: str
new_component
Path to the new Zarr image that will be written (also in the input_path folder). For example: \"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1_registered\"
TYPE: str
ROI_table
Fractal ROI table for the component
TYPE: AnnData
ROI_table_ref
Fractal ROI table for the reference cycle
TYPE: AnnData
num_levels
Number of pyramid layers to be created (argument of build_pyramid).
TYPE: int
coarsening_xy
Coarsening factor between pyramid levels
TYPE: int DEFAULT: 2
aggregation_function
Function to be used when downsampling (argument of build_pyramid).
TYPE: Callable DEFAULT: mean
Source code in fractal_tasks_core/tasks/apply_registration_to_image.py
def write_registered_zarr(\n input_path: Path,\n component: str,\n new_component: str,\n ROI_table: ad.AnnData,\n ROI_table_ref: ad.AnnData,\n num_levels: int,\n coarsening_xy: int = 2,\n aggregation_function: Callable = np.mean,\n):\n\"\"\"\n Write registered zarr array based on ROI tables\n\n This function loads the image or label data from a zarr array based on the\n ROI bounding-box coordinates and stores them into a new zarr array.\n The new Zarr array has the same shape as the original array, but will have\n 0s where the ROI tables don't specify loading of the image data.\n The ROIs loaded from `list_indices` will be written into the\n `list_indices_ref` position, thus performing translational registration if\n the two lists of ROI indices vary.\n\n Args:\n input_path: Base folder where the Zarr is stored\n (does not contain the Zarr file itself)\n component: Path to the OME-Zarr image that is processed. For example:\n `\"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1\"`\n new_component: Path to the new Zarr image that will be written\n (also in the input_path folder). For example:\n `\"20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/1_registered\"`\n ROI_table: Fractal ROI table for the component\n ROI_table_ref: Fractal ROI table for the reference cycle\n num_levels: Number of pyramid layers to be created (argument of\n `build_pyramid`).\n coarsening_xy: Coarsening factor between pyramid levels\n aggregation_function: Function to be used when downsampling (argument\n of `build_pyramid`).\n\n \"\"\"\n # Read pixel sizes from Zarr attributes\n ngff_image_meta = load_NgffImageMeta(str(input_path / component))\n pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n\n # Create list of indices for 3D ROIs\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n list_indices_ref = convert_ROI_table_to_indices(\n ROI_table_ref,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n\n old_image_group = zarr.open_group(f\"{input_path / component}\", mode=\"r\")\n old_ngff_image_meta = load_NgffImageMeta(str(input_path / component))\n new_image_group = zarr.group(f\"{input_path / new_component}\")\n new_image_group.attrs.put(old_image_group.attrs.asdict())\n\n # Loop over all channels. For each channel, write full-res image data.\n data_array = da.from_zarr(old_image_group[\"0\"])\n # Create dask array with 0s of same shape\n new_array = da.zeros_like(data_array)\n\n # TODO: Add sanity checks on the 2 ROI tables:\n # 1. The number of ROIs need to match\n # 2. The size of the ROIs need to match\n # (otherwise, we can't assign them to the reference regions)\n # ROI_table_ref vs ROI_table_cycle\n for i, roi_indices in enumerate(list_indices):\n reference_region = convert_indices_to_regions(list_indices_ref[i])\n region = convert_indices_to_regions(roi_indices)\n\n axes_list = old_ngff_image_meta.axes_names\n\n if axes_list == [\"c\", \"z\", \"y\", \"x\"]:\n num_channels = data_array.shape[0]\n # Loop over channels\n for ind_ch in range(num_channels):\n idx = tuple(\n [slice(ind_ch, ind_ch + 1)] + list(reference_region)\n )\n new_array[idx] = load_region(\n data_zyx=data_array[ind_ch], region=region, compute=False\n )\n elif axes_list == [\"z\", \"y\", \"x\"]:\n new_array[reference_region] = load_region(\n data_zyx=data_array, region=region, compute=False\n )\n elif axes_list == [\"c\", \"y\", \"x\"]:\n # TODO: Implement cyx case (based on looping over xy case)\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n elif axes_list == [\"y\", \"x\"]:\n # TODO: Implement yx case\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n else:\n raise NotImplementedError(\n \"`write_registered_zarr` has not been implemented for \"\n f\"a zarr with {axes_list=}\"\n )\n\n new_array.to_zarr(\n f\"{input_path / new_component}/0\",\n overwrite=True,\n dimension_separator=\"/\",\n write_empty_chunks=False,\n )\n\n # Starting from on-disk highest-resolution data, build and write to\n # disk a pyramid of coarser levels\n build_pyramid(\n zarrurl=f\"{input_path / new_component}\",\n overwrite=True,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=data_array.chunksize,\n aggregation_function=aggregation_function,\n )\n
Loading the images of a given ROI (=> loop over ROIs)
Calculating the transformation for that ROI
Storing the calculated transformation in the ROI table
Parallelization level: image
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
wavelength_id
Wavelength that will be used for image-based registration; e.g. A01_C01 for Yokogawa, C01 for MD.
TYPE: str
roi_table
Name of the ROI table over which the task loops to calculate the registration. Examples: FOV_ROI_table => loop over the field of views, well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
reference_cycle
Which cycle to register against. Defaults to 0, which is the first OME-Zarr image in the well (usually the first cycle that was provided).
TYPE: int DEFAULT: 0
level
Pyramid level of the image to be segmented. Choose 0 to process at full resolution.
TYPE: int DEFAULT: 2
Source code in fractal_tasks_core/tasks/calculate_registration_image_based.py
@validate_arguments\ndef calculate_registration_image_based(\n *,\n # Fractal arguments\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments\n wavelength_id: str,\n roi_table: str = \"FOV_ROI_table\",\n reference_cycle: int = 0,\n level: int = 2,\n) -> dict[str, Any]:\n\"\"\"\n Calculate registration based on images\n\n This task consists of 3 parts:\n\n 1. Loading the images of a given ROI (=> loop over ROIs)\n 2. Calculating the transformation for that ROI\n 3. Storing the calculated transformation in the ROI table\n\n Parallelization level: image\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n wavelength_id: Wavelength that will be used for image-based\n registration; e.g. `A01_C01` for Yokogawa, `C01` for MD.\n roi_table: Name of the ROI table over which the task loops to\n calculate the registration. Examples: `FOV_ROI_table` => loop over\n the field of views, `well_ROI_table` => process the whole well as\n one image.\n reference_cycle: Which cycle to register against. Defaults to 0,\n which is the first OME-Zarr image in the well (usually the first\n cycle that was provided).\n level: Pyramid level of the image to be segmented. Choose `0` to\n process at full resolution.\n\n \"\"\"\n logger.info(\n f\"Running for {input_paths=}, {component=}. \\n\"\n f\"Calculating translation registration per {roi_table=} for \"\n f\"{wavelength_id=}.\"\n )\n # Set OME-Zarr paths\n zarr_img_cycle_x = Path(input_paths[0]) / component\n\n # If the task is run for the reference cycle, exit\n # TODO: Improve the input for this: Can we filter components to not\n # run for itself?\n alignment_cycle = zarr_img_cycle_x.name\n if alignment_cycle == str(reference_cycle):\n logger.info(\n \"Calculate registration image-based is running for \"\n f\"cycle {alignment_cycle}, which is the reference_cycle.\"\n \"Thus, exiting the task.\"\n )\n return {}\n else:\n logger.info(\n \"Calculate registration image-based is running for \"\n f\"cycle {alignment_cycle}\"\n )\n\n zarr_img_ref_cycle = zarr_img_cycle_x.parent / str(reference_cycle)\n\n # Read some parameters from Zarr metadata\n ngff_image_meta = load_NgffImageMeta(str(zarr_img_ref_cycle))\n coarsening_xy = ngff_image_meta.coarsening_xy\n\n # Get channel_index via wavelength_id.\n # Intially only allow registration of the same wavelength\n channel_ref: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=str(zarr_img_ref_cycle),\n wavelength_id=wavelength_id,\n )\n channel_index_ref = channel_ref.index\n\n channel_align: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=str(zarr_img_cycle_x),\n wavelength_id=wavelength_id,\n )\n channel_index_align = channel_align.index\n\n # Lazily load zarr array\n data_reference_zyx = da.from_zarr(f\"{zarr_img_ref_cycle}/{level}\")[\n channel_index_ref\n ]\n data_alignment_zyx = da.from_zarr(f\"{zarr_img_cycle_x}/{level}\")[\n channel_index_align\n ]\n\n # Read ROIs\n ROI_table_ref = ad.read_zarr(f\"{zarr_img_ref_cycle}/tables/{roi_table}\")\n ROI_table_x = ad.read_zarr(f\"{zarr_img_cycle_x}/tables/{roi_table}\")\n logger.info(\n f\"Found {len(ROI_table_x)} ROIs in {roi_table=} to be processed.\"\n )\n\n # Check that table type of ROI_table_ref is valid. Note that\n # \"ngff:region_table\" and None are accepted for backwards compatibility\n valid_table_types = [\n \"roi_table\",\n \"masking_roi_table\",\n \"ngff:region_table\",\n None,\n ]\n ROI_table_ref_group = zarr.open_group(\n f\"{zarr_img_ref_cycle}/tables/{roi_table}\",\n mode=\"r\",\n )\n ref_table_attrs = ROI_table_ref_group.attrs.asdict()\n ref_table_type = ref_table_attrs.get(\"type\")\n if ref_table_type not in valid_table_types:\n raise ValueError(\n (\n f\"Table '{roi_table}' (with type '{ref_table_type}') is \"\n \"not a valid ROI table.\"\n )\n )\n\n # For each cycle, get the relevant info\n # TODO: Add additional checks on ROIs?\n if (ROI_table_ref.obs.index != ROI_table_x.obs.index).all():\n raise ValueError(\n \"Registration is only implemented for ROIs that match between the \"\n \"cycles (e.g. well, FOV ROIs). Here, the ROIs in the reference \"\n \"cycles were {ROI_table_ref.obs.index}, but the ROIs in the \"\n \"alignment cycle were {ROI_table_x.obs.index}\"\n )\n # TODO: Make this less restrictive? i.e. could we also run it if different\n # cycles have different FOVs? But then how do we know which FOVs to match?\n # If we relax this, downstream assumptions on matching based on order\n # in the list will break.\n\n # Read pixel sizes from zarr attributes\n ngff_image_meta_cycle_x = load_NgffImageMeta(str(zarr_img_cycle_x))\n pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n pxl_sizes_zyx_cycle_x = ngff_image_meta_cycle_x.get_pixel_sizes_zyx(\n level=0\n )\n\n if pxl_sizes_zyx != pxl_sizes_zyx_cycle_x:\n raise ValueError(\n \"Pixel sizes need to be equal between cycles for registration\"\n )\n\n # Create list of indices for 3D ROIs spanning the entire Z direction\n list_indices_ref = convert_ROI_table_to_indices(\n ROI_table_ref,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices_ref, roi_table)\n\n list_indices_cycle_x = convert_ROI_table_to_indices(\n ROI_table_x,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices_cycle_x, roi_table)\n\n num_ROIs = len(list_indices_ref)\n compute = True\n new_shifts = {}\n for i_ROI in range(num_ROIs):\n logger.info(\n f\"Now processing ROI {i_ROI+1}/{num_ROIs} \"\n f\"for channel {channel_align}.\"\n )\n img_ref = load_region(\n data_zyx=data_reference_zyx,\n region=convert_indices_to_regions(list_indices_ref[i_ROI]),\n compute=compute,\n )\n img_cycle_x = load_region(\n data_zyx=data_alignment_zyx,\n region=convert_indices_to_regions(list_indices_cycle_x[i_ROI]),\n compute=compute,\n )\n\n ##############\n # Calculate the transformation\n ##############\n # Basic version (no padding, no internal binning)\n if img_ref.shape != img_cycle_x.shape:\n raise NotImplementedError(\n \"This registration is not implemented for ROIs with \"\n \"different shapes between cycles\"\n )\n shifts = phase_cross_correlation(\n np.squeeze(img_ref), np.squeeze(img_cycle_x)\n )[0]\n\n # Registration based on scmultiplex, image-based\n # shifts, _, _ = calculate_shift(np.squeeze(img_ref),\n # np.squeeze(img_cycle_x), bin=binning, binarize=False)\n\n # TODO: Make this work on label images\n # (=> different loading) etc.\n\n ##############\n # Storing the calculated transformation ###\n ##############\n # Store the shift in ROI table\n # TODO: Store in OME-NGFF transformations: Check SpatialData approach,\n # per ROI storage?\n\n # Adapt ROIs for the given ROI table:\n ROI_name = ROI_table_ref.obs.index[i_ROI]\n new_shifts[ROI_name] = calculate_physical_shifts(\n shifts,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=pxl_sizes_zyx,\n )\n\n # Write physical shifts to disk (as part of the ROI table)\n logger.info(f\"Updating the {roi_table=} with translation columns\")\n image_group = zarr.group(zarr_img_cycle_x)\n new_ROI_table = get_ROI_table_with_translation(ROI_table_x, new_shifts)\n write_table(\n image_group,\n roi_table,\n new_ROI_table,\n overwrite=True,\n table_attrs=ref_table_attrs,\n )\n\n return {}\n
Run cellpose segmentation on the ROIs of a single OME-Zarr image.
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
level
Pyramid level of the image to be segmented. Choose 0 to process at full resolution.
TYPE: int
channel
Primary channel for segmentation; requires either wavelength_id (e.g. A01_C01) or label (e.g. DAPI).
TYPE: ChannelInputModel
channel2
Second channel for segmentation (in the same format as channel). If specified, cellpose runs in dual channel mode. For dual channel segmentation of cells, the first channel should contain the membrane marker, the second channel should contain the nuclear marker.
TYPE: Optional[ChannelInputModel] DEFAULT: None
input_ROI_table
Name of the ROI table over which the task loops to apply Cellpose segmentation. Examples: FOV_ROI_table => loop over the field of views, organoid_ROI_table => loop over the organoid ROI table (generated by another task), well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
output_ROI_table
If provided, a ROI table with that name is created, which will contain the bounding boxes of the newly segmented labels. ROI tables should have ROI in their name.
TYPE: Optional[str] DEFAULT: None
use_masks
If True, try to use masked loading and fall back to use_masks=False if the ROI table is not suitable. Masked loading is relevant when only a subset of the bounding box should actually be processed (e.g. running within organoid_ROI_table).
TYPE: bool DEFAULT: True
output_label_name
Name of the output label image (e.g. \"organoids\").
TYPE: Optional[str] DEFAULT: None
relabeling
If True, apply relabeling so that label values are unique for all objects in the well.
TYPE: bool DEFAULT: True
diameter_level0
Expected diameter of the objects that should be segmented in pixels at level 0. Initial diameter is rescaled using the level that was selected. The rescaled value is passed as the diameter to the CellposeModel.eval method.
TYPE: float DEFAULT: 30.0
model_type
Parameter of CellposeModel class. Defines which model should be used. Typical choices are nuclei, cyto, cyto2, etc.
TYPE: str DEFAULT: 'cyto2'
pretrained_model
Parameter of CellposeModel class (takes precedence over model_type). Allows you to specify the path of a custom trained cellpose model.
TYPE: Optional[str] DEFAULT: None
cellprob_threshold
Parameter of CellposeModel.eval method. Valid values between -6 to 6. From Cellpose documentation: \"Decrease this threshold if cellpose is not returning as many ROIs as you\u2019d expect. Similarly, increase this threshold if cellpose is returning too ROIs particularly from dim areas.\"
TYPE: float DEFAULT: 0.0
flow_threshold
Parameter of CellposeModel.eval method. Valid values between 0.0 and 1.0. From Cellpose documentation: \"Increase this threshold if cellpose is not returning as many ROIs as you\u2019d expect. Similarly, decrease this threshold if cellpose is returning too many ill-shaped ROIs.\"
TYPE: float DEFAULT: 0.4
anisotropy
Ratio of the pixel sizes along Z and XY axis (ignored if the image is not three-dimensional). If None, it is inferred from the OME-NGFF metadata.
TYPE: Optional[float] DEFAULT: None
min_size
Parameter of CellposeModel class. Minimum size of the segmented objects (in pixels). Use -1 to turn off the size filter.
TYPE: int DEFAULT: 15
augment
Parameter of CellposeModel class. Whether to use cellpose augmentation to tile images with overlap.
TYPE: bool DEFAULT: False
net_avg
Parameter of CellposeModel class. Whether to use cellpose net averaging to run the 4 built-in networks (useful for nuclei, cyto and cyto2, not sure it works for the others).
TYPE: bool DEFAULT: False
use_gpu
If False, always use the CPU; if True, use the GPU if possible (as defined in cellpose.core.use_gpu()) and fall-back to the CPU otherwise.
TYPE: bool DEFAULT: True
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/cellpose_segmentation.py
@validate_arguments\ndef cellpose_segmentation(\n *,\n # Fractal arguments\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments\n level: int,\n channel: ChannelInputModel,\n channel2: Optional[ChannelInputModel] = None,\n input_ROI_table: str = \"FOV_ROI_table\",\n output_ROI_table: Optional[str] = None,\n output_label_name: Optional[str] = None,\n use_masks: bool = True,\n relabeling: bool = True,\n # Cellpose-related arguments\n diameter_level0: float = 30.0,\n model_type: str = \"cyto2\",\n pretrained_model: Optional[str] = None,\n cellprob_threshold: float = 0.0,\n flow_threshold: float = 0.4,\n anisotropy: Optional[float] = None,\n min_size: int = 15,\n augment: bool = False,\n net_avg: bool = False,\n use_gpu: bool = True,\n overwrite: bool = True,\n) -> dict[str, Any]:\n\"\"\"\n Run cellpose segmentation on the ROIs of a single OME-Zarr image.\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n level: Pyramid level of the image to be segmented. Choose `0` to\n process at full resolution.\n channel: Primary channel for segmentation; requires either\n `wavelength_id` (e.g. `A01_C01`) or `label` (e.g. `DAPI`).\n channel2: Second channel for segmentation (in the same format as\n `channel`). If specified, cellpose runs in dual channel mode.\n For dual channel segmentation of cells, the first channel should\n contain the membrane marker, the second channel should contain the\n nuclear marker.\n input_ROI_table: Name of the ROI table over which the task loops to\n apply Cellpose segmentation. Examples: `FOV_ROI_table` => loop over\n the field of views, `organoid_ROI_table` => loop over the organoid\n ROI table (generated by another task), `well_ROI_table` => process\n the whole well as one image.\n output_ROI_table: If provided, a ROI table with that name is created,\n which will contain the bounding boxes of the newly segmented\n labels. ROI tables should have `ROI` in their name.\n use_masks: If `True`, try to use masked loading and fall back to\n `use_masks=False` if the ROI table is not suitable. Masked\n loading is relevant when only a subset of the bounding box should\n actually be processed (e.g. running within `organoid_ROI_table`).\n output_label_name: Name of the output label image (e.g. `\"organoids\"`).\n relabeling: If `True`, apply relabeling so that label values are\n unique for all objects in the well.\n diameter_level0: Expected diameter of the objects that should be\n segmented in pixels at level 0. Initial diameter is rescaled using\n the `level` that was selected. The rescaled value is passed as\n the diameter to the `CellposeModel.eval` method.\n model_type: Parameter of `CellposeModel` class. Defines which model\n should be used. Typical choices are `nuclei`, `cyto`, `cyto2`, etc.\n pretrained_model: Parameter of `CellposeModel` class (takes\n precedence over `model_type`). Allows you to specify the path of\n a custom trained cellpose model.\n cellprob_threshold: Parameter of `CellposeModel.eval` method. Valid\n values between -6 to 6. From Cellpose documentation: \"Decrease this\n threshold if cellpose is not returning as many ROIs as you\u2019d\n expect. Similarly, increase this threshold if cellpose is returning\n too ROIs particularly from dim areas.\"\n flow_threshold: Parameter of `CellposeModel.eval` method. Valid\n values between 0.0 and 1.0. From Cellpose documentation: \"Increase\n this threshold if cellpose is not returning as many ROIs as you\u2019d\n expect. Similarly, decrease this threshold if cellpose is returning\n too many ill-shaped ROIs.\"\n anisotropy: Ratio of the pixel sizes along Z and XY axis (ignored if\n the image is not three-dimensional). If `None`, it is inferred from\n the OME-NGFF metadata.\n min_size: Parameter of `CellposeModel` class. Minimum size of the\n segmented objects (in pixels). Use `-1` to turn off the size\n filter.\n augment: Parameter of `CellposeModel` class. Whether to use cellpose\n augmentation to tile images with overlap.\n net_avg: Parameter of `CellposeModel` class. Whether to use cellpose\n net averaging to run the 4 built-in networks (useful for `nuclei`,\n `cyto` and `cyto2`, not sure it works for the others).\n use_gpu: If `False`, always use the CPU; if `True`, use the GPU if\n possible (as defined in `cellpose.core.use_gpu()`) and fall-back\n to the CPU otherwise.\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n\n # Set input path\n if len(input_paths) > 1:\n raise NotImplementedError\n in_path = Path(input_paths[0])\n zarrurl = (in_path.resolve() / component).as_posix()\n logger.info(f\"{zarrurl=}\")\n\n # Preliminary checks on Cellpose model\n if pretrained_model is None:\n if model_type not in models.MODEL_NAMES:\n raise ValueError(f\"ERROR model_type={model_type} is not allowed.\")\n else:\n if not os.path.exists(pretrained_model):\n raise ValueError(f\"{pretrained_model=} does not exist.\")\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarrurl)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n actual_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=level)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n logger.info(\n f\"NGFF image has level-{level} pixel sizes \"\n f\"{actual_res_pxl_sizes_zyx}\"\n )\n\n plate, well = component.split(\".zarr/\")\n\n # Find channel index\n try:\n tmp_channel: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=zarrurl,\n wavelength_id=channel.wavelength_id,\n label=channel.label,\n )\n except ChannelNotFoundError as e:\n logger.warning(\n \"Channel not found, exit from the task.\\n\"\n f\"Original error: {str(e)}\"\n )\n return {}\n ind_channel = tmp_channel.index\n\n # Find channel index for second channel, if one is provided\n if channel2:\n try:\n tmp_channel_c2: OmeroChannel = get_channel_from_image_zarr(\n image_zarr_path=zarrurl,\n wavelength_id=channel2.wavelength_id,\n label=channel2.label,\n )\n except ChannelNotFoundError as e:\n logger.warning(\n f\"Second channel with wavelength_id: {channel2.wavelength_id} \"\n f\"and label: {channel2.label} not found, exit from the task.\\n\"\n f\"Original error: {str(e)}\"\n )\n return {}\n ind_channel_c2 = tmp_channel_c2.index\n\n # Set channel label\n if output_label_name is None:\n try:\n channel_label = tmp_channel.label\n output_label_name = f\"label_{channel_label}\"\n except (KeyError, IndexError):\n output_label_name = f\"label_{ind_channel}\"\n\n # Load ZYX data\n data_zyx = da.from_zarr(f\"{zarrurl}/{level}\")[ind_channel]\n logger.info(f\"{data_zyx.shape=}\")\n if channel2:\n data_zyx_c2 = da.from_zarr(f\"{zarrurl}/{level}\")[ind_channel_c2]\n logger.info(f\"Second channel: {data_zyx_c2.shape=}\")\n\n # Read ROI table\n ROI_table_path = f\"{zarrurl}/tables/{input_ROI_table}\"\n ROI_table = ad.read_zarr(ROI_table_path)\n\n # Perform some checks on the ROI table\n valid_ROI_table = is_ROI_table_valid(\n table_path=ROI_table_path, use_masks=use_masks\n )\n if use_masks and not valid_ROI_table:\n logger.info(\n f\"ROI table at {ROI_table_path} cannot be used for masked \"\n \"loading. Set use_masks=False.\"\n )\n use_masks = False\n logger.info(f\"{use_masks=}\")\n\n # Create list of indices for 3D ROIs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n\n # If we are not planning to use masked loading, fail for overlapping ROIs\n if not use_masks:\n overlap = find_overlaps_in_ROI_indices(list_indices)\n if overlap:\n raise ValueError(\n f\"ROI indices created from {input_ROI_table} table have \"\n \"overlaps, but we are not using masked loading.\"\n )\n\n # Select 2D/3D behavior and set some parameters\n do_3D = data_zyx.shape[0] > 1 and len(data_zyx.shape) == 3\n if do_3D:\n if anisotropy is None:\n # Compute anisotropy as pixel_size_z/pixel_size_x\n anisotropy = (\n actual_res_pxl_sizes_zyx[0] / actual_res_pxl_sizes_zyx[2]\n )\n logger.info(f\"Anisotropy: {anisotropy}\")\n\n # Rescale datasets (only relevant for level>0)\n if ngff_image_meta.axes_names[0] != \"c\":\n raise ValueError(\n \"Cannot set `remove_channel_axis=True` for multiscale \"\n f\"metadata with axes={ngff_image_meta.axes_names}. \"\n 'First axis should have name \"c\".'\n )\n new_datasets = rescale_datasets(\n datasets=[ds.dict() for ds in ngff_image_meta.datasets],\n coarsening_xy=coarsening_xy,\n reference_level=level,\n remove_channel_axis=True,\n )\n\n label_attrs = {\n \"image-label\": {\n \"version\": __OME_NGFF_VERSION__,\n \"source\": {\"image\": \"../../\"},\n },\n \"multiscales\": [\n {\n \"name\": output_label_name,\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n ax.dict()\n for ax in ngff_image_meta.multiscale.axes\n if ax.type != \"channel\"\n ],\n \"datasets\": new_datasets,\n }\n ],\n }\n\n image_group = zarr.group(zarrurl)\n label_group = prepare_label_group(\n image_group,\n output_label_name,\n overwrite=overwrite,\n label_attrs=label_attrs,\n logger=logger,\n )\n\n logger.info(\n f\"Helper function `prepare_label_group` returned {label_group=}\"\n )\n logger.info(f\"Output label path: {zarrurl}/labels/{output_label_name}/0\")\n store = zarr.storage.FSStore(f\"{zarrurl}/labels/{output_label_name}/0\")\n label_dtype = np.uint32\n\n # Ensure that all output shapes & chunks are 3D (for 2D data: (1, y, x))\n # https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/398\n shape = data_zyx.shape\n if len(shape) == 2:\n shape = (1, *shape)\n chunks = data_zyx.chunksize\n if len(chunks) == 2:\n chunks = (1, *chunks)\n mask_zarr = zarr.create(\n shape=shape,\n chunks=chunks,\n dtype=label_dtype,\n store=store,\n overwrite=False,\n dimension_separator=\"/\",\n )\n\n logger.info(\n f\"mask will have shape {data_zyx.shape} \"\n f\"and chunks {data_zyx.chunks}\"\n )\n\n # Initialize cellpose\n gpu = use_gpu and cellpose.core.use_gpu()\n if pretrained_model:\n model = models.CellposeModel(\n gpu=gpu, pretrained_model=pretrained_model\n )\n else:\n model = models.CellposeModel(gpu=gpu, model_type=model_type)\n\n # Initialize other things\n logger.info(f\"Start cellpose_segmentation task for {zarrurl}\")\n logger.info(f\"relabeling: {relabeling}\")\n logger.info(f\"do_3D: {do_3D}\")\n logger.info(f\"use_gpu: {gpu}\")\n logger.info(f\"level: {level}\")\n logger.info(f\"model_type: {model_type}\")\n logger.info(f\"pretrained_model: {pretrained_model}\")\n logger.info(f\"anisotropy: {anisotropy}\")\n logger.info(\"Total well shape/chunks:\")\n logger.info(f\"{data_zyx.shape}\")\n logger.info(f\"{data_zyx.chunks}\")\n if channel2:\n logger.info(\"Dual channel input for cellpose model\")\n logger.info(f\"{data_zyx_c2.shape}\")\n logger.info(f\"{data_zyx_c2.chunks}\")\n\n # Counters for relabeling\n if relabeling:\n num_labels_tot = 0\n\n # Iterate over ROIs\n num_ROIs = len(list_indices)\n\n if output_ROI_table:\n bbox_dataframe_list = []\n\n logger.info(f\"Now starting loop over {num_ROIs} ROIs\")\n for i_ROI, indices in enumerate(list_indices):\n # Define region\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n logger.info(f\"Now processing ROI {i_ROI+1}/{num_ROIs}\")\n\n # Prepare single-channel or dual-channel input for cellpose\n if channel2:\n # Dual channel mode, first channel is the membrane channel\n img_1 = load_region(\n data_zyx,\n region,\n compute=True,\n return_as_3D=True,\n )\n img_np = np.zeros((2, *img_1.shape))\n img_np[0, :, :, :] = img_1\n img_np[1, :, :, :] = load_region(\n data_zyx_c2,\n region,\n compute=True,\n return_as_3D=True,\n )\n channels = [1, 2]\n else:\n img_np = np.expand_dims(\n load_region(data_zyx, region, compute=True, return_as_3D=True),\n axis=0,\n )\n channels = [0, 0]\n\n # Prepare keyword arguments for segment_ROI function\n kwargs_segment_ROI = dict(\n model=model,\n channels=channels,\n do_3D=do_3D,\n anisotropy=anisotropy,\n label_dtype=label_dtype,\n diameter=diameter_level0 / coarsening_xy**level,\n cellprob_threshold=cellprob_threshold,\n flow_threshold=flow_threshold,\n min_size=min_size,\n augment=augment,\n net_avg=net_avg,\n )\n\n # Prepare keyword arguments for preprocessing function\n preprocessing_kwargs = {}\n if use_masks:\n preprocessing_kwargs = dict(\n region=region,\n current_label_path=f\"{zarrurl}/labels/{output_label_name}/0\",\n ROI_table_path=ROI_table_path,\n ROI_positional_index=i_ROI,\n )\n\n # Call segment_ROI through the masked-loading wrapper, which includes\n # pre/post-processing functions if needed\n new_label_img = masked_loading_wrapper(\n image_array=img_np,\n function=segment_ROI,\n kwargs=kwargs_segment_ROI,\n use_masks=use_masks,\n preprocessing_kwargs=preprocessing_kwargs,\n )\n\n # Shift labels and update relabeling counters\n if relabeling:\n num_labels_roi = np.max(new_label_img)\n new_label_img[new_label_img > 0] += num_labels_tot\n num_labels_tot += num_labels_roi\n\n # Write some logs\n logger.info(f\"ROI {indices}, {num_labels_roi=}, {num_labels_tot=}\")\n\n # Check that total number of labels is under control\n if num_labels_tot > np.iinfo(label_dtype).max:\n raise ValueError(\n \"ERROR in re-labeling:\"\n f\"Reached {num_labels_tot} labels, \"\n f\"but dtype={label_dtype}\"\n )\n\n if output_ROI_table:\n bbox_df = array_to_bounding_box_table(\n new_label_img,\n actual_res_pxl_sizes_zyx,\n origin_zyx=(s_z, s_y, s_x),\n )\n\n bbox_dataframe_list.append(bbox_df)\n\n overlap_list = []\n for df in bbox_dataframe_list:\n overlap_list.extend(\n get_overlapping_pairs_3D(df, full_res_pxl_sizes_zyx)\n )\n if len(overlap_list) > 0:\n logger.warning(\n f\"{len(overlap_list)} bounding-box pairs overlap\"\n )\n\n # Compute and store 0-th level to disk\n da.array(new_label_img).to_zarr(\n url=mask_zarr,\n region=region,\n compute=True,\n )\n\n logger.info(\n f\"End cellpose_segmentation task for {zarrurl}, \"\n \"now building pyramids.\"\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=f\"{zarrurl}/labels/{output_label_name}\",\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=chunks,\n aggregation_function=np.max,\n )\n\n logger.info(\"End building pyramids\")\n\n if output_ROI_table:\n # Handle the case where `bbox_dataframe_list` is empty (typically\n # because list_indices is also empty)\n if len(bbox_dataframe_list) == 0:\n bbox_dataframe_list = [empty_bounding_box_table()]\n # Concatenate all ROI dataframes\n df_well = pd.concat(bbox_dataframe_list, axis=0, ignore_index=True)\n df_well.index = df_well.index.astype(str)\n # Extract labels and drop them from df_well\n labels = pd.DataFrame(df_well[\"label\"].astype(str))\n df_well.drop(labels=[\"label\"], axis=1, inplace=True)\n # Convert all to float (warning: some would be int, in principle)\n bbox_dtype = np.float32\n df_well = df_well.astype(bbox_dtype)\n # Convert to anndata\n bbox_table = ad.AnnData(df_well, dtype=bbox_dtype)\n bbox_table.obs = labels\n\n # Write to zarr group\n image_group = zarr.group(f\"{in_path}/{component}\")\n logger.info(\n \"Now writing bounding-box ROI table to \"\n f\"{in_path}/{component}/tables/{output_ROI_table}\"\n )\n table_attrs = {\n \"type\": \"masking_roi_table\",\n \"region\": {\"path\": f\"../labels/{output_label_name}\"},\n \"instance_key\": \"label\",\n }\n write_table(\n image_group,\n output_ROI_table,\n bbox_table,\n overwrite=overwrite,\n table_attrs=table_attrs,\n )\n\n return {}\n
Internal function that runs Cellpose segmentation for a single ROI.
PARAMETER DESCRIPTION x
4D numpy array.
TYPE: ndarray
model
An instance of models.CellposeModel.
TYPE: CellposeModel DEFAULT: None
do_3D
If True, cellpose runs in 3D mode: runs on xy, xz & yz planes, then averages the flows.
TYPE: bool DEFAULT: True
channels
Which channels to use. If only one channel is provided, [0, 0] should be used. If two channels are provided (the first dimension of x has length of 2), [1, 2] should be used (x[0, :, :,:] contains the membrane channel and x[1, :, :, :] contains the nuclear channel).
TYPE: list[int] DEFAULT: [0, 0]
anisotropy
Set anisotropy rescaling factor for Z dimension.
TYPE: Optional[float] DEFAULT: None
diameter
Expected object diameter in pixels for cellpose.
TYPE: float DEFAULT: 30.0
cellprob_threshold
Cellpose model parameter.
TYPE: float DEFAULT: 0.0
flow_threshold
Cellpose model parameter.
TYPE: float DEFAULT: 0.4
label_dtype
Label images are cast into this np.dtype.
TYPE: Optional[dtype] DEFAULT: None
augment
Whether to use cellpose augmentation to tile images with overlap.
TYPE: bool DEFAULT: False
net_avg
Whether to use cellpose net averaging to run the 4 built-in networks (useful for nuclei, cyto and cyto2, not sure it works for the others).
TYPE: bool DEFAULT: False
min_size
Minimum size of the segmented objects.
TYPE: int DEFAULT: 15
Source code in fractal_tasks_core/tasks/cellpose_segmentation.py
def segment_ROI(\n x: np.ndarray,\n model: models.CellposeModel = None,\n do_3D: bool = True,\n channels: list[int] = [0, 0],\n anisotropy: Optional[float] = None,\n diameter: float = 30.0,\n cellprob_threshold: float = 0.0,\n flow_threshold: float = 0.4,\n label_dtype: Optional[np.dtype] = None,\n augment: bool = False,\n net_avg: bool = False,\n min_size: int = 15,\n) -> np.ndarray:\n\"\"\"\n Internal function that runs Cellpose segmentation for a single ROI.\n\n Args:\n x: 4D numpy array.\n model: An instance of `models.CellposeModel`.\n do_3D: If `True`, cellpose runs in 3D mode: runs on xy, xz & yz planes,\n then averages the flows.\n channels: Which channels to use. If only one channel is provided, `[0,\n 0]` should be used. If two channels are provided (the first\n dimension of `x` has length of 2), `[1, 2]` should be used\n (`x[0, :, :,:]` contains the membrane channel and\n `x[1, :, :, :]` contains the nuclear channel).\n anisotropy: Set anisotropy rescaling factor for Z dimension.\n diameter: Expected object diameter in pixels for cellpose.\n cellprob_threshold: Cellpose model parameter.\n flow_threshold: Cellpose model parameter.\n label_dtype: Label images are cast into this `np.dtype`.\n augment: Whether to use cellpose augmentation to tile images with\n overlap.\n net_avg: Whether to use cellpose net averaging to run the 4 built-in\n networks (useful for `nuclei`, `cyto` and `cyto2`, not sure it\n works for the others).\n min_size: Minimum size of the segmented objects.\n \"\"\"\n\n # Write some debugging info\n logger.info(\n \"[segment_ROI] START |\"\n f\" x: {type(x)}, {x.shape} |\"\n f\" {do_3D=} |\"\n f\" {model.diam_mean=} |\"\n f\" {diameter=} |\"\n f\" {flow_threshold=}\"\n )\n\n # Actual labeling\n t0 = time.perf_counter()\n mask, _, _ = model.eval(\n x,\n channels=channels,\n do_3D=do_3D,\n net_avg=net_avg,\n augment=augment,\n diameter=diameter,\n anisotropy=anisotropy,\n cellprob_threshold=cellprob_threshold,\n flow_threshold=flow_threshold,\n min_size=min_size,\n )\n\n if mask.ndim == 2:\n # If we get a 2D image, we still return it as a 3D array\n mask = np.expand_dims(mask, axis=0)\n t1 = time.perf_counter()\n\n # Write some debugging info\n logger.info(\n \"[segment_ROI] END |\"\n f\" Elapsed: {t1-t0:.3f} s |\"\n f\" {mask.shape=},\"\n f\" {mask.dtype=} (then {label_dtype}),\"\n f\" {np.max(mask)=} |\"\n f\" {model.diam_mean=} |\"\n f\" {diameter=} |\"\n f\" {flow_threshold=}\"\n )\n\n return mask.astype(label_dtype)\n
This task copies all the structure, but none of the image data:
For each plate, create a new zarr group with the same attributes as the original one.
For each well (in each plate), create a new zarr subgroup with the same attributes as the original one.
For each image (in each well), create a new zarr subgroup with the same attributes as the original one.
For each image (in each well), copy the relevant AnnData tables from the original source.
Note: this task makes use of methods from the Attributes class, see https://zarr.readthedocs.io/en/stable/api/attrs.html.
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Path were the output of this task is stored. Example: \"/some/path/\" => puts the new OME-Zarr file in the same folder as the input OME-Zarr file \"/some/new_path\" => puts the new OME-Zarr file into a new folder at /some/new_path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
Dictionary containing metadata about the OME-Zarr. This task requires the following elements to be present in the metadata: plate: List of plates (e.g. [\"MyPlate.zarr\"]); well: List of wells in the OME-Zarr plate (e.g. [\"MyPlate.zarr/B/03/MyPlate.zarr/B/05\"]); \"image\": List of images in the OME-Zarr plate (e.g. [\"MyPlate.zarr/B/03/0\", \"MyPlate.zarr/B/05/0\"]). standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
project_to_2D
If True, apply a 3D->2D projection to the ROI tables that are copied to the new OME-Zarr.
TYPE: bool DEFAULT: True
suffix
The suffix that is used to transform plate.zarr into plate_suffix.zarr. Note that None is not currently supported.
TYPE: str DEFAULT: 'mip'
ROI_table_names
List of Anndata table names to be copied. Note: copying non-ROI tables may fail if project_to_2D=True.
An update to the metadata table with new plate, well, image entries (now with the suffix in the plate name).
Source code in fractal_tasks_core/tasks/copy_ome_zarr.py
@validate_arguments\ndef copy_ome_zarr(\n *,\n input_paths: Sequence[str],\n output_path: str,\n metadata: dict[str, Any],\n project_to_2D: bool = True,\n suffix: str = \"mip\",\n ROI_table_names: tuple[str, ...] = (\"FOV_ROI_table\", \"well_ROI_table\"),\n overwrite: bool = False,\n) -> dict[str, Any]:\n\n\"\"\"\n Duplicate an input zarr structure to a new path.\n\n This task copies all the structure, but none of the image data:\n\n - For each plate, create a new zarr group with the same attributes as\n the original one.\n - For each well (in each plate), create a new zarr subgroup with the\n same attributes as the original one.\n - For each image (in each well), create a new zarr subgroup with the\n same attributes as the original one.\n - For each image (in each well), copy the relevant AnnData tables from\n the original source.\n\n Note: this task makes use of methods from the `Attributes` class, see\n https://zarr.readthedocs.io/en/stable/api/attrs.html.\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: Path were the output of this task is stored. Example:\n `\"/some/path/\"` => puts the new OME-Zarr file in the same folder as\n the input OME-Zarr file `\"/some/new_path\"` => puts the new OME-Zarr\n file into a new folder at `/some/new_path`. (standard argument for\n Fractal tasks, managed by Fractal server).\n metadata: Dictionary containing metadata about the OME-Zarr. This task\n requires the following elements to be present in the metadata:\n `plate`: List of plates\n (e.g. `[\"MyPlate.zarr\"]`);\n `well`: List of wells in the OME-Zarr plate\n (e.g. `[\"MyPlate.zarr/B/03/MyPlate.zarr/B/05\"]`);\n \"image\": List of images in the OME-Zarr plate\n (e.g. `[\"MyPlate.zarr/B/03/0\", \"MyPlate.zarr/B/05/0\"]`).\n standard argument for Fractal tasks, managed by Fractal server).\n project_to_2D: If `True`, apply a 3D->2D projection to the ROI tables\n that are copied to the new OME-Zarr.\n suffix: The suffix that is used to transform `plate.zarr` into\n `plate_suffix.zarr`. Note that `None` is not currently supported.\n ROI_table_names: List of Anndata table names to be copied. Note:\n copying non-ROI tables may fail if `project_to_2D=True`.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n An update to the metadata table with new `plate`, `well`, `image`\n entries (now with the suffix in the plate name).\n \"\"\"\n\n # Preliminary check\n if len(input_paths) > 1:\n raise NotImplementedError\n if suffix is None:\n # FIXME create a standard suffix (with timestamp)\n raise NotImplementedError\n\n # List all plates\n in_path = Path(input_paths[0])\n list_plates = [\n p.as_posix()\n for p in Path(in_path).glob(\"*.zarr\")\n if p.name in metadata[\"plate\"]\n ]\n logger.info(f\"{list_plates=}\")\n\n meta_update: dict[str, Any] = {\"copy_ome_zarr\": {}}\n meta_update[\"copy_ome_zarr\"][\"suffix\"] = suffix\n meta_update[\"copy_ome_zarr\"][\"sources\"] = {}\n\n # Loop over all plates\n for zarrurl_old in list_plates:\n zarrfile = zarrurl_old.split(\"/\")[-1]\n old_plate_name = zarrfile.split(\".zarr\")[0]\n new_plate_name = f\"{old_plate_name}_{suffix}\"\n new_plate_dir = Path(output_path).resolve()\n zarrurl_new = f\"{(new_plate_dir / new_plate_name).as_posix()}.zarr\"\n meta_update[\"copy_ome_zarr\"][\"sources\"][new_plate_name] = zarrurl_old\n\n logger.info(f\"{zarrurl_old=}\")\n logger.info(f\"{zarrurl_new=}\")\n logger.info(f\"{meta_update=}\")\n\n # Replicate plate attrs\n old_plate_group = zarr.open_group(zarrurl_old, mode=\"r\")\n new_plate_group = open_zarr_group_with_overwrite(\n zarrurl_new, overwrite=overwrite\n )\n new_plate_group.attrs.put(old_plate_group.attrs.asdict())\n\n well_paths = [\n well[\"path\"] for well in new_plate_group.attrs[\"plate\"][\"wells\"]\n ]\n logger.info(f\"{well_paths=}\")\n for well_path in well_paths:\n\n # Replicate well attrs\n old_well_group = zarr.open_group(\n f\"{zarrurl_old}/{well_path}\", mode=\"r\"\n )\n new_well_group = zarr.group(f\"{zarrurl_new}/{well_path}\")\n new_well_group.attrs.put(old_well_group.attrs.asdict())\n\n image_paths = [\n image[\"path\"]\n for image in new_well_group.attrs[\"well\"][\"images\"]\n ]\n logger.info(f\"{image_paths=}\")\n\n for image_path in image_paths:\n\n # Replicate image attrs\n old_image_group = zarr.open_group(\n f\"{zarrurl_old}/{well_path}/{image_path}\", mode=\"r\"\n )\n new_image_group = zarr.group(\n f\"{zarrurl_new}/{well_path}/{image_path}\"\n )\n new_image_group.attrs.put(old_image_group.attrs.asdict())\n\n # Extract pixel sizes, if needed\n if ROI_table_names:\n\n if project_to_2D:\n path_image = f\"{zarrurl_old}/{well_path}/{image_path}\"\n ngff_image_meta = load_NgffImageMeta(path_image)\n pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(\n level=0\n )\n pxl_size_z = pxl_sizes_zyx[0]\n\n # Copy the tables in ROI_table_names\n for ROI_table_name in ROI_table_names:\n\n logger.info(\n f\"I will now read {ROI_table_name} from \"\n f\"{zarrurl_old=}, convert it to 2D, and \"\n \"write it back to the new zarr file.\"\n )\n new_ROI_table = ad.read_zarr(\n f\"{zarrurl_old}/{well_path}/{image_path}/\"\n f\"tables/{ROI_table_name}\"\n )\n old_ROI_table_attrs = zarr.open_group(\n f\"{zarrurl_old}/{well_path}/{image_path}/\"\n f\"tables/{ROI_table_name}\"\n ).attrs.asdict()\n # Convert 3D ROIs to 2D\n if project_to_2D:\n new_ROI_table = convert_ROIs_from_3D_to_2D(\n new_ROI_table, pxl_size_z\n )\n # Write new table\n write_table(\n new_image_group,\n ROI_table_name,\n new_ROI_table,\n table_attrs=old_ROI_table_attrs,\n )\n\n for key in [\"plate\", \"well\", \"image\"]:\n meta_update[key] = [\n component.replace(\".zarr\", f\"_{suffix}.zarr\")\n for component in metadata[key]\n ]\n\n return meta_update\n
Create a OME-NGFF zarr folder, without reading/writing image data.
Find plates (for each folder in input_paths):
glob image files,
parse metadata from image filename to identify plates,
identify populated channels.
Create a zarr folder (for each plate):
parse mlf metadata,
identify wells and field of view (FOV),
create FOV ZARR,
verify that channels are uniform (i.e., same channels).
PARAMETER DESCRIPTION input_paths
List of input paths where the image data from the microscope is stored (as TIF or PNG). Should point to the parent folder containing the images and the metadata files MeasurementData.mlf and MeasurementDetail.mrf (if present). Example: [\"/some/path/\"]. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Path were the output of this task is stored. Example: \"/some/path/\" => puts the new OME-Zarr file in the \"/some/path/\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
allowed_channels
A list of OmeroChannel s, where each channel must include the wavelength_id attribute and where the wavelength_id values must be unique across the list.
TYPE: list[OmeroChannel]
image_glob_patterns
If specified, only parse images with filenames that match with all these patterns. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html, Example: image_glob_pattern=[\"*_B03_*\"] => only process well B03 image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"] => only process well C09, field of view 16 and Z planes 0-59.
TYPE: Optional[list[str]] DEFAULT: None
num_levels
Number of resolution-pyramid levels. If set to 5, there will be the full-resolution level and 4 levels of downsampled images.
TYPE: int DEFAULT: 5
coarsening_xy
Linear coarsening factor between subsequent levels. If set to 2, level 1 is 2x downsampled, level 2 is 4x downsampled etc.
TYPE: int DEFAULT: 2
image_extension
Filename extension of images (e.g. \"tif\" or \"png\")
TYPE: str DEFAULT: 'tif'
metadata_table_file
If None, parse Yokogawa metadata from mrf/mlf files in the input_path folder; else, the full path to a csv file containing the parsed metadata table.
TYPE: Optional[str] DEFAULT: None
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION dict[str, Any]
A metadata dictionary containing important metadata about the OME-Zarr plate, the images and some parameters required by downstream tasks (like num_levels).
Source code in fractal_tasks_core/tasks/create_ome_zarr.py
@validate_arguments\ndef create_ome_zarr(\n *,\n input_paths: Sequence[str],\n output_path: str,\n metadata: dict[str, Any],\n allowed_channels: list[OmeroChannel],\n image_glob_patterns: Optional[list[str]] = None,\n num_levels: int = 5,\n coarsening_xy: int = 2,\n image_extension: str = \"tif\",\n metadata_table_file: Optional[str] = None,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Create a OME-NGFF zarr folder, without reading/writing image data.\n\n Find plates (for each folder in input_paths):\n\n - glob image files,\n - parse metadata from image filename to identify plates,\n - identify populated channels.\n\n Create a zarr folder (for each plate):\n\n - parse mlf metadata,\n - identify wells and field of view (FOV),\n - create FOV ZARR,\n - verify that channels are uniform (i.e., same channels).\n\n Args:\n input_paths: List of input paths where the image data from\n the microscope is stored (as TIF or PNG). Should point to the\n parent folder containing the images and the metadata files\n `MeasurementData.mlf` and `MeasurementDetail.mrf` (if present).\n Example: `[\"/some/path/\"]`.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: Path were the output of this task is stored.\n Example: \"/some/path/\" => puts the new OME-Zarr file in the\n \"/some/path/\".\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n allowed_channels: A list of `OmeroChannel` s, where each channel must\n include the `wavelength_id` attribute and where the\n `wavelength_id` values must be unique across the list.\n image_glob_patterns: If specified, only parse images with filenames\n that match with all these patterns. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html, Example:\n `image_glob_pattern=[\"*_B03_*\"]` => only process well B03\n `image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"]` =>\n only process well C09, field of view 16 and Z planes 0-59.\n num_levels: Number of resolution-pyramid levels. If set to `5`, there\n will be the full-resolution level and 4 levels of\n downsampled images.\n coarsening_xy: Linear coarsening factor between subsequent levels.\n If set to `2`, level 1 is 2x downsampled, level 2 is\n 4x downsampled etc.\n image_extension: Filename extension of images (e.g. `\"tif\"` or `\"png\"`)\n metadata_table_file: If `None`, parse Yokogawa metadata from mrf/mlf\n files in the input_path folder; else, the full path to a csv file\n containing the parsed metadata table.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n A metadata dictionary containing important metadata about the OME-Zarr\n plate, the images and some parameters required by downstream tasks\n (like `num_levels`).\n \"\"\"\n\n # Preliminary checks on metadata_table_file\n if metadata_table_file:\n if not metadata_table_file.endswith(\".csv\"):\n raise ValueError(f\"{metadata_table_file=} is not a csv file\")\n if not os.path.isfile(metadata_table_file):\n raise FileNotFoundError(f\"{metadata_table_file=} does not exist\")\n\n # Identify all plates and all channels, across all input folders\n plates = []\n actual_wavelength_ids = None\n dict_plate_paths = {}\n dict_plate_prefixes: dict[str, Any] = {}\n\n # Preliminary checks on allowed_channels argument\n check_unique_wavelength_ids(allowed_channels)\n\n for in_path_str in input_paths:\n in_path = Path(in_path_str)\n\n # Glob image filenames\n patterns = [f\"*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n input_filenames = glob_with_multiple_patterns(\n folder=in_path_str,\n patterns=patterns,\n )\n\n tmp_wavelength_ids = []\n tmp_plates = []\n for fn in input_filenames:\n try:\n filename_metadata = parse_filename(Path(fn).name)\n plate_prefix = filename_metadata[\"plate_prefix\"]\n plate = filename_metadata[\"plate\"]\n if plate not in dict_plate_prefixes.keys():\n dict_plate_prefixes[plate] = plate_prefix\n tmp_plates.append(plate)\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n tmp_wavelength_ids.append(f\"A{A}_C{C}\")\n except ValueError as e:\n logger.warning(\n f'Skipping \"{Path(fn).name}\". Original error: ' + str(e)\n )\n tmp_plates = sorted(list(set(tmp_plates)))\n tmp_wavelength_ids = sorted(list(set(tmp_wavelength_ids)))\n\n info = (\n \"Listing plates/channels:\\n\"\n f\"Folder: {in_path_str}\\n\"\n f\"Patterns: {patterns}\\n\"\n f\"Plates: {tmp_plates}\\n\"\n f\"Channels: {tmp_wavelength_ids}\\n\"\n )\n\n # Check that only one plate is found\n if len(tmp_plates) > 1:\n raise ValueError(f\"{info}ERROR: {len(tmp_plates)} plates detected\")\n elif len(tmp_plates) == 0:\n raise ValueError(f\"{info}ERROR: No plates detected\")\n plate = tmp_plates[0]\n\n # If plate already exists in other folder, add suffix\n if plate in plates:\n ind = 1\n new_plate = f\"{plate}_{ind}\"\n while new_plate in plates:\n new_plate = f\"{plate}_{ind}\"\n ind += 1\n logger.info(\n f\"WARNING: {plate} already exists, renaming it as {new_plate}\"\n )\n plates.append(new_plate)\n dict_plate_prefixes[new_plate] = dict_plate_prefixes[plate]\n plate = new_plate\n else:\n plates.append(plate)\n\n # Check that channels are the same as in previous plates\n if actual_wavelength_ids is None:\n actual_wavelength_ids = tmp_wavelength_ids[:]\n else:\n if actual_wavelength_ids != tmp_wavelength_ids:\n raise ValueError(\n f\"ERROR\\n{info}\\nERROR:\"\n f\" expected channels {actual_wavelength_ids}\"\n )\n\n # Update dict_plate_paths\n dict_plate_paths[plate] = in_path\n\n # Check that all channels are in the allowed_channels\n allowed_wavelength_ids = [\n channel.wavelength_id for channel in allowed_channels\n ]\n if not set(actual_wavelength_ids).issubset(set(allowed_wavelength_ids)):\n msg = \"ERROR in create_ome_zarr\\n\"\n msg += f\"actual_wavelength_ids: {actual_wavelength_ids}\\n\"\n msg += f\"allowed_wavelength_ids: {allowed_wavelength_ids}\\n\"\n raise ValueError(msg)\n\n # Create actual_channels, i.e. a list of the channel dictionaries which are\n # present\n actual_channels = [\n channel\n for channel in allowed_channels\n if channel.wavelength_id in actual_wavelength_ids\n ]\n\n zarrurls: dict[str, list[str]] = {\"plate\": [], \"well\": [], \"image\": []}\n\n ################################################################\n for plate in plates:\n # Define plate zarr\n zarrurl = f\"{plate}.zarr\"\n in_path = dict_plate_paths[plate]\n logger.info(f\"Creating {zarrurl}\")\n # Call zarr.open_group wrapper, which handles overwrite=True/False\n group_plate = open_zarr_group_with_overwrite(\n str(Path(output_path) / zarrurl),\n overwrite=overwrite,\n )\n zarrurls[\"plate\"].append(zarrurl)\n\n # Obtain FOV-metadata dataframe\n\n if metadata_table_file is None:\n mrf_path = f\"{in_path}/MeasurementDetail.mrf\"\n mlf_path = f\"{in_path}/MeasurementData.mlf\"\n\n site_metadata, number_images_mlf = parse_yokogawa_metadata(\n mrf_path,\n mlf_path,\n filename_patterns=image_glob_patterns,\n )\n site_metadata = remove_FOV_overlaps(site_metadata)\n\n # If a metadata table was passed, load it and use it directly\n else:\n logger.warning(\n \"Since a custom metadata table was provided, there will \"\n \"be no additional check on the number of image files.\"\n )\n site_metadata = pd.read_csv(metadata_table_file)\n site_metadata.set_index([\"well_id\", \"FieldIndex\"], inplace=True)\n\n # Extract pixel sizes and bit_depth\n pixel_size_z = site_metadata[\"pixel_size_z\"][0]\n pixel_size_y = site_metadata[\"pixel_size_y\"][0]\n pixel_size_x = site_metadata[\"pixel_size_x\"][0]\n bit_depth = site_metadata[\"bit_depth\"][0]\n\n if min(pixel_size_z, pixel_size_y, pixel_size_x) < 1e-9:\n raise ValueError(pixel_size_z, pixel_size_y, pixel_size_x)\n\n # Identify all wells\n plate_prefix = dict_plate_prefixes[plate]\n\n patterns = [f\"{plate_prefix}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n plate_images = glob_with_multiple_patterns(\n folder=str(in_path), patterns=patterns\n )\n\n wells = [\n parse_filename(os.path.basename(fn))[\"well\"] for fn in plate_images\n ]\n wells = sorted(list(set(wells)))\n\n # Verify that all wells have all channels\n for well in wells:\n patterns = [f\"{plate_prefix}_{well}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n well_images = glob_with_multiple_patterns(\n folder=str(in_path), patterns=patterns\n )\n\n # Check number of images matches with expected one\n if metadata_table_file is None:\n num_images_glob = len(well_images)\n num_images_expected = number_images_mlf[well]\n if num_images_glob != num_images_expected:\n raise ValueError(\n f\"Wrong number of images for {well=}\\n\"\n f\"Expected {num_images_expected} (from mlf file)\\n\"\n f\"Found {num_images_glob} files\\n\"\n \"Other parameters:\\n\"\n f\" {image_extension=}\\n\"\n f\" {image_glob_patterns=}\"\n )\n\n well_wavelength_ids = []\n for fpath in well_images:\n try:\n filename_metadata = parse_filename(os.path.basename(fpath))\n well_wavelength_ids.append(\n f\"A{filename_metadata['A']}_C{filename_metadata['C']}\"\n )\n except IndexError:\n logger.info(f\"Skipping {fpath}\")\n well_wavelength_ids = sorted(list(set(well_wavelength_ids)))\n if well_wavelength_ids != actual_wavelength_ids:\n raise ValueError(\n f\"ERROR: well {well} in plate {plate} (prefix: \"\n f\"{plate_prefix}) has missing channels.\\n\"\n f\"Expected: {actual_channels}\\n\"\n f\"Found: {well_wavelength_ids}.\\n\"\n )\n\n well_rows_columns = [\n ind for ind in sorted([(n[0], n[1:]) for n in wells])\n ]\n row_list = [\n well_row_column[0] for well_row_column in well_rows_columns\n ]\n col_list = [\n well_row_column[1] for well_row_column in well_rows_columns\n ]\n row_list = sorted(list(set(row_list)))\n col_list = sorted(list(set(col_list)))\n\n group_plate.attrs[\"plate\"] = {\n \"acquisitions\": [{\"id\": 0, \"name\": plate}],\n \"columns\": [{\"name\": col} for col in col_list],\n \"rows\": [{\"name\": row} for row in row_list],\n \"wells\": [\n {\n \"path\": well_row_column[0] + \"/\" + well_row_column[1],\n \"rowIndex\": row_list.index(well_row_column[0]),\n \"columnIndex\": col_list.index(well_row_column[1]),\n }\n for well_row_column in well_rows_columns\n ],\n }\n\n for row, column in well_rows_columns:\n\n group_well = group_plate.create_group(f\"{row}/{column}/\")\n\n group_well.attrs[\"well\"] = {\n \"images\": [{\"path\": \"0\"}],\n \"version\": __OME_NGFF_VERSION__,\n }\n\n group_image = group_well.create_group(\"0/\") # noqa: F841\n zarrurls[\"well\"].append(f\"{plate}.zarr/{row}/{column}/\")\n zarrurls[\"image\"].append(f\"{plate}.zarr/{row}/{column}/0/\")\n\n group_image.attrs[\"multiscales\"] = [\n {\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n {\"name\": \"c\", \"type\": \"channel\"},\n {\n \"name\": \"z\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"y\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"x\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n ],\n \"datasets\": [\n {\n \"path\": f\"{ind_level}\",\n \"coordinateTransformations\": [\n {\n \"type\": \"scale\",\n \"scale\": [\n 1,\n pixel_size_z,\n pixel_size_y\n * coarsening_xy**ind_level,\n pixel_size_x\n * coarsening_xy**ind_level,\n ],\n }\n ],\n }\n for ind_level in range(num_levels)\n ],\n }\n ]\n\n group_image.attrs[\"omero\"] = {\n \"id\": 1, # FIXME does this depend on the plate number?\n \"name\": \"TBD\",\n \"version\": __OME_NGFF_VERSION__,\n \"channels\": define_omero_channels(\n channels=actual_channels, bit_depth=bit_depth\n ),\n }\n\n # Prepare AnnData tables for FOV/well ROIs\n well_id = row + column\n FOV_ROIs_table = prepare_FOV_ROI_table(site_metadata.loc[well_id])\n well_ROIs_table = prepare_well_ROI_table(\n site_metadata.loc[well_id]\n )\n\n # Write AnnData tables into the `tables` zarr group\n write_table(\n group_image,\n \"FOV_ROI_table\",\n FOV_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n write_table(\n group_image,\n \"well_ROI_table\",\n well_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n\n # Check that the different images in each well have unique channel labels.\n # Since we currently merge all fields of view in the same image, this check\n # is useless. It should remain there to catch an error in case we switch\n # back to one-image-per-field-of-view mode\n for well_path in zarrurls[\"well\"]:\n check_well_channel_labels(\n well_zarr_path=str(Path(output_path) / well_path)\n )\n\n metadata_update = dict(\n plate=zarrurls[\"plate\"],\n well=zarrurls[\"well\"],\n image=zarrurls[\"image\"],\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n image_extension=image_extension,\n image_glob_patterns=image_glob_patterns,\n original_paths=input_paths[:],\n )\n return metadata_update\n
Create OME-NGFF structure and metadata to host a multiplexing dataset.
This task takes a set of image folders (i.e. different acquisition cycles) and build the internal structure and metadata of a OME-NGFF zarr group, without actually loading/writing the image data.
Each element in input_paths should be treated as a different acquisition.
PARAMETER DESCRIPTION input_paths
List of input paths where the image data from the microscope is stored (as TIF or PNG). Each element of the list is treated as another cycle of the multiplexing data, the cycles are ordered by their order in this list. Should point to the parent folder containing the images and the metadata files MeasurementData.mlf and MeasurementDetail.mrf (if present). Example: [\"/path/cycle1/\", \"/path/cycle2/\"]. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Path were the output of this task is stored. Example: \"/some/path/\" => puts the new OME-Zarr file in the /some/path/. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
allowed_channels
A dictionary of lists of OmeroChannels, where each channel must include the wavelength_id attribute and where the wavelength_id values must be unique across each list. Dictionary keys represent channel indices (\"0\",\"1\",..).
TYPE: dict[str, list[OmeroChannel]]
image_glob_patterns
If specified, only parse images with filenames that match with all these patterns. Patterns must be defined as in https://docs.python.org/3/library/fnmatch.html, Example: image_glob_pattern=[\"*_B03_*\"] => only process well B03 image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"] => only process well C09, field of view 16 and Z planes 0-59.
TYPE: Optional[list[str]] DEFAULT: None
num_levels
Number of resolution-pyramid levels. If set to 5, there will be the full-resolution level and 4 levels of downsampled images.
TYPE: int DEFAULT: 5
coarsening_xy
Linear coarsening factor between subsequent levels. If set to 2, level 1 is 2x downsampled, level 2 is 4x downsampled etc.
TYPE: int DEFAULT: 2
image_extension
Filename extension of images (e.g. \"tif\" or \"png\").
TYPE: str DEFAULT: 'tif'
metadata_table_files
If None, parse Yokogawa metadata from mrf/mlf files in the input_path folder; else, a dictionary of key-value pairs like (acquisition, path) with acquisition a string and path pointing to a csv file containing the parsed metadata table.
TYPE: Optional[dict[str, str]] DEFAULT: None
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
RETURNS DESCRIPTION dict[str, Any]
A metadata dictionary containing important metadata about the OME-Zarr plate, the images and some parameters required by downstream tasks (like num_levels).
Source code in fractal_tasks_core/tasks/create_ome_zarr_multiplex.py
@validate_arguments\ndef create_ome_zarr_multiplex(\n *,\n input_paths: Sequence[str],\n output_path: str,\n metadata: dict[str, Any],\n allowed_channels: dict[str, list[OmeroChannel]],\n image_glob_patterns: Optional[list[str]] = None,\n num_levels: int = 5,\n coarsening_xy: int = 2,\n image_extension: str = \"tif\",\n metadata_table_files: Optional[dict[str, str]] = None,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Create OME-NGFF structure and metadata to host a multiplexing dataset.\n\n This task takes a set of image folders (i.e. different acquisition cycles)\n and build the internal structure and metadata of a OME-NGFF zarr group,\n without actually loading/writing the image data.\n\n Each element in input_paths should be treated as a different acquisition.\n\n Args:\n input_paths: List of input paths where the image data from the\n microscope is stored (as TIF or PNG). Each element of the list is\n treated as another cycle of the multiplexing data, the cycles are\n ordered by their order in this list. Should point to the parent\n folder containing the images and the metadata files\n `MeasurementData.mlf` and `MeasurementDetail.mrf` (if present).\n Example: `[\"/path/cycle1/\", \"/path/cycle2/\"]`. (standard argument\n for Fractal tasks, managed by Fractal server).\n output_path: Path were the output of this task is stored.\n Example: `\"/some/path/\"` => puts the new OME-Zarr file in the\n `/some/path/`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n allowed_channels: A dictionary of lists of `OmeroChannel`s, where\n each channel must include the `wavelength_id` attribute and where\n the `wavelength_id` values must be unique across each list.\n Dictionary keys represent channel indices (`\"0\",\"1\",..`).\n image_glob_patterns: If specified, only parse images with filenames\n that match with all these patterns. Patterns must be defined as in\n https://docs.python.org/3/library/fnmatch.html, Example:\n `image_glob_pattern=[\"*_B03_*\"]` => only process well B03\n `image_glob_pattern=[\"*_C09_*\", \"*F016*\", \"*Z[0-5][0-9]C*\"]` =>\n only process well C09, field of view 16 and Z planes 0-59.\n num_levels: Number of resolution-pyramid levels. If set to `5`, there\n will be the full-resolution level and 4 levels of downsampled\n images.\n coarsening_xy: Linear coarsening factor between subsequent levels.\n If set to `2`, level 1 is 2x downsampled, level 2 is 4x downsampled\n etc.\n image_extension: Filename extension of images\n (e.g. `\"tif\"` or `\"png\"`).\n metadata_table_files: If `None`, parse Yokogawa metadata from mrf/mlf\n files in the input_path folder; else, a dictionary of key-value\n pairs like `(acquisition, path)` with `acquisition` a string\n and `path` pointing to a csv file containing the parsed metadata\n table.\n overwrite: If `True`, overwrite the task output.\n\n Returns:\n A metadata dictionary containing important metadata about the OME-Zarr\n plate, the images and some parameters required by downstream tasks\n (like `num_levels`).\n \"\"\"\n\n if metadata_table_files:\n\n # Checks on the dict:\n # 1. Acquisitions as keys (same as keys of allowed_channels)\n # 2. Files end with \".csv\"\n # 3. Files exist.\n if set(allowed_channels.keys()) != set(metadata_table_files.keys()):\n raise ValueError(\n \"Mismatch in acquisition keys between \"\n f\"{allowed_channels.keys()=} and \"\n f\"{metadata_table_files.keys()=}\"\n )\n for f in metadata_table_files.values():\n if not f.endswith(\".csv\"):\n raise ValueError(\n f\"{f} (in metadata_table_file) is not a csv file.\"\n )\n if not os.path.isfile(f):\n raise ValueError(\n f\"{f} (in metadata_table_file) does not exist.\"\n )\n\n # Preliminary checks on allowed_channels\n # Note that in metadata the keys of dictionary arguments should be\n # strings (and not integers), so that they can be read from a JSON file\n for key, _channels in allowed_channels.items():\n if not isinstance(key, str):\n raise ValueError(f\"{allowed_channels=} has non-string keys\")\n check_unique_wavelength_ids(_channels)\n\n # Identify all plates and all channels, per input folders\n dict_acquisitions: dict = {}\n\n for ind_in_path, in_path_str in enumerate(input_paths):\n acquisition = str(ind_in_path)\n in_path = Path(in_path_str)\n dict_acquisitions[acquisition] = {}\n\n actual_wavelength_ids = []\n plates = []\n plate_prefixes = []\n\n # Loop over all images\n patterns = [f\"*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n input_filenames = glob_with_multiple_patterns(\n folder=in_path_str,\n patterns=patterns,\n )\n for fn in input_filenames:\n try:\n filename_metadata = parse_filename(Path(fn).name)\n plate = filename_metadata[\"plate\"]\n plates.append(plate)\n plate_prefix = filename_metadata[\"plate_prefix\"]\n plate_prefixes.append(plate_prefix)\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n actual_wavelength_ids.append(f\"A{A}_C{C}\")\n except ValueError as e:\n logger.warning(\n f'Skipping \"{Path(fn).name}\". Original error: ' + str(e)\n )\n plates = sorted(list(set(plates)))\n actual_wavelength_ids = sorted(list(set(actual_wavelength_ids)))\n\n info = (\n \"Listing all plates/channels:\\n\"\n f\"Patterns: {patterns}\\n\"\n f\"Plates: {plates}\\n\"\n f\"Actual wavelength IDs: {actual_wavelength_ids}\\n\"\n )\n\n # Check that a folder includes a single plate\n if len(plates) > 1:\n raise ValueError(f\"{info}ERROR: {len(plates)} plates detected\")\n elif len(plates) == 0:\n raise ValueError(f\"{info}ERROR: No plates detected\")\n original_plate = plates[0]\n plate_prefix = plate_prefixes[0]\n\n # Replace plate with the one of acquisition 0, if needed\n if int(acquisition) > 0:\n plate = dict_acquisitions[\"0\"][\"plate\"]\n logger.warning(\n f\"For {acquisition=}, we replace {original_plate=} with \"\n f\"{plate=} (the one for acquisition 0)\"\n )\n\n # Check that all channels are in the allowed_channels\n allowed_wavelength_ids = [\n c.wavelength_id for c in allowed_channels[acquisition]\n ]\n if not set(actual_wavelength_ids).issubset(\n set(allowed_wavelength_ids)\n ):\n msg = \"ERROR in create_ome_zarr\\n\"\n msg += f\"actual_wavelength_ids: {actual_wavelength_ids}\\n\"\n msg += f\"allowed_wavelength_ids: {allowed_wavelength_ids}\\n\"\n raise ValueError(msg)\n\n # Create actual_channels, i.e. a list of the channel dictionaries which\n # are present\n actual_channels = [\n channel\n for channel in allowed_channels[acquisition]\n if channel.wavelength_id in actual_wavelength_ids\n ]\n\n logger.info(f\"plate: {plate}\")\n logger.info(f\"actual_channels: {actual_channels}\")\n\n dict_acquisitions[acquisition] = {}\n dict_acquisitions[acquisition][\"plate\"] = plate\n dict_acquisitions[acquisition][\"original_plate\"] = original_plate\n dict_acquisitions[acquisition][\"plate_prefix\"] = plate_prefix\n dict_acquisitions[acquisition][\"image_folder\"] = in_path\n dict_acquisitions[acquisition][\"original_paths\"] = [in_path]\n dict_acquisitions[acquisition][\"actual_channels\"] = actual_channels\n dict_acquisitions[acquisition][\n \"actual_wavelength_ids\"\n ] = actual_wavelength_ids\n\n acquisitions = sorted(list(dict_acquisitions.keys()))\n current_plates = [item[\"plate\"] for item in dict_acquisitions.values()]\n if len(set(current_plates)) > 1:\n raise ValueError(f\"{current_plates=}\")\n plate = current_plates[0]\n\n zarrurl = dict_acquisitions[acquisitions[0]][\"plate\"] + \".zarr\"\n full_zarrurl = str(Path(output_path) / zarrurl)\n logger.info(f\"Creating {full_zarrurl=}\")\n # Call zarr.open_group wrapper, which handles overwrite=True/False\n group_plate = open_zarr_group_with_overwrite(\n full_zarrurl, overwrite=overwrite\n )\n group_plate.attrs[\"plate\"] = {\n \"acquisitions\": [\n {\n \"id\": int(acquisition),\n \"name\": dict_acquisitions[acquisition][\"original_plate\"],\n }\n for acquisition in acquisitions\n ]\n }\n\n zarrurls: dict[str, list[str]] = {\"well\": [], \"image\": []}\n zarrurls[\"plate\"] = [f\"{plate}.zarr\"]\n\n ################################################################\n logging.info(f\"{acquisitions=}\")\n\n for acquisition in acquisitions:\n\n # Define plate zarr\n image_folder = dict_acquisitions[acquisition][\"image_folder\"]\n logger.info(f\"Looking at {image_folder=}\")\n\n # Obtain FOV-metadata dataframe\n if metadata_table_files is None:\n mrf_path = f\"{image_folder}/MeasurementDetail.mrf\"\n mlf_path = f\"{image_folder}/MeasurementData.mlf\"\n site_metadata, total_files = parse_yokogawa_metadata(\n mrf_path, mlf_path, filename_patterns=image_glob_patterns\n )\n site_metadata = remove_FOV_overlaps(site_metadata)\n else:\n site_metadata = pd.read_csv(metadata_table_files[acquisition])\n site_metadata.set_index([\"well_id\", \"FieldIndex\"], inplace=True)\n\n # Extract pixel sizes and bit_depth\n pixel_size_z = site_metadata[\"pixel_size_z\"][0]\n pixel_size_y = site_metadata[\"pixel_size_y\"][0]\n pixel_size_x = site_metadata[\"pixel_size_x\"][0]\n bit_depth = site_metadata[\"bit_depth\"][0]\n\n if min(pixel_size_z, pixel_size_y, pixel_size_x) < 1e-9:\n raise ValueError(pixel_size_z, pixel_size_y, pixel_size_x)\n\n # Identify all wells\n plate_prefix = dict_acquisitions[acquisition][\"plate_prefix\"]\n patterns = [f\"{plate_prefix}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n plate_images = glob_with_multiple_patterns(\n folder=str(image_folder),\n patterns=patterns,\n )\n\n wells = [\n parse_filename(os.path.basename(fn))[\"well\"] for fn in plate_images\n ]\n wells = sorted(list(set(wells)))\n logger.info(f\"{wells=}\")\n\n # Verify that all wells have all channels\n actual_channels = dict_acquisitions[acquisition][\"actual_channels\"]\n for well in wells:\n patterns = [f\"{plate_prefix}_{well}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n well_images = glob_with_multiple_patterns(\n folder=str(image_folder),\n patterns=patterns,\n )\n\n well_wavelength_ids = []\n for fpath in well_images:\n try:\n filename_metadata = parse_filename(os.path.basename(fpath))\n A = filename_metadata[\"A\"]\n C = filename_metadata[\"C\"]\n well_wavelength_ids.append(f\"A{A}_C{C}\")\n except IndexError:\n logger.info(f\"Skipping {fpath}\")\n well_wavelength_ids = sorted(list(set(well_wavelength_ids)))\n actual_wavelength_ids = dict_acquisitions[acquisition][\n \"actual_wavelength_ids\"\n ]\n if well_wavelength_ids != actual_wavelength_ids:\n raise ValueError(\n f\"ERROR: well {well} in plate {plate} (prefix: \"\n f\"{plate_prefix}) has missing channels.\\n\"\n f\"Expected: {actual_wavelength_ids}\\n\"\n f\"Found: {well_wavelength_ids}.\\n\"\n )\n\n well_rows_columns = [\n ind for ind in sorted([(n[0], n[1:]) for n in wells])\n ]\n row_list = [\n well_row_column[0] for well_row_column in well_rows_columns\n ]\n col_list = [\n well_row_column[1] for well_row_column in well_rows_columns\n ]\n row_list = sorted(list(set(row_list)))\n col_list = sorted(list(set(col_list)))\n\n plate_attrs = group_plate.attrs[\"plate\"]\n plate_attrs[\"columns\"] = [{\"name\": col} for col in col_list]\n plate_attrs[\"rows\"] = [{\"name\": row} for row in row_list]\n plate_attrs[\"wells\"] = [\n {\n \"path\": well_row_column[0] + \"/\" + well_row_column[1],\n \"rowIndex\": row_list.index(well_row_column[0]),\n \"columnIndex\": col_list.index(well_row_column[1]),\n }\n for well_row_column in well_rows_columns\n ]\n group_plate.attrs[\"plate\"] = plate_attrs\n\n for row, column in well_rows_columns:\n\n try:\n group_well = group_plate.create_group(f\"{row}/{column}/\")\n logging.info(f\"Created new group_well at {row}/{column}/\")\n group_well.attrs[\"well\"] = {\n \"images\": [\n {\n \"path\": f\"{acquisition}\",\n \"acquisition\": int(acquisition),\n }\n ],\n \"version\": __OME_NGFF_VERSION__,\n }\n zarrurls[\"well\"].append(f\"{plate}.zarr/{row}/{column}\")\n except ContainsGroupError:\n group_well = zarr.open_group(\n f\"{full_zarrurl}/{row}/{column}/\", mode=\"r+\"\n )\n logging.info(\n f\"Loaded group_well from {full_zarrurl}/{row}/{column}\"\n )\n current_images = group_well.attrs[\"well\"][\"images\"] + [\n {\"path\": f\"{acquisition}\", \"acquisition\": int(acquisition)}\n ]\n group_well.attrs[\"well\"] = dict(\n images=current_images,\n version=group_well.attrs[\"well\"][\"version\"],\n )\n\n group_image = group_well.create_group(\n f\"{acquisition}/\"\n ) # noqa: F841\n logging.info(f\"Created image group {row}/{column}/{acquisition}\")\n image = f\"{plate}.zarr/{row}/{column}/{acquisition}\"\n zarrurls[\"image\"].append(image)\n\n group_image.attrs[\"multiscales\"] = [\n {\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n {\"name\": \"c\", \"type\": \"channel\"},\n {\n \"name\": \"z\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"y\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n {\n \"name\": \"x\",\n \"type\": \"space\",\n \"unit\": \"micrometer\",\n },\n ],\n \"datasets\": [\n {\n \"path\": f\"{ind_level}\",\n \"coordinateTransformations\": [\n {\n \"type\": \"scale\",\n \"scale\": [\n 1,\n pixel_size_z,\n pixel_size_y\n * coarsening_xy**ind_level,\n pixel_size_x\n * coarsening_xy**ind_level,\n ],\n }\n ],\n }\n for ind_level in range(num_levels)\n ],\n }\n ]\n\n group_image.attrs[\"omero\"] = {\n \"id\": 1, # FIXME does this depend on the plate number?\n \"name\": \"TBD\",\n \"version\": __OME_NGFF_VERSION__,\n \"channels\": define_omero_channels(\n channels=actual_channels,\n bit_depth=bit_depth,\n label_prefix=acquisition,\n ),\n }\n\n # Prepare AnnData tables for FOV/well ROIs\n well_id = row + column\n FOV_ROIs_table = prepare_FOV_ROI_table(site_metadata.loc[well_id])\n well_ROIs_table = prepare_well_ROI_table(\n site_metadata.loc[well_id]\n )\n\n # Write AnnData tables into the `tables` zarr group\n write_table(\n group_image,\n \"FOV_ROI_table\",\n FOV_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n write_table(\n group_image,\n \"well_ROI_table\",\n well_ROIs_table,\n overwrite=overwrite,\n table_attrs={\"type\": \"roi_table\"},\n )\n\n # Check that the different images (e.g. different cycles) in the each well\n # have unique labels\n for well_path in zarrurls[\"well\"]:\n check_well_channel_labels(\n well_zarr_path=str(Path(output_path) / well_path)\n )\n\n original_paths = {\n acquisition: dict_acquisitions[acquisition][\"original_paths\"]\n for acquisition in acquisitions\n }\n\n metadata_update = dict(\n plate=zarrurls[\"plate\"],\n well=zarrurls[\"well\"],\n image=zarrurls[\"image\"],\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n original_paths=original_paths,\n image_extension=image_extension,\n image_glob_patterns=image_glob_patterns,\n )\n return metadata_update\n
Corrects a stack of images, using a given illumination profile (e.g. bright in the center of the image, dim outside).
PARAMETER DESCRIPTION img_stack
4D numpy array (czyx), with dummy size along c.
TYPE: ndarray
corr_img
2D numpy array (yx)
TYPE: ndarray
background
Background value that is subtracted from the image before the illumination correction is applied.
TYPE: int DEFAULT: 110
Source code in fractal_tasks_core/tasks/illumination_correction.py
def correct(\n img_stack: np.ndarray,\n corr_img: np.ndarray,\n background: int = 110,\n):\n\"\"\"\n Corrects a stack of images, using a given illumination profile (e.g. bright\n in the center of the image, dim outside).\n\n Args:\n img_stack: 4D numpy array (czyx), with dummy size along c.\n corr_img: 2D numpy array (yx)\n background: Background value that is subtracted from the image before\n the illumination correction is applied.\n \"\"\"\n\n logger.info(f\"Start correct, {img_stack.shape}\")\n\n # Check shapes\n if corr_img.shape != img_stack.shape[2:] or img_stack.shape[0] != 1:\n raise ValueError(\n \"Error in illumination_correction:\\n\"\n f\"{img_stack.shape=}\\n{corr_img.shape=}\"\n )\n\n # Store info about dtype\n dtype = img_stack.dtype\n dtype_max = np.iinfo(dtype).max\n\n # Background subtraction\n img_stack[img_stack <= background] = 0\n img_stack[img_stack > background] -= background\n\n # Apply the normalized correction matrix (requires a float array)\n # img_stack = img_stack.astype(np.float64)\n new_img_stack = img_stack / (corr_img / np.max(corr_img))[None, None, :, :]\n\n # Handle edge case: corrected image may have values beyond the limit of\n # the encoding, e.g. beyond 65535 for 16bit images. This clips values\n # that surpass this limit and triggers a warning\n if np.sum(new_img_stack > dtype_max) > 0:\n warnings.warn(\n \"Illumination correction created values beyond the max range of \"\n f\"the current image type. These have been clipped to {dtype_max=}.\"\n )\n new_img_stack[new_img_stack > dtype_max] = dtype_max\n\n logger.info(\"End correct\")\n\n # Cast back to original dtype and return\n return new_img_stack.astype(dtype)\n
Applies illumination correction to the images in the OME-Zarr.
PARAMETER DESCRIPTION input_paths
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Path were the output of this task is stored. Examples: \"/some/path/\" => puts the new OME-Zarr file in the same folder as the input OME-Zarr file; \"/some/new_path\" => puts the new OME-Zarr file into a new folder at /some/new_path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
illumination_profiles_folder
Path of folder of illumination profiles.
TYPE: str
dict_corr
Dictionary where keys match the wavelength_id attributes of existing channels (e.g. A01_C01 ) and values are the filenames of the corresponding illumination profiles.
TYPE: dict[str, str]
background
Background value that is subtracted from the image before the illumination correction is applied. Set it to 0 if you don't want any background subtraction.
TYPE: int DEFAULT: 110
overwrite_input
If True, the results of this task will overwrite the input image data. In the current version, overwrite_input=False is not implemented.
TYPE: bool DEFAULT: True
new_component
Not implemented yet. This is not implemented well in Fractal server at the moment, it's unclear how a user would specify fitting new components. If the results shall not overwrite the input data and the output path is the same as the input path, a new component needs to be provided. Example: myplate_new_name.zarr/B/03/0/.
TYPE: Optional[str] DEFAULT: None
Source code in fractal_tasks_core/tasks/illumination_correction.py
@validate_arguments\ndef illumination_correction(\n *,\n # Standard arguments\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments\n illumination_profiles_folder: str,\n dict_corr: dict[str, str],\n background: int = 110,\n overwrite_input: bool = True,\n new_component: Optional[str] = None,\n) -> dict[str, Any]:\n\n\"\"\"\n Applies illumination correction to the images in the OME-Zarr.\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file. Example:\n `[\"/some/path/\"]`. This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: Path were the output of this task is stored. Examples:\n `\"/some/path/\"` => puts the new OME-Zarr file in the same folder as\n the input OME-Zarr file; `\"/some/new_path\"` => puts the new\n OME-Zarr file into a new folder at `/some/new_path`.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n illumination_profiles_folder: Path of folder of illumination profiles.\n dict_corr: Dictionary where keys match the `wavelength_id` attributes\n of existing channels (e.g. `A01_C01` ) and values are the\n filenames of the corresponding illumination profiles.\n background: Background value that is subtracted from the image before\n the illumination correction is applied. Set it to `0` if you don't\n want any background subtraction.\n overwrite_input:\n If `True`, the results of this task will overwrite the input image\n data. In the current version, `overwrite_input=False` is not\n implemented.\n new_component: Not implemented yet. This is not implemented well in\n Fractal server at the moment, it's unclear how a user would specify\n fitting new components. If the results shall not overwrite the\n input data and the output path is the same as the input path, a new\n component needs to be provided.\n Example: `myplate_new_name.zarr/B/03/0/`.\n \"\"\"\n\n # Preliminary checks\n if len(input_paths) > 1:\n raise NotImplementedError\n if (overwrite_input and new_component is not None) or (\n new_component is None and not overwrite_input\n ):\n raise ValueError(f\"{overwrite_input=}, but {new_component=}\")\n\n if not overwrite_input:\n msg = (\n \"We still have to harmonize illumination_correction(\"\n \"overwrite_input=False) with replicate_zarr_structure(..., \"\n \"suffix=..)\"\n )\n raise NotImplementedError(msg)\n\n # Defione old/new zarrurls\n plate, well = component.split(\".zarr/\")\n in_path = Path(input_paths[0])\n zarrurl_old = (in_path / component).as_posix()\n if overwrite_input:\n zarrurl_new = zarrurl_old\n else:\n new_plate, new_well = new_component.split(\".zarr/\")\n if new_well != well:\n raise ValueError(f\"{well=}, {new_well=}\")\n zarrurl_new = (Path(output_path) / new_component).as_posix()\n\n t_start = time.perf_counter()\n logger.info(\"Start illumination_correction\")\n logger.info(f\" {overwrite_input=}\")\n logger.info(f\" {zarrurl_old=}\")\n logger.info(f\" {zarrurl_new=}\")\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarrurl_old)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n\n # Read channels from .zattrs\n channels: list[OmeroChannel] = get_omero_channel_list(\n image_zarr_path=zarrurl_old\n )\n num_channels = len(channels)\n\n # Read FOV ROIs\n FOV_ROI_table = ad.read_zarr(f\"{zarrurl_old}/tables/FOV_ROI_table\")\n\n # Create list of indices for 3D FOVs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n FOV_ROI_table,\n level=0,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, \"FOV_ROI_table\")\n\n # Extract image size from FOV-ROI indices. Note: this works at level=0,\n # where FOVs should all be of the exact same size (in pixels)\n ref_img_size = None\n for indices in list_indices:\n img_size = (indices[3] - indices[2], indices[5] - indices[4])\n if ref_img_size is None:\n ref_img_size = img_size\n else:\n if img_size != ref_img_size:\n raise ValueError(\n \"ERROR: inconsistent image sizes in list_indices\"\n )\n img_size_y, img_size_x = img_size[:]\n\n # Assemble dictionary of matrices and check their shapes\n corrections = {}\n for channel in channels:\n wavelength_id = channel.wavelength_id\n corrections[wavelength_id] = imread(\n (\n Path(illumination_profiles_folder) / dict_corr[wavelength_id]\n ).as_posix()\n )\n if corrections[wavelength_id].shape != (img_size_y, img_size_x):\n raise ValueError(\n \"Error in illumination_correction, \"\n \"correction matrix has wrong shape.\"\n )\n\n # Lazily load highest-res level from original zarr array\n data_czyx = da.from_zarr(f\"{zarrurl_old}/0\")\n\n # Create zarr for output\n if overwrite_input:\n fov_path = zarrurl_old\n new_zarr = zarr.open(f\"{zarrurl_old}/0\")\n else:\n fov_path = zarrurl_new\n new_zarr = zarr.create(\n shape=data_czyx.shape,\n chunks=data_czyx.chunksize,\n dtype=data_czyx.dtype,\n store=zarr.storage.FSStore(f\"{zarrurl_new}/0\"),\n overwrite=False,\n dimension_separator=\"/\",\n )\n\n # Iterate over FOV ROIs\n num_ROIs = len(list_indices)\n for i_c, channel in enumerate(channels):\n for i_ROI, indices in enumerate(list_indices):\n # Define region\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(i_c, i_c + 1),\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n logger.info(\n f\"Now processing ROI {i_ROI+1}/{num_ROIs} \"\n f\"for channel {i_c+1}/{num_channels}\"\n )\n # Execute illumination correction\n corrected_fov = correct(\n data_czyx[region].compute(),\n corrections[channel.wavelength_id],\n background=background,\n )\n # Write to disk\n da.array(corrected_fov).to_zarr(\n url=new_zarr,\n region=region,\n compute=True,\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=fov_path,\n overwrite=True,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=data_czyx.chunksize,\n )\n\n t_end = time.perf_counter()\n logger.info(f\"End illumination_correction, elapsed: {t_end-t_start}\")\n\n return {}\n
Creates the appropriate components-related metadata, needed for processing an existing OME-Zarr through Fractal.
Optionally adds new ROI tables to the existing OME-Zarr.
PARAMETER DESCRIPTION input_paths
A length-one list with the parent folder of the OME-Zarr to be imported; e.g. input_paths=[\"/somewhere\"], if the OME-Zarr path is /somewhere/array.zarr. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Not used in this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
Not used in this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
zarr_name
The OME-Zarr name, without its parent folder; e.g. zarr_name=\"array.zarr\", if the OME-Zarr path is /somewhere/array.zarr.
TYPE: str
add_image_ROI_table
Whether to add a image_ROI_table table to each image, with a single ROI covering the whole image.
TYPE: bool DEFAULT: True
add_grid_ROI_table
Whether to add a grid_ROI_table table to each image, with the image split into a rectangular grid of ROIs.
TYPE: bool DEFAULT: True
grid_y_shape
Y shape of the ROI grid in grid_ROI_table.
TYPE: int DEFAULT: 2
grid_x_shape
X shape of the ROI grid in grid_ROI_table.
TYPE: int DEFAULT: 2
update_omero_metadata
Whether to update Omero-channels metadata, to make them Fractal-compatible.
TYPE: bool DEFAULT: True
overwrite
Whether new ROI tables (added when add_image_ROI_table and/or add_grid_ROI_table are True) can overwite existing ones.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/tasks/import_ome_zarr.py
@validate_arguments\ndef import_ome_zarr(\n *,\n input_paths: Sequence[str],\n output_path: str,\n metadata: dict[str, Any],\n zarr_name: str,\n add_image_ROI_table: bool = True,\n add_grid_ROI_table: bool = True,\n grid_y_shape: int = 2,\n grid_x_shape: int = 2,\n update_omero_metadata: bool = True,\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Import an OME-Zarr into Fractal.\n\n The current version of this task:\n\n 1. Creates the appropriate components-related metadata, needed for\n processing an existing OME-Zarr through Fractal.\n 2. Optionally adds new ROI tables to the existing OME-Zarr.\n\n Args:\n input_paths: A length-one list with the parent folder of the OME-Zarr\n to be imported; e.g. `input_paths=[\"/somewhere\"]`, if the OME-Zarr\n path is `/somewhere/array.zarr`.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: Not used in this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: Not used in this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n zarr_name: The OME-Zarr name, without its parent folder; e.g.\n `zarr_name=\"array.zarr\"`, if the OME-Zarr path is\n `/somewhere/array.zarr`.\n add_image_ROI_table: Whether to add a `image_ROI_table` table to each\n image, with a single ROI covering the whole image.\n add_grid_ROI_table: Whether to add a `grid_ROI_table` table to each\n image, with the image split into a rectangular grid of ROIs.\n grid_y_shape: Y shape of the ROI grid in `grid_ROI_table`.\n grid_x_shape: X shape of the ROI grid in `grid_ROI_table`.\n update_omero_metadata: Whether to update Omero-channels metadata, to\n make them Fractal-compatible.\n overwrite: Whether new ROI tables (added when `add_image_ROI_table`\n and/or `add_grid_ROI_table` are `True`) can overwite existing ones.\n \"\"\"\n\n # Preliminary checks\n if len(input_paths) > 1:\n raise NotImplementedError\n\n zarr_path = str(Path(input_paths[0]) / zarr_name)\n logger.info(f\"Zarr path: {zarr_path}\")\n\n zarrurls: dict = dict(plate=[], well=[], image=[])\n\n root_group = zarr.open_group(zarr_path, mode=\"r\")\n ngff_type = detect_ome_ngff_type(root_group)\n grid_YX_shape = (grid_y_shape, grid_x_shape)\n\n if ngff_type == \"plate\":\n zarrurls[\"plate\"].append(zarr_name)\n for well in root_group.attrs[\"plate\"][\"wells\"]:\n well_path = well[\"path\"]\n zarrurls[\"well\"].append(f\"{zarr_name}/{well_path}\")\n\n well_group = zarr.open_group(zarr_path, path=well_path, mode=\"r\")\n for image in well_group.attrs[\"well\"][\"images\"]:\n image_path = image[\"path\"]\n zarrurls[\"image\"].append(\n f\"{zarr_name}/{well_path}/{image_path}\"\n )\n _process_single_image(\n f\"{zarr_path}/{well_path}/{image_path}\",\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n elif ngff_type == \"well\":\n zarrurls[\"well\"].append(zarr_name)\n logger.warning(\n \"Only OME-Zarr for plates are fully supported in Fractal; \"\n f\"e.g. the current one ({ngff_type=}) cannot be \"\n \"processed via the `maximum_intensity_projection` task.\"\n )\n for image in root_group.attrs[\"well\"][\"images\"]:\n image_path = image[\"path\"]\n zarrurls[\"image\"].append(f\"{zarr_name}/{image_path}\")\n _process_single_image(\n f\"{zarr_path}/{image_path}\",\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n elif ngff_type == \"image\":\n zarrurls[\"image\"].append(zarr_name)\n logger.warning(\n \"Only OME-Zarr for plates are fully supported in Fractal; \"\n f\"e.g. the current one ({ngff_type=}) cannot be \"\n \"processed via the `maximum_intensity_projection` task.\"\n )\n _process_single_image(\n zarr_path,\n add_image_ROI_table,\n add_grid_ROI_table,\n update_omero_metadata,\n grid_YX_shape=grid_YX_shape,\n overwrite=overwrite,\n )\n\n # Remove zarrurls keys pointing to empty lists\n clean_zarrurls = {\n key: value for key, value in zarrurls.items() if len(value) > 0\n }\n\n return clean_zarrurls\n
Perform maximum-intensity projection along Z axis.
Note: this task stores the output in a new zarr file.
PARAMETER DESCRIPTION input_paths
This parameter is not used by this task. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Path were the output of this task is stored. Example: \"/some/path/\" => puts the new OME-Zarr file in that folder. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Component is typically changed by the copy_ome_zarr task before, to point to a new mip Zarr file. Example: \"some_plate_mip.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
Dictionary containing metadata about the OME-Zarr. This task requires the key copy_ome_zarr to be present in the metadata (as defined in copy_ome_zarr task). (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/tasks/maximum_intensity_projection.py
@validate_arguments\ndef maximum_intensity_projection(\n *,\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n overwrite: bool = False,\n) -> dict[str, Any]:\n\"\"\"\n Perform maximum-intensity projection along Z axis.\n\n Note: this task stores the output in a new zarr file.\n\n Args:\n input_paths: This parameter is not used by this task.\n This task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: Path were the output of this task is stored.\n Example: `\"/some/path/\"` => puts the new OME-Zarr file in that\n folder.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that\n is processed. Component is typically changed by the `copy_ome_zarr`\n task before, to point to a new mip Zarr file.\n Example: `\"some_plate_mip.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: Dictionary containing metadata about the OME-Zarr.\n This task requires the key `copy_ome_zarr` to be present in the\n metadata (as defined in `copy_ome_zarr` task).\n (standard argument for Fractal tasks, managed by Fractal server).\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n\n # Preliminary checks\n if len(input_paths) > 1:\n raise NotImplementedError\n\n plate, well = component.split(\".zarr/\")\n zarrurl_old = metadata[\"copy_ome_zarr\"][\"sources\"][plate] + \"/\" + well\n clean_output_path = Path(output_path).resolve()\n zarrurl_new = (clean_output_path / component).as_posix()\n logger.info(f\"{zarrurl_old=}\")\n logger.info(f\"{zarrurl_new=}\")\n\n # Read some parameters from metadata\n ngff_image = load_NgffImageMeta(zarrurl_old)\n num_levels = ngff_image.num_levels\n coarsening_xy = ngff_image.coarsening_xy\n\n # Load 0-th level\n data_czyx = da.from_zarr(zarrurl_old + \"/0\")\n num_channels = data_czyx.shape[0]\n chunksize_y = data_czyx.chunksize[-2]\n chunksize_x = data_czyx.chunksize[-1]\n logger.info(f\"{num_channels=}\")\n logger.info(f\"{chunksize_y=}\")\n logger.info(f\"{chunksize_x=}\")\n # Loop over channels\n accumulate_chl = []\n for ind_ch in range(num_channels):\n # Perform MIP for each channel of level 0\n mip_yx = da.stack([da.max(data_czyx[ind_ch], axis=0)], axis=0)\n accumulate_chl.append(mip_yx)\n accumulated_array = da.stack(accumulate_chl, axis=0)\n\n # Write to disk (triggering execution)\n try:\n accumulated_array.to_zarr(\n f\"{zarrurl_new}/0\",\n overwrite=overwrite,\n dimension_separator=\"/\",\n write_empty_chunks=False,\n )\n except ContainsArrayError as e:\n error_msg = (\n f\"Cannot write array to zarr group at '{zarrurl_new}/0', \"\n f\"with {overwrite=} (original error: {str(e)}).\\n\"\n \"Hint: try setting overwrite=True.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=zarrurl_new,\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=(1, 1, chunksize_y, chunksize_x),\n )\n\n return {}\n
Encapsulates features that are out-of-scope for the current wrapper task.
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper.py
class OutOfTaskScopeError(NotImplementedError):\n\"\"\"\n Encapsulates features that are out-of-scope for the current wrapper task.\n \"\"\"\n\n pass\n
List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. his task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\". (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
workflow_file
Absolute path to napari-workflows YAML file
TYPE: str
input_specs
A dictionary of NapariWorkflowsInput values.
TYPE: dict[str, NapariWorkflowsInput]
output_specs
A dictionary of NapariWorkflowsOutput values.
TYPE: dict[str, NapariWorkflowsOutput]
input_ROI_table
Name of the ROI table over which the task loops to apply napari workflows. Examples: FOV_ROI_table => loop over the field of views; organoid_ROI_table => loop over the organoid ROI table (generated by another task); well_ROI_table => process the whole well as one image.
TYPE: str DEFAULT: 'FOV_ROI_table'
level
Pyramid level of the image to be used as input for napari-workflows. Choose 0 to process at full resolution. Levels > 0 are currently only supported for workflows that only have intensity images as input and only produce a label images as output.
TYPE: int DEFAULT: 0
relabeling
If True, apply relabeling so that label values are unique across all ROIs in the well.
TYPE: bool DEFAULT: True
expected_dimensions
Expected dimensions (either 2 or 3). Useful when loading 2D images that are stored in a 3D array with shape (1, size_x, size_y) [which is the default way Fractal stores 2D images], but you want to make sure the napari workflow gets a 2D array to process. Also useful to set to 2 when loading a 2D OME-Zarr that is saved as (size_x, size_y).
TYPE: int DEFAULT: 3
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: True
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper.py
@validate_arguments\ndef napari_workflows_wrapper(\n *,\n # Default arguments for fractal tasks:\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n # Task-specific arguments:\n workflow_file: str,\n input_specs: dict[str, NapariWorkflowsInput],\n output_specs: dict[str, NapariWorkflowsOutput],\n input_ROI_table: str = \"FOV_ROI_table\",\n level: int = 0,\n relabeling: bool = True,\n expected_dimensions: int = 3,\n overwrite: bool = True,\n):\n\"\"\"\n Run a napari-workflow on the ROIs of a single OME-NGFF image.\n\n This task takes images and labels and runs a napari-workflow on them that\n can produce a label and tables as output.\n\n Examples of allowed entries for `input_specs` and `output_specs`:\n\n ```\n input_specs = {\n \"in_1\": {\"type\": \"image\", \"channel\": {\"wavelength_id\": \"A01_C02\"}},\n \"in_2\": {\"type\": \"image\", \"channel\": {\"label\": \"DAPI\"}},\n \"in_3\": {\"type\": \"label\", \"label_name\": \"label_DAPI\"},\n }\n\n output_specs = {\n \"out_1\": {\"type\": \"label\", \"label_name\": \"label_DAPI_new\"},\n \"out_2\": {\"type\": \"dataframe\", \"table_name\": \"measurements\"},\n }\n ```\n\n Args:\n input_paths: List of input paths where the image data is stored as\n OME-Zarrs. Should point to the parent folder containing one or many\n OME-Zarr files, not the actual OME-Zarr file.\n Example: `[\"/some/path/\"]`.\n his task only supports a single input path.\n (standard argument for Fractal tasks, managed by Fractal server).\n output_path: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed.\n Example: `\"some_plate.zarr/B/03/0\"`.\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: This parameter is not used by this task.\n (standard argument for Fractal tasks, managed by Fractal server).\n workflow_file: Absolute path to napari-workflows YAML file\n input_specs: A dictionary of `NapariWorkflowsInput` values.\n output_specs: A dictionary of `NapariWorkflowsOutput` values.\n input_ROI_table: Name of the ROI table over which the task loops to\n apply napari workflows.\n Examples:\n `FOV_ROI_table`\n => loop over the field of views;\n `organoid_ROI_table`\n => loop over the organoid ROI table (generated by another task);\n `well_ROI_table`\n => process the whole well as one image.\n level: Pyramid level of the image to be used as input for\n napari-workflows. Choose `0` to process at full resolution.\n Levels > 0 are currently only supported for workflows that only\n have intensity images as input and only produce a label images as\n output.\n relabeling: If `True`, apply relabeling so that label values are\n unique across all ROIs in the well.\n expected_dimensions: Expected dimensions (either `2` or `3`). Useful\n when loading 2D images that are stored in a 3D array with shape\n `(1, size_x, size_y)` [which is the default way Fractal stores 2D\n images], but you want to make sure the napari workflow gets a 2D\n array to process. Also useful to set to `2` when loading a 2D\n OME-Zarr that is saved as `(size_x, size_y)`.\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n wf: napari_workflows.Worfklow = load_workflow(workflow_file)\n logger.info(f\"Loaded workflow from {workflow_file}\")\n\n # Validation of input/output specs\n if not (set(wf.leafs()) <= set(output_specs.keys())):\n msg = f\"Some item of {wf.leafs()=} is not part of {output_specs=}.\"\n logger.warning(msg)\n if not (set(wf.roots()) <= set(input_specs.keys())):\n msg = f\"Some item of {wf.roots()=} is not part of {input_specs=}.\"\n logger.error(msg)\n raise ValueError(msg)\n list_outputs = sorted(output_specs.keys())\n\n # Characterization of workflow and scope restriction\n input_types = [in_params.type for (name, in_params) in input_specs.items()]\n output_types = [\n out_params.type for (name, out_params) in output_specs.items()\n ]\n are_inputs_all_images = set(input_types) == {\"image\"}\n are_outputs_all_labels = set(output_types) == {\"label\"}\n are_outputs_all_dataframes = set(output_types) == {\"dataframe\"}\n is_labeling_workflow = are_inputs_all_images and are_outputs_all_labels\n is_measurement_only_workflow = are_outputs_all_dataframes\n # Level-related constraint\n logger.info(f\"This workflow acts at {level=}\")\n logger.info(\n f\"Is the current workflow a labeling one? {is_labeling_workflow}\"\n )\n if level > 0 and not is_labeling_workflow:\n msg = (\n f\"{level=}>0 is currently only accepted for labeling workflows, \"\n \"i.e. those going from image(s) to label(s)\"\n )\n logger.error(msg)\n raise OutOfTaskScopeError(msg)\n # Relabeling-related (soft) constraint\n if is_measurement_only_workflow and relabeling:\n logger.warning(\n \"This is a measurement-output-only workflow, setting \"\n \"relabeling=False.\"\n )\n relabeling = False\n if relabeling:\n max_label_for_relabeling = 0\n\n # Pre-processing of task inputs\n if len(input_paths) > 1:\n raise NotImplementedError(\n \"We currently only support a single input path\"\n )\n in_path = Path(input_paths[0]).as_posix()\n label_dtype = np.uint32\n\n # Read ROI table\n zarrurl = f\"{in_path}/{component}\"\n ROI_table = ad.read_zarr(f\"{in_path}/{component}/tables/{input_ROI_table}\")\n\n # Load image metadata\n ngff_image_meta = load_NgffImageMeta(zarrurl)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n\n # Read pixel sizes from zattrs file\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n\n # Create list of indices for 3D FOVs spanning the entire Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n num_ROIs = len(list_indices)\n logger.info(\n f\"Completed reading ROI table {input_ROI_table},\"\n f\" found {num_ROIs} ROIs.\"\n )\n\n # Input preparation: \"image\" type\n image_inputs = [\n (name, in_params)\n for (name, in_params) in input_specs.items()\n if in_params.type == \"image\"\n ]\n input_image_arrays = {}\n if image_inputs:\n img_array = da.from_zarr(f\"{in_path}/{component}/{level}\")\n # Loop over image inputs and assign corresponding channel of the image\n for (name, params) in image_inputs:\n channel = get_channel_from_image_zarr(\n image_zarr_path=f\"{in_path}/{component}\",\n wavelength_id=params.channel.wavelength_id,\n label=params.channel.label,\n )\n channel_index = channel.index\n input_image_arrays[name] = img_array[channel_index]\n\n # Handle dimensions\n shape = input_image_arrays[name].shape\n if expected_dimensions == 3 and shape[0] == 1:\n logger.warning(\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n if expected_dimensions == 2:\n if len(shape) == 2:\n # We already load the data as a 2D array\n pass\n elif shape[0] == 1:\n input_image_arrays[name] = input_image_arrays[name][\n 0, :, :\n ]\n else:\n msg = (\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n logger.info(f\"Prepared input with {name=} and {params=}\")\n logger.info(f\"{input_image_arrays=}\")\n\n # Input preparation: \"label\" type\n label_inputs = [\n (name, in_params)\n for (name, in_params) in input_specs.items()\n if in_params.type == \"label\"\n ]\n if label_inputs:\n # Set target_shape for upscaling labels\n if not image_inputs:\n logger.warning(\n f\"{len(label_inputs)=} but num_image_inputs=0. \"\n \"Label array(s) will not be upscaled.\"\n )\n upscale_labels = False\n else:\n target_shape = list(input_image_arrays.values())[0].shape\n upscale_labels = True\n # Loop over label inputs and load corresponding (upscaled) image\n input_label_arrays = {}\n for (name, params) in label_inputs:\n label_name = params.label_name\n label_array_raw = da.from_zarr(\n f\"{in_path}/{component}/labels/{label_name}/{level}\"\n )\n input_label_arrays[name] = label_array_raw\n\n # Handle dimensions\n shape = input_label_arrays[name].shape\n if expected_dimensions == 3 and shape[0] == 1:\n logger.warning(\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n if expected_dimensions == 2:\n if len(shape) == 2:\n # We already load the data as a 2D array\n pass\n elif shape[0] == 1:\n input_label_arrays[name] = input_label_arrays[name][\n 0, :, :\n ]\n else:\n msg = (\n f\"Input {name} has shape {shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n\n if upscale_labels:\n # Check that dimensionality matches the image\n if len(input_label_arrays[name].shape) != len(target_shape):\n raise ValueError(\n f\"Label {name} has shape \"\n f\"{input_label_arrays[name].shape}. \"\n \"But the corresponding image has shape \"\n f\"{target_shape}. Those dimensionalities do not \"\n f\"match. Is {expected_dimensions=} the correct \"\n \"setting?\"\n )\n if expected_dimensions == 3:\n upscaling_axes = [1, 2]\n else:\n upscaling_axes = [0, 1]\n input_label_arrays[name] = upscale_array(\n array=input_label_arrays[name],\n target_shape=target_shape,\n axis=upscaling_axes,\n pad_with_zeros=True,\n )\n\n logger.info(f\"Prepared input with {name=} and {params=}\")\n logger.info(f\"{input_label_arrays=}\")\n\n # Output preparation: \"label\" type\n label_outputs = [\n (name, out_params)\n for (name, out_params) in output_specs.items()\n if out_params.type == \"label\"\n ]\n if label_outputs:\n # Preliminary scope checks\n if len(label_outputs) > 1:\n raise OutOfTaskScopeError(\n \"Multiple label outputs would break label-inputs-only \"\n f\"workflows (found {len(label_outputs)=}).\"\n )\n if len(label_outputs) > 1 and relabeling:\n raise OutOfTaskScopeError(\n \"Multiple label outputs would break relabeling in labeling+\"\n f\"measurement workflows (found {len(label_outputs)=}).\"\n )\n\n # We only support two cases:\n # 1. If there exist some input images, then use the first one to\n # determine output-label array properties\n # 2. If there are no input images, but there are input labels, then (A)\n # re-load the pixel sizes and re-build ROI indices, and (B) use the\n # first input label to determine output-label array properties\n if image_inputs:\n reference_array = list(input_image_arrays.values())[0]\n elif label_inputs:\n reference_array = list(input_label_arrays.values())[0]\n # Re-load pixel size, matching to the correct level\n input_label_name = label_inputs[0][1].label_name\n ngff_label_image_meta = load_NgffImageMeta(\n f\"{in_path}/{component}/labels/{input_label_name}\"\n )\n full_res_pxl_sizes_zyx = ngff_label_image_meta.get_pixel_sizes_zyx(\n level=0\n )\n # Create list of indices for 3D FOVs spanning the whole Z direction\n list_indices = convert_ROI_table_to_indices(\n ROI_table,\n level=level,\n coarsening_xy=coarsening_xy,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(list_indices, input_ROI_table)\n num_ROIs = len(list_indices)\n logger.info(\n f\"Re-create ROI indices from ROI table {input_ROI_table}, \"\n f\"using {full_res_pxl_sizes_zyx=}. \"\n \"This is necessary because label-input-only workflows may \"\n \"have label inputs that are at a different resolution and \"\n \"are not upscaled.\"\n )\n else:\n msg = (\n \"Missing image_inputs and label_inputs, we cannot assign\"\n \" label output properties\"\n )\n raise OutOfTaskScopeError(msg)\n\n # Extract label properties from reference_array, and make sure they are\n # for three dimensions\n label_shape = reference_array.shape\n label_chunksize = reference_array.chunksize\n if len(label_shape) == 2 and len(label_chunksize) == 2:\n if expected_dimensions == 3:\n raise ValueError(\n f\"Something wrong: {label_shape=} but \"\n f\"{expected_dimensions=}\"\n )\n label_shape = (1, label_shape[0], label_shape[1])\n label_chunksize = (1, label_chunksize[0], label_chunksize[1])\n logger.info(f\"{label_shape=}\")\n logger.info(f\"{label_chunksize=}\")\n\n # Loop over label outputs and (1) set zattrs, (2) create zarr group\n output_label_zarr_groups: dict[str, Any] = {}\n for (name, out_params) in label_outputs:\n\n # (1a) Rescale OME-NGFF datasets (relevant for level>0)\n if not ngff_image_meta.multiscale.axes[0].name == \"c\":\n raise ValueError(\n \"Cannot set `remove_channel_axis=True` for multiscale \"\n f\"metadata with axes={ngff_image_meta.multiscale.axes}. \"\n 'First axis should have name \"c\".'\n )\n new_datasets = rescale_datasets(\n datasets=[\n ds.dict() for ds in ngff_image_meta.multiscale.datasets\n ],\n coarsening_xy=coarsening_xy,\n reference_level=level,\n remove_channel_axis=True,\n )\n\n # (1b) Prepare attrs for label group\n label_name = out_params.label_name\n label_attrs = {\n \"image-label\": {\n \"version\": __OME_NGFF_VERSION__,\n \"source\": {\"image\": \"../../\"},\n },\n \"multiscales\": [\n {\n \"name\": label_name,\n \"version\": __OME_NGFF_VERSION__,\n \"axes\": [\n ax.dict()\n for ax in ngff_image_meta.multiscale.axes\n if ax.type != \"channel\"\n ],\n \"datasets\": new_datasets,\n }\n ],\n }\n\n # (2) Prepare label group\n zarrurl = f\"{in_path}/{component}\"\n image_group = zarr.group(zarrurl)\n label_group = prepare_label_group(\n image_group,\n label_name,\n overwrite=overwrite,\n label_attrs=label_attrs,\n logger=logger,\n )\n logger.info(\n \"Helper function `prepare_label_group` returned \"\n f\"{label_group=}\"\n )\n\n # (3) Create zarr group at level=0\n store = zarr.storage.FSStore(\n f\"{in_path}/{component}/labels/{label_name}/0\"\n )\n mask_zarr = zarr.create(\n shape=label_shape,\n chunks=label_chunksize,\n dtype=label_dtype,\n store=store,\n overwrite=overwrite,\n dimension_separator=\"/\",\n )\n output_label_zarr_groups[name] = mask_zarr\n logger.info(f\"Prepared output with {name=} and {out_params=}\")\n logger.info(f\"{output_label_zarr_groups=}\")\n\n # Output preparation: \"dataframe\" type\n dataframe_outputs = [\n (name, out_params)\n for (name, out_params) in output_specs.items()\n if out_params.type == \"dataframe\"\n ]\n output_dataframe_lists: dict[str, list] = {}\n for (name, out_params) in dataframe_outputs:\n output_dataframe_lists[name] = []\n logger.info(f\"Prepared output with {name=} and {out_params=}\")\n logger.info(f\"{output_dataframe_lists=}\")\n\n #####\n\n for i_ROI, indices in enumerate(list_indices):\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (slice(s_z, e_z), slice(s_y, e_y), slice(s_x, e_x))\n\n logger.info(f\"ROI {i_ROI+1}/{num_ROIs}: {region=}\")\n\n # Always re-load napari worfklow\n wf = load_workflow(workflow_file)\n\n # Set inputs\n for input_name in input_specs.keys():\n input_type = input_specs[input_name].type\n\n if input_type == \"image\":\n wf.set(\n input_name,\n load_region(\n input_image_arrays[input_name],\n region,\n compute=True,\n return_as_3D=False,\n ),\n )\n elif input_type == \"label\":\n wf.set(\n input_name,\n load_region(\n input_label_arrays[input_name],\n region,\n compute=True,\n return_as_3D=False,\n ),\n )\n\n # Get outputs\n outputs = wf.get(list_outputs)\n\n # Iterate first over dataframe outputs (to use the correct\n # max_label_for_relabeling, if needed)\n for ind_output, output_name in enumerate(list_outputs):\n if output_specs[output_name].type != \"dataframe\":\n continue\n df = outputs[ind_output]\n if relabeling:\n df[\"label\"] += max_label_for_relabeling\n logger.info(\n f'ROI {i_ROI+1}/{num_ROIs}: Relabeling \"{name}\" dataframe'\n \"output, with {max_label_for_relabeling=}\"\n )\n\n # Append the new-ROI dataframe to the all-ROIs list\n output_dataframe_lists[output_name].append(df)\n\n # After all dataframe outputs, iterate over label outputs (which\n # actually can be only 0 or 1)\n for ind_output, output_name in enumerate(list_outputs):\n if output_specs[output_name].type != \"label\":\n continue\n mask = outputs[ind_output]\n\n # Check dimensions\n if len(mask.shape) != expected_dimensions:\n msg = (\n f\"Output {output_name} has shape {mask.shape} \"\n f\"but {expected_dimensions=}\"\n )\n logger.error(msg)\n raise ValueError(msg)\n elif expected_dimensions == 2:\n mask = np.expand_dims(mask, axis=0)\n\n # Sanity check: issue warning for non-consecutive labels\n unique_labels = np.unique(mask)\n num_unique_labels_in_this_ROI = len(unique_labels)\n if np.min(unique_labels) == 0:\n num_unique_labels_in_this_ROI -= 1\n num_labels_in_this_ROI = int(np.max(mask))\n if num_labels_in_this_ROI != num_unique_labels_in_this_ROI:\n logger.warning(\n f'ROI {i_ROI+1}/{num_ROIs}: \"{name}\" label output has'\n f\"non-consecutive labels: {num_labels_in_this_ROI=} but\"\n f\"{num_unique_labels_in_this_ROI=}\"\n )\n\n if relabeling:\n mask[mask > 0] += max_label_for_relabeling\n logger.info(\n f'ROI {i_ROI+1}/{num_ROIs}: Relabeling \"{name}\" label '\n f\"output, with {max_label_for_relabeling=}\"\n )\n max_label_for_relabeling += num_labels_in_this_ROI\n logger.info(\n f\"ROI {i_ROI+1}/{num_ROIs}: label-number update with \"\n f\"{num_labels_in_this_ROI=}; \"\n f\"new {max_label_for_relabeling=}\"\n )\n\n da.array(mask).to_zarr(\n url=output_label_zarr_groups[output_name],\n region=region,\n compute=True,\n overwrite=overwrite,\n )\n logger.info(f\"ROI {i_ROI+1}/{num_ROIs}: output handling complete\")\n\n # Output handling: \"dataframe\" type (for each output, concatenate ROI\n # dataframes, clean up, and store in a AnnData table on-disk)\n for (name, out_params) in dataframe_outputs:\n table_name = out_params.table_name\n # Concatenate all FOV dataframes\n list_dfs = output_dataframe_lists[name]\n if len(list_dfs) == 0:\n measurement_table = ad.AnnData()\n else:\n df_well = pd.concat(list_dfs, axis=0, ignore_index=True)\n # Extract labels and drop them from df_well\n labels = pd.DataFrame(df_well[\"label\"].astype(str))\n df_well.drop(labels=[\"label\"], axis=1, inplace=True)\n # Convert all to float (warning: some would be int, in principle)\n measurement_dtype = np.float32\n df_well = df_well.astype(measurement_dtype)\n df_well.index = df_well.index.map(str)\n # Convert to anndata\n measurement_table = ad.AnnData(df_well, dtype=measurement_dtype)\n measurement_table.obs = labels\n\n # Write to zarr group\n image_group = zarr.group(f\"{in_path}/{component}\")\n table_attrs = dict(\n type=\"feature_table\",\n region=dict(path=f\"../labels/{out_params.label_name}\"),\n instance_key=\"label\",\n )\n write_table(\n image_group,\n table_name,\n measurement_table,\n overwrite=overwrite,\n table_attrs=table_attrs,\n )\n\n # Output handling: \"label\" type (for each output, build and write to disk\n # pyramid of coarser levels)\n for (name, out_params) in label_outputs:\n label_name = out_params.label_name\n build_pyramid(\n zarrurl=f\"{zarrurl}/labels/{label_name}\",\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=label_chunksize,\n aggregation_function=np.max,\n )\n\n return {}\n
A value of the input_specs argument in napari_workflows_wrapper.
ATTRIBUTE DESCRIPTION type
Input type (either image or label).
TYPE: Literal['image', 'label']
label_name
Label name (for label inputs only).
TYPE: Optional[str]
channel
ChannelInputModel object (for image inputs only).
TYPE: Optional[ChannelInputModel]
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
class NapariWorkflowsInput(BaseModel):\n\"\"\"\n A value of the `input_specs` argument in `napari_workflows_wrapper`.\n\n Attributes:\n type: Input type (either `image` or `label`).\n label_name: Label name (for label inputs only).\n channel: `ChannelInputModel` object (for image inputs only).\n \"\"\"\n\n type: Literal[\"image\", \"label\"]\n label_name: Optional[str]\n channel: Optional[ChannelInputModel]\n\n @validator(\"label_name\", always=True)\n def label_name_is_present(cls, v, values):\n\"\"\"\n Check that label inputs have `label_name` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"label\" and not v:\n raise ValueError(\n f\"Input item has type={_type} but label_name={v}.\"\n )\n return v\n\n @validator(\"channel\", always=True)\n def channel_is_present(cls, v, values):\n\"\"\"\n Check that image inputs have `channel` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"image\" and not v:\n raise ValueError(f\"Input item has type={_type} but channel={v}.\")\n return v\n
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
@validator(\"channel\", always=True)\ndef channel_is_present(cls, v, values):\n\"\"\"\n Check that image inputs have `channel` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"image\" and not v:\n raise ValueError(f\"Input item has type={_type} but channel={v}.\")\n return v\n
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
@validator(\"label_name\", always=True)\ndef label_name_is_present(cls, v, values):\n\"\"\"\n Check that label inputs have `label_name` set.\n \"\"\"\n _type = values.get(\"type\")\n if _type == \"label\" and not v:\n raise ValueError(\n f\"Input item has type={_type} but label_name={v}.\"\n )\n return v\n
A value of the output_specs argument in napari_workflows_wrapper.
ATTRIBUTE DESCRIPTION type
Output type (either label or dataframe).
TYPE: Literal['label', 'dataframe']
label_name
Label name (for label outputs, it is used as the name of the label; for dataframe outputs, it is used to fill the region[\"path\"] field).
TYPE: str
table_name
Table name (for dataframe outputs only).
TYPE: Optional[str]
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
class NapariWorkflowsOutput(BaseModel):\n\"\"\"\n A value of the `output_specs` argument in `napari_workflows_wrapper`.\n\n Attributes:\n type: Output type (either `label` or `dataframe`).\n label_name: Label name (for label outputs, it is used as the name of\n the label; for dataframe outputs, it is used to fill the\n `region[\"path\"]` field).\n table_name: Table name (for dataframe outputs only).\n \"\"\"\n\n type: Literal[\"label\", \"dataframe\"]\n label_name: str\n table_name: Optional[str] = None\n\n @validator(\"table_name\", always=True)\n def table_name_only_for_dataframe_type(cls, v, values):\n\"\"\"\n Check that table_name is set only for dataframe outputs.\n \"\"\"\n _type = values.get(\"type\")\n if (_type == \"dataframe\" and (not v)) or (_type != \"dataframe\" and v):\n raise ValueError(\n f\"Output item has type={_type} but table_name={v}.\"\n )\n return v\n
Check that table_name is set only for dataframe outputs.
Source code in fractal_tasks_core/tasks/napari_workflows_wrapper_models.py
@validator(\"table_name\", always=True)\ndef table_name_only_for_dataframe_type(cls, v, values):\n\"\"\"\n Check that table_name is set only for dataframe outputs.\n \"\"\"\n _type = values.get(\"type\")\n if (_type == \"dataframe\" and (not v)) or (_type != \"dataframe\" and v):\n raise ValueError(\n f\"Output item has type={_type} but table_name={v}.\"\n )\n return v\n
Takes a string (filename of a Yokogawa image), extract site and z-index metadata and returns them as a list of integers.
PARAMETER DESCRIPTION filename
Name of the image file.
TYPE: str
Source code in fractal_tasks_core/tasks/yokogawa_to_ome_zarr.py
def sort_fun(filename: str) -> list[int]:\n\"\"\"\n Takes a string (filename of a Yokogawa image), extract site and\n z-index metadata and returns them as a list of integers.\n\n Args:\n filename: Name of the image file.\n \"\"\"\n\n filename_metadata = parse_filename(filename)\n site = int(filename_metadata[\"F\"])\n z_index = int(filename_metadata[\"Z\"])\n return [site, z_index]\n
This task is typically run after Create OME-Zarr or Create OME-Zarr Multiplexing and populates the empty OME-Zarr files that were prepared.
PARAMETER DESCRIPTION input_paths
List of input paths where the OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: [\"/some/path/\"]. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: Sequence[str]
output_path
Unclear. Should be the same as input_path. (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
component
Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: \"some_plate.zarr/B/03/0\" (standard argument for Fractal tasks, managed by Fractal server).
TYPE: str
metadata
Dictionary containing metadata about the OME-Zarr. This task requires the following elements to be present in the metadata. original_paths: list of paths that correspond to the input_paths of the create_ome_zarr task (=> where the microscopy image are stored); num_levels (int): number of pyramid levels in the image (this determines how many pyramid levels are built for the segmentation); coarsening_xy (int): coarsening factor in XY of the downsampling when building the pyramid; image_extension: filename extension of images (e.g. \"tif\" or \"png\"); image_glob_patterns: parameter of create_ome_zarr task (if specified, only parse images with filenames that match with all these patterns). (standard argument for Fractal tasks, managed by Fractal server).
TYPE: dict[str, Any]
overwrite
If True, overwrite the task output.
TYPE: bool DEFAULT: False
Source code in fractal_tasks_core/tasks/yokogawa_to_ome_zarr.py
@validate_arguments\ndef yokogawa_to_ome_zarr(\n *,\n input_paths: Sequence[str],\n output_path: str,\n component: str,\n metadata: dict[str, Any],\n overwrite: bool = False,\n):\n\"\"\"\n Convert Yokogawa output (png, tif) to zarr file.\n\n This task is typically run after Create OME-Zarr or\n Create OME-Zarr Multiplexing and populates the empty OME-Zarr files that\n were prepared.\n\n Args:\n input_paths: List of input paths where the OME-Zarrs. Should point to\n the parent folder containing one or many OME-Zarr files, not the\n actual OME-Zarr file. Example: `[\"/some/path/\"]`.\n This task only supports a single input path.\n (standard argument for Fractal tasks,\n managed by Fractal server).\n output_path: Unclear. Should be the same as `input_path`.\n (standard argument for Fractal tasks, managed by Fractal server).\n component: Path to the OME-Zarr image in the OME-Zarr plate that is\n processed. Example: `\"some_plate.zarr/B/03/0\"`\n (standard argument for Fractal tasks, managed by Fractal server).\n metadata: Dictionary containing metadata about the OME-Zarr. This task\n requires the following elements to be present in the metadata.\n `original_paths`:\n list of paths that correspond to the `input_paths` of the\n `create_ome_zarr` task (=> where the microscopy image are stored);\n `num_levels (int)`:\n number of pyramid levels in the image (this determines how many\n pyramid levels are built for the segmentation);\n `coarsening_xy (int)`:\n coarsening factor in XY of the downsampling when building the\n pyramid;\n `image_extension`:\n filename extension of images (e.g. `\"tif\"` or `\"png\"`);\n `image_glob_patterns`:\n parameter of `create_ome_zarr` task (if specified, only parse\n images with filenames that match with all these patterns).\n (standard argument for Fractal tasks, managed by Fractal server).\n overwrite: If `True`, overwrite the task output.\n \"\"\"\n\n # Preliminary checks\n if len(input_paths) > 1:\n raise NotImplementedError\n zarrurl = Path(input_paths[0]).as_posix() + f\"/{component}\"\n\n # Read attributes from NGFF metadata\n ngff_image_meta = load_NgffImageMeta(zarrurl)\n num_levels = ngff_image_meta.num_levels\n coarsening_xy = ngff_image_meta.coarsening_xy\n full_res_pxl_sizes_zyx = ngff_image_meta.get_pixel_sizes_zyx(level=0)\n logger.info(f\"NGFF image has {num_levels=}\")\n logger.info(f\"NGFF image has {coarsening_xy=}\")\n logger.info(\n f\"NGFF image has full-res pixel sizes {full_res_pxl_sizes_zyx}\"\n )\n\n parameters = get_parameters_from_metadata(\n keys=[\n \"original_paths\",\n \"image_extension\",\n \"image_glob_patterns\",\n ],\n metadata=metadata,\n # FIXME: Why rely on output_path here, when we use the input path for\n # the zarr_url? That just means that different input & output paths\n # don't work, no?\n image_zarr_path=(Path(output_path) / component),\n )\n original_path_list = parameters[\"original_paths\"]\n image_extension = parameters[\"image_extension\"]\n image_glob_patterns = parameters[\"image_glob_patterns\"]\n\n channels: list[OmeroChannel] = get_omero_channel_list(\n image_zarr_path=zarrurl\n )\n wavelength_ids = [c.wavelength_id for c in channels]\n\n in_path = Path(original_path_list[0])\n\n # Define well\n component_split = component.split(\"/\")\n well_row = component_split[1]\n well_column = component_split[2]\n well_ID = well_row + well_column\n\n # Read useful information from ROI table\n adata = read_zarr(f\"{zarrurl}/tables/FOV_ROI_table\")\n fov_indices = convert_ROI_table_to_indices(\n adata,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(fov_indices, \"FOV_ROI_table\")\n adata_well = read_zarr(f\"{zarrurl}/tables/well_ROI_table\")\n well_indices = convert_ROI_table_to_indices(\n adata_well,\n full_res_pxl_sizes_zyx=full_res_pxl_sizes_zyx,\n )\n check_valid_ROI_indices(well_indices, \"well_ROI_table\")\n if len(well_indices) > 1:\n raise ValueError(f\"Something wrong with {well_indices=}\")\n\n # FIXME: Put back the choice of columns by name? Not here..\n\n max_z = well_indices[0][1]\n max_y = well_indices[0][3]\n max_x = well_indices[0][5]\n\n # Load a single image, to retrieve useful information\n patterns = [f\"*_{well_ID}_*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n tmp_images = glob_with_multiple_patterns(\n folder=str(in_path),\n patterns=patterns,\n )\n sample = imread(tmp_images.pop())\n\n # Initialize zarr\n chunksize = (1, 1, sample.shape[1], sample.shape[2])\n try:\n canvas_zarr = zarr.create(\n shape=(len(wavelength_ids), max_z, max_y, max_x),\n chunks=chunksize,\n dtype=sample.dtype,\n store=zarr.storage.FSStore(zarrurl + \"/0\"),\n overwrite=overwrite,\n dimension_separator=\"/\",\n )\n except ContainsArrayError as e:\n error_msg = (\n f\"Cannot create a zarr group at '{zarrurl}/0', \"\n f\"with {overwrite=} (original error: {str(e)}).\\n\"\n \"Hint: try setting overwrite=True.\"\n )\n logger.error(error_msg)\n raise OverwriteNotAllowedError(error_msg)\n\n # Loop over channels\n for i_c, wavelength_id in enumerate(wavelength_ids):\n A, C = wavelength_id.split(\"_\")\n\n patterns = [f\"*_{well_ID}_*{A}*{C}*.{image_extension}\"]\n if image_glob_patterns:\n patterns.extend(image_glob_patterns)\n filenames_set = glob_with_multiple_patterns(\n folder=str(in_path),\n patterns=patterns,\n )\n filenames = sorted(list(filenames_set), key=sort_fun)\n if len(filenames) == 0:\n raise ValueError(\n \"Error in yokogawa_to_ome_zarr: len(filenames)=0.\\n\"\n f\" in_path: {in_path}\\n\"\n f\" image_extension: {image_extension}\\n\"\n f\" well_ID: {well_ID}\\n\"\n f\" wavelength_id: {wavelength_id},\\n\"\n f\" patterns: {patterns}\"\n )\n # Loop over 3D FOV ROIs\n for indices in fov_indices:\n s_z, e_z, s_y, e_y, s_x, e_x = indices[:]\n region = (\n slice(i_c, i_c + 1),\n slice(s_z, e_z),\n slice(s_y, e_y),\n slice(s_x, e_x),\n )\n FOV_3D = da.concatenate(\n [imread(img) for img in filenames[:e_z]],\n )\n FOV_4D = da.expand_dims(FOV_3D, axis=0)\n filenames = filenames[e_z:]\n da.array(FOV_4D).to_zarr(\n url=canvas_zarr,\n region=region,\n compute=True,\n )\n\n # Starting from on-disk highest-resolution data, build and write to disk a\n # pyramid of coarser levels\n build_pyramid(\n zarrurl=zarrurl,\n overwrite=overwrite,\n num_levels=num_levels,\n coarsening_xy=coarsening_xy,\n chunksize=chunksize,\n )\n\n # Deprecated: Delete images (optional)\n # if delete_input:\n # for f in filenames:\n # try:\n # os.remove(f)\n # except OSError as e:\n # logging.info(\"Error: %s : %s\" % (f, e.strerror))\n\n return {}\n
Thanks to the package manifest and to their structure, the tasks in fractal_tasks_core.tasks can be run within the Fractal platform; this consists in a backend server which can be accessed by one of the two available clients (a command-line client and a web-client).
The fractal-demos repository lists a set of relevant examples, including:
How to set up a fractal-server instance;
How to set up a fractal-client command-line client;
How to use the command-line client to submit a series of typical workflows (based on fractal-tasks-core tasks) to Fractal; see folders from 01 to 10 in the examples folder.
The fractal-tasks-core GitHub repository includes an examples folder, listing a few examples of how to run fractal-tasks-core tasks from a standard Python script (instead of using the Fractal platform).
What follows is the content of examples/README.md:
This folder is not always kept up-to-date. If you encounter any unexpected problem, please open a new issue on the fractal-tasks-core GitHub repository.
Examples from 01 to 09 are currently aligned with fractal-tasks-core 0.10.0.
"},{"location":"version_updates/v0_14_0/","title":"From version 0.13.1 to 0.14.0","text":""},{"location":"version_updates/v0_14_0/#package-structure","title":"Package structure","text":"
Version 0.14.0 includes a large refactor of the fractal_tasks_core package, leading to this new structure:
Within fractal-tasks-core, we make use of tables which are AnnData objects
+stored within OME-Zarr image groups. This page describes the different kinds of
+tables we use, and it includes:
Note: The specifications below are largely inspired by a proposed update
+to OME-NGFF specs. This update is currently
+on hold, and fractal-tasks-core will evolve as soon as an official NGFF
+table specs is adopted - see also the Outlook section.
In this section we describe version 1 (V1) of the Fractal table specifications;
+for the moment, only V1 exists.
+Note that V1 specifications are only implemented as os of version 0.14.0 of
+fractal-tasks-core.
The core-table specification consists in the definition of the required Zarr
+structure and attributes, and of the AnnData table format.
+
AnnData table format
+
We store tabular data into Zarr groups as AnnData ("Annotated Data") objects;
+the anndata Python library provides the
+definition of this format and the relevant tools. Quoting from the anndata
+documentation:
+
+
AnnData is specifically designed for matrix-like data. By this we mean that
+we have \(n\) observations, each of which can be represented as \(d\)-dimensional
+vectors, where each dimension corresponds to a variable or feature. Both the
+rows and columns of this \(n \times d\) matrix are special in the sense that
+they are indexed.
Note that AnnData tables are easily transformed from/into pandas.DataFrame
+objects - see e.g. the AnnData.to_df
+method.
+
Zarr structure and attributes
+
The structure of Zarr groups is based on the image specification in NGFF
+0.4, with an
+additional tables group and the corresponding subgroups (similar to
+labels):
+
image.zarr # Zarr group for a NGFF image
+|
+├── 0 # Zarr array for multiscale level 0
+├── ...
+├── N # Zarr array for multiscale level N
+|
+├── labels # Zarr subgroup with a list of labels associated to this image
+| ├── label_A # Zarr subgroup for a given label
+| ├── label_B # Zarr subgroup for a given label
+| └── ...
+|
+└── tables # Zarr subgroup with a list of tables associated to this image
+ ├── table_1 # Zarr subgroup for a given table
+ ├── table_2 # Zarr subgroup for a given table
+ └── ...
+
+
The Zarr attributes of the tables group must include the key tables,
+pointing to the list of all tables (this simplifies discovery of tables
+associated to the current NGFF image), as in
+
image.zarr/tables/.zattrs
{
+"tables":["table_1","table_2"]
+}
+
+
The Zarr attributes of each specific-table group must include the version of
+the table specification (currently version 1), through the
+fractal_table_version attribute. Also note that the anndata function to
+write an AnnData object into a Zarr group automatically sets additional
+attributes. Here is an example of the resulting Zarr attributes:
+
image.zarr/tables/table_1/.zattrs
{
+"fractal_table_version":"1",
+"encoding-type":"anndata",// Automatically added by anndata 0.11
+"encoding-version":"0.1.0",// Automatically added by anndata 0.11
+}
+
In fractal-tasks-core, a ROI table defines regions of space which are
+three-dimensional (see also the Outlook section about
+dimensionality flexibility) and box-shaped.
+Typical use cases are described here.
+
Zarr attributes
+
The specification of a ROI table is a subset of the core table
+one. Moreover, the table-group Zarr attributes must include the
+type attribute with value roi_table, as in
+
The var
+attribute
+of a given AnnData object indexes the columns of the table. A
+fractal-tasks-core ROI table must include the following six columns:
+
+
x_micrometer, y_micrometer, z_micrometer:
+ the lower bounds of the XYZ intervals defining the ROI, in micrometers;
+
len_x_micrometer, len_y_micrometer, len_z_micrometer:
+ the XYZ edge lengths, in micrometers.
+
+
+
Notes:
+
+
The axes origin for the ROI positions (e.g. for x_micrometer)
+ corresponds to the top-left corner of the image (for the YX axes) and to
+ the lowest Z plane.
+
ROIs are defined in physical coordinates, and they do not store
+ information on the number or size of pixels.
+
+
+
ROI tables may also include other columns, beyond the required ones. Here are
+the ones that are typically used in fractal-tasks-core (see also the Use
+cases section):
+
+
x_micrometer_original and y_micrometer_original, which are a copy of
+ x_micrometer and y_micrometer taken before applying some transformation;
+
translation_x, translation_y and translation_z, which are used during
+ registration of multiplexing cycles;
Masking ROI tables are a specific instance of the basic ROI tables described
+above, where each ROI must also be associated to a specific label of a label
+image.
+
Motivation
+
The motivation for this association is based on the following use case:
+
+
By performing segmentation of a NGFF image, we identify N objects and we
+ store them as a label image (where the value at each pixel correspond to the
+ label index);
+
We also compute the three-dimensional bounding box of each segmented object,
+ and store these bounding boxes into a masking ROI table;
+
For each one of these ROIs, we also include information that link it to both
+ the label image and a specific label index;
+
During further processing we can load/modify specific sub-regions of the ROI,
+ based on information contained in the label image. This kind of operations
+ are masked, as they only act on the array elements that match a certain
+ condition on the label value.
+
+
Zarr attributes
+
For this kind of tables, fractal-tasks-core closely follows the proposed
+NGFF update mentioned above. The
+requirements on the Zarr attributes of a given table are:
+
+
Attributes must contain a type key, with value masking_roi_table2.
+
Attributes must contain a region key; the corresponding value must be an
+ object with a path key and a string value (i.e. the path to the data the
+ table is annotating).
+
Attributes must include a key instance_key, which is the key in obs that
+ denotes which instance in region the row corresponds to.
On top of the required ROI-table colums, the masking-ROI-table AnnData object
+must have an attribute obs with a key matching to the instance_key zarr
+attribute. For instance if instance_key="label" then table.obs["label"]
+must exist, with its items matching the labels in the image in
+"../labels/label_DAPI".
The typical use case for feature tables is to store measurements related to
+segmented objects, while mantaining a link to the original instances (e.g.
+labels). Note that the current specification is aligned to the one of masking
+ROI tables, since they both need to relate a table to a
+label image, but the two may diverge in the future.
+
As part of the current fractal-tasks-core tasks, measurements can be
+performed e.g. via regionprops from scikit-image, as wrapped in
+napari-skimage-regionprops).
+
Zarr attributes
+
For this kind of tables, fractal-tasks-core closely follows the proposed
+NGFF update mentioned above. The
+requirements on the Zarr attributes of a given table are:
+
+
Attributes must contain a type key, with value feature_table2.
+
Attributes must contain a region key; the corresponding value must be an
+ object with a path key and a string value (i.e. the path to the data the
+ table is annotating).
+
Attributes must include a key instance_key, which is the key in obs that
+ denotes which instance in region the row corresponds to.
The feature-table AnnData object must have an attribute obs with a key
+matching to the instance_key zarr attribute. For instance if
+instance_key="label" then table.obs["label"] must exist, with its items
+matching the labels in the image in "../labels/label_DAPI".
OME-Zarrs created via fractal-tasks-core (e.g. by parsing Yokogawa images via
+the
+create_ome_zarr
+or
+create_ome_zarr_multiplex
+tasks) always include two specific ROI tables:
+
+
The table named well_ROI_table, which covers the NGFF image corresponding to the whole well1;
+
The table named FOV_ROI_table, which lists all original fields of view (FOVs).
+
+
Each one of these two tables includes ROIs that span the whole image size along
+the Z axis. Note that this differs, e.g., from ROIs which are the bounding
+boxes of three-dimensional segmented objects, and which may cover only a part
+of the image Z size.
When working with an externally-generated OME-Zarr, one may use the
+import_ome_zarr
+task
+to make it compatible with fractal-tasks-core. This task optionally adds two
+ROI tables to the NGFF images:
+
+
The table named image_ROI_table, which covers the whole image;
+
A table named grid_ROI_table, which splits the whole-image ROI into a YX
+ rectangular grid of smaller ROIs. This may correspond to original FOVs (in
+ case the image is a tiled well1), or it may simply be useful for applying
+ downstream processing to smaller arrays and avoid large memory requirements.
+
+
As for the case of well_ROI_table and FOV_ROI_table described
+above, also these two tables include ROIs spanning the
+whole image extension along the Z axis.
ROI tables are also used and updated during image processing, e.g as in:
+
+
The FOV ROI table may undergo transformations during processing, e.g. FOV
+ ROIs may be shifted to avoid overlaps; in this case, we use the optional
+ columns x_micrometer_original and y_micrometer_original to store the values
+ before the transformation.
+
The FOV ROI table is also used to store information on the registration of
+ multiplexing cycles, via the translation_x, translation_y and
+ translation_z optional columns.
+
Several tasks in fractal-tasks-core take an existing ROI table as an input
+ and then loop over the ROIs defined in the table. This makes the task more
+ flexible, as it can be used to process e.g. a whole well, a set of FOVs, or a
+ set of custom regions of the array.
To read an AnnData table from a Zarr group, one may use the read_zarr
+function.
+In the following example a NGFF image was created by stitching together two
+field of views, where each one is made of a stack of five Z planes with 1 um
+spacing between the planes.
+The FOV_ROI_table has information on the XY position and size of the two
+original FOVs (named FOV_1 and FOV_2):
+
The anndata.experimental.write_elem function provides the required
+functionality to write an AnnData object to a Zarr group. In
+fractal-tasks-core, the write_table helper function wraps the anndata
+function and includes additional functionalities -- see its
+documentation.
+
With respect to the wrapped anndata function, the main additional features of write_table are
+
+
The boolean parameter overwrite (defaulting to False), that determines the behavior in case of an already-existing table at the given path.
+
The table_attrs parameter, as a shorthand for updating the Zarr attributes of the table group after its creation.
These specifications may evolve (especially based on the future NGFF updates),
+eventually leading to breaking changes in future versions.
+fractal-tasks-core will aim at mantaining backwards-compatibility with V1 for
+a reasonable amount of time.
+
Here is an in-progress list of aspects that may be reviewed:
+
+
We aim at removing the use of hard-coded units from the column names (e.g.
+ x_micrometer), in favor of a more general definition of units.
+
The z_micrometer and len_z_micrometer columns are currently required in
+ all ROI tables, even when the ROIs actually define a two-dimensional XY
+ region; in that case, we set z_micrometer=0 and len_z_micrometer is such
+ that the whole Z size is covered (that is, len_z_micrometer is the product
+ of the spacing between Z planes and the number of planes). In a future
+ version, we may introduce more flexibility and also accept ROI tables which
+ only include X and Y axes, and adapt the relevant tools so that they
+ automatically expand these ROIs into three-dimensions when appropriate.
+
Concerning the use of AnnData tables or other formats for tabular data, our
+ plan is to follow whatever serialised table specification becomes part of the
+ NGFF standard. For the record, Zarr does not natively support storage of
+ dataframes (see e.g.
+ https://github.com/zarr-developers/numcodecs/issues/452), which is one aspect
+ in favor of sticking with the anndata library.
+
+
+
+
+
+
Within fractal-tasks-core, NGFF images represent whole wells; this still
+complies with the NGFF specifications, as of an approved clarification in the
+specs. This explains the reason for
+storing the regions corresponding to the original FOVs in a specific ROI table,
+since one NGFF image includes a collection of FOVs. Note that this approach
+does not rely on the assumption that the FOVs constitute a regular tiling of
+the well, but it also covers the case of irregularly placed FOVs. ↩↩
+
+
+
Note that the table types masking_roi_table and feature_table closely
+resemble the type="ngff:region_table" specification in the previous proposed
+NGFF table specs. ↩↩