Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bioformats2raw.layout #112

Merged
merged 32 commits into from
Sep 28, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d7c0556
First draft of bioformats2raw.layout
joshmoore Apr 7, 2022
8ee02d1
Add layout example
joshmoore Apr 8, 2022
ee15455
Improve wording of "SHOULD parse multiple images", thanks to Ilan
joshmoore Apr 20, 2022
c8e98e4
Clarify contents of METADATA.ome.xml, thanks to Melissa
joshmoore Apr 20, 2022
6403d88
Add schema & test for bf2raw metadata
joshmoore Apr 21, 2022
890e8a2
Add missing schema file
joshmoore Apr 21, 2022
b099281
Merge 'origin/main' into bf2raw
joshmoore May 3, 2022
2b67a36
Add applicable versions statement
joshmoore May 11, 2022
31d34c3
Update text with suggestions
joshmoore May 30, 2022
3173303
Merge 'origin/main' into bf2raw
joshmoore May 30, 2022
d44a066
Add bf2raw examples config
joshmoore May 30, 2022
7574d22
Add test to find missing configs
joshmoore May 30, 2022
ff5d7ec
Split subitems to pass linting
joshmoore May 30, 2022
e2119e5
Add graphical layout representation
joshmoore May 30, 2022
4fcc045
Fix missing whitespace
joshmoore May 30, 2022
57acc23
Add "series" attribute in "OME" group under bioformats2raw.layout
melissalinkert Jul 5, 2022
5eca169
Update the MUST/SHOULD semantics
joshmoore Jul 18, 2022
cf4e04c
Fix doubly indented bullets
joshmoore Sep 14, 2022
89c322d
Make minimum spec a link
joshmoore Sep 14, 2022
3691437
Add no-toc sections
joshmoore Sep 14, 2022
499caee
Re-arrange and add more text
joshmoore Sep 14, 2022
c7582e5
Make changes based on feedback from Will
joshmoore Sep 15, 2022
399b70b
Add bf2raw plate example
joshmoore Sep 15, 2022
740a11b
Add schema for ome series
joshmoore Sep 15, 2022
1f91482
Apply suggestions from code review
joshmoore Sep 19, 2022
861227b
Apply to v0.4 only
joshmoore Sep 22, 2022
8e612ef
Re-iterate plate precedence without OME/METADATA.ome.xml
joshmoore Sep 22, 2022
a0919be
Backport latest/bf2raw to 0.4
joshmoore Sep 22, 2022
5971927
Re-word the 'series' section
joshmoore Sep 22, 2022
88fd042
Make plate/series link a "SHOULD"
joshmoore Sep 22, 2022
7b7c43f
Add 'transitional' to 'omero' spec
joshmoore Sep 26, 2022
106c301
Add 0.4.1 changelog
joshmoore Sep 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions latest/examples/bf2raw/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/bf2raw.schema"
}
3 changes: 3 additions & 0 deletions latest/examples/bf2raw/image.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"bioformats2raw.layout" : 3
}
22 changes: 22 additions & 0 deletions latest/examples/bf2raw/plate.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"bioformats2raw.layout" : 3,
"plate" : {
"columns" : [ {
"name" : "1"
} ],
"name" : "Plate Name 0",
"wells" : [ {
"path" : "A/1",
"rowIndex" : 0,
"columnIndex" : 0
} ],
"field_count" : 1,
"rows" : [ {
"name" : "A"
} ],
"acquisitions" : [ {
"id" : 0
} ],
"version" : "0.4"
}
}
3 changes: 3 additions & 0 deletions latest/examples/ome/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/ome.schema"
}
3 changes: 3 additions & 0 deletions latest/examples/ome/series-2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"series" : [ "0", "1" ]
}
89 changes: 88 additions & 1 deletion latest/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,14 @@ The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
“RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in
[RFC 2119](https://tools.ietf.org/html/rfc2119).

<p>
<dfn>Transitional</dfn> metadata is added to the specification with the
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
intention of removing it in the future. Implementations may be expected (MUST) or
encouraged (SHOULD) to support the reading of the data, but writing will usually
be optional (MAY). Examples of transitional metadata include custom additions by
implementations that are later submitted as a formal specification. (See [[#bf2raw]])
</p>

Some of the JSON examples in this document include commments. However, these are only for
clarity purposes and comments MUST NOT be included in JSON objects.

Expand Down Expand Up @@ -242,9 +250,88 @@ keys as specified below for discovering certain types of data, especially images

If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data.

"bioformats2raw.layout" (transitional) {#bf2raw}
------------------------------------------------

[=Transitional=] "bioformats2raw.layout" metadata identifies a group which implicitly describes a series of images.
The need for the collection stems from the common "multi-image file" scenario in microscopy. Parsers like Bio-Formats
define a strict, stable ordering of the images in a single container that can be used to refer to them by other tools.

In order to capture that information within an OME-NGFF dataset, `bioformats2raw` internally introduced a wrapping layer.
The bioformats2raw layout has been added to v0.4 as a transitional specification to specify filesets that already exist
in the wild. An upcoming NGFF specification will replace this layout with explicit metadata.

<h4 id="bf2raw-layout" class="no-toc">Layout</h4>

Typical Zarr layout produced by running `bioformats2raw` on a fileset that contains more than one image (series > 1):

<pre>
series.ome.zarr # One converted fileset from bioformats2raw
├── .zgroup
├── .zattrs # Contains "bioformats2raw.layout" metadata
├── OME # Special group for containing OME metadata
│ ├── .zgroup
│ ├── .zattrs # Contains "series" metadata
│ └── METADATA.ome.xml # OME-XML file stored within the Zarr fileset
├── 0 # First image in the collection
├── 1 # Second image in the collection
└── ...
</pre>

<h4 id="bf2raw-attributes" class="no-toc">Attributes</h4>

The top-level `.zattrs` file must contain the `bioformats2raw.layout` key:
<pre class=include-code>
path: examples/bf2raw/image.json
highlight: json
</pre>

If the top-level group represents a plate, the `bioformats2raw.layout` metadata will be present but
the "plate" key MUST also be present, takes precedence and parsing of such datasets should follow [[#plate-md]]. It is not
possible to mix collections of images with plates at present.

<pre class=include-code>
path: examples/bf2raw/plate.json
highlight: json
</pre>

The `.zattrs` file within the OME group may contain the "series" key:

<pre class=include-code>
path: examples/ome/series-2.json
highlight: json
</pre>

<h4 id="bf2raw-details" class="no-toc">Details</h4>

Conforming groups:

- MUST have the value "3" for the "bioformats2raw.layout" key in their `.zattrs` metadata at the top of the hierarchy;
- SHOULD have OME metadata representing the entire collection of images in a file named "OME/METADATA.ome.xml" which:
- MUST adhere to the OME-XML specification but
- MUST use `<MetadataOnly/>` elements as opposed to `<BinData/>`, `<BinaryOnly/>` or `<TiffData/>`;
- MAY make use of the [minimum specification](https://docs.openmicroscopy.org/ome-model/6.2.2/specifications/minimum.html).

Additionally:
joshmoore marked this conversation as resolved.
Show resolved Hide resolved

- If "OME/METADATA.ome.xml" is present, "OME" MUST be a Zarr group which:
- MAY contain a "series" attribute. If so:
- "series" MUST be a list of string objects, each of which is a path to an image group.
- The order of the paths MUST match the order of the "Image" elements in "OME/METADATA.ome.xml".
- If "OME/METADATA.ome.xml" or the "series" attribute do not exist:
- existing "plate" metadata will take precedence if it exists, or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does will here actually mean MUST?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... I think that's right, but what I was trying to get across is that "the next statement only holds when "plate" isn't defined". Is that the same thing as the MUST?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or more simply:

- If "OME/METADATA.ome.xml" or the "series" attribute do not exist and "plate" metadata is not defined

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what bothers me in the logic is that the "plate MUST take precedence" applies much higher, e.g. even if the XML exists. Let me try to break all the clauses apart.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed this proposal for feedback, @melissalinkert:

Screen Shot 2022-09-22 at 19 40 56

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary question from my side is if it's actually:

Matching "series" metadata (as described next) SHOULD be provided ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest version is definitely clearer, thank you. I think Matching "series" metadata (as described next) SHOULD be provided makes sense, but I don't have strong objections to MAY.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're unsure, I'll err on the side of the stricter SHOULD and then we can downgrade it later.

- separate "multiscales" images MUST be stored in consecutively numbered groups starting from 0 (i.e. "0/", "1/", "2/", "3/", ...).
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
- Every "multiscales" group MUST represent exactly one OME-XML "Image" in the same order as either the series index or the group numbers.

Conforming readers:
- SHOULD make users aware of the presence of more than one image (i.e. SHOULD NOT default to only opening the first image);
- MAY use the "series" attribute in the "OME" group to determine a list of valid groups to display;
- MAY choose to show all images within the collection or offer the user a choice of images, as with <dfn export="true"><abbr title="High-content screening">HCS</abbr></dfn> plates;
- MAY ignore other groups or arrays under the root of the hierarchy.


"coordinateTransformations" metadata {#trafo-md}
-------------------------------------
------------------------------------------------

"coordinateTransformations" describe a series of transformations that map between two coordinate spaces (defined by "axes").
For example, to map a discrete data space of an array to the corresponding physical space.
Expand Down
14 changes: 14 additions & 0 deletions latest/schemas/bf2raw.schema
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://ngff.openmicroscopy.org/latest/schemas/bf2raw.schema",
"title": "NGFF container produced by bioformats2raw",
"description": "JSON from OME-NGFF .zattrs",
"type": "object",
"properties": {
"bioformats2raw.layout": {
"description": "The top-level identifier metadata added by bioformats2raw",
"type": "number",
"enum": [3]
}
}
}
17 changes: 17 additions & 0 deletions latest/schemas/ome.schema
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://ngff.openmicroscopy.org/latest/schemas/ome.schema",
"title": "NGFF group produced by bioformats2raw to contain OME metadata",
"description": "JSON from OME-NGFF OME/.zattrs linked to an OME-XML file",
"type": "object",
"properties": {
"series": {
"description": "An array of the same length and the same order as the images defined in the OME-XML",
"type": "array",
"items": {
"type": "string"
},
"minContains": 1
}
}
}
14 changes: 14 additions & 0 deletions latest/tests/test_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,17 @@ def test_run(suite):
resolver = RefResolver.from_schema(suite.schema, store=schema_store)
validator = Validator(suite.schema, resolver=resolver)
suite.validate(validator)


def test_example_configs():
"""
Test that all example folders have a config file
"""
missing = []
for subdir in os.walk("examples"):
has_examples = glob.glob(f"{subdir[0]}/*.json")
has_config = glob.glob(f"{subdir[0]}/.config.json")
if has_examples and not has_config:
missing.append(subdir[0])
if missing:
raise Exception(f"Directories missing configs: {missing}")