Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Dask/pyramid suport. Split tiff/imagej/ome reader at metadata level instead. #22

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,4 @@ target/

# written by setuptools_scm
*/_version.py
.idea/
28 changes: 20 additions & 8 deletions napari_tiff/_tests/test_tiff_reader.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
import numpy as np
import os
import pytest
import tifffile
import zipfile

import numpy as np
from napari_tiff import napari_get_reader
from napari_tiff.napari_tiff_reader import (imagecodecs_reader,
imagej_reader,
tifffile_reader,
zip_reader)
import pytest
import tifffile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the imagej_reader from tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you're trying to just unify everything under one reader that automagically does the right thing. That's fine, but I think it's worth keeping and testing the other readers as independent functions. This makes it easier to understand when one particular functionality is broken.

Copy link
Collaborator

@GenevieveBuckley GenevieveBuckley Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR removes imagej_reader entirely, and instead has separate functions for finding imagej/ome/tiff metadata (instead of having separate functions for reading each of those types of tiff files).

EDIT: sorry Juan, for some reason your second comment didn't show up for me. I didn't mean to post redundant information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok it also saves imagej metadata - I didn't know that was supported. I'll put back the test.



def example_data_filepath(tmp_path, original_data):
Expand All @@ -18,9 +17,21 @@ def example_data_filepath(tmp_path, original_data):


def example_data_tiff(tmp_path, original_data):
example_data_filepath = str(tmp_path / "myfile.tif")
tifffile.imwrite(example_data_filepath, original_data, imagej=True)
return tifffile.TiffFile(example_data_filepath)
filepath = str(tmp_path / "myfile.tif")
tifffile.imwrite(filepath, original_data)
return tifffile.TiffFile(filepath)


def example_data_imagej(tmp_path, original_data):
filepath = str(tmp_path / "myfile.tif")
tifffile.imwrite(filepath, original_data, imagej=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is WIP? Or some metadata should be added? I know that it will contain some imagej tag but, will not contain any metadata to check if reading is proper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. The original test code read plain tif as imagej, so trying to expand with some simple imagej (and ome-tiff for ome-tiff reading) metadata that tifffile supports.
I'm happy to leave if/what metadata the imagej part of the code would be tested against, as I'm focussing on the ome-tiff reading.
In general a concise, non-generated dataset is needed for more thorough testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark is to ome function. It is also without providing any metadata.

return tifffile.TiffFile(filepath)


def example_data_ometiff(tmp_path, original_data):
filepath = str(tmp_path / "myfile.ome.tif")
tifffile.imwrite(filepath, original_data, ome=True)
return tifffile.TiffFile(filepath)


def example_data_zipped(tmp_path, original_data):
Expand Down Expand Up @@ -67,8 +78,9 @@ def test_reader(tmp_path, data_fixture, original_data):

@pytest.mark.parametrize("reader, data_fixture, original_data", [
(imagecodecs_reader, example_data_filepath, np.random.random((20, 20))),
(imagej_reader, example_data_tiff, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important that we don't lose functionality, so we'll need to keep this test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...ah, I think I see what's happening with this restructure now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

(tifffile_reader, example_data_imagej, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)),
(tifffile_reader, example_data_tiff, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)),
(tifffile_reader, example_data_ometiff, np.random.randint(0, 255, size=(20, 20, 3)).astype(np.uint8)),
(zip_reader, example_data_zipped, np.random.random((20, 20))),
])
def test_all_readers(reader, data_fixture, original_data, tmp_path):
Expand Down
157 changes: 142 additions & 15 deletions napari_tiff/napari_tiff_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from typing import List, Optional, Union, Any, Tuple, Dict, Callable

import numpy
from tifffile import TiffFile, TiffSequence, TIFF
from tifffile import TiffFile, TiffSequence, TIFF, xml2dict, PHOTOMETRIC
from vispy.color import Colormap

LayerData = Union[Tuple[Any], Tuple[Any, Dict], Tuple[Any, Dict, str]]
Expand Down Expand Up @@ -50,13 +50,10 @@ def napari_get_reader(path: PathLike) -> Optional[ReaderFunction]:

def reader_function(path: PathLike) -> List[LayerData]:
"""Return a list of LayerData tuples from path or list of paths."""
# TODO: Pyramids, OME, LSM
# TODO: LSM
with TiffFile(path) as tif:
try:
if tif.is_imagej:
layerdata = imagej_reader(tif)
else:
layerdata = tifffile_reader(tif)
layerdata = tifffile_reader(tif)
except Exception as exc:
# fallback to imagecodecs
log_warning(f'tifffile: {exc}')
Expand All @@ -72,14 +69,32 @@ def zip_reader(path: PathLike) -> List[LayerData]:


def tifffile_reader(tif):
"""Return napari LayerData from largest image series in TIFF file."""
"""Return napari LayerData from image series in TIFF file."""
nlevels = len(tif.series[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a little confused about the difference between tiff series and tiff pages. Is anyone else able to explain it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although @cgohlke is the expert as creator of tifffile, as I understand it pages is the internal structure of tiff, and the series/levels provide higher level access (which point to the underlying pages).

if nlevels > 1:
import dask.array as da
data = [da.from_zarr(tif.aszarr(level=level)) for level in range(nlevels)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the tiff series are arranged in a pyramid format. Is that something that is absolutely guaranteed by the tiff specification?
If not, we may need to check we actually have a multi-resolution file with all the resolution levels in the right order before passing it to napari.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not seen this written otherwise in any WSI file format. I imagine it's spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doctring from series Series of pages with compatible shape and data type. So if the length of series is different from 1 then different shapes are present.

However, I'm not sure if it always should be read as pyramidal. As I know that tiff allows storing thumbnail for example. But I do not have such data available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Czaki this is testing the length / number of levels inside the first serie. Would a thumbnail be the same or a separate series? What would the best way be to test for pyramid - ome metadata?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know. I work with writing my own tiff parser a long time ago and cannot find the data that I used then. This is rather point that it will be nice to check this (for example using some public datasets). I will also try to digg it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised this should be: len(tif.series[0].levels)

else:
data = tif.asarray()
if tif.is_ome:
kwargs = get_ome_tiff_metadata(tif)
# TODO: combine interpretation of imagej and tags metadata?:
elif tif.is_imagej:
kwargs = get_imagej_metadata(tif)
else:
kwargs = get_tiff_metadata(tif)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected that tiff files might include some combination of these types of metadata, not only one. Would it be better to combine the dictionaries instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's also possible. This is exactly what I'm doing in our image conversion pipeline e.g.:

https://github.com/FrancisCrickInstitute/OmeSliCC/blob/f6e2aecd5713bbbff7e10e04eebe5ede4bace1cc/OmeSliCC/TiffSource.py#L113

return [(data, kwargs, 'image')]


def get_tiff_metadata(tif):
"""Return napari metadata from largest image series in TIFF file."""
# TODO: fix (u)int32/64
# TODO: handle complex
series = tif.series[0]
for s in tif.series:
if s.size > series.size:
series = s
data = series.asarray()
dtype = series.dtype
axes = series.axes
shape = series.shape
page = next(p for p in series.pages if p is not None)
Expand Down Expand Up @@ -108,7 +123,7 @@ def tifffile_reader(tif):
elif (
page.photometric in (2, 6) and (
page.planarconfig == 2 or
(page.bitspersample > 8 and data.dtype.kind in 'iu') or
(page.bitspersample > 8 and dtype.kind in 'iu') or
(extrasamples and len(extrasamples) > 1)
)
):
Expand Down Expand Up @@ -191,7 +206,7 @@ def tifffile_reader(tif):

if (
contrast_limits is None and
data.dtype.kind == 'u' and
dtype.kind == 'u' and
page.photometric != 3 and
page.bitspersample not in (8, 16, 32, 64)
):
Expand All @@ -209,16 +224,16 @@ def tifffile_reader(tif):
blending=blending,
visible=visible,
)
return [(data, kwargs, 'image')]
return kwargs


def imagej_reader(tif):
def get_imagej_metadata(tif):
"""Return napari LayerData from ImageJ hyperstack."""
# TODO: ROI overlays
ijmeta = tif.imagej_metadata
series = tif.series[0]

data = series.asarray()
dtype = series.dtype
axes = series.axes
shape = series.shape
page = series.pages[0]
Expand Down Expand Up @@ -263,7 +278,7 @@ def imagej_reader(tif):
if mode in ('color', 'grayscale'):
blending = 'opaque'

elif axes[-1] == 'S' and data.dtype == 'uint16':
elif axes[-1] == 'S' and dtype == 'uint16':
# RGB >8-bit
channel_axis = axes.find('S')
if channel_axis >= 0 and shape[channel_axis] in (3, 4):
Expand Down Expand Up @@ -299,7 +314,96 @@ def imagej_reader(tif):
blending=blending,
visible=visible,
)
return [(data, kwargs, 'image')]
return kwargs


def get_ome_tiff_metadata(tif):
metadata = xml2dict(tif.ome_metadata)
if 'OME' in metadata:
metadata = metadata['OME']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata variable doesn't seem to get used in this function after this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP


series = tif.series[0]
shape = series.shape
dtype = series.dtype
axes = series.axes.lower().replace('s', 'c')
if 'c' in axes:
channel_axis = axes.index('c')
nchannels = shape[channel_axis]
else:
channel_axis = None
nchannels = 1

image = ensure_list(metadata.get('Image', {}))[0]
pixels = image.get('Pixels', {})

pixel_size = []
size = float(pixels.get('PhysicalSizeX', 0))
if size > 0:
pixel_size.append(get_value_units_micrometer(size, pixels.get('PhysicalSizeXUnit')))
size = float(pixels.get('PhysicalSizeY', 0))
if size > 0:
pixel_size.append(get_value_units_micrometer(size, pixels.get('PhysicalSizeYUnit')))

channels = ensure_list(pixels.get('Channel', []))
if len(channels) > nchannels:
nchannels = len(channels)

is_rgb = (series.keyframe.photometric == PHOTOMETRIC.RGB and nchannels in (3, 4))

names = []
contrast_limits = []
colormaps = []
blendings = []
visibles = []

scale = None
if pixel_size:
scale = pixel_size

for channeli, channel in enumerate(channels):
name = channel.get('Name')
color = channel.get('Color')
colormap = None
if color:
colormap = int_to_rgba(int(color))
elif is_rgb and len(channels) > 1:
# separate RGB channels
colormap = ['red', 'green', 'blue', 'alpha'][channeli]
if not name:
name = colormap

contrast_limit = None
if dtype.kind != 'f':
info = numpy.iinfo(dtype)
contrast_limit = (info.min, info.max)

blending = 'additive'
visible = True

if len(channels) > 1:
names.append(name)
blendings.append(blending)
contrast_limits.append(contrast_limit)
colormaps.append(colormap)
visibles.append(visible)
else:
names = name
blendings = blending
contrast_limits = contrast_limit
colormaps = colormap
visibles = visible

meta = dict(
rgb=is_rgb,
channel_axis=channel_axis,
name=names,
scale=scale,
colormap=colormaps,
contrast_limits=contrast_limits,
blending=blendings,
visible=visibles,
)
return meta


def imagecodecs_reader(path):
Expand All @@ -308,6 +412,12 @@ def imagecodecs_reader(path):
return [(imread(path), {}, 'image')]


def ensure_list(x):
if not isinstance(x, (list, tuple)):
x = [x]
return x


def alpha_colormap(bitspersample=8, samples=4):
"""Return Alpha colormap."""
n = 2**bitspersample
Expand Down Expand Up @@ -359,6 +469,23 @@ def cmyk_colormaps(bitspersample=8, samples=3):
return [Colormap(c), Colormap(m), Colormap(y), Colormap(k)]


def int_to_rgba(intrgba: int) -> tuple:
signed = (intrgba < 0)
rgba = [x / 255 for x in intrgba.to_bytes(4, signed=signed, byteorder="big")]
if rgba[-1] == 0:
rgba[-1] = 1
return tuple(rgba)


def get_value_units_micrometer(value: float, unit: str = None) -> float:
conversions = {'nm': 1e-3, 'µm': 1, 'um': 1, 'micrometer': 1, 'mm': 1e3, 'cm': 1e4, 'm': 1e6}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this should be global variable

if unit:
value_um = value * conversions.get(unit, 1)
else:
value_um = value
return value_um


def log_warning(msg, *args, **kwargs):
"""Log message with level WARNING."""
import logging
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,10 @@ requires-python = '>=3.10'
dependencies = [
'imagecodecs',
'numpy',
'tifffile>=2020.5.7',
'tifffile>=2023.9.26',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific feature/requirement causing the bump in version number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, loading (and lazy writing) levels / zarr / dask

'dask',
'vispy',
'zarr',
]

[project.optional-dependencies]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just below this line there are testing dependencies. Please add napari there

Expand Down
Loading