-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Dask/pyramid suport. Split tiff/imagej/ome reader at metadata level instead. #22
Changes from 4 commits
d1290fd
fecc64c
da56519
ab4f521
ee9c216
ccbc28f
967d3e4
17af7eb
616a7ab
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,3 +77,4 @@ target/ | |
|
||
# written by setuptools_scm | ||
*/_version.py | ||
.idea/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,13 @@ | ||
import numpy as np | ||
import os | ||
import pytest | ||
import tifffile | ||
import zipfile | ||
|
||
import numpy as np | ||
from napari_tiff import napari_get_reader | ||
from napari_tiff.napari_tiff_reader import (imagecodecs_reader, | ||
imagej_reader, | ||
tifffile_reader, | ||
zip_reader) | ||
import pytest | ||
import tifffile | ||
|
||
|
||
def example_data_filepath(tmp_path, original_data): | ||
|
@@ -18,9 +17,21 @@ def example_data_filepath(tmp_path, original_data): | |
|
||
|
||
def example_data_tiff(tmp_path, original_data): | ||
example_data_filepath = str(tmp_path / "myfile.tif") | ||
tifffile.imwrite(example_data_filepath, original_data, imagej=True) | ||
return tifffile.TiffFile(example_data_filepath) | ||
filepath = str(tmp_path / "myfile.tif") | ||
tifffile.imwrite(filepath, original_data) | ||
return tifffile.TiffFile(filepath) | ||
|
||
|
||
def example_data_imagej(tmp_path, original_data): | ||
filepath = str(tmp_path / "myfile.tif") | ||
tifffile.imwrite(filepath, original_data, imagej=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is WIP? Or some metadata should be added? I know that it will contain some imagej tag but, will not contain any metadata to check if reading is proper. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good point. The original test code read plain tif as imagej, so trying to expand with some simple imagej (and ome-tiff for ome-tiff reading) metadata that tifffile supports. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same remark is to ome function. It is also without providing any metadata. |
||
return tifffile.TiffFile(filepath) | ||
|
||
|
||
def example_data_ometiff(tmp_path, original_data): | ||
filepath = str(tmp_path / "myfile.ome.tif") | ||
tifffile.imwrite(filepath, original_data, ome=True) | ||
return tifffile.TiffFile(filepath) | ||
|
||
|
||
def example_data_zipped(tmp_path, original_data): | ||
|
@@ -67,8 +78,9 @@ def test_reader(tmp_path, data_fixture, original_data): | |
|
||
@pytest.mark.parametrize("reader, data_fixture, original_data", [ | ||
(imagecodecs_reader, example_data_filepath, np.random.random((20, 20))), | ||
(imagej_reader, example_data_tiff, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's important that we don't lose functionality, so we'll need to keep this test. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ...ah, I think I see what's happening with this restructure now There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above |
||
(tifffile_reader, example_data_imagej, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)), | ||
(tifffile_reader, example_data_tiff, np.random.randint(0, 255, size=(20, 20)).astype(np.uint8)), | ||
(tifffile_reader, example_data_ometiff, np.random.randint(0, 255, size=(20, 20, 3)).astype(np.uint8)), | ||
(zip_reader, example_data_zipped, np.random.random((20, 20))), | ||
]) | ||
def test_all_readers(reader, data_fixture, original_data, tmp_path): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,7 @@ | |
from typing import List, Optional, Union, Any, Tuple, Dict, Callable | ||
|
||
import numpy | ||
from tifffile import TiffFile, TiffSequence, TIFF | ||
from tifffile import TiffFile, TiffSequence, TIFF, xml2dict, PHOTOMETRIC | ||
from vispy.color import Colormap | ||
|
||
LayerData = Union[Tuple[Any], Tuple[Any, Dict], Tuple[Any, Dict, str]] | ||
|
@@ -50,13 +50,10 @@ def napari_get_reader(path: PathLike) -> Optional[ReaderFunction]: | |
|
||
def reader_function(path: PathLike) -> List[LayerData]: | ||
"""Return a list of LayerData tuples from path or list of paths.""" | ||
# TODO: Pyramids, OME, LSM | ||
# TODO: LSM | ||
with TiffFile(path) as tif: | ||
try: | ||
if tif.is_imagej: | ||
layerdata = imagej_reader(tif) | ||
else: | ||
layerdata = tifffile_reader(tif) | ||
layerdata = tifffile_reader(tif) | ||
except Exception as exc: | ||
# fallback to imagecodecs | ||
log_warning(f'tifffile: {exc}') | ||
|
@@ -72,14 +69,32 @@ def zip_reader(path: PathLike) -> List[LayerData]: | |
|
||
|
||
def tifffile_reader(tif): | ||
"""Return napari LayerData from largest image series in TIFF file.""" | ||
"""Return napari LayerData from image series in TIFF file.""" | ||
nlevels = len(tif.series[0]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still a little confused about the difference between tiff series and tiff pages. Is anyone else able to explain it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although @cgohlke is the expert as creator of tifffile, as I understand it pages is the internal structure of tiff, and the series/levels provide higher level access (which point to the underlying pages). |
||
if nlevels > 1: | ||
import dask.array as da | ||
data = [da.from_zarr(tif.aszarr(level=level)) for level in range(nlevels)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assumes that the tiff series are arranged in a pyramid format. Is that something that is absolutely guaranteed by the tiff specification? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've not seen this written otherwise in any WSI file format. I imagine it's spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doctring from series However, I'm not sure if it always should be read as pyramidal. As I know that tiff allows storing thumbnail for example. But I do not have such data available. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Czaki this is testing the length / number of levels inside the first serie. Would a thumbnail be the same or a separate series? What would the best way be to test for pyramid - ome metadata? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not know. I work with writing my own tiff parser a long time ago and cannot find the data that I used then. This is rather point that it will be nice to check this (for example using some public datasets). I will also try to digg it later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just realised this should be: |
||
else: | ||
data = tif.asarray() | ||
if tif.is_ome: | ||
kwargs = get_ome_tiff_metadata(tif) | ||
# TODO: combine interpretation of imagej and tags metadata?: | ||
elif tif.is_imagej: | ||
kwargs = get_imagej_metadata(tif) | ||
else: | ||
kwargs = get_tiff_metadata(tif) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I expected that tiff files might include some combination of these types of metadata, not only one. Would it be better to combine the dictionaries instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that's also possible. This is exactly what I'm doing in our image conversion pipeline e.g.: |
||
return [(data, kwargs, 'image')] | ||
|
||
|
||
def get_tiff_metadata(tif): | ||
"""Return napari metadata from largest image series in TIFF file.""" | ||
# TODO: fix (u)int32/64 | ||
# TODO: handle complex | ||
series = tif.series[0] | ||
for s in tif.series: | ||
if s.size > series.size: | ||
series = s | ||
data = series.asarray() | ||
dtype = series.dtype | ||
axes = series.axes | ||
shape = series.shape | ||
page = next(p for p in series.pages if p is not None) | ||
|
@@ -108,7 +123,7 @@ def tifffile_reader(tif): | |
elif ( | ||
page.photometric in (2, 6) and ( | ||
page.planarconfig == 2 or | ||
(page.bitspersample > 8 and data.dtype.kind in 'iu') or | ||
(page.bitspersample > 8 and dtype.kind in 'iu') or | ||
(extrasamples and len(extrasamples) > 1) | ||
) | ||
): | ||
|
@@ -191,7 +206,7 @@ def tifffile_reader(tif): | |
|
||
if ( | ||
contrast_limits is None and | ||
data.dtype.kind == 'u' and | ||
dtype.kind == 'u' and | ||
page.photometric != 3 and | ||
page.bitspersample not in (8, 16, 32, 64) | ||
): | ||
|
@@ -209,16 +224,16 @@ def tifffile_reader(tif): | |
blending=blending, | ||
visible=visible, | ||
) | ||
return [(data, kwargs, 'image')] | ||
return kwargs | ||
|
||
|
||
def imagej_reader(tif): | ||
def get_imagej_metadata(tif): | ||
"""Return napari LayerData from ImageJ hyperstack.""" | ||
# TODO: ROI overlays | ||
ijmeta = tif.imagej_metadata | ||
series = tif.series[0] | ||
|
||
data = series.asarray() | ||
dtype = series.dtype | ||
axes = series.axes | ||
shape = series.shape | ||
page = series.pages[0] | ||
|
@@ -263,7 +278,7 @@ def imagej_reader(tif): | |
if mode in ('color', 'grayscale'): | ||
blending = 'opaque' | ||
|
||
elif axes[-1] == 'S' and data.dtype == 'uint16': | ||
elif axes[-1] == 'S' and dtype == 'uint16': | ||
# RGB >8-bit | ||
channel_axis = axes.find('S') | ||
if channel_axis >= 0 and shape[channel_axis] in (3, 4): | ||
|
@@ -299,7 +314,96 @@ def imagej_reader(tif): | |
blending=blending, | ||
visible=visible, | ||
) | ||
return [(data, kwargs, 'image')] | ||
return kwargs | ||
|
||
|
||
def get_ome_tiff_metadata(tif): | ||
metadata = xml2dict(tif.ome_metadata) | ||
if 'OME' in metadata: | ||
metadata = metadata['OME'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. WIP |
||
|
||
series = tif.series[0] | ||
shape = series.shape | ||
dtype = series.dtype | ||
axes = series.axes.lower().replace('s', 'c') | ||
if 'c' in axes: | ||
channel_axis = axes.index('c') | ||
nchannels = shape[channel_axis] | ||
else: | ||
channel_axis = None | ||
nchannels = 1 | ||
|
||
image = ensure_list(metadata.get('Image', {}))[0] | ||
pixels = image.get('Pixels', {}) | ||
|
||
pixel_size = [] | ||
size = float(pixels.get('PhysicalSizeX', 0)) | ||
if size > 0: | ||
pixel_size.append(get_value_units_micrometer(size, pixels.get('PhysicalSizeXUnit'))) | ||
size = float(pixels.get('PhysicalSizeY', 0)) | ||
if size > 0: | ||
pixel_size.append(get_value_units_micrometer(size, pixels.get('PhysicalSizeYUnit'))) | ||
|
||
channels = ensure_list(pixels.get('Channel', [])) | ||
if len(channels) > nchannels: | ||
nchannels = len(channels) | ||
|
||
is_rgb = (series.keyframe.photometric == PHOTOMETRIC.RGB and nchannels in (3, 4)) | ||
|
||
names = [] | ||
contrast_limits = [] | ||
colormaps = [] | ||
blendings = [] | ||
visibles = [] | ||
|
||
scale = None | ||
if pixel_size: | ||
scale = pixel_size | ||
|
||
for channeli, channel in enumerate(channels): | ||
name = channel.get('Name') | ||
color = channel.get('Color') | ||
colormap = None | ||
if color: | ||
colormap = int_to_rgba(int(color)) | ||
elif is_rgb and len(channels) > 1: | ||
# separate RGB channels | ||
colormap = ['red', 'green', 'blue', 'alpha'][channeli] | ||
if not name: | ||
name = colormap | ||
|
||
contrast_limit = None | ||
if dtype.kind != 'f': | ||
info = numpy.iinfo(dtype) | ||
contrast_limit = (info.min, info.max) | ||
|
||
blending = 'additive' | ||
visible = True | ||
|
||
if len(channels) > 1: | ||
names.append(name) | ||
blendings.append(blending) | ||
contrast_limits.append(contrast_limit) | ||
colormaps.append(colormap) | ||
visibles.append(visible) | ||
else: | ||
names = name | ||
blendings = blending | ||
contrast_limits = contrast_limit | ||
colormaps = colormap | ||
visibles = visible | ||
|
||
meta = dict( | ||
rgb=is_rgb, | ||
channel_axis=channel_axis, | ||
name=names, | ||
scale=scale, | ||
colormap=colormaps, | ||
contrast_limits=contrast_limits, | ||
blending=blendings, | ||
visible=visibles, | ||
) | ||
return meta | ||
|
||
|
||
def imagecodecs_reader(path): | ||
|
@@ -308,6 +412,12 @@ def imagecodecs_reader(path): | |
return [(imread(path), {}, 'image')] | ||
|
||
|
||
def ensure_list(x): | ||
if not isinstance(x, (list, tuple)): | ||
x = [x] | ||
return x | ||
|
||
|
||
def alpha_colormap(bitspersample=8, samples=4): | ||
"""Return Alpha colormap.""" | ||
n = 2**bitspersample | ||
|
@@ -359,6 +469,23 @@ def cmyk_colormaps(bitspersample=8, samples=3): | |
return [Colormap(c), Colormap(m), Colormap(y), Colormap(k)] | ||
|
||
|
||
def int_to_rgba(intrgba: int) -> tuple: | ||
signed = (intrgba < 0) | ||
rgba = [x / 255 for x in intrgba.to_bytes(4, signed=signed, byteorder="big")] | ||
if rgba[-1] == 0: | ||
rgba[-1] = 1 | ||
return tuple(rgba) | ||
|
||
|
||
def get_value_units_micrometer(value: float, unit: str = None) -> float: | ||
conversions = {'nm': 1e-3, 'µm': 1, 'um': 1, 'micrometer': 1, 'mm': 1e3, 'cm': 1e4, 'm': 1e6} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that this should be global variable |
||
if unit: | ||
value_um = value * conversions.get(unit, 1) | ||
else: | ||
value_um = value | ||
return value_um | ||
|
||
|
||
def log_warning(msg, *args, **kwargs): | ||
"""Log message with level WARNING.""" | ||
import logging | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,8 +29,10 @@ requires-python = '>=3.10' | |
dependencies = [ | ||
'imagecodecs', | ||
'numpy', | ||
'tifffile>=2020.5.7', | ||
'tifffile>=2023.9.26', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a specific feature/requirement causing the bump in version number? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, loading (and lazy writing) levels / zarr / dask |
||
'dask', | ||
'vispy', | ||
'zarr', | ||
] | ||
|
||
[project.optional-dependencies] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just below this line there are testing dependencies. Please add |
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the imagej_reader from tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, you're trying to just unify everything under one reader that automagically does the right thing. That's fine, but I think it's worth keeping and testing the other readers as independent functions. This makes it easier to understand when one particular functionality is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR removes imagej_reader entirely, and instead has separate functions for finding imagej/ome/tiff metadata (instead of having separate functions for reading each of those types of tiff files).
EDIT: sorry Juan, for some reason your second comment didn't show up for me. I didn't mean to post redundant information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok it also saves imagej metadata - I didn't know that was supported. I'll put back the test.