-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend image metadata #18951
base: dev
Are you sure you want to change the base?
Extend image metadata #18951
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
The problem with black will be solved here #18955 |
This is more consistent with the terminology used in https://github.com/cgohlke/tifffile/blob/8a25a0d4738390af0a1f693705f29875d88fc320/tifffile/tifffile.py#L4676
|
One thing that I was thinking about is resource usage (time and memory) for setting the metadata. Is this limited in the current implementation? For many other data types we restrict the amount of data that we process (often we just read 1MB prefix) -- but I guess this is not useful here. |
Another thing to consider is that, if a tool requires certain metadata to be there, and it's not there because the data had been uploaded into Galaxy too long ago. This issue probably arises whenever new metadata is added in Galaxy? Are there any established procedures to cope with that? If not, two possible solutions come to my mind. Either the tool should also accept a dataset for which the metadata is missing. Or Galaxy should automatically recognize that the metadata of a dataset is outdated and rerun the respective metadata extraction methods. The former seems to be more easily feasible. It essentially means that a validator of the form |
I think re-running metadata extraction is the way to go (users can trigger that -- if I'm not wrong). With the Adding an |
Definitely something to consider. Given these considerations, I have made several changes in 1e6701f:
|
There is something strange going on with the tests, they hang kind of randomly. Most of the time, both running locally using However, after removing the first two tests, Test 3 passes (this is then the first test), and it hangs on Test 4 instead (this is then the second test). |
Does it work locally with |
Nope, same behavior using with tifffile.TiffFile(dataset.get_file_name()) as tif: (note that this line was already there before this PR) Haven't tried |
The issue with the last test is fixed in f3e20d9. Actually this should be totally unrelated to the other tests, but somehow this also fixes their hanging (running both locally and in CI). Maybe a bug in the test execution? |
This PR adds a series of basic metadata elements for image data, including:
width
,height
,depth
of the image, as well as the number ofchannels
andframes
(the terminology is consistent with Add support for arbitrarily ordered image axes in image content assertions #18891)axes
of the image (e.g.,YXC
orZCYX
)dtype
: The data type of the image pixels or voxels (e.g.,uint8
orfloat64
)num_unique_values
: The number of unique values in the imageThis is useful to define validators for input data when working with images. Some examples of when this will be useful:
num_unique_values
is1
or2
.dtype
: Some tools might not supportfloat
orint
image data.channels
is0
or1
: Restrict input data to single-channel images.axes
,depth
,channels
,frames
: Require that an image has one or more z-slices / channels / time steps.TIFF files are read using the tifffile library, other image types are tried to be read using Pillow. The new metadata is defined as optional, because Pillow might not be installed, or it might not be possible to read an image using Pillow (e.g., due to an image format that Pillow does not support).
For multi-page TIFF files, the metadata is determined for each page individually, and then joined into a
,
-separated string (with the order corresponding to the order of the pages in the series).How to test the changes?
(Select all options that apply)
License