Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update progress listener API to provide tile counts at each start notification #221

Merged
merged 3 commits into from
Nov 15, 2023

Conversation

melissalinkert
Copy link
Member

This should make it easier for a listener implementation to use a single progress bar for the whole conversion, instead of one progress bar per resolution (as discussed with @DavidStirling).

Updating the notifications in Converter requires pre-calculating tile and resolution counts before starting to write tiles, but that also means failing faster if the pixel type and downsampling (or other options) are incompatible.

raw2ometiff will need similar updates once we're happy with this, as it consumes the progress listener API and implementations here. Might be a case for a 0.8.0-rc1 that includes this, so we can more easily test with all of the downstream applications.

…ification

This should make it easier for a listener implementation to use a single progress bar
for the whole conversion, instead of one progress bar per resolution.

Updating the notifications in Converter requires pre-calculating tile and resolution counts
before starting to write tiles, but that also means failing faster if the pixel type and
downsampling (or other options) are incompatible.
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, 👍 for having additional metrics sent to the listening API for improved reporting/debugging.

Before digging into functional review, possibly the most impactful question I have is about vocabulary. We use both the tile and chunk terms in the converter. This is particularly true in the listener API where the series/resolution start method take tileCount as an input but the low-level notification are for chunk processing.
Using the default options, I think the two concepts are equivalent as the converter transfer input tiles into 2D chunks. Thinking of possible extensions like writing 3D chunks, should the listener API report the number of tiles read from the original data or the number of chunks written to the output?

*
* @param tileCount total number of tiles
*/
void notifyTotalTileCount(long tileCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing the various APIs:

  • notifySeriesStart takes the number of resolutions and total number of tiles for a given series
  • notifyResolutionStart takes the number of tiles for a given resolution,

I wonder if this new API should pre-emptively take the number of series and total number of tiles for a given dataset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree that sending an "I'm starting this job" message with the total series and tile counts might be the most useful transmission here. For bioformats2raw you can cheese the series count as the series list is public, but this is not the case in raw2ometiff. Adding it to this notification makes a lot of sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db85137 switches this to notifyTotalCounts(int series, long chunk)

@melissalinkert
Copy link
Member Author

db85137 does switch the whole progress API to using chunk instead of tile. I think in this context chunk makes more sense and is more consistent with Zarr and OME-NGFF spec terminology. Happy to hear other thoughts though. I intentionally did not try to switch all use of tile in Converter to chunk, as that would be a large change outside the intended scope of this PR.

@sbesson sbesson self-requested a review October 30, 2023 21:51
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading the last changes, the only naming question that came to mind is whether notifyTotalCount could be simply renamed as notifyStart.

Otherwise, the more systematic usage of chunk in the writing API is consistent with the Zarr core concepts as well as the the rest of the methods in the progress API.

@sbesson sbesson merged commit 5fcddf1 into glencoesoftware:master Nov 15, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants