Worse performance using datashader? #296

LucaMarconato · 2024-07-13T15:06:22Z

I wrote some benchmarks available here #295 (they can simply run as tests) and I have noticed that the datashader performance is worse than the matplotlib based one.

I think this maybe be due to the size of the canvas used by datashader since in the MERFISH example here #243 the performance was (as expected) better.

Therefore using a smaller default canvas size may fixed the issue. @Sonja-Stockhaus could you please have a look into this?

LucaMarconato · 2024-07-13T15:09:31Z

Here are the results of a (single) run of the tests (the timing are consistent across multiple manual runs).

LucaMarconato · 2024-07-13T15:45:53Z

With the fix that I proposed to the performance bug here #297 the performance gap is much bigger

timtreis · 2024-07-14T16:19:27Z

@Sonja-Stockhaus my "didn't-look-at-the-code" theory is that datashader generates too large of an image which then bypasses the rasterisation-downsampling logic. Wdyt?

Sonja-Stockhaus · 2024-07-15T19:14:04Z

Yep, datashader generates an image that is exactly the size of the extent (large extent = large image = long runtime). I'll think of sth so that we can use a smaller canvas size and then maybe rasterize or so to bring it back to the original scale.
Do we want a heuristic again to decide on the "smaller canvas size"?

I also noticed that for datashader, e.g. the radius of the points is relative to the axes which is not the case for matplotlib. So for a large extent you need extremely large point sizes to even make them visible at all with datashader. That should be consistent with matplotlib.

LucaMarconato · 2024-07-16T14:58:10Z

Thanks for the explanation. I would reuse the logic of _rasterize_if_necessary() or _multiscale_to_spatial_image() to take the dpi of the figure and the fig_size into consideration, since the extent could be extremely large, but in the end we are limited by the pixels available on screen/paper for plotting.

LucaMarconato · 2024-07-16T14:59:45Z

Btw, off-topic comment, when plotting Visium HD data as points/circles I noticed a Moire pattern due to the presence of a small rotation in the raw data. With datashader rasterization the Moire pattern disappears, which is great! So using datashader could have also this nice use case beyond improved performance.

LucaMarconato mentioned this issue Jul 13, 2024

Drastic decrease in performance for matplotlib plotting #297

Closed

timtreis added bug Something isn't working priority: medium images 🖼️ Anything related to Images labels 🏷️ Anything related to Labels points 🧮 Anything related to Points labels Jul 14, 2024

Sonja-Stockhaus linked a pull request Jul 17, 2024 that will close this issue

datashader speedup and bugfixes #309

Open

Sonja-Stockhaus mentioned this issue Jul 17, 2024

Error in Squidpy notebook due to rasterization not handling transformations #291

Open

timtreis assigned Sonja-Stockhaus Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worse performance using datashader? #296

Worse performance using datashader? #296

LucaMarconato commented Jul 13, 2024 •

edited

Loading

LucaMarconato commented Jul 13, 2024 •

edited

Loading

LucaMarconato commented Jul 13, 2024

timtreis commented Jul 14, 2024

Sonja-Stockhaus commented Jul 15, 2024

LucaMarconato commented Jul 16, 2024

LucaMarconato commented Jul 16, 2024

Worse performance using datashader? #296

Worse performance using datashader? #296

Comments

LucaMarconato commented Jul 13, 2024 • edited Loading

LucaMarconato commented Jul 13, 2024 • edited Loading

LucaMarconato commented Jul 13, 2024

timtreis commented Jul 14, 2024

Sonja-Stockhaus commented Jul 15, 2024

LucaMarconato commented Jul 16, 2024

LucaMarconato commented Jul 16, 2024

LucaMarconato commented Jul 13, 2024 •

edited

Loading

LucaMarconato commented Jul 13, 2024 •

edited

Loading