Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with indices vs positional indices in shapes #169

Open
LucaMarconato opened this issue Oct 9, 2023 · 6 comments
Open

Problem with indices vs positional indices in shapes #169

LucaMarconato opened this issue Oct 9, 2023 · 6 comments
Labels
bug Something isn't working priority: high

Comments

@LucaMarconato
Copy link
Member

LucaMarconato commented Oct 9, 2023

I have just noticed that the squidpy notebook can't make the last plot; it should be a quick fix. The latest version of the notebook (gonna be merge in main soon) is this one: https://github.com/scverse/spatialdata-notebooks/blob/fix/remove_napari/notebooks/examples/squidpy_integration.ipynb

The dataset used in the notebook is the xenium rep 1 data, it should be unchanged from the one we have downloaded and the S3 version should be up-to-date (I will re-enable the upload functions today/tomorrow, so for sure it will be up-to-date soon).

The problem seem to be due to the use of positional indices where ".loc" indices should have been used. Please see the screenshot: the cell_id and gdf.index columns match, but it seems that at some point either a positional index for the table, either the table.obs.index has been used.

image

Faulty line:

sdata.pl.render_shapes(element="cell_circles", color="leiden").pl.show()

Traceback

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 sdata.pl.render_shapes(element="cell_circles", color="leiden").pl.show()

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/basic.py:673, in PlotAccessor.show(self, coordinate_systems, legend_fontsize, legend_fontweight, legend_loc, legend_fontoutline, na_in_legend, colorbar, wspace, hspace, ncols, frameon, figsize, dpi, fig, title, share_extent, pad_extent, ax, return_ax, save)
    662     _render_images(
    663         sdata=sdata,
    664         render_params=params,
   (...)
    670         # extent=extent[cs],
    671     )
    672 elif cmd == "render_shapes" and cs_contents.query(f"cs == '{cs}'")["has_shapes"][0]:
--> 673     _render_shapes(
    674         sdata=sdata,
    675         render_params=params,
    676         coordinate_system=cs,
    677         ax=ax,
    678         fig_params=fig_params,
    679         scalebar_params=scalebar_params,
    680         legend_params=legend_params,
    681     )
    683 elif cmd == "render_points" and cs_contents.query(f"cs == '{cs}'")["has_points"][0]:
    684     _render_points(
    685         sdata=sdata,
    686         render_params=params,
   (...)
    691         legend_params=legend_params,
    692     )

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/render.py:121, in _render_shapes(sdata, render_params, coordinate_system, ax, fig_params, scalebar_params, legend_params)
    118     color_vector = color_vector[mask]
    119 shapes = gpd.GeoDataFrame(shapes, geometry="geometry")
--> 121 _cax = _get_collection_shape(
    122     shapes=shapes,
    123     s=render_params.scale,
    124     c=color_vector,
    125     render_params=render_params,
    126     rasterized=sc_settings._vector_friendly,
    127     cmap=render_params.cmap_params.cmap,
    128     norm=norm,
    129     fill_alpha=render_params.fill_alpha,
    130     outline_alpha=render_params.outline_alpha
    131     # **kwargs,
    132 )
    134 # Sets the limits of the colorbar to the values instead of [0, 1]
    135 if not norm and not values_are_categorical:

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/utils.py:248, in _get_collection_shape(shapes, c, s, norm, render_params, fill_alpha, outline_alpha, **kwargs)
    246     scaled_coords = [(centroid + (np.array(coord) - centroid) * s).tolist() for coord in geom.exterior.coords]
    247     row["geometry"] = mplp.Polygon(scaled_coords, closed=True)
--> 248     assign_fill_and_outline_to_row(shapes, fill_c, outline_c, row, idx)
    249     rows.append(row)
    251 elif geom.geom_type == "MultiPolygon":
    252     # mp = _make_patch_from_multipolygon(geom)

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/utils.py:235, in _get_collection_shape.<locals>.assign_fill_and_outline_to_row(shapes, fill_c, outline_c, row, idx)
    233     row["outline_c"] = outline_c
    234 else:
--> 235     row["fill_c"] = fill_c[idx]
    236     row["outline_c"] = outline_c[idx]

IndexError: index 167780 is out of bounds for axis 0 with size 167780
@LucaMarconato LucaMarconato added the bug Something isn't working label Oct 9, 2023
@grst
Copy link
Contributor

grst commented Apr 9, 2024

I'm running into the same issue

@LucaMarconato
Copy link
Member Author

Hi @timtreis I just noticed that this bug is still open. Do you have any timeline for a fix? Thanks 😊

@LucaMarconato
Copy link
Member Author

Here is how to reproduce the problem. When the table has less rows than the shapes, the problem appears.

import matplotlib.pyplot as plt
from spatialdata.datasets import blobs_annotating_element
import spatialdata_plot

sdata = blobs_annotating_element("blobs_circles")
sdata['table'] = sdata['table'][:3]

sdata.pl.render_shapes("blobs_circles", color="instance_id").pl.show()
plt.show()

The behavior in napari, that would be expected from the user also here, is that the plot should be possible independently of which rows are present in the table and the shapes (it could be that the table has some rows in common, some missing and some extra; and no guarantee is expected from the order of the rows).

In napari shapes that are not present in the table are plotted in gray, while rows that are present in the table but not in the shapes are simply ignored. The order is matched via the join_spatialelement_table() API, which made the implementation easy.

@LucaMarconato
Copy link
Member Author

The bug is related to this #178 which if I remember well Sonja had fixed (and tests were added), I am looking for her PR. I think that plot can be made with any order so maybe the only uncovered cases are when there is a different number of rows between shapes and table. Using join_spatialelement_table() would fi all of this.

@LucaMarconato
Copy link
Member Author

Here are the PRs from Sonja https://github.com/scverse/spatialdata-plot/pull/163/files and https://github.com/scverse/spatialdata-plot/pull/177/files.

So it seems that:

Practically, to address the issue I suggest to proceed as follows:

  • mimic the use of the join APIs as done in napari-spatialdata
  • revisit the tests above to cover also the case in which the table has less rows than the shapes. Probably the best is too simultaneously do all the following, which is the most general case:
    • remove some rows from the table
    • remove some different rows from the shapes
    • shuffle both the table and shapes

@LucaMarconato
Copy link
Member Author

Related bug, reported in #357. Using groups leads to a filtering of the table and triggers this bug due to a table/shapes mismatch (that is expected).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: high
Projects
None yet
Development

No branches or pull requests

3 participants