Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read re-segmented Xenium data #195

Open
liz-is opened this issue Aug 11, 2024 · 4 comments
Open

Can't read re-segmented Xenium data #195

liz-is opened this issue Aug 11, 2024 · 4 comments
Labels

Comments

@liz-is
Copy link

liz-is commented Aug 11, 2024

Hello,

I have re-run cell segmentation for my Xenium data using Xenium Ranger version 2.0.0.12 to benefit from its improved segmentation algorithm. I can't read the re-segmented data with spatialdata_io.xenium(), even though the original output can be read. I have spatialdata-io version 0.1.4.

Error message
sample1_v2 = spatialdata_io.xenium(sample1_loc)

INFO     reading                                                                                                   
         /mnt/storage/vaquerizas/xenium/outputs/XR_0014511/00145
         11/outs/cell_feature_matrix.h5                                                                            

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[6], line 1
----> 1 sample1_v2 = spatialdata_io.xenium(sample1_loc)

File /mnt/storage/vaquerizas/liz/spatial/xenium/python_analysis/spatialdata_env/lib/python3.10/site-packages/spatialdata_io/_utils.py:46, in deprecation_alias.<locals>.deprecation_decorator.<locals>.wrapper(*args, **kwargs)
     44 class_name = f.__qualname__
     45 rename_kwargs(f.__name__, kwargs, aliases, class_name)
---> 46 return f(*args, **kwargs)

File /mnt/storage/vaquerizas/liz/spatial/xenium/python_analysis/spatialdata_env/lib/python3.10/site-packages/spatialdata_io/readers/xenium.py:227, in xenium(path, cells_boundaries, nucleus_boundaries, cells_as_circles, cells_labels, nucleus_labels, transcripts, morphology_mip, morphology_focus, aligned_images, cells_table, n_jobs, imread_kwargs, image_models_kwargs, labels_models_kwargs)
    218     labels["nucleus_labels"], _ = _get_labels_and_indices_mapping(
    219         path,
    220         XeniumKeys.CELLS_ZARR,
   (...)
    224         labels_models_kwargs=labels_models_kwargs,
    225     )
    226 if cells_labels:
--> 227     labels["cell_labels"], cell_labels_indices_mapping = _get_labels_and_indices_mapping(
    228         path,
    229         XeniumKeys.CELLS_ZARR,
    230         specs,
    231         mask_index=1,
    232         labels_name="cell_labels",
    233         labels_models_kwargs=labels_models_kwargs,
    234     )
    235     if cell_labels_indices_mapping is not None and table is not None:
    236         if not pd.DataFrame.equals(cell_labels_indices_mapping["cell_id"], table.obs[str(XeniumKeys.CELL_ID)]):

File /mnt/storage/vaquerizas/liz/spatial/xenium/python_analysis/spatialdata_env/lib/python3.10/site-packages/spatialdata_io/readers/xenium.py:446, in _get_labels_and_indices_mapping(path, file, specs, mask_index, labels_name, labels_models_kwargs)
    443     real_label_index = real_label_index[1:]
    445 if version < packaging.version.parse("2.0.0"):
--> 446     expected_label_index = z["seg_mask_value"][...]
    448     if not np.array_equal(expected_label_index, real_label_index):
    449         raise ValueError(
    450             "The label indices from the labels differ from the ones from the input data. Please report "
    451             f"this issue. Real label indices: {real_label_index}, expected label indices: "
    452             f"{expected_label_index}."
    453         )

File /mnt/storage/vaquerizas/liz/spatial/xenium/python_analysis/spatialdata_env/lib/python3.10/site-packages/zarr/hierarchy.py:511, in Group.__getitem__(self, item)
    509         raise KeyError(item)
    510 else:
--> 511     raise KeyError(item)

KeyError: 'seg_mask_value'

Is this related to #150 ? i.e. is Xenium Ranger reanalysis not supported?

Thanks in advance for your help with this.

@LucaMarconato
Copy link
Member

Hi thanks for reporting. The code branch that is executed detects that the data is versioned with a number < 2.0.0. It could be that using XR 2.0.0.12 changes some specific data to the newest version but not the global data versioning.

I would suggest to manually parse the data in this case; you can see information and tutorials on how to proceed in the linked issue #150, and you could use the xenium.py code as a starting point.

@liz-is
Copy link
Author

liz-is commented Aug 12, 2024

Hi, thanks for the reply. I'll try adapting the code from spatialdata_io/readers/xenium.py to parse this format. I'll let you know if I figure out what's actually been changed in the resegmentation.

FWIW, I had a look at how the version is detected by spatialdata_io.readers.xenium, and it's technically being detected correctly, as this is what the relevant parts of my experiment.xenium file look like:

{
    "major_version": 4,
    [...snipped...]
    "instrument_sw_version": "1.9.2.0",
    "analysis_sw_version": "xenium-1.9.0.0",
     [...snipped...]
    },
    "xenium_explorer_files": {
    [...snipped...]
    },
    "xenium_ranger": {
        "run_id": "0014511",
        "version": "xenium-2.0.0.12",
        "command_line": "xeniumranger resegment --id=0014511 --xenium-bundle=/mnt/scratch/egi12/xenium/output-XETG00207__0014511__Region_1__20240315__115210/ --jobmode=slurm --disable-ui=true"
    },
    "segmentation_stain": ""
}

@LucaMarconato
Copy link
Member

Thanks for sharing the details. Maybe a fix could involve looking for xenium_ranger in the metadata and when the field is available, using this specific version for determining the code branch used by xenium() in spatialdata-io.

@liz-is
Copy link
Author

liz-is commented Aug 18, 2024

I managed to read in the data by adapting the code from spatialdata_io/readers/xenium.py to use the branches appropriate for version >= 2.0.0. So I think your idea to use the version in the xenium_ranger field in the metadata would work.

Caveats: 1) I didn't read in the transcripts as I don't need them for my current analysis 2) although I didn't get any errors/warnings, I don't know for sure that all the output is correct as I'm not super familiar with the data structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants