Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'DataDict' object has no attribute 'copy' #895

Closed
carmensandoval opened this issue Feb 2, 2023 · 5 comments
Closed

AttributeError: 'DataDict' object has no attribute 'copy' #895

carmensandoval opened this issue Feb 2, 2023 · 5 comments

Comments

@carmensandoval
Copy link

carmensandoval commented Feb 2, 2023

Trying to work with an anndata object that I can use with pegasus just fine returns the following error:

import scanpy as sc

adata = sc.read_h5ad('/path/to/my.h5ad')

adata

AnnData object with n_obs × n_vars = 57646 × 33554
    obs: 'n_genes', 'n_counts', 'percent_mito', 'leiden_labels', 'doublet_score', 'pred_dbl', 'dbl_kmeans_', 'timepoint'
    var: 'featureid', 'n_cells', 'percent_cells', 'robust', 'highly_variable_features', 'mean', 'var', 'hvf_loess', 'hvf_rank'
    uns: 'Channel_colors', 'PCs', 'W', '_attr2type', 'genome', 'leiden_labels_colors', 'leiden_resolution', 'modality', 'nmf_err', 'nmf_features', 'norm_count', 'pca', 'pca_features', 'pca_ncomps', 'stdzn_max_value', 'stdzn_mean', 'stdzn_std', 'timepoint_colors', 'uid', 'uns_dict', 'var_dict'
    obsm: 'H', 'X_nmf', 'X_pca', 'X_pca_harmony', 'X_umap', '_tmp_fmat_highly_variable_features', 'pca_harmony_knn_distances', 'pca_harmony_knn_indices', 'pca_knn_distances', 'pca_knn_indices'
    varm: 'de_res'
    layers: 'raw.X.log_norm'
    obsp: 'W_pca', 'W_pca_harmony'

sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In [67], line 1
----> 1 sc.pp.filter_cells(adata_2, min_genes=200)
      2 sc.pp.filter_genes(adata_2, min_cells=3)

File ~/mambaforge/envs/pegasus/lib/python3.9/site-packages/scanpy/preprocessing/_simple.py:141, in filter_cells(data, min_counts, min_genes, max_counts, max_genes, inplace, copy)
    139     else:
    140         adata.obs['n_genes'] = number
--> 141     adata._inplace_subset_obs(cell_subset)
    142     return adata if copy else None
    143 X = data  # proceed with processing the data matrix

File ~/mambaforge/envs/pegasus/lib/python3.9/site-packages/anndata/_core/anndata.py:1264, in AnnData._inplace_subset_obs(self, index)
   1262 else:
   1263     dtype = None
-> 1264 self._init_as_actual(adata_subset, dtype=dtype)

File ~/mambaforge/envs/pegasus/lib/python3.9/site-packages/anndata/_core/anndata.py:509, in AnnData._init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode)
    506         raise ValueError(f"Index of {attr_name} must match {x_name} of X.")
    508 # unstructured annotations
--> 509 self.uns = uns or OrderedDict()
    511 # TODO: Think about consequences of making obsm a group in hdf
    512 self._obsm = AxisArrays(self, 0, vals=convert_to_dict(obsm))
...
File ~/mambaforge/envs/pegasus/lib/python3.9/site-packages/anndata/compat/_overloaded_dict.py:130, in OverloadedDict.copy(self)
    129 def copy(self) -> dict:
--> 130     return self.data.copy()

AttributeError: 'DataDict' object has no attribute 'copy'

Any ideas what could be going on and how to fix this?

@ivirshup
Copy link
Member

ivirshup commented Feb 2, 2023

Thanks for the report. That does seem strange.

I think the issue that we're assuming the dict-like values are actually dicts.

Did you literally run:

import scanpy as sc

adata = sc.read_h5ad('/path/to/my.h5ad')
sc.pp.filter_cells(adata, min_genes=200)

Because I don't think that should be able to give you anything in uns that isn't a dict.

Could you please also report some info on your environment? E.g. the output of import session_info; session_info.show(dependencies=True, html=False) from the python session where you get the error.

@carmensandoval
Copy link
Author

carmensandoval commented Feb 3, 2023

My apologies -- this happens when converting a pegasus object to anndata.

adata = pg.read_input('../cellbender/SAM24425932/SAM24425932_cellbender_out_filtered.h5')

adata = adata.to_anndata()

sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)

AttributeError: 'DataDict' object has no attribute 'copy'

Saving it first, then reading it with scanpy.read_h5ad works fine.

adata = pg.read_input('../cellbender/SAM24425932/SAM24425932_cellbender_out_filtered.h5')
pg.write_output(adata, 'test.h5ad')

adata = sc.read_h5ad('test.h5ad')
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)

I guess this is more a question for the developers of pegasus, but it does seem like something is missing when converting,

Context:
I found myself in this hole because of the inability to read cellbender output with scanpy 1.9.1. Pegasus can load them and save these h5 files without issue, so I was thinking of using it as an importer to be able to use scanpy on those objects.

(I can now load these h5 files from cellbender using the function provided here, but still have issues saving - hence why I'm trying to find a way to convert between the two 'formats'.)

@ivirshup
Copy link
Member

ivirshup commented Feb 6, 2023

Yes, I think this would be an issue for pegasus. It shouldn't be putting DataDicts into AnnData.

You could maybe do something like:

from collections.abc import Mapping

def sanitize_uns(d):
    return {k: sanitize_uns(v) if isinstance(v, Mapping) else v for k, v in d.items()}

adata.uns = sanitize_uns(adata.uns)

Potentially we should be more aggressive with converting on the anndata side. However, Mapping subtypes are quite common it would be easy to convert something that shouldn't be converted.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

@github-actions github-actions bot added the stale label Jun 10, 2023
@flying-sheep
Copy link
Member

I’m closing this because there was no follow-up. Please feel free to respond and we’ll re-open it.

@flying-sheep flying-sheep closed this as not planned Won't fix, can't repro, duplicate, stale Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants