Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count File Format Issue #211

Open
npamb opened this issue Jan 9, 2025 · 1 comment
Open

Count File Format Issue #211

npamb opened this issue Jan 9, 2025 · 1 comment

Comments

@npamb
Copy link

npamb commented Jan 9, 2025

Hello,

I am having some issues with my count file in the method 1 notebook. I generated my count file by combining the normalized counts from two separate RNAseq datasets and writing that matrix as a h5 file with the rhdf5 package.

When I run

adata = anndata.read_h5ad(counts_file_path)

I get the following error

TypeError                                 Traceback (most recent call last)
File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/utils.py:202, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    201 try:
--> 202     return func(*args, **kwargs)
    203 except Exception as e:

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/specs/registry.py:235, in Reader.read_elem(self, elem, modifiers)
    234 if self.callback is not None:
--> 235     return self.callback(read_func, elem.name, elem, iospec=get_spec(elem))
    236 else:

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/h5ad.py:223, in read_h5ad.<locals>.callback(func, elem_name, elem, iospec)
    222 if iospec.encoding_type == "anndata" or elem_name.endswith("/"):
--> 223     return AnnData(
    224         **{
    225             # This is covering up backwards compat in the anndata initializer
    226             # In most cases we should be able to call `func(elen[k])` instead
    227             k: read_dispatched(elem[k], callback)
    228             for k in elem.keys()
    229             if not k.startswith("raw.")
    230         }
    231     )
    232 elif elem_name.startswith("/raw."):

TypeError: __init__() got an unexpected keyword argument 'data'

The above exception was the direct cause of the following exception:

AnnDataReadError                          Traceback (most recent call last)
Cell In[83], line 3
      1 import anndata
----> 3 adata = anndata.read_h5ad(counts_file_path)
      4 adata.shape

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/h5ad.py:243, in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    240         return read_dataframe(elem)
    241     return func(elem)
--> 243 adata = read_dispatched(f, callback=callback)
    245 # Backwards compat (should figure out which version)
    246 if "raw.X" in f:

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/experimental/__init__.py:58, in read_dispatched(elem, callback)
     54 from anndata._io.specs import Reader, _REGISTRY
     56 reader = Reader(_REGISTRY, callback=callback)
---> 58 return reader.read_elem(elem)

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/utils.py:204, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    202     return func(*args, **kwargs)
    203 except Exception as e:
--> 204     re_raise_error(e, elem)

File ~/mambaforge/envs/cpdb_2/lib/python3.8/site-packages/anndata/_io/utils.py:188, in report_read_key_on_error.<locals>.re_raise_error(e, elem)
    186 else:
    187     parent = _get_parent(elem)
--> 188     raise AnnDataReadError(
    189         f"Above error raised while reading key {elem.name!r} of "
    190         f"type {type(elem)} from {parent}."
    191     ) from e

AnnDataReadError: Above error raised while reading key '/' of type <class 'h5py._hl.files.File'> from /.

I can read my count file with the h5py package. When I check the shape of the data, I get the expected dimensions

f = h5py.File(counts_file_path, 'r')
dset = f['data']
dset.shape
(29966, 14605)

Any suggestions?

Thanks!

@cakirb
Copy link
Collaborator

cakirb commented Jan 13, 2025

Hello @npamb,

It appears that the rhdf5 package created an .h5 file instead of an .h5ad file, which is likely why the read_h5ad function didn't work as expected. I suggest trying to read the file using the anndata.read_hdf function instead.

Best,
Batu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants