You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of scanpy.
(optional) I have confirmed this bug exists on the main branch of scanpy.
What happened?
I'm encountering an issue while running Scrublet for doublet analysis on an .h5ad file loaded with backed="r+" in Scanpy. The operation throws an error, likely due to the limitations of Scrublet working with backed mode, which restricts in-memory data manipulation.
Has anyone faced this issue before? If so, do you know of any workarounds or alternative approaches to run Scrublet on such data without having to fully load it into memory? Any suggestions would be greatly appreciated!
Minimal code sample
#path to fileoutput_file='/home/test_folder/project1_matrix.h5ad'#reading the h5ad file which is contains around 1 million cells#backed="r+" do not allow the adata.obs data to be modified.adata=sc.read_h5ad(output_file, backed="r+")
sc.pp.scrublet(adata)
Error output
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/test_env/lib/python3.12/site-packages/legacy_api_wrap/__init__.py", line 80, in fn_compatiblereturn fn(*args_all, **kw)
^^^^^^^^^^^^^^^^^^^
File "/home/test_env/lib/python3.12/site-packages/scanpy/preprocessing/_scrublet/__init__.py", line 180, in scrublet
adata = adata.copy()
^^^^^^^^^^^^
File "/home/test_env/python3.12/site-packages/anndata/_core/anndata.py", line 1447, in copyraiseValueError(
ValueError: To copy an AnnData object in backed mode, pass a filename: `.copy(filename='myfilename.h5ad')`. To load the object into memory, use `.to_memory()`.
Hello! We do not support backed mode for scrublet! However if you wish to contribute this, we would be more than happy. Alternatively, and probably a more sustainable solution, would be to add dask support (which you can begin to use via https://anndata.readthedocs.io/en/stable/generated/anndata.experimental.read_elem_as_dask.html). I'm going to close because we already have an issue for this: #2578)
Please make sure these conditions are met
What happened?
I'm encountering an issue while running Scrublet for doublet analysis on an .h5ad file loaded with backed="r+" in Scanpy. The operation throws an error, likely due to the limitations of Scrublet working with backed mode, which restricts in-memory data manipulation.
Has anyone faced this issue before? If so, do you know of any workarounds or alternative approaches to run Scrublet on such data without having to fully load it into memory? Any suggestions would be greatly appreciated!
Minimal code sample
Error output
Versions
The text was updated successfully, but these errors were encountered: