Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessor Functionality for AnnData #1870

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

srivarra
Copy link

@srivarra srivarra commented Feb 21, 2025

Added Accessor / Extension functionality based off the work in Polars. Added tests as well.

Added _accessors to anndata._core.anndata.AnnData to house registered accessor names to avoid name conflicts.

You can create an accessor by decorating a class with methods with the register_anndata_namespace function.

Here's an example of how to use it:

import anndata as ad

@ad.register_anndata_namespace("my_accessor")
class Greetings:
	def __init__(self, adata: ad.AnnData):
		self._adata = adata
	
	def greet():
		return "hi"

# and to use

adata.my_accessor.greet()

Not sure if this needs a dedicated tutorial notebook, or specific documentation page beyond the docstring.

Copy link

codecov bot commented Feb 21, 2025

Codecov Report

Attention: Patch coverage is 95.29412% with 4 lines in your changes missing coverage. Please review.

Project coverage is 84.13%. Comparing base (45c1548) to head (b38c282).

Files with missing lines Patch % Lines
src/anndata/_core/extensions.py 96.15% 3 Missing ⚠️
src/anndata/_types.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1870      +/-   ##
==========================================
- Coverage   86.22%   84.13%   -2.10%     
==========================================
  Files          40       41       +1     
  Lines        6347     6431      +84     
==========================================
- Hits         5473     5411      -62     
- Misses        874     1020     +146     
Files with missing lines Coverage Δ
src/anndata/__init__.py 100.00% <100.00%> (ø)
src/anndata/_core/anndata.py 82.44% <100.00%> (+0.02%) ⬆️
src/anndata/_types.py 84.21% <80.00%> (-1.09%) ⬇️
src/anndata/_core/extensions.py 96.15% <96.15%> (ø)

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ilan-gold ilan-gold added this to the 0.12.0 milestone Feb 24, 2025
Copy link
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Some preliminary comments/changes

Taken from:
https://github.com/pandas-dev/pandas/blob/ab89c53f48df67709a533b6a95ce3d911871a0a8/pandas/util/_exceptions.py#L30-L51
"""
import anndata as ad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an internal-to-the-function import?

Copy link
Author

@srivarra srivarra Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason in particular, it was just what Polars used. As long as the function gets the location of the anndata/__init__.py file it'll work. Moving it outside the function should be fine since there are not any circular imports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why resolved without change? There was a circular import?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is changed, I moved it outside the function in one of the later pushes.

@srivarra
Copy link
Author

@ilan-gold

Made the requested changes:

  • Added ExtensionNamespace Protocol:
    • Provides a standardized interface for static type checking with mypy (can't really enforce implementation details when a user builds an accessor with a Protocol)
    • Defines the expected method signatures for custom accessors
  • Implemented Runtime Validation via _check_namespace_signature:
    • Verifies at registration time that custom accessors have the correct method signatures
    • Ensures __init__ properly accepts self and an adata parameter of type anndata.AnnData
    • Prevents namespace collisions with existing AnnData attributes (e.g., X, obs, var_names)
    • Raises error messages when validation fails

Added tests for (probably) all validation scenarios (missing parameters, wrong parameter names, incorrect type annotations) and tests for namespace collision detection.

A couple of issues I'm having:

I'm not too sure what's going on with the readthedocs build, since I'm unable to replicate it locally, instead I have a different error:

Here's my error log for `hatch run docs:build`
Running Sphinx v8.2.1
loading translations [en]... done
matplotlib is not installed, social cards will not be generated
myst v4.0.1: MdParserConfig(commonmark_only=False, gfm_only=False, enable_extensions={'html_image'}, disable_syntax=[], all_links_external=False, links_external_new_tab=False, url_schemes=('http', 'https', 'mailto', 'ftp'), ref_domains=None, fence_as_directive=set(), number_code_blocks=[], title_to_header=False, heading_anchors=3, heading_slug_func=None, html_meta={}, footnote_sort=True, footnote_transition=True, words_per_minute=200, substitutions={}, linkify_fuzzy_links=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, dmath_double_inline=False, update_mathjax=True, mathjax_classes='tex2jax_process|mathjax_process|math|output_area', enable_checkboxes=False, suppress_warnings=[], highlight_code_blocks=True)
loading intersphinx inventory 'awkward' from https://awkward-array.org/doc/stable/objects.inv ...
loading intersphinx inventory 'cupy' from https://docs.cupy.dev/en/stable/objects.inv ...
loading intersphinx inventory 'dask' from https://docs.dask.org/en/stable/objects.inv ...
loading intersphinx inventory 'h5py' from https://docs.h5py.org/en/latest/objects.inv ...
loading intersphinx inventory 'hdf5plugin' from https://hdf5plugin.readthedocs.io/en/latest/objects.inv ...
loading intersphinx inventory 'loompy' from https://linnarssonlab.org/loompy/objects.inv ...
loading intersphinx inventory 'numpy' from https://numpy.org/doc/stable/objects.inv ...
loading intersphinx inventory 'pandas' from https://pandas.pydata.org/pandas-docs/stable/objects.inv ...
loading intersphinx inventory 'python' from https://docs.python.org/3/objects.inv ...
loading intersphinx inventory 'scipy' from https://docs.scipy.org/doc/scipy/objects.inv ...
loading intersphinx inventory 'sklearn' from https://scikit-learn.org/stable/objects.inv ...
loading intersphinx inventory 'xarray' from https://docs.xarray.dev/en/stable/objects.inv ...
loading intersphinx inventory 'zarr' from https://zarr.readthedocs.io/en/v2.18.4/objects.inv ...
[autosummary] generating autosummary for: _key_contributors.rst, api.md, benchmark-read-write.ipynb, benchmarks.md, concatenation.rst, contributing.md, fileformat-prose.md, generated/anndata.AnnData.T.rst, generated/anndata.AnnData.X.rst, generated/anndata.AnnData.chunk_X.rst, ..., generated/anndata.settings.override.rst, generated/anndata.settings.rst, generated/anndata.typing.AxisStorable.rst, generated/anndata.typing.Index.rst, generated/anndata.typing.RWAble.rst, index.md, interoperability.md, references.rst, release-notes/index.md, tutorials/index.md
Writing evaluated template result to /Users/srivarra/Dev/Projects/Open Source Contributions/scverse/anndata/docs/_build/html/_static/nbsphinx-code-cells.css
building [mo]: targets for 0 po files that are out of date
writing output... 
building [html]: targets for 114 source files that are out of date
updating environment: [new config] 114 added, 0 changed, 0 removed
reading sources... [  1%] _key_contributors
reading sources... [  2%] api
reading sources... [  3%] benchmark-read-write
reading sources... [  4%] benchmarks
reading sources... [  4%] concatenation
reading sources... [  5%] contributing
reading sources... [  6%] fileformat-prose
reading sources... [  7%] generated/anndata.AnnData
reading sources... [  8%] generated/anndata.AnnData.T
reading sources... [  9%] generated/anndata.AnnData.X
reading sources... [ 10%] generated/anndata.AnnData.chunk_X
reading sources... [ 11%] generated/anndata.AnnData.chunked_X
reading sources... [ 11%] generated/anndata.AnnData.concatenate
reading sources... [ 12%] generated/anndata.AnnData.copy
reading sources... [ 13%] generated/anndata.AnnData.filename
reading sources... [ 14%] generated/anndata.AnnData.is_view
reading sources... [ 15%] generated/anndata.AnnData.isbacked
reading sources... [ 16%] generated/anndata.AnnData.layers
reading sources... [ 17%] generated/anndata.AnnData.n_obs
reading sources... [ 18%] generated/anndata.AnnData.n_vars
reading sources... [ 18%] generated/anndata.AnnData.obs
reading sources... [ 19%] generated/anndata.AnnData.obs_keys
reading sources... [ 20%] generated/anndata.AnnData.obs_names
reading sources... [ 21%] generated/anndata.AnnData.obs_names_make_unique
reading sources... [ 22%] generated/anndata.AnnData.obs_vector
reading sources... [ 23%] generated/anndata.AnnData.obsm
reading sources... [ 24%] generated/anndata.AnnData.obsm_keys
reading sources... [ 25%] generated/anndata.AnnData.obsp
reading sources... [ 25%] generated/anndata.AnnData.raw
reading sources... [ 26%] generated/anndata.AnnData.rename_categories
reading sources... [ 27%] generated/anndata.AnnData.shape
reading sources... [ 28%] generated/anndata.AnnData.strings_to_categoricals
reading sources... [ 29%] generated/anndata.AnnData.to_df
reading sources... [ 30%] generated/anndata.AnnData.to_memory
reading sources... [ 31%] generated/anndata.AnnData.transpose
reading sources... [ 32%] generated/anndata.AnnData.uns
reading sources... [ 32%] generated/anndata.AnnData.uns_keys
reading sources... [ 33%] generated/anndata.AnnData.var
reading sources... [ 34%] generated/anndata.AnnData.var_keys
reading sources... [ 35%] generated/anndata.AnnData.var_names
reading sources... [ 36%] generated/anndata.AnnData.var_names_make_unique
reading sources... [ 37%] generated/anndata.AnnData.var_vector
reading sources... [ 38%] generated/anndata.AnnData.varm
reading sources... [ 39%] generated/anndata.AnnData.varm_keys
reading sources... [ 39%] generated/anndata.AnnData.varp
reading sources... [ 40%] generated/anndata.AnnData.write
reading sources... [ 41%] generated/anndata.AnnData.write_csvs
reading sources... [ 42%] generated/anndata.AnnData.write_h5ad
reading sources... [ 43%] generated/anndata.AnnData.write_loom
reading sources... [ 44%] generated/anndata.AnnData.write_zarr
reading sources... [ 45%] generated/anndata.ImplicitModificationWarning
reading sources... [ 46%] generated/anndata.abc.CSCDataset
reading sources... [ 46%] generated/anndata.abc.CSCDataset.__getitem__
reading sources... [ 47%] generated/anndata.abc.CSCDataset.backend


Versions
========

* Platform:         darwin; (macOS-15.3.1-arm64-arm-64bit)
* Python version:   3.12.5 (CPython)
* Sphinx version:   8.2.1
* Docutils version: 0.21.2
* Jinja2 version:   3.1.5
* Pygments version: 2.19.1

Last Messages
=============

    generated/anndata.abc.CSCDataset


    reading sources... [ 46%]
    generated/anndata.abc.CSCDataset.__getitem__


    reading sources... [ 47%]
    generated/anndata.abc.CSCDataset.backend

Loaded Extensions
=================

* sphinx.ext.mathjax (8.2.1)
* alabaster (1.0.0)
* sphinxcontrib.applehelp (2.0.0)
* sphinxcontrib.devhelp (2.0.0)
* sphinxcontrib.htmlhelp (2.1.0)
* sphinxcontrib.serializinghtml (2.0.0)
* sphinxcontrib.qthelp (2.0.0)
* myst_parser (4.0.1)
* sphinx_copybutton (0.5.2)
* sphinx.ext.autodoc.preserve_defaults (8.2.1)
* sphinx.ext.autodoc.type_comment (8.2.1)
* sphinx.ext.autodoc.typehints (8.2.1)
* sphinx.ext.autodoc (8.2.1)
* sphinx.ext.intersphinx (8.2.1)
* sphinx.ext.doctest (8.2.1)
* sphinx.ext.coverage (8.2.1)
* sphinx.ext.napoleon (8.2.1)
* sphinx.ext.autosummary (8.2.1)
* sphinx_autodoc_typehints (unknown version)
* sphinx_issues (5.0.0)
* sphinx_design (0.6.1)
* sphinx_search.extension (0.3.2)
* sphinxext.opengraph (unknown version)
* scanpydoc.definition_list_typed_field (0.15.2)
* scanpydoc.elegant_typehints (0.15.2)
* scanpydoc.rtd_github_links (0.15.2)
* scanpydoc.theme (unknown version)
* scanpydoc.release_notes (0.15.2)
* scanpydoc (0.15.2)
* sphinx.ext.linkcode (8.2.1)
* nbsphinx (0.9.6)
* IPython.sphinxext.ipython_console_highlighting (unknown version)
* sphinx_toolbox.more_autodoc.autoprotocol (3.8.3)
* no_skip_abc_members (unknown version)
* patch_myst_cite (unknown version)
* sphinx_book_theme (unknown version)
* pydata_sphinx_theme (unknown version)

Traceback
=========

      File "/Users/srivarra/Library/Application Support/hatch/env/virtual/anndata/_5ujvw4a/docs/lib/python3.12/site-packages/sphinx/ext/autodoc/__init__.py", line 2980, in add_directive_header
        objrepr = stringify_annotation(
                  ^^^^^^^^^^^^^^^^^^^^^
    TypeError: _stringify_annotation() got an unexpected keyword argument 'short_literals'


The full traceback has been saved in:
/var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/sphinx-err-k20ijzbv.log

To report this error to the developers, please open an issue at <https://github.com/sphinx-doc/sphinx/issues/>. Thanks!
Please also report this if it was a user error, so that a better error message can be provided next time.

I made sure to merge main into this branch, but idk.

In addition I can't quite comprehend this single test which didn't pass MEMORY PROBLEMS tests/test_dask_view_mem.py::test_size_of_view[layers-True] . It did pass locally on my mac however.

Thank you!

@ilan-gold
Copy link
Contributor

ilan-gold commented Feb 26, 2025

I'm not too sure what's going on with the readthedocs build, since I'm unable to replicate it locally, instead I have a different error:

On CI, the issue is that you're exporting a new function whose typing depends on anndata._core.extensions.ExtensionNamespace so you'll need to export ExtensionNamespace as well in the docs.

If you're running the docs using hatch you can prune your environment - if not, would highly recommend using hatch! Check out our hatch.toml

In addition I can't quite comprehend this single test which didn't pass MEMORY PROBLEMS tests/test_dask_view_mem.py::test_size_of_view[layers-True] . It did pass locally on my mac however.

That one is just a hair flaky, we see this every so often, but generally passes (as it does now)

@srivarra srivarra requested a review from ilan-gold February 26, 2025 22:13
@@ -68,11 +69,13 @@ def __getattr__(attr_name: str) -> Any:
# Classes
"AnnData",
"Raw",
"ExtensionNamespace",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a typing.py file so you don't have to reexport

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this resolved?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to remove it from there. Should the protocol go in src/anndata/_types.py or src/anndata/typing.py?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flying-sheep can answer this best, but I think _types.py actually

raise TypeError(msg)


def _create_namespace(name: str, cls: type[AnnData]) -> Callable[[type], type]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more type usage that shoudl be ExtensionNamespace (there are others as well)



@runtime_checkable
class ExtensionNamespace(Protocol[NS]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the NS generic?

Copy link
Author

@srivarra srivarra Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought it was necessary with the protocol, but looks like it doesn't really do anything. Will be removed.

try:
type_hints = get_type_hints(ns_class.__init__)
resolved_type = type_hints.get(param.name, param.annotation)
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the blanket except here with something more precise

@ilan-gold ilan-gold requested a review from flying-sheep March 3, 2025 15:10
@srivarra srivarra marked this pull request as ready for review March 5, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accessors for AnnData
2 participants