-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support training from AnnCollection
#3018
Comments
Hi, thanks for the suggestion. We are currently looking into supporting MappedCollection from lamindb. However, AnnCollection works with setup_anndata while MappedCollection requires Custom Dataloaders. Do you used AnnCollection in disk-backed mode or are the datasets loaded to memory? scvi-tools/src/scvi/data/_preprocessing.py Line 283 in c53efe0
|
I set it up to use Here's a sample snippet of how I created the objects. # get some data
gdown.download(url="https://drive.google.com/uc?id=1X5N9rOaIqiGxZRyr1fyZ6NpDPeATXoaC",
output="pbmc_seurat_v4.h5ad", quiet=False)
gdown.download(url="https://drive.google.com/uc?id=1JgaXNwNeoEqX7zJL-jJD3cfXDGurMrq9",
output="covid_cite.h5ad", quiet=False)
# load in backed
covid = sc.read('covid_cite.h5ad', backed="r")
pbmc = sc.read('pbmc_seurat_v4.h5ad', backed="r")
# make a collection
collection = AnnCollection([covid, pbmc], join_vars="inner", join_obs="inner", label='dataset')
# use the wrapper
wrapped_collection = AnnFaker(collection)
# train a model
scvi.model.SCVI.setup_anndata(
wrapped_collection,
layer="test",
batch_key="dataset",
)
model = scvi.model.SCANVI(wrapped_collection, n_latent=10)
model.train(max_epochs=20)
# training completes, latent matches expectations
Sure, here's a minimal implementation in colab: https://colab.research.google.com/drive/1v9B62IfLM8qBfgmvDYnCs3GZaaUvnG26?usp=sharing |
Is your feature request related to a problem? Please describe.
scvi-tools
does not support training from the newanndata.experimental.AnnCollection
APIAnnCollection
is great! For teams training models on large datasets, it's a game changer.Describe the solution you'd like
AnnCollection
objects through the existing API would be great.scvi-tools
workflowQuestion for maintainers
scvi-tools
code.AnnCollection
that mimic theanndata.AnnData
API in all the waysscvi-tools
expects. We've successfully trained simple models with this solution.wrapped_collection = Wrapper(collection)
) then proceed with thescvi-tools
workflow as normal (setup_anndata(wrapped_collection, ...)
, etc.).scvi-tools
? If so, I can send in a PR. I'd imagine it living as a separate module under.data
.The text was updated successfully, but these errors were encountered: