Skip to content

lucas-diedrich/sccoral

Repository files navigation

scCoral

Tests Documentation codecov

Getting started

sccoral-model

Motivation

Increasing throughput in single-cell technologies enables researchers to create population-scale single-cell RNAseq datasets. An ongoing challenge in the analysis of this data is to link molecular features (i.e. single-cell gene expression) with patient-level covariates (e.g. disease severity, age, sex, ...). sccoral aims to find an interpretable link between subject/sample-level features and gene expression by embedding cellular metadata and gene expression in the same latent space of a variational autoencoder architecture {cite:p}lopez2018. By leveraging and improving the network architecture of linear scVI {cite:p}svensson2020, we aim to find a direct and interpretable link between embedded covariates and gene expression.

Installation

This repository is still under active development. If you want to install the package, please install it directly from GitHub.

Create a suitable conda environment with Python 3.10 or newer:

conda create -n scvi-env python=3.11
conda activate scvi-env

Install the latest development version:

pip install git+https://github.com/lucas-diedrich/sccoral.git@main

Usage

import sccoral

# Load data
adata = sccoral.data.splatter_simulation()

# Setup + train model with scvi-tools syntax
sccoral.model.setup_anndata(adata,
                            categorical_covariates='categorical_covariate',
                            continuous_covariates='continuous_covariates'
                            )
model = sccoral.model.SCCORAL(adata, n_latent=7)
model.train()

# Get latent representation of cells/factor usages
z = model.get_latent_representation()

# Get interpretable gene programs (factor loadings)
loadings = model.get_loadings()

Release notes

This repository is still under active development. See the changelog.

Contact

For feedback, questions and help requests, you can reach out via the issue tracker. Feel free to contact us via [email protected]

If you found a bug, please also use the issue tracker.

References

Interpretable factor models of single-cell RNA-seq via variational autoencoders [^svensson2020] Valentine Svensson, Adam Gayoso, Nir Yosef, Lior Pachter, Bioinformatics, Volume 36, Issue 11, June 2020, Pages 3418–3421, https://doi.org/10.1093/bioinformatics/btaa169

A Python library for probabilistic analysis of single-cell omics data [^lopez2019] Adam Gayoso, Romain Lopez, Galen Xing, Pierre Boyeau, Valeh Valiollah Pour Amiri, Justin Hong, Katherine Wu, Michael Jayasuriya, Edouard Mehlman, Maxime Langevin, Yining Liu, Jules Samaran, Gabriel Misrachi, Achille Nazaret, Oscar Clivio, Chenling Xu, Tal Ashuach, Mariano Gabitto, Mohammad Lotfollahi, Valentine Svensson, Eduardo da Veiga Beltrame, Vitalii Kleshchevnikov, Carlos Talavera-López, Lior Pachter, Fabian J. Theis, Aaron Streets, Michael I. Jordan, Jeffrey Regier & Nir Yosef Nature Biotechnology 2022 Feb 07. https://doi.org/10.1038/s41587-021-01206-w.

The scverse project provides a computational ecosystem for single-cell omics data analysis [^virshup2022] Isaac Virshup, Danila Bredikhin, Lukas Heumos, Giovanni Palla, Gregor Sturm, Adam Gayoso, Ilia Kats, Mikaela Koutrouli, Scverse Community, Bonnie Berger, Dana Pe’er, Aviv Regev, Sarah A. Teichmann, Francesca Finotello, F. Alexander Wolf, Nir Yosef, Oliver Stegle & Fabian J. Theis Nat Biotechnol. 2022 Apr 10. https://doi.org/10.1038/s41587-023-01733-8.

Citation

None

About

Linear scVI with explicit covariate/factor embedding

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages