Authors: Ted Verhey, Sorana Morrissy
Contributors: Hyojin Song, Aaron Gillmor, Gurveer Gill, Courtney Hall
mosaicMPI is a Python package for enabling mosaic integration of bulk, single-cell, and spatial expression data through program-level integration. Programs are first discovered using unsupervised deconvolution (consensus non-negative matrix factorization, cNMF) across multiple ranks separately for each dataset. A flexible network-based approach groups similar programs together across resolutions and datasets. Program communities are then interpreted using sample/cell metadata and gene set analyses. Integrative program communities enable metadata transfer across datasets.
Here are just a few of the things that mosaicMPI
does well:
- Identifies interpretable, non-negative programs at multiple resolutions
- Mosaic integration does not require subsetting features/genes to intersection or highly-variable subset
- Multi-omics integration without shared sample IDs
- Incremental integration (adding datasets one at a time) since deconvolution is performed independently on each dataset
- High performance integration of datasets with mismatched features (eg. Microarray, RNA-Seq, Proteomics) or sparsity (eg. single-cell vs. bulk)
- Metadata transfer across datasets
mosaicMPI
has two interfaces:
- command-line interface (CLI) with a standardized workflow for rapid data exploration and integration
- python API for greatest flexibility and extensibility
- Compatible with OS X, Windows and Linux systems
- Memory usage depends on size and number of datasets
Install the package with conda
:
# if using a fresh conda install
conda init
# create an environment called 'mosaicenv' and install
conda create -n mosaicenv -c conda-forge mosaicmpi
conda activate mosaicenv
Some analyses require packages from other channels to be installed in the same environment:
conda install -c bioconda gprofiler-official
# if you have conda (MacOS_x86-64 and Linux only)
conda install -c bioconda gseapy
# Windows and macOS (Apple Silicon)
pip install gseapy
Read the documentation.
For questions arising during use of mosaicMPI
, create and browse issues in the GitHub "issues" tab.