Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation #152

Merged
merged 9 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 161 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,165 @@
dist/
panpipes/__pycache__
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
panpipes.egg-info
*.pyc
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# panpipes specific
.history
panpipes/.DS_Store
.DS_Store
Expand Down
19 changes: 0 additions & 19 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -1,32 +1,13 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
# You can also specify other tool versions:
# nodejs: "19"
# rust: "1.64"
# golang: "1.19"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
# formats:
# - pdf
# - epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
43 changes: 22 additions & 21 deletions CHANGELOG.md
bio-la marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Releases
===========

# Changelog

## [latest]

Expand All @@ -17,30 +17,30 @@ Releases
## v0.4.1

### added

- added example multiome submission file 10X_h5
- added example multiome submission file cellranger
- workflows & tutorials for `qc_spatial`, `preprocess_spatial`, and `deconvolution_spatial` to readthedocs
- tutorial for `vis`
- added PCA parameters in pipeline_preprocess.py for PROT modality to fix issue #120
- added full control of dimred params for all modalities in pipeline_preprocess.py
- more info on custom genes format files added to documentation
- parsing summary files for cellranger multi version < 7
- parsing summary files for cellranger multi version < 7
- added checks for n_pcs in run_neighbors_method_choice
- added filtering by HVF for atac


### fixed

- changed typo in tutorial paths for clustering and deconvolution
- fix io to read cellranger outs folder for atac.
- fixes to refmap workflow
- fix io to read cellranger outs folder for atac.
- fixes to refmap workflow
- typos & capitalization in the pipeline.yml files of `qc_spatial`, `preprocess_spatial`, and `deconvolution_spatial`, `vis`
- remove `assay`, `sample_prefix`, and `modalities` parameters from the `qc_spatial` pipeline.yml
- remove `assay`, `sample_prefix`, and `modalities` parameters from the `qc_spatial` pipeline.yml
- remove `sample_prefix` and `modalities` parameters from the `preprocess_spatial` pipeline.yml
- fixed error in `preprocess_spatial` when `filtering: run: False`
-> now able to run no filtering without needing to save the MuData in `filtered.data` before running the pipeline
- fixed error in `vis`
- change PARAMS['custom_markers_minimal'] -> PARAMS['custom_markers']['files']['minimal']
- change PARAMS['custom_markers_minimal'] -> PARAMS['custom_markers']['files']['minimal']
- fix to avoid rerunning HVF and explicitly check X layer before normalization in pipeline_preprocess.py
- fix plotting of umaps after batch correction
- fix fetching string scvi if present in mudata for wnn
Expand All @@ -49,54 +49,55 @@ Releases
- fixed filtering HVG for rna
- moved pynndescent to PyPi dependencies


### dependencies

## v0.4.0

Big Change! the submission files for the `ingest` workflow have now changed! we require the paths to the Gene expression (RNA/GEX) and Protein (ADT) to have the following headers.


| sample_id | rna_path | rna_filetype | prot_path | prot_filetype |
| --------- | ----------- | ------------ | ------------ | ------------- |
| sampleX | path/to/rna | 10X_h5 | path/to/prot | 10x_h5 |
| | | | | |


See tutorials for examples of submission files.


### added

- merged PR #111:
- LSI in panpipes_preprocess is run on the highly variable features
- n_comp for LSI

### fixed

- changed all instances of ADT into PROT
- changed all instances of GEX to RNA
- changed the params to fix plotting as mentioned in issue #41
- typo in readme
- set default seaborn <=0.12.2 to avoid issue #104, #126

### dependencies

## v0.3.1
- set default matplotlib<=3.7.3 to avoid issue #104.

- set default matplotlib<=3.7.3 to avoid issue #104.

## v0.3.0

### added

- Spatial data analysis is now included in panpipes
- panpipes qc_spatial
- panpipes preprocess_spatial
- panpipes_deconvolution_spatial
- panpipes qc_spatial
- panpipes preprocess_spatial
- panpipes_deconvolution_spatial

### fixed

- make sure columns from individual modalities that are not in the multimodal outer obs can be used to

### dependencies

- additional dependencies: squidpy, cell2location, openpyxl

## v0.2.0
- First public version of panpipes
- contains qc_mm, preprocess, intergration, clustering


- First public version of panpipes
- contains qc_mm, preprocess, intergration, clustering
56 changes: 31 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,48 @@
[![PyPI version](https://badge.fury.io/py/panpipes.svg)](https://badge.fury.io/py/panpipes)
# Panpipes - multimodal single cell pipelines

Created and Maintained by Charlotte Rich-Griffin and Fabiola Curion
Additional contributors: Sarah Ouologuem, Devika Agarwal, and Tom Thomas
# Panpipes - multimodal single cell pipelines

## Overview

**See our [documentation](https://panpipes-pipelines.readthedocs.io/en/latest/) and our [preprint](https://www.biorxiv.org/content/10.1101/2023.03.11.532085v1)**:
Panpipes: a pipeline for multiomic single-cell data analysis
Charlotte Rich-Griffin*, Fabiola Curion*, Tom Thomas, Devika Agarwal, Fabian J. Theis, Calliope A. Dendrou.
bioRxiv 2023.03.11.532085;
doi: https://doi.org/10.1101/2023.03.11.532085
Panpipes is a set of computational workflows designed to automate multimodal single-cell and spatial transcriptomic analyses by incorporating widely-used Python-based tools to perform quality control, preprocessing, integration, clustering, and reference mapping at scale.
Panpipes allows reliable and customisable analysis and evaluation of individual and integrated modalities, thereby empowering decision-making before downstream investigations.

**See our [documentation](https://panpipes-pipelines.readthedocs.io/en/latest/) and our [preprint](https://www.biorxiv.org/content/10.1101/2023.03.11.532085v2)**:

# Introduction
These workflows use cgat-core pipeline software
These workflows make use of [cgat-core](https://github.com/cgat-developers/cgat-core):

Available workflows:
1. "ingest" : for the ingestion of data and computation of QC metrics
2. "preprocess" : for filtering and normalising of each modality
3. "integration" : integrate and batch correction using single and multimodal methods
4. "clustering" : cell clustering on single modalities
5. "refmap" : transfer scvi-tools models from published data to your data
6. "vis" : visualize metrics from other pipelines in the context of experiment metadata
7. "qc_spatial" : for the ingestion of spatial transcriptomics (ST) data (Vizgen, Visium) and computation of QC metrics
8. "preprocess_spatial" : for filtering and normalizing ST data
9. "deconvolution_spatial" : for the cell type deconvolution of ST slides


# Installation and configuration
See [installation instructions here](https://panpipes-pipelines.readthedocs.io/en/latest/install.html)

1. "ingest" : Ingest data and compute quality control metrics
2. "preprocess" : Filter and normalize per modality
3. "integration" : Integrate and batch correct using single and multimodal methods
4. "clustering" : Cluster cells per modality
5. "refmap" : Map queries against reference datasets
6. "vis" : Visualize metrics from other pipelines in the context of experiment metadata
7. "qc_spatial" : Ingest spatial transcriptomics data (Vizgen, Visium) and compute quality control metrics
8. "preprocess_spatial" : Filtering and normalize spatial transcriptomics data
9. "deconvolution_spatial" : Deconvolve cell types of spatial transcriptomics slides

## Installation and configuration

See [installation instructions here](https://panpipes-pipelines.readthedocs.io/en/latest/install.html)

Oxford BMRC Rescomp users find additional advice in [docs/installation_rescomp](https://github.com/DendrouLab/panpipes/blob/main/docs/installation_rescomp.md)

# Releases
## Releases

`panpipes v0.4.0` is out [now](./CHANGELOG.md)!
`panpipes v0.4.0` is out [now](./CHANGELOG.md)!

The `ingest` workflow now expects different headers for the RNA and Protein modalities.
Check the example [submission file](https://github.com/DendrouLab/panpipes/blob/main/docs/usage/sample_file_qc_mm.md) and the [documentation](https://panpipes-pipelines.readthedocs.io/en/latest/usage/setup_for_qc_mm.html) for more detailed instructions.

## Citation

[Panpipes: a pipeline for multiomic single-cell and spatial transcriptomic data analysis
Fabiola Curion, Charlotte Rich-Griffin, Devika Agarwal, Sarah Ouologuem, Tom Thomas, Fabian J. Theis, Calliope A. Dendrou
bioRxiv 2023.03.11.532085; doi: https://doi.org/10.1101/2023.03.11.532085](https://www.biorxiv.org/content/10.1101/2023.03.11.532085v2)

## Contributors

Created and Maintained by Charlotte Rich-Griffin and Fabiola Curion.
Additional contributors: Sarah Ouologuem, Devika Agarwal, Lilly May, Kevin Rue-Albrecht.
Loading
Loading