Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cellrangermulti subworkflow #276

Merged
merged 160 commits into from
May 22, 2024
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
160 commits
Select commit Hold shift + click to select a range
94397e5
Add cellranger multi testing assets
fmalmeida Nov 7, 2023
e6a54bb
allow cellrangermulti option
fmalmeida Nov 7, 2023
ae20b9e
include cellrangermulti testing conf/profile
fmalmeida Nov 7, 2023
5863d2c
allow cellrangermulti option
fmalmeida Nov 7, 2023
631a980
fix example samplesheet
fmalmeida Nov 8, 2023
70fff60
fixed samplesheet for cellranger multi
fmalmeida Nov 8, 2023
f97e430
don't get cellrangemulti metadata if not needed
fmalmeida Nov 8, 2023
ff66e97
fix check_samplesheet script to be more generic
fmalmeida Nov 9, 2023
0af2761
update input_check for cellranger multi
fmalmeida Nov 9, 2023
8e7d436
avoid renaming sample ids in input check
fmalmeida Nov 9, 2023
97281aa
generate a parsed input channel for cellrangermulti sub-workflow
fmalmeida Nov 9, 2023
8b19f48
defined cellrangemulti sub-workflow and parsed input channel for exec…
fmalmeida Nov 9, 2023
6f494fe
included gex (normal) reference building and updated cellranger modules
fmalmeida Nov 10, 2023
215226b
include mkvdjref
fmalmeida Nov 13, 2023
7b86f80
refactored sample mapping
fmalmeida Nov 14, 2023
e6dc3b2
finally cellranger multi running, with errors, but now can be debugged
fmalmeida Nov 14, 2023
7c66115
not finding samples in data directory
fmalmeida Nov 14, 2023
abb0e3c
saving quick changes for shifting development workspace
fmalmeida Nov 15, 2023
5e27a4c
include option for unzipping reference files
Nov 16, 2023
e00a78d
First successfull run of cellranger multi with renaming module
fmalmeida Nov 17, 2023
ead6462
add whiteline
fmalmeida Nov 20, 2023
ab4425e
Testing github traffic
Nov 21, 2023
2cfb148
Remove file used for testing
Nov 21, 2023
4c275b4
input dataset parsing refactored and fixed
Nov 22, 2023
a8d2702
include cellrangermulti outputs in mqc channel
Nov 22, 2023
0d8be69
include option for cellrangermulti in mtx conversion modules
Nov 23, 2023
4d75c83
add files filter for cellranger multi outputs
Nov 23, 2023
e4e37b5
include cellranger multi outputs to mtx conversion subworkflow
Nov 23, 2023
132e247
update changelog
Nov 23, 2023
0ec7019
remove unused file
Nov 23, 2023
6c7e550
update comment
Nov 23, 2023
a0d666b
remove unused params
fmalmeida Nov 23, 2023
d5c24c3
update nextflow schema
fmalmeida Nov 23, 2023
9206c59
update version of cellranger multi module
Feb 23, 2024
9764f3a
delete unnecessary module
Feb 23, 2024
df89a9b
update modules.config
Feb 23, 2024
85136a2
include newly required parameters
Feb 23, 2024
2f77b4d
remove unnecessary module
Feb 23, 2024
32e480b
remove unwanted args
Feb 23, 2024
3aeaeb8
Merge branch 'dev' of https://github.com/nf-core/scrnaseq into 247-su…
Feb 26, 2024
c85c6dd
add new params
Feb 26, 2024
b38ef0f
update modules due linting
Feb 26, 2024
38060f0
include new columns in samplesheet checker
Feb 26, 2024
aa5e77a
add docker image workaround
Feb 26, 2024
c8b86e4
fix linting
Feb 26, 2024
cc4f0c0
update nextflow schema
fmalmeida Feb 27, 2024
56123e2
applied prettier changes
fmalmeida Feb 27, 2024
e98f444
make param.fasta and params.gtf optional again
fmalmeida Feb 27, 2024
98e92bb
Merge remote-tracking branch 'origin/dev' into 247-support-for-10x-ff…
grst Mar 7, 2024
0e7efe2
update publishDir path
Mar 11, 2024
d432d42
Merge branch '247-support-for-10x-ffpe-scrna' of https://github.com/n…
Mar 11, 2024
b41d89b
Merge branch 'dev' of https://github.com/nf-core/scrnaseq into 247-su…
fmalmeida Mar 19, 2024
912fc73
add sample headers to schema
fmalmeida Mar 19, 2024
f8c65ba
add missing modules
fmalmeida Mar 19, 2024
a5009f8
remove debugging .view()
fmalmeida Mar 19, 2024
53bd304
update cellrager/count
fmalmeida Mar 19, 2024
176060b
small dataset cannot run emptydrops
Mar 19, 2024
04009ff
also run for cellranger/multi
Mar 19, 2024
43f1374
fix white space
Mar 19, 2024
9ff70c6
do not run emptydrops for cellranger arc, and update ch_matrices filt…
Mar 19, 2024
87abd86
add cellrangermulti in aligners options for conversion and add * to m…
Mar 20, 2024
614537e
parse cellrangermulti matrix outputs to filter between raw / filtered…
Mar 20, 2024
ceafcdf
fixing filtering option and using correct cellranger-multi mtxs
Mar 20, 2024
31ff8f4
add nf-test for cellranger multi
fmalmeida Mar 20, 2024
4ed087b
also test cellrangermulti
fmalmeida Mar 20, 2024
5807fd2
revert cellranger modules to latest, without the multi-out-channels a…
Mar 21, 2024
8d396d5
update modules to latest version
Mar 25, 2024
bb8d909
add a parser for raw/filtered results
Mar 25, 2024
84f781d
update comment line
Mar 25, 2024
f86a8d2
add lint fix
fmalmeida Apr 2, 2024
8d30a6f
fixed with new from template using lint
fmalmeida Apr 2, 2024
3b33ded
fix changelog.
fmalmeida Apr 8, 2024
db13a48
update assets for subworkflow
Apr 8, 2024
f9e5017
start removing channels related to "deprecated" additional csvs
Apr 8, 2024
d044f51
fix projectDir
Apr 10, 2024
c8ac4cf
add a parser for frna/cmo data from customised, unified, barcodes sam…
Apr 10, 2024
d81062d
add comment line
Apr 10, 2024
c6e7bfc
fix namings
Apr 10, 2024
319092b
fix if-else and .join() operations
Apr 10, 2024
ac9e5b6
fix selected data
Apr 10, 2024
a92e697
fix testing size
Apr 10, 2024
03e9e82
change to workDir
Apr 10, 2024
51c8577
avoid always writing to allow caching
Apr 10, 2024
62a396e
starting conversion as module
Apr 10, 2024
58466c9
converted parsing and split to a module
Apr 11, 2024
4634a4a
fix code for ensuring FIFO
Apr 11, 2024
07c399a
fixed cellranger-multi input channel logic
Apr 11, 2024
4750ebc
solved cellranger multi parsing and pipeline execution
Apr 12, 2024
aeaea61
add first cellranger-multi try-out bugfixes
Apr 15, 2024
0973363
Fix renaming logic
grst Apr 15, 2024
fa05a9b
fix variable
Apr 16, 2024
a6f46e6
added frna probeset subset reference and include parsing in module
Apr 17, 2024
7ef943c
frna runs also generate raw data per sample
Apr 17, 2024
3b90ddd
use shared nf-core test-datasets
Apr 18, 2024
53f5dc2
update cellranger multi module
Apr 18, 2024
b2f434c
add options-gex meta parsing
Apr 18, 2024
524eedb
Merge branch 'dev' of https://github.com/nf-core/scrnaseq into 247-su…
Apr 18, 2024
555ede3
update last cellranger module
Apr 18, 2024
a0f4ebf
adjust mkvdjref inputs
Apr 18, 2024
92e9d22
fix double comma
Apr 18, 2024
39a953c
update 'channel checkings'
Apr 18, 2024
1841682
add new parameter
Apr 18, 2024
3069563
fix schema
Apr 18, 2024
b4d8d45
also save config per sample
Apr 19, 2024
fc16327
add todo
Apr 19, 2024
2881f09
add fb reference example
Apr 19, 2024
e849278
fix file saving and remove outdated workaround on mqc
Apr 22, 2024
2b68ac8
add an universal key in cellranger-multi data options map so that par…
Apr 22, 2024
acd15cf
update comment
Apr 22, 2024
0d8bb2f
add new inputs to nf-test
Apr 22, 2024
54b1fa6
commit latest editions required in nf-core/module ( must be added in …
Apr 22, 2024
9f0d8db
try test update
Apr 29, 2024
118d754
change global variable used
Apr 30, 2024
b64fa2d
update test to use chr14
fmalmeida Apr 30, 2024
e97e1c0
change reference and resources used
fmalmeida Apr 30, 2024
37a9bb6
update number of tasks
fmalmeida Apr 30, 2024
619d921
Make it work without specifying GTF file
grst Apr 30, 2024
c51d0f6
fix pre-commit
grst Apr 30, 2024
81650b1
set working nf-test for cellranger multi
fmalmeida May 1, 2024
b4204ff
add .clone() method
fmalmeida May 1, 2024
ae6c561
decrease asked memory
fmalmeida May 1, 2024
f6b0f92
make sure ArrayBag is cloned to avoid input modification
fmalmeida May 1, 2024
f8865e6
change name for a better explanation
fmalmeida May 1, 2024
5b68077
modified by prettier
fmalmeida May 1, 2024
32a070b
update resources in test profile
fmalmeida May 1, 2024
97f22ae
make sure the workflow can work with new version of module that does …
fmalmeida May 1, 2024
02dc61d
fix mkgtf module "lint"
fmalmeida May 1, 2024
0361375
updated cellranger multi via nf-core tools
fmalmeida May 1, 2024
04492b1
force deletion for lint
fmalmeida May 1, 2024
2194739
manually download correct images
fmalmeida May 1, 2024
b660d19
Merge branch '247-support-for-10x-ffpe-scrna' into fix-without-gtf
fmalmeida May 2, 2024
5cdb691
Merge pull request #322 from nf-core/fix-without-gtf
fmalmeida May 2, 2024
2b79067
adjust parse to be the same for raw/filtered matrices
fmalmeida May 6, 2024
e8e14de
fix fastqc channel naming
fmalmeida May 6, 2024
714db1b
update nf-tests
fmalmeida May 6, 2024
87f38d0
update name
fmalmeida May 6, 2024
273ddd9
use pre-made fastqc_multiqc channel
fmalmeida May 6, 2024
aa1733f
nf-core lint fix
fmalmeida May 6, 2024
7da5644
flatten channel
fmalmeida May 6, 2024
889353c
update subworkflow
fmalmeida May 6, 2024
3aa7278
add missing .mix() operator
fmalmeida May 7, 2024
844fbc3
include cellrangermulti raw matrices for custom emptydrops filtering
May 10, 2024
9a3e529
correct indentation
May 10, 2024
f842cba
starting documentation on cellranger multi
fmalmeida May 10, 2024
adfda0f
continue documentation
fmalmeida May 10, 2024
0fce1c8
update documentation
fmalmeida May 14, 2024
dc63ae3
add section in outputs
fmalmeida May 14, 2024
9bff0b5
Merge remote-tracking branch 'origin/dev' into 247-support-for-10x-ff…
grst May 15, 2024
92502c9
Update usage
grst May 15, 2024
c22ad6f
Fixed 'file-path' in nextflow schema
grst May 15, 2024
c95b11c
Update output documentation
grst May 15, 2024
f4304ad
Update nextflow_schema.json
grst May 15, 2024
03a38cd
remove gex_barcode_sample_assignment parameter
fmalmeida May 16, 2024
e22f986
add note
fmalmeida May 16, 2024
0d0275e
Update nextflow schema documentation
grst May 17, 2024
882812c
Revert "remove gex_barcode_sample_assignment parameter"
grst May 17, 2024
708c903
Revert "add note"
grst May 17, 2024
b3afdb7
Merge branch 'dev' into 247-support-for-10x-ffpe-scrna
maxulysse May 22, 2024
d497ca8
update file
maxulysse May 22, 2024
4d9f17e
update file better
maxulysse May 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ jobs:
NXF_VER:
- "23.04.0"
- "latest-everything"
profile: ["alevin", "cellranger", "kallisto", "star"]
profile: ["alevin", "cellranger", "cellrangermulti", "kallisto", "star"]

steps:
- name: Disk space cleanup
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- The universc protocol is now specified via the `--protocol` flag
- Any protocol specified is now passed to the respective aligner
- Added a section to the documentation
- Add cellranger multi subworkflow ([#247](https://github.com/nf-core/scrnaseq/issues/247))
fmalmeida marked this conversation as resolved.
Show resolved Hide resolved

## v2.4.1 - 2023-09-28

Expand Down
1 change: 1 addition & 0 deletions assets/EMPTY
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

9 changes: 9 additions & 0 deletions assets/cellrangermulti_samplesheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
sample,fastq_1,fastq_2,feature_type,expected_cells
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5gex/subsampled_sc5p_v2_hs_PBMC_10k_5gex_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5gex/subsampled_sc5p_v2_hs_PBMC_10k_5gex_S1_L001_R2_001.fastq.gz,gex,1000
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/bcell/subsampled_sc5p_v2_hs_PBMC_10k_b_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/bcell/subsampled_sc5p_v2_hs_PBMC_10k_b_S1_L001_R2_001.fastq.gz,vdj,1000
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5fb/subsampled_sc5p_v2_hs_PBMC_10k_5fb_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5fb/subsampled_sc5p_v2_hs_PBMC_10k_5fb_S1_L001_R2_001.fastq.gz,ab,1000
PBMC_10K_CMO,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/gex_1/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_gex_S2_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/gex_1/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_gex_S2_L001_R2_001.fastq.gz,gex,1000
PBMC_10K_CMO,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/cmo/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_multiplexing_capture_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/cmo/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_multiplexing_capture_S1_L001_R2_001.fastq.gz,cmo,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/gex_1/subsampled_5k_human_antiCMV_T_TBNK_connect_GEX_1_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/gex_1/subsampled_5k_human_antiCMV_T_TBNK_connect_GEX_1_S1_L001_R2_001.fastq.gz,gex,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/ab/subsampled_5k_human_antiCMV_T_TBNK_connect_AB_S2_L004_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/ab/subsampled_5k_human_antiCMV_T_TBNK_connect_AB_S2_L004_R2_001.fastq.gz,ab,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/vdj/subsampled_5k_human_antiCMV_T_TBNK_connect_VDJ_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/vdj/subsampled_5k_human_antiCMV_T_TBNK_connect_VDJ_S1_L001_R2_001.fastq.gz,vdj,1000
3 changes: 3 additions & 0 deletions assets/cmo_barcodes.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample_id,cmo_ids,description
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go to the test-datasets repo eventually (together with the sample sheet)?

Copy link
Contributor Author

@fmalmeida fmalmeida Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to, yes.

PBMCs_human_1,CMO301,PBMCs_human_1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this correspond to the samplesheet above? Don't the sample_ids have to match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @grst ,
I cannot give much information. This is something I got from the cellranger/multi module testings and all, and there I think it was already different (so I did not change).

Hi @klkeys, Could you shed some light on this?

Should indeed we make sample_id the same of the one in the samplesheet (which I agree with @grst is the logical thinking) or this is indeed something different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I think I answered it below

Copy link

@klkeys klkeys Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that physical sample IDs need to match the CMO IDs, see the CMO test data for the cellranger multi module

matching physical and CMO IDs does sense if you have exactly one CMO sample per physical sample, as in @grst's explanation in #276 (comment)

note that the CMO Feature Reference CSV only needs controls, and it requires all controls from the same CMO sample as one line of the cellranger multi config (doc reference)

Copy link

@klkeys klkeys Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also relevant to #276 (comment)

I haven't run CMO myself, but I don't think that you tag multiple samples with the same CMO

your CMO files should be this:

proposed CMO samplesheet

sample,multiplexed_sample_id,probe_barcode_ids,description
SAMP001,SAMP001_ctrl,BC301,Control
SAMP001,SAMP001_trt,BC302,Treated
SAMP002,SAMP002_ctrl,BC303,Control
SAMP002,SAMP002_trt,BC304,Treated

sample 1 CMO config

sample_id,probe_barcode_ids,description
SAMP001_ctrl,BC301,Control

sample 2 CMO config

sample_id,probe_barcode_ids,description
SAMP002_ctrl,BC303,Control

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, from reading the code the logic seems to make sense to me.
I only have a multiplexed FFPE dataset I could test on, I'll try to do so this week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fixing some stuff (which I will update when done) then you can test it afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @grst ,
I have added the proposed changes and other fixes that I found were around:

Code as a module
First of all, as suggested here I have modified the parsing code from Groovy to a Module so we face no problems with AWS neither with caching feature.

The module is a python script that checks the consistency of the additional samplesheet and, based on the cmo / frna related columns, it splits the samplesheet having one for each sample.

Adding module splitted samplesheets to workflow
Overall, the code above is simple. The tricky part was adding it to the context of the code again. Here, I took advantage of the "FIFO" rules and using the "GEX" files channel as a base to re-connect and order the split cmo / frna samplesheets generated so that they are used, in the same order (the correct samples) in the CELLRANGER_MULTI module.

This is done in this chunk:
https://github.com/nf-core/scrnaseq/blob/247-support-for-10x-ffpe-scrna/subworkflows/local/align_cellrangermulti.nf#L64-L105

Parsing the generated results for MTX_CONVERSION
Then, I had to add a parsing for the generated results, in order to be able to convert the generated data to .h5ad / .Rds. Cellranger multi, outputs the filtered results in a special folder called per_sample_outs so that, when you have multiple samples demultiplexed by the barcodes given, they will be each in a subdir there.

As such, when doing cellranger/multi, we will be converting the raw_matrices ( not per sample ) and the filtered_matrices (per sample).

The code related to it is here:
https://github.com/nf-core/scrnaseq/blob/247-support-for-10x-ffpe-scrna/subworkflows/local/align_cellrangermulti.nf#L181-L228

The final part of it is just the "standard" splitting raw / filtered as we do for normal cellranger as well.

Custom Emptydrops
Currently, emptydrops will not be performed for cellranger/multi as I am not sure if it is relevant.
https://github.com/nf-core/scrnaseq/blob/247-support-for-10x-ffpe-scrna/workflows/scrnaseq.nf#L278

Samplesheet parsing
Because the cellranger multi subworkflow can receive data from multiple feature types for the same sample, but we must preserve the features type per sample information in different channels so they can be properly parsed here to create the correct channels expected modules.

So, I had to adapt the samplesheet parsing here: https://github.com/nf-core/scrnaseq/blob/247-support-for-10x-ffpe-scrna/subworkflows/local/utils_nfcore_scrnaseq_pipeline/main.nf#L83-L124

nf-tests and documentation
Nf-tests and documentation need to be worked on. But I will only start it once we resolve the code.

😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it can be tested with your data so you can also see how the outputs look like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic! I'll try it on Monday, have a nice weekend!

PBMCs_human_2,CMO302,PBMCs_human_2
5 changes: 5 additions & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@
"type": "string",
"enum": ["atac", "gex"],
"meta": ["sample_type"]
},
"feature_type": {
"type": "string",
"enum": ["gex", "vdj", "ab", "beam", "crispr", "cmo"],
"meta": ["feature_type"]
}
},
"required": ["sample", "fastq_1", "fastq_2"]
Expand Down
53 changes: 53 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -229,3 +229,56 @@ if (params.aligner == 'kallisto') {
}
}
}

if (params.aligner == 'cellrangermulti') {
process {
withName: FASTQC { ext.prefix = { "${meta.id}_${meta.feature_type}" } } // allow distinguishment of data types after renaming
withName: 'NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGER_MULTI_ALIGN:CELLRANGER_MULTI' {
ext.prefix = null // force it null, for some reason it was being wrongly read in the module
publishDir = [
path: "${params.outdir}/${params.aligner}/count",
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
withName: 'GUNZIP*' {
publishDir = [
enabled: false
]
}
withName: CELLRANGER_MKGTF {
publishDir = [
path: "${params.outdir}/${params.aligner}/mkgtf",
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
withName: CELLRANGER_MKREF {
publishDir = [
path: "${params.outdir}/${params.aligner}/mkref",
mode: params.publish_dir_mode
]
}
withName: CELLRANGER_MKVDJREF {
publishDir = [
path: "${params.outdir}/${params.aligner}/mkvdjref",
mode: params.publish_dir_mode
]
}
}
}


//
// QUICK FIX FOR PROBLEM WITH MQC IMAGE
// TODO: TO REMOVE WHEN FIXED IN NF-CORE MODULE
//
process {
withName: 'MULTIQC|CUSTOM_DUMPSOFTWAREVERSIONS' {
container = {
"${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/multiqc:1.20--pyhdfd78af_2' :
'biocontainers/multiqc:1.20--pyhdfd78af_2' }"
}
}
}
38 changes: 38 additions & 0 deletions conf/test_cellranger_multi.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/scrnaseq -profile test,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

// shared across profiles
params {
config_profile_name = 'Test profile (Cellranger Multi)'
config_profile_description = 'Minimal test dataset to check pipeline function using cellranger multi'

// Resources on test case
max_cpus = 10
max_memory = '50.GB'
max_time = '6.h'

// Input data
input = "${projectDir}/assets/cellrangermulti_samplesheet.csv"
cmo_barcode_csv = 'https://github.com/nf-core/scrnaseq/raw/247-support-for-10x-ffpe-scrna/assets/cmo_barcodes.csv'
skip_emptydrops = true // not enough data in small test

// Genome references
fasta = 'https://ftp.ensembl.org/pub/release-110/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz'
gtf = 'https://ftp.ensembl.org/pub/release-110/gtf/homo_sapiens/Homo_sapiens.GRCh38.110.gtf.gz'

// aligner
aligner = 'cellrangermulti'
protocol = 'auto'

// other
validationSchemaIgnoreParams = 'genomes'
}
Binary file modified docs/images/nf-core-scrnaseq_logo_dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/nf-core-scrnaseq_logo_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions lib/Utils.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ class WorkflowScrnaseq {
def jsonSlurper = new JsonSlurper()
def json = new File("${workflow.projectDir}/assets/protocols.json").text
def protocols = jsonSlurper.parseText(json)
aligner = (aligner == 'cellrangermulti') ? 'cellranger' : aligner
def aligner_map = protocols[aligner]
if(aligner_map.containsKey(protocol)) {
return aligner_map[protocol]
Expand Down
16 changes: 13 additions & 3 deletions modules.json
fmalmeida marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,29 @@
"nf-core": {
"cellranger/count": {
"branch": "master",
"git_sha": "92ca535c5a8c0fe89eb71e649ee536bd355ce4fc",
"git_sha": "1774f7876ee03f65ccf49ca2e6bdef7c2356ebca",
"installed_by": ["modules"]
},
"cellranger/mkgtf": {
"branch": "master",
"git_sha": "575e1bc54b083fb15e7dd8b5fcc40bea60e8ce83",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"cellranger/mkref": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"cellranger/mkvdjref": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"cellranger/multi": {
"branch": "master",
"git_sha": "a03357ba56317686b6f65102415211616cd38672",
"installed_by": ["modules"]
},
"cellrangerarc/count": {
"branch": "master",
"git_sha": "18e53e27cfeca5dbbfbeee675c05438dec68245f",
Expand Down Expand Up @@ -52,7 +62,7 @@
},
"kallistobustools/count": {
"branch": "master",
"git_sha": "9d3e489286eead7dfe1010fd324904d8b698eca7",
"git_sha": "53c2b466994f07def210b7f4cc866bb5a8a2cb92",
"installed_by": ["modules"]
},
"kallistobustools/ref": {
Expand Down
2 changes: 1 addition & 1 deletion modules/local/emptydrops.nf
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ process EMPTYDROPS_CELL_CALLING {
task.ext.when == null || task.ext.when

script:
if (params.aligner == "cellranger") {
if (params.aligner in ["cellranger", "cellrangermulti"]) {

matrix = "matrix.mtx.gz"
barcodes = "barcodes.tsv.gz"
Expand Down
8 changes: 4 additions & 4 deletions modules/local/mtx_to_h5ad.nf
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@ process MTX_TO_H5AD {

// check input type of inputs
input_type = (input_to_check.toUriString().contains('unfiltered') || input_to_check.toUriString().contains('raw')) ? 'raw' : 'filtered'
if ( params.aligner == 'alevin' ) { input_type = 'raw' } // alevin has its own filtering methods and mostly output a single mtx, raw here means, the base tool output
if ( params.aligner == 'alevin' ) { input_type = 'raw' } // alevin has its own filtering methods and mostly output a single mtx, 'raw' here means, the base tool output
if (input_to_check.toUriString().contains('emptydrops')) { input_type = 'custom_emptydrops_filter' }

// def file paths for aligners. Cellranger is normally converted with the .h5 files
// However, the emptydrops call, always generate .mtx files, thus, cellranger 'emptydrops' required a parsing
if (params.aligner in [ 'cellranger', 'cellrangerarc' ] && input_type == 'custom_emptydrops_filter') {
if (params.aligner in [ 'cellranger', 'cellrangerarc', 'cellrangermulti' ] && input_type == 'custom_emptydrops_filter') {

aligner = 'cellranger'
txp2gene = ''
Expand Down Expand Up @@ -89,12 +89,12 @@ process MTX_TO_H5AD {
//
// run script
//
if (params.aligner in [ 'cellranger', 'cellrangerarc' ] && input_type != 'custom_emptydrops_filter')
if (params.aligner in [ "cellranger", "cellrangerarc", "cellrangermulti"] && input_type != 'custom_emptydrops_filter')
"""
# convert file types
mtx_to_h5ad.py \\
--aligner cellranger \\
fmalmeida marked this conversation as resolved.
Show resolved Hide resolved
--input ${input_type}_feature_bc_matrix.h5 \\
--input *${input_type}_feature_bc_matrix.h5 \\
--sample ${meta.id} \\
--out ${meta.id}/${meta.id}_${input_type}_matrix.h5ad
"""
Expand Down
2 changes: 1 addition & 1 deletion modules/local/mtx_to_seurat.nf
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ process MTX_TO_SEURAT {

// def file paths for aligners. Cellranger is normally converted with the .h5 files
// However, the emptydrops call, always generate .mtx files, thus, cellranger 'emptydrops' required a parsing
if (params.aligner in [ 'cellranger', 'cellrangerarc' ]) {
if (params.aligner in [ "cellranger", "cellrangerarc", "cellrangermulti"]) {

mtx_dir = (input_type == 'custom_emptydrops_filter') ? 'emptydrops_filtered/' : ''
matrix = "${mtx_dir}matrix.mtx*"
Expand Down
10 changes: 2 additions & 8 deletions modules/nf-core/cellranger/count/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions modules/nf-core/cellranger/count/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

52 changes: 2 additions & 50 deletions modules/nf-core/cellranger/count/tests/main.nf.test.snap

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions modules/nf-core/cellranger/mkgtf/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions modules/nf-core/cellranger/mkvdjref/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading