Skip to content

Commit

Permalink
Upgrade simpleaf to 0.19.0 (#7405)
Browse files Browse the repository at this point in the history
* update simpleaf index

* done updating simpleaf modules. testing

* tests passed

* remove sshash from index test

* remove sshash from index test

* update snapshot and clean meta.yml

* rewrite version.json

* rewrite version.json

* update version md5

* update version md5

* update version md5

* remove salmon from versions.json

* remove md5sum checks

* restructure simpleaf quant output

* add quant_dir in stub

* fix link

* last try of md5sum test

* remove md5 related tests

* remove md5 related tests

* rewrote index and quant to be more generalized

* set meta[count_type] as raw or filter

* linting

* update simpleaf modules tags

* update simpleaf modules tags

* add filtered to simpleaf quant meta

* make read id the default in simpleaf quant

* update simpleaf tests

* update simpleaf tests

* update simpleaf tests and passed

* update simpleaf tests and passed

* test --update

* upgrade to simpleaf 0.19.0

* make lint happy

* write h5ad by default

* update to 0.19.1

---------

Co-authored-by: dongzehe <[email protected]>
Co-authored-by: Gregor Sturm <[email protected]>
Co-authored-by: Dongze He <[email protected]>
  • Loading branch information
4 people authored Feb 4, 2025
1 parent 83fb0b4 commit 094299f
Show file tree
Hide file tree
Showing 9 changed files with 187 additions and 64 deletions.
4 changes: 2 additions & 2 deletions modules/nf-core/simpleaf/index/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ channels:

dependencies:
- bioconda::alevin-fry=0.11.1
- bioconda::piscem=0.11.0
- bioconda::piscem=0.12.2
- bioconda::salmon=1.10.3
- bioconda::simpleaf=0.18.4
- bioconda::simpleaf=0.19.1
38 changes: 21 additions & 17 deletions modules/nf-core/simpleaf/index/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@ process SIMPLEAF_INDEX {

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/simpleaf:0.18.4--ha6fb395_1':
'biocontainers/simpleaf:0.18.4--ha6fb395_1' }"
'https://depot.galaxyproject.org/singularity/simpleaf:0.19.1--ha6fb395_0':
'biocontainers/simpleaf:0.19.1--ha6fb395_0' }"

input:
tuple val(meta), path(genome_fasta), path(genome_gtf)
tuple val(meta2), path(transcript_fasta)
tuple val(meta3), path(probe_csv)
tuple val(meta4), path(feature_csv)

output:
tuple val(meta), path("${prefix}/index") , emit: index
Expand All @@ -23,10 +25,9 @@ process SIMPLEAF_INDEX {

script:
def args = task.ext.args ?: ''
def seq_inputs = input_args(genome_fasta, genome_gtf, transcript_fasta)//, probes_csv, features_csv)
(meta, seq_inputs) = input_args(genome_fasta, genome_gtf, transcript_fasta, probe_csv, feature_csv, meta, meta2, meta3, meta4)

// Output meta needs to correspond to the input used
meta = (transcript_fasta) ? meta2 : meta
prefix = task.ext.prefix ?: "${meta.id}"
"""
# export required var
Expand Down Expand Up @@ -56,8 +57,8 @@ process SIMPLEAF_INDEX {
"""

stub:
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: (meta.id ? "${meta.id}" : "${meta2.id}")
meta = meta ? meta : [id: 'stub']
prefix = task.ext.prefix ?: "${meta.id}"

"""
mkdir -p ${prefix}/index
Expand All @@ -78,19 +79,22 @@ process SIMPLEAF_INDEX {
"""
}

def input_args(genome_fasta, genome_gtf, transcript_fasta) { //, probes_csv, features_csv) {
// if (probe_csv) {
// args = "--probe_csv ${probe_csv}"
// } else if (feature_csv) {
// args = "--feature_csv ${feature_csv}"
// } else
if (transcript_fasta) {
return "--ref-seq ${transcript_fasta}"
def input_args(genome_fasta, genome_gtf, transcript_fasta, probe_csv, feature_csv, meta, meta2, meta3, meta4) {
// check if all null
if (!genome_fasta && !genome_gtf && !transcript_fasta && !probe_csv && !feature_csv) {
error "No valid input provided; please provide either a genome fasta + gtf set or a transcript fasta file."
}

if (feature_csv) {
return [meta4, "--feature-csv ${feature_csv}"]
} else if (probe_csv) {
return [meta3, "--probe-csv ${probe_csv}"]
} else if (transcript_fasta) {
return [meta2, "--ref-seq ${transcript_fasta}"]
} else if (genome_fasta && genome_gtf) {
return "--fasta ${genome_fasta} --gtf ${genome_gtf}"
return [meta, "--fasta ${genome_fasta} --gtf ${genome_gtf}"]
} else {
error "No valid input provided; please provide either a genome fasta + gtf set or a transcript fasta file. ${genome_fasta} ${genome_gtf} ${transcript_fasta}"
// error "No valid input provided; please provide one of the followings: (i) a genome fasta + gtf set, (ii) a transcript fasta file, (iii) a probes csv file (iv) a features csv file."
error "No valid input provided; please provide one of the followings: (i) a genome fasta + gtf set, (ii) a transcript fasta file, (iii) a probes csv file (iv) a features csv file."
}

}
33 changes: 23 additions & 10 deletions modules/nf-core/simpleaf/index/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,13 @@ input:
- genome_fasta:
type: file
description: |
FASTA file containing the genome sequence.
It conflicts with transcript_fasta.
When transcript_fasta is provided, it must be empty (provided as []).
When transcript_fasta is empty, it must be provided together with its corresponding genome_gtf file.
FASTA file containing the genome sequence. It must be provided together with the corresponding genome_gtf file.
When another input set is provided, it must be empty (provided as []).
- genome_gtf:
type: file
description: |
GTF file containing gene annotations.
It conflicts with transcript_fasta.
When transcript_fasta is provided, it must be empty (provided as []).
When transcript_fasta is empty, it must be provided together with its corresponding genome_fasta file.
GTF file containing gene annotations. It must be provided together with its corresponding genome_fasta file.
When another input set rather than genome_fasta + genome_gtf is provided, it must be empty (provided as []).
- - meta2:
type: map
description: |
Expand All @@ -39,8 +35,25 @@ input:
type: file
description: |
FASTA file containing the transcript sequences to build index directly on.
It conflicts with genome_gtf and genome_fasta.
When genome_gtf and genome_fasta are provided, it must be empty (provided as []).
When another input set is provided, it must be empty (provided as []).
- - meta3:
type: map
description: |
Groovy Map containing information on probe_csv.
- probe_csv:
type: file
description: |
CSV file containing the reference probe sequences to build index directly on.
When another input set is provided, it must be empty (provided as []).
- - meta4:
type: map
description: |
Groovy Map containing information on feature_csv.
- feature_csv:
type: file
description: |
CSV file containing the reference feature barcodes to build index directly on.
When another input set is provided, it must be empty (provided as []).
output:
- index:
- meta:
Expand Down
80 changes: 79 additions & 1 deletion modules/nf-core/simpleaf/index/tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ nextflow_process {
input[0] = Channel.of([ meta, genome_fasta, gtf ])
input[1] = Channel.of([[],[]])
input[2] = Channel.of([[],[]])
input[3] = Channel.of([[],[]])
"""
}
}
Expand Down Expand Up @@ -56,6 +58,80 @@ nextflow_process {
input[0] = Channel.of([[],[],[]])
input[1] = Channel.of([ meta, transcriptome_fasta ])
input[2] = Channel.of([[],[]])
input[3] = Channel.of([[],[]])
"""
}
}

then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out.versions).match() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx_cfish.json").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.ctab").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.ectab").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.json").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.refinfo").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.sshash").exists() },
{ assert file("${process.out.index.get(0).get(1)}/simpleaf_index.json").exists() }
// { assert snapshot(
// path("${process.out.index.get(0).get(1)}/piscem_idx.ctab"),
// path("${process.out.index.get(0).get(1)}/piscem_idx.json"),
// path("${process.out.index.get(0).get(1)}/piscem_idx_cfish.json"),
// process.out.versions)
// .match() }
)
}
}

test("Homo sapiens - probe index - direct - probe csv") {
when {
process {
"""
probe_csv = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/10xgenomics/spaceranger/human-brain-cancer-11-mm-capture-area-ffpe-2-standard_v2_ffpe_cytassist/CytAssist_11mm_FFPE_Human_Glioblastoma_probe_set.csv', checkIfExists: true)
meta = [ 'id': 'CytAssist_11mm_FFPE_Human_Glioblastoma_probe_set']
input[0] = Channel.of([[],[],[]])
input[1] = Channel.of([[],[]])
input[2] = Channel.of([ meta, probe_csv ])
input[3] = Channel.of([[],[]])
"""
}
}

then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out.versions).match() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx_cfish.json").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.ctab").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.ectab").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.json").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.refinfo").exists() },
{ assert file("${process.out.index.get(0).get(1)}/piscem_idx.sshash").exists() },
{ assert file("${process.out.index.get(0).get(1)}/simpleaf_index.json").exists() }
// { assert snapshot(
// path("${process.out.index.get(0).get(1)}/piscem_idx.ctab"),
// path("${process.out.index.get(0).get(1)}/piscem_idx.json"),
// path("${process.out.index.get(0).get(1)}/piscem_idx_cfish.json"),
// process.out.versions)
// .match() }
)
}
}

test("Homo sapiens - feature index - direct - feature csv") {
when {
process {
"""
feature_csv = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/sc5p_v2_hs_PBMC_10k_multi_5gex_5fb_b_t_feature_ref.csv', checkIfExists: true)
meta = [ 'id': 'sc5p_v2_hs_PBMC_10k_multi_5gex_5fb_b_t_feature_ref']
input[0] = Channel.of([[],[],[]])
input[1] = Channel.of([[],[]])
input[2] = Channel.of([[],[]])
input[3] = Channel.of([ meta, feature_csv ])
"""
}
}
Expand Down Expand Up @@ -90,7 +166,9 @@ nextflow_process {
meta = [ 'id': 'human_transcriptome']
input[0] = Channel.of([[],[],[]])
input[1] = Channel.of([ meta, transcriptome_fasta ])
input[1] = Channel.of([ [], [] ])
input[2] = Channel.of([ [], [] ])
input[3] = Channel.of([ [], [] ])
"""
}
}
Expand Down
74 changes: 49 additions & 25 deletions modules/nf-core/simpleaf/index/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -1,24 +1,48 @@
{
"Homo sapiens - probe index - direct - probe csv": {
"content": [
[
"versions.yml:md5,d2fa3ef3c792f5bd01cf4e05866caceb"
]
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.3"
},
"timestamp": "2025-02-04T20:19:57.286321286"
},
"Homo sapiens - transcriptome index - direct - transcriptome fasta": {
"content": [
[
"versions.yml:md5,bd96efe900339c637533c40b37fa5cfc"
"versions.yml:md5,d2fa3ef3c792f5bd01cf4e05866caceb"
]
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.3"
},
"timestamp": "2025-02-04T20:15:47.68048376"
},
"Homo sapiens - feature index - direct - feature csv": {
"content": [
[
"versions.yml:md5,d2fa3ef3c792f5bd01cf4e05866caceb"
]
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.3"
},
"timestamp": "2025-01-23T00:40:55.088252924"
"timestamp": "2025-02-04T20:20:04.705227496"
},
"Homo sapiens - transcriptome index - direct - transcriptome fasta - stub": {
"content": [
{
"0": [
[
[

],
{
"id": "stub"
},
[
"piscem_idx.ectab:md5,d41d8cd98f00b204e9800998ecf8427e",
"piscem_idx.sshash:md5,d41d8cd98f00b204e9800998ecf8427e",
Expand All @@ -28,9 +52,9 @@
],
"1": [
[
[

],
{
"id": "stub"
},
[
"roers_ref.fa:md5,d41d8cd98f00b204e9800998ecf8427e",
"t2g_3col.tsv:md5,d41d8cd98f00b204e9800998ecf8427e"
Expand All @@ -39,20 +63,20 @@
],
"2": [
[
[

],
{
"id": "stub"
},
"t2g_3col.tsv:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"3": [
"versions.yml:md5,78f7da1109cf98d7b9107222704848e1"
"versions.yml:md5,70c02c3b43e257fa1abb47cb6879f6b7"
],
"index": [
[
[

],
{
"id": "stub"
},
[
"piscem_idx.ectab:md5,d41d8cd98f00b204e9800998ecf8427e",
"piscem_idx.sshash:md5,d41d8cd98f00b204e9800998ecf8427e",
Expand All @@ -62,9 +86,9 @@
],
"ref": [
[
[

],
{
"id": "stub"
},
[
"roers_ref.fa:md5,d41d8cd98f00b204e9800998ecf8427e",
"t2g_3col.tsv:md5,d41d8cd98f00b204e9800998ecf8427e"
Expand All @@ -73,33 +97,33 @@
],
"t2g": [
[
[

],
{
"id": "stub"
},
"t2g_3col.tsv:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"versions": [
"versions.yml:md5,78f7da1109cf98d7b9107222704848e1"
"versions.yml:md5,70c02c3b43e257fa1abb47cb6879f6b7"
]
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.3"
},
"timestamp": "2025-01-23T02:08:51.588975264"
"timestamp": "2025-02-04T20:20:12.01549861"
},
"Homo sapiens - genome index - expanded - fasta + gtf": {
"content": [
[
"versions.yml:md5,bd96efe900339c637533c40b37fa5cfc"
"versions.yml:md5,d2fa3ef3c792f5bd01cf4e05866caceb"
]
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.3"
},
"timestamp": "2025-01-23T00:40:41.692166586"
"timestamp": "2025-02-04T20:15:30.566784151"
}
}
4 changes: 2 additions & 2 deletions modules/nf-core/simpleaf/quant/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ channels:

dependencies:
- bioconda::alevin-fry=0.11.1
- bioconda::piscem=0.11.0
- bioconda::piscem=0.12.2
- bioconda::salmon=1.10.3
- bioconda::simpleaf=0.18.4
- bioconda::simpleaf=0.19.1
Loading

0 comments on commit 094299f

Please sign in to comment.