Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scrnaseq will not recognize my genome file path #339

Open
jjovelc opened this issue Jun 20, 2024 · 2 comments
Open

scrnaseq will not recognize my genome file path #339

jjovelc opened this issue Jun 20, 2024 · 2 comments

Comments

@jjovelc
Copy link

jjovelc commented Jun 20, 2024

Description of the bug

I am trying to run the scrnaseq nextflow workflow with the following error message:

ERROR ~ Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:GTF_GENE_FILTER ([])'

Caused by:
Process NFCORE_SCRNASEQ:SCRNASEQ:GTF_GENE_FILTER ([]) terminated with an error exit status (2)

Command executed:

filter_gtf_for_genes_in_genome.py
--gtf gencode.v32.primary_assembly.annotation.gtf
--fasta
-o []_genes.gtf
cat <<-END_VERSIONS > versions.yml
"NFCORE_SCRNASEQ:SCRNASEQ:GTF_GENE_FILTER":
python: $(python --version | sed 's/Python //g')
END_VERSIONS

Command exit status:
2

Command output:
(empty)

Command error:
WARNING: DEPRECATED USAGE: Environment variable SINGULARITYENV_TMPDIR will not be supported in the future, use APPTAINERENV_TMPDIR instead
WARNING: DEPRECATED USAGE: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR will not be supported in the future, use APPTAINERENV_NXF_TASK_WORKDIR instead
WARNING: DEPRECATED USAGE: Environment variable SINGULARITYENV_NXF_DEBUG will not be supported in the future, use APPTAINERENV_NXF_DEBUG instead
usage: filter_gtf_for_genes_in_genome.py [-h] [--gtf GTF] [--fasta FASTA]
[-o OUTPUT]
filter_gtf_for_genes_in_genome.py: error: argument --fasta: expected one argument

Command used and terminal output

No response

Relevant files

#!/usr/bin/bash

eval "$(conda shell.bash hook)"
conda activate nextflow

export DIR=$(pwd)
export JAVA_HOME="/home/juan.jovel/mambaforge/envs/nextflow"
export PATH="$JAVA_HOME/bin:$PATH"

REFS="/work/vetmed_data/jj/db/ensembl/GRCh38/reference_sources"

# Run the Nextflow pipeline with the specified configuration and input
nextflow run /work/vetmed_data/jj/projects/juanJovel/pipelines/nextflow/scrnaseq/scrnaseq/main.nf \
    -profile singularity \
    -c "${DIR}/jj_arc_scrnaseq.config" \
    --input "${DIR}/samplesheet.csv" \
    --genome_fasta "${REFS}/Homo_sapiens.GRCh38.dna.primary_assembly.fa" \
    --gtf "${REFS}/gencode.v32.primary_assembly.annotation.gtf" \
    --outdir "${DIR}/results" \
    --aligner alevin \
    --protocol 10XV2 \
    -resume

I could pinpoint that the problem is in file .command.sh in the working directory. Namely, the PATH of my genome fasta file was not included here:

#!/bin/bash -euo pipefail
filter_gtf_for_genes_in_genome.py \
    --gtf gencode.v32.primary_assembly.annotation.gtf \
    --fasta  \
    -o []_genes.gtf
cat <<-END_VERSIONS > versions.yml
"NFCORE_SCRNASEQ:SCRNASEQ:GTF_GENE_FILTER":
    python: $(python --version | sed 's/Python //g')
END_VERSIONS

System information

My nextflow version:

N E X T F L O W
  version 23.04.2 build 5870
  created 08-06-2023 08:29 UTC (02:29 MDT)
  cite doi:10.1038/nbt.3820
  http://nextflow.io

MY config file:

params {
outdir = "${baseDir}/results"
input = "${baseDir}/samplesheet.csv"
genome_fasta = "/work/vetmed_data/jj/db/ensembl/GRCh38/reference_sources/Homo_sapiens.GRCh38.dna.primary_assembly.fa"
gtf = "/work/vetmed_data/jj/db/ensembl/GRCh38/reference_sources/gencode.v32.primary_assembly.annotation.gtf"
aligner = "alevin"
protocol = "10XV2"
}

singularity {
enabled = true
}

process {
executor = 'slurm'
memory = '128 GB'
cpus = 24
time = '48h'
}

singularity {
enabled = true
autoMounts = true
}

docker {
enabled = false
}

timeline {
enabled = true
file = "${params.outdir}/pipeline_timeline.html"
overwrite = true
}

report {
enabled = true
file = "${params.outdir}/pipeline_report.html"
overwrite = true
}

trace {
enabled = true
file = "${params.outdir}/pipeline_trace.txt"
overwrite = true
}

params {
max_cpus = 24
max_memory = '128 GB'
}

executor {
queueSize = 100
maxForks = 4
}

workDir = '/work/vetmed_data/jj/projects/juanJovel/pipelines/nextflow/scrnaseq'

@jjovelc jjovelc added the bug Something isn't working label Jun 20, 2024
@jjovelc
Copy link
Author

jjovelc commented Aug 15, 2024

We solved the issue. The problem is that in the github page, the flag for passing the genome file to the pipeline appear as --genome_fasta GRCm38.p6.genome.chr19.fa. This is wrong, the correct way to pass the genome file is: --fasta GRCm38.p6.genome.chr19.fa.

@jjovelc jjovelc closed this as completed Aug 15, 2024
@grst
Copy link
Member

grst commented Aug 16, 2024

Thanks for reporting back! Let's keep this issue open then as a reminder to fix the documentation :)

@grst grst reopened this Aug 16, 2024
@grst grst added documentation and removed bug Something isn't working labels Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants