Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

endorspy erroring in nf-core/ender workflow #1083

Closed
5 tasks done
VerbalCant opened this issue Aug 19, 2024 · 3 comments
Closed
5 tasks done

endorspy erroring in nf-core/ender workflow #1083

VerbalCant opened this issue Aug 19, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@VerbalCant
Copy link

Check Documentation

I have checked the following places for your error:

Description of the bug

endors.py step is failing in a way that seems to be related to program arguments. The nf-core/eager version in bin/endors.py, which is what is being called, is quite a bit different than the current release version of [endors.py(https://github.com/aidaanva/endorS.py), but I don't think that matters. If I can make a guess of what's happening based on the extra space between the -n and the _flagstat_stats below, perhaps it's passing an empty value to --name/-n ?. If I look at the code here, it looks like might have an empty/null library_id?

I do also note the earlier warnings about the _rmdup files, where they're specifying the destination off of root, /_rmdup.bam, instead of what I presume should be more like ./_rmdup.bam? I don't have any evidence that they're related.

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line: NXF_VER=22.10.6 nextflow run nf-core/eager -profile docker --input '*Momia3_S2_L001_R1_001_merged.fastq.gz' --fasta /references/reference_genomes/hg38.analysisSet.fa --max_memory 60GB --max_cpus 16 --max_time 288.h -config nextflow_custom.config --run_genotyping true --save_reference --genotyping_tool 'freebayes' --run_mtnucratio true --run_sexdeterrmine true --run_nuclear_contamination true --single_end true --mapper bowtie2 -resume

It fails with this error the first time it hits endors.py, which (I can confirm) is the one called from the version in nf-core/eager/bin. It fails on subsequent runs with -resume, and if I clean the directory, remove all subdirectories, and start again from scratch.

  1. See error:
-[nf-core/eager] Pipeline completed with errors-
WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.bam; to: /_rmdup.bam [copy] -- See log file for details
WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.metrics; to: /_rmdup.metrics [copy] -- See log file for details
WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.bam.bai; to: /_rmdup.bam.bai [copy] -- See log file for details
WARN: Graphviz is required to render the execution DAG in the given format -- See http://www.graphviz.org for more info.
Error executing process > 'endorSpy ()'

Caused by:
  Process `endorSpy ()` terminated with an error exit status (2)

Command executed:

  endorS.py -o json -n  _flagstat.stats

Command exit status:
  2

Command output:
  (empty)

Command error:
  usage: python endorS.py [-h] [--version] [-o json,none] [-n] <samplesfile>.stats [<samplesfile>.stats]
  endorS.py: error: the following arguments are required: <samplefile>.stats

Work dir:
  /working/momia3_2024-08-19/work/12/1bfe7d3c613136f7be70fffcd8377d

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Expected behaviour

I expected it to proceed through the endorS.py step and move on to the next step of the workflow.

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline

  • The .nextflow.log file: nextflow.log

  • The exact error:

Command error:
  usage: python endorS.py [-h] [--version] [-o json,none] [-n] <samplesfile>.stats [<samplesfile>.stats]
  endorS.py: error: the following arguments are required: <samplefile>.stats

System

  • Hardware: 16-core i7 desktop, 64GB
  • Executor: local
  • OS: Ubuntu
  • Version 22.04 LTS

Nextflow Installation

  • Version: 22.10.6

Container engine

  • Engine: docker
  • version: Docker version 27.0.3, build 7d4bcd8
  • Image tag:

Additional context

The reads are a PE aDNA run that was merged using (NGmerge) into a single fastq.

I can confirm that the working directory contains the symlink to the _flagstat.stats:

╭─    /working/momia3_2024-08-19/work/12/1bfe7d3c613136f7be70fffcd8377d 
╰─ ls -l
total 8
lrwxrwxrwx 1 a a 81 Aug 19 17:05 _flagstat.stats -> /working/momia3_2024-08-19/work/3c/cc266703cd45f183eb1870a02677db/_flagstat.stats
lrwxrwxrwx 1 a a 69 Aug 19 17:05 nf-core_eager_dummy.txt -> /home/a/.nextflow/assets/nf-core/eager/assets/nf-core_eager_dummy.txt

The .nextflow.log file: nextflow.log

@VerbalCant VerbalCant added the bug Something isn't working label Aug 19, 2024
@VerbalCant VerbalCant changed the title endorspy erroring in pipeline endorspy erroring in nf-core/ender workflow Aug 19, 2024
@jfy133
Copy link
Member

jfy133 commented Aug 20, 2024

Pinging @aidaanva

@aidaanva
Copy link
Contributor

Hi @VerbalCant

Looking into the nextflow.log that you provided it seems that nf-core/eager does not parse correctly your input files into a tsv file. The names get shorten to only "_L0 "for the initial steps (fastqc, adapter removal) which then when trying to generate the bam files, the file name is further shorten to "". That's why you are seeing the error:

WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.bam; to: /_rmdup.bam [copy] -- See log file for details
WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.metrics; to: /_rmdup.metrics [copy] -- See log file for details
WARN: Failed to publish file: /working/momia3_2024-08-19/work/7d/667071f6d11007a3023ad1269d2c06/_rmdup.bam.bai; to: /_rmdup.bam.bai [copy] -- See log file for details

Since no files are produced, endorspy can not be run and I think this is why the workflow showed you this error.

My recommendation is that you give your input as a tsv since this is a safer way and ensures that the pipeline extracts the correct sample and library names. You can find documentation as of how set up the tsv for nf-core/eager here: https://nf-co.re/eager/2.5.2/docs/usage/#tsv-input-method

Let me know whether using the TSV input method solves the issue and if not I will take another look.

@VerbalCant
Copy link
Author

Hey @aidaanva thanks for the quick reply! I can confirm that using the tsv input method, as described in the docs you linked, resolves this. Closing the issue, and hoping it helps somebody else in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants