Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many input files for MultiQC #100

Open
orzechoj opened this issue Oct 5, 2018 · 12 comments
Open

Too many input files for MultiQC #100

orzechoj opened this issue Oct 5, 2018 · 12 comments
Labels
bug Something isn't working

Comments

@orzechoj
Copy link

orzechoj commented Oct 5, 2018

I ran the RNA-seq pipeline on 360 samples, and the slurm submission of multiQC failed with Pathname of a file, directory or other parameter too long

ERROR ~ Error executing process > 'multiqc'
Caused by:
 Failed to submit process to grid scheduler for execution
Command executed:
 sbatch .command.run
Command exit status:
 1
Command output:
sbatch: error: Batch job submission failed: Pathname of a file, directory or other parameter too long

The files .command.stub and .command.sh look normal, but .command.run is 11Mb, with many commands for lnetc. So it might be something related to this bug: https://bugs.schedmd.com/show_bug.cgi?id=2198

@ewels
Copy link
Member

ewels commented Oct 5, 2018

@pditommaso - have you come across problems like this before? I guess that this is because the MultiQC process is softlinking in a lot of files which makes .command.run massive so that slurm rejects it.

@ewels ewels added the bug Something isn't working label Oct 5, 2018
@pditommaso
Copy link
Contributor

pditommaso commented Oct 5, 2018

Ouch, 11Mb of input files! You can mitigate this problem using an directory as output instead files. I mean, instead of having

   output:
    file "*_fastqc.{zip,html}" into fastqc_results

let multiqc to save the files into a directory e.g. reports, then

   output:
    file "reports" into fastqc_results

@ewels
Copy link
Member

ewels commented Oct 5, 2018

Yes, maybe we should profile how many files each channel going into MultiQC has. I suspect that there are quite a few that aren't needed. For example - MultiQC only needs the zip file here, not the html. So could make new MultiQC-specific channels that have just these files to cut down on the number.

@ewels ewels changed the title Error in MultiQC slurm submission Too many input files for MultiQC Oct 5, 2018
@apeltzer apeltzer added this to the 1.4 milestone Jul 11, 2019
@apeltzer
Copy link
Member

I'm wondering whether @olgabot had issues with this when doing her large-scale nf-core/rnaseq experiments on AWS - any ideas?

@apeltzer apeltzer removed this from the 1.4 milestone Oct 2, 2019
@ojziff
Copy link

ojziff commented May 12, 2020

i ran the RNAseq pipeline on 576 fastq files and the slurm submission has also failed on the multiqc process with the same error:
sbatch: error: Batch job submission failed: Pathname of a file, directory or other parameter too long
There is no .command.out in work/
Is there any update on a work around for this? Thank you

@jfy133
Copy link
Member

jfy133 commented Oct 16, 2020

FYI: A user just encountered the same error in nf-core/eager when trying to run a 1000 sample job. If I understand the solution proposed above, in this case I don't think the directory output would necessarily work as most of the log files in this case are standalone from separate processes (rather than lots of logs from a single process).

@apeltzer
Copy link
Member

Had that some days ago and opened nextflow-io/nextflow#2118 for some points

@ggabernet
Copy link
Member

Just for the record, we've also had this issue now with nf-core/airrflow

@ssnn-airr
Copy link

Re the nf-core/airrflow issue @ggabernet just mentioned. I can confirm the .command.run file size exceeds the SLURM max_script_size reported by scontrol show config. There are many rm and ln lines in the section nxf_stage() .

@apeltzer
Copy link
Member

The issue at Nextflow is still open, the small scale mitigation attempts did also not help us permanently either: Maybe also comment here too to make sure this gets addressed soon 👉🏻 nextflow-io/nextflow#2852

@m3hdad
Copy link

m3hdad commented Oct 5, 2023

same issue on nf-core/proteinfold softlinking mmcif_files about 210342 lines of softlinking?

@apeltzer
Copy link
Member

apeltzer commented Oct 6, 2023

Should be better when using nextflow-io/nextflow#2852

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants