Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.nio.file.FileSystemException No space left on device #1400

Closed
fjosefdz opened this issue Oct 8, 2024 · 3 comments
Closed

java.nio.file.FileSystemException No space left on device #1400

fjosefdz opened this issue Oct 8, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@fjosefdz
Copy link

fjosefdz commented Oct 8, 2024

Description of the bug

Hi, I am using piepeline on my university's HPC using singularity and SLURM as a queue manager. I was using the pipeline for a large cohort of RNAseq bulk samples. I am using the following code in a script that I run through SLURM:

I have had success for some of them, but recently I am having problems related to space in the temporary directory. The errors are with FastQC and with Picard Markduplicates. And they are as follows:

Why after specifying the temporary directory and after checking that there is more than enough space in that directory do I get this error related to temporary file space?

I would appreciate any help, thank you very much.

Translated with DeepL.com (free version)

Command used and terminal output

#!/bin/bash
#SBATCH --job-name=nfcore_rnaseq
#SBATCH --partition cpu # partition (queue) where the job will be submitted
#SBATCH --nodes 1
#SBATCH --tasks-per-node 2
#SBATCH --time 240:00:00 # maximum time the job is allowed to run
#SBATCH --mem 12G # amount of memory requested for the job
#SBATCH --output /home/user/rnaseq_processing/bulk_rnaseq_project_%j.out # output (stdout) of the job will be written
#SBATCH --error /home/user/rnaseq_processing/bulk_rnaseq_project_%j.err # file where the standard error (stderr) of the job will be written

cd /home/user/rnaseq_processing/

source /home/conda/miniconda3/bin/activate base
conda activate /home/gpfs/user1/.conda/envs/nextflow_env

export TMPDIR='/home/user/rnaseq_processing/temporary_dir/'
export NXF_SINGULARITY_CACHEDIR='/home/user/rnaseq_processing/nf-core_rnaseq_singularity/'
export NXF_TMPDIR='/home/user/rnaseq_processing/temporary_dir/'
export SINGULARITY_TMPDIR='/home/user/rnaseq_processing/temporary_dir/'

ulimit -n 8192

nextflow run nf-core/rnaseq
--input /home/user/rnaseq_processing/samplesheet_bulk_rnaseq_project.csv
--outdir /home/user/rnaseq_processing/output_bulk_rnaseq_project/
--aligner 'star_salmon'
-profile singularity
--max_memory '100.GB'
--max_cpus 16
--max_time '240.h'
-r 3.16.0
--fasta /home/user/rnaseq_processing/star/GRCh38.primary_assembly.genome.fa
--gtf /home/user/rnaseq_processing/star/gencode.v46.annotation.gtf
--star_index /home/user/rnaseq_processing/saved_genome/index/star/
--salmon_index /home/user/rnaseq_processing/saved_genome/index/salmon/
--gene_bed /home/user/rnaseq_processing/saved_genome/GRCh38.primary_assembly.genome.filtered.bed
--skip_deseq2_qc
-work-dir /home/user/rnaseq_processing/work_dir_bulk_rnaseq_project/
-c /home/user/rnaseq_processing/custom.config
--trimmer fastp

ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES (sample1)'

Caused by:
Process NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES (sample1) terminated with an error exit status (1)

Command executed:

picard
-Xmx81920M
MarkDuplicates
--ASSUME_SORTED true --REMOVE_DUPLICATES false --VALIDATION_STRINGENCY LENIENT --TMP_DIR tmp
--INPUT sample1.sorted.bam
--OUTPUT sample1.markdup.sorted.bam
--REFERENCE_SEQUENCE GRCh38.primary_assembly.genome.fa
--METRICS_FILE sample1.markdup.sorted.MarkDuplicates.metrics.txt

cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES":
picard: $(echo $(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:)
END_VERSIONS

Command exit status:
1

Command output:
(empty)

Command error:
/usr/local/bin/picard: line 5: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
Oct 08, 2024 8:13:15 AM com.intel.gkl.NativeLibraryLoader load
INFO: Loading libgkl_compression.so from jar:file:/usr/local/share/picard-3.1.1-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 08, 2024 8:13:15 AM com.intel.gkl.NativeLibraryLoader load
WARNING: Unable to load libgkl_compression.so from native/libgkl_compression.so (No space left on device)
Oct 08, 2024 8:13:15 AM com.intel.gkl.NativeLibraryLoader load
INFO: Loading libgkl_compression.so from jar:file:/usr/local/share/picard-3.1.1-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 08, 2024 8:13:15 AM com.intel.gkl.NativeLibraryLoader load
WARNING: Unable to load libgkl_compression.so from native/libgkl_compression.so (No space left on device)
[Tue Oct 08 08:13:15 GMT 2024] MarkDuplicates --INPUT sample1.sorted.bam --OUTPUT sample1.markdup.sorted.bam --METRICS_FILE sample1.markdup.sorted.MarkDuplicates.metrics.txt --REMOVE_DUPLICATES false --ASSUME_SORTED true --TMP_DIR tmp --VALIDATION_STRINGENCY LENIENT --REFERENCE_SEQUENCE GRCh38.primary_assembly.genome.fa --MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP 50000 --MAX_FILE_HANDLES_FOR_READ_ENDS_MAP 8000 --SORTING_COLLECTION_SIZE_RATIO 0.25 --TAG_DUPLICATE_SET_MEMBERS false --REMOVE_SEQUENCING_DUPLICATES false --TAGGING_POLICY DontTag --CLEAR_DT true --DUPLEX_UMI false --FLOW_MODE false --FLOW_QUALITY_SUM_STRATEGY false --USE_END_IN_UNPAIRED_READS false --USE_UNPAIRED_CLIPPED_END false --UNPAIRED_END_UNCERTAINTY 0 --FLOW_SKIP_FIRST_N_FLOWS 0 --FLOW_Q_IS_KNOWN_END false --FLOW_EFFECTIVE_QUALITY_THRESHOLD 15 --ADD_PG_TAG_TO_READS true --DUPLICATE_SCORING_STRATEGY SUM_OF_BASE_QUALITIES --PROGRAM_RECORD_ID MarkDuplicates --PROGRAM_GROUP_NAME MarkDuplicates --READ_NAME_REGEX <optimized capture of last three ':' separated fields as numeric values> --OPTICAL_DUPLICATE_PIXEL_DISTANCE 100 --MAX_OPTICAL_DUPLICATE_SET_SIZE 300000 --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Tue Oct 08 08:13:15 GMT 2024] Executing as user1@sy096 on Linux 6.7.12+bpo-amd64 amd64; OpenJDK 64-Bit Server VM 21.0.1-internal-adhoc.conda.src; Deflater: Jdk; Inflater: Jdk; Provider GCS is available; Picard version: Version:3.1.1
INFO 2024-10-08 08:13:15 MarkDuplicates Start of doWork freeMemory: 456914008; totalMemory: 469762048; maxMemory: 85899345920
INFO 2024-10-08 08:13:15 MarkDuplicates Reading input file and constructing read end information.
INFO 2024-10-08 08:13:15 MarkDuplicates Will retain up to 311229514 data points before spilling to disk.
Oct 08, 2024 8:13:17 AM com.intel.gkl.compression.IntelInflaterFactory makeInflater
WARNING: IntelInflater is not supported, using Java.util.zip.Inflater
Oct 08, 2024 8:13:17 AM com.intel.gkl.compression.IntelInflaterFactory makeInflater
WARNING: IntelInflater is not supported, using Java.util.zip.Inflater
[Tue Oct 08 08:13:17 GMT 2024] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=8690597888
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Exception creating temporary directory.
at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:1018)
at htsjdk.samtools.CoordinateSortedPairInfoMap.(CoordinateSortedPairInfoMap.java:59)
at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.(DiskBasedReadEndsForMarkDuplicatesMap.java:57)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:513)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:270)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:280)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:105)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:115)
Caused by: java.nio.file.FileSystemException: /tmp/CSPI.tmp16480539071667820001: No space left on device
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:438)
at java.base/java.nio.file.Files.createDirectory(Files.java:699)
at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:134)
at java.base/java.nio.file.TempFileHelper.createTempDirectory(TempFileHelper.java:171)
at java.base/java.nio.file.Files.createTempDirectory(Files.java:1017)
at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:1016)
... 7 more

Work dir:
/home/user/rnaseq_processing/work_dir/f5/70ee649e43f3e43cc96e70c33b528e

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

-- Check '.nextflow.log' file for details

WARN: Killing running tasks (100)

Relevant files

No response

System information

No response

@fjosefdz fjosefdz added the bug Something isn't working label Oct 8, 2024
@pinin4fjords
Copy link
Member

Can I ask if you checked the temp space on the compute node itself, rather than just from the head, for example?

@fjosefdz
Copy link
Author

Hi, I fund out what was the problem. The HPC of my uni use temporary folder on every node automatically of the jobs running on the given node. It sometimes it full depends on the trafic. I tried to redirect the temp folder of the nfcore rnaseq jobs I had problems on the config file:

process {
withName: 'PICARD_MARKDUPLICATES' {
ext.args = "--TMP_DIR /home/user/temporary_dir/"
}

withName: 'FASTQC_TRIM' {
    ext.args = "--dir /home/user/temporary_dir/"
}

withName: 'FASTQC_RAW' {
    ext.args = "--dir /home/user/temporary_dir/"
}

}

And so far I did not have the problems again.

@pinin4fjords
Copy link
Member

Most (not all) processes will respect the value of TMPDIR if you reset it globally, that might help you head off future issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants