Skip to content

Comparison with samtools

pjotrp edited this page Aug 7, 2012 · 9 revisions

Functionality

Viewing SAM/BAM

Feature sambamba view samtools view Notes
BAM support Full Full
SAM support Full Full sambamba skips (syntactically) invalid tags and sets invalid fields to default values
Error messages Descriptive Incomplete Where samtools says just 'truncated file', sambamba prints detailed error message with a description what is wrong with BAM file
Multithreaded BAM decompression Yes No
Non-seekable file stream support No Yes
Skipping invalid reads Optional No Sambamba library also includes a module for creating custom validation tools
Filtering Powerful Limited sambamba view comes with a simple query language for filtering alignments
JSON output Yes No useful for interacting with scripting languages
Progressbar Optional No

Other tools

Feature sambamba samtools
Indexing Yes, multithreaded Yes, single-threaded
Merging BAM files Yes, multithreaded decompression and compression Yes, compression is multithreaded
Automatic SAM header merging Yes No
Multithreaded BAM file external sort Yes Yes
Flag statistics Yes, multithreaded Yes, single-threaded
(other utilities available in samtools are not implemented in sambamba)

Performance

Here are some benchmarks on two configurations:

  • Intel Atom N450 @ 1.66GHz (1 core with hyperthreading), 1GB of RAM
  • 2x Intel Xeon E5310 @ 1.60GHz (8 cores without hyperthreading), 8GB of RAM

On both machines, sambamba was built with GDC compiler (which is used for building debian packages), and samtools was built with its default makefile using gcc -02.

Command line 1st configuration (1 core) 2nd configuration (8 core)
Time Max. memory used CPU load Time Max. memory used CPU load
the file being fully read from disk (empty file cache)
sambamba index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 12.29s 32MB 147% 6.96s 32MB 139%
samtools index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 13.60s 1.4MB 92% 8.73s 1.4MB 93%
sambamba view -f bam HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 20:10,000,000-20,000,000 -F "mapping_quality >= 50" -o test.bam 22.96s 90MB 98% 5.24s 90MB 250%
samtools view -b HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 20:10,000,000-20,000,000 -q50 -o test.bam 23.12s 1.8MB 96% 10.83s 1.8MB 98%
the file fully cached in RAM
sambamba index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 9.43s 32MB 188% 2.21s 32MB 433%
samtools index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 12.08s 1.4MB 99% 7.98s 1.4MB 100%
sambamba view HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam -c -F "[RG] == 'ERR016156' and proper_pair and first_of_pair and not duplicate" 20:1000000-3000000 0.53s 50MB 144% 0.20s 50MB 208%
samtools view HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam -c -r 'ERR016156' -f66 -F1024 20:1000000-3000000 0.42s 1.3MB 99% 0.27s 1.5MB 100%