-
Notifications
You must be signed in to change notification settings - Fork 107
Comparison with samtools
pjotrp edited this page Aug 7, 2012
·
9 revisions
Feature | sambamba view | samtools view | Notes |
---|---|---|---|
BAM support | Full | Full | |
SAM support | Full | Full | sambamba skips (syntactically) invalid tags and sets invalid fields to default values |
Error messages | Descriptive | Incomplete | Where samtools says just 'truncated file', sambamba prints detailed error message with a description what is wrong with BAM file |
Multithreaded BAM decompression | Yes | No | |
Non-seekable file stream support | No | Yes | |
Skipping invalid reads | Optional | No | Sambamba library also includes a module for creating custom validation tools |
Filtering | Powerful | Limited | sambamba view comes with a simple query language for filtering alignments |
JSON output | Yes | No | useful for interacting with scripting languages |
Progressbar | Optional | No |
Feature | sambamba | samtools |
---|---|---|
Indexing | Yes, multithreaded | Yes, single-threaded |
Merging BAM files | Yes, multithreaded decompression and compression | Yes, compression is multithreaded |
Automatic SAM header merging | Yes | No |
Multithreaded BAM file external sort | Yes | Yes |
Flag statistics | Yes, multithreaded | Yes, single-threaded |
(other utilities available in samtools are not implemented in sambamba) |
Here are some benchmarks on two configurations:
- Intel Atom N450 @ 1.66GHz (1 core with hyperthreading), 1GB of RAM
- 2x Intel Xeon E5310 @ 1.60GHz (8 cores without hyperthreading), 8GB of RAM
On both machines, sambamba was built with GDC compiler (which is used for building debian packages), and samtools was built with its default makefile using gcc -02.
Command line | 1st configuration (1 core) | 2nd configuration (8 core) | ||||
---|---|---|---|---|---|---|
Time | Max. memory used | CPU load | Time | Max. memory used | CPU load | |
the file being fully read from disk (empty file cache) | ||||||
sambamba index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam | 12.29s | 32MB | 147% | 6.96s | 32MB | 139% |
samtools index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam | 13.60s | 1.4MB | 92% | 8.73s | 1.4MB | 93% |
sambamba view -f bam HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 20:10,000,000-20,000,000 -F "mapping_quality >= 50" -o test.bam | 22.96s | 90MB | 98% | 5.24s | 90MB | 250% |
samtools view -b HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam 20:10,000,000-20,000,000 -q50 -o test.bam | 23.12s | 1.8MB | 96% | 10.83s | 1.8MB | 98% |
the file fully cached in RAM | ||||||
sambamba index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam | 9.43s | 32MB | 188% | 2.21s | 32MB | 433% |
samtools index HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam | 12.08s | 1.4MB | 99% | 7.98s | 1.4MB | 100% |
sambamba view HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam -c -F "[RG] == 'ERR016156' and proper_pair and first_of_pair and not duplicate" 20:1000000-3000000 | 0.53s | 50MB | 144% | 0.20s | 50MB | 208% |
samtools view HG00125.chrom20.ILLUMINA.bwa.GBR.low_coverage.20111114.bam -c -r 'ERR016156' -f66 -F1024 20:1000000-3000000 | 0.42s | 1.3MB | 99% | 0.27s | 1.5MB | 100% |