Releases: muellan/metacache
Releases · muellan/metacache
MetaCache 2.1.1
- More refactoring and code cleanup
- Bug fixes
MetaCache 2.1.0
Improvements
- Enabled longer reads for the GPU version (previous limit was 127 bps)
- Significantly reduced number of temporary allocations on host
- Some more refactoring and code cleanup
Fixes
- Skip empty lines in input files
- Fixed conversion warning
MetaCache 2.0.1
- Fixed wrong load factor in query mode resulting in high load times (#22)
MetaCache 2.0.0
New: MetaCache-GPU
- building and querying on CUDA-capable accelerators
- multi-GPU support for distributing and/or replicating databases across multiple GPUs
- ultra-fast database building: ~10 seconds for the complete Refseq 202 (bacteria+viruses+archaea) on 4 NVIDIA(R) Tesla(R) V100 GPUs
- faster querying: ~300 million reads per minute for a Refseq 202 database on 4 NVIDIA(R) Tesla(R) V100 GPUs
- GPU-built databases can be used with the CPU version and vice versa (needs to be partitioned so that each part fits in GPU memory)
- MetaCache-GPU will be presented at ICPP '21
Further Improvements
- support for reading gzipped FASTA/FASTQ files (requires zlib to be installed, see installation instructions)
- mode "build+query" for building and directly querying a database (great for the GPU version)
- improved CPU querying speed
- new database format (incompatible with previous versions of MetaCache)
- improved database reading/writing performance
- more useful progress indicators
MetaCache v1.1.1
- fixed critical bug that lead to increased memory consumption during database build
- control characters like \t are now properly handled in the -separator option
- fixed g++5 compilation bug
- fixed some smaller bugs
- improved the documentation explaining the output analysis and display options
- more consistent progress bar display across different operations
- some code reorganization
MetaCache v1.0.0
- re-written command line interface with proper diagnostics; there shouldn’t be any breaking changes (regarding option names, etc.)
- updated and improved documentation
- removed hardly used and brittle "annotate" mode that was out of the scope of MetaCache anyway
- small performance improvements
- some minor bug fixes
MetaCache v0.9.0
- improved query speed, especially better thread scaling beyond 32 threads (ca. 50% faster with 88 threads)
- improved database loading speed
- database loading indicator
- small bug fixes
- improved code structure
Attention:
You need to rebuild your databases for this version, because it uses a new database binary format. Disk and RAM consumption are not affected.
MetaCache v0.8.0
New feature "Coverage Filter"
Option -cov-percentile <p>
removes the p-th percentile of hit targets (reference genomes) with the lowest coverage. A first pass does the normal mapping of queries (reads) to targets (reference genomes). The actual classification is then done in a second pass using only the remaining hit targets.
This will lead to a very small increase in runtime and memory consumption but can improve accuracy by detecting and removing stray false positive hits.
The coverage filter is deactivated by default.
Other Changes
- improved multi-threading in query mode
- improved database format (layout better suited for future loading on GPUs)
- code cleanup
MetaCache v0.6.2
- improved accession number / sequence id parsing
- file reading improvements
- code cleanup
MetaCache v0.6.1
- improved database building performance (~30-50% speedup)
- improved taxonomic id assignment during build: now one can also use global assembly_summary files
(default: "assembly_summary_refseq.txt", "assembly_summary_refseq_historical.txt", "assembly_summary_genbank.txt", "assembly_summary_genbank_historical.txt" in the taxonomy folder) - the
download-ncbi-taxonomy
script downloads "assembly_summary_refseq.txt" and "assembly_summary_refseq_historical.txt" by default now - code cleanup