Releases · muellan/metacache

11 Mar 13:40

muellan

v2.4.2

7368290

MetaCache 2.4.2 Latest

Latest

Improved sequence id extraction from filenames and sequence headers.

The default setting works a bit smarter now, it first tries to find NCBI-style accession or accession.version identifiers, then genbank identifiers and finally uses the filename (without path and extension).

The new command line option -sequence-id-format <type> allows the user to select a preferred method for sequence id extraction.
Available values for <type> are:

smart: (default), works as described above
ncbi: only use NCBI-style accession or accession.version identifiers
genbank: only use genbank identifiers
filename: only use filename (without path and extension)
leadingword: only use first contiguous stretch of non-whitespace characters

Assets 2

11 Mar 10:00

muellan

v2.4.1

d35f957

MetaCache 2.4.1

fixed abundance table formatting

prevent scientific notation from beeing used for read counts
row showing unclassified reads had the taxon column missing, now shown with taxon "--"

Assets 2

10 Mar 15:43

muellan

v2.4.0

bfbdda5

MetaCache 2.4.0

Changed handling of non-unique sequence IDs during database build

If a reference sequence is inserted, whose ID (e.g. NCBI accession) is already present in the database, the newer sequence will now be inserted with a modified ID (an exclamation mark + duplication counter will be appended) and a warning will be printed to stderr.

Added min/max length filter

A minimum and maximum length for reads can now be set with -min-readlen <#> and -max-readlen <#>. Reads with lengths outside of this range will not be processed, i.e., treated as if they were not present in the input file. How many reads were discarded and how many were processed is printed to stderr. The default behavior, that all reads will be processed, remains unchanged.

Other changes

cleaned up some includes
updated dates
changed some aspects of default code formatting

Assets 2

29 Feb 12:27

muellan

v2.3.2

c563174

MetaCache 2.3.2

improved parsing of assembly_summary files with inconsistent headers

Assets 2

09 Mar 12:07

muellan

v2.3.1

72cce7f

MetaCache 2.3.1

fixed type mismatch bug that could prevented compilation with uint64_t for MC_TARGET_ID_TYPE / MC_WINDOW_ID_TYPE / DMC_KMER_TYPE
allow up to 10 alphanumeric characters in NCBI-style accession ids
GPU version: removed outdated CUDA 10.2 and CUB from documentation

Assets 2

03 Jan 14:13

Funatiq

v2.3.0

bbd8ba0

MetaCache 2.3.0

Removed compaction step from GPU version and speed up GPU queries. This also removes the dependency on CUB.
Set CUDA arch=native per default to automatically detect GPU architecture.
Fixed make with multiple MACROS (#34 ).

Assets 2

08 Jul 17:00

Funatiq

v2.2.3

532a737

MetaCache 2.2.3

Improved merge mode:

Added -out option
Recover from malformed input files (#33)
Show more output on verbose info level

Assets 2

08 Jul 16:58

Funatiq

v2.2.2

0364cc8

MetaCache 2.2.2

Fixed kmers on GPU for k != 16 (default was working correctly)
Fixed shown query parameters when running abundance estimation

Assets 2

12 Jan 12:18

Funatiq

v2.2.1

2b12702

MetaCache 2.2.1

Fixed canonical kmer on GPU for k != 16 (default was working correctly)
Fixed merge mode

Assets 2

09 Dec 15:20

muellan

v2.2.0

1746bc3

MetaCache 2.2.0

Fixed the NCBI genome download script (the ftp path can be empty for some genomes).
Changed the default data type for storing reference sequence ids from 16 to 32 bits in order to fit all complete bacterial, viral and archaea genomes of the latest NCBI RefSeq releases.
The error message during the build process that should have reported that the number of sequences exceeds the supported number is fixed now.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved sequence id extraction from filenames and sequence headers.

Changed handling of non-unique sequence IDs during database build

Added min/max length filter

Other changes

Releases: muellan/metacache

MetaCache 2.4.2

Improved sequence id extraction from filenames and sequence headers.

MetaCache 2.4.1

MetaCache 2.4.0

Changed handling of non-unique sequence IDs during database build

Added min/max length filter

Other changes

MetaCache 2.3.2

MetaCache 2.3.1

MetaCache 2.3.0

MetaCache 2.2.3

MetaCache 2.2.2

MetaCache 2.2.1

MetaCache 2.2.0