Kmer Db Versions Save

Kmer-db is a fast and memory-efficient tool for large-scale k-mer analyses (indexing, querying, estimating evolutionary relationships, etc.).

1 year ago

Changes from the previous release:

Removed deadlock in -multisample-fasta mode,
Added support of sparse inputs in distance mode,
Added support of sparse outputs in all2all, new2all, and distance modes (-sparse switch) with optional filtering (-above/-below),
Extended help information.

1 year ago

2 years ago

Changes from the last release:

Output matrices can be stored in sparse format (-sparse switch).
Better workload balancing.
Improved parallelization scheme in new2all mode (few-fold speed improvement).
Reduced memory footprint of -multisample-fasta mode.
More than one input FASTA files supported in -multisample-fasta mode.
Added -extend switch which allows extending existing kmer database.
Serialization/deserialization works much faster now.
Fixed serious bug in -multisample-fasta mode which caused incorrect kmers counting.

3 years ago

Sources compile under macOS.
Basic tests have been added.
Fixed bug in distance mode when sequence id contained spaces.
Makefile update (automatic detection of support of AVX2, different handling of CFLAGS and LDFLAGS).

4 years ago

4 years ago

Added:

For performance reasons upper triangle (with diagonal) of distance matrix in all2all mode is no longer saved.
Possibility to specify low threshold of k-mer minhash filter (-f-start parameter).
When loading genome files, exact filenames are examined first. If this fails, an attempt to add predefined extensions is made.
Added new distance measure -mash-query which is a mash distance calculated w.r.t. a query length (use if the query is much shorter than database sequences).
C++11 compatibility (compiles with G++ 4.8.5).

Fixed:

Rare bug in hashtable when k-mer containing only T bases was treated as an empty entry. Now an empty item is indicated by a special value instead of a special key.

4 years ago

Note: Starting from this release version numbering conforms to major.minor.patch scheme.

Added:

Switch-phylip-out in distance mode which allows storing distance/similarity matrices in Phylip format.

Fixed several bugs from 1.51 release:

5 years ago

Serious reduction of time and memory requirements of build mode caused by the changes of the data structures. E.g., when tested on full k-mer spectrum of 40715 pathogen genomes, time and memory footprint decreased by 1/3 (1h30 to 1h, 60 to 40GB).
Several new parameters added.
Lots of bugs fixed.

5 years ago

Changes:

Bugs fixed:

5 years ago

Support of no-AVX2 build.