Kmer Db Versions Save

Kmer-db is a fast and memory-efficient tool for large-scale k-mer analyses (indexing, querying, estimating evolutionary relationships, etc.).

v1.11.1

1 year ago

Changes from the previous release:

  • Removed deadlock in -multisample-fasta mode,
  • Added support of sparse inputs in distance mode,
  • Added support of sparse outputs in all2all, new2all, and distance modes (-sparse switch) with optional filtering (-above/-below),
  • Extended help information.

v1.9.4

1 year ago

v1.9.2

2 years ago

Changes from the last release:

  • Output matrices can be stored in sparse format (-sparse switch).
  • Better workload balancing.
  • Improved parallelization scheme in new2all mode (few-fold speed improvement).
  • Reduced memory footprint of -multisample-fasta mode.
  • More than one input FASTA files supported in -multisample-fasta mode.
  • Added -extend switch which allows extending existing kmer database.
  • Serialization/deserialization works much faster now.
  • Fixed serious bug in -multisample-fasta mode which caused incorrect kmers counting.

v1.7.6

3 years ago
  • Sources compile under macOS.
  • Basic tests have been added.
  • Fixed bug in distance mode when sequence id contained spaces.
  • Makefile update (automatic detection of support of AVX2, different handling of CFLAGS and LDFLAGS).

v1.7.5

4 years ago
  • Some compilation warnings removed.
  • Fixed crash on samples with small k-mers count or very small filter values.

v1.7.3

4 years ago

Added:

  • For performance reasons upper triangle (with diagonal) of distance matrix in all2all mode is no longer saved.
  • Possibility to specify low threshold of k-mer minhash filter (-f-start parameter).
  • When loading genome files, exact filenames are examined first. If this fails, an attempt to add predefined extensions is made.
  • Added new distance measure -mash-query which is a mash distance calculated w.r.t. a query length (use if the query is much shorter than database sequences).
  • C++11 compatibility (compiles with G++ 4.8.5).

Fixed:

  • Rare bug in hashtable when k-mer containing only T bases was treated as an empty entry. Now an empty item is indicated by a special value instead of a special key.

v1.6.2

4 years ago

Note: Starting from this release version numbering conforms to major.minor.patch scheme.

Added:

  1. Switch-phylip-out in distance mode which allows storing distance/similarity matrices in Phylip format.

Fixed several bugs from 1.51 release:

  1. Incorrect support of k-mer lengths < 16.
  2. Very long processing of long k-mers (k >= 26).
  3. Segmentation fault when storing minhashed k-mers on a disk (minhash mode).

v1.51

5 years ago
  1. Serious reduction of time and memory requirements of build mode caused by the changes of the data structures. E.g., when tested on full k-mer spectrum of 40715 pathogen genomes, time and memory footprint decreased by 1/3 (1h30 to 1h, 60 to 40GB).
  2. Several new parameters added.
  3. Lots of bugs fixed.

v.1.20

5 years ago

Changes:

  • new2all mode added,
  • uniform output table format for all2all, new2all, and one2all modes,

Bugs fixed:

  • proper support of samples with no k-mers of given length,
  • problems with building database from minhashed k-mers.

v1.12

5 years ago

Support of no-AVX2 build.