In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
89 changed files with 11,223 additions and 6,171 deletions.
Sparse graph labeling @camillescott ( sweep-reads-by-partition-buffered.py ) Initial support for Galaxy integration (for normalize-by-median and abund-filter) @mr-c Normalization of arguments across the scripts @camillescott
Note: The default branch on GitHub is now the 'master' branch. Our unmaintained tutorials that install khmer with a plain git clone have a warning added to them and potential user are directed to the khmer-protocols: http://khmer-protocols.readthedocs.org
load-graph gives erroneous memory estimate #279 @camillescott If loadhash is specified, do not complain about hashsize #278 @RamRS @humberto-ortiz test_Hashbits case sensitivity #265 @luizirber Installation fails: cannot find argparse >= 1.2.1 #258 @mr-c & @standage Bugs found by coverity #256 @camillescott
abundance-dist-single.py, abundance-dist.py, do-partition.py, interleave-reads.py, load-graph.py, load-into-counting.py normalize-by-median.py now exit with return code 1 instead of 255 as is standard. Program arguments that have default values are disclosed @mr-c Developer documentation updates: contribution guidelines, coding standards, code review hints (with checklist). Release instruction completely rewritten. Installation instructions tweaks including specific commands for Debian derivatives and RHEL6. Update to the latest versioneer.py & ez_setup.py The latest version of setuptools is no longer required: version 0.6c11 appears to be just fine. Many code cleanups. Python namespace usage was tidied. Type safety was strengthened in the C++/Python integration. Testing coverage measures the scripts properly.
All of these are pre-existing.
Some users have reported that normalize-by-median.py will utilize more memory than it was configured for. This is being investigated in https://github.com/ged-lab/khmer/issues/266
Some FASTQ files confuse our parser when running with more than one thread. For example, while using load-into-counting.py. If you experience this then add "--threads=1" to your command line. This issue is being tracked in https://github.com/ged-lab/khmer/issues/249
If your hashfile gets truncated, perhaps from a full filesystem, then our tools currently will get stuck. This is being tracked in https://github.com/ged-lab/khmer/issues/247
Paired-end reads from Casava 1.8 currently require renaming for use in normalize-by-median and abund-filter when used in paired mode. The integration of a fix for this is being tracked in https://github.com/ged-lab/khmer/issues/23
A user has reported a floating point exception when running count-overlap.py. There is no workaround at this time. https://github.com/ged-lab/khmer/issues/282
@camillescott, @mr-c, @ctb, @luizirber, @RamRS, @humberto-ortiz, and @standage Special thanks to the new contributors!
This is the last release using the legacy development system.