sensitive and precise assembly of short sequencing reads
First release of Penguin, a metagenomic assembler that assembles DNA/RNA through a novel greedy AA/DNA-hybrid bayesian overlap extension strategy.
plass
and penguin
. Plass assembles protein sequences from DNA while Penguin assembles DNA contigs. Penguin comes in two variants penguin guided_nuclassemble
, which first assembles using AA six-framed-translated overlaps and then further assemble the contigs using nucleotide information and a pure nucleotide assembler penguin nuclassemble
.Changes since Release 3-764a3:
At a glance: Significant further development of the nucleotide/hybrid assembler. Updated MMseqs2 submodule and adjusted Plass to multiple MMseqs2 changes.
Changes since Release 2-c7e35:
At a glance: Significant further development of the nucleotide assembler. Reduced hard disk requirements for protein assembler and many bug fixes.
Updated mmseqs submodule and adjusted plass to multiple MMseqs2 changes.
--kmer-per-seq-scale
parameter to make sure not to miss good hits of long sequences. The number of extracted kmers can now be scaled with a user defined factor multiplied by the length of the sequence.--rescore-mode 3
)cat reads.fas | plass assemble stdin asm tmp
--delete-tmp-inc
)<uniq ID> len:<len> cycle:<0|1>
The cycle field is optional (for the nucleotide case)kmermatcher
phase. Replaced --skip-n-repeat
parameter by --ignore-multi-kmer
--min-contig-len
parameter to set minimum length of assembled contig to output (for nucleotide assembly)--clust-thr
, default 0.97)Changes since release 1-2e0ef
--protein-filter-threshold
Plass Release 1-2e0ef
Plass (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level. The main purpose of Plass is the assembly of complex metagenomic datasets.
--min-length
flag to adjust codon extraction lengthFirst Plass release