Bayesian haplotype-based mutation calling
This is a minor release with some bug fixes, and improvements to installation scripts.
cmake3
and cmake
. Resolves #37. [30a3ffb11ad216e108efefcd70549130939655bb]This is a minor release that resolves a few issues in the first Beta release - v0.5.0-beta.
RFQUAL
random forest score to the FORMAT
field, so there is now one score for each sample. [81b75ea3af852e809c789a8a924b3aa0f9791264]RTB
, REB
, BMC
, BMF
. [f1000d4410fe62e8b0b8bd0080d4720b81024710, 1a937f2f48aad986ea76b799f666155da4fccc08]forest
to the --training-annotations
option (renamed from --csr-train
). [519be06afbc7ddc3c70b4a5da899a22d18391b5c, 106e3443c4ed5319d84f621b5b0eaf50c46db179]--csr-train
option to --training-annotations
. [a1f8c45878ac1ca05c496f4b6b6c344c21a1ab10]RPB
measure to RSB
. [c52c6e8cb220e5db1171aa857141617b8aedf7c4]double
s are not parsed properly, causing errors when using random forest filtering. [dc137542403b7c9af73257151472936ccd5a0844]MQD
measure. [45b9b742d09cb037ffa605c719695ae22d94a066]INFO
and FORMAT
fields with multiple values. [0078c5a1e2fbd1abd43a65315f4de216fbf4fa9b]This is the first beta release as most of the core features are reasonably mature. There have been various stability and runtime improvements, in addition to improvements to the core algorithm - including a completely new indel mutation model. Once again, the cancer calling model has received most attention, particularly for high depth ultra-low VAF tumour-only calling (e.g. UMI).
--repeat-candidate-generator
command line option. [2856c2e6b8a5683f07c19d3f40e1c2f3b467bacd , 2856c2e6b8a5683f07c19d3f40e1c2f3b467bacd]QUAL
is calculated in the cancer and trio models has been improved. Previously QUAL
was the posterior probability the called alt allele segregated and is classified correctly. This could lead to low QUAL
scores if the classification was uncertain (e.g. in tumour-only samples). QUAL
is now simply the posterior probability the allele segregates. There is also a new annotation for all cancer caller calls, and DENOVO
trio calls, PP
, that is equivalent to the old QUAL
. [905c96b7362ba2513c920e33d896751490cc32f0, 3b28e9fe85af4aef4408cb3b31c959408a0ba129, 0d1537b9012326d4e8e3d98d718e0f81ff73219e]SOMATIC
have a new annotation: MAP_VAF
which reports theMaximum a posteriori VAF estimate.VAF_CR
) [e361f5065da83a9d1febabf4dcac9c7578dc3e8e].--max-vb-seeds
which controls the maximum number of seeds the Variational Bayes based genotype model algorithms can use. [95c66a2ec89fe37adb8a4707d15b69bf17f25563]--split-bamout
for split realigned BAMs. Split BAMs are no longer requested by specifying a prefix to --bamout
. [34d8a89748cd363e967cea89774531efa73a9dbb]SC
has been renamed to NC
(Normal Contamination). [23497c3aaf0c93c9ca633f96778f8f74c4a5a4b3]
-- Adds --mask-tails
for unconditionally masking bases of all read tails. [acfddaf1b5e910496b737f3dd6cab2667dadae4b]--tumour-germline-concentration
which may be used to control shape of prior distribution on haplotype mixture frequency of tumour samples. Only really relevant to high depth tumour-only calling. [9f83ca6fce24ced6ea901845f3c474ecfc6a1867]--snv-denovo-mutation-rate
to --denovo-snv-mutation-rate
and --indel-denovo-mutation-rate
to --denovo-indel-mutation-rate
. [4b9d95f448ef1f8d2375947a58d664850a868c18]--repeat-candidate-generator
to control new repeat candidate generator. [2856c2e6b8a5683f07c19d3f40e1c2f3b467bacd]configs
directory in the main project directory that contains pre-written configs for calling certain types of data. [9da036416ff2bd7a36f5f734aebbd391df7c48f4]This is a major release with important new features, enhancements, and performance improvements.
--bamout
option. See the wiki page for details.DENOVO
and SOMATIC
calls now get different filtering treatment to regular germline variants using threshold filters.--forest-file
and --somatic-forest-file
for random forest filtering.--somatics-only
to report only SOMATIC
variants.--denovos-only
to report only DENOVO
variants.--max-somatic-haplotypes
which limits the number of somatic haplotypes that may be used by the cancer
calling model.--consider-reads-with-unmapped-segments
--> --no-reads-with-unmapped-segments
and --consider-reads-with-distant-segments
--> --no-reads-with-distant-segments
. These filters are now off my default.--max-cancer-genotypes
removed and replaced with --max-genotypes
, which is also used by the polyclone
calling model.--max-clones
option for specifying the maximum number of clones for the polyclone
calling model.--somatic-filter-expression
, --denovo-filter-expression
, and --refcall-filter-expression
which may be used for hard filtering 'DENOVO' and SOMATIC
calls.This version brings new features, in addition to significant calling and runtime improvements.
--filter-vcf
command line option).--threads
command. This resolves #13.install.py
is now supplied with both a C++ and C compiler with the cxx_compiler
and c_compiler
commands respectively.--no-supplementary-alignments
changes to --allow-supplementary-alignments
).--no-secondary-alignments
changes to --allow-secondary-alignments
).--clean
with Python install script)..vcf.gz
index files are now in the .tbi
format, rather than .csi
.This version brings bug fixes and some minor performance improvements.
This release contains some runtime performance improvements, particularly for the tumour calling model. It also updates the requirements for GCC, CMake, and Boost.
This is a major release that contains significant new features and improvements.
This release includes a new de novo mutation model that improves trio calling.
snv-denovo-mutation-rate
and indel-denovo-mutation-rate
. Gap open and extension penalties are weighted based on context.max-joint-genotypes
to 1,000,00.