Official code repository for GATK versions 4 and up
Download release: gatk-4.5.0.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
HaplotypeCaller
now supports custom ploidy regions that can be specified via a new --ploidy-regions
argument, overriding the global -ploidy
setting
The default SmithWaterman
implementation for HaplotypeCaller
and Mutect2
is now the hardware-accelerated version, resulting in a significant speedup
Funcotator
has a new datasource release that brings in the latest version of Gencode
and several other key data sources
We've updated our dependencies and our docker environment to greatly cut down on known security vulnerabilities
We've greatly improved support for http
/https
inputs in GATK-native tools (though most Picard tools bundled with GATK do not yet support it)
We've ported some additional DRAGEN features to HaplotypeCaller
that bring us closer to functional equivalence with DRAGEN v3.7.8
GenomicsDBImport
now has support for Azure storage az://
URIs
GnarlyGenotyper
now has haploid support
Lots of important bug fixes, including a fix for a bug in the Intel GKL that could cause output files to intermittently fail to be compressed properly
HaplotypeCaller
HaplotypeCaller
called --ploidy-regions
which allows the user to input a .bed
or .interval_list
with the "name" column equal to a positive integer for the ploidy to use when calling variants in that region-ploidy
flag will still provide the background default (or the built-in ploidy of 2 for humans), but the user-supplied values will supersede these in overlapping regionsSmithWaterman
implementation to default to FASTEST_AVAILABLE
(#8485)--dont-use-softclipped-bases
argument (#8271)Mutect2
--base-qual-correction-factor
to allow a scale factor to be provided to modify the base qualities reported by the sequencer and used in the Mutect2
substitution error model (#8447)
FilterMutectCalls
for GVCFs (#8458)
Mutect2
(for example with the Mitochondria mode), in the filtering step ADs for symbolic alleles are set to 0 so it doesn't contribute to overall AD. There was an off-by-one error that removed the alt allele AD rather than the <NON_REF>
allele AD. This led to NaNs and errors when a site had no ref reads (for example a GT of [ref,alt,<NON_REF>]
and AD of [0,300,0]
would accidentally be changed to an AD of [0,0,0]
if the alt index was removed instead of the <NON_REF>
index).DRAGEN-GATK
PartiallyDeterminedHaplotypeComputationEngine
(#8367)PartiallyDeterminedHaplotypeComputationEngine
and preparing for joint detection (#8492)EventGroup
subclass (#8400)Joint Calling
GnarlyGenotyper
(#7750)GenotypeGVCFs
to properly handle events not in minimal representation (#8567)ReblockGVCF
: added a --keep-site-filters
argument to keep site-level filters (#8304) (#8308)ReblockGVCF
: added a --add-site-filters-to-genotype
argument to move site-level filters to genotype-level filters (#8484)ReblockGVCF
: added a --format-annotations-to-remove
argument to specify format-level annotations to remove from all genotypes in final GVCF (#8411)ReblockGVCF
: added a check to make sure the input VCF is a GVCF rather than a single sample VCF (#8411)GnarlyGenotyper
(#8270)mergeWithRemapping()
method in ReferenceConfidenceVariantContextMerger
to perform allele remapping prior to genotyping (#8318)GenomicsDB
GenomicsDBImport
to accept Azure az://
URIs as input (#8438)GenomicsDB
release with Java 17 support, improved error messages/logging, and generally improved performance (#8358)Funcotator
Gencode
to version 43, and also updated COSMIC
, Clinvar
, and several other datasources to their latest versionsGencode
GTF versions by making the GencodeGTFField
parsing more permissive (#8351)Funcotator
VCF output renderer to correctly preserve B37 contig names on output for B37 aligned files (#8539)Funcotator
to crash with certain datasources (#8445)LocatableXsvFuncotationFactory
to read gzipped files (#8363)CNV Calling
GermlineCNVCaller
java doc (#8064)SV Calling
SVCluster
(#8408)
SVConcordance
and SVCluster
tools. This is particularly useful for accurately matching smaller SVs that have a high degree of breakpoint uncertainty, in which case reciprocal overlap does not work well. PESR/mixed variant types must have size similarity, reciprocal overlap, and breakend window criteria met. Depth-only variants may have either size similarity + reciprocal overlap OR breakend window criteria met (or both).SVConcordance
(#8211)Mitochondrial pipeline
Flow-based Calling
GroundTruthScorer
tool to score reads against a reference/ground truthFlowFeatureMapper
AddFlowBaseQuality
tool that writes reads from flow-based SAM/BAM/CRAM files that pass criteria to a new file while adding a base-quality attribute (BQ) (#8235)FlowPairHMMAlignReadsToHaplotypes
that aligns flow-based reads to set of haplotypes / templates (#8305)FlowBasedAnnotation
that contained a bug and thus was meaningless (#8421)GroundTruthScorer
doc update (#8597)Notable Enhancements
samtools
and bcftools
(#8610)build.gradle
(#8607)http-nio
library and made tweaks to HTSJDK to make it available in more places. The new version of http-nio
should provide much more reliable access to http(s) file paths. This is supported by all methods accessing Paths, and includes SAM/BAM/CRAM and VCF/Feature files. It includes a new retry mechanism which retries after transient errors. It also includes bug fixes and various other minor improvements, such as making encoded Path handling more consistent.PrintFileDiagnostics
tool that can output the internal metadata of CRAM
, CRAI
and BAI
files for diagnostic purposes (#8577)TransmittedSingleton
annotation and added quality threshold arguments to the PossibleDenovo
annotation (#8329)ReadNameReadFilter
(#8405)2bit
references, and removed the dependency on the ADAM library (#8606)Bug Fixes
Miscellaneous Changes
CNNVariantTrain
: exposed more CNN training parameters as arguments (#8483)dockstore.yaml
(#8323)AnalyzeSaturationMutagenesis
to keep disjoint mates (#8557)OutOfMemoryError
(#8277)Documentation
Dependencies
Picard
to 3.1.1 (#8585)HTSJDK
4.1.0 (#8620)Intel GKL
to 0.8.11 (#8409)Apache Spark
to 3.5.0 (#8607)Hadoop
to 3.3.6 (#8607)google-cloud-nio
to 0.127.8http-nio
to 1.1.0 (#8626)Download release: gatk-4.4.0.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
We've moved to Java 17, the latest long-term support (LTS) Java release, for building and running GATK! Previously we required Java 8, which is now end-of-life.
Significant enhancements to SelectVariants
, including arguments to enable GVCF
filtering support and to work with genotype fields more easily.
A new tool SVConcordance
, that calculates SV genotype concordance between an "evaluation" VCF and a "truth" VCF
Bug fixes and enhancements to the support for the Ultima Genomics flow-based sequencing platform introduced in GATK 4.3.0.0
Flow-based Variant Calling
FlowFeatureMapper
: added surrounding-median-quality-size feature (#8222)Mutect2
parameters to make them consistent with the HaplotypeCaller
parameters (#8186)SelectVariants
SelectVariants
(#7193)
--ignore-non-ref-in-types
to support correct handling of VariantContexts that contain a NON_REF allele. This is necessary because every variant in a GVCF file would otherwise be assigned the type MIXED, which makes it impossible to filter for e.g. SNPs.SelectVariants
: added new arguments for controlling genotype JEXL filtering (#8092)
-select-genotype
: with this new genotype-specific JEXL argument, we support easily filtering by genotype fields with expressions like 'GQ > 0', where the behavior in the multi-sample case is 'GQ > 0' in at least one sample. It's still possible to manually access genotype fields using the old -select
argument and expressions such as vc.getGenotype('NA12878').getGQ() > 0
.--apply-jexl-filters-first
: This flag is provided to allow the user to do JEXL filtering before subsetting the format fields, in particular the case where the filtering is done on INFO fields only, which may improve speed when working with a large cohort VCF that contains genotypes for thousands of samples.SV Calling
SVConcordance
, that calculates SV genotype concordance between an "evaluation" VCF and a "truth" VCF (#7977)SVAnnotate
(#8125)AnalyzeSaturationMutagenesis
(#8053)Notable Enhancements
GenotypeGVCFs
: added an --keep-specific-combined-raw-annotation
argument to keep specified raw annotations (#7996)VariantAnnotator
now warns instead of fails when the variant contains too many alleles (#8075)GenomicsDB
arguments to the CreateSomaticPanelOfNormals
tool (#6746)DeprecatedFeature
annotation and a process for officially marking GATK tools as deprecated (#8100)close()
methods from hiding underlying errors (#7764)Bug Fixes
VariantRecalibrator
to sometimes fail if user provided duplicate -an options (#8227)ReblockGVCF
: remove A,R, and G length attributes when ReblockGVCF
subsets an allele (#8209)
ReblockGVCF
would not remove all of them at sites where an allele was dropped. This makes the output gVCF invalid since the annotation length no longer matches the length described in the header at those sites. Now we fix up F1R2, F2R1, and AF annotations and remove any other annotations that are not already handled that are defined as A, R, or G length in the header.gCNV
bug that breaks the inference when only 2 intervals are provided (#8180)GenotypingEngine
(#8159)StreamingPythonExecutor
/CNNScoreVariants
(#7402)ShiftFasta
where the interval list output was never written (#8070)MergeAnnotatedRegions
now requires a reference as asserted in its documentation (#8067)Miscellaneous Changes
VariantRecalibrator
argument and an old ReblockGVCF
argument that produced invalid GVCFs (#8140)GnarlyGenotyper
code with a diploid assumption to prepare for adding haploid support to GnarlyGenotyper
(#8140)ReblockGVCF
: add error message for when tree-score-threshold is set but the TREE_SCORE annotation is not present (#8218)TransferReadTags
: allow empty unaligned bams as input (#8198)JointVcfFiltering
WDL and expanded tests. (#8074)#carrot_pr
to trigger branch vs master comparison runs (#8084)File.createTempFile()
with IOUtils.createTempFile()
to ensure that temp files are deleted on shutdown (#6780)CNNScoreVariants
tool classes. (#8128)Funcotator
methods and fields protected so it is easier to extend the tool (#8124) (#8166)ProcessControllerAckResult
API (#7816)DirichletAlleleDepthAndFractionIntegrationTest
(#7963)HaplotypeCaller
test files that are no longer needed (#7634)Documentation
OMP_NUM_THREADS
and MKL_NUM_THREADS
to GermlineCNVCaller
and DetermineGermlineContigPloidy
(#8223)PileupDetectionArgumentCollection
documentation (#8050)VariantAnnotator
(#8145)Dependencies
Java 17
, the latest LTS Java release, for building/running GATK (#8035)Gradle
to 7.5.1 (#8098)HTSJDK
to 3.0.5 (#8035)Picard
to 3.0.0 (#8035)Barclay
to 5.0.0 (#8035)GenomicsDB
to 1.4.4 (#7978)Spark
to 3.3.1 (#8035)Hadoop
to 3.3.1. (#8102)commons-text
1.10.0 to fix a security vulnerability (#8071)Download release: gatk-4.3.0.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Support for the Ultima Genomics flow-based sequencing platform
A next-generation suite of tools for variant filtration based on site-level annotation, intended to eventually supersede the older VariantRecalibrator
workflow
CompareReferences
and CheckReferenceCompatibility
: new tools for comparing and checking compatibility with genomic references
Support in HaplotypeCaller
/Mutect2
for supplementing the variants discovered in local assembly with variants discovered via a pileup-based approach
Support for the Ultima Genomics flow-based sequencing platform (#7876)
--flow-mode
argument to HaplotypeCaller
which better supports flow-based calling
FlowBasedHMM
and the FlowBasedAlignmentLkelihoodEngine
--flow-mode
argument to Mutect2
which better supports flow-based callingMarkDuplicatesSpark
FlowFeatureMapper
for quick heuristic calling of bams for diagnosticsGroundTruthReadsBuilder
to generate ground truth files for BasecallingHaplotypeBasedVariantRecaller
for recalling VCF files using the HaplotypeCallerEngine
SplitCram
FlowBasedRead
that manages the new features for FlowBased dataPartialReadsWalker
that supports terminating before traversal is finishedNext-generation suite of tools for variant filtration based on site-level annotations (#7954) (#8049)
VariantRecalibrator
workflowExtractVariantAnnotations
: extracts site-level variant annotations, labels, and other metadata from a VCF file to HDF5 filesTrainVariantAnnotationsModel
: trains a model for scoring variant calls based on site-level annotationsScoreVariantAnnotations
: scores variant calls in a VCF file based on site-level annotations using a previously trained modelNew Reference Comparison Tools
CompareReferences
: a new tool for analyzing the differences between references at both the dictionary and the base level (#7930) (#7987) (#7973)
-R
argument. Subsequent references to be compared may be specified using the ``--references-to-compare` argument.--display-sequences-by-name argument
; to display only sequence names for which the references are not consistent, run with the --display-only-differing-sequences
argument as well.--base-comparison FULL_ALIGNMENT
, the tool performs full-sequence alignment on the differing reference sequences to produce a VCF with SNPs and Indels. However, this mode ignores IUPAC / N bases.--base-comparison FIND_SNPS_ONLY
finds single-base differences between differing reference sequences of the same length. This mode can handle IUPAC / N bases correctly, but not indels.MUMmer
for x86_64 Mac and Linux, which can be invoked from within the GATK using the new MummerExecutor
class.CheckReferenceCompatibility
: a new tool to check a BAM/CRAM/VCF for compatibility against a set of references (#7959) (#7973)
--references-to-compare
argument.HaplotypeCaller/Mutect2
Mutect2
and HaplotypeCaller
before assembly that supplements the variants from local assembly with variants that show up in the pileups (#7432)Mutect2
IndexOutOfBoundException
with germline resource (#7979)Mutect3
dataset enhancements: optional truth VCF for labels, seq error likelihood annotation (#7975)Mutect3
dataset generation to the Mutect2
WDL (#7992)GetPileupSummaries
now streams its output rather than storing it in memory (#7664)AdaptiveChainPruner
where the JavaPriorityQueue
is undefined for tied elements (#7851)SV Calling
CondenseDepthEvidence
: a new tool that combines adjacent intervals in DepthEvidence files (#7926)LocusDepthtoBAF
: a new tool that merges locus-sorted LocusDepth evidence files, calculates the bi-allelic frequency (baf) for each sample and site, and writes these values as a BafEvidence output file (#7776)PrintReadCounts
: a new tool that prints (and optionally subsets) an read depth (DepthEvidence) file or a counts file as one or more (for multi-sample DepthEvidence files) counts files for CNV determination (#8015)CollectSVEvidence
: fixed a bug where trailing SNP sites and depth intervals without read coverage were being omitted from the output (#8045)CollectSVEvidence
: added read depth generation and raw-counts output (#8015)PrintSVEvidence
performance by tweaking the MultiFeatureWalker
traversal (#7869)BafEvidence
(biallelic-frequency of a sample at some locus) (#7861)SVClusterEngine
(#7779)CNV Calling
JointGermlineCNVSegmentation
(#7779)ModelSegments
single-sample and multiple-sample modes (#7652)GenomicsDB
GenomicsDBImport
: added the ability to specify explicit index locations via the sample name map file (#7967)
Bug Fixes
ReblockGVCF
that could cause the first position on a contig to be dropped (#8028)VariantRecalibrator
: type change int -> long to prevent tranche novel variant count overflow (#7864)SiteDepthCodec
(#7910)Miscellaneous Changes
VariantsToTable
now includes all fields when none are specified (#7911)SelectVariants
now warns the user about poor performance when the sample names in the VCF header are unsorted (#7887)VariantRecalibrator
now has a --dont-run-rscript
argument to disable execution of its R script but still output the actual R script file (#7900)build_docker_remote.sh
script for building the docker image remotely with Google Cloud Build (#7951)HaplotypeCaller
--dragen-mode
(#7745)Utils.concat()
methods (#7918)use_allele_specific_annotation
arg and fixed task with empty input in the JointVcfFiltering
WDL (#8027)utils.solver
package (#7922)CodeCov
builds and delaying the posting of coverage information to complete test (#7817)Documentation
JointGermlineCNVSegmentation
as a DocumentedFeature (#7871)SVAnnotate
as a DocumentedFeature (#7833)CollectSVEvidence
as a DocumentedFeature (#8041)GenotypeGVCFs
for some reblocking-related funkiness (#7846)Dependencies
HTSJDK
to 3.0.1 (#8025)Picard
to 2.27.5 (#8025)protobuf
to 3.21.6 (#8036)gsalib
to 2.2.1 (#8048)typing_extensions
Python package to 4.1.1
in the GATK conda environment (#7802)Download release: gatk-4.2.6.1.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
This release contains a single bug fix for GenotypeGVCFs
to fix an erroneous IllegalStateException
("No likelihood sum exceeded zero -- method was called for variant data with no variant information.") in the edge case where unnormalized PLs are present at monomorphic sites.
Download release: gatk-4.2.6.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Important bug fixes for the joint calling tools (GenotypeGVCFs / GenomicsDB)
GenotypeGVCFs
can throw NullPointerExceptions in some cases with many alternate alleles.Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when the --gcs-project-for-requester-pays
argument was specified
Two new tools for the Structural Variation calling pipeline: SVAnnotate
and PrintSVEvidence
Some fixes to genotype-given-alleles mode in HaplotypeCaller
and Mutect2
Joint Calling (GenotypeGVCFs / GenomicsDB)
GenotypeGVCFs
can throw NullPointerExceptions in some cases with many alternate alleles.
NullPointerException
when GenomicsDB has more ALT alleles than specified maximum and many GQ0 hom-ref genotypes allow variants to pass the QUAL filter (#7738)ReblockGVCFs
(#7670)GenomicsDBImport
error message (#7692)SV Calling
SVAnnotate
(#7431)
SVAnnotate
adds functional annotations for SVs called by GATK-SV
(#7431)PrintSVEvidence
(#7695)
PrintSVEvidence
is a tool that can merge any number of files containing one of five types of evidence of structural variation. It's also capable of subsetting regions or samples. It's used to merge evidence from a cohort in the GATK-SV
pipeline.SVCallRecord
(#7714)HaplotypeCaller / Mutect2
HaplotypeCaller
where filtered alleles in the vicinity of forced-calling alleles could result in empty calls (#7740)
HaplotypeCaller
and Mutect2
where force-calling alleles were lost upon trimming by placing allele injection after trimming (#7679)Mutect2
to support the future Mutect3
(#7663)
RNA Tools
TransferReadTags
: a new tool that transfers a read tag from an unaligned bam to the matching aligned bam (#7739).
PostProcessReadsForRSEM
: a new tool that re-orders and filters reads before running RSEM, which has stringent requirements on the input SAM (https://github.com/deweylab/RSEM) (#7752).Funcotator
VariantClassification
severity ordering. (#7673)
VariantClassifications
using the new --custom-variant-classification-order
argumentVariantRecalibrator
VariantRecalibrator
(#7709)
Bug Fixes
--gcs-project-for-requester-pays
was specified (#7700) (#7730)PossibleDeNovo
annotation to work without Genotype Likelihoods (#7662)
PossibleDeNovo
checks each trio's genotype (including parent hom ref genotypes) for likelihoods even though it doesn't actually use the PLs. The PLs can get dropped if GVCFs are reblocked which means this annotation no longer works as expected. This changes the check to look for GQs instead of PLs as the GQs are used as part of the annotation.--mate-too-distant-length
in MateDistantReadFilter
not being configurable (#7701)GATK Engine
MultiFeatureWalker
traversal to the GATK engine (#7695)LocusIteratorByState
(#6410)Miscellaneous Changes
jcenter
repository resolver to our gradle build, fixing a "Could not find biz.k11i:xgboost-predictor:0.3.0" error when building GATK from source (#7665)latest
tag in the broadinstitute/gatk-nightly
Dockerhub repo (#7703)git lfs pull
on src/main/resources/large
(#7727)Dockerfile
(#7682)MultiVariantWalkers
by adding a companion index to the MultiVariantWalker
input variant arg (#7689)JointCallExomeCNVs
to .dockstore.yml
and included a note in the WDL (#7719)Documentation
--heterozygosity
argument in the GenotypeCalculationArgumentCollection
(#7661)Dependencies
Picard
to 2.27.1
(#7766)google-cloud-nio
to 0.123.25
(#7730)Download release: gatk-4.2.5.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Fixed a GenotypeGVCFs
IllegalStateException
error reported by multiple users in https://github.com/broadinstitute/gatk/issues/7639
Added a new tool SVCluster
that clusters structural variants based on coordinates, event type, and supporting algorithms.
Joint Calling (GenotypeGVCFs / GenomicsDB)
IllegalStateException
in GenotypeGVCFs
arising from GenomicsDB output with too many alts and no likelihoods, and also added a --genomicsdb-max-alternate-alleles
argument that is separate from the --max-alternate-alleles
argument used by GenotypeGVCFs
(#7655)
GenotypeGVCFs
error reported in https://github.com/broadinstitute/gatk/issues/7639
--genomicsdb-max-alternate-alleles
argument is required to be at least one greater than the --max-alternate-alleles
argument, to account for the NON_REF allele.ReblockGVCF
: fixed an edge case where hom-ref "variant" records with no data had wrong-sized PLs and didn't merge with adjacent blocks (#7644)SV Calling
SVCluster
that clusters structural variants based on coordinates, event type, and supporting algorithms. (#7541)
Mutect2
Mutect2
reported in #6851GATK Engine
ExcessiveEndClippedReadFilter
(#7638)
Download release: gatk-4.2.4.1.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Build System
GenomicsDB
Miscellaneous Changes
Dependencies
Download release: gatk-4.2.4.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Funcotator
GenotypeGVCFs / ExcessHet
Miscellaneous Changes
Documentation
Dependencies
Download release: gatk-4.2.3.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Notable bug fixes for Mutect2
and Funcotator
Support in CombineGVCFs
and GenotypeGVCFs
for "reblocked" GVCFs as produced by the ReblockGVCF
tool. Reblocked GVCFs have a significantly reduced storage footprint.
More control over the Smith-Waterman parameters in HaplotypeCaller
and Mutect2
A new Fragment Allele Depth (FAD
) variant annotation similar to the AD
annotation except that allele support is considered per read pair, not per individual read
GenomicsDB bug fixes and enhancements
HaplotypeCaller/Mutect2
Mutect2
failed to filter germline variants with alternate representations (#7103)
HaplotypeCaller
, Mutect2
, and FilterAlignmentArtifacts
. (#6885)
FilterAlignmentArtifacts
(#7105)--debug-assembly-variants-out
diagnostic option to output a side VCF with variants detected by assembly for HaplotypeCaller
and Mutect2
(#7384)Mutect2
: the --genotype-germline-sites
argument is no longer marked as experimental (#7533)GenotypeGVCFs / CombineGVCFs
CombineGVCFs
and GenotypeGVCFs
to handle "reblocked" GVCFs with diploid data that are potentially missing hom-ref genotype PLs (#7223)GenotypeGVCFs
(#7471)GenotypeGVCFs
/GnarlyGenotyper
when allele-specific annotations have empty values due to lack of informative reads or no depth (#7491) (#7186)GenomicsDB
--call-genotypes
GenomicsDB argument, enabling output of called genotypes (i.e. not ./.) when tools like CombineGVCFs
and SelectVariants
read from a GenomicsDB workspace (#7223)--bypass-feature-reader
argument to GenomicsDBImport
to allow the C-based htslib VCF reader implementation to be used instead of the Java implementation (#7393)
Funcotator
StringIndexOutOfBoundsException
in the protein change prediction code that could be triggered by certain indels. The fix avoids the crash by adding additional bounds checking. (#7513)FilterFuncotations
to process multi-transcript genes (#7506)CNV Calling
--num-samples-copy-ratio-approx
argument (#7450)SV Calling
JointGermlineCNVSegmentation
: bug fixes and refactoring (#7243)
JointGermlineCNVSegmentation
JointGermlineCNVSegmentation
for SV clustering and defragmentation. The design of SVClusterEngine
has been overhauled to enable the implementation of CNVDefragmenter
and BinnedCNVDefragmenter
subclasses. Logic for producing representative records from a collection of clustered SVs has been separated into an SVCollapser
class, which provides enhanced functionality for handling genotypes for SVs more generally.Notable Enhancements
FAD
) variant annotation (#7511)
AD
annotation except that allele support is considered per read pair, not per individual readMiscellaneous Changes
SplitIntervals
: added new tool arguments to control output file naming (#7488)Documentation
StrandBiasBySample
documentation (#7283)MarkDuplicatesSpark
documentation (#7191) (#7535)Dependencies
GenomicsDB
1.4.2 (#7520)sqlite-jdbc
library to a newer version to support M1 Macs (#7519)Download release: gatk-4.2.2.0.zip Docker image: https://hub.docker.com/r/broadinstitute/gatk/
The ReblockGVCF
tool is now out of beta with several important improvements. This tool can be used to postprocess HaplotypeCaller
GVCFs to decrease filesize.
FilterMutectCalls
now has a --microbial-mode
argument that sets filters to defaults appropriate for microbial calling
Important bug fixes to CalibrateDragstrModel
and Funcotator
New Tools
ShiftFasta
: create a fasta with the bases shifted by an offset (#6694)ReblockGVCF
ReblockGVCF
is now out of beta (#7419)ReblockGVCF
output to eliminate overlapping reference blocks and reference gaps following trimmed deletions (#7122)--floor-blocks
arg is not provided); fixed rare cases where spanning deletion (*) allele is incorrectly modified (#7400)Mutect2
FilterMutectCalls
: added a --microbial-mode
argument that sets filters to defaults appropriate for microbial calling (#6694)ValidateVariants
DRAGEN-GATK
CalibrateDragstrModel
that could cause intermittent ArrayIndexOutOfBoundsExceptions
(#7417)ComposeSTRTableFile
(#7409)Funcotator
Match_Norm_Seq_Allele1
and Match_Norm_Seq_Allele2
fields were not being populated in MAF output (#7422)Mitochondrial pipeline
FilterNuMTs
and FilterLowHetSites
, which are no longer being used (#7325)CNV Calling
GermlineCNVCaller
and improved documentation of corresponding utility methods. (#7411)Documentation
CombineGVCFs
docs (#7413)MultiVariantDataSource
(#7388)