The latest GATK release is out, with changes corresponding to the period of April 24, 2020 - June 19, 2020. The full GATK release notes are available on the GATK GitHub, but here is just a taste of what's new in GATK 18.104.22.168:
GenomicsDBdatastore format has a major new release (v1.3.0). Developed to store variant call data, GenomicsDB now includes enhanced support for shared filesystems such as NFS and Lustre, as well as support for multinucleotide variants (MNVs). It also supports better compression, which means that workspace size is significantly reduced (by approximately 50%).
This also includes a fix for the frequently reported "NullPointerException" in
GenotypeGVCFswhen reading from GenomicsDB.
PathSeqmicrobial detection pipeline containing many improvements, including a WDL redesign that significantly improves performance on the cloud. Downsampling can now be applied to BAMs with high microbial content (ie >10M reads) that would normally cause performance issues.
There is now prototype support for reading from HTSGET services in GATK, which will play a more important role in upcoming releases.
- Bug fixes to
Mutect2address frequently reported errors such as "evidence provided is not in sample" and "String index out of range".
The GATK docker image is now built off of Ubuntu 18.04 instead of 16.04, which brings in newer versions of several important packages such as
samtools. This updates many of the Python libraries installed via the Conda environment. R dependencies are now installed via Conda in our Docker build instead of the now-removed