GATK 22.214.171.124 and GATK 126.96.36.199 represent changes to GATK corresponding to the period from February 4, 2022 - April 13, 2021. While we always recommend using the newest version of GATK, in this particular case, it is critical that you upgrade.
There were some especially critical bugs that were discovered in GATK 188.8.131.52 that were discovered, and these have been addressed in the newest version.
Spanner in the machine
When we released GATK 184.108.40.206, some changes were made to
GenomicsDB that resulted in
NullPointerException in some cases with many alternate alleles. Additionally, a feature of the QUAL calculation was disabled that would lead to false positives and low quality alleles in some multi-allelic sites.
As a result, for any workspaces or pipelines that have used
GenotypeGVCFs under GATK 220.127.116.11 (such as the WARP Joint Genotyping pipeline) we strongly recommend that you re-run the analyses under GATK 18.104.22.168.
It may be possible to filter out multiallelic alleles if your analysis has already been completed, but the safest best would definitely be to re-run the analysis, to ensure that the data was not affected by these bugs at all.
In addition, we fixed the "Bucket is a Requester Pays bucket but no user project provided" error that would occur when accessing Requester Pays buckets in Google Cloud Storage, even though the
--gcs-project-for-requester-pays argument was being specified. If you experienced this error in the past, it should be resolved now, however if you continue to have difficulties accessing Requester Pays Google Cloud Storage buckets, please let us know by filing an issue on GitHub.
What else is new in GATK 22.214.171.124
In additional to bug fixes, a slew of new tools and features have been added to GATK. The full release notes are available on the GATK GitHub, but here is a rundown of our main highlights in this version.
New RNA tools:
PostProcessReadsForRSEMis for re-ordering and filtering reads before running RSEM (for estimating gene and isoform expression levels from RNA-seq data). Also, the
TransferReadTagstool can transfer the read tags from an unaligned BAM to its matching aligned BAM counterpart. These are tags would normally get lost when converting a SAM file to FASTQ and then back to SAM (ie. when clipping adapter bases before alignment.
Two new tools for the Structural Variation calling pipeline: The
SVAnnotatetool adds functional annotations to SVs called by the
GATK-SVpipeline, while the new
PrintSVEvidencetool can merge evidence from a cohort in the
Mutect2where force-calling alleles were lost upon trimming, as well as changes to support future Mutect2 releases.
- Changes to Funcotator: added the ability to customize the severity ratings of
VariantClassificationsusing the new
- Fixes to convergence issues in
VariantRecalibratorin order to make the tool more robust with highly correlated annotations.
- Changes to the GATK Engine: new
MultiFeatureWalkertraversal is available.
Picarddependencies have been updated to version
2.27.1. Picard documentation migration into GATK technical documentation is being finalized.