A year ago this month, we announced the start of our collaboration with the DRAGEN team at Illumina, which aims to combine the respective strengths of GATK and DRAGEN as well as promote the standardization of secondary analysis pipelines used in genomics (see original blog post).
TL;DR: Tune in to the DRAGEN-GATK webinar hosted by GenomeWeb on September 29 to learn more about the status of the project and the technical wizardry involved.
The first jointly developed DRAGEN-GATK pipeline (for germline short variants) has already been available in its proprietary hardware-accelerated form from Illumina -- specifically, version 3.4 of the DRAGEN Bio-IT Platform, as I mentioned in our last update. Meanwhile, we've been working hard on the open-source software implementation of this pipeline, which involved rewriting the algorithms responsible for key accuracy improvements in Illumina's DRAGEN pipeline into GATK and associated tools.
After weathering some delays due to the COVID-19 pandemic, we are now expecting to be able to release the full open-source software version of this first DRAGEN-GATK pipeline in early November of this year (just in time for my birthday, woohoo). This new pipeline implementation will replace the current GATK Best Practices for germline short variant calling, and will produce results that are *functionally equivalent* to the results produced by the proprietary accelerated DRAGEN pipeline.
We care deeply about making this pipeline as accessible, portable and reproducible as possible, so in addition to releasing all the relevant software in Github, we'll provide a set of WDL workflows and Docker container images containing the precompiled executables with all dependencies correctly installed. We'll also make the workflows available for import into popular analysis platforms through the Dockstore tool repository, and we'll publish a Terra workspace containing the workflows in a fully-configured state along with example genomic data for testing.
Terra is a secure open platform for data access and analysis developed and operated by the Broad institute and Verily. We make all GATK Best Practices workflows available in the Terra showcase.
We realize that the prospect of a major pipeline update raises a lot of questions, so we plan to roll out a set of blog posts and documentation articles that will provide all the necessary technical details about what's new in the pipeline -- and what you need to know to apply it to your data. Most excitingly, we're currently finalizing the content for a webinar that will be co-presented by Séverine Catreux from the DRAGEN team and Eric Banks from the GATK team. Séverine and Eric will provide an in-depth look at the key methodological improvements in DRAGEN-GATK, and will be available for Q&A after their presentation, so don't miss this opportunity to get the lowdown straight from the experts. The webinar will be hosted by GenomeWeb on September 29; registration is already open, so be sure to register today.
Don't want to miss out on any updates? Subscribe to this blog by clicking the "Follow" button in the top right corner, and don't hesitate to leave a comment below -- or ask any burning questions that you feel can't wait until September 29.
6 comments
Exciting news! Are there plans to provide this new DRAGEN output-equivalent GATK4 tool in Docker images built for ARM systems? There are significant performance gains (in terms of both compute time and cost-efficiency) that could be achieved simply by compiling for this architecture.
Hi Matthew Porter, we don't have any plans to make anything specific for ARM at this time. It's all open source though so you're welcome to do so yourself.
Any more specific updates on "early November" release of this pipeline? Thanks!
Hi,
Since early November has passed into early December, are there any updates on a possible release?
Thanks
Hi,
Are there any updates on the timetable for pipeline release?
I can't find any links, if it's out. If it has been released, could the article above be updated with a link to the pipeline?
Thanks
Hi Marsha Wallace, Matthias De Smet, Karynne Patterson,
Apologies for the delay -- we posted an update on the blog last week, see https://gatk.broadinstitute.org/hc/en-us/articles/360055062832-New-Year-update-GATK-takes-on-2021
In a nutshell, the open-source DRAGEN-GATK release has been delayed due to ongoing work on DRAGMAP, the open-source mapper from the Illumina team. We're now hoping to release by the end of this quarter, but can't give any firm predictions at this time.
Please sign in to leave a comment.