Question on using GATK-DRAGEN (dragmap-os aligner) with Mutect2
Hi Genevieve Brandt (she/her) Geraldine Van der Auwera and Derek Caetano-Anolles
I happened to see this post from Jan 2022 - https://gatk.broadinstitute.org/hc/en-us/community/posts/4416632101275-DRAGMAP-and-Mutect-
In that posted, it is mentioned that
- DRAGMAP output should still be compatible with Mutect2.
- Unfortunately you would still need to use BQSR. The reason BQSR isn't needed anymore is because of changes in HaplotypeCaller that are not implemented in Mutect2
I wanted to check if the above still holds true (i.e. need to do BQSR). If so, would the workflow look like below? I am lost after Step 5 below
- Use the dragen masked reference from here: gs://gcp-public-data--broad-references/hg38/v0/dragen_reference
- Create reference table using dragmap-os
- For every Paired End Fastq sample, take the fastqs and perform QC (say use fastp), use the resultant fastqs and convert to ubam
- use reference table from step 2 with dragmap-os and ubam to create aligned bam followed by MergeBamAlignment as shown here - https://github.com/broadinstitute/warp/blob/master/tasks/broad/DragmapAlignment.wdl#L58-L90
- Take the aligned.bam, perform MarkDuplicates to get MarkDuplicates_output_BAM
I am kind of lost after this as to how to go about it.
Should I do something like shown in https://gatk.broadinstitute.org/hc/en-us/articles/4407897446939--How-to-Run-germline-single-sample-short-variant-discovery-in-DRAGEN-mode
- ComposeSTRTableFile for the masked reference file (do it once)
- Use the MarkDuplicates_output_BAM with CalibrateDragstrModel
- Should I then do BQSR followed by Mutect2 and then do something like what's being done with HaplotypeCaller in Dragen mode (VariantFiltration using hard filtering)?
Derek Caetano-Anolles - the above link is missing valuable info/steps like MarkDuplicates
Also, in this youtube video https://www.youtube.com/watch?v=_bhsVHvz_Yk&t=3892s Derek Caetano-Anolles mentions that dragmap can be used for somatic variant calling. Would the workflow steps be the same as above?
Thanks in advance.
-
Hello @Anand. To try to answer your questions here are some comments:
DRAGMAP broadly should be compatible with Mutect2, it output similar alignments as BWA does on the same reference and there should be no inherent incompatibility between it and BQSR. BQSR is still part of or Mutect2 best practices, but it was dropped from the DRAGEN-GATK best practices because the altered base quality scores were incompatible with the BQD genotyping model that was introduced into HaplotypeCaller. That genotyping model is not in Mutect2 so BQSR should still work.
As far as your question about the ComoseSTRTableFile tool. While it is not part of our best practices for Mutect2, it was an improvement to the common code between HC and M2 and you should be able to use an STR table in Mutect2. Since the priors learned in the table involve sampling STR sites across the entire genome I would only recommend using it on WGS data but this is an off-label use-case you use at your own risk. Should you want to proceed, you should run CallibrateDragstrModel after running MarkDuplicates on the input.
I will add that we have never tested what impact combining the Dragstr model with BQSR and I would be worried about the impact of mixing those two models, since it seems possible that you would be double-penalizing events at STR sites/low complexity regions at places where both models are learning to penalize errors.
Please sign in to leave a comment.
1 comment