How can the difference between Mutect2 FORMAT fields FAD, F1R2, F2R2, SB and AD be explained?Answered
I would like to understand the FORMAT strings of Mutect2 better. Here is an example of a somatic variant from running the Mutect2 pipeline using GATK 18.104.22.168
17 29665040 . C A . PASS AS_FilterStatus=SITE;AS_SB_TABLE=19,17|1,2;DP=39;ECNT=1;GERMQ=75;MBQ=20,20;MFRL=174,205;MMQ=60,60;MPOS=63;POPAF=7.3;TLOD=4.51 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:36,3:0.103:39:11,0:14,2:25,2:19,17,1,2
What is the true number of informative forward and reverse reads overlapping the variant position? AD and SB suggest it's 19 and 17 for REF and 1 and 2 for ALT, yet F1R2, F1R2 and FAD suggest it's 11 and 14 for REF and 0 and 2 for ALT for the forward and reverse strand.
I have read the article https://gatk.broadinstitute.org/hc/en-us/articles/360035532252-Allele-Depth-AD-is-lower-than-expected
Can the difference between these fields also be explained by uninformative reads? If I want to filter for variants where the ALT allele is supported by at least one read on both forward and reverse strand, do I need to remove this variant? I.e. do I need to consider the data from SB or from F1R2 / F1R2 to answer this question?
Hi Eva König,
Thank you for your question!
AD and SB are a read based annotations while FAD, F1R2, and F1R2 are fragment based annotations. Fragment counting does not double count paired reads that overlap. So if your paired reads never overlap, the numbers will be equal and if your paired reads always overlap, the fragment counts will be half the read counts. In most situations, some reads overlap but not all.
Let me know if you have any further questions.
Thank you Genevieve Brandt (she/her) your answer was very helpful.
In my info fields I have F1R2 and F2R1.
Is this expected or only one of them is supposed to be there.?
The samples were sequenced in two lanes. So I merged them before running mutect2
Please sign in to leave a comment.