STRANDQ missing from mutect2 vcf recordsAnswered
d) Where do I find the strand artifact score calculated by mutect2? I see how it is calculated in section II.D of the mutect2 white paper, but I'm wondering if it is possible to have the score emitted in the VCF? In my VCF header I see "STRANDQ: Phred-scaled quality of strand bias artifact" which sounds like the score strand artifact score described in the white paper, but none of the records in my VCF have "STRANDQ" in the INFO field. I am seeing this behavior in with both GATK 4.1.8 and 4.2.0. I tried running mutect2 with "--enable-all-annotations true" and now I see other INFO fields related to strand bias populated, including "FS: Phred-scaled p-value using Fisher's exact test to detect strand bias" and "SOR: Symmetric Odds Ratio of 2x2 contingency table to detect strand bias". However, from the white paper I believe that STRANDQ (not FS or SOR) is used for adding the "strand_bias" filter - is that correct? On a broader note it would be really helpful if the white paper more directly described how the calculations correspond to the INFO/FORMAT fields in the mutect2 VCFs (see this post for a more detailed request along these lines).
I am going to move your post into our Community Discussions -> Documentation Questions topic, as the Somatic topic is for reporting bugs and issues with GATK.
You can read more about our forum guidelines and the topics here: Forum Guidelines.
Hi Rebecca Halperin,
The forum request you linked to has been implemented by the GATK team. You can see the pull request for those changes here: https://github.com/broadinstitute/gatk/issues/6965
Please also see this forum post regarding strand bias annotations: https://gatk.broadinstitute.org/hc/en-us/community/posts/360057705151-Mutect2-annotation-group-fields-that-can-check-strand-bias
Please let us know if you have other questions!
Thanks for your response. The link to the forum that you shared does confirm that "STRANDQ" is the strand bias score that is used for setting the "strand_bias" filter. I do want to follow up on the point that none of the records in my VCF have "STRANDQ" in the INFO field. Is there an input setting that I need to modify to get "STRANDQ" to appear in my VCF records, or is it a bug that "STRANDQ" is not showing up?
STRANDQ is applied by FilterMutectCalls, do you see this in your file after FilterMutectCalls?
No, I don't see STRANDQ in the INFO field in any records in the output of FilterMutectCalls. I do see records with "strand_bias" in the FILTER column, so it must be calculated internally but not written to the VCF.
You can check if it is possible using -A STRANDQ with FilterMutectCalls. If you have already tried this though, then unfortunately it is probably not possible.
Is there a reason why you are interested in this annotation so we can look into if there is something else that meets your needs? I don't see anything about it in our documentation regarding our annotations: https://gatk.broadinstitute.org/hc/en-us/articles/360057438571--Tool-Documentation-Index#VariantAnnotations
Here are some other helpful posts regarding Mutect2 and strand bias filtering:
Please sign in to leave a comment.