How does GATK HaplotypeCaller behave with respect to deletions that span across a supplied interval in a bed file using -L?
AnsweredHow does GATK HaplotypeCaller behave with respect to deletions that span across a supplied interval in a bed file using -L?
I can't find anything in the documentation that explains whether a deletion would be included in the output VCF if it spans across the start or end point of an interval in the supplied bed file using the -L argument (i.e. part of the deletion is within the supplied region and part is not).
From the testing I have done it seems that such a deletion would not be included in the output, but I wanted to verify whether that is the case, and find out specifically where in the code or documentation this behaviour is displayed.
Thanks!
-
Hi Rachel Duffin,
The deletion would be included in the final output VCF if it starts within the interval. You might be missing the deletion you expect though if you are losing some of the read support to find the deletion. Reads are only kept if they start within the interval. So any reads that support the end of the deletion may not be kept.
You can check out this article to help with your investigation of what might be occurring: https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant
To get around this issue, you can add interval padding with the -ip argument. This is especially important if you have a lot of intervals.
Hope this helps!
Genevieve
-
Hi Genevieve,
Thanks so much for the information, it is much appreciated!
Best wishes,
Rachel
-
Glad we could help!
Please sign in to leave a comment.
3 comments