Confirmation that Mutect2 /haplotype caller only use reads mapped within the active region
Hi, firstly thank you for a great set of tools!
I am looking for clarification on how each active regions is evaluated in Mutect2 (4.1.X+) / Haplotype caller.
It is not explictly stated in the documentation so I highlight in bold the sections I need a bit of clarity on:
For each active region, only the reads mapped to the active region are used to create the de-Bruijn graph ?
Once the haplotypes are identified and aligned to the reference, only the reads mapped to the active region are realigned to the haplotypes to generate a probability pe read.
In effect, each active region only takes information from reads that are mapped to that region ie. it does not use unmapped reads or anything else.
b) What does (......) mean?
HaplotypeCaller determines active regions based on the evidence for variation and then uses only those active regions to identify possible haplotypes in the data. You are correct that only the reads mapped to the determined active regions are used to generate the De-Bruijn graph. I hope this clarified your question and please see the following articles on HaplotypeCaller and how active regions are determined:
Hi Pamela, thank you for your confirmation, so this applies to all steps after creating the De-Bruijn graph also ? For instance, no unmapped reads are used at any point after creating the De-Bruijn graph.
That's correct. Once the active regions are determined, only those sections are used for the remaining steps. I'm glad I could help clarify.
Please sign in to leave a comment.