Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

(GATK v4.1.6.0) Mutect2 bamout reports fewer reads than VCF AD


1 comment

  • Avatar
    Bhanu Gandham
    1. So the VCF says there is one more read supporting than the bamout. Looking at the code, one of the transformations in the AD annotation code is subsetting the likelihoods matrix to just the emitted alleles. It’s possible that in the bamout the 5th read is being aligned to a different allele that didn’t pass the emission threshold, while in the AD it’s assigned to the insertion since that allele was removed from consideration.
    2. 5 out of 847 reads for a big indel like that is enough to call. The likelihoods model (as opposed to the filters) only considers sequencing error, and the chance of 5 reads independently having the same big insertion in Illumina sequencing is negligible.
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk