Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Inconsistant QD values

0

1 comment

  • Avatar
    Laura Gauthier

    Hi Allan Daly,

    This is not unexpected.  There's some (perhaps undocumented) random behavior of the QD annotation that applies here: https://github.com/broadinstitute/gatk/blob/cfd4d87ec29ac45a68f13a37f30101f326546b7d/src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/QualByDepth.java#L93. Specifically, we expect the quality of each read to be on the order of 30, which is a single mismatch with good base quality according to Illumina's quality estimates.  So het variants will get QD values centered around 15 (half qual 30 read, half qual zero for ref, divide by the full depth) and hom-var variants will get QD values around 30.  When there is more than one variant within the span of a read, then there are two mismatches that support the alternate haplotype and look different from the reference. The pairHMM algorithm determines that the quality for reads like that is much higher, i.e. we're much more confident they didn't come from the reference haplotype. However, these cases are relatively rare and the resulting QD values are obvious outliers because they're much higher. In order to prevent variants in phase with other variants from being filtered out, we "correct" their QD values by assigning them a value drawn from a Gaussian distribution centered around 30.  Unfortunately because we're replacing these annotations with random values any additional call to the (pseudo) random number generator will change the values.  It's hard to say what that additional call was in your case, but we've definitely encountered this previously in similar scenarios.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk