Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Why doesn't  haplotypecaller use PON?

0

1 comment

  • Avatar
    Genevieve Brandt (she/her)

    Hi sinc,

    Thanks for trying out GATK and HaplotypeCaller! We have an article that talks about the differences between Mutect2 and HaplotypeCaller that you might find very helpful: https://gatk.broadinstitute.org/hc/en-us/articles/360035890491-Somatic-calling-is-NOT-simply-a-difference-between-two-callsets

    To answer your more specific questions:

    1. HaplotypeCaller doesn't use a PON because it isn't responsible for filtering. The technical artifacts can be found from the annotations and strange sites will be filtered out with whatever filtering method you choose. Many of our users use VQSR, while for some cases CNN might be better at finding the technical artifacts. Mutect2 does not have this approach and needs a PON because there are not enough observations to get enough statistical power to find the artifacts.
    2. For somatic calling we recommend running FilterMutectCalls after Mutect2. BQSR is recommended for both somatic and germline.
    3. gnomAD contains all things that look like germline and Mutect2 does not want germline variants. Even bad variants in gnomad are filtered out because they are most likely artifacts. 100G and Hapmap are stable resources that we have used for germline for a long time.

    Hope this helps!

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk