Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Somatic calling is NOT simply a difference between two callsets Follow


  • Avatar

    Just a note that the first image near the start of the article is not appearing for me; it's just a missing image placeholder.

    Comment actions Permalink
  • Avatar
    Yee Mey Seah

    In the section on Historical perspective explains quirks of somatic calling, I think you mean identify individuals, not deidentify them.

    Germline variants, in particular those in untranslated regions or noncoding regions of the genome, deidentify individuals.


    Somatic mutations in coding regions do not deidentify individuals and are publically sharable according to TCGA policies.

    Comment actions Permalink
  • Avatar
    Nicola Cosgrove

    Hi there,

    We have a small sample WXS study where a PON was generated from 9 blood normal samples. 

    In reference to the panel of normals in general, would it be worthwhile to include the public GATK 1000g_pon.hg38.vcf.gz as well as the custom PON generated from normal samples sequenced for the user specific study when calling somatic variants?



    Comment actions Permalink
  • Avatar
    Arye Harel

    Thank you.

    In the case of Plant data, and diploid samples. What is the advantage of using Mutect2 over HaplotypeCaller?



    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk