Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Panel of Normals (PON) Follow


  • Avatar
    Stephanie Hoyt


    The listed PON's are vcf format, but the GATK copy number pipelines require an hdf5 file. Is there another processing step we need to do before we can use these files?



    Comment actions Permalink
  • Avatar

    Dear GATK Team,

    I am using mutect2 for somatic mutation identification from MMRF data which contains multiple myeloma (MM) samples from 5 different ethnicities. I have the only tumor, and corresponding matched normal for around 1004 MM WES samples. As per the documentation of PON,

    1. PON should be created from healthy normals with an undiagnosed tumor, which I don't have.

    2. Secondly, MM is a very heterogeneous disease that has a unique mutational signature and clonal evolution history for each ethnicity.

    So is it preferable to use PON for somatic mutation identification or rely on tumors and matched normals only? Kindly suggest.


    Comment actions Permalink
  • Avatar
    Rahul Nahar

    Hi GATK team
    Does the PoN also help to remove additional germline variants which might be missed in a sample due to low coverage in matched normal or absence of matched normal ?

    OR do you think the germline resource provided in form of 1000genome or dbSNP vcf is sufficient for germline variant removal ?

    I ask this because I am trying to figure out if we should create a PoN with normal samples coming from different ethnicities to remove germline variants effectively. 

    Comment actions Permalink
  • Avatar
    OrielResearch Eila Arich-Landkof

    Hi GATK team,

    Thank you so much for the explanation. I am working with a tool to call mutation from RNA-seq data and would like to use the PoN to filter out any sequencing artifacts. 

    Will it be possible to add the PoN that was used for the RNA mutect development ( to the gatk-best-practices bucket?

    Many thanks,


    Comment actions Permalink
  • Dear GATK team,

    Thank you for this thread. I want to ask:

     I am applying the GATK Mutect2 over TargetSequecing over a set of genes and some genomic regions of interest without having PON of the the sample. The reference genome used is the hg19. The best option for the PON in this case would be the exome or the WGS version. What would be the best practice?

    Best Regards, Manuel

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk