Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Panel of Normals (PON) Follow


  • Avatar
    Stephanie Hoyt


    The listed PON's are vcf format, but the GATK copy number pipelines require an hdf5 file. Is there another processing step we need to do before we can use these files?



    Comment actions Permalink
  • Avatar


    Currently, I am working with multiple myeloma (MM) dataset, which contains around 1100 samples. I want to generate a panel of normal (PON) for MM data, but I am not sure how many normal samples I should use. As recommended, there should at least 40 samples to generate PON. So any 40 samples out of 1100 samples will work, or I should take all samples to generate PON (if so, it'll take an enormous amount of time and not preferable). Kindly suggest. 

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk