Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

HaplotypeCaller and GenotypeGVCFs


1 comment

  • Avatar
    Genevieve Brandt (she/her)

    If you have more than one sample, we recommend running HaplotypeCaller in GVCF mode and then GenotypeGVCFs. This is our joint genotyping method, we have a couple resources about what that means here and here. A quick run down is that HaplotypeCaller in GVCF mode outputs a GVCF, which contains information about all sites, not just sites with variation. GenotypeGVCF then uses the information at all sites and across all samples to be able to call variants that cannot be called if you only had the information from one sample. This can make a big difference depending on how many samples you have.

    With just one sample, running HaplotypeCaller as normal is sufficient. It should get the same results as the sample run in GVCF mode then GenotypeGVCFs.

    Let me know if you have further questions!

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk