Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

HaplotypeCaller produces VCF's from 1KG data that have low SNP overlap


1 comment

  • Avatar
    Louis Bergelson


    I don't think we have enough information to understand your problem.  Are you checking the overlap in SNPs between the two unrelated parents?  I'm not sure what the expected overlap there should be.  Or are you comparing a parent to their child where you would expect more overlap.  

    Could you also explain how you're comparing the files?  The g.vcf output will include a large number of reference blocks which which you would not expect to have much overlap. If you want to compare snps you really need to run genotyping on the gvcfs to get a final VCF.  

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk