Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Artificial haplotype generates a new variant for a homozygous site



  • 0
    Comment actions Permalink
  • Avatar
    Yiguan Wang

    Hi Genevieve, thanks a lot for your reply.

    I've read the post you mentioned, but I don't think it answers my confusion. I did some post-HaplotypeCaller filtering, but I am afraid it may not remove these variants. As I showed in the example, "3R:10209297 GT:AD:DP:GQ:PL     0/1:17,39:56:99:1586,0,568" is a high quality variant. 

    I also followed the Refinement workflow to do "CalculateGenotypePosteriors" and "VariantAnnotator" providing a pedigree file which includes Father+Mother + 12 Children. I find many "hiConfDeNovo" and "loConfDeNovo" in these samples, but for the "sample62_3R:10209297", nothing changed - neither identified as de novo (1/1 1/1 -> 0/1) nor refined as non-variant (1/1). 

    3R 10209297 . G T 31555.30 . AC=27;AF=0.964;AN=28;BaseQRankSum=1.76;DP=867;ExcessHet=3.0103;FS=0;InbreedingCoeff=-0.037;MLEAC=27;MLEAF=0.964;MQ=59.74;MQRankSum=0;PG=26,11,0;QD=27.37;ReadPosRankSum=-1.314;SOR=0.871 GT:AD:DP:GQ:JL:JP:PGT:PID:PL:PP:PS 1/1:0,24:24:23:71:11:.:.:1080,72,0:1046,23,0 1/1:0,27:27:32:71:11:.:.:1215,81,0:1181,32,0 0|1:16,39:55:99:71:11:0|1:10209297_G_T:1589,0,554:1663,0,483:10209297 1/1:0,73:73:99:.:.:.:.:3285,220,0:3311,231,0 1/1:0,39:39:99:.:.:.:.:1749,117,0:1775,128,0 1/1:0,54:54:99:.:.:.:.:2424,163,0:2450,174,0 1|1:0,33:33:99:.:.:1|1:10209297_G_T:1485,99,0:1511,110,0:10209297 1/1:1,30:33:66:.:.:.:.:1306,55,0:1332,66,0 1|1:0,31:34:99:.:.:1|1:10209297_G_T:1395,93,0:1421,104,0:10209297 1/1:0,49:49:99:.:.:.:.:2199,148,0:2225,159,0 1/1:0,85:85:99:.:.:.:.:3820,256,0:3846,267,0 1/1:0,77:77:99:.:.:.:.:3465,232,0:3491,243,0 1/1:0,75:75:99:.:.:.:.:3375,226,0:3401,237,0 1/1:0,68:68:99:.:.:.:.:3060,205,0:3086,216,0

    [sample order: father - mother - sample62 ... other 11 samples]

    My major concern is the artificial haplotypes and their strange behaviours, how could "HaplotypeCaller" create these haplotypes with a new allele which looks like a de novo mutation. 

    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hello Yiguan Wang,

    Here are a few options to run that can potentially help this issue or give insight into why this issue is coming up:

    We want to determine if this is caused by phasing or the realignment by HaplotypeCaller, and those steps above can help narrow things down.

    A couple questions: What version of HaplotypeCaller are you running? And could you send another screenshot, with the bamout organized by read groups and the base context at that site?

    Thank you,


    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk