    Genevieve Brandt (she/her)

    I have kind of the same issue. Here is what I experienced lately and I really can't figure out a starting point.

    I have a sample (Sample1) that is analyzed using BWA v0.7.6a-r433 and GATK I called the variants using HaplotypeCaller using the ERC mode (GVCF) and then joint genotyped the samples (8 samples) using GenotypeGVCFs.

    In one region in this sample, there is a variant that is TRUE (based on a third-party quality control, (chr1:169510380) that we did NOT call.

    As you can see below, the variant exists in Sample1 when we look at the BAM file which is generated from BWA. But when I take the BAM file that is generated by HaplotypeCaller (containing the active regions by using –bamOut option), one sees clearly that HaplotypeCaller didn’t consider this region in Sample1 as an active region and thus there is no way of calling the variant.

    This variant was called in another sample from the same run (Sample2). As you can see below as well, there is no big difference in the quality of the reads between the two samples (The coverage of both is around 500 and there is no bias in one strand or the other in both samples).

    My questions are:

    • Where should I start searching for the reason? What can I do to check why is this not called in Sample1?



    • How can I force HaplotypeCaller to call this variant?


    I hope I was able to explain the case clearly. Please let me know if I missed any required details and/or you need any more info.

    Thanks for your help.

    Hi, you are describing the same issue I (and others) stated before. It's not due to BAM, it's simply GATK HaplotypeCaller considering that region too messy with so many gaps, that it removes the whole chunk.

    The solution that resolved the issue for me was to configure the --kmer-size option for "gatk HaplotypeCaller", I forgot what was the default values, but you can input multiple values (the program will run using these different values and choose the best result).

    For my case the problem was resolved when I tried some values smaller than default as follow:

    --kmer-size 18 --kmer-size 22


    Genevieve Brandt (she/her)

    NawarDalila A few other helpful tips:

    • You can force call an allele with the --alleles argument and it might give more insight into why it was not called.
    • Use the --debug argument and look at the stderr or stdout files to see what is happening with the assembly.
    • I would recommend a newer version of GATK, I think our adaptive pruning improvements have been added since GATK, and could potentially help with this case.
    • How many reads are supporting the alt allele in Sample 1?

    Hope this helps!


