GATK HaplotypeCaller issues with Genotyping (GT)
Hello,
I am using GATK v4.2.6.1. I am working on a haploid organism. Previously described in - HaplotypeCaller calls 0/0 when all reads in raw BAM support 1/1 – GATK (broadinstitute.org) -I am also facing a similar issue.
I already tried the following option described in - https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant
I tried to do two options:
- Enabling the --linked-de-bruijn-graph option
- Enabling the ---recover-all-dangling-branches option
However, some SNPs with QUAL > 1`000, genotyped as 0/0 in my ann.vcf file appear as 1/1 in my raw BAM file (checked via IGV). To give you an example from the two options I tried
1) Enabling the --linked-de-bruijn-graph option
For the region: Pf3D7_06_v3 730390, my vcf file gives me the following information for sample1
Pf3D7_06_v3 730390 . A C 7722.11 PASS GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:58,0:58:99:.:.:0,120,1800
This is what it shows on my raw bam file: sample1_paired_sorted_addedRG_recal.bam
however, my bamout file shows me this
This was then correctly fixed by enabling the ---recover-all-dangling-branches option which shows me the following bamout file
However, with this option, I again encounter another problem for another region: Pf3D7_06_v3 340645
raw BAM file
2) Enabling the ---recover-all-dangling-branches option
vcf file shows me something weird in the format
Pf3D7_06_v3 340645 . A G 7777.98 PASS GT:AD:DP:GQ:PGT:PID:PL:PS 1|1:0,58:58:99:1|1:340624_A_G:2249,170,0:340624
bamout file
How should I fix this in my annotated vcf file? Does this come from the HaplotypeCaller`s error in realignments?
Here are the command lines I used for the two options
1) Enable --linked-de-bruijn-graph
gatk HaplotypeCaller -R PlasmoDB-57_Pfalciparum3D7_Genome.fasta -I sample1_paired_sorted_addedRG_md_recal.bam -ERC GVCF -O sample1_paired.g.vcf --minimum-mapping-quality 10 --linked-de-bruijn-graph true --bam-output sample1_bamout.bam
2.) Enable --recover-all-dangling-branches
gatk HaplotypeCaller -R PlasmoDB-57_Pfalciparum3D7_Genome.fasta -I sample1_paired_sorted_addedRG_md_recal.bam -ERC GVCF -O sample1_paired.g.vcf --minimum-mapping-quality 10 --recover-all-dangling-branches true --bam-output sample1_bamout.bam
Thank you very much in advance for the help!
-
Thank you for your post, pb! I want to let you know we have received your question and will be moving it to the Community Discussions -> Special GATK Use Cases topic, as the Other topic is for reporting bugs and issues with GATK.
We'll get back to you if we have any updates or follow up questions. Please see our Support Policy for more details about how we prioritize responding to questions.
-
Hi pb,
Thank you for writing to the GATK forum! I hope that we can help you sort this out.
I brought your inquiry to our development team and received feedback and clarifying questions that I'd like to share.
Firstly, please clarify what you believe the truth to be in your second run with the "--recover-all-dangling-branches" option enabled and why?
Are you convinced the second site you referenced has a verifiable, easily-identifiable variant that HaplotypeCaller is missing? If yes, please provide a snippet of your bam file so that the developers can attempt to debug it. Unfortunately, they can't dig deeper into this issue without more information.
I hope to hear back from you soon! Thank you in advance for any further information and clarity you can provide.
Best,
Anthony -
Hi pb,
We haven't heard from you in a while so we're going to close out this ticket. If you still require assistance, simply respond to this email and we'll be happy to pick up where we left off!
Kind regards,
Anthony
Please sign in to leave a comment.
3 comments