Clear missing variant from HaplotypeCaller
Dear all, I'm using gatk-4.1.7.0 to compare variants among two WGS from MZ Twins. As a summary I can say that the 97% of all variants are common in both samples as expected. But there are a big number of discordant variants among both samples annotated in dbSNP, which is quite impossible to believe, as both samples come form the same zygote.
Among the discordant variants I have several examples of clear variants called in one sample but not in the other, although having similar values (scores, DP, GQ etc etc). I've tried to find an explanation but I can't.
For example, in chr1 position 125178240 of one sample I see a variant with a score of 37014, DP is 4199 and GQ 99. According to IGV, in this position the the 76% are from the reference (G) and ~ 24% from the alt (A). If I check this position in the sample where the variant is missing, I see similar values:
I tried to use varscan2 to see how this position is detected. According to the result, this is a germline variant with a frequency of 23.58% in one sample and 24.44% in the other.
I have more example quite similar to this.
I've used BWA as aligner, then I`ve used gatk MarkDuplicatesSpark and finally the HaplotypeCaller using default parameters except: --native-pair-hmm-threads 20
So ... Has any of you an explanation of this behaviour ?
-
Hello Edu Andrés León,
Check out some of our resources that can help figure out why you are seeing this:
- When HaplotypeCaller and Mutect2 do not call an expected variant
- How HaplotypeCaller works: Algorithms
- Check out the HaplotypeCaller option -bamout to view realigned reads
- Make sure that you are following our best practices including data pre-processing and variant calling workflows
-
Dear Genevieve
I'm rerunning the HaplotypeCaller with the options suggested in your first post. The others did not answer what I'm seeing in this analysis.
Thanks for your help. Once it finished, I'll let you know
Cheers
-
Hi again,
Finally I have been able to finish reviewing some variants. It is true that many of those I selected no longer appear, but it is true that now others appear and essentially the same thing is happening to me, I see some that for me are clear, but they do not appear marked in both patients.
Here is one example:
-
Hi Edu Andrés León, thank you for posting an update. Could you answer these follow up questions?
- Are these genomes or exomes?
- What method did you use to get this data? Was this from PCR amplified data or PCR-free?
- Could you look at the pileup and see if this occurs there? Or, is this a result of GATK re-assembly?
-
Hi Genevieve, thanks again for your help. These are the answers:
1) Whole genome
2) PCR amplified
3) The picture was made using bam files from the --bam-output parameter by the HaplotypeCaller
The mpileup from this position in this bam file is:
[mpileup] 2 samples in 1 input files
chr1 125178240 N 3946 GAGGGGAGGGGAGAAAAGGGGGGAAGGAAAAGGGGAAGAGAGAGGAAAGGAAAAAAAGAGGAAGGAAGAGAGGGGGAGGGGGGGGGGGGGGGGGGGGAAG$GAGAGGAAAAAGGAGGAAGGAGGGGGGAGAGGAGGGAGGAAGGAGGGGAGACGGA$GGGAGGGAGG$GAGAGGGAGGGGAGAAAGAGACAGGGAGGAGAAGAGGGAAA$AGGGGGAGGAAAGGAGAAGGAAGG$GAGGGGAGGAGAAGGGGGGGGGGAAGAAGAGAGAGGAAAGAAGAAGAGGGAGGAGGGGAGGGGAAAAGG$AAGAGAAGAAAAGGGGAAAAGAAAAGAGGGGGGGGGAAGAGAAGGGGAGGGGGGAGGAGAGAGAGGAGGAGAGAGAAGGGGGGGGGGGAGGGGGAAGGGGGGGAAGAAGGGGAAGGGGAGAGGCAAGGGAAGGAAGAGAAGGGGGAGGAAGGGGAAG$GGGGGGGAAAGGGGGGAAGGGGGGGAGAG$AAGGGGGGAAAAGGGGAGGAGGGGAGGAGGAAAGGGAAAGGGGGGAAGGGGGAGGAGAGAGGGGGAGGAGGGGGGAAGAGGGGGGGGGGGAGGGGGGGAAAAGAGGAGAGAAGGGAAGGGGGGAGGGGGGGGAGGGGAGGGGAGGAGGAAGAGGGAGGGAGGAAGGGAAGGAGAAGGAAAGAAAGGGGGGGGGGAGGGG$GAGAGAAGGAGAGGGAGGGGGAGGAGGAGGAAGGGGG$GAGAGAGGAGAAGGAAAAAAAAGAAGGGAGAGGAAAAAGGGGGGAGGGAAAGAAGAGG$GAAGGGGAAGGGGAGAAGAGGGAGGAAGGGGAGGGGGAAGGAGGGGGAGGAAGGG$GAAAAAAGGGGGGAAGAAAGAAGGGG$GGGGGGGGGGGAGGAGGGGGGAGGA$GGGGGGGGAGGAAAGAAG$GAGAAGAGGGAGGGAGGAGGGAGGAGGAGGGGAGGGGGAGGGAAGAGGA$GGAGGGGGAGGAGGGGGGGAGAGGAAAAGAGGGGGGGAGAAGAAAAGGAA$GGGAGGGGGGGGGAGAGGGGAGGGAAAGGAGAGAGGGAGAGGGGGAGAGGAAGAAGAGGAGAGAA$GGGGGAAAAGGGGGAAGAGGGGAAAGAGGAAGAGGGGAAGGGGGAGGGAGGAAAGGGGGAGGGGGAGGgggggaaagaagaaaaggggggaagggaaagagggagaggagggaggaaagaggggagg$ggaa$ggagagaagaggggggaggggggagggaggaagggaggaggggaagggggggggaggggggaagggggaaagggaaaggaaggagagggggagaaaagggaggggggagaaaagagagggagaaaggggagagagggagaaaggagagaggggggaaggaggggagaggggagggaggaagggagggaggggggggagggagggagaga$gggggggggggggaaaaggagaagaagagggggaaaaggagagagggggaaagaagggggggaaagggagggggaaggaggaaggggagaagggagggagagggagaaggagggggaggaagagggaagggagggggggagggggggggggggagggagggaggaggagaggggggggggggaaaggagagagggggaggaagggaggggagggagagagagagagggaaagggggggg$ggg$ggaaaggagggaggaaggggggaggagggaaaggagagaaggggggggggaggaaggaggagg$ggaaagggagaggggggagg$aagggggggagaacgagagagaggggggggaaaggaaaaaggggaaggggggaaggagaggaggagaagaggggggggggagaagggaagggggggaaaaggaaaggagggggagagagggaggggggaaggaggggggggggggggggaaaggggaaaagaagggaaggggaggagggagaggagaaaag$aggggggaggggagagagaggggggaaggaaaggggaggaggggaggagggggaggagggaaggagaaaagggggaaggagggggggagagaagagggaaggaggaggggagagaaggagagagagggggaggagaggaaagaaaaaagggagaaggggcagaagggagag$ggggagaaggggagaggggagggaggggagggaggagggggaagagaggaaggaagaggggaaggg$ggggggggggggggggagagaggagaaggagggggggagaggggaggggaagagaggggggggagagaaggggggaaggagagaaggaaagaggggggggaaggagggagaaaggagagggggaggaggaggaggaaaggaagggaaaagaggggggggggggggggggggaggggggagaAAGGGGGGAGgaaGGGAAGGGAGggggggaaaggaggaAGAGGAGAGAGagaagaaggagggaaaaggGGAAGAAAGGGGAGGGggagaagggagagaGGAGGGAAGAAAGgaaggGAGGGAgaaaggggggAAAAGGAAGGAGAaggggagagaggaGGAGGAAGAGGAaaaggggggggggagaagaggggaAAGGGAGGGGgggaggagggagaggaggagggagggaagaagagGGAGGGGGGAAGGAGAAAAGGGaagaggggaggAAAAAGGGgggagagggGGGGAGGggaaaaagaggaGGGGGGAGGGGAagagggaagAGGGAGAGAAGAAAAggggggaggggGGGAAAAGAAGAGGAGGAGGAgaagggagaggaaAAGGgaaaagGGAgggggggagaggggaggaGGGGGAAAAGAAAGGGGaggggagaggggggagggGGGGGAAGGGGAAGggggaggggggGAGGGGAGGAgggaaggaggaGGGGGGGAAgggaaggggggaaaGGGGGGGGGAGGAAGAAGGagggagggagAGGGAGAGGGAGGGGAGAGGGAaggaggggaaaaaggAGGAGGGGAggaggggggaggggaAAGGAAGAGGgagggagaggagggggagGAGGAGAGAGGGgaggggagggggGGAAGGAGAAGGGCGAGGGGGAGAaggagggggaggGAGGGGAGGAGAGGGAAGGAAGGAGgaggagagaggGAAaagcGGGGGGGAGAggagagagggGGGGGAAGAGGAaagggaagggGGGGAAAAGGGAGGGAGGGGGGGAGGGGAGGGggggagaGACGGGAAGAGGGGGAGGGagagagAGGAAGAAGAAAgaggaggaGAAGGGAAGGagggggaggagggagagaGGGAGaagggGGGAGAGaggAGGGGGGgagggaggggaggagGaaaaggagGGggggaaggagGGGaagagggggaaggggggaGGaaaggagaGGGGGagggGGGGGggggggagggGGGGGGGGGGagagaagggaggaggGGGGGGGGgggggggggggggGGGGGGGGGGaggagaggggggggggGGGGGGGGGGGGGGGGggagaggggggggagggggGGGGGGGGGGGgaggggaagggaaaggagggggggGGGGGGGGGGggggggaggaaagggggggggggGGGGGGGGagggggggggggaaggggaGGGGGGAGGGGGgagggggaagagagggggGGGGGGGGGGGaggaggggagggggggggagggaGGGGGGGGGGGGGGggagaaagagagggggagaaggGGGGGaagagggaaagaggggggggaGGGGaaaagGGGGGGGgaggaggggggggGGGGGGGGGGGGGGGGGGGGGgggagggaggGGGGGgggggggggaaagggGGGGGGGGgaagaagaggggggagaagagggGGGGGagaggaggaaggggGGGGGagagggggggggggaGGgaggaggggaaaaggaagagGGagagagaggggaggggggggggaggagGGgaaggaagcgaggggGGggggggggGGgagggggaggggaaa^<G^:G^:G^Ng^\a^Qg^?g^Na^Zg^<g^Da^Na^Bg^>a^Qg^:a FFFFFF5F:FFFFFF5FFFFFFFFFFFFFFF5FFFFF:FFFFFFFF55FFFFFFFFFFFF:FFFFFFFFFFF5FFFFFF5FFF:F5F:FFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFF5:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFF:FFFFFF:FFFFF5FFFFF5F:F:FFFFFFFFFFFF5FFFF:FF5FFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFF5FFFFFFFFF5FFFFFF:FFF5FFFFFFFFFFFFFFFFFFF:F:FF:FFFFFFFFFFFF5FF:FFFFFFFFFFFFFFFFFFFFF:FF:F5FFF:F:FFFFFF5FFF:FFFFFFFFFFFFFFFFFFF:FFFF:FFFFFFFF5FFFFFF::FFFFFFFFF:F5F5FFFFFF:F5FFFFF:FFFFFFFFFF5FFFFFFF5FFFFF5F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFIFFFF5F:FFFFFFFFFFFFFFFF:F:FFFFFFFF:FFFFFF5FFFFFFFFFFFFFFFFFFF:5FFFFFFFFF:FF5FFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFF:FFFF5FFFFF5F:FFFFFFFFFFFFFFFF5FF5FFFFFFFFFFFF5FFFFF5F:FFFFFFFF5:FFFFFFFF5FFFFF::FFF5FFFFFF:FFFF5FFFF:FF:FFFFFFFFFFFFFFFFFFFFF55FFFF:F:FFFFFFFF5:FFF5FFFF:FF:FFFFFF:FFF5FFFFFFFFFFFFFFFF55FFFF:5FFFFFFFFFFFFFFFFFFFFIFFFF5FFFFFFFFI5F5FFFFFFFFFFFF:FFFFFF5FFFFFFFFFFIF5FFFFFFFFFF:FFFF:FF:FF:FFF5FF5FFFFFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF:FF:FFFIFF5F:FFFFFFF5FFFFFFIFFFFFFFFFFFFF5FFFFFFFFFFFFFFF5:FFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFF5FIFFFFFFFFFFFFFFFFFFFFFFFFFFF5FF:F:FF5FFFFFFFFFFFFFFFF:FFFFFFFFFFFF:FFIFFFFFFFFFFF:5FFFFFFFFFF5FFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF5:FFFFFFFFFFF:FFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF5FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFF5F5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF5FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF5FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFF:BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFF:5FFFF5FFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFIFFFFFFFFFFFFFFFF:FFFFFFFFFFF5FFFFFFFFFFF:FFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFF:FFFFFFFFFFFFFFFFF:FFFFFF5:FFFFFF5FFFFFFFFFFFFFFFF:FFFFFFFFFFF5FF:FFFFFF:FFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFF5IFFFFFFFFFFFFFFFFFFFFF5FF5FFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF5:FFFFFFFFFFFFFFFFFFF::FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:5FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFF:FFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFF5FFF:FFFFFFF:FFF5FFFFFFF55FFFFFFFF5FFFFFFFFF:FFFFFFFFFFFFFFFFF5F:5FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF5FFFF5FFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF5FFFFF5FF:FFF5FFF:FFFFFFFFFFFFFFFFFFFFF5:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF55FFFFFFFFFFFFFFFFFFFFFF5FFFF:5F:FF:FFFFFFFFFF5FF5FFFFFF:FF:FF:FFFFFFFFFF:FFFFFFFFFFFF5F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFF:F:F5FFFFFFFFFFFFFFFFF:5FFFFFF:FFFFFFF5FFFFFFFFF:FFF5FFFFF:FFFFFFFFF5FFF5F:FFFF:FFF5FFFF5FF::F5FF:FFFFFFFF:FFF5FFFF:F5FF55FFFFFFFFFFFF:F5FFFF:FFFFF5FFFFFF:FFFFFFFFFF:FFF5F5FF5FFFFFFFFFFFFFF5F
And what I see in the gvcf file is:
Sample1:
chr1 125178240 . G A,<NON_REF> 38033.64 . BaseQRankSum=-4.969;DP=4076;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-2.250;RAW_MQandDP=6413694,4076;ReadPosRankSum=4.257 GT:AD:DP:GQ:PL:SB 0/1:2683,1317,2:4002:99:38041,0,101158,46110,105099,151208:1262,1421,619,700
Sample2:
chr1 125178240 . G <NON_REF> . . END=125178240 GT:DP:GQ:MIN_DP:PL 0/0:3024:0:3024:0,0,59823
I hope this can help
-
Edu Andrés León Thank you, I will let you know when I have more information.
-
Thanks in advance. Please let me know if you need more examples like this one, I have a bunch
-
Edu Andrés León could you submit multiple of these examples in a bug report following these instructions?
-
Hi Edu Andrés León, there is another option for debugging these sites. Please check out --debug-assembly-region-state. You will want to target just this site because it will output a dot file for the assembly graph at each step of the process. You can use these files to determine what is going wrong.
Please sign in to leave a comment.
9 comments