In Mutect2 force-calling allele via --alleles does not force-call the allele
AnsweredIf not an error, choose a category for your question(REQUIRED):
c) Why do I see (......)?
GATK version: 4.2.4.1
Hi all,
I'm running Mutect2 over samples for which I have provided a VCF of alleles that I would like force-called. In 99% of cases this works great, but in a few instances a couple of alleles that I have included for force-calling do not actually make it into the output calls VCF.
My command is:
gatk --java-options -Xmx3000m Mutect2 \
-R /data_in/MY1776/MY1776.self.ref.shifted_by_8000_bases.fasta \
-I /data_in/MY1776/self_realigned_shifted.bam \
--read-filter MateOnSameContigOrNoMappedMateReadFilter \
--read-filter MateUnmappedAndUnmappedReadFilter \
-O /out_default.vcf \
--alleles /data_in/MY1776/MY1776.self.ref.reversed.selfRef.shifted.homoplasmies.vcf.bgz \
--annotation StrandBiasBySample \
--mitochondria-mode \
--max-reads-per-alignment-start 75 \
--max-mnp-distance 0 \
-L chrM:8023-9140 \
--genotype-filtered-alleles \
--debug-assembly-variants-out /rej.vcf \
--bam-output bamout.bam
In this instance the variant in question is listed in the rej.vcf file obtained via `--debug-assembly-variants-out`. I have examined `bamout.bam` as well as the input bam and there appears to be ample coverage at the site of interest (the T at position 8316 is the position of interest, highlighted):
I have tried running this with some of the additional parameters in https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant (namely `--linked-de-bruijn-graph` and `--recover-all-dangling-branches`) to no avail. Coverage is very deep at this position (>2000x). Notably if I edit the input to `--alleles` and change the allele of interest (8316:T>A) to anything else (8316:T>C or T>G) it appropriately shows up in the output VCF.
What am I missing here? Let me know if you have any solutions or if you need any additional files.
UPDATE: Adding `--disable-adaptive-pruning` now produces the variant of interest specified in --alleles, but also adds several other new calls, in case that is helpful in isolating where this force-call variant is being lost.
-
Hi Rahul Gupta,
Thanks for writing into the forum! Let's see if we can get this figured out. Could you share the VCF line for this site when it is called? Could you also show the VCF lines around this site when it is not called?
Best,
Genevieve
-
Hi Genevieve,
Thanks for your reply! Sure. The variant of interest is 8316:T>C. Here are the next few variants if I run using the above command (which doesn't force-call it; though 8517:C>T and 8640:G>A are force-called):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT MY1776
chrM 8372 . A C . . AS_SB_TABLE=1168,1107|6,26;DP=2407;ECNT=1;MBQ=30,10;MFRL=332,355;MMQ=60,60;MPOS=32;OCM=0;POPAF=2.40;TLOD=1.76 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2275,32:4.600e-03:2307:878,1:952,4:2186,28:1168,1107,6,26
chrM 8517 . C T . . AS_SB_TABLE=1020,1098|1,2;DP=2184;ECNT=1;MBQ=30,30;MFRL=330,522;MMQ=60,60;MPOS=31;OCM=0;POPAF=2.40;TLOD=-1.877e+00 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2118,3:1.692e-03:2121:980,1:981,2:2046,3:1020,1098,1,2
chrM 8640 . G A . . AS_SB_TABLE=1122,1126|1,4;DP=2303;ECNT=3;MBQ=30,20;MFRL=322,353;MMQ=60,60;MPOS=30;OCM=0;POPAF=2.40;TLOD=-3.756e-01 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2248,5:1.915e-03:2253:1013,4:1001,0:2100,4:1122,1126,1,4If I enable --disable-adaptive-pruning we now see the desired allele (the variant at position 8309 is new also):
chrM 8309 . T C . . AS_SB_TABLE=1183,1347|0,3;DP=2533;ECNT=7;MBQ=30,30;MFRL=329,256;MMQ=60,60;MPOS=33;OCM=0;POPAF=2.40;TLOD=0.684 GT:AD:AF:DP:F1R2:F2R1:FAD:PGT:PID:PS:SB 0|1:2530,3:1.560e-03:2533:1008,2:1150,1:2520,3:0|1:8282_A_G:8282:1183,1347,0,3
chrM 8316 . T A . . AS_SB_TABLE=1168,1296|0,2;DP=2496;ECNT=7;MBQ=30,30;MFRL=328,259;MMQ=60,60;MPOS=29;OCM=0;POPAF=2.40;TLOD=-5.371e-01 GT:AD:AF:DP:F1R2:F2R1:FAD:PGT:PID:PS:SB 0|1:2464,2:1.186e-03:2466:997,2:1098,0:2476,2:0|1:8282_A_G:8282:1168,1296,0,2
chrM 8517 . C T . . AS_SB_TABLE=1070,1134|1,2;DP=2268;ECNT=1;MBQ=30,30;MFRL=331,522;MMQ=60,60;MPOS=31;OCM=0;POPAF=2.40;TLOD=-2.008e+00 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2204,3:1.618e-03:2207:965,1:968,2:2138,3:1070,1134,1,2
chrM 8640 . G A . . AS_SB_TABLE=1120,1104|1,4;DP=2276;ECNT=3;MBQ=30,20;MFRL=320,353;MMQ=60,60;MPOS=30;OCM=0;POPAF=2.40;TLOD=-7.294e-02 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2224,5:1.907e-03:2229:857,4:854,0:2104,4:1120,1104,1,4And for good measure, if I change 8316:T>A to 8316:T>C in the force-call alleles VCF manually, the variant is called and here is what I get:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT MY1776
chrM 8316 . T C . . AS_SB_TABLE=1136,1248|4,6;DP=2494;ECNT=2;MBQ=30,13;MFRL=327,362;MMQ=60,60;MPOS=27;OCM=0;POPAF=2.40;TLOD=-8.726e-01 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2384,10:1.668e-03:2394:1002,0:1108,3:2252,8:1136,1248,4,6
chrM 8372 . A C . . AS_SB_TABLE=1168,1108|6,26;DP=2408;ECNT=2;MBQ=30,10;MFRL=332,355;MMQ=60,60;MPOS=32;OCM=0;POPAF=2.40;TLOD=1.76 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2276,32:4.600e-03:2308:878,1:952,4:2186,28:1168,1108,6,26
chrM 8517 . C T . . AS_SB_TABLE=1020,1098|1,2;DP=2184;ECNT=1;MBQ=30,30;MFRL=330,522;MMQ=60,60;MPOS=31;OCM=0;POPAF=2.40;TLOD=-1.877e+00 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2118,3:1.692e-03:2121:980,1:981,2:2046,3:1020,1098,1,2
chrM 8640 . G A . . AS_SB_TABLE=1122,1126|1,4;DP=2303;ECNT=3;MBQ=30,20;MFRL=322,353;MMQ=60,60;MPOS=30;OCM=0;POPAF=2.40;TLOD=-3.756e-01 GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:2248,5:1.915e-03:2253:1013,4:1001,0:2100,4:1122,1126,1,4Let me know what else you need!
-
Hi Rahul Gupta,
It looks like we will need to take a closer look at what is happening here to determine if it is a bug. Would it be possible for you to submit your files as a bug report? We would need the reference fasta, the input bam, and the force calling VCF.
We have instructions for how to upload a bug report here: https://gatk.broadinstitute.org/hc/en-us/articles/360035889671
Put all your files in a zipped folder and let me know once you have been able to upload it.
Best,
Genevieve
-
Thanks for your response! The upload is complete -- the file is `220211_RG_bug_report_mutect2.zip`.
In it are two folders; each of which is a sample which has exhibited this issue. In each folder `command.txt` contains the full command, and `mutect_output.log` is the output of running mutect2 displayed on the terminal. I also included an `output` folder in which I have included the outputs I get from running the command.
For sample 1, the force-call variant I'm missing is `8316:T>A` as above; sample 2 is another example with a missing force-call variant (`8317:A>G`) in the output VCF.
Let me know if you need anything else -- looking forward to seeing what y'all find.
A broader question -- does the `--genotype-filtered-alleles` flag only apply to variants with non-PASS filters in the VCF provided in `--alleles`?
-
Thanks Rahul Gupta, I received your files and recreated the issue on my end. We'll start looking into the issue that is happening.
Yes, the --genotype-filtered-alleles flag isn't needed in your case, it would be necessary if some of the variants in your --alleles file were filtered (non-PASS).
-
Rahul Gupta thank you for reporting this to us, we really appreciate it.
We did find that this is a bug that affects mitochondria mode and so we have already started working on a fix.
Here is the issue ticket on github where we will post more updates regarding the fix: https://github.com/broadinstitute/gatk/issues/7672.
Please let me know if you have any questions.
-
Nice, thank you Genevieve! Looking forward to getting my hands on the new version when it's ready.
-
Rahul Gupta The fix has been merged and will be available in the next GATK release, which should happen this week.
-
Fantastic! Thank you for the notification, and thanks to the team for taking a look at this!
-
Hi -- just wanted to check in regarding the next GATK release. Do you have a sense for when this is planned to happen? Thanks!
-
It was supposed to happen last week but they ran into an issue. It should come out by the end of this week.
-
Hi -- checking back in! We're hoping to use this in a production application so curious if you have any updates on the next GATK release.
-
Everything looks set up for the release to happen tomorrow. We're sorry for the delay - we had to fix a few issues in the process.
-
The release is out: https://github.com/broadinstitute/gatk/releases/tag/4.2.6.0
Let us know how it goes for you.
Please sign in to leave a comment.
14 comments