Increased number of false positives in Mutect2 GATK 4.1.7.0
We are running Mutect2 on WGS prostate cancer samples. We analyzed our tumor-normal pairs previously with GATK 4.1.3.0 and validated the results against deep sequencing panel results (500-700x).
We are now running the samples again with GATK 4.1.7.0 and are getting significantly more false positives. Same exact command line, Gnomad resource, PoN, filtering, ... everything. Just the docker container is different: broadinstitute/gatk:4.1.7.0 instead of broadinstitute/gatk:4.1.3.0
Are we missing some new parameters that we should launch Mutect2 with, should we change something in the filtering of the results or why has the performance gone down dramatically between the versions?
-
I found out using
--ob-priors read-orientation-model.tar.gz
with FilterMutectCalls makes a big difference for the better. I think this is a new feature that was introduced recently. It is starting to feel like the documentation is all over the place and finding these critical changes is becoming difficult. What else am I missing?
-
Public service announcement: the Mutect2 documentation is here: https://github.com/broadinstitute/gatk/blob/master/docs/mutect/mutect.pdf
The read orientation model has been around for about 2 years, the recent change being that Mutect2 can collect the F1R2 counts file on the fly to save runtime rather than requiring the separate tool CollectF1R2Counts. I'm surprised that running read orientation filtering wasn't necessary with GATK 4.1.3 because it has always been essential for FFPE data and not very important otherwise. Are your samples FFPE, or is there any other esoteric quality in their preparation that might cause substitution errors?
Off the top of my head I can't think of any change between 4.1.3 and 4.1.7 that would affect false positives very much. To diagnose this it would help us a lot if you could share a VCF of new false positives in 4.1.7 and an accompanying VCF of the same sites in the output of FilterMutectCalls with 4.1.3, if they appear at all.
It would also help if you share your commands for Mutect2, FilterMutectCalls, and any other tools such as CalculateContamination in your Mutect2 pipeline.
-
We are working with PAXgene fixed material.
Maybe I misunderstood that using the Mutect2 --f1r2-tar-gz flag would insert a tag into the raw results that would cause them to be filtered out by FilterMutectCalls without any extra params. I'm a bit confused now because I think we saw a change in our 4.1.3.0 version calls after enabling it (and filtering without --ob-priors).
I am looking further into the issue and doing a more robust validation of the results. I'll let you know if we discover any clear issues with the calls. We did some tests with the contamination filtering and didn't see any significant changes, maybe we should try it one more time.
Do you recommend using FilterAlignmentArtifacts or any other additional steps?
-
Went back to check and we did use --ob-priors in 4.3.1.0 filtering, I had just forgotten about it.
-
If the read orientation artifact filter makes a big difference it must mean that PAXgene introduces substitution errors that occur sometime between extracting the DNA and ligating adapters. When you include -ob-priors are the 4.1.7 results comparable to 4.1.3, or still inferior?
FilterAlignmentArtifacts can be a useful tool for removing mapping errors when you care more about removing false positives than about maximizing sensitivity.
-
Thanks for the reply David,
The issue has reversed after applying --ob-priors. I think with 4.1.7.0 we are actually missing more calls now (false negatives, FN). In this recent sample I was looking at, we got 14 FNs with 4.1.3.0 and now we're getting 28 (4.1.7.0). They are all detected, but filtered out with FilterMutectCalls. It may not seem like a lot, but this comparison only covers the regions in out targetted deep sequencing panel. Doubling the rate of FNs for the whole genome is more problematic. Here's a list of the 28 calls we are classifying as FNs (missed calls, should not be filtered out) and reasons for filtering:
chr1 4153237 . C T . strand_bias
chr1 175148066 . T TA . slippage
chr2 154188707 . CT C . slippage
chr3 34053364 . C A . strand_bias
chr3 77563487 . A AT . slippage
chr3 117343832 . C CA . slippage
chr5 2985099 . TAATTA T . slippage
chr5 9057846 . C G . clustered_events;haplotype
chr5 9057853 . C T . clustered_events;haplotype
chr5 9057991 . C T . clustered_events
chr5 141042984 . CT C . slippage
chr6 142332231 . G A . strand_bias
chr8 55063415 . A T . strand_bias
chr8 55546358 . G T . strand_bias
chr8 85085906 . AT A . slippage
chr8 87582133 . G C . strand_bias
chr8 112287186 . C A . strand_bias
chr9 122480098 . CT C . slippage
chr11 31525425 . A T . strand_bias;weak_evidence
chr11 78666986 . G GT . slippage
chr12 23546637 . G A . strand_bias
chr15 54266045 . CT C . slippage
chr15 54420688 . AT A . slippage
chr15 57251736 . A T . weak_evidence
chr18 35918470 . G A . strand_bias
chr20 20897790 . A G . normal_artifact
chrX 109271370 . TTC T . clustered_events"strand_bias" as the sole reason for filtering out a SNV occurs 420 times total, so it is mostly doing a good job.I think we are seeing more "slippage" now in 4.1.7.0.
For example, SNV chr12 23546637 is detected by our deepseq panel with 18 reads out of 331 supporting the call:
chr12 23546637 . G A 0 PASS DEPTH=331;RATIO=0.05;TCGA=0:0:313:18;
There is really no reason to doubt it. In the WGS data that we are analyzing we see the variant in 6/64 reads. 5 of the 6 supporting reads are on the same strand, so I can understand why the filtering happens.
Thinking about the possibility that 4.1.7.0 would actually perform better here and our ground truth data would be less accurate... I can't think of a way to prove it.
Is there any way we can relax the filtering on calling slippages and strand bias?
-
I am experimenting with
--pcr-slippage-rate
to see how it affects the results.
About the orientation bias filtering... I would like to see how the results look if I could specify that a single supporting mutation call on the other strand would be enough keep the variant from getting filtered with "strand_bias". At least for low AF variants. Is this possible?
-
I'm noticing the AS_SB_TABLE contains the information I'm interested in.
AS_FilterStatus=strand_bias;AS_SB_TABLE=67,35|1,4;
where "1,4" tells me the split between the strands. I think I'm going to apply my own post-filtering step to maybe set everything up to six or seven reads with one supporting read on the other strand to PASS.
Update: Lowering the pcr-slippage-rate and introducing the new post-filtering step gets us back up to matching performance with 4.1.3.0. Maybe even with very slight improvement.
-
I think our strand bias filter is seriously wrong here, because in our validations we have found that strand bias artifacts (which are not the same as orientation bias artifacts) are very rare in modern sequencing. I think your post-filtering is a fine band-aid for the problem and in fact you could go a step further and just ignore the strand bias filter entirely. It will incur at most a few false positives. We need to fix this.
The slippage filter is more problematic because polymerase slippage occurs in all Illumina sequencing — in fact, in all sequencing involving a polymerase. Your ground truth is not an orthogonal validation in this regard. We set up a nifty validation (https://www.biorxiv.org/content/10.1101/825042v1.full) and found that our slippage filter is frequently right and frequently wrong. Relaxing it is risky, but it definitely cause a good number of false negatives. We have figured out how to refine the slippage filter to improve things dramatically (for example, insertion slippage errors are much rarer in Illumina sequencing than deletion errors) but have not yet implemented this analysis.
If you are able to share your unfiltered Mutect2 VCF with us for debugging I'm sure I could find out why the strand bias filter is going off the rails and fix it pretty quickly.
-
@David_Benjamin,
I sent you an email regarding our data, we would be happy for you to take a look at it.
-
[We're discussing this off-forum, but I will post here once I have an answer on the strand bias issue.]
-
David Benjamin and @registered_user,
Are there any updates regarding this? I've been running Mutect2 and want to understand better each of the commonly seen filters (e.g. slippage, orientation). FYI, I'm using the latest version of GATK 4.2.0.
-
I spoke with David Benjamin and I believe he was unable to find anything obviously wrong with the user's results. More information on the Mutect2 filters and parameters can be found in the Tool Documentation, the Mutect2 FAQ, as well as this paper.
Kind regards,
Pamela
-
Im using GATK 4.2.4.0 & GRCH38, getting plenty of false calls per exome,
is there problem in Haplotypecaller, please let me know?
-
Rajesh Kumar Maurya thank you for your question! The advice generally in this thread applies to somatic calls. In your case, it sounds like you are doing germline analysis.
There are no general HaplotypeCaller bugs that would cause an excess of false positives. HaplotypeCaller is written to find possible signs of variance and we expect that users do variant filtering on the results from HaplotypeCaller. Here is an article regarding the variant filtering step: https://gatk.broadinstitute.org/hc/en-us/articles/360035531112--How-to-Filter-variants-either-with-VQSR-or-by-hard-filtering
If you have any more questions regarding HaplotypeCaller, feel free to open up a new thread and we can help you from there!
Please sign in to leave a comment.
15 comments