Expected variants not detected by Mutect2
AnsweredDear GATK,
We are currently trying to implement a pipeline to detect tumor mutations.
We followed the "Somatic short variant discovery (SNVs + Indels)" Best Practices Workflow, using our own validation samples.
Some of the expected variants, present in the BAM file given to Mutect2, are not detected by the tool. I have tried what is advised in this blog page but none of those parameters has permitted Mutect2 to detect those variants.
I have also tried the parameter "--disable-adaptive-pruning" which allowed to detect some more expected variants, but there are still some mutations not detected.
The reads in the variant area seem to be hard-clipped by Mutect2 (see Figure 1), while the reads containing the variant simply seem to be filtered out.
The position of the variant is present in the intervals BED file.
The base qualities of the variant are all very good, and the mapping qualities of the reads supporting the variant are also always very good (see Figure 2).
The variants aren't low-frequency ones (AF: 9% in the example).
Those variants are not special in any way, they are not at the extremities of the reads, nor at the extremities of the regions.
In this example, it is a probe-based sample, but the results are similar with an amplicon-based sample. The samples used are from targeted sequencing.
A pipeline we had which uses GATK3.6+Mutect1 correctly detects those variants.
I am using Mutect2 4.1.9.0.
Mutect2 command line used:
gatk Mutect2 -R reference.fa -L intervals.bed -I input.bam --tumor-sample sampleNA --disable-adaptive-pruning --germline-resource dbsnp.vcf --f1r2-tar-gz f1r2.tar.gz -O output.vcf --bam-output output.bam 2>output.log
The reference genome and the DBSNP VCF come from the GATK Resource Bundle (hg38).
What I have already tried:
- --debug doesn't show anything in the region of the variant
- --min-pruning 0
- --max-reads-per-alignment-start 0
- --linked-de-bruijn-graph
- --recover-all-dangling-branches
- --padding-around-snps 75
- --alleles with a VCF containing the variant
- --assembly-region-out -> The file contains the region where the variant is.
- --force-active
- --disable-read-filters MappingQualityNotZeroReadFilter
- --ignore-itr-artifacts
- --mitochondria-mode
- --independent-mates
- --pruning-lod-threshold 1.3/0.5/0.1
Figure 1: Reads from the BAM given to Mutect2 (top) and rendered by Mutect2 (bottom). The reads containing the variant are not in the BAMout from Mutect2. The reads kept in that region are hard-clipped, making a depth of 0 at the variant location.
Figure 2: Example of a read hard-clipped by Mutect2 in that region.
Would it be some parameter I should consider to detect those variants ? What could be the cause of GATK not detecting those variants ?
Thanks a lot!
Nicolas Loucheu
-
Hi Nicolas Loucheu,
Here is an additional resource about troubleshooting when Mutect2 does not call a variant you expect which may be helpful as well as common questions about how Mutect2 works. It is generally recommended to stick to the default parameters for the tool.
https://gatk.broadinstitute.org/hc/en-us/articles/360050722212-FAQ-for-Mutect2
Kind regards,
Pamela
-
Dear Nicolas Loucheu,
Please may I ask if you managed to resolve the issue described in your initial post above? If so, what was the cause, and please could you advise on how it was resolved?
Thank you for your time and help.
-
Dear David Benjamin and the GATK Team,
Reading the post above written by Nicolas Loucheu, multiple steps were taken to try and resolve the issue described. This included steps mentioned in the following troubleshooting guidelines:
- https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant
- https://gatk.broadinstitute.org/hc/en-us/articles/360035891111-Expected-variant-at-a-specific-site-was-not-called
- Would you have any specific advice on what the next step would be to investigate and resolve the issue described in the post above?
- What could be leading to reads being hard clipped by Mutect2 in a seemingly random way but at the same position for each read?
Thank you for your time and help.
-
Hi ISmolicz,
We are still using Mutect1 for that pipeline, but we haven't tried yet newer versions than 4.1.9.0 of Mutect2, maybe some update corrected that issue or added a parameter to solve it.
If you have any news on how to correct that issue, that would still be very useful for us as well.
Thanks a lot!
Nicolas Loucheu
-
Nicolas Loucheu ISmolicz It's possible Mutect2 is discarding this as a possible germline event before it gets to assembly and genotyping. First, you should use the af-only-gnomad VCF from the GATK best practices bucket as the germline resource, not dbSNP. Second, you could run with -genotype-germline-sites to see if the variant appears in the output.
-
Hello GATK team,
I am currently experiencing similar issues with calling variants from some genes with possibly some complexity, such as FLG, with Mutect2 (versions 4.1.2.0, 4.2.2.0 and 4.3.0.0) not calling seemingly legit variants. I followed the troubleshooting tutorial here (https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant) with little success. I am running Mutect2 in tumor-only mode. All the resources (PoN, germline, genome, etc have been downloaded from the GATK bucket).
To note, I also ran VarDict on the side to see which variants the two different callers are calling. I noticed that VarDict calls variants that are not called by Mutect2 and vice versa. Below an example (the reads are not shown but I am available to send data if necessary).
When a variant is called by VarDict but not Mutect2, or a variant is evident in the bam vs reference file but not called by either callers, this region is missing in the bamout file. In these cases:
- If quality of reads, alignment, etc. are not an issue, how and why is Mutect2 completely ignoring the variant?
- Some variants are called by an older version of Mutect2 and not by a newer version, and vice versa - what is the difference between versions?
- Enabling debug, linked de Bruijn graph, etc. options does not result in called variants that seem 'evident' in the bam file. Some of the regions that include these variants may or may not appear in the debug output
- Some of these variants are called in other samples of the same cohort of patients and do not seem to be qualitatively much different from those in samples where these are not called
How do I know that a variant has been excluded for a good reason and was not just skipped by Mutect2? It would be great to please have some guidance as to why some variants are excluded/not called.
Command used:
gatk Mutect2 \
-R resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \
-I my.bam \
-L <chr>:<start>-<end> \
--germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz \
--panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz \
-bamout my.bam.bamout.bam \
--recover-all-dangling-branches true \
--linked-de-bruijn-graph true \
--min-pruning 0 \
--debug-graph-transformations true \
-genotype-filtered-alleles true \
-debug true \
-O my.bam.var.vcf.gz 2>&1 | tee my.bam.debug.err -
Matteo Costacurta It's possible that Mutect2 rejects the variant as an artifact in the panel of normals or as a possible germline variant before assembly. You can use the -genotype-pon-sites and -genotype-germline-sites options to force it to output these.
-
Hi Nicolas Loucheu,
Thank you for your reply to my post. I do not have an update yet, but will post if I do.
Kind regards.
-
Hi David Benjamin,
Thank you for your reply to my post.
By changing the arguments --genotype-germline-sites and --genotype-pon-sites to true, what does this change in the Mutect2 output? Would variants that are likely germline and/or in the panel of normals not be in the Mutect2 output anyway and then labelled with 'germline' and/or 'panel_of_normals' by FilterMutectCalls, respectively? (This relates to my query in this post). Alternatively, are likely germline variants or variants present in the panel of normals being assessed for at more than one stage by Mutect2?
Thank you for your time and help.
Kind regards.
-
Hi ISmolicz,
Those two options successfully forced the calling of all the variants in my bam files, so that worked fine for me. I am using GATK 4.3.0.0.
Without those options, Mutect2 would just ignore some of the variants. Not sure how to answer your other questions.
Good luck!
Matteo
-
These arguments turn off speed optimizations. They ensure that germline and panel sites are in the output VCF (they will still be marked with the appropriate filters by FilterMutectCalls), which can improve interpretability at the cost of significant runtime for local assembly and realignment at non-somatic sites.
Please sign in to leave a comment.
11 comments