Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Expected variants not detected by Mutect2

Answered
0

11 comments

  • Avatar
    Pamela Bretscher

    Hi Nicolas Loucheu,

    Here is an additional resource about troubleshooting when Mutect2 does not call a variant you expect which may be helpful as well as common questions about how Mutect2 works. It is generally recommended to stick to the default parameters for the tool.

    https://gatk.broadinstitute.org/hc/en-us/articles/360035891111-Expected-variant-at-a-specific-site-was-not-called

    https://gatk.broadinstitute.org/hc/en-us/articles/360050722212-FAQ-for-Mutect2

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Dear Nicolas Loucheu,

    Please may I ask if you managed to resolve the issue described in your initial post above? If so, what was the cause, and please could you advise on how it was resolved?

    Thank you for your time and help.

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Dear David Benjamin and the GATK Team,

    Reading the post above written by Nicolas Loucheu, multiple steps were taken to try and resolve the issue described. This included steps mentioned in the following troubleshooting guidelines:

    1. Would you have any specific advice on what the next step would be to investigate and resolve the issue described in the post above?
    2. What could be leading to reads being hard clipped by Mutect2 in a seemingly random way but at the same position for each read?

    Thank you for your time and help.

    0
    Comment actions Permalink
  • Avatar
    Nicolas Loucheu

    Hi ISmolicz,

    We are still using Mutect1 for that pipeline, but we haven't tried yet newer versions than 4.1.9.0 of Mutect2, maybe some update corrected that issue or added a parameter to solve it.

    If you have any news on how to correct that issue, that would still be very useful for us as well.

    Thanks a lot!

    Nicolas Loucheu

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Nicolas Loucheu ISmolicz It's possible Mutect2 is discarding this as a possible germline event before it gets to assembly and genotyping.  First, you should use the af-only-gnomad VCF from the GATK best practices bucket as the germline resource, not dbSNP.  Second, you could run with -genotype-germline-sites to see if the variant appears in the output.

    0
    Comment actions Permalink
  • Avatar
    Matteo Costacurta

    Hello GATK team,

    I am currently experiencing similar issues with calling variants from some genes with possibly some complexity, such as FLG, with Mutect2 (versions 4.1.2.0, 4.2.2.0 and 4.3.0.0) not calling seemingly legit variants. I followed the troubleshooting tutorial here (https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant) with little success. I am running Mutect2 in tumor-only mode. All the resources (PoN, germline, genome, etc have been downloaded from the GATK bucket). 

    To note, I also ran VarDict on the side to see which variants the two different callers are calling. I noticed that VarDict calls variants that are not called by Mutect2 and vice versa. Below an example (the reads are not shown but I am available to send data if necessary).

    When a variant is called by VarDict but not Mutect2, or a variant is evident in the bam vs reference file but not called by either callers, this region is missing in the bamout file. In these cases:

    • If quality of reads, alignment, etc. are not an issue, how and why is Mutect2 completely ignoring the variant?
    • Some variants are called by an older version of Mutect2 and not by a newer version, and vice versa - what is the difference between versions? 
    • Enabling debug, linked de Bruijn graph, etc. options does not result in called variants that seem 'evident' in the bam file. Some of the regions that include these variants may or may not appear in the debug output
    • Some of these variants are called in other samples of the same cohort of patients and do not seem to be qualitatively much different from those in samples where these are not called

    How do I know that a variant has been excluded for a good reason and was not just skipped by Mutect2? It would be great to please have some guidance as to why some variants are excluded/not called.

    Command used:

    gatk Mutect2 \ 
    -R resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \
    -I my.bam \
    -L <chr>:<start>-<end> \
    --germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz \
    --panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz \
    -bamout my.bam.bamout.bam \
    --recover-all-dangling-branches true \
    --linked-de-bruijn-graph true \
    --min-pruning 0 \
    --debug-graph-transformations true \
    -genotype-filtered-alleles true \
    -debug true \
    -O my.bam.var.vcf.gz 2>&1 | tee my.bam.debug.err
    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Matteo Costacurta It's possible that Mutect2 rejects the variant as an artifact in the panel of normals or as a possible germline variant before assembly.  You can use the -genotype-pon-sites and -genotype-germline-sites options to force it to output these.

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Hi Nicolas Loucheu,

    Thank you for your reply to my post. I do not have an update yet, but will post if I do.

    Kind regards.

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Hi David Benjamin,

    Thank you for your reply to my post.

    By changing the arguments --genotype-germline-sites and --genotype-pon-sites to true, what does this change in the Mutect2 output? Would variants that are likely germline and/or in the panel of normals not be in the Mutect2 output anyway and then labelled with 'germline' and/or 'panel_of_normals' by FilterMutectCalls, respectively? (This relates to my query in this post). Alternatively, are likely germline variants or variants present in the panel of normals being assessed for at more than one stage by Mutect2?

    Thank you for your time and help.

    Kind regards.

    0
    Comment actions Permalink
  • Avatar
    Matteo Costacurta

    Hi ISmolicz,

    Those two options successfully forced the calling of all the variants in my bam files, so that worked fine for me. I am using GATK 4.3.0.0.

    Without those options, Mutect2 would just ignore some of the variants. Not sure how to answer your other questions.

    Good luck!

    Matteo

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    These arguments turn off speed optimizations.  They ensure that germline and panel sites are in the output VCF (they will still be marked with the appropriate filters by FilterMutectCalls), which can improve interpretability at the cost of significant runtime for local assembly and realignment at non-somatic sites.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk