Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Missing mutations

0

11 comments

  • Avatar
    vctrymao

    I read that using `force-active` in Mutect2 may help. What exactly is this doing, and why is this turned off by default?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi vctrymao, please see this troubleshooting document and work through these steps: When HaplotypeCaller and Mutect2 do not call an expected variant

    In terms of the --force-active argument, it is turning on all regions as active regions. Please see this article about active regions for more information.

    Also, is this region in IGV near any of the ends of your regions-list intervals?

    0
    Comment actions Permalink
  • Avatar
    vctrymao

    I read this, but I did not see an exact description of what an "active region" is defined as mathematically. Regardless, this seems to have fixed the problem. 

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks for posting the update vctrymao!

    0
    Comment actions Permalink
  • Avatar
    vctrymao

    Could you explain how exactly GATK defines active regions? It isn't intuitive to me that Mutect2 would omit a region with high read level support.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    vctrymao please see this paper: https://github.com/broadinstitute/gatk/blob/master/docs/mutect/mutect.pdf There is a section on the active regions and how they are determined mathematically.

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    vctrymao -force-active makes Mutect2 run very slowly.  It is possible this variant is not output because it overlaps the germline resource or the panel of normals.

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Hi David Benjamin,

    Following your post above, if a variant was in the germline resource and/or the panel of normals, would it still not be output by Mutect2 and then labelled with 'germline' and/or 'panel_of_normals' by FilterMutectCalls?

    Other than Mutect2 running very slowly, are there any cautions that one must be aware of if changing the argument --force-active to true with Mutect2? When would changing this argument to true be advised?

    Thank you for your time and help.

    Kind regards.

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    The germline resource is not a hard filter, to be clear.  It only serves to give the population allele fraction for the probabilistic germline model.

    Variants in the panel of normals by default do not trigger local assembly (this is overridden with -genotype-pon-sites), however, they can end up in the output VCF when a different site nearby triggers assembly and genotyping.

    I would only advise -force-active for debugging.  I have personally never used it otherwise, even when performing validations where I want Mutect2 to look as good as possible.

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Hi David Benjamin,

    Thank you for your reply.

    The functions of the arguments --genotype-germline-sites and --genotype-pon-sites are described with similar terminology in the Mutect2 documentation. However, only panel of normals is a hard filter of the two. What defines "apparent germline site" as described in the Mutect2 documentation for --genotype-germline-sites if not a hard filter? There is information in this document but clarity would be appreciated.

    As variants in the panel of normals "can end up in the output VCF when a different site nearby triggers assembly and genotyping" as you have described, is this also what happens with apparent germline sites if --genotype-germline-sites is set as false?

    Thank you for your advice regarding --force-active.

    Kind regards.

     

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    The Mutect2 paper on bioarxiv goes into quite a bit more detail than the documentation on the forum.  In particular, in has details on the initial guesses of apparent somatic variation that trigger assembly and genotyping.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk