Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Mutect2 calls germline variants not present in germline resource

0

4 comments

  • Avatar
    Gökalp Çelik

    Hi Jaime Alvarez Benayas

    It is all about the allele frequency of the germline variant which affects the probability normalization of a particular site to be counted as germline or not. 

    TL;DR: The higher the allele frequency the higher the probability of a site to be considered as germline. 

    Long description can be found in the mutect2 documentation (Pages 7 and 8) about how germline filter works.

    https://github.com/broadinstitute/gatk/blob/master/docs/mutect/mutect.pdf 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Jaime Alvarez Benayas

    Thank you Gökalp Çelik it does help a lot.

    What about the opposite case? Variants filtered as "germline" but not found in the germline population frequency resource. I am assuming this is because the matched normal provides "strong" evidence of being a germline variant. However, if the matched normal is a tumor and not a healthy tissue, would it make sense to consider these variants as "non-germline"?

    Thank you.

     

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    If a variant is not found if the germline resource there is also an estimation of allele frequency used in the actual algorithm therefore not being in the germline resource does not warrant a variant to not be marked as germline. 

    Matched normal always provides the best evidence for a possible germline event but Mutect2 does not discriminate if a normal is actually a pseudonormal. You need to do that distinction and change the way variants filtered by other post processing means. 

    I am sure David Benjamin the author of this tool already mentioned this in the past that there is a high number of variants detected by mutect not found in germline resource to be an actual germline variant therefore estimations and assumption on the germline resource is already coded with this prior probability in mind. 

    0
    Comment actions Permalink
  • Avatar
    Jaime Alvarez Benayas

    Thank you very much for the help Gökalp Çelik

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk