Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Why are some variants not phased in Mutect2

0

1 comment

  • Avatar
    Laura Gauthier

    Hi xin cui,

    GATK doesn't do any sort of Bayesian phasing so it's actually quite strict.  In order to phase two variants they need to occur on either 100% or 0% of the reads that overlap the two positions.  If there's an error that creates a mismatch at either position, then the variants will fail to be phased.  A long time ago someone did a sensitivity analysis and it turned out that (compared with GATK3 ReadBackedPhasing) we only had 90% sensitivity for variants that were at adjacent positions.  As I said, very strict.  The updated assembly in GATK 4.2 or so makes some improvements, but for deep sequencing there will still be errors that prevent phasing.  If you're really concerned about phasing then you will probably want to run an additional tool using the read-level information.  You can find the official Docker for the old GATK3 (which should have the ReadBackedPhasing tool) here: https://hub.docker.com/r/broadinstitute/gatk3/tags

    -Laura

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk