Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to decrease the likelihood threshold for a candidate haplotype to be called as a genotype in Mutect2

Answered
0

9 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Yussanne Ma,

    Thank you for your thorough post! I am starting to look into this. Unfortunately, there is a holiday in the US and so I won't be able to get back to you until Monday. Could you run Mutect2 around this site and force call it with the --alleles option? I want to see what the evidence for the site would be in the VCF.

    Thank you,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yussanne Ma

    Hi Genevieve,

    Thank you so much for getting back to me!  I ran the command on the same file with all the same params, forcing the call at that position, and here is the relevant line from the resulting vcf:

    chr5 112840254 . G GA . . AS_SB_TABLE=8004,8324|14,15;

    DP=17495;ECNT=1;MBQ=20,20;MFRL=166,167;MMQ=60,60;

    MPOS=30;POPAF=7.30;RPA=6,7;RU=A;STR;TLOD=-4.740e+00 GT:AD:AF:DP:F1R2:F2R1:SB

    0/1:16328,29:5.914e-04:16357:7254,13:1195,3:8004,8324,14,15

    0
    Comment actions Permalink
  • Avatar
    Yussanne Ma

    Hi Genevieve,

    After looking more closely at what I sent you yesterday, I can see that the TLOD is -4.7, so I am able to now get this variant by setting my init-lod and emit-lod to -10.  I am still a little confused as to why the odds ratio is so low, given that the alternate depth and F/R balance is not bad, and certainly the tool has called other variants with less confidence, so I'd appreciate any insight you have, but I understand that there is a lot involved in the models to get to these likelihood calculations so this could just be an edge case!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    The --alleles option does change some of the likelihood calculations. I'm wondering what the site looks like when you are able to call it without the --alleles option?

    0
    Comment actions Permalink
  • Avatar
    Yussanne Ma

    Here it is in the two samples where I wasn't able to call it previously:

    chr5 112840254 . G GA . . AS_SB_TABLE=8004,8324|14,15;DP=17495;ECNT=1;MBQ=20,20;MFRL=166,167;MMQ=60,60;MPOS=30;POPAF=7.30;RPA=6,7;RU=A;STR;TLOD=-4.740e+00 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:16328,29:5.914e-04:16357:7254,13:1195,3:8004,8324,14,15

    chr5 112840254 . G A,GA . . AS_SB_TABLE=6731,7049|1,1|15,16;DP=14829;ECNT=2;MBQ=20,20,20;MFRL=168,164,162;MMQ=60,60,60;MPOS=27,48;POPAF=7.30,7.30;TLOD=-5.369e+00,-5.211e+00 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2:13780,2,31:2.153e-04,1.052e-03:13813:6443,1,12:709,0,3:6731,7049,16,17

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Ok, thank you, we'll take a closer look and get back to you.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yussanne Ma,

    The reason this site is not called without decreasing the lod arguments is that the frequency of the reads supporting the alternate allele is much lower than the normal error rate with illumina short reads. Since this is an insertion in a homopolymer, the error rate is around 1/1000, so we cannot be confident that these reads supporting the alternate allele are not just errors. Different types of variants could be called with lower support if the normal error rate for those variant types is lower.

    I hope this helps you to understand what is going on at this site. Please let me know if you have further questions.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yussanne Ma

    That definitely makes sense, Genevieve, thank you so much for looking into this for me.  If you develop a version of mutect2 with priors adjusted for high-depth low-AF data, we would be really excited to use it :)

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Happy to help!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk