Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Possible wrong call in Orientation bias filter with F2R1 null read counts (REF/ALT)

0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Maria Maqueda,

    We are a bit behind with addressing Mutect2 questions, so it may take us some time to figure out what is exactly going on here.

    In terms of your situation, it is a bit challenging with tumor only analysis because many of the possible variants will be false positives when you do not have the normal sample to help with filtering. Are there examples you have of variants that you think may be filtered out wrongly with the Orientation filter? Seeing those example variants in the VCF and also in IGV may help to figure out why they were filtered out.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Maria Maqueda

    Hi Genevieve Brandt (she/her),

    Thanks for your response. I understand the time you need (we are a lot of Mutect2 users!) I will be patient, no worries.

    I agree that tumor-only analysis is challenging. However, I was surprised by the huge percentage of variants filtered out by the Orientation filter (9,965 variants out of the raw 10,062) as solo instance or in combination with other filter types.  For instance, among the filtered out variants, we have identified four in TP53 that are expected to be true positives.

    I see the relevance of the Orientation filtering given that these are FFPE samples. But, I do not understand how the model can detect an orientation bias if all REF and ALT reads are in the same orientation. I am assuming that to have orientation bias, conceptually, I would expect:

    a. F1R2 and F2R1 reads (more or less) balanced for REF reads.

    b. for ALT reads, all F1R2 (or F2R1)

    But this is not the case of these results. Am I missing or misunderstanding anything? 

    Many thanks in advance!!

    Maria

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Yes I see, could you share the VCF lines from the filtered TP53 variants?

    0
    Comment actions Permalink
  • Avatar
    Maria Maqueda

    Hi again Genevieve Brandt (she/her),

      Below the four TP53 variants filtered out by the Orientation filter solo or in combination with others. This VCF subset is a merge among different samples.

    #CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	S1	S2	S3	S4	S5	S6	S7	S8	S9	S10	S11
    chr17 7577124 . C A,T . strand_bias;orientation;clustered_events AC=1,1;AN=4;AS_FilterStatus=SITE;AS_SB_TABLE=2458,2375|45,49;DP=15840;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=92,92;MMQ=60,60;MPOS=43;POPAF=5.46;ROQ=1;SF=17f,26f;TLOD=119.25 GT:AD:AF:F2R1:DP:SB:F1R2 . . . . 0/1:4833,94:0.018,.:0,0:4927:2458,2375,45,49:4745,90 0/2:10858,32:.,3.093e-03:0,0:10890:5440,5418,16,16:10689,32 . . . . .
    chr17 7577539 . G A . clustered_events;orientation AC=3;AN=6;AS_FilterStatus=SITE;AS_SB_TABLE=1825,1818|41,42;DP=13533;ECNT=3;GERMQ=93;MBQ=20,20;MFRL=100,100;MMQ=60,60;MPOS=49;POPAF=5.46;ROQ=1;SF=3f,29f,50f;TLOD=114.14 GT:F2R1:DP:SB:F1R2:AD:AF 0/1:0,0:3726:1825,1818,41,42:3625,82:3643,83:0.022 . . . . . 0/1:0,0:7572:2181,2173,1615,1603:4316,3216:4354,3218:0.425 . . . 0/1:0,0:2208:1079,1030,49,50:2041,97:2109,99:0.044
    chr17 7578212 . G A . orientation AC=4;AN=8;AS_FilterStatus=SITE;AS_SB_TABLE=2051,2047|25,27;DP=24436;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=75,74;MMQ=60,60;MPOS=30;POPAF=5.46;ROQ=1;SF=12f,14f,34f,48f;TLOD=60.12 GT:AF:AD:F2R1:F1R2:SB:DP . 0/1:0.012:4098,52:0,0:4051,50:2051,2047,25,27:4150 0/1:0.139:3852,621:0,0:3845,619:1929,1923,311,310:4473 . . . . . 0/1:0.345:4297,2256:0,0:2440,2254:2152,2145,1131,1125:6553 0/1:0.139:7955,1286:0,0:7871,1277:3983,3972,645,641:9241 .
    chr17 7578406 . C T . orientation;clustered_events AC=2;AN=4;AS_FilterStatus=SITE;AS_SB_TABLE=859,866|34,35;DP=2178;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=89,89;MMQ=60,60;MPOS=50;POPAF=5.46;ROQ=1;SF=16f,31f;TLOD=104.41 GT:F2R1:F1R2:SB:DP:AF:AD . . . 0/1:0,0:1648,69:859,866,34,35:1794:0.039:1725,69 . . . 0/1:0,0:66,251:66,66,126,126:384:0.655:132,252 . . .

    Just a note about the 'clustered events' filter that also appears in those lines, we are already reviewing the ECNT value and the particular regions where it is tagged. We are fine with the output of this filter, no comments about it.

    Many thanks again!!

    Maria

    0
    Comment actions Permalink
  • Avatar
    Takuto Sato

    Hi Maria,

    Thanks for reaching out. If all ref and alt reads are either F1R2 or F2R1 at every variant site, we have no way to separate real variants from the read orientation artifacts. In fact the model assumes that the F1R2/F2R1 ratio in ref reads is always balanced. So it is best to turn off the read orientation filter in your use case. This is exactly the same situation as the RNA issue you referenced.

    After turning off the read orientation filter, if you find that you have too many false positives due to FFPE (deamination/oxoG) artifacts, you might want to filter variants based on the reference bases around the variant (i.e. reference context). This should get rid of the majority of FFPE artifacts, although it might also filter real variants.

    Why do we always have either all F1R2 or F2R1 reads at a given variant in amplicon sequencing?

    0
    Comment actions Permalink
  • Avatar
    Maria Maqueda

    Hi Takuto Sato,

      Really appreciate your advice. 

    I am not totally sure why this is happening (all ref and alt reads are F1R2 or F2R1) in amplicon sequencing. However, I guess it is related to what was mentioned in the RNA issue: it is caused by the fact that strands are amplified separately prior to adapters ligation. I should check with the lab technician that run the experiment to confirm this.

    Thanks again for your input.

    Maria

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk