Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

VariantFiltration undefined variable

Answered
0

5 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi wbsimey, could you provide more information to determine if the AB and MQ0 annotations exist in your VCF? Please share the ##INFO lines in the header of your VCF. Also, the lines of your VCF that you shared above are from non-ref blocks (scaffold_6 1 . T <NON_REF>) because you are selecting variants from a GVCF. It would be more helpful to see lines where there are variants (instead of <NON_REF> you will see the variant allele). Please share an example of those as well, since the AB score and MQ0 should be calculated at those locations.

    0
    Comment actions Permalink
  • Avatar
    wbsimey

    Hello Genevieve,

    It looks like the AB and MQ0 annotations are not here. When are these annotations created? I do not think I am doing anything different from previous GATK4 versions and I am using the same data and these two annotations are included in previous vcf files.

    I think I figured out the <NON REF> issue - I had slightly different versions of my reference file and used them interchangeably through my pipeline (HaploTypeCaller>GenomicdDBImport>GenotypeGVCFs>VariantFiltration).

     

    ##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
    ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
    ##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
    ##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
    ##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
    ##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
    ##INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of squared MQ and total depth) for improved RMS Mapping Quality calculation. Incompatible with deprecated RAW_MQ formulation.">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect strand bias">

     

    scaffold_12 686 . G A 465.24 . AC=7;AF=0.085;AN=82;BaseQRankSum=-2.220e-01;DP=309;ExcessHet=4.2875;FS=0.000;InbreedingCoeff=-0.0094;MLEAC=8;MLEAF=0.098;MQ=55.26;MQRankSu
    m=-9.670e-01;QD=11.63;ReadPosRankSum=0.00;SOR=0.556 GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:1,0:1:3:.:.:0,3,32 0/1:4,4:8:99:.:.:137,0,149 0/0:2,0:2:6:.:.:0,6,78 0/0:6,0:6:18:.:.:0,18,239 0|1:
    2,2:6:75:0|1:643_C_G:75,0,78:643 0/0:2,0:2:6:.:.:0,6,78 0/0:10,0:10:30:.:.:0,30,391 0/1:4,2:6:34:.:.:34,0,137 0/1:10,3:14:84:.:.:84,0,394 0/0:4,0:4:12:.:.:0,12,155 0/0:2,0:2:6:.:
    .:0,6,55 0/0:2,0:2:6:.:.:0,6,84 0/0:7,0:7:21:.:.:0,21,266 0/0:7,0:7:21:.:.:0,21,292 0/0:7,0:7:18:.:.:0,18,270 0/0:3,0:3:9:.:.:0,9,99 0/0:2,0:2:0:.:.:0,0,3 0/0:14,0:14:42:.:.:0,42,
    573 0/0:4,0:4:12:.:.:0,12,141 0/0:5,0:5:15:.:.:0,15,194 ./.:2,0:2:.:.:.:0,0,0 0/0:29,0:29:81:.:.:0,81,1215 0/0:7,0:7:21:.:.:0,21,224 0/0:10,0:10:30:.:.:0,30,372 ./.:0,0:0:
    .:.:.:0,0,0 0/0:21,0:21:60:.:.:0,60,696 0/0:1,0:1:3:.:.:0,3,29 0/0:6,0:6:18:.:.:0,18,224 0/0:4,0:4:12:.:.:0,12,141 0/0:35,0:35:99:.:.:0,102,1280 0/0:3,0:3:9:.:.:0,9,81 0/0:8,0:8:21
    :.:.:0,21,315 0/0:9,0:9:24:.:.:0,24,360 0/1:1,2:3:36:.:.:71,0,36 0/0:5,0:5:15:.:.:0,15,175 0|1:4,2:6:72:0|1:643_C_G:72,0,161:643 0/0:5,0:5:15:.:.:0,15,182 0/0:14,0:14:36
    :.:.:0,36,540 0/0:3,0:3:9:.:.:0,9,85 0/0:17,0:17:51:.:.:0,51,646 0/1:1,1:2:34:.:.:38,0,34 0/0:3,0:3:9:.:.:0,9,104 0/0:3,0:3:0:.:.:0,0,44

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi wbsimey, you mentioned that you did this same workflow before with an older version of GATK. What version was that? Was it with the same data? 

    I also noticed that you are using SelectVariants to select from your GenomicsDB database and not from the VCF after Genotype GCVFs is run. Are the MQ0 and AB annotations in the file after Genotype GCVFs?

    Also can you check if these annotations exist in the unfiltered VCF following HaplotypeCaller? Please check in the output from the older version of GATK and now. 

     

    0
    Comment actions Permalink
  • Avatar
    astrinaki_maria

    Hello Wbsimey and Genevieve,

    I face the same problem, did you find the solution?

    Best regards,

    Astrinaki Maria

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    astrinaki_maria if you still have not solved your issue, go ahead and create a new post and we can help you there. Here is an article with our forum guidelines: https://gatk.broadinstitute.org/hc/en-us/articles/360053845952-Forum-Guidelines

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk