ApplyVQSR AS_FilterStatus=VQSRTrancheSNPXXXXtoXXXX but variant still passes FILTERAnswered
I am using GATK 4.2 to perform germline calling. To reduce the number of false positives, I use the VariantRecalibration workflow with the recommended resources.
After using `ApplyVQSR` in the SNP mode, I notice that many SNPs have "PASS" in the VCF FILTER column, although they should have been filtered, according to the `AS_FilterStatus`.
Here are two such variant:
1 2424417 . T C 2265.06 PASS AC=2;AF=1.00;AN=2;AS_BaseQRankSum=.;AS_FS=0.000;AS_FilterStatus=VQSRTrancheSNP97.00to98.00;AS_MQ=60.00;AS_MQRankSum=.;AS_QD=33.31;AS_ReadPosRankSum=.;AS_SOR=1.352;AS_VQSLOD=6.0834;AS_culprit=MQ;DP=77;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=59.72;POSITIVE_TRAIN_SITE;QD=33.31;SOR=1.385 GT:AD:DP:GQ:PL 1/1:0,68:69:99:2279,204,0
1 4246829 . T G 965.64 PASS AC=1;AF=0.500;AN=2;AS_BaseQRankSum=-5.000;AS_FS=19.511;AS_FilterStatus=VQSRTrancheSNP99.00to99.30;AS_MQ=59.74;AS_MQRankSum=-1.700;AS_QD=10.50;AS_ReadPosRankSum=1.700;AS_SOR=1.277;AS_VQSLOD=0.0526;AS_culprit=FS;BaseQRankSum=-4.925e+00;DP=96;ExcessHet=3.0103;FS=19.511;MLEAC=1;MLEAF=0.500;MQ=59.89;MQRankSum=-1.606e+00;NEGATIVE_TRAIN_SITE;POSITIVE_TRAIN_SITE;QD=10.50;ReadPosRankSum=1.80;SOR=1.277 GT:AD:DP:GQ:PL 0/1:52,40:92:99:973,0,1575
For all steps I am using the "Allele specific" calling pipeline.
I read in the `ApplyVQSR` documentation that if one allele passes, the whole site will be PASS, however, as you see by my example, there only is a single allele which fails the quality control.
The specific command I am using:
gatk ApplyVQSR -V cohort.indel.recalibrated.vcf.gz --recal-file cohort_snp.recal --tranches-file cohort_snp.tranches --truth-sensitivity-filter-level 97 -mode SNP -AS -O cohort.recalibrated.vcf.gz
If somebody could point out why I am seeing this or what I am doing wrong, I would be very grateful!
Thanks for writing in, I think this is hitting on a common confusion regarding ApplyVQSR and what should be expected in the FILTER and INFO fields. There was a previous conversation on the forum where this topic was discussed: https://gatk.broadinstitute.org/hc/en-us/community/posts/360076810671-ApplyVQSR-tranche-sensitivity-filtering
Please check that out and see if it gives some insight into why you are seeing the results you are seeing. I believe it still looks like ApplyVQSR is working correctly for you, though please let me know if I have missed something.
Thank you very much for getting back to me so quickly.
Honestly, I am still a bit puzzled by the results I am seeing. In the meantime I figured out what my problem was, and now everything works as I suppose intended. Specifically, if the AS_FilterStatus is VQSRTrancheSNPxxx then the site doesn't pass.
I suspect, that the problem was, that I ran ApplyVQSR twice on my dataset. The first time with a sensitivity threshold of 99% and the second time with 97% (on the data which resulted from the 99% threshold run). When I use 97% right away everything worked.
I hope that this helps anybody that runs in a similar problem!
Thanks for the follow up with the solution! That makes sense, I'm glad you figured it out.
Let us know if you have other questions!
Please sign in to leave a comment.