ApplyVQSR AS_FilterStatus=VQSRTrancheSNPXXXXtoXXXX but variant still passes FILTER
AnsweredHello,
I am using GATK 4.2 to perform germline calling. To reduce the number of false positives, I use the VariantRecalibration workflow with the recommended resources.
After using `ApplyVQSR` in the SNP mode, I notice that many SNPs have "PASS" in the VCF FILTER column, although they should have been filtered, according to the `AS_FilterStatus`.
Here are two such variant:
1 2424417 . T C 2265.06 PASS AC=2;AF=1.00;AN=2;AS_BaseQRankSum=.;AS_FS=0.000;AS_FilterStatus=VQSRTrancheSNP97.00to98.00;AS_MQ=60.00;AS_MQRankSum=.;AS_QD=33.31;AS_ReadPosRankSum=.;AS_SOR=1.352;AS_VQSLOD=6.0834;AS_culprit=MQ;DP=77;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=59.72;POSITIVE_TRAIN_SITE;QD=33.31;SOR=1.385 GT:AD:DP:GQ:PL 1/1:0,68:69:99:2279,204,0
1 4246829 . T G 965.64 PASS AC=1;AF=0.500;AN=2;AS_BaseQRankSum=-5.000;AS_FS=19.511;AS_FilterStatus=VQSRTrancheSNP99.00to99.30;AS_MQ=59.74;AS_MQRankSum=-1.700;AS_QD=10.50;AS_ReadPosRankSum=1.700;AS_SOR=1.277;AS_VQSLOD=0.0526;AS_culprit=FS;BaseQRankSum=-4.925e+00;DP=96;ExcessHet=3.0103;FS=19.511;MLEAC=1;MLEAF=0.500;MQ=59.89;MQRankSum=-1.606e+00;NEGATIVE_TRAIN_SITE;POSITIVE_TRAIN_SITE;QD=10.50;ReadPosRankSum=1.80;SOR=1.277 GT:AD:DP:GQ:PL 0/1:52,40:92:99:973,0,1575
For all steps I am using the "Allele specific" calling pipeline.
I read in the `ApplyVQSR` documentation that if one allele passes, the whole site will be PASS, however, as you see by my example, there only is a single allele which fails the quality control.
The specific command I am using:
gatk ApplyVQSR -V cohort.indel.recalibrated.vcf.gz --recal-file cohort_snp.recal --tranches-file cohort_snp.tranches --truth-sensitivity-filter-level 97 -mode SNP -AS -O cohort.recalibrated.vcf.gz
If somebody could point out why I am seeing this or what I am doing wrong, I would be very grateful!
Cheers!
-
Hi nhaus,
Thanks for writing in, I think this is hitting on a common confusion regarding ApplyVQSR and what should be expected in the FILTER and INFO fields. There was a previous conversation on the forum where this topic was discussed: https://gatk.broadinstitute.org/hc/en-us/community/posts/360076810671-ApplyVQSR-tranche-sensitivity-filtering
Please check that out and see if it gives some insight into why you are seeing the results you are seeing. I believe it still looks like ApplyVQSR is working correctly for you, though please let me know if I have missed something.
Best,
Genevieve
-
Hi Genevieve,
Thank you very much for getting back to me so quickly.
Honestly, I am still a bit puzzled by the results I am seeing. In the meantime I figured out what my problem was, and now everything works as I suppose intended. Specifically, if the AS_FilterStatus is VQSRTrancheSNPxxx then the site doesn't pass.
I suspect, that the problem was, that I ran ApplyVQSR twice on my dataset. The first time with a sensitivity threshold of 99% and the second time with 97% (on the data which resulted from the 99% threshold run). When I use 97% right away everything worked.
I hope that this helps anybody that runs in a similar problem!
Cheers. -
Thanks for the follow up with the solution! That makes sense, I'm glad you figured it out.
Let us know if you have other questions!
Please sign in to leave a comment.
3 comments