VQSR vs Hard Filtering
The filters you decide to use really just depend on your data and the sensitivity and precision that you would like to achieve with filtering. You're correct that VQSR is generally recommended for filtering and the filtering you have already done should be sufficient. If you wish, you could use VariantFiltration to hard filter your variants using filters like QD<2, QUAL<30, MQRankSum, etc., and compare your results. However, I think using VQSR should give you high enough accuracy. I hope this helps answer your question.
Thank you so much for your explanatory answer. I have below two variants as an example of my doubt, hope that’s ok (I just copied 3 representative samples, as I have >200, all with DP<10) .
So, should I consider them as true variants, even though both have really low DP across all samples?
Or should I filter variants that have at least two samples with DP>10, for example?
I understand your concern with these variants. However, it is actually not recommended to use the DP annotation when working with exome samples because of the really high variation in depth. The variation is seen as an error by the filtering tools when working with whole-genome data, but it isn't necessarily indicative of error when working with exome samples, which you mentioned that you are. Given that you have a large number of samples, VQSR should still be suitable for filtering, but the DP filter shouldn't be used with exomes. I hope this is helpful.
OK, I understand it. Again, thanks a lot for explaining this. It is truly helpful for me.
Please sign in to leave a comment.