I am using mutational data (SNVs) downloaded from TCGA in 2021.
When examining the FILTER column in the MAF/VCF files, I see the following filters-
I have several questions -
1. I found technical documentation on Mutect2 and FilterMutect calls on github, but it seems to describe a newer version. Is there any technical documentation about these filters? and generally about the Mutect2 and FilterMutectCalls versions that were used in 2021 for TCGA variants?
2. I also downloaded a VCF file from the TCGA website recently and it had the same filters as my data from 2021. Does this mean that the variant calling pipeline for TCGA variants is still not using the most updated version of FilterMutectCalls?
3. Most importantly - it seems like the vast majority (70-99% of variants, ranging according to cancer type) are tagged as errors/germline variants. How should I consider this in my analysis? why are these variants still kept on TCGA if this is the case?
Thank you very much,
Please sign in to leave a comment.