Hi, I have a few technical questions about possible germline sites in 'normal mosaic' sites in Mutect2 (4.2.0).
1) What does the 'germline' filter in the final Mutect VCF mean exactly? It isn't clear from the documentation if these are only SNVs that were found in the --germline-resource, or whether these also represent variants found in the matched normal?
2) Are all variants that mutect calls in both the tumor and the normal samples present in the final VCF? Or is there an initial filtering step internally in Mutect that calls likely somatic variants using information from the tumor and normal samples, and therefore the final VCF does not contain all germline sites? In other words, are true germline sites found in both the normal sample and the tumor in the final Mutect VCF? I'm assuming it must be the latter because the number of variants in the final Mutect VCF seems too small to include all germline variants, but if that is the case, then what exactly are the sites labeled as 'germline' in the final VCF?
3) Mutect2's documentation has an error. The --genotype-germline-sites and --genotype-pon-sites have the same documentation: "Usually we exclude sites in the panel of normals from active region determination, which saves time. Setting this to true causes Mutect to produce a variant call at these sites. This call will still be filtered, but it shows up in the vcf. "
What is the difference between them, and what does genotype-germline-sites do?
4) Do you have a suggestion for how to detect variants that are both in a tumor sample and a normal sample, but the variant is at low level in the normal sample such that it might be a true normal somatic variant? This is similar to tumor-in-normal contamination but not quite, because it will only be a very small subset of tumor variants that have this property, whereas with tumor-in-normal contamination, many tumor variants will be found in the normal sample.
Please sign in to leave a comment.