Hi, I have a few technical questions about changes in filtered variants when running mutect2 with -genotype-germline-sites.
I ran mutect2 on matched tumor-normal data with and without -genotype-germline-sites. Everything else about these runs was the same.
When I compared the output vcfs I noticed differences in which variants pass all filters between the two different runs. Each run had unique variants that only passed - i.e. some variants were marked as pass when mutect2 was run with -genotype-germline-sites that failed when run with standard settings, and vice-versa.
When I looked through these variants I noticed two different patterns of unique variants:
Unique PASS variants to genotype-germline: the unique variants that PASSED in genotypegermline but were rejected in standard analysis failed in the standard run because of the "strand_bias" filter. The "strand_bias" filter marks more variants in the standard analysis than in the genotypegermline analysis. Looking through these variants on IGV, they look like they are false positives and for some reason when you run mutect2 with --genotypegermlinesites it prevents this filter from accurately working.
Unique PASS variants to standard: These variants were all rejected in genotypegermline but passed in standard mutect2 failed because of haplotype or clustered_events. I believe this is a potential problem with --genotypegermlinesites because when you include germlinesites, bona fide somatic variants that happen to be close to germline sites get filtered (when you run genotype germline you are more likely to include the germline variant in the activeregion of a somatic variant because you create an active region around the germline variant in addition to the somatic variant). It seems like if you run -genotypegermline sites you will have false negatives and miss these somatic variants because they get filtered.
These are not an insubstantial number of variants - -genotypegermline sites returned 3910 PASS variants, and there were 123 variants that failed genotypegermline sites but passed in standard mutect2 just because they failed the haplotype or clustered_events (likely false negatives).
Do you have any suggestions for how to get around these two issues? One way I can think of to get around the second issue is to ignore the haplotype or clustered_event filters when running --genotypegermlinesites, but this would have the effect of introducing false positives in the variant call. Is there a way to increase the number of nearby events that trigger the haplotype/clustered _events filters? Changing this could also restore the false negatives. I am not sure how to solve the issue in which the strand_bias filter stops working as well when running -genotypegermlinesites.
Please sign in to leave a comment.