low af variant with contamination tag
Hi all,
I am using GATK (The Genome Analysis Toolkit (GATK) v4.5.0.0) MUTECT2 to discover the somatic variants from tumor-only samples.
I exactly followed the tutorial in best practice, mapping-dedup-calling-filtering steps to generate the final file.
Recently, I found one ctDNA sample had contamination of 0.006714243642768898, which resulted in many contamination tag in final calling vcf.
For example, a common EGFR 19DEL was found "contaminated" since the AF of this variant is 0.00483, which is smaller than sample contamination value. The AD of the mutant allele is 11, while the DP is 2278 (1189,1078,4,7).
By carefully reviewing the variant in IGV, we confirmed this variant positive, which were later validated by ddPCR.
Any suggestions of filtering the variants with the contamination tag?
Best,
Junfeng
-
We would recommend you to check our paper for how we estimate whether a low frequency variant is a true variant or a contamination artifact
https://github.com/broadinstitute/gatk/blob/master/docs/mutect/mutect.pdf
The more the population allele frequency for a variant the highly likely that it may get filtered out as being a contaminant in other samples. If your sample was confirmed to carry this mutation in the tumor biopsy/or an alternate ctDNA sample from the same patient that was not used in the initial sequencing then you may want to recheck the way you include contamination filter for your ctDNA samples as they show variants with even lesser fractions inside, therefore contamination filter may not be ultimately useful for the purpose of selecting variants as a screening result of ctDNA sequencing. I consulted our team about this question and they may post more insights later.
-
yes, I definetely read the contamination filter in the document and know the algorithm to calculate the probability of the contamination.
Our statistics showed that the contamination score is always around 0.005, which will largely affect the variation selection, especially for those drug response ones.
And it may be inappropriate to raise this question that any suggestions to reduce the contamination in experimental step?
-
Hi again.
Contamination here could be partly due to index jumps and partly due droplet contamination during experimentation none of which could be easily get rid of especially when you are talking about levels of 0.005.
Our team suggests that for your purpose it may be better of not using this filter at all.
Regards.
Please sign in to leave a comment.
3 comments