CalculateContamination inputs for paired tumor-normal data
AnsweredREQUIRED for all errors and issues:
a) GATK version used: 4.2.2.0
I am running mutect2 in tumor-normal mode which gives ONE output file. I am puzzled by which inputs I should use for CalculateContamination. My input for GetPileupSummaries is the ONE file that are output from Mutect2. So, which file should be provided for the --matched-normal parameter? Since I have not run GetPileupSummaries on the normal samples.
-
Thank you for your post, Robin Mjelle ! I want to let you know we have received your question and will be moving it to the Community Discussions -> General Discussion topic, as the Somatic topic is for reporting bugs and issues with GATK.
We'll get back to you if we have any updates or follow up questions. Please see our Support Policy for more details about how we prioritize responding to questions.
-
Dear Robin,
based on the current best practices and version of the tools, the CalculateContamination part of the variant calling workflow is independent of calling the Mutect2 part. The contamination model output is only needed for the FilterMutectCalls step, which identifies false positives in the variant calls.
You run GetPileupSummaries on the input bams of the tumor sample and the normal sample. This gives you tumor_pileups and normal_pileups. Then you call roughly
gatk CalculateContamination \
--input ~{tumor_pileups} \
~{"--matched-normal " + normal_pileups} \
--output ~{output_contamination} \
--tumor-segmentation ~{output_segments}If you can read wdl, then you can also check this link for a recent implementation of the multi-sample variant calling workflow. Feel free to adapt it as you see fit.
Best,
Philipp
Please sign in to leave a comment.
2 comments