Mutect2 output - resources?
Hello,
I recently ran mutect2:4.1.6.0 workflow in Terra. I was wondering what is the best resource to read about the various output files produced, as well as the filtering strategies used to generate the resulting VCF files?
Many thanks,
Mia
-
Hi Mia,
Thanks for writing in. Is there anything specific you are curious about that isn't answered by the Mutect2 GATK article, or the articles that link out from that page regarding specific tasks?
Kind regards,
Jason
-
Hi Jason,
Thanks. I have come across that page, but I have not found answers to my questions.
1) I am looking to understand structure of various output files - e.g. basic folders generated are 'call-Filter', 'call-Funcotate', 'call-LearnReadRation','call-M2','call-MergeStats', 'call-MergeVCFs', 'callSplitIntervals'
1a) What sorts of outputs do these folders contain? Each has multiple files, but I cannot find guidelines as to what these are.
1b) What are the differences between different vcf/maf files produced across these folders in terms of different numbers of mutations which they contain (e.g. filtered.annotated.maf and filtered.vcf) ?
1c) Some files have column names that are abbreviations (e.g. file filtering_stats in call-Filter folder) - how can I find out what these columns mean?
2) How does flagging and filtering of the VCF files work, in terms of marking a mutation as 'PASS' or anything else (e.g. germline etc) ? Which mutations are also Funcotated (this might be relating to differences outlined in question 1b)?
Thanks,
Mia
-
Hi Mia,
This paper may help in answering some of your questions: https://www.biorxiv.org/content/10.1101/861054v1.full.pdf
Here is some information about Somatic short discovery more generally: https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-
A member of the GATK team will be reviewing your questions to fill in any perceived gaps that these two documents don't fill, but feel free to take a look at them and come back with any lingering questions.
Kind regards,
Jason
-
Hi MPetlj, Many of the different folders are different steps of the workflow. This article goes over all of the steps performed: https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-
And we also have a funcotator tutorial which may help understand these outputs as well: https://gatk.broadinstitute.org/hc/en-us/articles/360035889931-Funcotator-Information-and-Tutorial
Please sign in to leave a comment.
4 comments