Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

how to combine mutect2 and haplotypecaller vcfs from the same sample

Answered
0

7 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Yanick Paco Hagemeijer,

    I am going to move your post into our Community Discussions -> Special GATK Use Cases topic, as the Somatic topic is for reporting bugs and issues with GATK.

    You can read more about our forum guidelines and the topics here: Forum Guidelines.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yanick Paco Hagemeijer

    Hi Genevieve,

    Sorry and thanks for correcting that.

    Greetings.

    Yanick

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Yanick Paco Hagemeijer what are you expecting for the merge if there are events from Mutect2 and HaplotypeCaller at the same site? Or are you wanting a two sample VCF where one sample is the output from Mutect2 and the other sample is the output from HaplotypeCaller?

    0
    Comment actions Permalink
  • Avatar
    Yanick Paco Hagemeijer

    Thanks for the quick reply!

    I would expect the latter (a multi sample VCF). I cannot even begin to imagine the complexity of having to fold those 2 files into a chimera sample.

    I would like to filter the calls made by mutect2 prior to merging the 'samples' by calling FilterMutectCalls. So to clarify what I encountered was the following:

    • HaplotypeCaller only outputs haplotype information when outputting a gvcf or bpvcf (not clearly documented)
    • Mutect2 can also output in gvcf format (completes without error)
    • FilterMutectCalls crashes on Mutect2's gvcf output (unexpected)
    • GenotypeGVCFs only accepts gvcf files (expected, non-issue, obviously I would need to rename either sample)
    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yanick Paco Hagemeijer,

    The GVCF output from Mutect2 is still in BETA so that is probably why there are remaining issues with FilterMutectCalls. I wouldn't recommend skipping the FilterMutectCalls step because there would be too many false positives in your output. 

    Even though if somehow we were able to patch FilterMutectCalls to accept the GVCF, the output would be a VCF and wouldn't be able to be accepted to CombineGVCFs or GenotypeGVCFs.

    I think you're going to run across too many issues if you try the method you are suggesting here. I would recommend that you follow each best practices method for germline and somatic separately, then compare the VCF outputs once all filtering has been completed. 

    Alternatively, some users have found great results from running Mutect2 with the option --genotype-germline-sites. It's no longer experimental, we just have not updated the label.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yanick Paco Hagemeijer

    Hi Genevieve,

    I am going to give that a try, thanks for the insight.

    Have a nice weekend,

    Yanick

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    You too!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk