Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

mutect2 multi-sample

Answered
0

35 comments

  • Avatar
    David Benjamin

    Hi Alexandre Mondaini,

    You should run Mutect2 in multi-sample mode, separately for each patient.  That is:

    gatk Mutect2 -I normal1.bam -I tumor1.bam -I tumor2.bam -I tumor3.bam -normal normal1 . . . -O patient1.vcf
    gatk Mutect2 -I normal2.bam -I tumor4.bam -I tumor5.bam -I tumor6.bam -normal normal2 . . . -O patient2.vcf
    0
    Comment actions Permalink
  • Avatar
    Alexandre Mondaini

    Hi David,

    Thanks for the quick answer. Is this the default mode on which samples can be run using the WDL file ?

    https://github.com/broadinstitute/gatk/blob/master/scripts/mutect2_wdl/mutect2_multi_sample.wdl

    I see details about a pair list:

    #  pair_list: a tab-separated table with no header in the following format:
    #   TUMOR_1_BAM</TAB>TUMOR_1_bai</TAB>NORMAL_1_BAM</TAB>NORMAL_1_bai
    #   TUMOR_2_BAM</TAB>TUMOR_2_bai</TAB>NORMAL_2_BAM</TAB>NORMAL_2_bai

    With a subsequent scatter over each row of this tsv file:

    scatter( row in pairs )

    So should I replicate the normal file over the rows for all combinations of tumors as such ?

    TUMOR_1_BAM</TAB>TUMOR_1_bai</TAB>NORMAL_1_BAM</TAB>NORMAL_1_bai

    TUMOR_2_BAM</TAB>TUMOR_2_bai</TAB>NORMAL_1_BAM</TAB>NORMAL_1_bai

    TUMOR_3_BAM</TAB>TUMOR_3_bai</TAB>NORMAL_1_BAM</TAB>NORMAL_1_bai

    Thank you for the support.

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Alexandre Mondaini That WDL file is misleading.  It invokes multiple independent runs of Mutect2 on tumor-normal pairs, and each pair has a single tumor and normal.  Given that Mutect2 should be run using the featured workflow on Terra, where scattering over multiple pairs is built into the UI and is not part of the WDL itself, this workflow (which predated Terra) is obsolete (see https://github.com/broadinstitute/gatk/pull/7992).

    We do not yet have a WDL for genuine multi-sample mode with multiple tumor samples from a single patient.

    1
    Comment actions Permalink
  • Avatar
    Anthony Dias-Ciarla

    Hi Alexandre Mondaini,

    Thank you for writing to the GATK forum! I hope that we can help you sort this out.

    I brought your inquiry to our developers and received some feedback to share with you.

    Firstly, you should run Mutect2 separately for each person. When you have a matched normal, you should use it, but you shouldn't use one patient’s normal with another patient’s tumor. Likewise, you shouldn't combine two different patients’ tumors in the same run of Mutect2.

    I hope this helps clarify your questions! Please let me know if any other questions arise in the future. Thank you for being a valued contributor to the GATK community.

    Best,
    Anthony

    0
    Comment actions Permalink
  • Avatar
    Anthony Dias-Ciarla

    Hi Alexandre Mondaini,

    We haven't heard from you in a while so we're going to close out this ticket. If you still require assistance, simply respond to this email and we'll be happy to pick up where we left off!

    Kind regards,

    Anthony​

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk