Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GATK Mutect2 tumor vs. normal sample

Answered
0

4 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hello Enrico Cocchi,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, check out our support policy.

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Enrico Cocchi If I understand the question correctly, you are wondering how to compare tumor-only calls versus tumor-normal calls.  The short answer is that this is pointless because calls with a matched normal are vastly superior.  The reason is simple: even with the strictest germline filtering possible, namely blacklisting every germline variant in gnomAD no matter how rare, any given genome will still have 30,000 or so rare germline variants that go unfiltered without a matched normal.  There is basically nothing you can do about this unless your tumor has low purity and so you can distinguish the low allele fractions of somatic variants from germline allele fractions closer to 1/2.

    While we're here, let's also clear up a few things:

    • The PoN is important for both tumor-only and tumor-normal calling because its job is to filter out mapping and technical artifacts.
    • The germline resource (usually the AF-only gnomAD we provided in our public google bucket) is responsible for germline filtering.
    0
    Comment actions Permalink
  • Avatar
    Jeffcgen2000

    we have one normal sample and two tumor samples from the same patient, how do I set up normal-tumor sample matching? is below correct?

     

    gatk Mutect2 \
         -R reference.fa \
         -I tumor1_patient1.bam \
         -I tumor2_patient1.bam \
    -I tumor1_patient2.bam \
    -I tumor2_patient2.bam \
    -I normal_patient1.bam \
    -I normal_patient1.bam \
        -I normal_patient2.bam \
    -I normal_patient2.bam \
    -normal normal_patient1 \
    -normal normal_patient1 \      
    -normal normal_patient2 \ -normal normal_patient2 \ --germline-resource af-only-gnomad.vcf.gz \ --panel-of-normals pon.vcf.gz \ -O somatic.vcf.gz

     

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Jeffcgen2000 Different patients should not be combined in a single Mutect2 command.  The correct command for patient1 would be:

    gatk Mutect2 \
         -R reference.fa \
         -I tumor1_patient1.bam \
         -I tumor2_patient1.bam \ 
    -I normal_patient1.bam \
    -normal normal_patient1 \ --germline-resource af-only-gnomad.vcf.gz \ --panel-of-normals pon.vcf.gz \ -O somatic.vcf.gz
    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk