Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

What variant calling method should I use?


1 comment

  • Avatar
    David Benjamin

    Chris It's a lot simpler than that.  Before suggesting an answer, I have a few unsolicited (sorry!) public service announcements.

    PSA #1: Only make your own panel of normals if you have both hundreds of samples and a very good reason to do so.  Otherwise, use one from our best practices google bucket gs://gatk-best-practices/somatic-hg38 (there's also one for the hg19 reference).

    PSA #2: The panel of normals captures technical artifacts, not germline variants.

    PSA #3: Tumor-only calling is always much less reliable than tumor-normal.

    Anyway, you have two options, depending on whether you want to resolve the difference between daughter cells.  Both involve Mutect2.

    Option #1: If you want to discover all mutations that exist in any daughter cell (ie you want to pool the data from all daughter cells) and do not exist in the zero-day cells you should run Mutect2 as follows

    gatk Mutect2 -R ref.fasta \

        -I zero-day1.bam -I zero-day2.bam \

        -I daughter1.bam -I daughter2.bam \

        --normal zero-day-sample1 --normal zero-day-sample2 \

        --germline-resource gs://gatk-best-practices/somatic-hg38/af-only-gnomad.hg38.vcf.gz \

        -pon gs://gatk-best-practices/somatic-hg38/1000g_pon.hg38.vcf.gz \

        -O pooled-daughters.vcf

    Option #2: If you want to treat the daughters as distinct populations run Mutect2 separately for each daughter, the only difference between the above command being that you only input one daughter at a time.

    Note that in both cases it is essential to run FilterMutectCalls afterwards.  "zero-day-sample1" and "zero-day-sample2" should be replaced by their sample names from the BAM header.

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk