Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Inconsistent variant call in some pool-seq samples

Answered
0

3 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Lorena,

    I am a bit confused regarding your post. Do you think we could start with one question and example at a time?

    Generally for pool seq we recommend Mutect2 because it can handle variable ploidies much better than HaplotypeCaller. Have you tried out Mutect2?

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Lorena

    Hi Genevieve, yes! Apologies if it's confusing. I guess the question boils down to why is it that some samples get called as variable, and some don't, even though they are so similar.

    I have not tried Mutect2, I became aware of it only recently. I wanted to see if I could rescue the variants that we already have rather than spending a lot of time learning and preparing a whole new pipeline for Mutect2. Also, a few aspects of it made me feel like it was a significant amount of work to re-start the variant calling (please correct me if I am wrong!):

    1) It seems to me that if I input the parents, for example, as "normal" samples, and the pool-seq samples as "tumors", I would only get the de novo mutations, to the exclusion of the sites in the parents ("germline"). If instead I choose to run everything in tumor-only mode to get all sites, then I have to run each of the 159 samples independently and then join them? or can Mutect2 do joint called now of only "tumors"?

    2) My impression was also that the Mutect2 output has different metrics for filtering, so that is something I would need time to get familiar with. Is it a normal vcf format?

    Thanks!

    Lorena

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Lorena,

    Ok, thanks for summarizing your question. Could we go over one variant site of interest and see why it was called in some samples and not in others? I would recommend first taking a look at this article: When HaplotypeCaller and Mutect2 do not call an expected variant. The article contains troubleshooting recommendations and common debug options. If the article does not help understand better, please provide details for the site of interest including the read depth support of the variant and the output VCF line if you call the variant with the --alleles option.

    I am interested to see if you get better results if you use Mutect2 for pool seq, but again, its not a recommended use case for Mutect2. We don't have many resources regarding how to use it and you would have to implement your own filtering rules because FilterMutectCalls is meant for tumor analysis.

    1) You would have to run everything individually, so it would be time intensive.

    2) The output VCF from Mutect2 is a normal VCF with a few changes that make sense for tumor analysis.

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk