Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

SNP calling with multi-sample plant data

0

5 comments

  • Avatar
    Bhanu Gandham

    Hi ,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, check out our support policy.

     

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Mutect2 is the appropriate tool for calls with low low allele fractions, while HaplotypeCaller is preferred for higher allele fractions that conform to a specific ploidy eg AF = 0.5 or 1 in humans.

    In plants with high ploidy there is some gray area.  HaplotypeCaller can genotype germline variants at arbitrary ploidy eg A/A/A/B in a tetraploid organism.  However, at high ploidies the computational demands of modeling an exponential number of possible genotypes causes problems with multiallelic variants.  If I had to pick a cutoff I would say that germline calls at ploidy 6 or lower should be handled with HaplotypeCaller.

    The relevant questions for you, therefore, are

    1) What is the ploidy?

    2) Are you interested only in germline variants or also somatic variants?

    0
    Comment actions Permalink
  • Avatar
    Sin Lee

    @David Benjamin 

    Thanks for your replying, actually I'm dealing withe a diploid genome and only interested in germline variants.

    It seems that HaplotypeCaller might be better for my situation.

    Also if there is any approach to build up a relatively credible SNP database by myself for BQSR?

     

    Best regards,

    Sin 

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    You can bootstrap BQSR by generating calls without BQSR, using the most confident calls as a database for BQSR, then calling again with the recalibrated data.

    0
    Comment actions Permalink
  • Avatar
    Sin Lee

    Thanks a lot for the suggestions, I'll try it and share the results later.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk