Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Using GATK as a genotyper to genotype existing variants

Answered
0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Ivar Grytten,

    Thanks for writing into the forum! I'll do my best to try to find a resolution that works for you. What would be the expected behavior you are suggesting in this instance? To me, it would make sense that GATK would present the call that seems to exist at this location based on the evidence.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Ivar Grytten

    Thanks for the reply!

    I would expect GATK to genotype the G/T SNP as either 0/0, 0/1 or 1/1, not to suggest another SNP G/A that is not in the input vcf.

    As another example, if I ask GATK to genotype this SNP:
    1 969377 . A G

    .. then in my case GATK outputs:

    1 969377 . A G,ATG 

    with genotype 0/2.

    Here it has come up with an insertion that is not in the input vcf. 

    I guess this may just come down to a question of the definition of what a genotyper should do.  I thought a genotyper should only determine the genotypes of a given set of input variants, not call new variants (which is the job of a variant caller)? For me it's not a big deal, but it's nice to clarify whether this is the intended behaviour of GATK HaplotypeCaller with the --alleles flag :)

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    I see, thanks for clarifying.

    Yes, the --alleles flag calls variants at those positions, not just genotypes. I will follow up with my colleagues if there is a better way to do what you are trying to do but it might take some time to get a full resolution.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Ivar Grytten

    I had my colleagues take a second look and agreed that the results here are as expected. For the example in your comment above, marking the site as 0/0, 0/1 or 1/1 would be an incorrect result because the site is not A>G or A. 

    It looks like here we are providing information on the allele you input and giving the results that are present. Is there an alternate expected behavior you are looking for that would still provide accurate results?

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Ivar Grytten

    Thanks a lot for clarifying!

    It does indeed make sense that GATK does suggest (call) alleles that it finds support for on the given sites. I just assumed that GATK with --alleles dit not behave this way since I've seen people using GATK as a pure genotyper to genotype a given set of input variants. I guess an alternative behaviour that could make sense would be to call the genotype as ./. (no support for the variant). 

    Anyway, this is not a big deal, I mostly wanted to clarify what was the intended behaviour. Thanks again for helping with this! :)

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Glad we could help out!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk