Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to evlauate the result: GenotypeConcordance or Concordance

Answered
0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi lizhichao,

    The tools are very similar but GenotypeConcordance calculates the concordance at the sample genotype level while Concordance calculates the concordance at the site level.

    The one you choose depends on the information you want for further analysis.

    Best,

    Genevieve

    1
    Comment actions Permalink
  • Avatar
    lizhichao

    Sorry ,I cant get the difference between the genotype level and site level. By the way ,I did noticed that some paper calculate the precision at variant calls and  genotype level .But I think the genotype is based on every site/variant, so I cant figure that .Can you explain it to me? thanks!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Yes! I can explain this. 

    In a multi-sample VCF, there is some information in the VCF that applies to all samples and some information that applies to each specific sample. The site level calculation refers to a calculation from values that apply to all samples at that site, so data in the INFO field or REF/ALT fields. 

    A genotype level calculation is looking at the FORMAT field for each sample individual and calculating the concordance with those GT, AD, AF, etc, values.

    There is an in depth explanation in our VCF document, which you can see here: https://gatk.broadinstitute.org/hc/en-us/articles/360035531692-VCF-Variant-Call-Format

    Hope this helps!

    1
    Comment actions Permalink
  • Avatar
    lizhichao

    if I compare two sample from two individual vcfs, then the above explanation is suited for that? Does the Site level calculate the REF/ATL  difference of overlap sites of the two vcfs? the Genotype level calculates the GT difference of overlap sites?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Yes, that's correct!

    1
    Comment actions Permalink
  • Avatar
    lizhichao

    Thanks for your answer!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk