How to evlauate the result: GenotypeConcordance or Concordance
AnsweredREQUIRED for all errors and issues:
a) GATK version used:gatk-4.1.7
b) Exact command used:/zfssz2/ST_MCHRI/BIGDATA/USER/liupanhong/Src/gatk-4.1.7.0/gatk --java-options -Xmx3G GenotypeConcordance \
--CALL_VCF $testvcf \
--OUTPUT ${sample}_zbolt_microarray \
--IGNORE_FILTER_STATUS \
--TRUTH_VCF $baseline \
--TRUTH_SAMPLE $sample && echo "${sample} done"
c) Entire program log:
when I analysis some samples with different analysis pipeline, I want to know which one performs more precise, so I will run rtg eval or hap.py .
Besides, when I gonna to get the consistency,which one is ok between GenotypeConcordance and Concordance?
See forum topic details at forum guidelines page: https://gatk.broadinstitute.org/hc/en-us/articles/360053845952-Forum-Guidelines
-
Hi lizhichao,
The tools are very similar but GenotypeConcordance calculates the concordance at the sample genotype level while Concordance calculates the concordance at the site level.
The one you choose depends on the information you want for further analysis.
Best,
Genevieve
-
Sorry ,I cant get the difference between the genotype level and site level. By the way ,I did noticed that some paper calculate the precision at variant calls and genotype level .But I think the genotype is based on every site/variant, so I cant figure that .Can you explain it to me? thanks!
-
Yes! I can explain this.
In a multi-sample VCF, there is some information in the VCF that applies to all samples and some information that applies to each specific sample. The site level calculation refers to a calculation from values that apply to all samples at that site, so data in the INFO field or REF/ALT fields.
A genotype level calculation is looking at the FORMAT field for each sample individual and calculating the concordance with those GT, AD, AF, etc, values.
There is an in depth explanation in our VCF document, which you can see here: https://gatk.broadinstitute.org/hc/en-us/articles/360035531692-VCF-Variant-Call-Format
Hope this helps!
-
if I compare two sample from two individual vcfs, then the above explanation is suited for that? Does the Site level calculate the REF/ATL difference of overlap sites of the two vcfs? the Genotype level calculates the GT difference of overlap sites?
-
Yes, that's correct!
-
Thanks for your answer!
Please sign in to leave a comment.
6 comments