I want to use GATK only as a genotyper to genotype a set of input variants (e.g. variants from 1000 Genomes). From what I have understood from the documentation (and other papers that benchmark genotypers), this can be done by using HaplotypeCaller with the --alleles flag.
However, it seems that when using the --alleles flag, GATK will not only genotyping the input variants, but also often suggest other alleles for these variants.
For instance, if I provide this input variant:
1 879801 . G T . PASS
.. then GATK outputs the following:
1 879801 . G T,A [...] 0/2
So, basically GATK has not genotyped the G/T SNP, but suggested another SNP G/A with genotype 0/2. It seems that this only happens at variant sites that are provided in the vcf provided by --alleles.
So my question is: Is it possible to use GATK to only genotype the specific input variants? I am benchmarking genotypers, and it seems a bit "unfair" that GATK suggests other alleles than what is provided as input. This results in better genotyping accuracy when comparing against a truth dataset.
Thanks for any help and advice!
Please sign in to leave a comment.