Genotyping precalled sites/variants
I am looking for a way to genotype a specific set of variants/sites in GATK4.1.4.1. In previous versions it was possible to provide a sites-only vcf (back then to UnifiedGenotyper) to have all these sites genotyped. However, I cannot find this option in genotypeGVCFs.
Background: I have populations/species with highly stratified allele frequencies, and I want to genotype these separately to not bias the genotypes by the allele frequencies in the entire set of individuals. To still get the same sites called, however, I have to provide a list of sites/variants.
Thanks a lot for any insights!
Reto
-
Hi ,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
-
Hi Reto,
You can actually provide a list to GenotypeGVCFs to opporate on, it's the -L function. You can specify in that list single sites. I haven't tried it, but otherwise you could take a region of 100 bp before and after the variant you definitely want to call.
for example you can format it as such, and save it as a list:
for a site:
chr_5:100
chr_6:15890
chr_7:4389
for a region:
chr_5:50-250
chr_6:25790-15990I haven't tried this myself, but I think it should work. Also if the site isn't a variant in your new vcf it will not appear (unless you say --include-non-variant-sites this will ensure that you have all callable site)
I hope this can help you :)
Cheers,
-
Please sign in to leave a comment.
3 comments