Why do I see difference in ploidy in GT field within same sample after GenotypeVCFs
Hi, as topic describes, I am writing regarding different types of ploidy in GT field in GenotypedVCF when allele call is missing. I scanned through genotyped vcf to check for patterns and distribution. 97.5% of my samples have at least one case where GT field would be '.' instead of './.' or '.|.'
My conclusion would be that at genotyping step, sites not found in the samples' gVCF are filled with dots/zeros, though I cannot confirm my thoughts with any docs. However, is it expected to not match ploidy found across the sample? If it is the case of calls without enough support for ref/alt would you recommend imputing those to wild types?
I will be happy to get any feedback regarding this - thanks in advance!
Emilia
GATK version used:
The Genome Analysis Toolkit (GATK) v4.1.4.1
HTSJDK Version: 2.21.0
Picard Version: 2.21.2
Commands:
# Single call per sample
gatk HaplotypeCaller -I <input_bam> -R <ref.fa> -O <output.g.vcf.gz> -ERC GVCF
# Create DB
gatk GenomicsDBImport --genomicsdb-update-workspace-path <prefix_genomics_db> -L 20 --sample-name-map <map_file> --reader-threads <threads> --batch-size 500
# Genotype VCFs
gatk --java-options -Xmx40g GenotypeGVCFs -R <ref.fa> -V <preifx_genomics_db> -O <my.genotyped.vcf.gz>
-
Hi Emila Mańko,
What ploidy are you working with? Are you studying a sample that would have some differences in ploidy?
You can get more information about the realigned reads and why you might see certain results in your VCF by using the HaplotypeCaller option bamout and viewing the reads in IGV.
Hope this helps,
Genevieve
-
Hi,
I actually came across this also and the study organism is diploid without any changes in ploidy expected.
However, I didn't use GenomicsDBImport, just standard combinegvcfs.
Do you have any insights into this?Cheers,
-
Hi ABours,
There is some information on biostars regarding this: https://www.biostars.org/p/43538/
You can look into the realigned reads with the recommendation I gave above, using -bamout. If you find that your data does not follow any of these explanations and you want to put in a report of an abnormal result, please see this post for what information we require: https://gatk.broadinstitute.org/hc/en-us/articles/360053424571-How-to-Write-a-Post
Best,
Genevieve
Please sign in to leave a comment.
3 comments