The same number of SNPS and INDELS identified for all samples after joint genotyping
AnsweredDear GATK team, I performed joint variant calling on 46 whole exome samples. After running CombineGVCFs and GenotypeGVCfs, I realized the number of variants was the same for each sample. Is this an expected result or its due to something else.
I used GATK 4.1.8.1
-
But, do you see different variant type between the samples.
-
Priyadarshini Thirunavukkarasu , the same number snps and indels were identified.
I hope this answers your question.
-
Could you give an example?
-
Here is an example.
I used bcftools to count the number of variants identified
samples=(100N 100T 10T)
for sample in ${samples[@]};do echo $sample $(bcftools view -s $sample $vcf|grep -v -c '^#');doneResult
100N 476154
100T 476154
10T 476154
-
Thank you. Does these samples, have same type of variant at a given position. For example, if it is a missense mutation across all the samples at a given position or the type of mutation differs?.
-
Priyadarshini Thirunavukkarasu
I am showing results for just three samples but its the same for the others
I tried with one position and are the output
COUNTS OF SNPS AND INDELS
sample indels snps Total
100N 47757 429534 477291
100T 47757 429534 477291
10T 47757 429534 477291VARIANTS AT A POSITION
sample ID POS REF ALT
100N rs41304577 93425 A G
100T rs41304577 93425 A G
10T rs41304577 93425 A G -
Did you check the genotype. If it is homozygous for the variant allele at a given position? I have run a different pipeline for my samples and it shows the similar variant allele at a given position. But, the genotype was not same for the variant allele across samples
-
OK. Thanks
Buts its a bit confusing. When I run GenotypeGVCFs on the individual vcfs generated using the HaplotypeCaller, the number of variants were different for each sample.
Sample No. of Variants
100 N 108353
100 T 104014
10T 106707I am going to try looking into the genotypes for the multi-sample vcf file
-
When you do joint genotyping, it is at a given site the variant calling is done. So,the genotypes will different between these samples for the same snp at a given position
-
Thanks for the clarification Priyadarshini Thirunavukkarasu.
What I like to know is when doing joint variant calling, does gatk look for only variants that are common to all the samples?
-
Genevieve Brandt (she/her). Can you help clarify this question?
Thanks
-
Hi Vincent Appiah,
Yes, Priyadarshini Thirunavukkarasu is correct here. When you do joint genotyping, you first get a GVCF, which has information about all the sites in the genome. So when you call variants with GenotypeGVCFs, there will be a variant line whenever any of the samples has a variant at that site. You can determine which samples have the variant and which do not in the genotype fields.
GATK does not only look for variants that are common to all the samples, it can also call variants that are only in one sample.
You can read more about joint calling here: https://gatk.broadinstitute.org/hc/en-us/articles/360035890431-The-logic-of-joint-calling-for-germline-short-variants
Best,
Genevieve
Please sign in to leave a comment.
12 comments