GenotypeGVCF: How can I know which sample falied in joint genotyping?
AnsweredHello,
GATK versionv 4.1.9.0
I am using GenotypeGVCF for joint genotyping and it worked successfully. I had big data set generated using GenomicsDBimport. I did chromosome-wise genotyping and I didn't get any error in GenomcsDBimport and GenotypeGVCF.
But after later stages in my analysis I found that few of my samples didn't have any SNP for 1 or 2 chromosomes. These same samples however have good genotyping rate in other chromosomes. What could be the reason? It happened in 16 of 972 samples and in 2 out of 8 chromosomes. In one chr it happened for 2 samples while for another chr it happened for 16 samples.
I tried a lot of things to know the issue but I could not find anything. Is there any way where I can know in case any sample fails in GenotypeGVCF?
Because after the entire pipeline it is hard to come back and repeat some samples using GenotypeGVCF which takes a long time.
Thanks,
Vinod,
-
Hi Vinod Kumar,
You should get some warning or error if certain chromosomes fail the genotyping step. You can check the GenomicsDB to see if those chromosomes are present. If not, search for the error in the stack trace output of GenotypeGVCFs.
Best,
Genevieve
-
Hi Genevieve Brandt (she/her),
Thanks for the reply. Actually I am talking about that few samples failed for a particular chromosome while for rest of the chromosomes all these samples are okay. When chromosome fails or give some errors then it is okay but when only few samples don't produce any results only for a chromosome, what to do in that case? Can I find this information somewhere in DB database or genotypeGVCF results?
Thanks,
Vinod,
-
Yes, this information should be available in the stack trace output. If you can find more information about why this happened (from an error or warning) then we can make sure it doesn't happen next time.
-
Hi Genevieve Brandt (she/her),
I couldn't find an issue why only 16 out of 973 samples failed in one particular chromosome. Just one weird thing is that files from temp directory for this particular chromosome have not been deleted during genomicsDBimport. However, everything look okay in the error file. Just updating again the genomicsDB with these 16 sample by first deleting them from callset.json file. Will see if it will solve the problem.
In genotypeGVCF, I can see that samples are not there in stack trace output but I have no idea if these samples have been successfully imported in genomicsDB or not. Just updating the old DB and will see the results.
Thanks,
-
Hi Vinod Kumar,
Sounds good, please let me know if it solves the problem. In the meantime, please share the stack trace from GenomicsDBImport from one of the samples that failed if you want me to take a look.
Best,
Genevieve
Please sign in to leave a comment.
5 comments