Should joint-calling be performed for the control and the disease group separately?
AnsweredDear GATK Team,
I have about 50 disease samples and 30 control samples, I was wondering if I should do joint-calling on these two groups separately or I should treat them as one group.
Thanks,
Jiayi
-
Hi Jiayi,
if you mean by joint calling the multi-sample variant calling for Mutect2, then this is not done on a cohort basis, but on a per patient basis. The multi-sample calling pools evidence for a variant across samples and is thus more powered to detect variants in a patient.
Please read the best practices tutorial.
Best,
Philipp
-
Hi Philipp,
Thanks for your reply.
Actually, I am using HaplotypeCaller, and I am going to try GenotypeGVCF. Is this a good choice? and should I conduct joint-calling on disease and control separately?
Best,
Jiayi
-
Hi Jiayi,
are you interested in obtaining germline variants or somatic variants? For the former, HaplotypeCaller should be used, for the latter Mutect2.
Are the disease and control samples patient-matched? If yes, you can use them as tumor-normal pairs in Mutect2 to filter germline variants in the controls.
Joint calling should only ever be done for multiple samples coming from the same patient. EDIT: this is certainly true for somatic calling. Upon reading documentation for germline calling again, you can run that in cohort mode on multiple patients. If you are interested in which germline variants may be responsible for the disease, then in order to maximize power, I'd run it in two batches: the case batch and the control batch. Maybe someone from the gatk team who is more familiar with germline calling could elaborate on that?
Best,
Philipp
-
Thanks so much for posting your insight here Philipp Hähnel! I would recommend Jiayi Zhao to run the 50 disease and 20 control samples together, because running them through our joint calling workflow will give the workflow more statistical power to make better calls. You will get a joint called VCF. If you want the VCF calls separated by group, you can divide the VCF with SelectVariants.
Please sign in to leave a comment.
4 comments