Hello! I am new to GATK and am wondering which Best Practices Workflows in this page https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels- is best to use.
I have bam files for 12 trios and want to call their variants. However, I read that the workflow for Main steps for Germline Cohort Data is for cohort data and the workflow for Main steps for Germline Single-Sample Data is for small samples. I am thinking to use the latter since there are only 3 subjects in a trio and that this is considered a small sample. But I only think this because I do not know what "cohort data" is. The cohort definition that I learned is "collection of people who share a characteristic over time." However, not everyone in each trio is affected. So I am confused if GATK cohort means just a large sample or a large sample with everyone affected. Also, what is the minimum sample size to use Main steps for Germline Cohort Data?
I am unsure what GATK means by cohort data since I cannot find its definition. Thus, I am confused about if I should use Main steps for Germline Cohort Data or the Main steps for Germline Single-Sample Data. Which one should I use?
Please sign in to leave a comment.