I'm facing a puzzling issue with my cohort model, and I was hoping to get some insights from the community.
I've successfully created a cohort model using 30 samples that were sequenced with the same kit for Hg38. However, the results are not aligning with the metadata as expected. Interestingly, when I performed the same analysis for Hg19 using the identical procedure outlined in the GATK documentation (How to) Call common and rare germline copy number variants – GATK (broadinstitute.org), the results matched seamlessly with the metadata.
Just to provide more context, the samples were WES data, totaling 60 samples—30 for Hg19 (sequenced in a single run) and 30 for Hg38 (not sequenced in a single run but with the same capture kit). The samples were collected from different locations, and I have metadata for the case samples to compare the results generated by different callers, specifically DRAGEN.
While GATK gCNV for Hg19 yielded matching results with the metadata, the same cannot be said for Hg38. I'm aware that there might be algorithmic differences between the two CNV caller, but I'm struggling to pinpoint the exact issue.
GATK version: 184.108.40.206
BED file: Twist_Comprehensive_Exome_Covered_Targets_hg38/hg19.bed
Has anyone encountered a similar situation or can offer some guidance on how to troubleshoot this? I appreciate any assistance or insights you can provide!
Please sign in to leave a comment.