Clarification: snp calling best practices multiple samples (DRAGEN?)
I have wgs data for a group of unrelated individuals of the same population. I want to make a vcf file with SNPs. When I look in the best practices workflows for GATK, I see that the the workflow for Germline short variant discovery (SNPs and Indels), makes no mention of DRAGEN-GATK. I also see that the Broad Institute now considers DRAGEN mode to be best practices for "single-sample short variant discovery". There is also an article saying that DRAGMAP outperforms BWA. From all of this, my guess is that DRAGEN mode is now considered best practices for a GATK workflow for germline short variant discovery using multiple individuals (do samples != individuals in this context?). If using gatk, should I be using DRAGEN-mode to make my vcf with multiple individuals?
Also, the best practices workflow used to have a nice graphical/step-by-step layout for snp calling with the kind of data I'm using. I remember there being multiple rounds of things (eg running baserecalibrator twice) and the graphic being helpful. Is there an up-to-date equivalent of something like that which the Broad Institute provides?
Thanks,
Tim
-
Hi Tim DeLory
It is true that we have commonalities with DRAGEN in GATK-DRAGEN workflows and functional equivalence has been defined in our documents already.
Certain parts of the DRAGEN technology have been ported into GATK while certain GATK improvements are added to DRAGEN 3.4.12 and onwards. You may want to use GATK-DRAGEN if that is what you wish or say you want to combine old data with new data from DRAGEN then it may be preferable to perform GATK-DRAGEN workflows to make all data compatible and comparable with each other. Keep in mind that our GATK-DRAGEN is only updated to version 3.7.8 version of hardware DRAGEN however current version of DRAGEN is way ahead.
If you want to use GATK-DRAGEN workflows for multi-sample genotyping then there are certain parameters that you need to pay attention to such as not using PDHMM in HaplotypeCaller. Current implementation is not compatible with GVCF output therefore if you are interested in using dragen378concordance mode for HaplotypeCaller then you need to disable PDHMM. However this makes it slightly less sensitive to calling variants at hard-to-call regions.
Our former documentation is not readily available from our current portal. We do keep a git repository for the old documentation however it may not be 100% applicable to current recommendations for DRAGEN (BQSR is not used in GATK-DRAGEN workflows).
I hope this helps.
Regards.
-
Thank you for the clarification! This was Helpful.
Please sign in to leave a comment.
2 comments