I need to do some variant calling on 50 diploid fish genomes sequenced to ~10X. The reference is from the same species but a different population. In terms of genetic distance, the MASH distance between the samples and reference ranges from 0.006 - 0.007 whereas the distance between samples is around 0.002. For reference, the distance between any two humans is supposed to be around 0.001. So my samples diverge quite a bit from the reference compared to humans. But they aren't as divergent as what you might get from bacteria isolates. For the latter, I know that GATK does not work very well and the community uses other tools like SNIPPY which is based on bwa mem/freebayes pipeline. I was wondering whether GATK when perform well in my case or whether there's better alternatives. How divergent does the reference need to be from the samples before GATK starts under performing other callers? thanks - Robert
Please sign in to leave a comment.