I'm currently in the process of creating my own dataset of known variants, since one doesn't exist generated from WGS for my study species (bumblebee). However, the step of performing joint genotyping with GenotypeGVCFs is taking a really long time (16 days!) and I would like to speed up this process. I have read in this forum about multithreading or parallelise the job by running one chromosome at a time. However I don't know how to write that code.
My current code is:
gatk --java-options "-Xmx4g" GenotypeGVCFs \
-R /proj/snic2020-16-43/ref/Bombus_terrestris.Bter_1.0.dna.toplevel.fa \
-V gendb://WorkspaceDBImport \
I would appreciate any help I can get with running this job faster.
Please sign in to leave a comment.