Is it okay to run multiple instances of BQSR with the -L argument in order to increase the speed of the workflow?
AnsweredHi!
I would like to run in parallel multiple instances of a BQSR workflow in order to increase its speed. As I achieve this, for example, for mutect2 pipeline by specifying the -L argument to each chromosome, I was wondering if this is also okay with BQSR. Wouldn't this affect the statistics?
Thanks!
-
Hi Jenifer,
Yes, this is possible and how we run BQSR in our production pipelines.
You can run BaseRecalibrator for each chromosome separately (with -L) and then you must run GatherBQSRReports to merge your separate recalibration tables. Then, run ApplyBQSR over your entire dataset so that the output is not affected.
You can read more about this method here under the BQSR section: https://gatk.broadinstitute.org/hc/en-us/articles/360035535912-Data-pre-processing-for-variant-discovery
Hope this helps!
Genevieve
Please sign in to leave a comment.
1 comment