Error when trying to generate the second recalibration table for AnalyzeCovariates
Dear GATK Team,
In previous GATK4 versions, I understand there was a -bqsr option to use with BaseRecalibrator so that a second recalibration table could be generated and then submitted to AnalyzeCovariates. However, it appears this option is longer available as I receive the following error with GATK version 4.1.9.0:
A USER ERROR has occurred: -bqsr is not a recognized option.
The command used:
gatk BaseRecalibrator \
--input input.bam \
-R ucsc.hg19.fasta \
--known-sites dbsnp_138.hg19.vcf \
--known-sites Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \
--known-sites 1000G_phase1.indels.hg19.sites.vcf \
-bqsr sample.recal_data.grp \
--output sample.recal_data_2.grp \
--tmp-dir $TMPDIR
Please could you confirm the current method used to generate the second recalibration table with GATK 4.1.9.0?
Unfortunately I could not identify the current method, including in the Base Quality Score Recalibration (BQSR) documentation.
Thank you for your time and help.
-
Hi ISmolicz,
I looked through all of our GATK4 tool docs and could not find any mention of this option. The only information I could find was on our legacy site with much older versions of GATK that we do not support anymore.
However, I don't think you need that option to run AnalyzeCovariates. You should be able to input any recalibration table to get the plots.
Best,
Genevieve
-
Thank you for your reply Genevieve Brandt.
I would like to submit both a first and second pass recalibration table to AnalyzeCovariates. If the option mentioned above is unavailable, how can the second pass recalibration table be generated? Is it by running BaseRecalibrator on a BAM that has already been recalibrated once with BaseRecalibrator and ApplyBQSR?
Thank you again.
-
Hi ISmolicz,
I was able to find information about the first and second pass recalibration on our legacy site. I'll rephrase it here so people can more easily find it:
Base Recalibration has two steps in GATK4:
1) First pass of the base quality score recalibration. Generates a recalibration table based on various covariates. The default covariates are read group, reported quality score, machine cycle, and nucleotide context. For more info: https://gatk.broadinstitute.org/hc/en-us/articles/360050815072-BaseRecalibrator
2) Second pass or the second steps is the Apply base quality score recalibration. This tool performs the second pass in a two-stage process called Base Quality Score Recalibration (BQSR). Specifically, it recalibrates the base qualities of the input reads based on the recalibration table produced by the BaseRecalibrator tool, and outputs a recalibrated BAM or CRAM file.So, the second pass is the ApplyBQSR step.
For more info please take a look at this detailed document: https://software.broadinstitute.org/gatk/documentation/article?id=11081
Best,
Genevieve
-
Hi Genevieve Brandt,
Thank you for your reply and for the additional information.
AnalyzeCovariates allows one to submit more than one recalibration table and states:
Second pass recalibration tables results from the application of org.broadinstitute.hellbender.transformers.BQSRReadTransformer on the alignment recalibrated using the first pass tables.
ApplyBQSR does not generate a second recalibration table and therefore, how can the second pass recalibration table be generated using a tool from the Tool Index? What would be the equivalent tool for BQSRReadTransformer?
I would like to compare BQSR tables pre- and post-recalibration, similar to what is described in the following video, although I understand this describes GATK4-beta: https://www.youtube.com/watch?v=Zd58XBlBFk4
Thank you again for your time and help.
-
Hi ISmolicz,
We no longer support GATK3 or GATK4 Beta, so unfortunately I do not have any other information regarding this question. You can look more on our legacy forum site for information about older versions of GATK.
Thank you,
Genevieve
-
Thank you for your reply.
I am a little confused as AnalyzeCovariates GATK v4.1.9.0 says that it can accept a -before and -after recalibration table and discussed a second pass recalibration table. Is this instead not the case and a post-recalibration table cannot be generated to compare to pre-recalibration?
Thank you again.
-
I will put in a documentation request for my team to update the AnalyzeCovariates documentation. I agree, it does not make sense with the current BQSR method. I will try to get clarity to determine what the recommended method is for GATK4 and get back to you.
-
Thank you Genevieve Brandt - I am grateful for your help.
I look forward to your reply.
-
Hi ISmolicz,
We did not have an easy solution to this request and so I have created a ticket where our team will discuss the GATK4 method. You can follow along here: https://github.com/broadinstitute/gatk/issues/7096
-
Thank you for letting me know and for setting up this ticket. I will ensure I check the relevant page for updates.
Kind regards.
Please sign in to leave a comment.
10 comments