Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

options used during base recalibration

0

6 comments

  • Avatar
    Bhanu Gandham

    Hi janick mathys

    --use-original-qualities is used if a samples has already been processed by BQSR and we are re-running BQSR, we want the tool to use the original qualities. 

    --static-quantized-quals determines which values determines which values the quals should be rounded off to. For example: if you set these values to 10,20,30 it will round off all the quals to one of these three values. 

     

     

    0
    Comment actions Permalink
  • Avatar
    mk

    Hello,


    Thank you for your reply. What is the purpose of rounding off the quals? Why are we doing this. Thank you.

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi mk

     

    You can read more about this argument and other arguments in the tool docs here: https://gatk.broadinstitute.org/hc/en-us/articles/360040507871-ApplyBQSR#--static-quantized-quals

    0
    Comment actions Permalink
  • Avatar
    mk

    Hi, thank you. I read the docs before posting. It explains what the parameter does. My question was why should we do this (as is done in the implementation of the best practices). Thank you.

    1
    Comment actions Permalink
  • Avatar
    Shinichi Namba

    Hello, 

    I found a page in Legacy GATK forum that may be related to this option.

    https://sites.google.com/a/broadinstitute.org/legacy-gatk-forum-discussions/announcements/6495-Version-highlights-for-GATK-version-35 

    I cited a paragraph in this page below:

    ----

    Static binning of base quality scores. In a nutshell, binning (or quantizing) the base qualities in a BAM file means that instead of recording all possible quality values separately, we group them into bins represented by a single value (by default, 10, 20, 30 or 40). By doing this we end up having to record fewer separate numbers, which through the magic of BAM compression yields substantially smaller files. The idea is that we don’t actually need to be able to differentiate between quality scores at a very high resolution -- if the binning scheme is set up appropriately, it doesn’t make any difference to the variant discovery process downstream. This is not a new concept, but now the GATK engine has an argument to enable binning quality scores during the base recalibration (BQSR) process using a static binning scheme that we have determined produces optimal results in our hands. The level of compression is of course adjustable if you’d like to set your own tradeoff between compression and base quality resolution. We have validated that this type of binning (with our chosen default parameters) does not have any noticeable adverse effect on germline variant discovery. However we are still looking into some possible effects on somatic variant discovery, so we can’t yet recommend binning for that application.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks for posting the info you found, Shinichi Namba and helping out other GATK users!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk