Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

VQSR positive training model failed to converge

0

4 comments

  • Avatar
    Bhanu Gandham

    Yangyxt

     

    The Error suggests reducing the max-gaussians. Have you tried that? If not I would recommend doing that.

    0
    Comment actions Permalink
  • Avatar
    Kshama Aswath

    Hi ,

    I was getting the same error on Indel recalibration and had to end up using max-gaussian to 1 !

    Everything ran fine but I am nervous and uncertain on using that low number. What is the disadvantage if any by going so low on the gaussian number? 

    Also is it forcing it go that low coz I have just 2 chromosomes I am working with across my files and hence less number of variants? Is it better instead, to hard filter variants instead of VQSR even though I have 168 samples but have less number of chromosomes?

    I used just two chromosomes in this entire process as I am interested in couple genes just in those 2 chromosomes. I thought it is better to feed the entire chromosome than the exact gene locations.

    Just a little lost here and any advice would help me better understand this step.

    Thankyou!

    1
    Comment actions Permalink
  • Avatar
    Yangyxt

    Dear Kshama Aswath,

    According to my personal experience, I'm afraid that the error will occur when the read data amount is just not big enough. Try to bring in more samples or a larger genomic area for VQSR. 

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Kshama Aswath Yangyxt,

    I spoke with the developers to get more clarity about this issue. With only 2 chromosomes, there is not a lot of variance, which could be an issue, and if VQSR does not work, we recommend Hard Filtering or CNN.

    However, --max-gaussians 1 will not necessarily create issues. Look at the plots from VariantRecalibrator (--rscript-file) and if everything makes sense, then your results could be fine. ApplyVQSR also has plots to view, looking at the Ti/Tv ratio. If the Ti/Tv ratio is bad, you should consider hard filtering. Respectable range for genmes: 1.9-2.1 and high 2s for exomes. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk