Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

gatk 4.2.6.1 BaseRecalibrator cycle covariate

0

4 comments

  • Avatar
    Gökalp Çelik

    Hi Yanis Chrys

    HiFi data would definitely have reads longer than this value. Did you check the length distribution of your data using tools such as FASTQC?

    0
    Comment actions Permalink
  • Avatar
    Yanis Chrys

    Hi Gökalp Çelik 

    Thank you very much for your reply!
    I think this speaks to my complete misunderstanding of what the cycle covariate is. I was under the impression that it had to do with model iterations, as it was said to affect memory/runtime. I have checked the read lengths so I have an idea of each sample's read size distribution.
    Should this value be set to the maximum read size in the dataset?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    Yes that would be appropriate. The default parameter is set to 500 which covers most short read sequencing data types unless one chooses to use 2x300 cycles. 

    I hope this helps.

    0
    Comment actions Permalink
  • Avatar
    Yanis Chrys

    Thank you, this worked and it's running fine now.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk