Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

CollectReadCounts tutorial incorrectly states default read quality threshold

Answered
0

1 comment

  • Avatar
    Genevieve Brandt (she/her)

    Hi Simon,

    We took a closer look at this tutorial and it seems that you are correct and there is a typo in the tutorial:

    This means the tool excludes reads marked as duplicate and excludes reads with mapping quality less than 10.

    The default for the MappingQualityReadFilter is 30. I'll have our documentation team fix this issue, thank you for pointing it out! I'm sorry that it caused you such a headache looking into this problem. Sometimes our tutorial documentation becomes out of date, so feel free to reach out to us if you think something is strange.

    In terms of why the value is set at 30, my colleague has a note about that:

    This is specifically set for the both Germline and Somatic CNV pipelines optimized for WES samples where higher quality data translates to more accurate calls. If the user need to increase sensitivity for regions of low mappability, they can just turn that parameter down in their runs.

    Please let me know if you have any further questions.

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk