Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Where do I find guidance on when to mark duplicates a second time?



  • Official comment
    Genevieve Brandt (she/her)

    Hi ISmolicz,

    It looks like the (How to) Map and clean up short read sequence data efficiently tutorial is out of date based on your forum posts. In our pipelines, we only run MarkDuplicates once to get both optical and PCR duplicates. We can't think of a reason why it would need to be run twice.

    I requested that this tutorial be changed noting that it is out of date and when we have the capacity, we will take a look at the tutorial and try to bring it up to date. Thank you for writing in regarding this issue. I apologize for how long it took us to get an answer.



    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi ISmolicz,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, check out our support policy.

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk