Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

MarkDuplicates not checking sequences

Answered
0

3 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Naomie Abecassis,

    MarkDuplicates does not check if the sequences match, it only checks other information about the reads. Thanks for sharing that ambiguous documentation, I'll put in a request for our team to get that changed.

    Do you have standard illumina data or some other type of sequencing? You should not use MarkDuplicates with amplicon data because all the sequences will get marked as duplicates.

    Hope this helps!

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Naomie Abecassis

    Thank you Genevieve-Brandt-she-her for this clarification!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    No problem!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk