Should we run MarkDuplicates on PCR-Free Libraries made using PCR-Free Kits to take care of Optical Duplicates ?
I found a post here that states we shouldn't run MarkDuplicates on the data produced using PCR-Free Kits. So should we use it to mark optical duplicates because Illumina platform generates a lot of PCR duplicates as described here or it should be skipped? An example of status of Duplicates marked by Picard in my data is attached below.
-
It looks like you linked to our old documentation, which is no longer up to date. There is a discussion about this issue in the documentation at our new site: https://gatk.broadinstitute.org/hc/en-us/articles/360046222751-MarkDuplicates-Picard- Please see that link for more info.
Here is a discussion from our old site: https://sites.google.com/a/broadinstitute.org/legacy-gatk-forum-discussions/2014-03-21-2013-10-14/3372-PCR-Duplicate-detection-on-PCRFree-Libraries
In terms of your sequencing data, we do not provide support for those issues. But hopefully those above resources are able to help you out.
-
Hi, The updated MarkDuplicates documentation does not address this issue. The question is: Is there a way to only mark optical duplicates but not PCR-duplicates. This situation is relevant for PCR-free libraries.
MarkDuplicates allows this distinction with the TAGGING_POLICY option, but this is only in regards to the DT tag. However the DT tag is not the same thing as the SAM duplicates flag, which is what all programs practically use for duplicate filtering.
Therefore, the question is can MarkDuplicates change the SAM duplicates flag only for optical duplicates, but not for PCR duplicates?
-
Hello, Just following up on the above question that hasn't been answered yet.
Is there an option for MarkDuplicates to only remove or tag optical duplicates, but not PCR duplicates?
Please sign in to leave a comment.
3 comments