UmiAwareMarkDuplicatesWithMateCigar - UMI_METRICS
When checking the UMI_METRICS output file from `UmiAwareMarkDuplicatesWithMateCigar`, I found that DUPLICATE_SETS_WITH_UMI > DUPLICATE_SETS_IGNORING_UMI.
What it means? I expected that when using UMIs, the number of real duplicates is lower, since it is able to distinguish between PCR duplicates and different molecules.
DUPLICATE_SETS_IGNORING_UMI | Number of duplicate sets found before taking UMIs into account |
DUPLICATE_SETS_WITH_UMI | Number of duplicate sets found after taking UMIs into account |
I found this explanation but I don't understand it completely.
Thanks!
-
Thank you for your post, Miguel Grau! I want to let you know we have received your question. We'll get back to you if we have any updates or follow up questions.
Please see our Support Policy for more details about how we prioritize responding to questions.
-
Thanks, Genevieve. Any update?
Besides the main question, when using UmiAwareMarkDuplicatesWithMateCigar, the pipeline is the same as usual? ie:
BAM -> UmiAwareMarkDuplicatesWithMateCigar -> BaseRecalibrator -> ApplyBQSR -> Mutect2 -> LearnReadOrientationModel/GetPileupSummaries/CalculateContamination/FilterMutectCalls.
Thanks!
Please sign in to leave a comment.
2 comments