In previous versions of the GATK4 SCNA-calling pipeline there was an issue of double counting overlapping reads from the same fragment at a specific location of the genome. Because paired end sequencing is used, for certain fragments that are short enough the paired ends together could exceed the length of the fragment and thus overlap in what parts of the genome they span. GATK4 before, however, would not account for this in determining coverage at certain sites — when there would be overlap it would correspond to both reads at a specific region of overlap being counted even though they are from the same fragment, thus double-counting coverage for that fragment.
I wanted to ask if this issue in the SCNA-calling pipeline has been fixed or accounted for in any way in the more recent versions of GATK4.
Please sign in to leave a comment.