Could you help me understand how does the "--pcr-snv-qual" parameter work?
I have a dataset with 150nt paired reads with most of quality scores equal 37. However based on bam file QC almost all inserts sizes are less than 300nt so majority of pairs have a significant overlap. In result Mutect2 22.214.171.124 that I'm using drops base quality of overlapped nucleotides to BQ=20 and majority of variants have MBQ=20.
I was playing with "--pcr-snv-qual" parameter and found quite strange behavior:
1) values from 36 to 74 (although I did not test every integer) give majority of variants with MBQ=half of pcr-snv-qual.
2) values <= 34 and >= 80 give me majority of variants with MBQ=37.
I was looking in the code, GitHub issues and forum pages and found that Mutect2 set base quality as half of pcr-snv-qual for bases that are the same in both reads from the overlapping pair and 0 if they are different.
1) However why using pcr-snv-qual <= 34 results in MBQ=37?
2) Is it safe to override overlapping reads base quality adjustment by setting pcr-snv-qual very high? I don't see why it is reasonable to drop quality of base that was supported by two high-quality reads (e.g. two A bases with BQ=37).
Please sign in to leave a comment.