I have an allopolyploid genome that is mostly phased. I've mapped 150bp PE reads to this genome, and there's of course -- a ton of multimappings. I used BWA-MEM for this with the -M flag, and my understanding is that BWA-MEM decides a primary alignment then puts all the other secondary alignments in the XA tag. I want to keep the best (equiv. to primary in this case) alignment for variant calling but I realized there may be an issue. When a read (R1) maps equally well to, say, chromosome 1A and 1A', but its mate (R2) maps uniquely to only 1A, what happens when the primary alignment for that R1 is randomly assigned to be 1A'? Now I have non-linear alignments / a chimeric read, right?
I found MergeBamAlignment and loved the idea of setting the Primary_Alignment_Strategy = Most Distant, since it would fix the above issue. But I just have a regular .bam file, with paired reads mapped to the genome. I'm not sure where the unmapped .bam comes from in this example: https://gatk.broadinstitute.org/hc/en-us/articles/360039568932--How-to-Map-and-clean-up-short-read-sequence-data-efficiently#step3B
So my question is... is there another tool that does the equivalent of Primary_Alignment_Strategy = Most Distant that is compatible with just a regular old mapped .bam? I found a different discussion where a user wanted to use this tool with PE mapped reads -- and someone suggested they modify the pipeline with SetNMMDAndUQTags instead. I am unsure if that is really an equivalent... seems like a nope. Is my only hope to take my raw fastq files and convert them into .ubam files..?
If there's any advice someone could offer or some schooling to tell me that I'm very confused, both would be much appreciated!
Thanks for your time,
Please sign in to leave a comment.