UnsupportedOperationException: getFragment called for paired read
I am a new user trying to merge my aligned and unmapped bam files. I am using MergeBamAlignment, gatk version 4.2.5.0, and encounter the following error - 'UnsupportedOperationException: getFragment called for paired read' (see log below).
Any guidance as to how to resolve this issue would be greatly appreciated. Thank you.
Log:
-
Hi Laura Steel
Can you share the details of how you generated your aligned and unmapped files?
-
Hi Gökalp Çelik,
Thank you for your reply. I first ran FastQC/MultiQC and was happy with the results so I concatenate 6 fasta files (paired end reads from 3 lanes) to create a bam file with the following read group information using FastQToSAM
-SM LATPX3\
-LB CRPN2300014 \
-PU 22GJM7LT3 \
-PM illumina \I then used MarkIlluminaAdapters to add the XT tag and used the Burrow-Wheeler Aligner to index my genome (bwa index), producing five index files (amb, ann, bwt, pac and sa).5a.
I used SamToFastq to convert the bam file
-CLIP_ACT 2 \
-CLIP_ATTR XT \
-INTERLEAVE=true \
then ran the Burrows-Wheeler Aligner for maximal exact matches. I tried to run MergeBamAligment and was prompted to provide a sequence dictionary, which I did (CreateSequenceDictionary), then I tried to run MergeBamAlignment again but received the aforementioned error.
Thank you for any advice or assistance you can offer. -
Since you converted your reads into interleaved format did you also include -p parameter to perform smart pairing of reads? The error looks like it cannot find the proper pair of a read present.
-
Hi Gökalp Çelik,
Yes, I used the following;
bwa mem -M -t 7 -p path_to_genome.fna path_to_merged_file.fq > output_file.sam \ -
Parameters all seem to be in order. Is it possible for you to test this command with the latest GATK 4.5.0.0 and Java 17 to see if the issue still persists?
-
I do not have access to GATK 4.5.0.0 through my institution but I will try to organise an alternative. Thank you.
-
Hi Laura Steel
Is it possible for you to send us a sample file that we can recreate the problem on our end? You may refer to the document in the link below to submit a file for us to check.
https://gatk.broadinstitute.org/hc/en-us/articles/360035889671-How-do-I-submit-a-detailed-bug-report
-
By the way is it possible for you to check if your bam file contains any paired and unpaired reads together?
Another check may be needed to see if you can compare the number of reads present in the aligned bam and unaligned bam. Can you do that as well?
-
Thank you, I will check the bam files and confirm. I will also get approval to share the files - which should be fine, as the data is anonymised.
-
Hi Gökalp Çelik,
Thank you for your patience. I used Samtools to check the number of paired (flag 0x1), unpaired (no flag set), and properly paired reads (flag 0x2) in my aligned and unmapped bam files and the results were as follows;
Aligned bam
Paired - 659451248
Unpaired - 0
Properly paired - 518631979
Unmapped bam
Paired - 0
Unpaired - 639910370
Properly paired - 0 -
In the aligned bam file, 646322620 reads were aligned and 13128628 were unaligned (out of a total of 659451248).
-
Hi Laura Steel
Your unmapped bam file only contains unpaired reads. How was it generated? Normally ubam should also contain paired reads.
-
I did think that was strange. I wonder if it's because I used concatenate to merge R1.fq and R2.fq files for all 3 lanes (1 sample) before using FastqToSam. Would it be better to merge the R1s and R2s into their own fastq files and then supply them separately for FastqToSam?
Aside from that the only other changes I made were running MarkIlluminaAdapters and then, more recently, AddOrReplaceReadGroups when I realised while troubleshooting that I had not specified -RG when using FastqToSam, so I ran the following:
gatk AddOrReplaceReadGroups \
I=LATPX3_merged_mia.bam \
O=LATPX3_merged_mia_rg.bam \
RGID=L1-3 \
RGLB=CRPN2300014 \
RGPL=illumina \
RGPU=22GJM7LT3 \
RGSM=LATPX3 \Thanks again.
-
Concatenating R1 and R2 files together is definitely the reason since it causes pairs to miss each other indefinitely.
Alternatively what you can do is to perform alignment and merge steps separately per readgroup and then combine them together during the mark duplicates step which is a common practice performed by many.
You may also combine R1 files into a single R1 and R2 files into a single R2 file without losing the order of merger in each set to map and merge your bam files. That way your issue will most likely be solved.
Regards.
Please sign in to leave a comment.
14 comments