Error: ValidateSamFile Value was put into PairInfoMap more than once.
I tried running ValidateSamFile on a bam file (to troubleshoot a different problem), but got an error message stating that "ValidateSamFile Value was put into PairInfoMap more than once"
This bam file was prepared from a PCR-free library, so there should not be duplicated reads. However, I did merge two separate sorted bam files in order to create this final bam file, called Ancestors.final.bam, that I ran ValidateSamFile on.
Is the issue that I have merged two technical replicates, causing some rads to have the same values? How can I overcome this problem and still merge the technical replicate bam files?
Thanks for your help! My GATK version, code, and output messages are pasted below.
GATK version used: 4.1.2.0
Picard version used: 2.14.0
Exact command used: java -jar ~/libraries/picard/picard.jar ValidateSamFile I=Ancestors.final.bam MODE=SUMMARY
Entire error log:
ERROR 2021-02-12 15:12:09 ValidateSamFile Value was put into PairInfoMap more than once. 1: A00975:57:HHGM7DRXX:2:1131:23936:21214
[Fri Feb 12 15:12:09 EST 2021] picard.sam.ValidateSamFile done. Elapsed time: 2.88 minutes.
Runtime.totalMemory()=1648361472
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
-
Yeah looks like combining the bam files is the reason why you are seeing this error. Other users have faced similar errors. Here are a couple of proposed solutions: https://www.biostars.org/p/60263/
Let me know if any of these solution work for you.
-
I tried aligning the one of the affected Libraries (sequenced over two lanes) again with bwa-mem to generate two sam files, but this time used the -M tag to make sure secondary hits are tagged appropriately for Picard as suggested by both of the links you provided. I then converted the resulting sam file for each lane to a bam file and sorted.
Before merging the bam files together I decided to check and ran ValidateSamFile on each of the two bam files. This ran for one of the bam files, but for the other ValidateSamFile did not run successfully and I got the same "ValidateSamFile Value was put into PairInfoMap more than once" error message as before. I then also ran ValidateSamFile on the intermediate sam file for that sample (before its was converted to bam and sorted), and the sam file did not give me that error message. So, during the step where the sam file was converted into a bam file and sorted with samtools sort, something about read group information was disrupted.
Do you have any idea about what would cause this, or if its is common to have compatibility issues with sorted bam files? I haven't been able to find much information about this error, other than suggestions to run Picard AddOrReplaceReadGroups to rename read group information- would this be the recommended next step?
-
Sounds like you are having issues with samtools and not GATK. I have not seen this error before so unfortunately can't help with this. I suggest to reaching out to the samtools support team or posting this question on Seqanswers, Biostars, or Bioinformatics Stack Exchange.
Please sign in to leave a comment.
3 comments