Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Error: ValidateSamFile Value was put into PairInfoMap more than once.

0

3 comments

  • Avatar
    Bhanu Gandham

    Stephen Johnson

    Yeah looks like combining the bam files is the reason why you are seeing this error. Other users have faced similar errors. Here are a couple of proposed solutions:  https://www.biostars.org/p/60263/

    https://sites.google.com/a/broadinstitute.org/legacy-gatk-forum-discussions/2016-08-11-2016-04-07/7431-MarkDuplicates-error-Value-was-put-into-PairInfoMap-more-than-once

    Let me know if any of these solution work for you.

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    I tried aligning the one of the affected Libraries (sequenced over two lanes) again with bwa-mem to generate two sam files, but this time used the -M tag to make sure secondary hits are tagged appropriately for Picard as suggested by both of the links you provided. I then converted the resulting sam file for each lane to a bam file and sorted.

    Before merging the bam files together I decided to check and ran ValidateSamFile on each of the two bam files. This ran for one of the bam files, but for the other ValidateSamFile did not run successfully and I got the same "ValidateSamFile Value was put into PairInfoMap more than once" error message as before. I then also ran ValidateSamFile on the intermediate sam file for that sample (before its was converted to bam and sorted), and the sam file did not give me that error message. So, during the step where the sam file was converted into a bam file and sorted with samtools sort, something about read group information was disrupted.

    Do you have any idea about what would cause this, or if its is common to have compatibility issues with sorted bam files? I haven't been able to find much information about this error, other than suggestions to run Picard AddOrReplaceReadGroups to rename read group information- would this be the recommended next step? 

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Sounds like you are having issues with samtools and not GATK. I have not seen this error before so unfortunately can't help with this. I suggest to reaching out to the samtools support team or posting this question on  SeqanswersBiostars, or Bioinformatics Stack Exchange.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk