Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

MergeBamAlignement error

Answered
0

5 comments

  • Avatar
    Genevieve Brandt (she/her)

    Thanks for writing in Quentin Chartreux

    It looks like this error has been seen before, could you check out this post and try some of the troubleshooting methods referenced there? https://gatk.broadinstitute.org/hc/en-us/community/posts/360067295232-mergeBam-picard-issue

    If those do not work, please include the full stack trace from MergeBamAlignment in your follow up comment.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Quentin Chartreux

    Thank you for your reply.

    I had indeed seen this post, I tried to sort the bam with samtools sort -n (the ubam and the bam aligned) but even in this case mergebamalignment gives this error.

    But the problem seems to come from the fact that I used samtools to generate the ubam. Re-try using FastqTosam it works fine.

    The only "problem" is that, knowing that samtools is multithreading, FastqtoSam is slow compared to samtools (more than 2h vs 20 min)

     

    INFO 2021-11-23 13:48:43 SamAlignmentMerger Read 796000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:48:46 SamAlignmentMerger Read 797000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:48:50 SamAlignmentMerger Read 798000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:48:54 SamAlignmentMerger Read 799000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:48:58 SamAlignmentMerger Read 800000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:49:01 SamAlignmentMerger Read 801000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:49:05 SamAlignmentMerger Read 802000000 records from alignment SAM/BAM.
    INFO 2021-11-23 13:49:09 SamAlignmentMerger Finished reading 802992576 total records from alignment SAM/BAM.
    [Tue Nov 23 13:49:27 CET 2021] picard.sam.MergeBamAlignment done. Elapsed time: 53.90 minutes.
    Runtime.totalMemory()=2702180352
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    java.lang.IllegalStateException: Aligned record iterator (A00957:55:HGJHFDSXY:4:1101:10004:11240) is behind the unmapped reads (A00957:55:HGJHFDSXY:4:1101:1018:3223)
    at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:557)
    at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:186)
    at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:368)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308)
    at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:37)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks for the update Quentin Chartreux. That's strange that the output of samtools would have this issue. Could you check the samtools unmapped bam with ValidateSamFile to verify that there are not other issues with the file format?

    I will follow up with my colleagues to get more information about the MergeBamAlignment error. Unfortunately, since there is a holiday in the US for the rest of this week, I don't think I will be able to get back to you until next Monday. 

    0
    Comment actions Permalink
  • Avatar
    Quentin Chartreux

    Genevieve Brandt (she/her) 

    I used gatk 4.2.3.0 with the following command:

    gatk --java-options "-Xmx${MEMORYxmx}M" ValidateSamFile \
    -I ${BAM_DIR}/${BAM} \
    -M SUMMARY \
    -O ${BAM_DIR}/${BAM/.bam/.txt}

     

    The result is the same on the 4 u.bam tested: No errors found

    An other strange thing is the size of the u.bam : With samtools the u.bam size is 59G and the u.bam from FastqToSam is 88G.

    The order of the reads is not the same between the output of samtools and that of FastqtoSam:

    first read of u.bam from samtools : 

    A00957:55:HGJHFDSXY:4:1101:1217:1031 77 * 0 0 * * 0 0 ANATATACGTCATACTGAAGTCAATCTAGTCTACAACATGGTAAGGATTCATACTAATCACTTAACTCACTGACTCAAATAGACCAAATGGTTGAATTTTACATATGCACAGTCTAAGTTGAACTTAACTTTTATTTTGTGTCAATTTCCA ,#FF,,F,:F,F:FF:,,FFF:FFFF:FF:FFFF,FFF,,FF,F,,FF::,:F::,,:,F:,,F,F,FFFF:F,F:FF:F:,,,,,:F,F,FFF:FFFFFF:FF:,,F,FFFFF:FF,FF,F,:FFF:F:FFF:,,:F:F,F,FF:F:,FF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1217:1031 141 * 0 0 * * 0 0 ATTCTCTCTACAACCTGAAGAAAATCACACTGTGACTCTTGTTCACCTTTAAGTTTTCCAGGAAAGAATCAGACATAAGTCTAATTTTATTGTCAAATGTTCAGTTATACTACCAAGCTATTTAACTAGTCTTATAGAAGCCTCCACAGTT F,F:,FFFFF:,F,FF,F,F,::,F,,:FFF:,F,,:F,FF::,:::,FFF:,FFFFF,F,F,F:,F,:FFFF:::,:FFF,,,,,,,,,::F,FFF::FF,F,,,F,F:,F,F:,F:F,FFFFF,F,,:,F,F:,:FFF,,FFFFFFFFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1561:1031 77 * 0 0 * * 0 0 TNAAATATTTCAAAAGAGGTAGGATTTTATACAAAAACAATGTTGTGATGAAGAATGTTTGAATTGCATGCAAAATCTTTACAAATGAGCATCTGGCTATTTAAGCCAATCAATTTTGAAGTTAAATCATAAAATATTTATGACATGCTGA F#F,FFFFFFF:F:,FFFFFFF::FFFFFFFFFFFFFFFFFF,:FFFFFF:FFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFF:FFFF:F:FFFFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1561:1031 141 * 0 0 * * 0 0 CTTAGAATGCATCAAGTTTTGACAGATTTTGGGTGTGTATGCAATTTTTTAACAAAGTTTTATAATAGCATAACATGATATTTTTCAATTCCTGTTCGACAATATTATTATAAAGAAAATCGTTCTTGAGATAGTAGTACACTCCATCAAT FFFF,FFFFFFFFF,FFF,FF,FFFF,:FFF:FFFFFFFFFFFF:FFFF:FFFF,FFFF::FFFFFFFFFF::FFFFFF:FFFFFF,FFFFFFFFFFFFFFFFFF:FFF:F:FFF,FFFFFF,FF:FFFFFFFFFFFFFFFFFF,FFFF:F RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1597:1031 77 * 0 0 * * 0 0 ANATCTCCTTTCCCTTTTTCCATGATAATTTCATTTCTGTCAACTTCTGGGAAAATAACCCTCATCCTTATGTTACTTTATGCTTGTGTTGGCCTGCTTGTGTATCTGGTGTTTTTACAAAATTTAGTTTTTACGTTTTCTTAGCTCATAC F#FFFFFFFFFFF,FFFFFF:FFFFFFFFFF:FFFFFFFFFFF:FFFFF:FFFFFFFFFFFFF,FFFF:FFF:F,,FFFFF,,,FFFFFFFFF,FFFFFFFFFFFFFFFFFF,FFFFFFFFFF:FFFFFFFFFFFFFFF,FFFFFF:FFF: RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1597:1031 141 * 0 0 * * 0 0 CGTAAAAGAGAACATAGTAACAATAGTCAATGTAGCATTGCTCTAGCAAAGCCCGGTTTTTTTTTTAATATATAATTTCTGGCCATCTTAGCACACACCCTGCCTGTTCTTCCTTACTAGCTCTATTTTGCAGAATCATACTCATCTTTCA FFFF:FFFFFFF:F:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:,FFFFFFF:FFFFF:FFF,FFFFFFFFF:FFFFFFFFF,FF:FFF,FFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1633:1031 77 * 0 0 * * 0 0 TNATTAGCTTTTTGTCCTTATTGTGATCACTTCACAATGTATATCAAAACATCAAATTGTACAATATAAACAATTTTTACTTGTCAATTATACCTCAATAAAGCTGAAAAAGGAATCAAAGAGATAGGTTTAATCAGTTAAATTGAAAATG F#FFFFFFFFFFFFFF:FFF:FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF:FFFFFFFFF:F,FFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFF:FFFFFFFFFFFFF:FFFFFFF:FFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:1633:1031 141 * 0 0 * * 0 0 TTTGTTGTCTAATATTTTATAATGATAATGTATATTTATGTAACTTGTATAATTATACATAAAATTTAAAATTTTATGTAGTTTTGATATATTCAAAATAGTTTGTAATTAAGCAGCTTGCATGTAACTAGAATGCTGTCATTGTAAATGA FFFF:FF:F:FF:FFFFF:FFFFFF,:FFFF,FFFF,F:FFFFF:FFFFF,F:FFFFFFFFF::FFFFFFFFFFF:FFFFF:FFFF:F,FFFFFFFF,F:FFFFFFFFFF:FFF,FFFFFF,FFF,F:FFFF:FFFFFFF,FFFFFF,FFF RG:Z:MNM00254

    first read of u.bam from FastqTosam: 

    A00957:55:HGJHFDSXY:4:1101:10004:10113 77 * 0 0 * * 0 0 ATGAATGTGTAAACAAACTGTGGTATATCTATACAATGGAATATTATACAGTGATAAAAATAAATGGGCTATTCAGCCATCCATAGAGATGAATCTCAAATGGAGGTTTGAATTGTAATCCCAAAAGATGATCCCAAATTCCATGTGCCAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:10113 141 * 0 0 * * 0 0 AGAGTCAAATTTCTGATGTTCAATCTGACATCCAAGGAAGCAGATGAAAAGTTTGCCCCAGCTTTCAGAGAGAGAGGAGACAGAGAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGACAGAGAGA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFFFFF:F:FFFFF:FFFFFFFFFFFFF:FFF,FFFFFF::FF::F,FFFFFFFFFFFF:FFF,FFFFFFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:1094 77 * 0 0 * * 0 0 GGTACAGGACATGCTTCGTTTCTCTAACTGCCCTGTGTTAAGAAGTCTACTTCTCAGATGCGGTTAAAATCTAGCCTCAAAATCAGAAAGCAGAAGCTCTCAACCAGAAAGTGAGAGCAAAGATATGCTCATCAAATCGTGCATTTGATAC ::FF::F,F:FFF,FFF:FFF:F,FFF:F:FF,:FFFFFFF,FF,:F:,,,,FFFFFF:,:,,FFFFFFFF,FFFF:,FF,FFFFFFFF,FFF:F,:,:F:FF,FFFF:F:FFFFFF,F:F,F::F,:::FF:FFFF,F,::FF,FF,FFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:1094 141 * 0 0 * * 0 0 CAGATGATTGCTGGAAATCACTAACTCTTGTTGGTTTATTCTGTAGCAAATCTAACTGGGCTGTTTTCCCATTCAGAGGAGCAGGACAAAGCCAGGTCACGGCTTGTCAATCTATACCACGTAGGTTTCCAGTAAGTGTGAGTATCCTCAA ,FFFFFF:FFFF:FF:FF:FFFFFFF:F,:F,::FFFFFF:FFFFFF:FFFF,F,,:FFF,FF::FFF,FFFFFFF:F,FFFFFFFFFFFF,FFFF,FFFF:FFF,FFFFFFFF:F,,F:,,FFF:,F::F,FF:F:F:F,FFFFF:FF,F RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:11522 77 * 0 0 * * 0 0 CATGCCTAAAAATTTTTAAATATAGATTGTCTTCAAAGTTGCTGGTGGTAACTTTAGTACAGACTTACCTGAATTCAAAAGTTAGTCCATGATTTGATGGCTGTCTTCACCAGCGATTTGCATTTTGAAGGAAATTAAATGATGGCTATAA FFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF::FFFFFFFFFFFFFFFFFFFF,F RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:11522 141 * 0 0 * * 0 0 ATAGCAACGATTTGATAAATGAAGAGTAGGTACTCAATAAAGCAGCACACATGGACTGGTTAGAAACACTATGAGGTATTAACTTACAGAAGTCATTGTCACTTAAACAGGAGTACTCCCCAGTTATGATTAAGTGTTAATTTGAATAAAA FFF,FF,FFF:::FFFF::FF:F:FFFFF:F:::FFFFFFFFFFFFFFFFFFF,:F,FF:FFF:FFF:FFFF:FFFF,FFFFF::,,:FFF,F,FFFFFFFFFFFF,FFFFF,:FFFF,:FFFFF:FFFFF,FF:FFFFF,FFFFFFFF,F RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:1188 77 * 0 0 * * 0 0 GGACTCACAGACAGACACACACAGAGGAAGACGGTGTGAAGAGACACAGGGAAAAGGTGGCCATCTACAAGCCAAGGAGAGAAGTCTGCAATAGACCTGCCACTGATAGCCCTCAGAAGGAACCAACCCTGCTGGCACCTTGATTTTGGAT FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFF RG:Z:MNM00254
    A00957:55:HGJHFDSXY:4:1101:10004:1188 141 * 0 0 * * 0 0 CTCCACACCTGGCTGACTTTTTATATTTTCAAACCATAGGGTTTGATTATCATTAGGTAAATTGAACCAAAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTTGAACTCCTGATCTCAAGAGATCCACCTCCCTTGGCCTCCCAAAG FFFFFFFFFFFFF:FFFF,FFFFFFFFFFFFFFFFF:FFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F RG:Z:MNM00254

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Quentin Chartreux,

    It looks like samtools and FastqtoSam are sorting your ubam differently which is resulting in this problem. If you create the ubam with samtools, you can run SortSam in GATK to sort the ubam.

    This is a long standing issue between samtools and htsjdk and how they sort so the best way to move forward would just be to work around it in the current state. 

    I'm sorry we don't have a solution that will speed things up as well, but you'll need the GATK sort order to run MergeBamAlignment at this point.

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk