Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

ValiidateSamFile Error: Mate not found

Answered
0

13 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Stephen Johnson,

    You could be losing the mates unknowingly in one of the steps in your pipeline. If this is your first time running these methods and you are trying to validate your methods, you may want to isolate which step is causing this error to come up. You can try running ValidateSamFile for your bam file after each step and figure out when it comes up to find an explanation.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    Hi Genevieve Brandt (she/her),

    Thanks for your reply. I already get the following error on my sam file immediately downstream of reference genome alignment:

    ERROR::MISSING_READ_GROUP:Read groups is empty

     

    Because I have unable to figure out why the read group info is lost when I align, I'm trying to add this information back with the tools described in my post. I've heard through personal communication with a collaborator that this method worked for them. After trying to add the read groups back with AddOrRepaceGroups and also tried to fix any incorrect mate information with FixMateInformation. After this, I no longer get the "read groups is empty error" but now get the "Error: mate not found" messages for a large number of reads.

    Do you have any resources or suggestions for troubleshooting "Error: mate not found" after using Picard AddOrRepaceGroups?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Stephen Johnson can you try to add read groups to your files during the mapping step? Does the "Error: mate not found" come up after or before FixMateInformation?

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    Hi,

    I think it should alternatively be possible to add the read group information during the mapping step, but I've read some posts on that and am having trouble understanding how. I've read that its easier to add the read group information back to the resulting bam file with Picard as I'm trying. The "Error: mate not found" comes up after AddOrReplaceGroups and before FixMateInformation.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Stephen Johnson,

    I see, thanks for clarifying where you see this error. I noticed that you are missing adding a read group ID in your AddOrReplaceReadGroups command. Please make sure to add that field and also share the stack trace from the command if there are still issues following AddOrReplaceReadGroups.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    Okay, I added the ID tag to the AddOrReplaceGroups command but still got the same "Error:: MATE_NOT_FOUND errors. I also figured out how to add the read groups during alignment and tried that but still get the same errors when running ValidateSamFile, and even after I try adding the read groups with AddOrReplaceReadGroups.

    I've looked around but am not sure how to read the stack trace- are you aware of any posts demonstrating how to do this for java/GATK comands? Also, would I want the stack trace for AddOrReplaceGroups or ValidateSamFile?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    I don't think we have any posts about how to read the stack trace but that's a really great idea, I'll put in a request with our documentation team!

    Do you have any more specific questions about it I could answer?

    Does FixMateInformation get rid of the MATE_NOT_FOUND errors? Now that you have successfully fixed your read group issue, could you find exactly when the MATE_NOT_FOUND error comes up now? Then please provide the command that seems to cause the issue.

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    Thanks for your response.

    To clarify my last post, I do still get MATE_NOT_FOUND errors when I add read group information with AddOrReplaceGroups, and FixMateInformation does not solve the issue.

    That being said, was just able to add read group information during alignment, and so I will align everything with read group info added during alignment and try to continue my pipeline from there.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Ok I see, please let us know if that doesn't work and you have further questions!

    0
    Comment actions Permalink
  • Avatar
    Sinem Selvi

    Hi,

    Is there any update about the "MATE_NOT_FOUND" problem?

    Thanks

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Stephen Johnson do you have any recommendations to share with Sinem Selvi for how you fixed your "MATE_NOT_FOUND" issue?

    0
    Comment actions Permalink
  • Avatar
    Stephen Johnson

    Yes, I was able to fix the read group errors I was getting. Sorry for not replying with the solutio earlier. The solution was to specify the read group information with RGID, RGLB, RGPL, RGSM tags during bwa-mem alignment, rather than adding this information after the alignment step. I've included an example below.

    bwa mem -t 14 -M -R '@RG\tID:D1\tLB:library1_L1\tPL:ILLUMINA\tPU:NCGCGGTT+NGCGCTAG\tSM:D1' ../../ref_genome/Brapa_v3.0.fasta Library-1_S1_L001_R1_001.paired.fastq.gz Library-1_S1_L001_R2_001.paired.fastq.gz > ../sam/Library1_L1_pa_withRG.sam

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thank you Stephen Johnson!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk