ValiidateSamFile Error: Mate not found
AnsweredHello,
I'm a relatively new user trying to generate bam files for SNP calling. I'm using the latest version of Picard (2.25.6). My basic pipeline is to:
1) trim low quality sequences and adapters with trimmomatic
2) align reads to reference genome with bwa-mem
3) sort resulting sam files, and merge technical replicates which were sequenced on different cells
4) filter low quality bases with samtools view
5) sort and convert to bam with samtools sort
After running this pipeline, I was getting errors about mates not getting found for many paired end reads.
After looking around at other users experiences and posts on the internet, I attempted to fix this problem with using both of the AddOrReplaceReadGroups and FixMateInformation tools of Picard (code below).
java -jar ~/libraries/picard2.25.jar AddOrReplaceReadGroups I=Library1_pasmfs_withhead.bam O=Library1_withRG.bam RGPL=illumina RGPU=NCGCGGTT+NGCGCTAG RGSM=D1 RGLB=Library1_S1
java -jar ~/libraries/picard2.25.jar FixMateInformation I=Library1_withRG.bam O=Library1_withRG_fixedMates.bam
Those each seemed to run fine based on the updates output into the command line interphase. However, I'm still getting errors when I run ValidateSamFIle about Mates not getting found (below).
java -jar ~/libraries/picard2.25.jar ValidateSamFile I=Library1_withRG_fixedMates.bam IGNORE_WARNINGS=true MODE=VERBOSE
INFO 2021-06-16 16:38:08 ValidateSamFile
********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** ValidateSamFile -I Library1_withRG_fixedMates.bam -IGNORE_WARNINGS true -MODE VERBOSE
**********
16:38:09.015 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/jmunshisouth/libraries/picard2.25.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Wed Jun 16 16:38:09 EDT 2021] ValidateSamFile INPUT=Library1_withRG_fixedMates.bam MODE=VERBOSE IGNORE_WARNINGS=true MAX_OUTPUT=100 VALIDATE_INDEX=true INDEX_VALIDATION_STRINGENCY=EXHAUSTIVE IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 SKIP_MATE_VALIDATION=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Wed Jun 16 16:38:09 EDT 2021] Executing as jmunshisouth@Peromyscus.local on Mac OS X 10.14.3 x86_64; OpenJDK 64-Bit Server VM 11.0.8+10-LTS; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.25.6
WARNING 2021-06-16 16:38:09 ValidateSamFile NM validation cannot be performed without the reference. All other validations will still occur.
INFO 2021-06-16 16:39:06 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:00:57s. Time for last 10,000,000: 57s. Last read position: A01:26,341,123
INFO 2021-06-16 16:40:05 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:01:56s. Time for last 10,000,000: 59s. Last read position: A02:21,515,498
INFO 2021-06-16 16:41:09 SamFileValidator Validated Read 30,000,000 records. Elapsed time: 00:03:00s. Time for last 10,000,000: 63s. Last read position: A03:16,073,129
INFO 2021-06-16 16:42:18 SamFileValidator Validated Read 40,000,000 records. Elapsed time: 00:04:09s. Time for last 10,000,000: 69s. Last read position: A04:2,760,120
INFO 2021-06-16 16:43:27 SamFileValidator Validated Read 50,000,000 records. Elapsed time: 00:05:18s. Time for last 10,000,000: 68s. Last read position: A05:5,632,307
INFO 2021-06-16 16:44:42 SamFileValidator Validated Read 60,000,000 records. Elapsed time: 00:06:33s. Time for last 10,000,000: 75s. Last read position: A06:3,189,770
INFO 2021-06-16 16:45:38 SamFileValidator Validated Read 70,000,000 records. Elapsed time: 00:07:28s. Time for last 10,000,000: 55s. Last read position: A06:22,435,402
INFO 2021-06-16 16:46:52 SamFileValidator Validated Read 80,000,000 records. Elapsed time: 00:08:43s. Time for last 10,000,000: 74s. Last read position: A07:20,162,605
INFO 2021-06-16 16:48:05 SamFileValidator Validated Read 90,000,000 records. Elapsed time: 00:09:55s. Time for last 10,000,000: 72s. Last read position: A08:17,582,933
INFO 2021-06-16 16:49:40 SamFileValidator Validated Read 100,000,000 records. Elapsed time: 00:11:31s. Time for last 10,000,000: 95s. Last read position: A09:21,123,849
INFO 2021-06-16 16:50:57 SamFileValidator Validated Read 110,000,000 records. Elapsed time: 00:12:48s. Time for last 10,000,000: 77s. Last read position: A10:813,721
INFO 2021-06-16 16:52:10 SamFileValidator Validated Read 120,000,000 records. Elapsed time: 00:14:01s. Time for last 10,000,000: 73s. Last read position: Scaffold0156:35,018
INFO 2021-06-16 16:54:04 SamFileValidator Validated Read 130,000,000 records. Elapsed time: 00:15:55s. Time for last 10,000,000: 113s. Last read position: Scaffold0689:8,860
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1114:10095:16188, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2277:9905:10942, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2216:16658:2754, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1232:13973:6214, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2111:2727:4178, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2202:15700:15248, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2247:11650:29747, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2138:20880:11475, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1119:32081:26835, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1150:9570:18881, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1139:24135:34334, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2237:8386:12289, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2248:28935:35415, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2203:21667:19914, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2177:15926:20525, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2118:7346:12555, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2209:27932:21402, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2230:23384:36041, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2263:20283:17550, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2140:18873:17644, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1138:2727:25254, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1268:26223:15593, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2169:23556:14293, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1107:4381:28682, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1102:24261:23124, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2273:15275:33614, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2215:12780:1172, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1237:8350:22122, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1248:16342:14669, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2176:15628:28338, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2104:25301:35039, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2141:7916:2832, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1235:7862:20024, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1170:19831:2268, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1217:10366:20228, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2229:27516:21778, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2275:8422:32894, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1124:15926:22529, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2138:14073:29246, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1211:1235:36855, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2159:16984:26240, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1150:29423:12743, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1264:26521:20870, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2269:31358:27273, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2133:22535:4225, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1273:18954:31720, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1150:31168:36933, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2212:23854:23046, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1227:5023:16673, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1271:22417:36714, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1159:28845:21762, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1137:20365:32565, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2259:5493:13510, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2159:3884:19774, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1219:12870:26913, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2259:26042:2534, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2170:8757:2942, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2101:13422:6918, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1167:16649:33614, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1144:3830:6339, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2121:8558:18724, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1125:3034:22999, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1233:23077:28275, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2177:27787:17550, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1203:13458:9267, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1242:10041:12587, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1255:26612:36088, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1151:5602:23876, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2109:29297:1251, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1118:24288:27179, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1128:15881:10488, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1212:28727:2957, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1266:26440:13808, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2210:25988:15969, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1273:12364:3458, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2137:22218:14920, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1271:13856:30749, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1248:31693:21934, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2213:12608:25457, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2277:16080:12305, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1142:3730:13745, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2170:18358:17409, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1227:27444:15295, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2167:11442:7560, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2262:14398:21762, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2205:11514:6339, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1233:21160:36229, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1144:14009:16454, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1127:11098:36620, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2236:14045:13823, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1202:23529:17221, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2230:31033:8328, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2138:10583:10864, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2112:24243:3677, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:1213:23484:35837, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2167:32931:31125, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2205:9914:2096, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:1149:14913:24721, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:2:2146:18078:17895, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name A00975:57:HHGM7DRXX:1:2245:5511:3646, Mate not found for paired read
Maximum output of [100] errors reached.
[Wed Jun 16 17:01:15 EDT 2021] picard.sam.ValidateSamFile done. Elapsed time: 23.10 minutes.
Runtime.totalMemory()=2734686208
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
I'm aware I'm using older syntax, but I don't think that would be an issue. Since I'm an inexperienced user, I'm not really sure whether all of these mate not found errors is typical or may indicate a problem with my file, and what the next step would be for me to go about fixing or excluding reads without found mates.
Any help or suggestions for next steps would be appreciated, thanks!
-
Hi Stephen Johnson,
You could be losing the mates unknowingly in one of the steps in your pipeline. If this is your first time running these methods and you are trying to validate your methods, you may want to isolate which step is causing this error to come up. You can try running ValidateSamFile for your bam file after each step and figure out when it comes up to find an explanation.
Best,
Genevieve
-
Hi Genevieve Brandt (she/her),
Thanks for your reply. I already get the following error on my sam file immediately downstream of reference genome alignment:
ERROR::MISSING_READ_GROUP:Read groups is empty
Because I have unable to figure out why the read group info is lost when I align, I'm trying to add this information back with the tools described in my post. I've heard through personal communication with a collaborator that this method worked for them. After trying to add the read groups back with AddOrRepaceGroups and also tried to fix any incorrect mate information with FixMateInformation. After this, I no longer get the "read groups is empty error" but now get the "Error: mate not found" messages for a large number of reads.
Do you have any resources or suggestions for troubleshooting "Error: mate not found" after using Picard AddOrRepaceGroups?
-
Stephen Johnson can you try to add read groups to your files during the mapping step? Does the "Error: mate not found" come up after or before FixMateInformation?
-
Hi,
I think it should alternatively be possible to add the read group information during the mapping step, but I've read some posts on that and am having trouble understanding how. I've read that its easier to add the read group information back to the resulting bam file with Picard as I'm trying. The "Error: mate not found" comes up after AddOrReplaceGroups and before FixMateInformation.
-
Hi Stephen Johnson,
I see, thanks for clarifying where you see this error. I noticed that you are missing adding a read group ID in your AddOrReplaceReadGroups command. Please make sure to add that field and also share the stack trace from the command if there are still issues following AddOrReplaceReadGroups.
Best,
Genevieve
-
Okay, I added the ID tag to the AddOrReplaceGroups command but still got the same "Error:: MATE_NOT_FOUND errors. I also figured out how to add the read groups during alignment and tried that but still get the same errors when running ValidateSamFile, and even after I try adding the read groups with AddOrReplaceReadGroups.
I've looked around but am not sure how to read the stack trace- are you aware of any posts demonstrating how to do this for java/GATK comands? Also, would I want the stack trace for AddOrReplaceGroups or ValidateSamFile?
-
I don't think we have any posts about how to read the stack trace but that's a really great idea, I'll put in a request with our documentation team!
Do you have any more specific questions about it I could answer?
Does FixMateInformation get rid of the MATE_NOT_FOUND errors? Now that you have successfully fixed your read group issue, could you find exactly when the MATE_NOT_FOUND error comes up now? Then please provide the command that seems to cause the issue.
-
Thanks for your response.
To clarify my last post, I do still get MATE_NOT_FOUND errors when I add read group information with AddOrReplaceGroups, and FixMateInformation does not solve the issue.
That being said, was just able to add read group information during alignment, and so I will align everything with read group info added during alignment and try to continue my pipeline from there.
-
Ok I see, please let us know if that doesn't work and you have further questions!
-
Hi,
Is there any update about the "MATE_NOT_FOUND" problem?
Thanks
-
Stephen Johnson do you have any recommendations to share with Sinem Selvi for how you fixed your "MATE_NOT_FOUND" issue?
-
Yes, I was able to fix the read group errors I was getting. Sorry for not replying with the solutio earlier. The solution was to specify the read group information with RGID, RGLB, RGPL, RGSM tags during bwa-mem alignment, rather than adding this information after the alignment step. I've included an example below.
bwa mem -t 14 -M -R '@RG\tID:D1\tLB:library1_L1\tPL:ILLUMINA\tPU:NCGCGGTT+NGCGCTAG\tSM:D1' ../../ref_genome/Brapa_v3.0.fasta Library-1_S1_L001_R1_001.paired.fastq.gz Library-1_S1_L001_R2_001.paired.fastq.gz > ../sam/Library1_L1_pa_withRG.sam
-
Thank you Stephen Johnson!
Please sign in to leave a comment.
13 comments