Error in running SplitNCigarReads - Attempt to add record to closed writer
Hi All,
I'm having an issue trying to run SplitNCigarReads, and I've tried all the potential fixes I've seen online to but haven't been able to fix the issue.
I'm using GATK 4.2.5.0 with java/temurin-8/8u322-b06 (Java 8 on my cluster)
I'm running the following command:
gatk SplitNCigarReads \
-R /home/regmvcr/Scratch/reference/sarek/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \
-I "/home/regmvcr/Scratch/workspace/JSBF/star_salmon/"$SAMPLE".markdup.sorted.bam" \
-O "/home/regmvcr/Scratch/workspace/JSBF/SplitNCigarReads/"$SAMPLE"_split.bam"
(I'm using a parameter file on a cluster to run the command on multiple samples at once)
And I get the following error message:
GATK: Some GATK tools require conda and associated libraries.
To use them run:
module load python/miniconda3/4.10.3
source $UCL_CONDA_PATH/etc/profile.d/conda.sh
conda activate $GATK_CONDA
Using GATK jar /shared/ucl/apps/gatk-bsd/4.2.5.0/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar defined in environment variable GATK_LOCAL_JAR
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /shar
ed/ucl/apps/gatk-bsd/4.2.5.0/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar SplitNCigarReads -R /home/regmvcr/Scratch/reference/sarek/resources_broad_hg38_v0_Homo_sapiens_assemb
ly38.fasta -I /home/regmvcr/Scratch/workspace/JSBF/star_salmon/I3O-MC-JSBF-100-1003.markdup.sorted.bam -O /home/regmvcr/Scratch/workspace/JSBF/SplitNCigarReads/I3O-MC-JSBF-10
0-1003_split.bam
19:40:24.551 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre/shared/ucl/apps/gatk-bsd/4.2.5.0/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com
/intel/gkl/native/libgkl_compression.so
Sep 14, 2023 7:40:24 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
19:40:24.716 INFO SplitNCigarReads - ------------------------------------------------------------
19:40:24.716 INFO SplitNCigarReads - The Genome Analysis Toolkit (GATK) v4.2.5.0
19:40:24.716 INFO SplitNCigarReads - For support and documentation go to https://software.broadinstitute.org/gatk/
19:40:24.717 INFO SplitNCigarReads - Executing as regmvcr@node-h00a-012.myriad.ucl.ac.uk on Linux v3.10.0-1160.53.1.el7.x86_64 amd64
19:40:24.717 INFO SplitNCigarReads - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_322-b06
19:40:24.717 INFO SplitNCigarReads - Start Date/Time: September 14, 2023 7:40:24 PM BST
19:40:24.717 INFO SplitNCigarReads - ------------------------------------------------------------
19:40:24.717 INFO SplitNCigarReads - ------------------------------------------------------------
19:40:24.718 INFO SplitNCigarReads - HTSJDK Version: 2.24.1
19:40:24.718 INFO SplitNCigarReads - Picard Version: 2.25.4
19:40:24.718 INFO SplitNCigarReads - Built for Spark Version: 2.4.5
19:40:24.718 INFO SplitNCigarReads - HTSJDK Defaults.COMPRESSION_LEVEL : 2
19:40:24.718 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
19:40:24.718 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
19:40:24.719 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
19:40:24.719 INFO SplitNCigarReads - Deflater: IntelDeflater
19:40:24.719 INFO SplitNCigarReads - Inflater: IntelInflater
19:40:24.719 INFO SplitNCigarReads - GCS max retries/reopens: 20
19:40:24.719 INFO SplitNCigarReads - Requester pays: disabled
19:40:24.719 INFO SplitNCigarReads - Initializing engine
19:40:25.434 INFO SplitNCigarReads - Done initializing engine
19:40:25.543 INFO ProgressMeter - Starting traversal
19:40:25.543 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
19:40:35.584 INFO ProgressMeter - chr1:1755459 0.2 295000 1762948.2
19:40:45.611 INFO ProgressMeter - chr1:11863327 0.3 564000 1686350.7
19:40:55.620 INFO ProgressMeter - chr1:16582638 0.5 1006000 2006849.1
19:41:05.622 INFO ProgressMeter - chr1:21258696 0.7 1275000 1908730.3
...
20:44:06.519 INFO ProgressMeter - chr22:41867223 63.7 158095000 2482533.3
20:44:16.525 INFO ProgressMeter - chrX:2719670 63.8 158464000 2481828.4
20:44:26.531 INFO ProgressMeter - chrX:20161751 64.0 158966000 2483204.8
20:44:36.543 INFO ProgressMeter - chrX:41343166 64.2 159351000 2482747.3
20:44:46.545 INFO ProgressMeter - chrX:53216185 64.4 159757000 2482625.0
20:44:56.546 INFO ProgressMeter - chrX:72273898 64.5 160125000 2481914.9
20:45:06.573 INFO ProgressMeter - chrX:79200408 64.7 160415000 2479986.0
20:45:16.611 INFO ProgressMeter - chrX:108571833 64.9 160804000 2479586.6
20:45:26.622 INFO ProgressMeter - chrX:130138642 65.0 161157000 2478652.7
20:45:36.719 INFO ProgressMeter - chrX:153782051 65.2 161530000 2477975.9
20:45:47.140 INFO ProgressMeter - chrY:10825619 65.4 162041000 2479210.0
20:45:57.139 INFO ProgressMeter - chrM:2699 65.5 162766000 2483968.3
20:46:07.139 INFO ProgressMeter - chr22_KI270733v1_random:134787 65.7 164454000 2503361.6
20:46:17.167 INFO ProgressMeter - chrUn_GL000220v1:113440 65.9 166153000 2522805.8
20:46:27.276 INFO ProgressMeter - chrUn_GL000220v1:157564 66.0 167306000 2533830.5
20:46:30.851 WARN IntelInflater - Zero Bytes Written : 0
20:46:30.860 INFO SplitNCigarReads - 0 read(s) filtered by: AllowAllReadsReadFilter
20:46:30.863 INFO OverhangFixingManager - Overhang Fixing Manager saved 694454 reads in the first pass
20:46:30.892 INFO SplitNCigarReads - Starting traversal pass 2
20:46:37.284 INFO ProgressMeter - chr1:3404725 66.2 168146000 2540135.4
20:46:48.282 INFO ProgressMeter - chr1:10291091 66.4 168321000 2535757.4
20:46:58.713 INFO ProgressMeter - chr1:16564851 66.6 168634000 2533836.5
20:47:08.746 INFO ProgressMeter - chr1:19113706 66.7 168965000 2532447.8
20:47:18.763 INFO ProgressMeter - chr1:21890055 66.9 169199000 2529625.2
20:47:29.315 INFO ProgressMeter - chr1:25813920 67.1 169452000 2526764.1
20:47:39.314 INFO ProgressMeter - chr1:28508573 67.2 169734000 2524694.6
20:47:49.443 INFO ProgressMeter - chr1:31692007 67.4 170037000 2522866.5
20:47:59.453 INFO ProgressMeter - chr1:36344170 67.6 170275000 2520159.5
20:48:09.458 INFO ProgressMeter - chr1:39447509 67.7 170537000 2517823.3
20:48:19.478 INFO ProgressMeter - chr1:45013600 67.9 170770000 2515062.2
20:48:29.490 INFO ProgressMeter - chr1:52525685 68.1 170995000 2512202.7
20:48:39.501 INFO ProgressMeter - chr1:63549257 68.2 171193000 2508960.8
20:48:49.565 INFO ProgressMeter - chr1:77632876 68.4 171475000 2506931.0
20:48:59.572 INFO ProgressMeter - chr1:89010984 68.6 171781000 2505295.9
20:49:09.869 INFO ProgressMeter - chr1:92837514 68.7 172031000 2502678.0
20:49:20.131 INFO ProgressMeter - chr1:100891510 68.9 172297000 2500326.5
20:49:30.138 INFO ProgressMeter - chr1:110345420 69.1 172598000 2498647.0
20:49:40.160 INFO ProgressMeter - chr1:117442273 69.2 172913000 2497168.8
20:49:50.174 INFO ProgressMeter - chr1:120463528 69.4 173200000 2495299.1
20:50:00.184 INFO ProgressMeter - chr1:120826424 69.6 173559000 2494475.6
20:50:10.196 INFO ProgressMeter - chr1:120838884 69.7 173996000 2494773.2
20:50:20.326 INFO ProgressMeter - chr1:144429847 69.9 174352000 2493840.6
20:50:30.333 INFO ProgressMeter - chr1:145584694 70.1 174825000 2494654.9
20:50:40.338 INFO ProgressMeter - chr1:146077591 70.2 175095000 2492576.7
...
21:17:56.047 INFO ProgressMeter - chr6:31828873 97.5 222011000 2276839.7
21:18:06.108 INFO ProgressMeter - chr6:31829365 97.7 222402000 2276934.0
21:18:16.447 INFO ProgressMeter - chr6:31829459 97.8 222828000 2277277.9
21:18:26.537 INFO ProgressMeter - chr6:31871533 98.0 223264000 2277819.0
21:18:36.548 INFO ProgressMeter - chr6:32031575 98.2 223609000 2277462.0
21:18:46.550 INFO ProgressMeter - chr6:33694905 98.4 223890000 2276459.3
21:18:56.882 INFO ProgressMeter - chr6:42652593 98.5 224124000 2274855.2
21:19:06.922 INFO ProgressMeter - chr6:46150821 98.7 224356000 2273348.8
21:19:16.925 INFO ProgressMeter - chr6:56607518 98.9 224691000 2272903.7
21:19:26.951 INFO ProgressMeter - chr6:72182472 99.0 224990000 2272087.7
21:19:36.956 INFO ProgressMeter - chr6:73519441 99.2 225334000 2271736.1
21:19:44.136 INFO SplitNCigarReads - Shutting down engine
[September 14, 2023 9:19:44 PM BST] org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads done. Elapsed time: 99.33 minutes.
Runtime.totalMemory()=5453119488
htsjdk.samtools.util.RuntimeIOException: Attempt to add record to closed writer.
at htsjdk.samtools.util.AbstractAsyncWriter.write(AbstractAsyncWriter.java:57)
at htsjdk.samtools.AsyncSAMFileWriter.addAlignment(AsyncSAMFileWriter.java:53)
at org.broadinstitute.hellbender.utils.read.SAMFileGATKReadWriter.addRead(SAMFileGATKReadWriter.java:21)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.writeReads(OverhangFixingManager.java:358)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.flush(OverhangFixingManager.java:338)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads.closeTool(SplitNCigarReads.java:192)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1091)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
It's the same issue as the following topic:
But like the person in the comments, increasing tmp-dir and specifying a location does not work (and I have plenty of space). It appears to be an issue with SplitNCigarReads in general, as a similar issue occurs when trying to run a pre-confiigured rnaseq pipeline (it also crashes out at the SplitNCigarReads step).
Any ideas how I can fix this? The input files are BAMs created using STAR salmon using nfcore rnaseq.
-
Does this happen with a single sample only or do all your samples have the same problem?
Does this problem occur about the same spot for this sample?
Do you have a chance to try with the latest GATK 4.4.0.0 or maybe with 4.3.0.0?
Can you run ValidateSamFile on your input file to see if there is a problematic read inside?
These are a few questions that may need answers before we can escalate the issue.
-
Hi Gökalp Çelik,
Thank you for getting back to me.
This happens with all of my samples (I have 128 in total).
It happens at a different point on each sample, and all of them when running simultaneously crash a few minutes apart.
Unfortunately, I am running this on a university HPC system, and GATK 4.2.5.0 is the most recent version available. Docker does not run on the HPC that I use. I can try to get a more recent of GATK installed, but this will take some time (if the IT services are able to do so).
When I ran ValidateSamFIle on one file I got the following output:
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MATE_NOT_FOUND 26369
ERROR:MISSING_PLATFORM_VALUE 1And on another sample using Verbose mode:
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22304:15340:4254, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23508:16716:16483, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11210:12467:2937, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22309:26730:6264, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13508:24253:6148, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:23610:20394:2708, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21212:25018:8014, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13508:3006:9024, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22410:19644:13259, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21109:25641:19234, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13505:15927:8285, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11111:25353:6668, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22311:20392:3796, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23206:7131:12353, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12402:22874:9992, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21410:6086:1282, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:21501:22893:17937, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13206:24314:1518, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21406:22260:12625, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23305:13288:6784, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13603:11736:19940, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23308:23824:5000, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12609:6100:4886, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12308:18108:6055, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13301:14800:8602, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23101:4012:5222, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12408:15048:7213, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12411:19103:16046, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13510:6968:9048, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11304:26045:8278, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13512:26775:17793, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12307:16138:16166, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21209:4945:14453, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13601:7408:8053, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13405:10458:9518, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21212:5240:18564, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21203:10002:4080, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13207:4299:10834, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21101:12810:11895, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13510:16820:18514, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12506:20194:13401, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12104:21067:19294, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23204:24637:18783, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12109:11231:12255, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21306:1966:14153, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22302:11626:19926, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22505:1750:11158, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12210:20133:3566, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22206:8583:7130, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21609:15323:15825, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:13306:1502:18576, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12509:2972:17204, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12506:17189:15002, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22306:11294:4080, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12406:10850:8804, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12305:19524:2378, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12105:3869:12584, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21209:23880:16858, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:22502:21591:16790, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21107:9977:6671, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11206:14649:15731, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23311:21725:8779, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11203:14475:11878, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21201:1640:1460, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13411:19873:2097, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:13302:14343:9955, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22505:10788:1842, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23609:20026:19439, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13608:16087:14760, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23607:10808:19178, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12506:24699:11315, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13611:5883:18276, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11105:7442:10844, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13111:2500:7982, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23308:19684:9329, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21311:6407:4283, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11102:7227:9642, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12608:1961:8305, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13209:15109:8345, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12405:10690:5275, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12103:15722:3511, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23308:11818:8806, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:11511:18039:11013, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22407:22620:7006, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23307:19659:12035, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12202:16058:7069, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22310:20898:7140, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12206:8303:14993, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12503:13268:18013, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22107:2326:2492, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:21401:14038:15575, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12410:11717:19455, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:22602:18184:11953, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22307:17486:6385, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21205:3881:14008, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12508:9904:20337, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23311:4296:5929, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:23408:11895:4853, Mate not found for paired read
ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23201:15291:7308, Mate not found for paired read
Maximum output of [100] errors reached.Any thoughts?
-
I have checked my own bam files and I also have mate not found messages so this is probably not the actual problem. Looking at the commits and merges in the github repository we have a commit to fix unseen exceptions and unsafe closure of runners on February 14th of this year and it seems to have made it to the version 4.4.0.0. It may be necessary to test with this version to see if problems still persist.
Also I have a few more suggestions for you to check. Although you mentioned that your cluster has plenty of storage and even setting --tmp-dir did not help out your problem but since this problem was solved with setting a --tmp-dir to a larger volume for other users in the past it may be possible that your cluster may have user quota restrictions to prevent users from using more than certain amount of storage. If you are running multiple instances at the same time it may be necessary for you to run one-by-one to see if the problem still persists. Also when you try this potential solution please set --tmp-dir to the largest volume that you can use preferably to the place where your bam files are located if you have read write execute permissions there. For any potential user quota issues you may need to contact your local IT support to get a solution.
I hope these will help solve your problem.
-
Thank you for getting back to me. I agree; I don't think the input files are the issue.
I've tried running a single file with a specified large --tmp-dir but I'm having the same issue. I'm now running a single file with no specific space requirements and I'll let you know how it goes.
I've also asked for GATK v4.4.0.0 to be added to the HPC, and I'll give it a try with that when it goes live.
I'll keep you posted.
-
Our team also have a suggestion for you to try. Since your error message is thrown from our AsyncSAMFileWriter class it may be worth trying to disable asynchronous writing. To do that you need to add these parameters to your GATK command line under --java-options tag.
gatk --java-options "-Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_read_samtools=false"
Let us know how it goes if you have a chance to try this option.
-
Hi Gökalp Çelik
I'm still waiting to hear about the installation of GATK 4.4.4.0, but I've tried adding your command. It still isn't working, but I get a different error message:
19:14:21.776 INFO ProgressMeter - chr7:64992413 147.3 255623000 1735866.8
19:14:31.807 INFO ProgressMeter - chr7:67306017 147.4 255878000 1735628.0
19:14:42.455 INFO ProgressMeter - chr7:73041333 147.6 256225000 1735892.1
19:14:50.344 INFO SplitNCigarReads - Shutting down engine
[September 28, 2023 7:14:50 PM BST] org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads done. Elapsed time: 147.77 minutes.
Runtime.totalMemory()=5701632000
java.lang.NullPointerException
at htsjdk.samtools.SAMRecordCoordinateComparator.fileOrderCompare(SAMRecordCoordinateComparator.java:90)
at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:48)
at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:43)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
at java.util.TimSort.sort(TimSort.java:234)
at java.util.Arrays.parallelSort(Arrays.java:1174)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:247)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:187)
at org.broadinstitute.hellbender.utils.read.SAMFileGATKReadWriter.addRead(SAMFileGATKReadWriter.java:21)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.writeReads(OverhangFixingManager.java:358)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.flush(OverhangFixingManager.java:338)
at org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads.closeTool(SplitNCigarReads.java:192)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1091)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289) -
A few things before we escalate this into a gatk issue.
1- What is the version of STAR that you are using?
2- Can you run a ValidateSamFile tool with -O option to output all the results to a file and submit our way? We may want to dig into details.
3- Does this exception get thrown at the same spot for the same sample? Is it deterministic or does that happen at random spots for the same sample?
Thank you for bringing this to our attention. We are looking forward to your inputs on this matter.
-
Hi Gökalp Çelik,
Thank you for your continued help with this. I'm still waiting to see if I can get access to a newer version of GATK, but in the meantime:
1. I'm using nfcore rnaseq version 3.12.0, which uses STAR version 2.6.1
2. I can generate that file for you today. How's the best way to send it to you?
3. As far as I can tell, the exception is thrown at the same region for the same sample, but I don't think it's the exact same spot:
One run:
18:00:07.091 INFO ProgressMeter - chr7:102354789 175.4 262234000 1495179.9
Second run (slightly altered parameters):
19:14:42.455 INFO ProgressMeter - chr7:73041333 147.6 256225000 1735892.1
I'm running the same sample with the exact same parameters as the second run, and I'll let you know if the output is identical.
-
I just checked, and it's not stopping at the exact same place:
Third run (same parameters as second run:
18:59:35.971 INFO ProgressMeter - chr7:72982428 97.4 256028000 2627549.7
18:59:45.925 INFO SplitNCigarReads - Shutting down engineAlso, I have the ValidateSamReads output ready to send whenever you're ready.
-
Hi again.
Can you upload your output file and also a possible bam snippet of 100K~200K upstream and downstream of the problematic region to the FTP server mentioned in the below article?
https://gatk.broadinstitute.org/hc/en-us/articles/360035889671
You may perform any sort of anonymization such as removing sample names and other tags in the header.
The team wishes to replicate the error so that we can fix this issue quickly. Let us know when you are done with uploading and we will take a look at it at once.
Thank you very much for your inputs in this matter.
-
Hi Gökalp Çelik,
I've uploaded the script, output file and one of the BAMs to the FTP server. Apologies, I couldn't get the snippet to work properly, so I've uploaded the whole BAM.
The file is GATK_error_info2.tar.gz (please ignore GATK_error_info.tar.gz if it is present on the ftp server - I don't think it uploaded properly).
-
Great thank you for these inputs. The team will check these and we will let you know. When we find the root cause and a solution or a fix.
-
As we try to replicate the issue there is one more thing we would like to ask you to try.
Is it possible for you to try running the tool without the intel accelerated libraries by adding the below parameters
--use-jdk-deflater true --use-jdk-inflater true
These will disable intel gkl library therefore if there is an issue with those libraries we may be able to avoid them.
-
Hi again.
We have tried several ways to replicate the problem however we are unable to replicate the issue and both versions of the tool completes without any issues and produces exactly the same file as an output. This observation brings us to the point that the issue you are observing could be related to your setup and configuration rather than GATK itself. There seems to be some combination of those limiting the resource management for GATK therefore results in prematurely closed writers and streams for the tool which results in Null Pointer Exception in an highly unexpected nature.
In this regard can you provide us more details about the setup you have to execute your commands such as
- the execution environment (shell, slurm, sge etc.)
- user limits such as the output of command ulimit -a
- amount of memory and disk space for analysis
- whether there are quotas for users to execute their tasks such as amount of disk space to be used by a single task, amount of memory to be mapped
- Operating system version and kernel versions
- whether GATK is installed from the official channel or from unofficial channels such as conda or built by the IT management.
If you can provide these information we may be able to check if there is anything that limits the resources for GATK.
Also as final suggestions you may set the system property
--java-options "-DGATK_STACKTRACE_ON_USER_EXCEPTION=true"
to get detailed exception in case the real problem is somewhat hidden by the null pointer exception in the resulting log. And lastly can you also increase the -Xmx parameter to somewhere around 1 to 2 gbs less than the allocated memory for the task to see if it will make any difference?
Regards.
-
Hi,
Sorry for the late reply, I was on call this weekend. I am trying your new commands, and I will let you know if they fix the issue.
I'll need to ask to find the answers about the HPC, but I know I'm using Myriad at UCL, which is an SGE based system. GATK is built as a installable module, which I think has been built by our IT department. I have just under 30TB of space available on my allocation, and the output of ulimit -a is:
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 770531
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 100000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimitedAlthough I'm not sure if these are the limits for the login nodes or the compute notes. I'll find out more details for you.
-
Our team has worked on your files and we are certain that we could not replicate your problem here on our end. The team has a few more suggestions for you to track.
If you are allowed to download files to your allowed space on your cluster you may try downloading the latest GATK 4.4.0.0 from
https://github.com/broadinstitute/gatk/releases/download/4.4.0.0/gatk-4.4.0.0.zip
and extract the contents and try running the gatk script directly from there to see if you still get any errors like this one.
If not you may need to contact your IT support to check your executable permissions and other possible cluster quota restrictions to get you going with your analyses.
I hope this helps.
-
Hi,
Apologies for the late reply, I've just come back from an overseas conference and it turns out that the replies from the RC Support at my institution were going missing before I could receive them. GATK 4.4.0.0 has now been installed on my system, so I will try to run my script with the newer version and let you know if that solves the issue.
As for the answers to your questions (as per the lovely RC Support folk):
- execution environment: SGEhttps://github.com/UCL-RITS/rcps-buildscripts/blob/master/gatk-4.4.0.0_install 4.2.5.0: https://github.com/UCL-RITS/rcps-buildscripts/blob/master/gatk-4.2.5.0_install The output from "ulimit -a" in the job environment shows: core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 770597 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 770597 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
- memory and disk space: up to 1.5TB of each on the node your job runs on, depending on what you request for the job - same as above? Except also you'll have a limit on the amount of networked storage, which is usually 150GB for your home directory and 1TB for the Scratch area unless you've asked for more in the past - Red Hat Linux 7.8, kernel 3.10.0-1127.el7.x86_64 - installed from GitHub official releases, extra tools installed from Conda -- you can see the exact scripts we used here: 4.4.0.0: -
Hi,
Thank you for your help. I've tried running SplitNCigarReads with GATK 4.4.0.0 but I'm still having issues. I get the error:htsjdk.samtools.util.RuntimeIOException: Problem writing temporary file file:///tmp/sortingcollection.8583155797493982206.tmp. Try setting TMP_DIR to a file system with lotsof space.at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:260)at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:202)at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:36)at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:16)at htsjdk.samtools.util.AbstractAsyncWriter$WriterRunnable.run(AbstractAsyncWriter.java:123)at java.base/java.lang.Thread.run(Thread.java:833)Suppressed: htsjdk.samtools.util.RuntimeIOException: Attempt to add record to closed writer.at htsjdk.samtools.util.AbstractAsyncWriter.write(AbstractAsyncWriter.java:57)at htsjdk.samtools.AsyncSAMFileWriter.addAlignment(AsyncSAMFileWriter.java:58)at org.broadinstitute.hellbender.utils.read.SAMFileGATKReadWriter.addRead(SAMFileGATKReadWriter.java:21)at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.writeReads(OverhangFixingManager.java:358)at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.flush(OverhangFixingManager.java:338)at org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads.closeTool(SplitNCigarReads.java:192)at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)at org.broadinstitute.hellbender.Main.main(Main.java:289)Caused by: htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222)at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:212)at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:184)at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:40)at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:254)... 6 moreSuppressed: java.io.IOException: No space left on deviceat java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97)at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67)at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:288)at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)at java.base/java.nio.channels.Channels.writeFully(Channels.java:96)at java.base/java.nio.channels.Channels$1.write(Channels.java:171)at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)at org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:360)at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:377)at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:334)at org.xerial.snappy.SnappyOutputStream.close(SnappyOutputStream.java:419)at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:259)... 6 moreCaused by: java.io.IOException: No space left on deviceat java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97)at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67)at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:288)at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)at java.base/java.nio.channels.Channels.writeFully(Channels.java:96)at java.base/java.nio.channels.Channels$1.write(Channels.java:171)at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)at org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:360)at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:377)at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:130)at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)... 10 moreUnfortunately, this issue persists even when setting tempfs to 300GB in size, or setting TMP_DIR to part of my Scratch (which has nearly 30TB of free space). I'll pass it onto my local team to see if they have any ideas, but do you have any thoughts as to anything else I could try?Warm regards,Valerie -
The issue seems like a compute system issue rather than GATK itself therefore our obvious suggestion would be to use a different system to get your results.
I hope this helps.
-
Hi,
Sorry it's been so long, but my local HPC has been down for some time.
For future reference, I managed to solve the issue - turns out that I had to manually set --tmp-dir as part of SplitNCigarReads in order for it to work properly.
Thank you for all of your help!
Valerie
Please sign in to leave a comment.
20 comments