Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Error in running SplitNCigarReads - Attempt to add record to closed writer

0

20 comments

  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    Does this happen with a single sample only or do all your samples have the same problem? 

    Does this problem occur about the same spot for this sample? 

    Do you have a chance to try with the latest GATK 4.4.0.0 or maybe with 4.3.0.0?

    Can you run ValidateSamFile on your input file to see if there is a problematic read inside? 

    These are a few questions that may need answers before we can escalate the issue.

     

     

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi Gökalp Çelik,

    Thank you for getting back to me.

    This happens with all of my samples (I have 128 in total). 

    It happens at a different point on each sample, and all of them when running simultaneously crash a few minutes apart. 

    Unfortunately, I am running this on a university HPC system, and GATK 4.2.5.0 is the most recent version available. Docker does not run on the HPC that I use. I can try to get a more recent of GATK installed, but this will take some time (if the IT services are able to do so).

    When I ran ValidateSamFIle on one file I got the following output:

    ## HISTOGRAM    java.lang.String
    Error Type      Count
    ERROR:MATE_NOT_FOUND    26369
    ERROR:MISSING_PLATFORM_VALUE    1

    And on another sample using Verbose mode:

    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22304:15340:4254, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23508:16716:16483, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11210:12467:2937, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22309:26730:6264, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13508:24253:6148, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:23610:20394:2708, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21212:25018:8014, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13508:3006:9024, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22410:19644:13259, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21109:25641:19234, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13505:15927:8285, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11111:25353:6668, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22311:20392:3796, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23206:7131:12353, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12402:22874:9992, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21410:6086:1282, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:21501:22893:17937, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13206:24314:1518, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21406:22260:12625, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23305:13288:6784, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13603:11736:19940, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23308:23824:5000, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12609:6100:4886, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12308:18108:6055, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13301:14800:8602, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23101:4012:5222, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12408:15048:7213, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12411:19103:16046, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13510:6968:9048, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11304:26045:8278, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13512:26775:17793, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12307:16138:16166, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21209:4945:14453, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13601:7408:8053, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13405:10458:9518, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21212:5240:18564, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21203:10002:4080, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13207:4299:10834, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21101:12810:11895, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13510:16820:18514, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12506:20194:13401, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12104:21067:19294, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23204:24637:18783, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12109:11231:12255, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21306:1966:14153, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22302:11626:19926, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22505:1750:11158, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12210:20133:3566, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22206:8583:7130, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:21609:15323:15825, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:13306:1502:18576, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12509:2972:17204, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:12506:17189:15002, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:22306:11294:4080, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12406:10850:8804, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12305:19524:2378, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12105:3869:12584, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21209:23880:16858, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:22502:21591:16790, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21107:9977:6671, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:11206:14649:15731, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23311:21725:8779, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11203:14475:11878, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:21201:1640:1460, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:13411:19873:2097, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:13302:14343:9955, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22505:10788:1842, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23609:20026:19439, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13608:16087:14760, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:23607:10808:19178, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12506:24699:11315, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:13611:5883:18276, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11105:7442:10844, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13111:2500:7982, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23308:19684:9329, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21311:6407:4283, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:11102:7227:9642, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12608:1961:8305, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:13209:15109:8345, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12405:10690:5275, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12103:15722:3511, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23308:11818:8806, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:11511:18039:11013, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:22407:22620:7006, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23307:19659:12035, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:12202:16058:7069, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22310:20898:7140, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:12206:8303:14993, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12503:13268:18013, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22107:2326:2492, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:21401:14038:15575, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12410:11717:19455, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:22602:18184:11953, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:22307:17486:6385, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:21205:3881:14008, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:4:12508:9904:20337, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:1:23311:4296:5929, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:3:23408:11895:4853, Mate not found for paired read
    ERROR::MATE_NOT_FOUND:Read name NB551256:42:HFJ3GBGX9:2:23201:15291:7308, Mate not found for paired read
    Maximum output of [100] errors reached.

     

    Any thoughts?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    I have checked my own bam files and I also have mate not found messages so this is probably not the actual problem. Looking at the commits and merges in the github repository we have a commit to fix unseen exceptions and unsafe closure of runners on February 14th of this year and it seems to have made it to the version 4.4.0.0. It may be necessary to test with this version to see if problems still persist. 

    Also I have a few more suggestions for you to check. Although you mentioned that your cluster has plenty of storage and even setting --tmp-dir did not help out your problem but since this problem was solved with setting a --tmp-dir to a larger volume for other users in the past it may be possible that your cluster may have user quota restrictions to prevent users from using more than certain amount of storage. If you are running multiple instances at the same time it may be necessary for you to run one-by-one to see if the problem still persists. Also when you try this potential solution please set --tmp-dir to the largest volume that you can use preferably to the place where your bam files are located if you have read write execute permissions there. For any potential user quota issues you may need to contact your local IT support to get a solution. 

    I hope these will help solve your problem.  

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi Gökalp Çelik,

    Thank you for getting back to me. I agree; I don't think the input files are the issue.

    I've tried running a single file with a specified large --tmp-dir but I'm having the same issue. I'm now running a single file with no specific space requirements and I'll let you know how it goes.

    I've also asked for GATK v4.4.0.0 to be added to the HPC, and I'll give it a try with that when it goes live.

    I'll keep you posted.

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    Our team also have a suggestion for you to try. Since your error message is thrown from our AsyncSAMFileWriter class it may be worth trying to disable asynchronous writing. To do that you need to add these parameters to your GATK command line under --java-options tag. 

    gatk --java-options "-Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_read_samtools=false"

    Let us know how it goes if you have a chance to try this option. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi Gökalp Çelik

    I'm still waiting to hear about the installation of GATK 4.4.4.0, but I've tried adding your command. It still isn't working, but I get a different error message:

    19:14:21.776 INFO  ProgressMeter -        chr7:64992413            147.3             255623000        1735866.8
    19:14:31.807 INFO  ProgressMeter -        chr7:67306017            147.4             255878000        1735628.0
    19:14:42.455 INFO  ProgressMeter -        chr7:73041333            147.6             256225000        1735892.1
    19:14:50.344 INFO  SplitNCigarReads - Shutting down engine
    [September 28, 2023 7:14:50 PM BST] org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads done. Elapsed time: 147.77 minutes.
    Runtime.totalMemory()=5701632000
    java.lang.NullPointerException
            at htsjdk.samtools.SAMRecordCoordinateComparator.fileOrderCompare(SAMRecordCoordinateComparator.java:90)
            at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:48)
            at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:43)
            at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
            at java.util.TimSort.sort(TimSort.java:234)
            at java.util.Arrays.parallelSort(Arrays.java:1174)
            at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:247)
            at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)
            at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:187)
            at org.broadinstitute.hellbender.utils.read.SAMFileGATKReadWriter.addRead(SAMFileGATKReadWriter.java:21)
            at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.writeReads(OverhangFixingManager.java:358)
            at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.flush(OverhangFixingManager.java:338)
            at org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads.closeTool(SplitNCigarReads.java:192)
            at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1091)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
            at org.broadinstitute.hellbender.Main.main(Main.java:289)

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    A few things before we escalate this into a gatk issue. 

    1- What is the version of STAR that you are using? 

    2- Can you run a ValidateSamFile tool with -O option to output all the results to a file and submit our way? We may want to dig into details. 

    3- Does this exception get thrown at the same spot for the same sample? Is it deterministic or does that happen at random spots for the same sample? 

    Thank you for bringing this to our attention. We are looking forward to your inputs on this matter. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi Gökalp Çelik,

    Thank you for your continued help with this. I'm still waiting to see if I can get access to a newer version of GATK, but in the meantime:

    1. I'm using nfcore rnaseq version 3.12.0, which uses STAR version 2.6.1

    2. I can generate that file for you today. How's the best way to send it to you?

    3. As far as I can tell, the exception is thrown at the same region for the same sample, but I don't think it's the exact same spot:

    One run:

    18:00:07.091 INFO  ProgressMeter -       chr7:102354789            175.4             262234000        1495179.9

    Second run (slightly altered parameters):

    19:14:42.455 INFO  ProgressMeter -        chr7:73041333            147.6             256225000        1735892.1

    I'm running the same sample with the exact same parameters as the second run, and I'll let you know if the output is identical.

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    I just checked, and it's not stopping at the exact same place:

    Third run (same parameters as second run:


    18:59:35.971 INFO  ProgressMeter -        chr7:72982428             97.4             256028000        2627549.7
    18:59:45.925 INFO  SplitNCigarReads - Shutting down engine

     

    Also, I have the ValidateSamReads output ready to send whenever you're ready.

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    Can you upload your output file and also a possible bam snippet of 100K~200K upstream and downstream of the problematic region to the FTP server mentioned in the below article?

    https://gatk.broadinstitute.org/hc/en-us/articles/360035889671 

    You may perform any sort of anonymization such as removing sample names and other tags in the header. 

    The team wishes to replicate the error so that we can fix this issue quickly. Let us know when you are done with uploading and we will take a look at it at once. 

    Thank you very much for your inputs in this matter. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi Gökalp Çelik,

    I've uploaded the script, output file and one of the BAMs to the FTP server. Apologies, I couldn't get the snippet to work properly, so I've uploaded the whole BAM.

    The file is GATK_error_info2.tar.gz (please ignore GATK_error_info.tar.gz if it is present on the ftp server - I don't think it uploaded properly).

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Great thank you for these inputs. The team will check these and we will let you know. When we find the root cause and a solution or a fix. 

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    As we try to replicate the issue there is one more thing we would like to ask you to try. 

    Is it possible for you to try running the tool without the intel accelerated libraries by adding the below parameters 

    --use-jdk-deflater true --use-jdk-inflater true

    These will disable intel gkl library therefore if there is an issue with those libraries we may be able to avoid them. 

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again.

    We have tried several ways to replicate the problem however we are unable to replicate the issue and both versions of the tool completes without any issues and produces exactly the same file as an output. This observation brings us to the point that the issue you are observing could be related to your setup and configuration rather than GATK itself. There seems to be some combination of those limiting the resource management for GATK therefore results in prematurely closed writers and streams for the tool which results in Null Pointer Exception in an highly unexpected nature. 

    In this regard can you provide us more details about the setup you have to execute your commands such as

    - the execution environment (shell, slurm, sge etc.)

    - user limits such as the output of command ulimit -a

    - amount of memory and disk space for analysis

    - whether there are quotas for users to execute their tasks such as amount of disk space to be used by a single task, amount of memory to be mapped

    - Operating system version and kernel versions

    - whether GATK is installed from the official channel or from unofficial channels such as conda or built by the IT management. 

    If you can provide these information we may be able to check if there is anything that limits the resources for GATK.

    Also as final suggestions you may set the system property 

    --java-options "-DGATK_STACKTRACE_ON_USER_EXCEPTION=true" 

    to get detailed exception in case the real problem is somewhat hidden by the null pointer exception in the resulting log. And lastly can you also increase the -Xmx parameter to somewhere around 1 to 2 gbs less than the allocated memory for the task to see if it will make any difference?

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi,

    Sorry for the late reply, I was on call this weekend. I am trying your new commands, and I will let you know if they fix the issue. 

    I'll need to ask to find the answers about the HPC, but I know I'm using Myriad at UCL, which is an SGE based system. GATK is built as a installable module, which I think has been built by our IT department. I have just under 30TB of space available on my allocation, and the output of ulimit -a is:

    ulimit -a
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    scheduling priority             (-e) 0
    file size               (blocks, -f) unlimited
    pending signals                 (-i) 770531
    max locked memory       (kbytes, -l) unlimited
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 100000
    pipe size            (512 bytes, -p) 8
    POSIX message queues     (bytes, -q) 819200
    real-time priority              (-r) 0
    stack size              (kbytes, -s) 8192
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 4096
    virtual memory          (kbytes, -v) unlimited
    file locks                      (-x) unlimited

    Although I'm not sure if these are the limits for the login nodes or the compute notes. I'll find out more details for you.

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    Our team has worked on your files and we are certain that we could not replicate your problem here on our end. The team has a few more suggestions for you to track. 

    If you are allowed to download files to your allowed space on your cluster you may try downloading the latest GATK 4.4.0.0 from 

    https://github.com/broadinstitute/gatk/releases/download/4.4.0.0/gatk-4.4.0.0.zip 

    and extract the contents and try running the gatk script directly from there to see if you still get any errors like this one. 

    If not you may need to contact your IT support to check your executable permissions and other possible cluster quota restrictions to get you going with your analyses. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi, 

    Apologies for the late reply, I've just come back from an overseas conference and it turns out that the replies from the RC Support at my institution were going missing before I could receive them. GATK 4.4.0.0 has now been installed on my system, so I will try to run my script with the newer version and let you know if that solves the issue.

    As for the answers to your questions (as per the lovely RC Support folk):

    - execution environment: SGE - memory and disk space: up to 1.5TB of each on the node your job runs on, depending on what you request for the job - same as above? Except also you'll have a limit on the amount of networked storage, which is usually 150GB for your home directory and 1TB for the Scratch area unless you've asked for more in the past - Red Hat Linux 7.8, kernel 3.10.0-1127.el7.x86_64 - installed from GitHub official releases, extra tools installed from Conda -- you can see the exact scripts we used here:    4.4.0.0: https://github.com/UCL-RITS/rcps-buildscripts/blob/master/gatk-4.4.0.0_install    4.2.5.0: https://github.com/UCL-RITS/rcps-buildscripts/blob/master/gatk-4.2.5.0_installThe output from "ulimit -a" in the job environment shows:    core file size          (blocks, -c) unlimited    data seg size           (kbytes, -d) unlimited    scheduling priority             (-e) 0    file size               (blocks, -f) unlimited    pending signals                 (-i) 770597    max locked memory       (kbytes, -l) unlimited    max memory size         (kbytes, -m) unlimited    open files                      (-n) 1024    pipe size            (512 bytes, -p) 8    POSIX message queues     (bytes, -q) 819200    real-time priority              (-r) 0    stack size              (kbytes, -s) unlimited    cpu time               (seconds, -t) unlimited    max user processes              (-u) 770597    virtual memory          (kbytes, -v) unlimited    file locks                      (-x) unlimited

     

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi,

    Thank you for your help. I've tried running SplitNCigarReads with GATK 4.4.0.0 but I'm still having issues. I get the error:
     
    htsjdk.samtools.util.RuntimeIOException: Problem writing temporary file file:///tmp/sortingcollection.8583155797493982206.tmp.  Try setting TMP_DIR to a file system with lots
     of space.
            at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:260)
            at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)
            at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:202)
            at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:36)
            at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:16)
            at htsjdk.samtools.util.AbstractAsyncWriter$WriterRunnable.run(AbstractAsyncWriter.java:123)
            at java.base/java.lang.Thread.run(Thread.java:833)
            Suppressed: htsjdk.samtools.util.RuntimeIOException: Attempt to add record to closed writer.
                    at htsjdk.samtools.util.AbstractAsyncWriter.write(AbstractAsyncWriter.java:57)
                    at htsjdk.samtools.AsyncSAMFileWriter.addAlignment(AsyncSAMFileWriter.java:58)
                    at org.broadinstitute.hellbender.utils.read.SAMFileGATKReadWriter.addRead(SAMFileGATKReadWriter.java:21)
                    at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.writeReads(OverhangFixingManager.java:358)
                    at org.broadinstitute.hellbender.tools.walkers.rnaseq.OverhangFixingManager.flush(OverhangFixingManager.java:338)
                    at org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads.closeTool(SplitNCigarReads.java:192)
                    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
                    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
                    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
                    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
                    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
                    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
                    at org.broadinstitute.hellbender.Main.main(Main.java:289)
    Caused by: htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
            at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222)
            at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:212)
            at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:184)
            at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:40)
            at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:254)
            ... 6 more
            Suppressed: java.io.IOException: No space left on device
                    at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
                    at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
                    at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)
                    at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97)
                    at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67)
                    at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:288)
                    at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)
                    at java.base/java.nio.channels.Channels.writeFully(Channels.java:96)
                    at java.base/java.nio.channels.Channels$1.write(Channels.java:171)
                    at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
                    at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)
                    at org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:360)
                    at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:377)
                    at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:334)
                    at org.xerial.snappy.SnappyOutputStream.close(SnappyOutputStream.java:419)
                    at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:259)
                    ... 6 more
    Caused by: java.io.IOException: No space left on device
            at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
            at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
            at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)
            at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97)
            at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67)
            at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:288)
            at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)
            at java.base/java.nio.channels.Channels.writeFully(Channels.java:96)
            at java.base/java.nio.channels.Channels$1.write(Channels.java:171)
            at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
            at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)
            at org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:360)
            at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:377)
            at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:130)
            at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
            ... 10 more
     
    Unfortunately, this issue persists even when setting tempfs to 300GB in size, or setting TMP_DIR to part of my Scratch (which has nearly 30TB of free space). I'll pass it onto my local team to see if they have any ideas, but do you have any thoughts as to anything else I could try?
     
    Warm regards,
     
    Valerie
     
    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Valerie Crolley

    The issue seems like a compute system issue rather than  GATK itself therefore our obvious suggestion would be to use a different system to get your results.

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Valerie Crolley

    Hi,

    Sorry it's been so long, but my local HPC has been down for some time.

    For future reference, I managed to solve the issue - turns out that I had to manually set --tmp-dir as part of SplitNCigarReads in order for it to work properly. 

    Thank you for all of your help!

    Valerie

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk