PlotModeledSegments: java.lang.IllegalArgumentException: Records were not strictly sorted in dictionary order.
I had split the interval file. and in the last step combined all the split files to run this command. I have checked all my files are sorted according to .dict file. Can someone please help. Also, exactly which file is this error message pointing to ? and how i can solve this error.
Thanks a lot.
REQUIRED for all errors and issues:
a) GATK version used:
gatk --version
Using GATK jar /miniforge3/envs/gatk4.6.1.0/share/gatk4-4.6.1.0-0/gatk-package-4.6.1.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /miniforge3/envs/gatk4.6.1.0/share/gatk4-4.6.1.0-0/gatk-package-4.6.1.0-local.jar --version
The Genome Analysis Toolkit (GATK) v4.6.1.0
HTSJDK Version: 4.1.3
Picard Version: 3.3.0
b) Exact command used:
gatk PlotModeledSegments --segments /CopyRatio/SRRXXXXXXX.modelFinal.seg --denoised-copy-ratios /hdf5/SRRXXXXXXX_denoised_CR.tsv --allelic-counts /CopyRatio/SRRXXXXXXX.hets.tsv --sequence-dictionary /GATK4/Homo_sapiens/Homo_sapiens_assembly38.dict --output PlotDenoizedCR/ --output-prefix SRRXXXXXXX
c) Entire program log:
Using GATK jar /miniforge3/envs/gatk4.6.1.0/share/gatk4-4.6.1.0-0/gatk-package-4.6.1.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /miniforge3/envs/gatk4.6.1.0/share/gatk4-4.6.1.0-0/gatk-package-4.6.1.0-local.jar PlotModeledSegments --segments /CopyRatio/SRRXXXXXXX.modelFinal.seg --denoised-copy-ratios /hdf5/SRRXXXXXXX_denoised_CR.tsv --allelic-counts /CopyRatio/SRRXXXXXXX.hets.tsv --sequence-dictionary /GATK4/Homo_sapiens/Homo_sapiens_assembly38.dict --output PlotDenoizedCR/ --output-prefix SRRXXXXXXX
06:19:38.326 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/miniforge3/envs/gatk4.6.1.0/share/gatk4-4.6.1.0-0/gatk-package-4.6.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
SLF4J(W): Class path contains multiple SLF4J providers.
SLF4J(W): Found provider [org.apache.logging.slf4j.SLF4JServiceProvider@5dbab232]
SLF4J(W): Found provider [ch.qos.logback.classic.spi.LogbackServiceProvider@5939e24]
SLF4J(W): See https://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J(I): Actual provider is of type [org.apache.logging.slf4j.SLF4JServiceProvider@5dbab232]
06:19:38.547 INFO PlotModeledSegments - ------------------------------------------------------------
06:19:38.550 INFO PlotModeledSegments - The Genome Analysis Toolkit (GATK) v4.6.1.0
06:19:38.550 INFO PlotModeledSegments - For support and documentation go to https://software.broadinstitute.org/gatk/
06:19:38.550 INFO PlotModeledSegments - Executing as user@terminal2 on Linux v4.18.0-425.3.1.el8.x86_64 amd64
06:19:38.551 INFO PlotModeledSegments - Java runtime: OpenJDK 64-Bit Server VM v17.0.10-internal+0-adhoc..src
06:19:38.551 INFO PlotModeledSegments - Start Date/Time: January 14, 2025 at 6:19:38 AM CST
06:19:38.551 INFO PlotModeledSegments - ------------------------------------------------------------
06:19:38.551 INFO PlotModeledSegments - ------------------------------------------------------------
06:19:38.554 INFO PlotModeledSegments - HTSJDK Version: 4.1.3
06:19:38.554 INFO PlotModeledSegments - Picard Version: 3.3.0
06:19:38.555 INFO PlotModeledSegments - Built for Spark Version: 3.5.0
06:19:38.558 INFO PlotModeledSegments - HTSJDK Defaults.COMPRESSION_LEVEL : 2
06:19:38.559 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
06:19:38.559 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
06:19:38.559 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
06:19:38.560 INFO PlotModeledSegments - Deflater: IntelDeflater
06:19:38.560 INFO PlotModeledSegments - Inflater: IntelInflater
06:19:38.560 INFO PlotModeledSegments - GCS max retries/reopens: 20
06:19:38.560 INFO PlotModeledSegments - Requester pays: disabled
06:19:38.561 INFO PlotModeledSegments - Initializing engine
06:19:38.561 INFO PlotModeledSegments - Done initializing engine
06:19:38.566 INFO PlotModeledSegments - Reading and validating input files...
06:19:45.999 INFO PlotModeledSegments - Shutting down engine
[January 14, 2025 at 6:19:45 AM CST] org.broadinstitute.hellbender.tools.copynumber.plotting.PlotModeledSegments done. Elapsed time: 0.13 minutes.
Runtime.totalMemory()=2076049408
java.lang.IllegalArgumentException: Records were not strictly sorted in dictionary order.
at org.broadinstitute.hellbender.tools.copynumber.arguments.CopyNumberArgumentValidationUtils.validateIntervals(CopyNumberArgumentValidationUtils.java:74)
at org.broadinstitute.hellbender.tools.copynumber.formats.collections.AbstractLocatableCollection.<init>(AbstractLocatableCollection.java:59)
at org.broadinstitute.hellbender.tools.copynumber.formats.collections.AbstractSampleLocatableCollection.<init>(AbstractSampleLocatableCollection.java:44)
at org.broadinstitute.hellbender.tools.copynumber.formats.collections.ModeledSegmentCollection.<init>(ModeledSegmentCollection.java:72)
at org.broadinstitute.hellbender.tools.copynumber.plotting.PlotModeledSegments.doWork(PlotModeledSegments.java:202)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:150)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:203)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:222)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:166)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:209)
at org.broadinstitute.hellbender.Main.main(Main.java:306)
-
Hi Beetle
Any of those input files may be out of order. During combine stage it is possible that some of those contigs got sorted lexicographically however that order should be exactly the same as the one in the sequence dictionary.
Since we don't have access to your files we cannot recreate this problem in our hands.
If you think this is a bug and report a detailed bug report we recommend you to follow the steps in the below article and submit data that we can check and try to recreate the bug.
https://gatk.broadinstitute.org/hc/en-us/articles/360035889671-How-do-I-submit-a-detailed-bug-report
I hope this helps.
Regards.
-
Hi Gökalp Çelik
Thanks a lot for your reply. I have uploaded all the input files. I think they are in dictionary order. Can you please take a look. Thanks a lot for your help. Download files here:
-
Hi Beetle
Can you upload these files to the ftp that we provide in the article? We are unable to reach any user defined environment due to security reasons.
-
Hi Gökalp Çelik,
I was able to upload file following instructions from here. https://gatk.broadinstitute.org/hc/en-us/articles/360035889671-How-do-I-submit-a-detailed-bug-report#article-comments
please check file name: put input_files.tar.gz
-
Hi again.
Your denoised copy ratio file contains inputs from contigs that are not in the main chromosomes. All other files end up with chrY but this file goes beyond those contigs. Can you remove those entries and try again to see if the issue persists?
Regards.
-
thanks. you were wright. There was some problem with my denoized file. I was able to fix it. thanks.
Please sign in to leave a comment.
6 comments