gatk4.5 markduplicatesspark error spark
REQUIRED for all errors and issues:
a) GATK version used: 4.5
b) Exact command used: atk MarkDuplicatesSpark -I /data/BI_data/nbs/aligned_reads/SRR062634.paired.sam -O /data/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads.bam --java-options "-Djava.io.tmpdir=/data/BI_data/nbs/tmp"
c) Entire program log: 16:01:24.392 WARN ShutdownHookManager - ShutdownHook '' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException: null
at java.util.concurrent.FutureTask.get(FutureTask.java:204) ~[?:?]
at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
16:01:24.392 WARN JavaUtils - Attempt to delete using native Unix OS command failed for path = /data/BI_data/nbs/tmp/blockmgr-64d5f045-db58-4697-9bfa-6183da2e9cae. Falling back to Java IO way
java.io.IOException: Failed to delete: /data/BI_data/nbs/tmp/blockmgr-64d5f045-db58-4697-9bfa-6183da2e9cae
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingUnixNative(JavaUtils.java:165) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:109) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:90) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkFileUtils.deleteRecursively(SparkFileUtils.scala:121) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkFileUtils.deleteRecursively$(SparkFileUtils.scala:120) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1126) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.storage.DiskBlockManager.$anonfun$doStop$1(DiskBlockManager.scala:368) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.storage.DiskBlockManager.$anonfun$doStop$1$adapted(DiskBlockManager.scala:364) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.storage.DiskBlockManager.doStop(DiskBlockManager.scala:364) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.storage.DiskBlockManager.stop(DiskBlockManager.scala:359) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:2124) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:95) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.SparkContext.$anonfun$stop$25(SparkContext.scala:2310) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1375) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.SparkContext.stop(SparkContext.scala:2310) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.SparkContext.stop(SparkContext.scala:2216) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.SparkContext.$anonfun$new$34(SparkContext.scala:686) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
at scala.util.Try$.apply(Try.scala:213) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) [gatk-package-4.5.0.0-local.jar:4.5.0.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method) ~[?:?]
at java.lang.Object.wait(Object.java:338) ~[?:?]
at java.lang.ProcessImpl.waitFor(ProcessImpl.java:434) ~[?:?]
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingUnixNative(JavaUtils.java:163) ~[gatk-package-4.5.0.0-local.jar:4.5.0.0]
... 33 more
16:01:27.393 INFO MemoryStore - MemoryStore cleared
16:01:27.393 INFO BlockManager - BlockManager stopped
16:01:27.394 INFO BlockManagerMaster - BlockManagerMaster stopped
16:01:27.396 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped!
16:01:27.412 INFO SparkContext - Successfully stopped SparkContext
16:01:27.412 INFO ShutdownHookManager - Shutdown hook called
16:01:27.412 INFO ShutdownHookManager - Deleting directory /data/BI_data/nbs/tmp/spark-11e3f5ab-904a-45c1-b7ae-33d460a1eaf6
-
This is the command:
gatk MarkDuplicatesSpark -I /data/BI_data/nbs/aligned_reads/SRR062634.paired.sam -O /data/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads.bam --java-options "-Djava.io.tmpdir=/data/BI_data/nbs/tmp"
Jave specs:
openjdk 17.0.9-internal 2023-10-17
OpenJDK Runtime Environment (build 17.0.9-internal+0-adhoc..src)
OpenJDK 64-Bit Server VM (build 17.0.9-internal+0-adhoc..src, mixed mode, sharing) -
Hi Ryan Welch
Can you try running the tool with the below parameter?
--java-options "-Dsamjdk.use_async_io_write_samtools=false"
-
Hello Gökalp Çelik, I ran the follwong command:
gatk MarkDuplicatesSpark -I /data/BI_data/nbs/aligned_reads/SRR062634.paired.sam -O /data/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads1.bam --java-options "-Dsamjdk.use_async_io_write_samtools=false"
The error I get now is:
15:11:51.897 INFO SparkContext - Created broadcast 14 from broadcast at BamSink.java:76
15:11:51.918 INFO PathOutputCommitterFactory - No output committer factory defined, defaulting to FileOutputCommitterFactory
15:11:51.921 INFO FileOutputCommitter - File Output Committer Algorithm version is 1
15:11:51.921 INFO FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
15:11:51.938 INFO SparkContext - SparkContext is stopping with exitCode 0.
15:11:51.943 INFO AbstractConnector - Stopped Spark@1c3e6bf3{HTTP/1.1, (http/1.1)}{0.0.0.0:4040}
15:11:51.948 INFO SparkUI - Stopped Spark web UI at http://10.199.221.155:4040
15:11:51.957 INFO MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
15:11:52.946 INFO MemoryStore - MemoryStore cleared
15:11:52.947 INFO BlockManager - BlockManager stopped
15:11:52.948 INFO BlockManagerMaster - BlockManagerMaster stopped
15:11:52.949 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped!
15:11:52.956 INFO SparkContext - Successfully stopped SparkContext
15:11:52.956 INFO MarkDuplicatesSpark - Shutting down engine
[June 25, 2024 at 3:11:52 PM CDT] org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSpark done. Elapsed time: 7.88 minutes.
Runtime.totalMemory()=9412018176
***********************************************************************A USER ERROR has occurred: Couldn't write file /data/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads1.bam because writing failed with exception chmod: changing permissions of '/dat a/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads1.bam.parts': Operation not permitted
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
15:11:52.960 INFO ShutdownHookManager - Shutdown hook called
15:11:52.960 INFO ShutdownHookManager - Deleting directory /tmp/spark-362bdcd6-9251-453e-bb2e-084872bdff0b
(gatk4) [ryan.welch@dphccfpcdc319 ~]$ [June 25, 2024 at 3:11:52 PM CDT] org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSpark done. Elapsed time: 7.88 minutes.
bash: [June: command not found...
(gatk4) [ryan.welch@dphccfpcdc319 ~]$ Runtime.totalMemory()=9412018176
-bash: syntax error near unexpected token `=9412018176'
(gatk4) [ryan.welch@dphccfpcdc319 ~]$ ***********************************************************************
bash: $: command not found...
(gatk4) [ryan.welch@dphccfpcdc319 ~]$
(gatk4) [ryan.welch@dphccfpcdc319 ~]$ A USER ERROR has occurred: Couldn't write file /data/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads1.bam because writing failed with exception chmod: changing permissions of '/dat a/BI_data/nbs/aligned_reads/SRR062634_sorted_dedup_reads1.bam.parts': Operation not permitted -
Hi again.
/data folder seems to be under root therefore it is possible that read write permissions are not set properly for that folder. Can you move your files to a local user folder and try again?
Please sign in to leave a comment.
4 comments