something wrong with MarkDuplicatesSpark
hi,The GATK throws an NoClassDefFoundError: scala/Product$class while running MarkDuplicatesSpark on both version 4.1.7 and version 4.0.0. And my environment is: spark-3.0.0, scala-2.11.12.
Thank you.
GATK commands:
gatk MarkDuplicatesSpark \
-I hdfs://master2:9000/Drosophila/output/Drosophila.sorted.bam \
-O hdfs://master2:9000/Drosophila/output/Drosophila.sorted.markdup.bam \
-M hdfs://master2:9000/Drosophila/output/Drosophila.sorted.markdup_metrics.txt \
-- \
--spark-runner SPARK --spark-master spark://master2:7077
The error log:
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Product$class
at org.bdgenomics.adam.serialization.InputStreamWithDecoder.<init>(ADAMKryoRegistrator.scala:35)
at org.bdgenomics.adam.serialization.AvroSerializer.<init>(ADAMKryoRegistrator.scala:45)
at org.bdgenomics.adam.models.VariantContextSerializer.<init>(VariantContext.scala:94)
at org.bdgenomics.adam.serialization.ADAMKryoRegistrator.registerClasses(ADAMKryoRegistrator.scala:179)
at org.broadinstitute.hellbender.engine.spark.GATKRegistrator.registerClasses(GATKRegistrator.java:78)
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$8(KryoSerializer.scala:170)
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$8$adapted(KryoSerializer.scala:170)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$5(KryoSerializer.scala:170)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:221)
at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:161)
at org.apache.spark.serializer.KryoSerializer$$anon$1.create(KryoSerializer.scala:102)
at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
at org.apache.spark.serializer.KryoSerializer$PoolWrapper.borrow(KryoSerializer.scala:109)
at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:336)
at org.apache.spark.serializer.KryoSerializationStream.<init>(KryoSerializer.scala:256)
at org.apache.spark.serializer.KryoSerializerInstance.serializeStream(KryoSerializer.scala:422)
at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:309)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:137)
at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:91)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:35)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:77)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1494)
at org.apache.spark.rdd.NewHadoopRDD.<init>(NewHadoopRDD.scala:80)
at org.apache.spark.SparkContext.$anonfun$newAPIHadoopFile$2(SparkContext.scala:1235)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:771)
at org.apache.spark.SparkContext.newAPIHadoopFile(SparkContext.scala:1221)
at org.apache.spark.api.java.JavaSparkContext.newAPIHadoopFile(JavaSparkContext.scala:484)
at org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource.getParallelReads(ReadsSparkSource
.java:112)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.getUnfilteredReads(GATKSparkTool.java:254)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.getReads(GATKSparkTool.java:220)
at org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSpark.runTool(MarkDupli
catesSpark.java:72)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:387)
at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:30
)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.jav
a:179)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:152)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195)
at org.broadinstitute.hellbender.Main.main(Main.java:275)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
at java.lang.ClassLoader.findClass(ClassLoader.java:523)
at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.java:35)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
at org.apache.spark.util.ChildFirstURLClassLoader.loadClass(ChildFirstURLClassLoader.java:48)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 55 more
-
Hi,
We don't support spark 3 yet. We will soon (maybe in the next few months). But for now please switch to spark 2 and that should resolve this issue.
Please sign in to leave a comment.
1 comment