Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

PathSeqPipelineSpark Follow

3 comments

  • Avatar
    sy zhang

    Excuse me,when I run the PathSeqPipelineSpark using pre-built microbe reference files of GATK Resource Bundle to detect microbe of mice scRNA-BAM file, there are some errors:

    ......

    23/02/20 02:26:31 INFO TaskSetManager: Finished task 534.0 in stage 4.0 (TID 2702) in 7371 ms on localhost (executor driver) (542/542)
    23/02/20 02:26:31 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 
    23/02/20 02:26:31 INFO DAGScheduler: ResultStage 4 (count at PSFilterFileLogger.java:47) finished in 80.017 s
    23/02/20 02:26:31 INFO DAGScheduler: Job 3 finished: count at PSFilterFileLogger.java:47, took 1635.386275 s
    23/02/20 02:26:32 INFO SparkUI: Stopped Spark web UI at http://10.10.10.152:4040
    23/02/20 02:26:33 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    23/02/20 02:26:33 INFO MemoryStore: MemoryStore cleared
    23/02/20 02:26:33 INFO BlockManager: BlockManager stopped
    23/02/20 02:26:33 INFO BlockManagerMaster: BlockManagerMaster stopped
    23/02/20 02:26:33 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    23/02/20 02:26:34 INFO SparkContext: Successfully stopped SparkContext
    02:26:34.177 INFO  PathSeqPipelineSpark - Shutting down engine
    [February 20, 2023 at 2:26:34 AM CST] org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark done. Elapsed time: 44.24 minutes.
    Runtime.totalMemory()=211493584896
    java.lang.IllegalArgumentException: Unsupported class file major version 55
        at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:166)
        at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:148)
        at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:136)
        at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:237)
        at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:49)
        at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:517)
        at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:500)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
        at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
        at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
        at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:500)
        at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175)
        at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238)
        at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631)
        at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355)
        at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:307)
        at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:306)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:306)
        at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
        at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKeyWithClassTag$1.apply(PairRDDFunctions.scala:88)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKeyWithClassTag$1.apply(PairRDDFunctions.scala:77)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
        at org.apache.spark.rdd.PairRDDFunctions.combineByKeyWithClassTag(PairRDDFunctions.scala:77)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$groupByKey$1.apply(PairRDDFunctions.scala:505)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$groupByKey$1.apply(PairRDDFunctions.scala:498)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
        at org.apache.spark.rdd.PairRDDFunctions.groupByKey(PairRDDFunctions.scala:498)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$groupByKey$3.apply(PairRDDFunctions.scala:641)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$groupByKey$3.apply(PairRDDFunctions.scala:641)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
        at org.apache.spark.rdd.PairRDDFunctions.groupByKey(PairRDDFunctions.scala:640)
        at org.apache.spark.api.java.JavaPairRDD.groupByKey(JavaPairRDD.scala:559)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSFilter.filterDuplicateSequences(PSFilter.java:166)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSFilter.doFilter(PSFilter.java:289)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark.runTool(PathSeqPipelineSpark.java:238)
        at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:546)
        at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:31)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)
        Suppressed: java.lang.IllegalStateException: Cannot compute metrics if primary, pre-aligned host, quality, host, duplicate, or final paired read counts are not initialized
            at org.broadinstitute.hellbender.tools.spark.pathseq.loggers.PSFilterMetrics.computeDerivedMetrics(PSFilterMetrics.java:72)
            at org.broadinstitute.hellbender.tools.spark.pathseq.loggers.PSFilterFileLogger.close(PSFilterFileLogger.java:64)
            at org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark.runTool(PathSeqPipelineSpark.java:239)
            ... 8 more
    23/02/20 02:26:34 INFO ShutdownHookManager: Shutdown hook called
    23/02/20 02:26:34 INFO ShutdownHookManager: Deleting directory /tmp/spark-84326ac0-95d8-4bbb-8cf8-b7df3b1d7a3f

    ------------------------------------------------------------------------------------------------------------------------------------

    Can you help me? Please!

    0
    Comment actions Permalink
  • Avatar
    Cameron Griffiths

    sy zhang

    Have you made sure that you are running the correct version of Java for GATK? I ran into the same error and it was fixed by changing my Java version.

    https://gatk.broadinstitute.org/hc/en-us/articles/360035532332-Java-version-issues

    https://gatk.broadinstitute.org/hc/en-us/articles/360035889531-What-are-the-requirements-for-running-GATK

    0
    Comment actions Permalink
  • Avatar
    Zhengnuan Li

    Hi, I want to know more about --is-host-aligned setting, I want to save some time so I alignment the .fq files to the reference genoem first(use hisat2).  If I make --is-host-aligned true, what processing(In the follow 4 processing) will be carried out on my bam file?

    The follow article shows that the PathSeq process begins with a subtractive phase in which input reads are subtracted by alignment to reference sequences. This process include 4 process (1 Filtering; 2 MAQ Alignments; 3 MegaBlast Alignments; 4 Blast Alignmnets)

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3523678/#SD1

    By the way, If I setting --is-host-aligned false, How to traslate the fq1 and fq2 files to bam file?

    Could you help me? Thanks!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk