Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

FindBadGenomicKmersSpark running indefinitely

Answered
0

5 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Cher Wei Yuan,

    Thank you for writing into the GATK forum about this tool! I hope that we can help you solve this issue. 

    If you are running a Spark tool locally (without a spark cluster), we recommend that you follow these guidelines: https://github.com/broadinstitute/gatk#running-gatk4-spark-tools-locally. I believe that by using the options outlined in that README, the Spark tool will run much better on your machine.

    Please let us know how that goes. If it is still is not working, we can troubleshoot from there.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Cher Wei Yuan

    Dear Genevieve

    I tried using 1, 4 and 12 threads but the program runs indefinitely still. May I seek your help in troubleshooting?

    Here is the printout from using 4 threads:

    gatk FindBadGenomicKmersSpark \
    > --spark-runner LOCAL \
    > --spark-master 'local[4]' \
    > -R $dir/sv_resources/genome/GRCH37-lite.fa \
    > -O $dir/sv_resources/misc/kmers_to_ignore.txt
    Using GATK jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar FindBadGenomicKmersSpark --spark-master local[4] -R /mnt/e/variant_calling/sv_pipeline/sv_resources/genome/GRCH37-lite.fa -O /mnt/e/variant_calling/sv_pipeline/sv_resources/misc/kmers_to_ignore.txt
    17:05:20.424 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    17:05:20.566 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------
    17:05:20.567 INFO  FindBadGenomicKmersSpark - The Genome Analysis Toolkit (GATK) v4.2.6.0
    17:05:20.567 INFO  FindBadGenomicKmersSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
    17:05:20.567 INFO  FindBadGenomicKmersSpark - Executing as weiyuan@IMC-SPD-EOM120 on Linux v5.10.102.1-microsoft-standard-WSL2 amd64
    17:05:20.567 INFO  FindBadGenomicKmersSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
    17:05:20.567 INFO  FindBadGenomicKmersSpark - Start Date/Time: August 3, 2022 5:05:20 PM SGT
    17:05:20.568 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------  
    17:05:20.568 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------  
    17:05:20.568 INFO  FindBadGenomicKmersSpark - HTSJDK Version: 2.24.1
    17:05:20.568 INFO  FindBadGenomicKmersSpark - Picard Version: 2.27.1
    17:05:20.568 INFO  FindBadGenomicKmersSpark - Built for Spark Version: 2.4.5
    17:05:20.569 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    17:05:20.569 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false        
    17:05:20.569 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true        
    17:05:20.569 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false        
    17:05:20.569 INFO  FindBadGenomicKmersSpark - Deflater: IntelDeflater
    17:05:20.569 INFO  FindBadGenomicKmersSpark - Inflater: IntelInflater
    17:05:20.569 INFO  FindBadGenomicKmersSpark - GCS max retries/reopens: 20
    17:05:20.569 INFO  FindBadGenomicKmersSpark - Requester pays: disabled
    17:05:20.569 WARN  FindBadGenomicKmersSpark -

       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

       Warning: FindBadGenomicKmersSpark is a BETA tool and is not yet ready for use in production

       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


    17:05:20.569 INFO  FindBadGenomicKmersSpark - Initializing engine
    17:05:20.570 INFO  FindBadGenomicKmersSpark - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    22/08/03 17:05:20 WARN Utils: Your hostname, IMC-SPD-EOM120 resolves to a loopback address: 127.0.1.1; using 172.27.18.91 instead (on interface eth0)
    22/08/03 17:05:20 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    22/08/03 17:05:21 INFO SparkContext: Running Spark version 2.4.5
    22/08/03 17:05:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    22/08/03 17:05:22 INFO SparkContext: Submitted application: FindBadGenomicKmersSpark
    22/08/03 17:05:22 INFO SecurityManager: Changing view acls to: weiyuan
    22/08/03 17:05:22 INFO SecurityManager: Changing modify acls to: weiyuan
    22/08/03 17:05:22 INFO SecurityManager: Changing view acls groups to:
    22/08/03 17:05:22 INFO SecurityManager: Changing modify acls groups to:
    22/08/03 17:05:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(weiyuan); groups with view permissions: Set(); users  with modify permissions: Set(weiyuan); groups with modify permissions: Set()
    22/08/03 17:05:22 INFO Utils: Successfully started service 'sparkDriver' on port 41925.
    22/08/03 17:05:22 INFO SparkEnv: Registering MapOutputTracker
    22/08/03 17:05:22 INFO SparkEnv: Registering BlockManagerMaster
    22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for 
    getting topology information
    22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
    22/08/03 17:05:22 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-db8bfc05-ae9e-4d0e-a266-2d923c3c5074
    22/08/03 17:05:22 INFO MemoryStore: MemoryStore started with capacity 15.8 GB
    22/08/03 17:05:22 INFO SparkEnv: Registering OutputCommitCoordinator
    22/08/03 17:05:22 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    22/08/03 17:05:22 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.27.18.91:4040
    22/08/03 17:05:22 INFO Executor: Starting executor ID driver on host localhost
    22/08/03 17:05:22 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40331.
    22/08/03 17:05:22 INFO NettyBlockTransferService: Server created on 172.27.18.91:40331
    22/08/03 17:05:22 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    22/08/03 17:05:22 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.27.18.91, 40331, None)
    22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: Registering block manager 172.27.18.91:40331 with 15.8 GB RAM, BlockManagerId(driver, 172.27.18.91, 40331, None)
    22/08/03 17:05:22 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.27.18.91, 40331, None)
    22/08/03 17:05:22 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.27.18.91, 40331, None)
    17:05:22.993 INFO  FindBadGenomicKmersSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
    22/08/03 17:05:54 INFO SparkContext: Starting job: collect at FindBadGenomicKmersSpark.java:178
    22/08/03 17:05:55 INFO DAGScheduler: Registering RDD 2 (mapToPair at FindBadGenomicKmersSpark.java:162) as input to shuffle 0
    22/08/03 17:05:55 INFO DAGScheduler: Got job 0 (collect at FindBadGenomicKmersSpark.java:178) with 2998 output partitions
    22/08/03 17:05:55 INFO DAGScheduler: Final stage: ResultStage 1 (collect at FindBadGenomicKmersSpark.java:178)
    22/08/03 17:05:55 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
    22/08/03 17:05:55 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
    22/08/03 17:05:55 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162), which has no missing parents
    22/08/03 17:05:55 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 15.8 GB)
    22/08/03 17:05:55 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.7 KB, free 15.8 GB)
    22/08/03 17:05:55 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.27.18.91:40331 (size: 3.7 KB, free: 15.8 GB)
    22/08/03 17:05:55 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
    22/08/03 17:05:55 INFO DAGScheduler: Submitting 2998 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 
    4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
    22/08/03 17:05:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 2998 tasks
    22/08/03 17:05:55 WARN TaskSetManager: Stage 0 contains a task of very large size (1018 KB). The maximum recommended task size is 100 KB.
    22/08/03 17:05:55 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 1043155 bytes)
    22/08/03 17:05:55 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 1053206 bytes)
    22/08/03 17:05:55 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 1053206 bytes)
    22/08/03 17:05:55 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 1053206 bytes)
    22/08/03 17:05:55 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    22/08/03 17:05:55 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
    22/08/03 17:05:55 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
    22/08/03 17:05:55 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
    22/08/03 17:05:57 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 924 bytes result sent to driver
    22/08/03 17:05:57 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 1053206 bytes)
    22/08/03 17:05:57 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
    22/08/03 17:05:58 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2486 ms on localhost (executor driver) (1/2998)

    Best Regards

    Wei Yuan

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks Wei Yuan for trying that so quickly and getting back to us! 

    Could you also try adding the java -Xmx option to your command line to limit the amount of memory java uses? 

    Explanation about -Xmx:

    How to use it in the GATK command line:

    0
    Comment actions Permalink
  • Avatar
    Cher Wei Yuan

    Dear Genevieve

    I tried several parameters (Xmx4/12/20G + 48 threads, Xmx100G + 1/4/24 threads, Xmx200G + 48 threads)  but the furthest I got was still task 53.0 in stage 0.0 (TID 53), where it runs indefinitely (shown below).

    Perhaps it will be easier if the reference genome (GRCh37 & GRCh38) and its output from FindBadGenomicKmersSpark can be uploaded to the GATK Google Cloud resource. This will help most of us get ready to use StructuralVariationDiscoveryPipelineSpark immediately. Let me know if this suggestion works. Thank you!

    (gatk4) weiyuan@IMC-SPD-EOM120:/mnt/e/variant_calling/sv_pipeline$ gatk FindBadGenomicKmersSpark \
    > --java-options "-Xmx200G" \
    > --spark-runner LOCAL \
    > --spark-master 'local[*]' \
    > -R $dir/sv_resources/genome/GRCH37-lite.fa \
    > -O $dir/sv_resources/misc/kmers_to_ignore.txt
    Using GATK jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx200G -jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar FindBadGenomicKmersSpark --spark-master local[*] -R /mnt/e/variant_calling/sv_pipeline/sv_resources/genome/GRCH37-lite.fa -O /mnt/e/variant_calling/sv_pipeline/sv_resources/misc/kmers_to_ignore.txt
    12:05:05.554 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    12:05:05.673 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------
    12:05:05.673 INFO  FindBadGenomicKmersSpark - The Genome Analysis Toolkit (GATK) v4.2.6.0
    12:05:05.673 INFO  FindBadGenomicKmersSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:05:05.673 INFO  FindBadGenomicKmersSpark - Executing as weiyuan@IMC-SPD-EOM120 on Linux v5.10.102.1-microsoft-standard-WSL2 amd64
    12:05:05.673 INFO  FindBadGenomicKmersSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
    12:05:05.674 INFO  FindBadGenomicKmersSpark - Start Date/Time: August 4, 2022 12:05:05 PM SGT
    12:05:05.674 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------  
    12:05:05.674 INFO  FindBadGenomicKmersSpark - ------------------------------------------------------------  
    12:05:05.674 INFO  FindBadGenomicKmersSpark - HTSJDK Version: 2.24.1
    12:05:05.674 INFO  FindBadGenomicKmersSpark - Picard Version: 2.27.1
    12:05:05.674 INFO  FindBadGenomicKmersSpark - Built for Spark Version: 2.4.5
    12:05:05.675 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    12:05:05.675 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false        
    12:05:05.675 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true        
    12:05:05.675 INFO  FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false        
    12:05:05.675 INFO  FindBadGenomicKmersSpark - Deflater: IntelDeflater
    12:05:05.675 INFO  FindBadGenomicKmersSpark - Inflater: IntelInflater
    12:05:05.675 INFO  FindBadGenomicKmersSpark - GCS max retries/reopens: 20
    12:05:05.675 INFO  FindBadGenomicKmersSpark - Requester pays: disabled
    12:05:05.676 WARN  FindBadGenomicKmersSpark -

       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

       Warning: FindBadGenomicKmersSpark is a BETA tool and is not yet ready for use in production

       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


    12:05:05.676 INFO  FindBadGenomicKmersSpark - Initializing engine
    12:05:05.676 INFO  FindBadGenomicKmersSpark - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    22/08/04 12:05:05 WARN Utils: Your hostname, IMC-SPD-EOM120 resolves to a loopback address: 127.0.1.1; using 172.27.18.91 instead (on interface eth0)
    22/08/04 12:05:05 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    22/08/04 12:05:06 INFO SparkContext: Running Spark version 2.4.5
    22/08/04 12:05:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    22/08/04 12:05:07 INFO SparkContext: Submitted application: FindBadGenomicKmersSpark
    22/08/04 12:05:07 INFO SecurityManager: Changing view acls to: weiyuan
    22/08/04 12:05:07 INFO SecurityManager: Changing modify acls to: weiyuan
    22/08/04 12:05:07 INFO SecurityManager: Changing view acls groups to:
    22/08/04 12:05:07 INFO SecurityManager: Changing modify acls groups to:
    22/08/04 12:05:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(weiyuan); groups with view permissions: Set(); users  with modify permissions: Set(weiyuan); groups with modify permissions: Set()
    22/08/04 12:05:07 INFO Utils: Successfully started service 'sparkDriver' on port 34615.
    22/08/04 12:05:07 INFO SparkEnv: Registering MapOutputTracker
    22/08/04 12:05:07 INFO SparkEnv: Registering BlockManagerMaster
    22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for 
    getting topology information
    22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
    22/08/04 12:05:07 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8b29a948-0bc1-4dd5-acba-4dea04a4b3b8
    22/08/04 12:05:07 INFO MemoryStore: MemoryStore started with capacity 106.5 GB
    22/08/04 12:05:07 INFO SparkEnv: Registering OutputCommitCoordinator
    22/08/04 12:05:07 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    22/08/04 12:05:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.27.18.91:4040
    22/08/04 12:05:07 INFO Executor: Starting executor ID driver on host localhost
    22/08/04 12:05:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41571.
    22/08/04 12:05:07 INFO NettyBlockTransferService: Server created on 172.27.18.91:41571
    22/08/04 12:05:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    22/08/04 12:05:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.27.18.91, 41571, None)
    22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: Registering block manager 172.27.18.91:41571 with 106.5 GB RAM, BlockManagerId(driver, 172.27.18.91, 41571, None)
    22/08/04 12:05:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.27.18.91, 41571, None)
    22/08/04 12:05:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.27.18.91, 41571, None)
    12:05:08.052 INFO  FindBadGenomicKmersSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
    22/08/04 12:05:42 INFO SparkContext: Starting job: collect at FindBadGenomicKmersSpark.java:178
    22/08/04 12:05:43 INFO DAGScheduler: Registering RDD 2 (mapToPair at FindBadGenomicKmersSpark.java:162) as input to shuffle 0
    22/08/04 12:05:43 INFO DAGScheduler: Got job 0 (collect at FindBadGenomicKmersSpark.java:178) with 2998 output partitions
    22/08/04 12:05:43 INFO DAGScheduler: Final stage: ResultStage 1 (collect at FindBadGenomicKmersSpark.java:178)
    22/08/04 12:05:43 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
    22/08/04 12:05:43 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
    22/08/04 12:05:43 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162), which has no missing parents
    22/08/04 12:05:43 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 106.5 GB)
    22/08/04 12:05:43 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.7 KB, free 106.5 GB)
    22/08/04 12:05:43 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.27.18.91:41571 (size: 3.7 KB, free: 106.5 GB)
    22/08/04 12:05:43 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
    22/08/04 12:05:43 INFO DAGScheduler: Submitting 2998 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 
    4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
    22/08/04 12:05:43 INFO TaskSchedulerImpl: Adding task set 0.0 with 2998 tasks
    22/08/04 12:05:43 WARN TaskSetManager: Stage 0 contains a task of very large size (1018 KB). The maximum recommended task size is 100 KB.
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 1043155 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, localhost, executor driver, 
    partition 10, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, localhost, executor driver, 
    partition 11, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, localhost, executor driver, 
    partition 12, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, localhost, executor driver, 
    partition 13, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, localhost, executor driver, 
    partition 14, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, localhost, executor driver, 
    partition 15, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, localhost, executor driver, 
    partition 16, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, localhost, executor driver, 
    partition 17, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, localhost, executor driver, 
    partition 18, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, localhost, executor driver, 
    partition 19, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, localhost, executor driver, 
    partition 20, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, localhost, executor driver, 
    partition 21, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, localhost, executor driver, 
    partition 22, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, localhost, executor driver, 
    partition 23, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, localhost, executor driver, 
    partition 24, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, localhost, executor driver, 
    partition 25, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, localhost, executor driver, 
    partition 26, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, localhost, executor driver, 
    partition 27, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, localhost, executor driver, 
    partition 28, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, localhost, executor driver, 
    partition 29, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, localhost, executor driver, 
    partition 30, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, localhost, executor driver, 
    partition 31, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, localhost, executor driver, 
    partition 32, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, localhost, executor driver, 
    partition 33, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, localhost, executor driver, 
    partition 34, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, localhost, executor driver, 
    partition 35, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, localhost, executor driver, 
    partition 36, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, localhost, executor driver, 
    partition 37, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, localhost, executor driver, 
    partition 38, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, localhost, executor driver, 
    partition 39, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, localhost, executor driver, 
    partition 40, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, localhost, executor driver, 
    partition 41, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, localhost, executor driver, 
    partition 42, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, localhost, executor driver, 
    partition 43, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:43 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, localhost, executor driver, 
    partition 44, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:44 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, localhost, executor driver, 
    partition 45, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:44 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, localhost, executor driver, 
    partition 46, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:44 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, localhost, executor driver, 
    partition 47, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:44 INFO Executor: Running task 14.0 in stage 0.0 (TID 14)
    22/08/04 12:05:44 INFO Executor: Running task 28.0 in stage 0.0 (TID 28)
    22/08/04 12:05:44 INFO Executor: Running task 26.0 in stage 0.0 (TID 26)
    22/08/04 12:05:44 INFO Executor: Running task 27.0 in stage 0.0 (TID 27)
    22/08/04 12:05:44 INFO Executor: Running task 25.0 in stage 0.0 (TID 25)
    22/08/04 12:05:44 INFO Executor: Running task 24.0 in stage 0.0 (TID 24)
    22/08/04 12:05:44 INFO Executor: Running task 23.0 in stage 0.0 (TID 23)
    22/08/04 12:05:44 INFO Executor: Running task 22.0 in stage 0.0 (TID 22)
    22/08/04 12:05:44 INFO Executor: Running task 21.0 in stage 0.0 (TID 21)
    22/08/04 12:05:44 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
    22/08/04 12:05:44 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
    22/08/04 12:05:44 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
    22/08/04 12:05:44 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
    22/08/04 12:05:44 INFO Executor: Running task 42.0 in stage 0.0 (TID 42)
    22/08/04 12:05:44 INFO Executor: Running task 43.0 in stage 0.0 (TID 43)
    22/08/04 12:05:44 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
    22/08/04 12:05:44 INFO Executor: Running task 45.0 in stage 0.0 (TID 45)
    22/08/04 12:05:44 INFO Executor: Running task 11.0 in stage 0.0 (TID 11)
    22/08/04 12:05:44 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
    22/08/04 12:05:44 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
    22/08/04 12:05:44 INFO Executor: Running task 19.0 in stage 0.0 (TID 19)
    22/08/04 12:05:44 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
    22/08/04 12:05:44 INFO Executor: Running task 15.0 in stage 0.0 (TID 15)
    22/08/04 12:05:44 INFO Executor: Running task 20.0 in stage 0.0 (TID 20)
    22/08/04 12:05:44 INFO Executor: Running task 16.0 in stage 0.0 (TID 16)
    22/08/04 12:05:44 INFO Executor: Running task 10.0 in stage 0.0 (TID 10)
    22/08/04 12:05:44 INFO Executor: Running task 17.0 in stage 0.0 (TID 17)
    22/08/04 12:05:44 INFO Executor: Running task 13.0 in stage 0.0 (TID 13)
    22/08/04 12:05:44 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
    22/08/04 12:05:44 INFO Executor: Running task 18.0 in stage 0.0 (TID 18)
    22/08/04 12:05:44 INFO Executor: Running task 12.0 in stage 0.0 (TID 12)
    22/08/04 12:05:44 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    22/08/04 12:05:44 INFO Executor: Running task 46.0 in stage 0.0 (TID 46)
    22/08/04 12:05:44 INFO Executor: Running task 47.0 in stage 0.0 (TID 47)
    22/08/04 12:05:44 INFO Executor: Running task 40.0 in stage 0.0 (TID 40)
    22/08/04 12:05:44 INFO Executor: Running task 39.0 in stage 0.0 (TID 39)
    22/08/04 12:05:44 INFO Executor: Running task 36.0 in stage 0.0 (TID 36)
    22/08/04 12:05:44 INFO Executor: Running task 38.0 in stage 0.0 (TID 38)
    22/08/04 12:05:44 INFO Executor: Running task 44.0 in stage 0.0 (TID 44)
    22/08/04 12:05:44 INFO Executor: Running task 41.0 in stage 0.0 (TID 41)
    22/08/04 12:05:44 INFO Executor: Running task 30.0 in stage 0.0 (TID 30)
    22/08/04 12:05:44 INFO Executor: Running task 37.0 in stage 0.0 (TID 37)
    22/08/04 12:05:44 INFO Executor: Running task 35.0 in stage 0.0 (TID 35)
    22/08/04 12:05:44 INFO Executor: Running task 34.0 in stage 0.0 (TID 34)
    22/08/04 12:05:44 INFO Executor: Running task 32.0 in stage 0.0 (TID 32)
    22/08/04 12:05:44 INFO Executor: Running task 33.0 in stage 0.0 (TID 33)
    22/08/04 12:05:44 INFO Executor: Running task 31.0 in stage 0.0 (TID 31)
    22/08/04 12:05:44 INFO Executor: Running task 29.0 in stage 0.0 (TID 29)
    22/08/04 12:05:48 INFO Executor: Finished task 38.0 in stage 0.0 (TID 38). 924 bytes result sent to driver
    22/08/04 12:05:48 INFO Executor: Finished task 40.0 in stage 0.0 (TID 40). 924 bytes result sent to driver
    22/08/04 12:05:48 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 924 bytes result sent to driver    
    22/08/04 12:05:48 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, localhost, executor driver, 
    partition 48, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:48 INFO Executor: Running task 48.0 in stage 0.0 (TID 48)
    22/08/04 12:05:48 INFO Executor: Finished task 21.0 in stage 0.0 (TID 21). 924 bytes result sent to driver
    22/08/04 12:05:48 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, localhost, executor driver, 
    partition 49, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:48 INFO Executor: Running task 49.0 in stage 0.0 (TID 49)
    22/08/04 12:05:48 INFO TaskSetManager: Starting task 50.0 in stage 0.0 (TID 50, localhost, executor driver, 
    partition 50, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:48 INFO Executor: Running task 50.0 in stage 0.0 (TID 50)
    22/08/04 12:05:48 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 4645 ms on localhost (executor driver) (1/2998)
    22/08/04 12:05:48 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 4639 ms on localhost (executor driver) (2/2998)
    22/08/04 12:05:48 INFO TaskSetManager: Starting task 51.0 in stage 0.0 (TID 51, localhost, executor driver, 
    partition 51, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:48 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 4917 ms on localhost (executor driver) (3/2998)
    22/08/04 12:05:48 INFO Executor: Running task 51.0 in stage 0.0 (TID 51)
    22/08/04 12:05:48 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 4774 ms on localhost (executor driver) (4/2998)
    22/08/04 12:05:51 INFO Executor: Finished task 49.0 in stage 0.0 (TID 49). 924 bytes result sent to driver
    22/08/04 12:05:51 INFO TaskSetManager: Starting task 52.0 in stage 0.0 (TID 52, localhost, executor driver, 
    partition 52, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:51 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 3140 ms on localhost (executor driver) (5/2998)
    22/08/04 12:05:51 INFO Executor: Running task 52.0 in stage 0.0 (TID 52)
    22/08/04 12:05:54 INFO Executor: Finished task 52.0 in stage 0.0 (TID 52). 881 bytes result sent to driver
    22/08/04 12:05:54 INFO TaskSetManager: Starting task 53.0 in stage 0.0 (TID 53, localhost, executor driver, 
    partition 53, PROCESS_LOCAL, 1053206 bytes)
    22/08/04 12:05:54 INFO TaskSetManager: Finished task 52.0 in stage 0.0 (TID 52) in 2963 ms on localhost (executor driver) (6/2998)
    22/08/04 12:05:54 INFO Executor: Running task 53.0 in stage 0.0 (TID 53)

     

    Thank you again, Genevieve!

     

    Best Regards

    Wei Yuan

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Wei Yuan,

    Thank you for the update and giving more insight into your use case. Unfortunately, the GATK Spark SV pipeline is not maintained in favor of the much improved GATK-SV pipeline. The Spark SV pipeline tools are all in BETA and we are not able to provide much support for using them. 

    I highly recommend that you switch over to the GATK-SV pipeline for an improved experience and better results. Here is more information: https://github.com/broadinstitute/gatk-sv.

    If you really need to use the Spark SV tools, our team might be able to look more closely, but it will take some time to come to an answer about this. Let me know!

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk