FindBadGenomicKmersSpark running indefinitely
AnsweredDear GATK staff
May I seek your help in troubleshooting FindBadGenomicGenomicKmersSpark? I wish to use it to generate a list of ubiquitous kmers to ignore for StructuralVariationDiscoveryPipelineSpark but the former runs indefinitely.
REQUIRED for all errors and issues:
a) GATK version used:
v4.2.6.0
b) Exact command used:
gatk FindBadGenomicKmersSpark -R $dir/sv_resources/genome/GRCH37-lite.fa -O $dir/sv_resources/misc/kmers_to_ignore.txt
The GRCH37-lite.fa is downloaded from: https://storage.googleapis.com/genomics-public-data/references/GRCh37lite/GRCh37-lite.fa.gz
c) Entire program log:
I tried running the program twice and both times they were stuck at "Running task 53.0 in stage 0.0 (TID 53)" for 2 and 5 days.
Using GATK jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar FindBadGenomicKmersSpark -R /mnt/e/variant_calling/sv_pipeline/sv_resources/genome/GRCH37-lite.fa -O /mnt/e/variant_calling/sv_pipeline/sv_resources/misc/kmers_to_ignore.txt
18:41:45.124 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
18:41:45.245 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
18:41:45.245 INFO FindBadGenomicKmersSpark - The Genome Analysis Toolkit (GATK) v4.2.6.0
18:41:45.245 INFO FindBadGenomicKmersSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
18:41:45.245 INFO FindBadGenomicKmersSpark - Executing as weiyuan@IMC-SPD-EOM120 on Linux v5.10.102.1-microsoft-standard-WSL2 amd64
18:41:45.245 INFO FindBadGenomicKmersSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
18:41:45.246 INFO FindBadGenomicKmersSpark - Start Date/Time: July 29, 2022 6:41:45 PM SGT
18:41:45.246 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
18:41:45.246 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
18:41:45.246 INFO FindBadGenomicKmersSpark - HTSJDK Version: 2.24.1
18:41:45.246 INFO FindBadGenomicKmersSpark - Picard Version: 2.27.1
18:41:45.246 INFO FindBadGenomicKmersSpark - Built for Spark Version: 2.4.5
18:41:45.246 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:41:45.246 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:41:45.247 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:41:45.247 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:41:45.247 INFO FindBadGenomicKmersSpark - Deflater: IntelDeflater
18:41:45.247 INFO FindBadGenomicKmersSpark - Inflater: IntelInflater
18:41:45.247 INFO FindBadGenomicKmersSpark - GCS max retries/reopens: 20
18:41:45.247 INFO FindBadGenomicKmersSpark - Requester pays: disabled
18:41:45.247 WARN FindBadGenomicKmersSpark -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: FindBadGenomicKmersSpark is a BETA tool and is not yet ready for use in production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
18:41:45.247 INFO FindBadGenomicKmersSpark - Initializing engine
18:41:45.247 INFO FindBadGenomicKmersSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/07/29 18:41:45 WARN Utils: Your hostname, IMC-SPD-EOM120 resolves to a loopback address: 127.0.1.1; using 172.27.31.79 instead (on interface eth0)
22/07/29 18:41:45 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/07/29 18:41:46 INFO SparkContext: Running Spark version 2.4.5
22/07/29 18:41:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/07/29 18:41:46 INFO SparkContext: Submitted application: FindBadGenomicKmersSpark
22/07/29 18:41:46 INFO SecurityManager: Changing view acls to: weiyuan
22/07/29 18:41:46 INFO SecurityManager: Changing modify acls to: weiyuan
22/07/29 18:41:46 INFO SecurityManager: Changing view acls groups to:
22/07/29 18:41:46 INFO SecurityManager: Changing modify acls groups to:
22/07/29 18:41:46 INFO SecurityManager: SecurityManager: authentication disabled; ui acls
disabled; users with view permissions: Set(weiyuan); groups with view permissions: Set(); users with modify permissions: Set(weiyuan); groups with modify permissions: Set()
22/07/29 18:41:47 INFO Utils: Successfully started service 'sparkDriver' on port 37735.
22/07/29 18:41:47 INFO SparkEnv: Registering MapOutputTracker
22/07/29 18:41:47 INFO SparkEnv: Registering BlockManagerMaster
22/07/29 18:41:47 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/07/29 18:41:47 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/07/29 18:41:47 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-64a099bb-6814-4d91-98fa-40962cf319a8
22/07/29 18:41:47 INFO MemoryStore: MemoryStore started with capacity 15.8 GB
22/07/29 18:41:47 INFO SparkEnv: Registering OutputCommitCoordinator
22/07/29 18:41:47 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/07/29 18:41:47 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.27.31.79:4040
22/07/29 18:41:47 INFO Executor: Starting executor ID driver on host localhost
22/07/29 18:41:47 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33163.
22/07/29 18:41:47 INFO NettyBlockTransferService: Server created on 172.27.31.79:33163
22/07/29 18:41:47 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/07/29 18:41:47 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.27.31.79, 33163, None)
22/07/29 18:41:47 INFO BlockManagerMasterEndpoint: Registering block manager 172.27.31.79:33163 with 15.8 GB RAM, BlockManagerId(driver, 172.27.31.79, 33163, None)
22/07/29 18:41:47 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver,
172.27.31.79, 33163, None)
22/07/29 18:41:47 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.27.31.79, 33163, None)
18:41:47.664 INFO FindBadGenomicKmersSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
22/07/29 18:42:21 INFO SparkContext: Starting job: collect at FindBadGenomicKmersSpark.java:178
22/07/29 18:42:21 INFO DAGScheduler: Registering RDD 2 (mapToPair at FindBadGenomicKmersSpark.java:162) as input to shuffle 0
22/07/29 18:42:21 INFO DAGScheduler: Got job 0 (collect at FindBadGenomicKmersSpark.java:178) with 2998 output partitions
22/07/29 18:42:21 INFO DAGScheduler: Final stage: ResultStage 1 (collect at FindBadGenomicKmersSpark.java:178)
22/07/29 18:42:21 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
22/07/29 18:42:21 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
22/07/29 18:42:21 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at
mapToPair at FindBadGenomicKmersSpark.java:162), which has no missing parents
22/07/29 18:42:22 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 15.8 GB)
22/07/29 18:42:22 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.7 KB, free 15.8 GB)
22/07/29 18:42:22 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.27.31.79:33163 (size: 3.7 KB, free: 15.8 GB)
22/07/29 18:42:22 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
22/07/29 18:42:22 INFO DAGScheduler: Submitting 2998 missing tasks from ShuffleMapStage 0
(MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
22/07/29 18:42:22 INFO TaskSchedulerImpl: Adding task set 0.0 with 2998 tasks
22/07/29 18:42:22 WARN TaskSetManager: Stage 0 contains a task of very large size (1018 KB). The maximum recommended task size is 100 KB.
22/07/29 18:42:22 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 1043155 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, localhost, executor driver, partition 10, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, localhost, executor driver, partition 11, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, localhost, executor driver, partition 12, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, localhost, executor driver, partition 13, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, localhost, executor driver, partition 14, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, localhost, executor driver, partition 15, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, localhost, executor driver, partition 16, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, localhost, executor driver, partition 17, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, localhost, executor driver, partition 18, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, localhost, executor driver, partition 19, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, localhost, executor driver, partition 20, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, localhost, executor driver, partition 21, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, localhost, executor driver, partition 22, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, localhost, executor driver, partition 23, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, localhost, executor driver, partition 24, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, localhost, executor driver, partition 25, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, localhost, executor driver, partition 26, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, localhost, executor driver, partition 27, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, localhost, executor driver, partition 28, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, localhost, executor driver, partition 29, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, localhost, executor driver, partition 30, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, localhost, executor driver, partition 31, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, localhost, executor driver, partition 32, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, localhost, executor driver, partition 33, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, localhost, executor driver, partition 34, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, localhost, executor driver, partition 35, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, localhost, executor driver, partition 36, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, localhost, executor driver, partition 37, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, localhost, executor driver, partition 38, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, localhost, executor driver, partition 39, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, localhost, executor driver, partition 40, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, localhost, executor driver, partition 41, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, localhost, executor driver, partition 42, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, localhost, executor driver, partition 43, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, localhost, executor driver, partition 44, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, localhost, executor driver, partition 45, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, localhost, executor driver, partition 46, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, localhost, executor driver, partition 47, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:22 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
22/07/29 18:42:22 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
22/07/29 18:42:22 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
22/07/29 18:42:22 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
22/07/29 18:42:22 INFO Executor: Running task 15.0 in stage 0.0 (TID 15)
22/07/29 18:42:22 INFO Executor: Running task 14.0 in stage 0.0 (TID 14)
22/07/29 18:42:22 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
22/07/29 18:42:22 INFO Executor: Running task 13.0 in stage 0.0 (TID 13)
22/07/29 18:42:22 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
22/07/29 18:42:22 INFO Executor: Running task 10.0 in stage 0.0 (TID 10)
22/07/29 18:42:22 INFO Executor: Running task 21.0 in stage 0.0 (TID 21)
22/07/29 18:42:22 INFO Executor: Running task 20.0 in stage 0.0 (TID 20)
22/07/29 18:42:22 INFO Executor: Running task 17.0 in stage 0.0 (TID 17)
22/07/29 18:42:22 INFO Executor: Running task 19.0 in stage 0.0 (TID 19)
22/07/29 18:42:22 INFO Executor: Running task 18.0 in stage 0.0 (TID 18)
22/07/29 18:42:22 INFO Executor: Running task 16.0 in stage 0.0 (TID 16)
22/07/29 18:42:22 INFO Executor: Running task 12.0 in stage 0.0 (TID 12)
22/07/29 18:42:22 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
22/07/29 18:42:22 INFO Executor: Running task 28.0 in stage 0.0 (TID 28)
22/07/29 18:42:22 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
22/07/29 18:42:22 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
22/07/29 18:42:22 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
22/07/29 18:42:22 INFO Executor: Running task 11.0 in stage 0.0 (TID 11)
22/07/29 18:42:22 INFO Executor: Running task 27.0 in stage 0.0 (TID 27)
22/07/29 18:42:22 INFO Executor: Running task 25.0 in stage 0.0 (TID 25)
22/07/29 18:42:22 INFO Executor: Running task 26.0 in stage 0.0 (TID 26)
22/07/29 18:42:22 INFO Executor: Running task 24.0 in stage 0.0 (TID 24)
22/07/29 18:42:22 INFO Executor: Running task 23.0 in stage 0.0 (TID 23)
22/07/29 18:42:22 INFO Executor: Running task 22.0 in stage 0.0 (TID 22)
22/07/29 18:42:22 INFO Executor: Running task 29.0 in stage 0.0 (TID 29)
22/07/29 18:42:22 INFO Executor: Running task 44.0 in stage 0.0 (TID 44)
22/07/29 18:42:22 INFO Executor: Running task 45.0 in stage 0.0 (TID 45)
22/07/29 18:42:22 INFO Executor: Running task 40.0 in stage 0.0 (TID 40)
22/07/29 18:42:22 INFO Executor: Running task 43.0 in stage 0.0 (TID 43)
22/07/29 18:42:22 INFO Executor: Running task 36.0 in stage 0.0 (TID 36)
22/07/29 18:42:22 INFO Executor: Running task 37.0 in stage 0.0 (TID 37)
22/07/29 18:42:22 INFO Executor: Running task 46.0 in stage 0.0 (TID 46)
22/07/29 18:42:22 INFO Executor: Running task 41.0 in stage 0.0 (TID 41)
22/07/29 18:42:22 INFO Executor: Running task 38.0 in stage 0.0 (TID 38)
22/07/29 18:42:22 INFO Executor: Running task 31.0 in stage 0.0 (TID 31)
22/07/29 18:42:22 INFO Executor: Running task 39.0 in stage 0.0 (TID 39)
22/07/29 18:42:22 INFO Executor: Running task 33.0 in stage 0.0 (TID 33)
22/07/29 18:42:22 INFO Executor: Running task 32.0 in stage 0.0 (TID 32)
22/07/29 18:42:22 INFO Executor: Running task 42.0 in stage 0.0 (TID 42)
22/07/29 18:42:22 INFO Executor: Running task 30.0 in stage 0.0 (TID 30)
22/07/29 18:42:22 INFO Executor: Running task 34.0 in stage 0.0 (TID 34)
22/07/29 18:42:22 INFO Executor: Running task 35.0 in stage 0.0 (TID 35)
22/07/29 18:42:22 INFO Executor: Running task 47.0 in stage 0.0 (TID 47)
22/07/29 18:42:26 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 924 bytes result
sent to driver
22/07/29 18:42:26 INFO Executor: Finished task 38.0 in stage 0.0 (TID 38). 967 bytes result sent to driver
22/07/29 18:42:26 INFO Executor: Finished task 40.0 in stage 0.0 (TID 40). 924 bytes result sent to driver
22/07/29 18:42:26 INFO Executor: Finished task 21.0 in stage 0.0 (TID 21). 924 bytes result sent to driver
22/07/29 18:42:26 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, localhost, executor driver, partition 48, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:26 INFO Executor: Running task 48.0 in stage 0.0 (TID 48)
22/07/29 18:42:26 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, localhost, executor driver, partition 49, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:26 INFO Executor: Running task 49.0 in stage 0.0 (TID 49)
22/07/29 18:42:26 INFO TaskSetManager: Starting task 50.0 in stage 0.0 (TID 50, localhost, executor driver, partition 50, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:26 INFO Executor: Running task 50.0 in stage 0.0 (TID 50)
22/07/29 18:42:26 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 4287 ms on localhost (executor driver) (1/2998)
22/07/29 18:42:26 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 4549 ms on localhost (executor driver) (2/2998)
22/07/29 18:42:26 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 4313 ms on localhost (executor driver) (3/2998)
22/07/29 18:42:26 INFO TaskSetManager: Starting task 51.0 in stage 0.0 (TID 51, localhost, executor driver, partition 51, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:26 INFO Executor: Running task 51.0 in stage 0.0 (TID 51)
22/07/29 18:42:26 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 4429 ms on localhost (executor driver) (4/2998)
22/07/29 18:42:29 INFO Executor: Finished task 49.0 in stage 0.0 (TID 49). 924 bytes result sent to driver
22/07/29 18:42:29 INFO TaskSetManager: Starting task 52.0 in stage 0.0 (TID 52, localhost, executor driver, partition 52, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:29 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 3188 ms on localhost (executor driver) (5/2998)
22/07/29 18:42:29 INFO Executor: Running task 52.0 in stage 0.0 (TID 52)
22/07/29 18:42:32 INFO Executor: Finished task 52.0 in stage 0.0 (TID 52). 924 bytes result sent to driver
22/07/29 18:42:32 INFO TaskSetManager: Starting task 53.0 in stage 0.0 (TID 53, localhost, executor driver, partition 53, PROCESS_LOCAL, 1053206 bytes)
22/07/29 18:42:32 INFO TaskSetManager: Finished task 52.0 in stage 0.0 (TID 52) in 3073 ms on localhost (executor driver) (6/2998)
22/07/29 18:42:32 INFO Executor: Running task 53.0 in stage 0.0 (TID 53)
Thank you for your time.
Best Regards
Wei Yuan
-
Hi Cher Wei Yuan,
Thank you for writing into the GATK forum about this tool! I hope that we can help you solve this issue.
If you are running a Spark tool locally (without a spark cluster), we recommend that you follow these guidelines: https://github.com/broadinstitute/gatk#running-gatk4-spark-tools-locally. I believe that by using the options outlined in that README, the Spark tool will run much better on your machine.
Please let us know how that goes. If it is still is not working, we can troubleshoot from there.
Best,
Genevieve
-
Dear Genevieve
I tried using 1, 4 and 12 threads but the program runs indefinitely still. May I seek your help in troubleshooting?
Here is the printout from using 4 threads:
gatk FindBadGenomicKmersSpark \
> --spark-runner LOCAL \
> --spark-master 'local[4]' \
> -R $dir/sv_resources/genome/GRCH37-lite.fa \
> -O $dir/sv_resources/misc/kmers_to_ignore.txt
Using GATK jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar FindBadGenomicKmersSpark --spark-master local[4] -R /mnt/e/variant_calling/sv_pipeline/sv_resources/genome/GRCH37-lite.fa -O /mnt/e/variant_calling/sv_pipeline/sv_resources/misc/kmers_to_ignore.txt
17:05:20.424 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
17:05:20.566 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
17:05:20.567 INFO FindBadGenomicKmersSpark - The Genome Analysis Toolkit (GATK) v4.2.6.0
17:05:20.567 INFO FindBadGenomicKmersSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
17:05:20.567 INFO FindBadGenomicKmersSpark - Executing as weiyuan@IMC-SPD-EOM120 on Linux v5.10.102.1-microsoft-standard-WSL2 amd64
17:05:20.567 INFO FindBadGenomicKmersSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
17:05:20.567 INFO FindBadGenomicKmersSpark - Start Date/Time: August 3, 2022 5:05:20 PM SGT
17:05:20.568 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
17:05:20.568 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
17:05:20.568 INFO FindBadGenomicKmersSpark - HTSJDK Version: 2.24.1
17:05:20.568 INFO FindBadGenomicKmersSpark - Picard Version: 2.27.1
17:05:20.568 INFO FindBadGenomicKmersSpark - Built for Spark Version: 2.4.5
17:05:20.569 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
17:05:20.569 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
17:05:20.569 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
17:05:20.569 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
17:05:20.569 INFO FindBadGenomicKmersSpark - Deflater: IntelDeflater
17:05:20.569 INFO FindBadGenomicKmersSpark - Inflater: IntelInflater
17:05:20.569 INFO FindBadGenomicKmersSpark - GCS max retries/reopens: 20
17:05:20.569 INFO FindBadGenomicKmersSpark - Requester pays: disabled
17:05:20.569 WARN FindBadGenomicKmersSpark -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: FindBadGenomicKmersSpark is a BETA tool and is not yet ready for use in production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
17:05:20.569 INFO FindBadGenomicKmersSpark - Initializing engine
17:05:20.570 INFO FindBadGenomicKmersSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/08/03 17:05:20 WARN Utils: Your hostname, IMC-SPD-EOM120 resolves to a loopback address: 127.0.1.1; using 172.27.18.91 instead (on interface eth0)
22/08/03 17:05:20 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/08/03 17:05:21 INFO SparkContext: Running Spark version 2.4.5
22/08/03 17:05:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/03 17:05:22 INFO SparkContext: Submitted application: FindBadGenomicKmersSpark
22/08/03 17:05:22 INFO SecurityManager: Changing view acls to: weiyuan
22/08/03 17:05:22 INFO SecurityManager: Changing modify acls to: weiyuan
22/08/03 17:05:22 INFO SecurityManager: Changing view acls groups to:
22/08/03 17:05:22 INFO SecurityManager: Changing modify acls groups to:
22/08/03 17:05:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(weiyuan); groups with view permissions: Set(); users with modify permissions: Set(weiyuan); groups with modify permissions: Set()
22/08/03 17:05:22 INFO Utils: Successfully started service 'sparkDriver' on port 41925.
22/08/03 17:05:22 INFO SparkEnv: Registering MapOutputTracker
22/08/03 17:05:22 INFO SparkEnv: Registering BlockManagerMaster
22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for
getting topology information
22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/08/03 17:05:22 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-db8bfc05-ae9e-4d0e-a266-2d923c3c5074
22/08/03 17:05:22 INFO MemoryStore: MemoryStore started with capacity 15.8 GB
22/08/03 17:05:22 INFO SparkEnv: Registering OutputCommitCoordinator
22/08/03 17:05:22 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/08/03 17:05:22 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.27.18.91:4040
22/08/03 17:05:22 INFO Executor: Starting executor ID driver on host localhost
22/08/03 17:05:22 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40331.
22/08/03 17:05:22 INFO NettyBlockTransferService: Server created on 172.27.18.91:40331
22/08/03 17:05:22 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/08/03 17:05:22 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.27.18.91, 40331, None)
22/08/03 17:05:22 INFO BlockManagerMasterEndpoint: Registering block manager 172.27.18.91:40331 with 15.8 GB RAM, BlockManagerId(driver, 172.27.18.91, 40331, None)
22/08/03 17:05:22 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.27.18.91, 40331, None)
22/08/03 17:05:22 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.27.18.91, 40331, None)
17:05:22.993 INFO FindBadGenomicKmersSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
22/08/03 17:05:54 INFO SparkContext: Starting job: collect at FindBadGenomicKmersSpark.java:178
22/08/03 17:05:55 INFO DAGScheduler: Registering RDD 2 (mapToPair at FindBadGenomicKmersSpark.java:162) as input to shuffle 0
22/08/03 17:05:55 INFO DAGScheduler: Got job 0 (collect at FindBadGenomicKmersSpark.java:178) with 2998 output partitions
22/08/03 17:05:55 INFO DAGScheduler: Final stage: ResultStage 1 (collect at FindBadGenomicKmersSpark.java:178)
22/08/03 17:05:55 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
22/08/03 17:05:55 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
22/08/03 17:05:55 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162), which has no missing parents
22/08/03 17:05:55 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 15.8 GB)
22/08/03 17:05:55 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.7 KB, free 15.8 GB)
22/08/03 17:05:55 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.27.18.91:40331 (size: 3.7 KB, free: 15.8 GB)
22/08/03 17:05:55 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
22/08/03 17:05:55 INFO DAGScheduler: Submitting 2998 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162) (first 15 tasks are for partitions Vector(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
22/08/03 17:05:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 2998 tasks
22/08/03 17:05:55 WARN TaskSetManager: Stage 0 contains a task of very large size (1018 KB). The maximum recommended task size is 100 KB.
22/08/03 17:05:55 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 1043155 bytes)
22/08/03 17:05:55 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 1053206 bytes)
22/08/03 17:05:55 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 1053206 bytes)
22/08/03 17:05:55 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 1053206 bytes)
22/08/03 17:05:55 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
22/08/03 17:05:55 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
22/08/03 17:05:55 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
22/08/03 17:05:55 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
22/08/03 17:05:57 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 924 bytes result sent to driver
22/08/03 17:05:57 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 1053206 bytes)
22/08/03 17:05:57 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
22/08/03 17:05:58 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2486 ms on localhost (executor driver) (1/2998)Best Regards
Wei Yuan
-
Thanks Wei Yuan for trying that so quickly and getting back to us!
Could you also try adding the java -Xmx option to your command line to limit the amount of memory java uses?
Explanation about -Xmx:
How to use it in the GATK command line:
-
Dear Genevieve
I tried several parameters (Xmx4/12/20G + 48 threads, Xmx100G + 1/4/24 threads, Xmx200G + 48 threads) but the furthest I got was still task 53.0 in stage 0.0 (TID 53), where it runs indefinitely (shown below).
Perhaps it will be easier if the reference genome (GRCh37 & GRCh38) and its output from FindBadGenomicKmersSpark can be uploaded to the GATK Google Cloud resource. This will help most of us get ready to use StructuralVariationDiscoveryPipelineSpark immediately. Let me know if this suggestion works. Thank you!
(gatk4) weiyuan@IMC-SPD-EOM120:/mnt/e/variant_calling/sv_pipeline$ gatk FindBadGenomicKmersSpark \
> --java-options "-Xmx200G" \
> --spark-runner LOCAL \
> --spark-master 'local[*]' \
> -R $dir/sv_resources/genome/GRCH37-lite.fa \
> -O $dir/sv_resources/misc/kmers_to_ignore.txt
Using GATK jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx200G -jar /home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar FindBadGenomicKmersSpark --spark-master local[*] -R /mnt/e/variant_calling/sv_pipeline/sv_resources/genome/GRCH37-lite.fa -O /mnt/e/variant_calling/sv_pipeline/sv_resources/misc/kmers_to_ignore.txt
12:05:05.554 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/weiyuan/miniconda3/envs/gatk4/share/gatk4-4.2.6.0-0/gatk-package-4.2.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
12:05:05.673 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
12:05:05.673 INFO FindBadGenomicKmersSpark - The Genome Analysis Toolkit (GATK) v4.2.6.0
12:05:05.673 INFO FindBadGenomicKmersSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
12:05:05.673 INFO FindBadGenomicKmersSpark - Executing as weiyuan@IMC-SPD-EOM120 on Linux v5.10.102.1-microsoft-standard-WSL2 amd64
12:05:05.673 INFO FindBadGenomicKmersSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
12:05:05.674 INFO FindBadGenomicKmersSpark - Start Date/Time: August 4, 2022 12:05:05 PM SGT
12:05:05.674 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
12:05:05.674 INFO FindBadGenomicKmersSpark - ------------------------------------------------------------
12:05:05.674 INFO FindBadGenomicKmersSpark - HTSJDK Version: 2.24.1
12:05:05.674 INFO FindBadGenomicKmersSpark - Picard Version: 2.27.1
12:05:05.674 INFO FindBadGenomicKmersSpark - Built for Spark Version: 2.4.5
12:05:05.675 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:05:05.675 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:05:05.675 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:05:05.675 INFO FindBadGenomicKmersSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:05:05.675 INFO FindBadGenomicKmersSpark - Deflater: IntelDeflater
12:05:05.675 INFO FindBadGenomicKmersSpark - Inflater: IntelInflater
12:05:05.675 INFO FindBadGenomicKmersSpark - GCS max retries/reopens: 20
12:05:05.675 INFO FindBadGenomicKmersSpark - Requester pays: disabled
12:05:05.676 WARN FindBadGenomicKmersSpark -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: FindBadGenomicKmersSpark is a BETA tool and is not yet ready for use in production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
12:05:05.676 INFO FindBadGenomicKmersSpark - Initializing engine
12:05:05.676 INFO FindBadGenomicKmersSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/08/04 12:05:05 WARN Utils: Your hostname, IMC-SPD-EOM120 resolves to a loopback address: 127.0.1.1; using 172.27.18.91 instead (on interface eth0)
22/08/04 12:05:05 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/08/04 12:05:06 INFO SparkContext: Running Spark version 2.4.5
22/08/04 12:05:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/04 12:05:07 INFO SparkContext: Submitted application: FindBadGenomicKmersSpark
22/08/04 12:05:07 INFO SecurityManager: Changing view acls to: weiyuan
22/08/04 12:05:07 INFO SecurityManager: Changing modify acls to: weiyuan
22/08/04 12:05:07 INFO SecurityManager: Changing view acls groups to:
22/08/04 12:05:07 INFO SecurityManager: Changing modify acls groups to:
22/08/04 12:05:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(weiyuan); groups with view permissions: Set(); users with modify permissions: Set(weiyuan); groups with modify permissions: Set()
22/08/04 12:05:07 INFO Utils: Successfully started service 'sparkDriver' on port 34615.
22/08/04 12:05:07 INFO SparkEnv: Registering MapOutputTracker
22/08/04 12:05:07 INFO SparkEnv: Registering BlockManagerMaster
22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for
getting topology information
22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/08/04 12:05:07 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8b29a948-0bc1-4dd5-acba-4dea04a4b3b8
22/08/04 12:05:07 INFO MemoryStore: MemoryStore started with capacity 106.5 GB
22/08/04 12:05:07 INFO SparkEnv: Registering OutputCommitCoordinator
22/08/04 12:05:07 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/08/04 12:05:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.27.18.91:4040
22/08/04 12:05:07 INFO Executor: Starting executor ID driver on host localhost
22/08/04 12:05:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41571.
22/08/04 12:05:07 INFO NettyBlockTransferService: Server created on 172.27.18.91:41571
22/08/04 12:05:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/08/04 12:05:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.27.18.91, 41571, None)
22/08/04 12:05:07 INFO BlockManagerMasterEndpoint: Registering block manager 172.27.18.91:41571 with 106.5 GB RAM, BlockManagerId(driver, 172.27.18.91, 41571, None)
22/08/04 12:05:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.27.18.91, 41571, None)
22/08/04 12:05:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.27.18.91, 41571, None)
12:05:08.052 INFO FindBadGenomicKmersSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
22/08/04 12:05:42 INFO SparkContext: Starting job: collect at FindBadGenomicKmersSpark.java:178
22/08/04 12:05:43 INFO DAGScheduler: Registering RDD 2 (mapToPair at FindBadGenomicKmersSpark.java:162) as input to shuffle 0
22/08/04 12:05:43 INFO DAGScheduler: Got job 0 (collect at FindBadGenomicKmersSpark.java:178) with 2998 output partitions
22/08/04 12:05:43 INFO DAGScheduler: Final stage: ResultStage 1 (collect at FindBadGenomicKmersSpark.java:178)
22/08/04 12:05:43 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
22/08/04 12:05:43 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
22/08/04 12:05:43 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162), which has no missing parents
22/08/04 12:05:43 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 106.5 GB)
22/08/04 12:05:43 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.7 KB, free 106.5 GB)
22/08/04 12:05:43 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.27.18.91:41571 (size: 3.7 KB, free: 106.5 GB)
22/08/04 12:05:43 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
22/08/04 12:05:43 INFO DAGScheduler: Submitting 2998 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at mapToPair at FindBadGenomicKmersSpark.java:162) (first 15 tasks are for partitions Vector(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
22/08/04 12:05:43 INFO TaskSchedulerImpl: Adding task set 0.0 with 2998 tasks
22/08/04 12:05:43 WARN TaskSetManager: Stage 0 contains a task of very large size (1018 KB). The maximum recommended task size is 100 KB.
22/08/04 12:05:43 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 1043155 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, localhost, executor driver,
partition 10, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, localhost, executor driver,
partition 11, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, localhost, executor driver,
partition 12, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, localhost, executor driver,
partition 13, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, localhost, executor driver,
partition 14, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, localhost, executor driver,
partition 15, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, localhost, executor driver,
partition 16, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, localhost, executor driver,
partition 17, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, localhost, executor driver,
partition 18, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, localhost, executor driver,
partition 19, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, localhost, executor driver,
partition 20, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, localhost, executor driver,
partition 21, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, localhost, executor driver,
partition 22, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, localhost, executor driver,
partition 23, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, localhost, executor driver,
partition 24, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, localhost, executor driver,
partition 25, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, localhost, executor driver,
partition 26, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, localhost, executor driver,
partition 27, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, localhost, executor driver,
partition 28, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, localhost, executor driver,
partition 29, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, localhost, executor driver,
partition 30, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, localhost, executor driver,
partition 31, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, localhost, executor driver,
partition 32, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, localhost, executor driver,
partition 33, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, localhost, executor driver,
partition 34, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, localhost, executor driver,
partition 35, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, localhost, executor driver,
partition 36, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, localhost, executor driver,
partition 37, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, localhost, executor driver,
partition 38, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, localhost, executor driver,
partition 39, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, localhost, executor driver,
partition 40, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, localhost, executor driver,
partition 41, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, localhost, executor driver,
partition 42, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, localhost, executor driver,
partition 43, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:43 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, localhost, executor driver,
partition 44, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:44 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, localhost, executor driver,
partition 45, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:44 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, localhost, executor driver,
partition 46, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:44 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, localhost, executor driver,
partition 47, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:44 INFO Executor: Running task 14.0 in stage 0.0 (TID 14)
22/08/04 12:05:44 INFO Executor: Running task 28.0 in stage 0.0 (TID 28)
22/08/04 12:05:44 INFO Executor: Running task 26.0 in stage 0.0 (TID 26)
22/08/04 12:05:44 INFO Executor: Running task 27.0 in stage 0.0 (TID 27)
22/08/04 12:05:44 INFO Executor: Running task 25.0 in stage 0.0 (TID 25)
22/08/04 12:05:44 INFO Executor: Running task 24.0 in stage 0.0 (TID 24)
22/08/04 12:05:44 INFO Executor: Running task 23.0 in stage 0.0 (TID 23)
22/08/04 12:05:44 INFO Executor: Running task 22.0 in stage 0.0 (TID 22)
22/08/04 12:05:44 INFO Executor: Running task 21.0 in stage 0.0 (TID 21)
22/08/04 12:05:44 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
22/08/04 12:05:44 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
22/08/04 12:05:44 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
22/08/04 12:05:44 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
22/08/04 12:05:44 INFO Executor: Running task 42.0 in stage 0.0 (TID 42)
22/08/04 12:05:44 INFO Executor: Running task 43.0 in stage 0.0 (TID 43)
22/08/04 12:05:44 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
22/08/04 12:05:44 INFO Executor: Running task 45.0 in stage 0.0 (TID 45)
22/08/04 12:05:44 INFO Executor: Running task 11.0 in stage 0.0 (TID 11)
22/08/04 12:05:44 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
22/08/04 12:05:44 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
22/08/04 12:05:44 INFO Executor: Running task 19.0 in stage 0.0 (TID 19)
22/08/04 12:05:44 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
22/08/04 12:05:44 INFO Executor: Running task 15.0 in stage 0.0 (TID 15)
22/08/04 12:05:44 INFO Executor: Running task 20.0 in stage 0.0 (TID 20)
22/08/04 12:05:44 INFO Executor: Running task 16.0 in stage 0.0 (TID 16)
22/08/04 12:05:44 INFO Executor: Running task 10.0 in stage 0.0 (TID 10)
22/08/04 12:05:44 INFO Executor: Running task 17.0 in stage 0.0 (TID 17)
22/08/04 12:05:44 INFO Executor: Running task 13.0 in stage 0.0 (TID 13)
22/08/04 12:05:44 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
22/08/04 12:05:44 INFO Executor: Running task 18.0 in stage 0.0 (TID 18)
22/08/04 12:05:44 INFO Executor: Running task 12.0 in stage 0.0 (TID 12)
22/08/04 12:05:44 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
22/08/04 12:05:44 INFO Executor: Running task 46.0 in stage 0.0 (TID 46)
22/08/04 12:05:44 INFO Executor: Running task 47.0 in stage 0.0 (TID 47)
22/08/04 12:05:44 INFO Executor: Running task 40.0 in stage 0.0 (TID 40)
22/08/04 12:05:44 INFO Executor: Running task 39.0 in stage 0.0 (TID 39)
22/08/04 12:05:44 INFO Executor: Running task 36.0 in stage 0.0 (TID 36)
22/08/04 12:05:44 INFO Executor: Running task 38.0 in stage 0.0 (TID 38)
22/08/04 12:05:44 INFO Executor: Running task 44.0 in stage 0.0 (TID 44)
22/08/04 12:05:44 INFO Executor: Running task 41.0 in stage 0.0 (TID 41)
22/08/04 12:05:44 INFO Executor: Running task 30.0 in stage 0.0 (TID 30)
22/08/04 12:05:44 INFO Executor: Running task 37.0 in stage 0.0 (TID 37)
22/08/04 12:05:44 INFO Executor: Running task 35.0 in stage 0.0 (TID 35)
22/08/04 12:05:44 INFO Executor: Running task 34.0 in stage 0.0 (TID 34)
22/08/04 12:05:44 INFO Executor: Running task 32.0 in stage 0.0 (TID 32)
22/08/04 12:05:44 INFO Executor: Running task 33.0 in stage 0.0 (TID 33)
22/08/04 12:05:44 INFO Executor: Running task 31.0 in stage 0.0 (TID 31)
22/08/04 12:05:44 INFO Executor: Running task 29.0 in stage 0.0 (TID 29)
22/08/04 12:05:48 INFO Executor: Finished task 38.0 in stage 0.0 (TID 38). 924 bytes result sent to driver
22/08/04 12:05:48 INFO Executor: Finished task 40.0 in stage 0.0 (TID 40). 924 bytes result sent to driver
22/08/04 12:05:48 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 924 bytes result sent to driver
22/08/04 12:05:48 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, localhost, executor driver,
partition 48, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:48 INFO Executor: Running task 48.0 in stage 0.0 (TID 48)
22/08/04 12:05:48 INFO Executor: Finished task 21.0 in stage 0.0 (TID 21). 924 bytes result sent to driver
22/08/04 12:05:48 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, localhost, executor driver,
partition 49, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:48 INFO Executor: Running task 49.0 in stage 0.0 (TID 49)
22/08/04 12:05:48 INFO TaskSetManager: Starting task 50.0 in stage 0.0 (TID 50, localhost, executor driver,
partition 50, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:48 INFO Executor: Running task 50.0 in stage 0.0 (TID 50)
22/08/04 12:05:48 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 4645 ms on localhost (executor driver) (1/2998)
22/08/04 12:05:48 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 4639 ms on localhost (executor driver) (2/2998)
22/08/04 12:05:48 INFO TaskSetManager: Starting task 51.0 in stage 0.0 (TID 51, localhost, executor driver,
partition 51, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:48 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 4917 ms on localhost (executor driver) (3/2998)
22/08/04 12:05:48 INFO Executor: Running task 51.0 in stage 0.0 (TID 51)
22/08/04 12:05:48 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 4774 ms on localhost (executor driver) (4/2998)
22/08/04 12:05:51 INFO Executor: Finished task 49.0 in stage 0.0 (TID 49). 924 bytes result sent to driver
22/08/04 12:05:51 INFO TaskSetManager: Starting task 52.0 in stage 0.0 (TID 52, localhost, executor driver,
partition 52, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:51 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 3140 ms on localhost (executor driver) (5/2998)
22/08/04 12:05:51 INFO Executor: Running task 52.0 in stage 0.0 (TID 52)
22/08/04 12:05:54 INFO Executor: Finished task 52.0 in stage 0.0 (TID 52). 881 bytes result sent to driver
22/08/04 12:05:54 INFO TaskSetManager: Starting task 53.0 in stage 0.0 (TID 53, localhost, executor driver,
partition 53, PROCESS_LOCAL, 1053206 bytes)
22/08/04 12:05:54 INFO TaskSetManager: Finished task 52.0 in stage 0.0 (TID 52) in 2963 ms on localhost (executor driver) (6/2998)
22/08/04 12:05:54 INFO Executor: Running task 53.0 in stage 0.0 (TID 53)Thank you again, Genevieve!
Best Regards
Wei Yuan
-
Hi Wei Yuan,
Thank you for the update and giving more insight into your use case. Unfortunately, the GATK Spark SV pipeline is not maintained in favor of the much improved GATK-SV pipeline. The Spark SV pipeline tools are all in BETA and we are not able to provide much support for using them.
I highly recommend that you switch over to the GATK-SV pipeline for an improved experience and better results. Here is more information: https://github.com/broadinstitute/gatk-sv.
If you really need to use the Spark SV tools, our team might be able to look more closely, but it will take some time to come to an answer about this. Let me know!
Best,
Genevieve
Please sign in to leave a comment.
5 comments