Baserecalibarator error: half of the files worked without any error
AnsweredIf you are seeing an error, please provide(REQUIRED) :
a) GATK version used: 4.2.3
b) Exact command used: gatk BaseRecalibratorSpark -R /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa -I PRADO_data/7B_dedupraw/S2697Nr1_dedup.bam --known-sites /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf -O PRADO_data/8C_bqsr/S2697Nr1_recal.table --spark-master local[4]
c) Entire error log:
gatk BaseRecalibratorSpark -R /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa -I PRADO_data/7B_dedupraw/S2697Nr1_dedup.bam --known-sites /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf -O PRADO_data/8C_bqsr/S2697Nr1_recal.table --spark-master local[4]
Using GATK jar /DATA/General_Resources/Tools/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /DATA/General_Resources/Tools/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar BaseRecalibratorSpark -R /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa -I PRADO_data/7B_dedupraw/S2697Nr1_dedup.bam --known-sites /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf -O PRADO_data/8C_bqsr/S2697Nr1_recal.table --spark-master local[4]
13:10:06.861 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/DATA/General_Resources/Tools/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 17, 2022 1:10:06 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:10:06.984 INFO BaseRecalibratorSpark - ------------------------------------------------------------
13:10:06.984 INFO BaseRecalibratorSpark - The Genome Analysis Toolkit (GATK) v4.2.3.0
13:10:06.984 INFO BaseRecalibratorSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
13:10:06.984 INFO BaseRecalibratorSpark - Executing as h.shehwana@wall-e on Linux v5.4.0-91-generic amd64
13:10:06.984 INFO BaseRecalibratorSpark - Java runtime: OpenJDK 64-Bit Server VM v11.0.13+8-Ubuntu-0ubuntu1.20.04
13:10:06.985 INFO BaseRecalibratorSpark - Start Date/Time: January 17, 2022 at 1:10:06 PM CET
13:10:06.985 INFO BaseRecalibratorSpark - ------------------------------------------------------------
13:10:06.985 INFO BaseRecalibratorSpark - ------------------------------------------------------------
13:10:06.985 INFO BaseRecalibratorSpark - HTSJDK Version: 2.24.1
13:10:06.985 INFO BaseRecalibratorSpark - Picard Version: 2.25.4
13:10:06.985 INFO BaseRecalibratorSpark - Built for Spark Version: 2.4.5
13:10:06.985 INFO BaseRecalibratorSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:10:06.986 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:10:06.986 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:10:06.986 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:10:06.986 INFO BaseRecalibratorSpark - Deflater: IntelDeflater
13:10:06.986 INFO BaseRecalibratorSpark - Inflater: IntelInflater
13:10:06.986 INFO BaseRecalibratorSpark - GCS max retries/reopens: 20
13:10:06.986 INFO BaseRecalibratorSpark - Requester pays: disabled
13:10:06.986 WARN BaseRecalibratorSpark -
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: BaseRecalibratorSpark is a BETA tool and is not yet ready for use in production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
13:10:06.986 INFO BaseRecalibratorSpark - Initializing engine
13:10:06.986 INFO BaseRecalibratorSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/01/17 13:10:07 WARN Utils: Your hostname, wall-e resolves to a loopback address: 127.0.1.1; using 192.168.200.233 instead (on interface eno1)
22/01/17 13:10:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/DATA/General_Resources/Tools/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar) to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
22/01/17 13:10:07 INFO SparkContext: Running Spark version 2.4.5
22/01/17 13:10:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/01/17 13:10:07 INFO SparkContext: Submitted application: BaseRecalibratorSpark
22/01/17 13:10:07 INFO SecurityManager: Changing view acls to: h.shehwana
22/01/17 13:10:07 INFO SecurityManager: Changing modify acls to: h.shehwana
22/01/17 13:10:07 INFO SecurityManager: Changing view acls groups to:
22/01/17 13:10:07 INFO SecurityManager: Changing modify acls groups to:
22/01/17 13:10:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(h.shehwana); groups with view permissions: Set(); users with modify permissions: Set(h.shehwana); groups with modify permissions: Set()
22/01/17 13:10:07 INFO Utils: Successfully started service 'sparkDriver' on port 34895.
22/01/17 13:10:07 INFO SparkEnv: Registering MapOutputTracker
22/01/17 13:10:07 INFO SparkEnv: Registering BlockManagerMaster
22/01/17 13:10:07 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/01/17 13:10:07 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/01/17 13:10:07 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-b0f62696-025b-4f6a-adb5-5098beb5505f
22/01/17 13:10:07 INFO MemoryStore: MemoryStore started with capacity 17.8 GB
22/01/17 13:10:07 INFO SparkEnv: Registering OutputCommitCoordinator
22/01/17 13:10:07 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/01/17 13:10:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.200.233:4040
22/01/17 13:10:07 INFO Executor: Starting executor ID driver on host localhost
22/01/17 13:10:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33293.
22/01/17 13:10:07 INFO NettyBlockTransferService: Server created on 192.168.200.233:33293
22/01/17 13:10:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/01/17 13:10:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.200.233, 33293, None)
22/01/17 13:10:07 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.200.233:33293 with 17.8 GB RAM, BlockManagerId(driver, 192.168.200.233, 33293, None)
22/01/17 13:10:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.200.233, 33293, None)
22/01/17 13:10:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.200.233, 33293, None)
13:10:07.959 INFO BaseRecalibratorSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
22/01/17 13:10:08 INFO GoogleHadoopFileSystemBase: GHFS version: 1.9.4-hadoop3
22/01/17 13:10:08 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 172.9 KB, free 17.8 GB)
22/01/17 13:10:08 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 35.4 KB, free 17.8 GB)
22/01/17 13:10:08 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.200.233:33293 (size: 35.4 KB, free: 17.8 GB)
22/01/17 13:10:08 INFO SparkContext: Created broadcast 0 from newAPIHadoopFile at PathSplitSource.java:96
22/01/17 13:10:08 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.200.233:33293 in memory (size: 35.4 KB, free: 17.8 GB)
22/01/17 13:10:08 INFO SparkContext: Added file file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa at file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa with timestamp 1642421408951
22/01/17 13:10:08 INFO Utils: Copying /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa to /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/GRCh38.d1.vd1.fa
22/01/17 13:10:10 INFO SparkContext: Added file file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa.fai at file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa.fai with timestamp 1642421410926
22/01/17 13:10:10 INFO Utils: Copying /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.fa.fai to /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/GRCh38.d1.vd1.fa.fai
22/01/17 13:10:10 INFO SparkContext: Added file file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.dict at file:///DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.dict with timestamp 1642421410932
22/01/17 13:10:10 INFO Utils: Copying /DATA/peeper_lab/reference_fasta_files/GRCh38.d1.vd1.dict to /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/GRCh38.d1.vd1.dict
22/01/17 13:10:10 INFO SparkContext: Added file /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf at file:/DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf with timestamp 1642421410939
22/01/17 13:10:10 INFO Utils: Copying /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf to /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/Homo_sapiens_assembly38.dbsnp138.vcf
22/01/17 13:10:18 INFO SparkContext: Added file /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf.idx at file:/DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf.idx with timestamp 1642421418617
22/01/17 13:10:18 INFO Utils: Copying /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf.idx to /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/Homo_sapiens_assembly38.dbsnp138.vcf.idx
22/01/17 13:10:18 INFO SparkUI: Stopped Spark web UI at http://192.168.200.233:4040
22/01/17 13:10:18 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/01/17 13:10:18 INFO MemoryStore: MemoryStore cleared
22/01/17 13:10:18 INFO BlockManager: BlockManager stopped
22/01/17 13:10:18 INFO BlockManagerMaster: BlockManagerMaster stopped
22/01/17 13:10:18 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/01/17 13:10:20 INFO SparkContext: Successfully stopped SparkContext
13:10:20.363 INFO BaseRecalibratorSpark - Shutting down engine
[January 17, 2022 at 1:10:20 PM CET] org.broadinstitute.hellbender.tools.spark.BaseRecalibratorSpark done. Elapsed time: 0.23 minutes.
Runtime.totalMemory()=2910846976
java.nio.file.FileSystemException: /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf.idx -> /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/Homo_sapiens_assembly38.dbsnp138.vcf.idx: No space left on device
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:258)
at java.base/sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:603)
at java.base/sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:258)
at java.base/java.nio.file.Files.copy(Files.java:1295)
at org.apache.spark.util.Utils$.org$apache$spark$util$Utils$$copyRecursive(Utils.scala:664)
at org.apache.spark.util.Utils$.copyFile(Utils.scala:635)
at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1568)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1508)
at org.apache.spark.api.java.JavaSparkContext.addFile(JavaSparkContext.scala:675)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.addVCFsForSpark(GATKSparkTool.java:724)
at org.broadinstitute.hellbender.tools.spark.BaseRecalibratorSpark.runTool(BaseRecalibratorSpark.java:128)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:546)
at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:31)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
22/01/17 13:10:20 INFO ShutdownHookManager: Shutdown hook called
22/01/17 13:10:20 INFO ShutdownHookManager: Deleting directory /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312
If not an error, choose a category for your question(REQUIRED):
a)How do I resolve this error?
Details
I have 160 files and my command worked perfectly for 83 files but it started producing output for the remaining files. Do you know what could be the reason of this error. I am really stuck and would really appreciate your help.
-
Hi Huma Shehwana,
The error message for this run can be found in the program log here:
java.nio.file.FileSystemException: /DATA/General_Resources/DNAseq_databases_knownresources/Homo_sapiens_assembly38.dbsnp138.vcf.idx -> /tmp/spark-78d11971-b432-4342-8481-a2ac5daa8312/userFiles-3183fa15-760f-414c-a479-92f45293a294/Homo_sapiens_assembly38.dbsnp138.vcf.idx: No space left on device
It looks like you are running out of disk space in your temporary directories. Hopefully this helps you to run the jobs successfully! Please let me know if you have further questions.
Best,
Genevieve
Please sign in to leave a comment.
1 comment