Haplotype caller error file not found
AnsweredDear gatk team,
When I use haplotypecaller I have a problem that the same script will sometimes work correctly and provide the gvcf and sometimes not. This happens for the same file, ie I can run the script it will fail, I restart the script without changing anything it works.
Based on the log file (after) it seems that it cannot find a file that it is supposed to create itself.
I use gatk 4.2.2.0
the command:
for i in $(seq -w 0001 00${SCATTER_COUNT})
do
srun --ntasks=1 gatk --java-options "-Xmx${SLURM_MEM_PER_CPU}M" HaplotypeCaller \
-R ${REF_Genome} \
-L ${Scattered_DIR}/temp_${i}_of_${SCATTER_COUNT}/scattered.interval_list \
-I ${BAM_INPUT_DIR}/${BAM_INPUT} \
-O ${temp_gVCF_OUTPUT_DIR}/${i}.${GVCF_OUTPUT} \
-G StandardAnnotation -G AS_StandardAnnotation -G StandardHCAnnotation \
-GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 50 -GQB 60 -GQB 70 -GQB 80 -GQB 90 \
-ERC GVCF \
--pcr-indel-model NONE \
2> ${logs_HC}/${i}.${GVCF_OUTPUT}.log &
done
the log :
Using GATK jar /shared/ifbstor1/projects/gentaumix/conda/envs/gatk_4.2.2.0/share/gatk4-4.2.2.0-0/gatk-package-4.2.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx5000M -jar /shared/ifbstor1/projects/gentaumix/conda/envs/gatk_4.2.2.0/share/gatk4-4.2.2.0-0/gatk-package-4.2.2.0-local.jar HaplotypeCaller -R /shared/projects/gentaumix/Ressources/grch38_BWA_2_saliva/grch38_with_saliva.fa -L /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/interval/temp_0018_of_50/scattered.interval_list -I /shared/projects/gentaumix/processed_data_set5/03_Alignment/D_recal/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam -O /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz -G StandardAnnotation -G AS_StandardAnnotation -G StandardHCAnnotation -GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 50 -GQB 60 -GQB 70 -GQB 80 -GQB 90 -ERC GVCF --pcr-indel-model NONE
00:08:08.048 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardAnnotation) is enabled for this tool by default
00:08:08.097 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardHCAnnotation) is enabled for this tool by default
00:08:08.428 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/shared/ifbstor1/projects/gentaumix/conda/envs/gatk_4.2.2.0/share/gatk4-4.2.2.0-0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 22, 2021 12:08:09 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
00:08:09.460 INFO HaplotypeCaller - ------------------------------------------------------------
00:08:09.471 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.2.2.0
00:08:09.472 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
00:08:09.483 INFO HaplotypeCaller - Executing as quentin67100@cpu-node-61.ifb.local on Linux v3.10.0-1160.6.1.el7.x86_64 amd64
00:08:09.484 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_282-b08
00:08:09.484 INFO HaplotypeCaller - Start Date/Time: 22 octobre 2021 00:08:08 CEST
00:08:09.485 INFO HaplotypeCaller - ------------------------------------------------------------
00:08:09.485 INFO HaplotypeCaller - ------------------------------------------------------------
00:08:09.486 INFO HaplotypeCaller - HTSJDK Version: 2.24.1
00:08:09.486 INFO HaplotypeCaller - Picard Version: 2.25.4
00:08:09.486 INFO HaplotypeCaller - Built for Spark Version: 2.4.5
00:08:09.486 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
00:08:09.487 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
00:08:09.487 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
00:08:09.487 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
00:08:09.487 INFO HaplotypeCaller - Deflater: IntelDeflater
00:08:09.487 INFO HaplotypeCaller - Inflater: IntelInflater
00:08:09.488 INFO HaplotypeCaller - GCS max retries/reopens: 20
00:08:09.488 INFO HaplotypeCaller - Requester pays: disabled
00:08:09.488 INFO HaplotypeCaller - Initializing engine
00:08:12.451 INFO FeatureManager - Using codec IntervalListCodec to read file file:///shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/interval/temp_0018_of_50/scattered.interval_list
00:08:12.704 INFO IntervalArgumentCollection - Processing 58416259 bp from intervals
00:08:12.805 INFO HaplotypeCaller - Done initializing engine
00:08:12.817 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified annotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
00:08:13.343 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
00:08:13.344 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
00:08:13.383 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/shared/ifbstor1/projects/gentaumix/conda/envs/gatk_4.2.2.0/share/gatk4-4.2.2.0-0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
00:08:13.395 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/shared/ifbstor1/projects/gentaumix/conda/envs/gatk_4.2.2.0/share/gatk4-4.2.2.0-0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
00:08:13.481 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
00:08:13.483 INFO IntelPairHmm - Available threads: 14
00:08:13.483 INFO IntelPairHmm - Requested threads: 4
00:08:13.483 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
00:08:20.526 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
00:08:20.538 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
00:08:20.539 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
00:08:20.539 INFO HaplotypeCaller - Shutting down engine
[22 octobre 2021 00:08:20 CEST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.20 minutes.
Runtime.totalMemory()=1286078464
htsjdk.samtools.util.RuntimeIOException: File not found: /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:451)
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:415)
at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.createVCFWriter(GATKVariantContextUtils.java:123)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.makeVCFWriter(HaplotypeCallerEngine.java:374)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.onTraversalStart(HaplotypeCaller.java:263)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1083)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.nio.file.FileSystemException: /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz: Erreur de protocole
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at java.nio.file.Files.newOutputStream(Files.java:216)
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:447)
... 11 more
srun: error: cpu-node-61: task 0: Exited with exit code 3
-
Could you give the specific filenames in your HaplotypeCaller command that are associated with this error message?
Best,
Genevieve
-
In the log it's written :
-I /shared/projects/gentaumix/processed_data_set5/03_Alignment/D_recal/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam -O /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz
-
It's not an important error as at the end it work, but it's a bit annoying to have the job running for 2 hours to ultimately fail without really knowing why and only being able to restart it, hoping that this time it passes ...
-
Oh yeah, you're right! It looks like the problem is from the output file. There is also another error message at the bottom:
Caused by: java.nio.file.FileSystemException:
/shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz:
Erreur de protocole
…
srun: error: cpu-node-61: task 0: Exited with exit code 3This looks to me like an I/O problem which is most likely being caused by your cluster or filesystem. You can look into the exit code 3 more closely to figure out why your job is not able to create the output file.
Best,
Genevieve
-
I asked the administrators of our cluster, we can not find out where the problem comes from, especially since the file in question is indeed created by haplotypecaller even if it is completely empty so I do not understand well why it says it can't find it.
-
The error message that HaplotypeCaller cannot find the file doesn't usually mean it doesn't exist, usually there's another problem. There could be too many jobs writing to files in that directory, or potentially permissions issue.
This part of the stack trace is most likely much more descriptive to what is going on in your file system:
Caused by: java.nio.file.FileSystemException: /shared/projects/gentaumix/processed_data_set5/05_VCF/A_gVCF/G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.bam/temp_gVCF/0018.G632_DA_C002A5Q_1_H5N3JDSX2.DUAL2.trimmed.align.filtered.dedup.fixed.recal.g.vcf.gz: Erreur de protocole
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at java.nio.file.Files.newOutputStream(Files.java:216)
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:447)
... 11 more
srun: error: cpu-node-61: task 0: Exited with exit code 3
Please sign in to leave a comment.
6 comments