Issue when running BaseRecalibrator
REQUIRED for all errors and issues:
a) GATK version used:v4.2.6.1
b) Exact command used: see below
c) Entire program log: see below
How can I assign a temp directory and won't get the bug?
I always got error when I assigned the temp directory:
/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx8G -Djava.io.tmpdir=/data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/shell/temp" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table
Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx8G -Djava.io.tmpdir=/data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/shell/temp -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table
00:09:41.541 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
00:09:41.554 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
00:09:41.557 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
00:09:41.558 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
00:09:41.678 INFO BaseRecalibrator - ------------------------------------------------------------
00:09:41.679 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
00:09:41.679 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
00:09:41.679 INFO BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
00:09:41.679 INFO BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
00:09:41.680 INFO BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:09:41 AM CST
00:09:41.680 INFO BaseRecalibrator - ------------------------------------------------------------
00:09:41.680 INFO BaseRecalibrator - ------------------------------------------------------------
00:09:41.681 INFO BaseRecalibrator - HTSJDK Version: 2.24.1
00:09:41.681 INFO BaseRecalibrator - Picard Version: 2.27.1
00:09:41.681 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
00:09:41.681 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
00:09:41.681 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
00:09:41.681 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
00:09:41.681 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
00:09:41.682 INFO BaseRecalibrator - Deflater: JdkDeflater
00:09:41.682 INFO BaseRecalibrator - Inflater: JdkInflater
00:09:41.682 INFO BaseRecalibrator - GCS max retries/reopens: 20
00:09:41.682 INFO BaseRecalibrator - Requester pays: disabled
00:09:41.682 INFO BaseRecalibrator - Initializing engine
00:09:41.884 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:09:41.888 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:09:42.030 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:09:42.036 INFO BaseRecalibrator - Shutting down engine
[August 21, 2022 at 12:09:42 AM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=1140850688
org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec
at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)
at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
And I will get the same error when I assign the temp directory in another way:
/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx30G" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.table --tmp-dir /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam
Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.table --tmp-dir /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam
00:11:11.683 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
00:11:11.697 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
00:11:11.700 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
00:11:11.700 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
00:11:11.812 INFO BaseRecalibrator - ------------------------------------------------------------
00:11:11.813 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
00:11:11.813 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
00:11:11.813 INFO BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
00:11:11.813 INFO BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
00:11:11.813 INFO BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:11:11 AM CST
00:11:11.813 INFO BaseRecalibrator - ------------------------------------------------------------
00:11:11.813 INFO BaseRecalibrator - ------------------------------------------------------------
00:11:11.814 INFO BaseRecalibrator - HTSJDK Version: 2.24.1
00:11:11.814 INFO BaseRecalibrator - Picard Version: 2.27.1
00:11:11.814 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
00:11:11.814 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
00:11:11.814 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
00:11:11.814 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
00:11:11.814 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
00:11:11.814 INFO BaseRecalibrator - Deflater: JdkDeflater
00:11:11.815 INFO BaseRecalibrator - Inflater: JdkInflater
00:11:11.815 INFO BaseRecalibrator - GCS max retries/reopens: 20
00:11:11.815 INFO BaseRecalibrator - Requester pays: disabled
00:11:11.815 INFO BaseRecalibrator - Initializing engine
00:11:12.005 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:11:12.009 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:11:12.127 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
00:11:12.134 INFO BaseRecalibrator - Shutting down engine
[August 21, 2022 at 12:11:12 AM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=285212672
org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec
at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)
at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
However, the bug wasn't reported when I didn't assign the temp directory:
/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx30G" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table
Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table
00:12:20.992 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
00:12:21.140 INFO BaseRecalibrator - ------------------------------------------------------------
00:12:21.141 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
00:12:21.141 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
00:12:21.141 INFO BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
00:12:21.141 INFO BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
00:12:21.142 INFO BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:12:20 AM CST
00:12:21.142 INFO BaseRecalibrator - ------------------------------------------------------------
00:12:21.142 INFO BaseRecalibrator - ------------------------------------------------------------
00:12:21.142 INFO BaseRecalibrator - HTSJDK Version: 2.24.1
00:12:21.143 INFO BaseRecalibrator - Picard Version: 2.27.1
00:12:21.143 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
00:12:21.143 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
00:12:21.143 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
00:12:21.143 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
00:12:21.143 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
00:12:21.143 INFO BaseRecalibrator - Deflater: IntelDeflater
00:12:21.144 INFO BaseRecalibrator - Inflater: IntelInflater
00:12:21.144 INFO BaseRecalibrator - GCS max retries/reopens: 20
00:12:21.144 INFO BaseRecalibrator - Requester pays: disabled
00:12:21.144 INFO BaseRecalibrator - Initializing engine
00:12:21.485 INFO FeatureManager - Using codec VCFCodec to read file file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz
00:12:21.565 INFO FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz
00:12:21.688 INFO FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
00:12:21.797 WARN IndexUtils - Feature file "file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
00:12:21.895 WARN IntelInflater - Zero Bytes Written : 0
00:12:21.966 INFO BaseRecalibrator - Done initializing engine
00:12:21.969 INFO BaseRecalibrationEngine - The covariates being used here:
00:12:21.969 INFO BaseRecalibrationEngine - ReadGroupCovariate
00:12:21.969 INFO BaseRecalibrationEngine - QualityScoreCovariate
00:12:21.969 INFO BaseRecalibrationEngine - ContextCovariate
00:12:21.969 INFO BaseRecalibrationEngine - CycleCovariate
00:12:22.016 INFO ProgressMeter - Starting traversal
00:12:22.017 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
How can I assign a temp directory and won't get the bug?
I set the gatk environment using conda:
/data/xieduo/WES_pipe/pipeline/bin/Miniconda3/bin/conda env create -n gatk_4.2.6.1 -f gatkcondaenv.yml
Thank you!
Best,
Duo
-
Hi Duo Xie,
I wanted to follow up with you on your inquiry with some updates.
The GATK developer that is best able to help solve your GitHub ticket is currently out of the office. They aren't due to be back until early next week. I wanted to keep you updated and let you know that we haven't forgotten about you.
We truly appreciate your patience! If you have any additional questions/comments in the meantime, please do not hesitate to reach out anytime.
Best,
Anthony -
Thank you for your work!
I have given it a try and responded in the GitHub page: https://github.com/broadinstitute/gatk/issues/8005#issuecomment-1254561081
Best,
Duo
-
Hi Duo Xie,
Thank you for writing to the GATK forum! I hope that we can help you sort this out.
After consulting with our developers, it appears that the issue that you are encountering may be a bug. I’ve created a GitHub ticket where you can follow along with the progress as we try to fix this issue. Please find the link here.
We appreciate you bringing this issue to our attention! Please let us know if you have any other questions in the meantime.
Best,
Anthony -
Hi Duo Xie,
Thank you for your much-appreciated patience! Our developer has returned from vacation and has responded to the GitHub thread. I wanted to write to you to let you know, just in case you didn’t see it.
The developers think the issue may be related to the non-ASCII character in the temp file path.
Łuksza_2022_Nature
They would like you to try using a temp folder and output folders that contain only English characters. Java seems to have a recurrent problem with any non-ASCII characters.
I hope this helps! Please let me know whether this fix leads you to success. If not, please reach back out; we will happily look further into what else might be causing this issue. Thank you again for your incredible patience.
Stay well,
Anthony -
Hi Duo Xie,
I'm going to go ahead and solve this ticket. You should be able to continue following along with your GitHub ticket.
Thank you again for your contribution to the GATK forum. If you need anything else from me, please do not hesitate to reach out!
Best,
Anthony
Please sign in to leave a comment.
5 comments