VariantRecalibrator - Exception thrown at
Hello,
I have a multi-vcf file (10 WGS human samples, chr1) generated using HaplotypeCaller in GVCF mode, followed by GenomicsDBImport and GenotypeGVCFs. I am trying to use VariantRecalibrator but I am having an error that I have not seen before. I followed the recommendations from other posts with similar problems, but not this one with VariantRecalibrator. I tried "gzcat vcf | head -1", PrintBGZFBlockInformation and ValidateVariants on my vcf file and
- hapmap_3.3.hg38.vcf.gz
- 1000G_omni2.5.hg38.vcf.gz
- 1000G_phase1.snps.high_confidence.hg38.vcf.gz
- Homo_sapiens_assembly38.dbsnp138.vcf.gz
Everything looks good. The command and the program log are as follow:
REQUIRED for all errors and issues:
a) GATK version used: gatk-4.3.0.0
b) Exact command used:
c) Entire program log:
11:54:56.228 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4
.3.0.0
11:54:56.229 INFO VariantRecalibrator - For support and documentation go to h
ttps://software.broadinstitute.org/gatk/
11:54:56.231 INFO VariantRecalibrator - Executing as lsanz@scn02.svi.edu.au o
n Linux v4.18.0-425.19.2.el8_7.x86_64 amd64
11:54:56.232 INFO VariantRecalibrator - Java runtime: OpenJDK 64-Bit Server V
M v1.8.0_372-b07
11:54:56.232 INFO VariantRecalibrator - Start Date/Time: 26 May 2023 11:54:55
AM
11:54:56.233 INFO VariantRecalibrator - -------------------------------------
-----------------------
11:54:56.234 INFO VariantRecalibrator - -------------------------------------
-----------------------
11:54:56.235 INFO VariantRecalibrator - HTSJDK Version: 3.0.1
11:54:56.236 INFO VariantRecalibrator - Picard Version: 2.27.5
11:54:56.236 INFO VariantRecalibrator - Built for Spark Version: 2.4.5
11:54:56.236 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
11:54:56.237 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR
_SAMTOOLS : false
11:54:56.237 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FO
R_SAMTOOLS : true
11:54:56.238 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FO
R_TRIBBLE : false
11:54:56.238 INFO VariantRecalibrator - Deflater: IntelDeflater
11:54:56.239 INFO VariantRecalibrator - Inflater: IntelInflater
11:54:56.239 INFO VariantRecalibrator - GCS max retries/reopens: 20
11:54:56.239 INFO VariantRecalibrator - Requester pays: disabled
11:54:56.240 INFO VariantRecalibrator - Initializing engine
11:54:56.815 INFO FeatureManager - Using codec VCFCodec to read file file:///
mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/hap
map_3.3.hg38.vcf.gz
11:54:57.132 INFO FeatureManager - Using codec VCFCodec to read file file:///
mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/100
0G_omni2.5.hg38.vcf.gz
11:54:57.301 INFO FeatureManager - Using codec VCFCodec to read file file:///
mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/100
0G_phase1.snps.high_confidence.hg38.vcf.gz
11:54:57.450 INFO FeatureManager - Using codec VCFCodec to read file file:///
mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/Hom
o_sapiens_assembly38.dbsnp138.vcf.gz
11:54:57.581 INFO FeatureManager - Using codec VCFCodec to read file file:///
mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/fullpilot_chr1.vcf.gz
11:54:57.718 INFO VariantRecalibrator - Done initializing engine
11:54:57.722 INFO TrainingSet - Found hapmap track: Known = false Traini
ng = true Truth = true Prior = Q15.0
11:54:57.723 INFO TrainingSet - Found omni track: Known = false Traini
ng = true Truth = true Prior = Q12.0
11:54:57.723 INFO TrainingSet - Found 1000G track: Known = false Traini
ng = true Truth = false Prior = Q10.0
11:54:57.724 INFO TrainingSet - Found dbsnp track: Known = true Traini
ng = false Truth = false Prior = Q7.0
11:54:57.743 WARN GATKVariantContextUtils - Can't determine output variant fi
le format from output file extension "recal". Defaulting to VCF.
11:54:57.768 INFO ProgressMeter - Starting traversal
11:54:57.769 INFO ProgressMeter - Current Locus Elapsed Minutes Va
riants Processed Variants/Minute
11:54:58.258 INFO VariantRecalibrator - Shutting down engine
[26 May 2023 11:54:58 AM] org.broadinstitute.hellbender.tools.walkers.vqsr.Var
iantRecalibrator done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=3087007744
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at ch
r1:1325353 [VC /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/fullpilot_c
hr1.vcf.gz @ chr1:1325353 Q1725.35 of type=SNP alleles=[A*, C] attr={AC=19, AF
=0.950, AN=20, BaseQRankSum=0.00, DP=51, ExcessHet=0.0000, FS=0.000, Inbreedin
gCoeff=-0.0850, MLEAC=18, MLEAF=0.900, MQ=60.00, MQRankSum=0.00, QD=37.92, Rea
dPosRankSum=-5.240e-01, SOR=0.823} GT=GT:AD:DP:GQ:PL 1/1:0,5:5:15:180,15,
0 0/1:2,3:5:56:92,0,56 1/1:0,7:7:21:254,21,0 1/1:0,5:5:15:180,15,0 1/
1:0,2:2:6:72,6,0 1/1:0,5:5:15:181,15,0 1/1:0,2:2:6:73,6,0 1/1:0,8:
8:24:291,24,0 1/1:0,5:5:15:180,15,0 1/1:0,5:5:15:179,15,0 filters=
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$trav
erse$1(MultiVariantWalker.java:145)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:
183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.jav
a:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.jav
a:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.jav
a:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliter
ators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:48
2)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.
java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps
.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForE
achOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:23
4)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:4
85)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(Mu
ltiVariantWalker.java:136)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:
1095)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(Co
mmandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMa
inPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMa
in(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:
160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: htsjdk.samtools.util.RuntimeIOException: java.io.IOException: Bad a
ddress
at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIterat
orLineReader.java:48)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(Ta
bixFeatureReader.java:170)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.<init>(TabixFeatu
reReader.java:159)
at htsjdk.tribble.TabixFeatureReader.query(TabixFeatureReader.java:133
)
at org.broadinstitute.hellbender.engine.FeatureDataSource.refillQueryC
ache(FeatureDataSource.java:622)
at org.broadinstitute.hellbender.engine.FeatureDataSource.queryAndPref
etch(FeatureDataSource.java:591)
at org.broadinstitute.hellbender.engine.FeatureManager.getFeatures(Fea
tureManager.java:363)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(Featu
reContext.java:173)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(Featu
reContext.java:125)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(Featu
reContext.java:240)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDataManager
.parseTrainingSets(VariantDataManager.java:397)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrato
r.addDatum(VariantRecalibrator.java:621)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrato
r.addVariantDatum(VariantRecalibrator.java:578)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrato
r.lambda$consumeQueuedVariants$0(VariantRecalibrator.java:549)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrato
r.consumeQueuedVariants(VariantRecalibrator.java:549)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrato
r.apply(VariantRecalibrator.java:528)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$trav
erse$1(MultiVariantWalker.java:139)
... 20 more
Caused by: java.io.IOException: Bad address
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:377)
at htsjdk.samtools.seekablestream.SeekableFileStream.read(SeekableFile
Stream.java:85)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at htsjdk.samtools.seekablestream.SeekableBufferedStream.read(Seekable
BufferedStream.java:133)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockComp
ressedInputStream.java:571)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockCompressedInputStream.java:560)
at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:510)
at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
at htsjdk.samtools.util.BlockCompressedInputStream.seek(BlockCompressedInputStream.java:382)
at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:427)
at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
... 37 more
Using GATK jar /mnt/beegfs/users/lsanz/Software/gatk.tool/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3g -Xms3g -jar /mnt/beegfs/users/lsanz/Software/gatk.tool/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar VariantRecalibrator -V /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/fullpilot_chr1.vcf.gz --trust-all-polymorphic -tranche 100.0 -tranche 99.95 -tranche 99.9 -tranche 99.8 -tranche 99.6 -tranche 99.5 -tranche 99.4 -tranche 99.3 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an SOR -an DP -mode SNP --max-gaussians 6 --resource:hapmap,known=false,training=true,truth=true,prior=15 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/hapmap_3.3.hg38.vcf.gz --resource:omni,known=false,training=true,truth=true,prior=12 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000G_omni2.5.hg38.vcf.gz --resource:1000G,known=false,training=true,truth=false,prior=10 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000G_phase1.snps.high_confidence.hg38.vcf.gz --resource:dbsnp,known=true,training=false,truth=false,prior=7 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/Homo_sapiens_assembly38.dbsnp138.vcf.gz -O /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/cohort_snps.recal --tranches-file /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/cohort_snps.tranches
I then gunzip and bgzip every file and run again. I got the same error but with different positions:
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at ch
r1:22576030 [VC /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/fullpilot_
chr1.vcf @ chr1:22576030 Q55.78 of type=SNP alleles=[C*, T] attr={AC=1, AF=0.0
50, AN=20, BaseQRankSum=0.967, DP=44, ExcessHet=0.0000, FS=4.771, InbreedingCo
eff=-0.0766, MLEAC=1, MLEAF=0.050, MQ=60.00, MQRankSum=0.00, QD=18.59, ReadPos
RankSum=0.967, SOR=2.225} GT=GT:AD:DP:GQ:PL 0/0:4,0:4:12:0,12,110 0/0:
3,0:3:9:0,9,89 0/0:4,0:4:12:0,12,123 0/0:5,0:5:15:0,15,149 0/0:3,0:3:
9:0,9,89 0/0:8,0:8:24:0,24,269 0/1:1,2:3:25:67,0,25 0/0:5,0:5:15:0,1
5,135 0/0:4,0:4:12:0,12,119 0/0:5,0:5:15:0,15,138 filters=
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at chr1:90538650 [VC/mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/fullpilot_chr1.vcf @ chr1:90538650 Q365.63 of type=SNP alleles=[T*, C] attr={AC=6, AF=0.300, AN=20, DP=44, ExcessHet=0.0000, FS=0.000, InbreedingCoeff=0.8086, MLEAC=5,
MLEAF=0.250, MQ=60.00, QD=30.47, SOR=1.022} GT=GT:AD:DP:GQ:PL 0/0:8,0:8:24:0,24,289 0/0:6,0:6:15:0,15,225 1/1:0,2:2:6:70,6,0 0/0:5,0:5:15:0,15,169
0/0:4,0:4:12:0,12,140 0/0:2,0:2:6:0,6,65 1/1:0,7:7:21:225,21,0 0/0:3,0:3:9:0,9,90 0/0:4,0:4:12:0,12,132 1/1:0,3:3:9:103,9,0 filters=
Any of these positions are in my vcf file. Has anyone experience this error and how did you fix it? I would really appreciate some help. In the meantime, I'll try hard-filtering.
Many thanks,
Laura
-
Bad Address exception tells that there is a problem accessing files or maybe temporary files used by gatk. Is it possible for you to move the --tmp-dir to a directory where you have read write access.
-
Thanks for your comment SkyWarrior
I have read-write access in the output directory. I also tried moving --tmp-dir to another directory and it didn't work.
-
Interesting. Can you run a ValidateVariants on your vcf file to see if there is anything wrong?
Please sign in to leave a comment.
3 comments