VariantRecalibrator error
Hello,
I have a multi-vcf file (10 WGS human samples, chr1) generated using HaplotypeCaller in GVCF mode, followed by GenomicsDBImport and GenotypeGVCFs. I am trying to use VariantRecalibrator but I am having an error that I have not seen before. I followed the recommendations from other posts with similar problems, but not this one with VariantRecalibrator. I tried "gzcat vcf | head -1", PrintBGZFBlockInformation and ValidateVariants on my vcf file and
- hapmap_3.3.hg38.vcf.gz
- 1000G_omni2.5.hg38.vcf.gz
- 1000G_phase1.snps.high_confidence.hg38.vcf.gz
- Homo_sapiens_assembly38.dbsnp138.vcf.gz
Everything looks good. The command and the program log are as follow:
REQUIRED for all errors and issues:
a) GATK version used: gatk-4.3.0.0
b) Exact command used:
15:04:03.251 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/beegfs/users/lsanz/Software/gatk.tool/gatk-4.3.0.0/gatk-package-4.3
.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
15:04:03.515 INFO VariantRecalibrator - ------------------------------------------------------------
15:04:03.515 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4.3.0.0
15:04:03.516 INFO VariantRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
15:04:03.522 INFO VariantRecalibrator - Executing as x on Linux v4.18.0-425.19.2.el8_7.x86_64 amd64
15:04:03.523 INFO VariantRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_372-b07
15:04:03.524 INFO VariantRecalibrator - Start Date/Time: 7 July 2023 3:04:03 PM
15:04:03.524 INFO VariantRecalibrator - ------------------------------------------------------------
15:04:03.524 INFO VariantRecalibrator - ------------------------------------------------------------
15:04:03.525 INFO VariantRecalibrator - HTSJDK Version: 3.0.1
15:04:03.526 INFO VariantRecalibrator - Picard Version: 2.27.5
15:04:03.526 INFO VariantRecalibrator - Built for Spark Version: 2.4.5
15:04:03.526 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:04:03.527 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:04:03.527 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:04:03.527 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:04:03.527 INFO VariantRecalibrator - Deflater: IntelDeflater
15:04:03.528 INFO VariantRecalibrator - Inflater: IntelInflater
15:04:03.528 INFO VariantRecalibrator - GCS max retries/reopens: 20
15:04:03.528 INFO VariantRecalibrator - Requester pays: disabled
15:04:03.529 INFO VariantRecalibrator - Initializing engine
15:04:04.030 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/hapm
ap_3.3.hg38.vcf.gz
15:04:04.198 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000
G_omni2.5.hg38.vcf.gz
15:04:04.307 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000
G_phase1.snps.high_confidence.hg38.vcf.gz
15:04:04.418 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/Homo
_sapiens_assembly38.dbsnp138.vcf.gz
15:04:04.529 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/n20pilot_chr1.vcf
15:04:04.640 INFO VariantRecalibrator - Done initializing engine
15:04:04.644 INFO TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0
15:04:04.645 INFO TrainingSet - Found omni track: Known = false Training = true Truth = true Prior = Q12.0
15:04:04.645 INFO TrainingSet - Found 1000G track: Known = false Training = true Truth = false Prior = Q10.0
15:04:04.645 INFO TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q7.0
15:04:04.678 INFO ProgressMeter - Starting traversal
15:04:04.678 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
15:04:05.058 INFO VariantRecalibrator - Shutting down engine
[7 July 2023 3:04:05 PM] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=3087007744
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at chr1:1321228 [VC /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/n20pilot_chr
1.vcf @ chr1:1321228 Q397.23 of type=SNP alleles=[G*, A] attr={AC=3, AF=0.075, AN=40, BaseQRankSum=0.431, DP=97, ExcessHet=0.3672, FS=0.000, InbreedingCoeff=
-0.1138, MLEAC=3, MLEAF=0.075, MQ=60.00, MQRankSum=0.00, QD=22.07, ReadPosRankSum=-2.100e-01, SOR=0.551} GT=GT:AD:DP:GQ:PL 0/0:4,0:4:12:0,12,122 0/0:1,0
:1:3:0,3,14 0/0:5,0:5:15:0,15,155 0/1:1,5:6:18:160,0,18 0/0:2,0:2:6:0,6,65 0/0:3,0:3:9:0,9,80 0/0:9,0:9:27:0,27,303 0/1:1,4:5:22:132,0,2
2 0/0:5,0:5:15:0,15,146 0/0:7,0:7:21:0,21,233 0/0:5,0:5:15:0,15,152 0/0:4,0:4:9:0,9,135 0/0:5,0:5:15:0,15,148 0/0:3,0:3:9:0,9,111 0/0:3,0:3
:9:0,9,101 0/0:5,0:5:15:0,15,177 0/1:3,4:7:60:129,0,60 0/0:6,0:6:18:0,18,185 0/0:4,0:4:0:0,0,73 0/0:6,0:6:18:0,18,191 filters=
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:145)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: htsjdk.samtools.util.RuntimeIOException: java.io.IOException: Bad address
at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:48)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.<init>(TabixFeatureReader.java:159)
at htsjdk.tribble.TabixFeatureReader.query(TabixFeatureReader.java:133)
at org.broadinstitute.hellbender.engine.FeatureDataSource.refillQueryCache(FeatureDataSource.java:622)
at org.broadinstitute.hellbender.engine.FeatureDataSource.queryAndPrefetch(FeatureDataSource.java:591)
at org.broadinstitute.hellbender.engine.FeatureManager.getFeatures(FeatureManager.java:363)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:173)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:125)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:240)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDataManager.parseTrainingSets(VariantDataManager.java:397)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.addDatum(VariantRecalibrator.java:621)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.addVariantDatum(VariantRecalibrator.java:578)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.lambda$consumeQueuedVariants$0(VariantRecalibrator.java:549)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.consumeQueuedVariants(VariantRecalibrator.java:549)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.apply(VariantRecalibrator.java:528)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:139)
... 20 more
Caused by: java.io.IOException: Bad address
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:377)
at htsjdk.samtools.seekablestream.SeekableFileStream.read(SeekableFileStream.java:85)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at htsjdk.samtools.seekablestream.SeekableBufferedStream.read(SeekableBufferedStream.java:133)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockCompressedInputStream.java:571)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockCompressedInputStream.java:560)
at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:510)
at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
at htsjdk.samtools.util.BlockCompressedInputStream.seek(BlockCompressedInputStream.java:382)
at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:427)
at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
... 37 more
Using GATK jar /mnt/beegfs/users/lsanz/Software/gatk.tool/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3g -Xms3g -jar /mnt/beegfs/users/lsanz/Software/gatk.tool/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar VariantRecalibrator -V /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/n20pilot_chr1.vcf --trust-all-polymorphic -tranche 100.0 -tranche 99.95 -tranche 99.9 -tranche 99.8 -tranche 99.6 -tranche 99.5 -tranche 99.4 -tranche 99.3 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an SOR -an DP -mode SNP --max-gaussians 6 --resource:hapmap,known=false,training=true,truth=true,prior=15 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/hapmap_3.3.hg38.vcf.gz --resource:omni,known=false,training=true,truth=true,prior=12 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000G_omni2.5.hg38.vcf.gz --resource:1000G,known=false,training=true,truth=false,prior=10 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/1000G_phase1.snps.high_confidence.hg38.vcf.gz --resource:dbsnp,known=true,training=false,truth=false,prior=7 /mnt/beegfs/users/lsanz/Projects/mapping/ncbi_dataset/data/GCF_000001405.26/Homo_sapiens_assembly38.dbsnp138.vcf.gz --tmp-dir /mnt/beegfs/users/lsanz/Projects/tmpdir/ -O /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/cohort20_snps.recal.vcf --tranches-file /mnt/beegfs/bandit/scratch/laura/from_home/bwa/GVCF/cohort20_snps.tranches
I checked the position in the vcf file:
-
Hi Laura Sanz,
Since this exception is coming from TabixIteratorLineReader I wonder if there's something wrong with the index. Can you try recreating the index with GATK's IndexFeatureFile? It should be pretty quick.
Please sign in to leave a comment.
1 comment