Error initializing feature reader for path somatic-hg38_1000g_pon.hg38.vcf.gz
AnsweredIf you are seeing an error, please provide(REQUIRED) :
a) GATK version used:
gatk/4.0.11.0
b) Exact command used:
module load gatk/4.0.11.0
gatk Mutect2 -R /PathToFolder/WholeGenomeFasta/genome.fa \
-I /PathToFolder/MOLM13_2R_pe.sorted.bam \
-I /PathToFolder/MOLM13_WT_pe.sorted.bam \
-intervals /PathToFolder/20200805_MOLM13_ER_WES/chr_int/chr1_int.bed \
-tumor MOLM13_2R \
-normal MOLM13_WT \
-O /PathToFolder/MOLM13_2R_sommut_chr1.vcf.gz \
-bamout /PathToFolder/MOLM13_2R_sommut_chr1.bam
c) Entire error log:
Using GATK jar /opt/applications/gatk/4.0.11.0/gatk-package-4.0.11.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /opt/applications/gatk/4.0.11.0/gatk-package-4.0.11.0-local.jar Mutect2 -R /gpfs/home/michaelerb/genomes/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -I /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/bamFolder/MOLM13_2R_pe.sorted.bam -I /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/bamFolder/MOLM13_WT_pe.sorted.bam -intervals /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/chr_int/chr1_int.bed -tumor MOLM13_2R -normal MOLM13_WT --panel-of-normals /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz -O /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/MOLM13_2R_sommut_chr1.vcf.gz -bamout /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/MOLM13_2R_sommut_chr1.bam
13:33:12.474 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/applications/gatk/4.0.11.0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
13:33:14.611 INFO Mutect2 - ------------------------------------------------------------
13:33:14.611 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.0.11.0
13:33:14.611 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
13:33:14.612 INFO Mutect2 - Executing as tbishop@emb0703.cluster.net on Linux v3.10.0-1127.10.1.el7.x86_64 amd64
13:33:14.612 INFO Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_261-b12
13:33:14.612 INFO Mutect2 - Start Date/Time: February 11, 2021 1:33:12 PM PST
13:33:14.612 INFO Mutect2 - ------------------------------------------------------------
13:33:14.612 INFO Mutect2 - ------------------------------------------------------------
13:33:14.613 INFO Mutect2 - HTSJDK Version: 2.16.1
13:33:14.613 INFO Mutect2 - Picard Version: 2.18.13
13:33:14.613 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:33:14.613 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:33:14.613 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:33:14.613 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:33:14.613 INFO Mutect2 - Deflater: IntelDeflater
13:33:14.613 INFO Mutect2 - Inflater: IntelInflater
13:33:14.614 INFO Mutect2 - GCS max retries/reopens: 20
13:33:14.614 INFO Mutect2 - Requester pays: disabled
13:33:14.614 INFO Mutect2 - Initializing engine
13:33:15.163 INFO FeatureManager - Using codec VCFCodec to read file file:///gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
13:33:15.455 INFO Mutect2 - Shutting down engine
[February 11, 2021 1:33:15 PM PST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2041511936
org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:348)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:301)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:255)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:228)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
at org.broadinstitute.hellbender.engine.GATKTool.initializeFeatures(GATKTool.java:417)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:638)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:156)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz has invalid uncompressedLength: -2141253336, for input source: file:///gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
at htsjdk.tribble.TabixFeatureReader.readHeader(TabixFeatureReader.java:97)
at htsjdk.tribble.TabixFeatureReader.<init>(TabixFeatureReader.java:82)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:109)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:345)
... 14 more
Caused by: htsjdk.samtools.util.RuntimeIOException: /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz has invalid uncompressedLength: -2141253336
at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:543)
at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:532)
at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:331)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:257)
at htsjdk.tribble.readers.PositionalBufferedStream.fill(PositionalBufferedStream.java:132)
at htsjdk.tribble.readers.PositionalBufferedStream.read(PositionalBufferedStream.java:84)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at htsjdk.tribble.readers.LongLineBufferedReader.fill(LongLineBufferedReader.java:140)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:300)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:356)
at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:51)
at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:24)
at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:11)
at htsjdk.samtools.util.AbstractIterator.hasNext(AbstractIterator.java:44)
at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:89)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
at htsjdk.tribble.TabixFeatureReader.readHeader(TabixFeatureReader.java:95)
... 17 more
As you can see I'm trying to run Mutect2 with paired tumor and normal samples. I downloaded the 1000g panel of normals file here https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38. The command runs fine without using the panel of normals but I would prefer to have it. I have confirmed that all the file paths are correct so I'm not really sure what is happening. Any thoughts?
Best,
Tim
-
Hi Tim, you are using a very old version of GATK which we do not support anymore. Could you try running with the latest version, 4.1.9.0
-
Hi Bhanu,
I ran the same code with version 4.1.9.0 and received the same error message.
-
Can you please post the new error log.
-
Using GATK jar /opt/applications/gatk/4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /opt/applications/gatk/4.1.9.0/gatk-package-4.1.9.0-local.jar Mutect2 -R /gpfs/home/michaelerb/genomes/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -I /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/bamFolder/MOLM13_2R_pe.sorted.bam -I /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/bamFolder/MOLM13_WT_pe.sorted.bam -intervals /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/chr_int/chr2_int.bed -tumor MOLM13_2R -normal MOLM13_WT --panel-of-normals /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz -O /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/MOLM13_2R_sommut_chr2.vcf.gz -bamout /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/MOLM13_2R_sommut_chr2.bam
10:02:09.935 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/applications/gatk/4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Feb 12, 2021 10:02:10 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
10:02:10.542 INFO Mutect2 - ------------------------------------------------------------
10:02:10.542 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.9.0
10:02:10.542 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
10:02:10.543 INFO Mutect2 - Executing as tbishop@emb0725.cluster.net on Linux v3.10.0-1127.10.1.el7.x86_64 amd64
10:02:10.543 INFO Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_261-b12
10:02:10.543 INFO Mutect2 - Start Date/Time: February 12, 2021 10:02:09 AM PST
10:02:10.543 INFO Mutect2 - ------------------------------------------------------------
10:02:10.543 INFO Mutect2 - ------------------------------------------------------------
10:02:10.544 INFO Mutect2 - HTSJDK Version: 2.23.0
10:02:10.544 INFO Mutect2 - Picard Version: 2.23.3
10:02:10.544 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:02:10.544 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:02:10.544 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:02:10.544 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:02:10.544 INFO Mutect2 - Deflater: IntelDeflater
10:02:10.544 INFO Mutect2 - Inflater: IntelInflater
10:02:10.544 INFO Mutect2 - GCS max retries/reopens: 20
10:02:10.545 INFO Mutect2 - Requester pays: disabled
10:02:10.545 INFO Mutect2 - Initializing engine
10:02:11.356 INFO FeatureManager - Using codec VCFCodec to read file file:///gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
10:02:11.635 INFO Mutect2 - Shutting down engine
[February 12, 2021 10:02:11 AM PST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=2041511936
org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:383)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:335)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:282)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:246)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:209)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:156)
at org.broadinstitute.hellbender.engine.GATKTool.initializeFeatures(GATKTool.java:488)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:79)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz has invalid uncompressedLength: -2141253336, for input source: /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz
at htsjdk.tribble.TabixFeatureReader.readHeader(TabixFeatureReader.java:97)
at htsjdk.tribble.TabixFeatureReader.<init>(TabixFeatureReader.java:82)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:117)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:380)
... 14 more
Caused by: htsjdk.samtools.util.RuntimeIOException: /gpfs/home/tbishop/fastq/20200805_MOLM13_ER_WES/somatic-hg38_1000g_pon.hg38.vcf.gz has invalid uncompressedLength: -2141253336
at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:543)
at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:532)
at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:331)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:257)
at htsjdk.tribble.readers.PositionalBufferedStream.fill(PositionalBufferedStream.java:132)
at htsjdk.tribble.readers.PositionalBufferedStream.read(PositionalBufferedStream.java:84)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at htsjdk.tribble.readers.LongLineBufferedReader.fill(LongLineBufferedReader.java:140)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:300)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:356)
at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:51)
at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:24)
at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:11)
at htsjdk.samtools.util.AbstractIterator.hasNext(AbstractIterator.java:44)
at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:89)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
at htsjdk.tribble.TabixFeatureReader.readHeader(TabixFeatureReader.java:95)
... 17 more -
Looks like the somatic-hg38_1000g_pon.hg38.vcf.gz file is malformed. How was this file generated? Can you run ValidateVariants following the directions provided here to help identify the cause of the error. Can you also post the header of the vcf.
-
Running against the same issue. Does anyone have a solution yet?
-
I believe I have found the solution
First I did what Field did by renaming the file to vcf.gz then extracted the vcf file. Next
I installed 'tabix' in order to recompress the vcf file using bgzip instead of gzip
$ sudo apt install tabix
$ bgzip somatic-hg38_1000g_pon.hg38.vcffinally I used bcftools to re-index the compressed vcf file using the -t argument
bcftools index -t somatic-hg38_1000g_pon.hg38.vcf.gz
currently running Mutect2. Hopefully this will work.
-
Thanks for posting your solution, elhadi iich!
Please sign in to leave a comment.
8 comments