Caused by: java.util.zip.ZipException: Not in GZIP format
Hi, I am trying to merge a couple of GVCF files generated using HaplotypeCaller but getting Caused by: java.util.zip.ZipException: Not in GZIP format
When I do the below command, the header seems ok but the file is not gzipped since I could directly cat it (even though it has the .gz extension)
cat file.g.vcf.gz
Here's the full stack trace
04:12:01.153 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/gitc/picard.jar!/com/intel/gkl/native/libgkl_compression.so [Wed Jun 16 04:12:01 UTC 2021] MergeVcfs INPUT=[/io/batch/1d0a29/inputs/zWl8g/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_10scattered.g.vcf.gz, /io/batch/1d0a29/inputs/eZmhB/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_11scattered.g.vcf.gz, /io/batch/1d0a29/inputs/mixos/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_12scattered.g.vcf.gz, /io/batch/1d0a29/inputs/XyL3V/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_13scattered.g.vcf.gz, /io/batch/1d0a29/inputs/WKN0l/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_14scattered.g.vcf.gz, /io/batch/1d0a29/inputs/O8dCZ/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_15scattered.g.vcf.gz, /io/batch/1d0a29/inputs/tsqqu/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_16scattered.g.vcf.gz, /io/batch/1d0a29/inputs/P2fu4/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_17scattered.g.vcf.gz, /io/batch/1d0a29/inputs/4cgYv/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_18scattered.g.vcf.gz, /io/batch/1d0a29/inputs/CjEZA/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_19scattered.g.vcf.gz, /io/batch/1d0a29/inputs/eZxWU/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_1scattered.g.vcf.gz, /io/batch/1d0a29/inputs/UO18W/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_20scattered.g.vcf.gz, /io/batch/1d0a29/inputs/IsGAN/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_21scattered.g.vcf.gz, /io/batch/1d0a29/inputs/6I4Jo/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_22scattered.g.vcf.gz, /io/batch/1d0a29/inputs/0MX9c/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_23scattered.g.vcf.gz, /io/batch/1d0a29/inputs/rZXJM/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_24scattered.g.vcf.gz, /io/batch/1d0a29/inputs/YOzTA/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_25scattered.g.vcf.gz, /io/batch/1d0a29/inputs/6I41q/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_26scattered.g.vcf.gz, /io/batch/1d0a29/inputs/pH9XX/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_27scattered.g.vcf.gz, /io/batch/1d0a29/inputs/7GT0m/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_28scattered.g.vcf.gz, /io/batch/1d0a29/inputs/dYObh/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_29scattered.g.vcf.gz, /io/batch/1d0a29/inputs/VSHWZ/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_2scattered.g.vcf.gz, /io/batch/1d0a29/inputs/0SZZP/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_30scattered.g.vcf.gz, /io/batch/1d0a29/inputs/sEyUP/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_31scattered.g.vcf.gz, /io/batch/1d0a29/inputs/eeCDa/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_32scattered.g.vcf.gz, /io/batch/1d0a29/inputs/dgvUR/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_33scattered.g.vcf.gz, /io/batch/1d0a29/inputs/Z9qhc/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_34scattered.g.vcf.gz, /io/batch/1d0a29/inputs/ZtZ4r/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_35scattered.g.vcf.gz, /io/batch/1d0a29/inputs/Hn2AR/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_36scattered.g.vcf.gz, /io/batch/1d0a29/inputs/ydwjH/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_37scattered.g.vcf.gz, /io/batch/1d0a29/inputs/qrQfZ/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_38scattered.g.vcf.gz, /io/batch/1d0a29/inputs/rGVqz/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_39scattered.g.vcf.gz, /io/batch/1d0a29/inputs/XaUHE/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_3scattered.g.vcf.gz, /io/batch/1d0a29/inputs/ef7sM/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_40scattered.g.vcf.gz, /io/batch/1d0a29/inputs/H4LOi/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_41scattered.g.vcf.gz, /io/batch/1d0a29/inputs/knAVw/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_42scattered.g.vcf.gz, /io/batch/1d0a29/inputs/BdJT3/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_43scattered.g.vcf.gz, /io/batch/1d0a29/inputs/4U4HT/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_44scattered.g.vcf.gz, /io/batch/1d0a29/inputs/vdMUw/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_45scattered.g.vcf.gz, /io/batch/1d0a29/inputs/aSRUq/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_46scattered.g.vcf.gz, /io/batch/1d0a29/inputs/QBAeT/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_47scattered.g.vcf.gz, /io/batch/1d0a29/inputs/wprQc/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_48scattered.g.vcf.gz, /io/batch/1d0a29/inputs/6ryKW/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_49scattered.g.vcf.gz, /io/batch/1d0a29/inputs/UR6zm/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_4scattered.g.vcf.gz, /io/batch/1d0a29/inputs/vcWDz/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_50scattered.g.vcf.gz, /io/batch/1d0a29/inputs/5Iwwt/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_5scattered.g.vcf.gz, /io/batch/1d0a29/inputs/sMTDW/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_6scattered.g.vcf.gz, /io/batch/1d0a29/inputs/MCh9G/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_7scattered.g.vcf.gz, /io/batch/1d0a29/inputs/fUtSw/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_8scattered.g.vcf.gz, /io/batch/1d0a29/inputs/CV4Vb/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_9scattered.g.vcf.gz] OUTPUT=SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.g.vcf.gz VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=true CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [Wed Jun 16 04:12:01 UTC 2021] Executing as root@a43602a413f7 on Linux 5.4.0-1042-gcp amd64; OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1~deb9u1-b10; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.4-SNAPSHOT [Wed Jun 16 04:12:01 UTC 2021] picard.vcf.MergeVcfs done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=2027290624 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Not in GZIP format, for input source: file:///io/batch/1d0a29/inputs/zWl8g/SC_GMFUL5306375.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov_10scattered.g.vcf.gz at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263) at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102) at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127) at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:120) at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:80) at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:140) at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:92) at picard.vcf.MergeVcfs.doWork(MergeVcfs.java:174) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113) Caused by: java.util.zip.ZipException: Not in GZIP format at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:165) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91) at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:257) ... 10 more
-
Hi Lindo Nkambule,
If it's not gzipped then you will want to rename the file without the gzip extension. It should work for GATK without being zipped then!
Best,
Genevieve
Please sign in to leave a comment.
1 comment