FixVcfHeader error: A reference dictionary is required for creating Tribble indices on the fly
Hello GATK team,
I'm getting an error when using FixVcfHeader. The error complains about a reference dictionary which is required for creating Tribble indices on the fly.
Could you please tell me how could I create the missing reference dictionary?
I have also run the command below with a tabix indexed input vcf.gz file and the same error occurs.
Thanks
Jorge
Can you please provide
a) GATK version used:
4.1.7.0
b) Exact GATK commands used
gatk FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf
c) The entire error log if applicable.
INFO: Failed to detect whether we are running on Google Compute Engine.
[Mon Jul 20 20:41:26 CEST 2020] Executing as uscmgjzb@c6601 on Linux 3.10.0-862.14.4.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.7.0
INFO 2020-07-20 20:41:26 FixVcfHeader Reading in records to re-build the header.
INFO 2020-07-20 20:41:26 FixVcfHeader Will add an INFO line with id: set
INFO 2020-07-20 20:41:26 FixVcfHeader VCF header re-built.
[Mon Jul 20 20:41:26 CEST 2020] picard.vcf.FixVcfHeader done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2039545856
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
java.lang.IllegalArgumentException: A reference dictionary is required for creating Tribble indices on the fly
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:467)
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:415)
at picard.vcf.FixVcfHeader.doWork(FixVcfHeader.java:197)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
-
What command did you run in this instance? "I have also run the command below with a tabix indexed input vcf.gz file and the same error occurs."
You can use IndexFeatureFile to create an index for your SNVs.vcf. If it is located in the same directory, you may fix this issue.
-
Hi,
Both (A) the usual sort, bgzip and tabix sequence followed by FixVcfHeader or (B) or IndexFeatureFile followed by FixVcfHeader fails with the error 'A reference dictionary is required for creating Tribble indices on the fly'
A:
(grep ^"#" SNVs.vcf; grep -v ^"#" SNVs.vcf | sort -k1,1 -k2,2n) > SNVs.sorted.vcf
bgzip SNVs.sorted.vcf
tabix -p vcf SNVs.sorted.vcf.gzgatk FixVcfHeader --INPUT SNVs.sorted.vcf.gz --OUTPUT SNVs.sorted.fixed.vcf
B:
gatk IndexFeatureFile -I SNVs.vcf
gatk FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcfAny help will be appreciated
Jorge
-
For version B) is gatk IndexFeatureFile -I SNVs.vcf running without any errors and creating the index file?
-
Hi,
Yes, the index was created successfully with IndexFeatureFile -
Is this the entire error log?
-
Hi,
This is the log for IndexFeatureFile (A) and FixVcfHeader (B)
A: IndexFeatureFile
Using GATK jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar IndexFeatureFile -I SNVs.vcf
12:17:41.941 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 21, 2020 12:17:42 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:17:42.333 INFO IndexFeatureFile - ------------------------------------------------------------
12:17:42.333 INFO IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.1.7.0
12:17:42.333 INFO IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
12:17:42.334 INFO IndexFeatureFile - Executing as uscmgjzb@c6601 on Linux v3.10.0-862.14.4.el7.x86_64 amd64
12:17:42.334 INFO IndexFeatureFile - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
12:17:42.334 INFO IndexFeatureFile - Start Date/Time: July 21, 2020 12:17:41 PM CEST
12:17:42.334 INFO IndexFeatureFile - ------------------------------------------------------------
12:17:42.334 INFO IndexFeatureFile - ------------------------------------------------------------
12:17:42.334 INFO IndexFeatureFile - HTSJDK Version: 2.21.2
12:17:42.334 INFO IndexFeatureFile - Picard Version: 2.21.9
12:17:42.334 INFO IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:17:42.334 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:17:42.335 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:17:42.335 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:17:42.335 INFO IndexFeatureFile - Deflater: IntelDeflater
12:17:42.335 INFO IndexFeatureFile - Inflater: IntelInflater
12:17:42.335 INFO IndexFeatureFile - GCS max retries/reopens: 20
12:17:42.335 INFO IndexFeatureFile - Requester pays: disabled
12:17:42.335 INFO IndexFeatureFile - Initializing engine
12:17:42.335 INFO IndexFeatureFile - Done initializing engine
12:17:42.624 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf
12:17:42.640 INFO ProgressMeter - Starting traversal
12:17:42.653 INFO ProgressMeter - Current Locus Elapsed Minutes Records Processed Records/Minute
12:17:42.968 INFO ProgressMeter - chr7:151781516 0.0 2118 404713.4
12:17:42.968 INFO ProgressMeter - Traversal complete. Processed 2118 total records in 0.0 minutes.
12:17:43.144 INFO IndexFeatureFile - Successfully wrote index to /mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf.idx
12:17:43.144 INFO IndexFeatureFile - Shutting down engine
[July 21, 2020 12:17:43 PM CEST] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=2039545856
Tool returned:
/mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf.idx
B: FixVcfHeader
Using GATK jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf
12:16:19.539 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Tue Jul 21 12:16:19 CEST 2020] FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf --CHECK_FIRST_N_RECORDS -1 --ENFORCE_SAME_SAMPLES true --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Jul 21, 2020 12:16:19 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Tue Jul 21 12:16:19 CEST 2020] Executing as uscmgjzb@c6601 on Linux 3.10.0-862.14.4.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.7.0
INFO 2020-07-21 12:16:20 FixVcfHeader Reading in records to re-build the header.
INFO 2020-07-21 12:16:20 FixVcfHeader Will add an INFO line with id: set
INFO 2020-07-21 12:16:20 FixVcfHeader VCF header re-built.
[Tue Jul 21 12:16:20 CEST 2020] picard.vcf.FixVcfHeader done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2039545856
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
java.lang.IllegalArgumentException: A reference dictionary is required for creating Tribble indices on the fly
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:467)
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:415)
at picard.vcf.FixVcfHeader.doWork(FixVcfHeader.java:197)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
-
Hi again,
I did a bit of testing and confirmed that FixVcfHeader works on vcf files produced by GATK and SAMtools but it fails in vcf files produced by VarScan or Platypus. This is the behaviour I see at least on my vcf files.
Jorge
-
Hi jorgez, were you able to get it to work?
-
Hi,
Not on vcf files produced by either Platypus or VarScan.
Jorge
-
There is most likely a problem with how the VCFs are formatted with those different tools. Could you run ValidateVariants on your other VCF files to see if there are issues with the format?
Could you also send the headers of these 3 examples, and some example variants?
Please sign in to leave a comment.
10 comments