Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

FixVcfHeader error: A reference dictionary is required for creating Tribble indices on the fly

0

10 comments

  • Avatar
    Genevieve Brandt (she/her)

    What command did you run in this instance? "I have also run the command below with a tabix indexed input vcf.gz file and the same error occurs."

    You can use IndexFeatureFile to create an index for your SNVs.vcf. If it is located in the same directory, you may fix this issue. 

    0
    Comment actions Permalink
  • Avatar
    jorgez

    Hi,

    Both (A) the usual sort,  bgzip and tabix sequence followed by FixVcfHeader  or (B) or IndexFeatureFile followed by FixVcfHeader fails with the error 'A reference dictionary is required for creating Tribble indices on the fly'

    A:

    (grep ^"#" SNVs.vcf; grep -v ^"#" SNVs.vcf | sort -k1,1 -k2,2n) > SNVs.sorted.vcf
    bgzip SNVs.sorted.vcf
    tabix -p vcf SNVs.sorted.vcf.gz

    gatk FixVcfHeader --INPUT SNVs.sorted.vcf.gz --OUTPUT SNVs.sorted.fixed.vcf

    B:

    gatk IndexFeatureFile -I SNVs.vcf
    gatk FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf

     

    Any help will be appreciated

    Jorge

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    For version B) is gatk IndexFeatureFile -I SNVs.vcf running without any errors and creating the index file?

    0
    Comment actions Permalink
  • Avatar
    jorgez

    Hi,
    Yes, the index was created successfully with IndexFeatureFile

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Is this the entire error log?

    0
    Comment actions Permalink
  • Avatar
    jorgez

    Hi,

    This is the log for IndexFeatureFile (A) and FixVcfHeader (B)

    A:  IndexFeatureFile

    Using GATK jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar

    Running:

        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar IndexFeatureFile -I SNVs.vcf

    12:17:41.941 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so

    Jul 21, 2020 12:17:42 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine

    INFO: Failed to detect whether we are running on Google Compute Engine.

    12:17:42.333 INFO  IndexFeatureFile - ------------------------------------------------------------

    12:17:42.333 INFO  IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.1.7.0

    12:17:42.333 INFO  IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/

    12:17:42.334 INFO  IndexFeatureFile - Executing as uscmgjzb@c6601 on Linux v3.10.0-862.14.4.el7.x86_64 amd64

    12:17:42.334 INFO  IndexFeatureFile - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13

    12:17:42.334 INFO  IndexFeatureFile - Start Date/Time: July 21, 2020 12:17:41 PM CEST

    12:17:42.334 INFO  IndexFeatureFile - ------------------------------------------------------------

    12:17:42.334 INFO  IndexFeatureFile - ------------------------------------------------------------

    12:17:42.334 INFO  IndexFeatureFile - HTSJDK Version: 2.21.2

    12:17:42.334 INFO  IndexFeatureFile - Picard Version: 2.21.9

    12:17:42.334 INFO  IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2

    12:17:42.334 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

    12:17:42.335 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

    12:17:42.335 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

    12:17:42.335 INFO  IndexFeatureFile - Deflater: IntelDeflater

    12:17:42.335 INFO  IndexFeatureFile - Inflater: IntelInflater

    12:17:42.335 INFO  IndexFeatureFile - GCS max retries/reopens: 20

    12:17:42.335 INFO  IndexFeatureFile - Requester pays: disabled

    12:17:42.335 INFO  IndexFeatureFile - Initializing engine

    12:17:42.335 INFO  IndexFeatureFile - Done initializing engine

    12:17:42.624 INFO  FeatureManager - Using codec VCFCodec to read file file:///mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf

    12:17:42.640 INFO  ProgressMeter - Starting traversal

    12:17:42.653 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Records Processed   Records/Minute

    12:17:42.968 INFO  ProgressMeter -       chr7:151781516              0.0                  2118         404713.4

    12:17:42.968 INFO  ProgressMeter - Traversal complete. Processed 2118 total records in 0.0 minutes.

    12:17:43.144 INFO  IndexFeatureFile - Successfully wrote index to /mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf.idx

    12:17:43.144 INFO  IndexFeatureFile - Shutting down engine

    [July 21, 2020 12:17:43 PM CEST] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.02 minutes.

    Runtime.totalMemory()=2039545856

    Tool returned:

    /mnt/lustre/scratch/home/usc/mg/jzb/GERP_LIFTOVER/HG38/GRD96/2_VariantCalling/SNVs.vcf.idx

    B: FixVcfHeader

    Using GATK jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar

    Running:

        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar FixVcfHeader --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf

    12:16:19.539 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/netapp1/Optcesga_FT2_RHEL7/easybuild-cesga/software/Core/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so

    [Tue Jul 21 12:16:19 CEST 2020] FixVcfHeader  --INPUT SNVs.vcf --OUTPUT SNVs.fixed.vcf  --CHECK_FIRST_N_RECORDS -1 --ENFORCE_SAME_SAMPLES true --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false

    Jul 21, 2020 12:16:19 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine

    INFO: Failed to detect whether we are running on Google Compute Engine.

    [Tue Jul 21 12:16:19 CEST 2020] Executing as uscmgjzb@c6601 on Linux 3.10.0-862.14.4.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.7.0

    INFO 2020-07-21 12:16:20 FixVcfHeader Reading in records to re-build the header.

    INFO 2020-07-21 12:16:20 FixVcfHeader Will add an INFO line with id: set

    INFO 2020-07-21 12:16:20 FixVcfHeader VCF header re-built.

    [Tue Jul 21 12:16:20 CEST 2020] picard.vcf.FixVcfHeader done. Elapsed time: 0.01 minutes.

    Runtime.totalMemory()=2039545856

    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp

    java.lang.IllegalArgumentException: A reference dictionary is required for creating Tribble indices on the fly

    at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:467)

    at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:415)

    at picard.vcf.FixVcfHeader.doWork(FixVcfHeader.java:197)

    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)

    at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)

    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)

    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)

    at org.broadinstitute.hellbender.Main.main(Main.java:292)

    0
    Comment actions Permalink
  • Avatar
    jorgez

    Hi again,

    I did a bit of testing and confirmed that FixVcfHeader works on vcf files produced by GATK and SAMtools but it fails in vcf files produced by VarScan or Platypus. This is the behaviour I see at least on my vcf files.

    Jorge

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi jorgez, were you able to get it to work?

    0
    Comment actions Permalink
  • Avatar
    jorgez

    Hi,

    Not on vcf files produced by either Platypus or VarScan.

    Jorge

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    There is most likely a problem with how the VCFs are formatted with those different tools. Could you run ValidateVariants on your other VCF files to see if there are issues with the format? 

    Could you also send the headers of these 3 examples, and some example variants?

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk