CombineGVCFs - KEY END found in VariantContext field INFO but this key isn't defined in the VCFHeader
I am running CombineGVCFs to integrate two raw_variants.vcf files derived from HaplotypeCaller, it throws an error as below. GATK version is v4.1.8.1. Any suggestions for this?
10:31:33.609 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/ubda/home/19044464r/biosoft/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 21, 2022 10:31:33 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
10:31:33.977 INFO CombineGVCFs - ------------------------------------------------------------
10:31:33.977 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.1.8.1
10:31:33.977 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
10:31:33.978 INFO CombineGVCFs - Executing as 19044464r@ubda-d049 on Linux v3.10.0-693.21.1.el7.x86_64 amd64
10:31:33.978 INFO CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_252-b09
10:31:33.978 INFO CombineGVCFs - Start Date/Time: March 21, 2022 10:31:33 AM HKT
10:31:33.978 INFO CombineGVCFs - ------------------------------------------------------------
10:31:33.978 INFO CombineGVCFs - ------------------------------------------------------------
10:31:33.979 INFO CombineGVCFs - HTSJDK Version: 2.23.0
10:31:33.979 INFO CombineGVCFs - Picard Version: 2.22.8
10:31:33.979 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:31:33.979 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:31:33.980 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:31:33.980 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:31:33.980 INFO CombineGVCFs - Deflater: IntelDeflater
10:31:33.980 INFO CombineGVCFs - Inflater: IntelInflater
10:31:33.980 INFO CombineGVCFs - GCS max retries/reopens: 20
10:31:33.980 INFO CombineGVCFs - Requester pays: disabled
10:31:33.981 INFO CombineGVCFs - Initializing engine
10:31:34.825 INFO FeatureManager - Using codec VCFCodec to read file file:///ubda/home/19044464r/project/CNV-calling/analysis/test2/1-29406_raw_variants.vcf
10:31:35.333 INFO FeatureManager - Using codec VCFCodec to read file file:///ubda/home/19044464r/project/CNV-calling/analysis/test2/1-29683_raw_variants.vcf
10:31:35.971 INFO CombineGVCFs - Done initializing engine
10:31:36.044 INFO ProgressMeter - Starting traversal
10:31:36.045 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
10:31:36.276 INFO CombineGVCFs - Shutting down engine
[March 21, 2022 10:31:36 AM HKT] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2060451840
java.lang.IllegalStateException: Key END found in VariantContext field INFO at chr1:20094 but this key isn't defined in the VCFHeader. We require all VCFs to have complete VCF headers by default.
at htsjdk.variant.vcf.VCFEncoder.fieldIsMissingFromHeaderError(VCFEncoder.java:213)
at htsjdk.variant.vcf.VCFEncoder.write(VCFEncoder.java:146)
at htsjdk.variant.variantcontext.writer.VCFWriter.add(VCFWriter.java:250)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.endPreviousStates(CombineGVCFs.java:408)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.createIntermediateVariants(CombineGVCFs.java:217)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.apply(CombineGVCFs.java:162)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:131)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:106)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:120)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:118)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:163)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /ubda/home/19044464r/biosoft/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /ubda/home/19044464r/biosoft/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar CombineGVCFs -R /ubda/home/19044464r/refs/hg38/ucsc-fasta/hg38.analysisSet.fa -V 1-29406_raw_variants.vcf -V 1-29683_raw_variants.vcf -O cohort.g.vcf.gz
-
Hi Ryan Sun,
It looks like your VCF header may be missing a declaration of the "END" INFO field which is causing the error. Can you take a look at this similar post which shows the potential solution: https://www.biostars.org/p/371307/. Please let me know if this is helpful.
Kind regards,
Pamela
-
Hi Pamela,
Thanks for your response. I followed the suggestions of that post, and manually added END tag to VCF header. It worked but ran into other problems later.
One person in Github gave me a reminder about this issue and I carefully checked best practice again, finding the CombineGVCFs must take gVCF (genomic VCF) files as input. Unfortunately, I used normal VCF files (forgot to add -ERC GVCF in HaplotypeCaller step). I tried again and it worked well this time.
Anyway, I really appreciated your time and effort!
Ryan
-
Hi Ryan Sun,
Okay, I'm glad to hear that was successful and that you were able to get HaplotypeCaller to work as well. Are there any issues that you are still experiencing that I can help with?
Kind regards,
Pamela
Please sign in to leave a comment.
3 comments