Combine GVCFs
AnsweredI am using GATK 4.2.2.0-foss-2018b-Java-1.8 version to combine GVCF files. But, I get the below warning as invalid annotation at chromosome 2 and exception thrown at chromosome 5
09:07:30.141 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Nov 16, 2021 9:07:30 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
09:07:30.295 INFO CombineGVCFs - ------------------------------------------------------------
09:07:30.296 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.2.2.0
09:07:30.296 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
09:07:30.296 INFO CombineGVCFs - Executing as thirun0000@shi18.cluster.bc2.ch on Linux v3.10.0-1062.18.1.el7.x86_64 amd64
09:07:30.296 INFO CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-b03
09:07:30.297 INFO CombineGVCFs - Start Date/Time: November 16, 2021 9:07:30 AM CET
09:07:30.297 INFO CombineGVCFs - ------------------------------------------------------------
09:07:30.297 INFO CombineGVCFs - ------------------------------------------------------------
09:07:30.297 INFO CombineGVCFs - HTSJDK Version: 2.24.1
09:07:30.297 INFO CombineGVCFs - Picard Version: 2.25.4
09:07:30.297 INFO CombineGVCFs - Built for Spark Version: 2.4.5
09:07:30.297 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:07:30.297 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:07:30.298 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:07:30.298 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:07:30.298 INFO CombineGVCFs - Deflater: IntelDeflater
09:07:30.298 INFO CombineGVCFs - Inflater: IntelInflater
09:07:30.298 INFO CombineGVCFs - GCS max retries/reopens: 20
09:07:30.298 INFO CombineGVCFs - Requester pays: disabled
09:07:30.298 INFO CombineGVCFs - Initializing engine
09:07:30.813 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-8_S5.genome.vcf.gz
09:07:30.910 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-7_S4.genome.vcf.gz
09:07:30.979 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-6_S3.genome.vcf.gz
09:07:31.053 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-5_S2.genome.vcf.gz
09:07:31.119 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-52_S24.genome.vcf.gz
09:07:31.174 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-51_S23.genome.vcf.gz
09:07:31.229 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-48_S22.genome.vcf.gz
09:07:31.307 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-47_S21.genome.vcf.gz
09:07:31.360 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-45_S20.genome.vcf.gz
09:07:31.442 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-43_S19.genome.vcf.gz
09:07:31.496 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-41_S18.genome.vcf.gz
09:07:31.556 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-40_S17.genome.vcf.gz
09:07:31.635 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-39_S16.genome.vcf.gz
09:07:31.687 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-38_S15.genome.vcf.gz
09:07:31.756 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-26_S14.genome.vcf.gz
09:07:31.824 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-22_S13.genome.vcf.gz
09:07:31.874 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-21_S12.genome.vcf.gz
09:07:31.953 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-1_S1.genome.vcf.gz
09:07:32.004 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-18_S11.genome.vcf.gz
09:07:32.061 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-17_S10.genome.vcf.gz
09:07:32.130 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-15_S9.genome.vcf.gz
09:07:32.184 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-14_S8.genome.vcf.gz
09:07:32.255 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-11_S7.genome.vcf.gz
09:07:32.317 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-10_S6.genome.vcf.gz
09:07:32.672 INFO CombineGVCFs - Done initializing engine
09:07:32.699 INFO ProgressMeter - Starting traversal
09:07:32.700 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
09:07:41.026 WARN ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contexts at location chr2:162890196 the annotation EVS=0.067295|33|6256 was not a numerical value and was ignored
09:07:42.702 INFO ProgressMeter - chr3:52256385 0.2 779000 4673065.4
09:07:51.318 INFO CombineGVCFs - Shutting down engine
[November 16, 2021 9:07:51 AM CET] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 0.35 minutes.
Runtime.totalMemory()=1401421824
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at chr5:180055863 [VC /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-6_S3.genome.vcf.gz @ chr5:180055863 Q2.00 of type=NO_VARIATION alleles=[G*] attr={DP=86} GT=GT:GQ:AD:DP:VF:NL:SB:NC 0/.:0:1:86:0.988:20:-100.0000:0.0000 filters=q30
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:145)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:165)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: org.broadinstitute.hellbender.exceptions.UserException$BadInput: Bad input: Combining gVCFs containing MNPs is not supported. /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-51_S23.genome.vcf.gz contained a MNP at chr5:180055862
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.apply(CombineGVCFs.java:152)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:133)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:108)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:139)
... 21 more
Using GATK jar /scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar CombineGVCFs -R /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/reference/hg19.fa --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-8_S5.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-7_S4.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-6_S3.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-5_S2.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-52_S24.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-51_S23.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-48_S22.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-47_S21.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-45_S20.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-43_S19.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-41_S18.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-40_S17.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-39_S16.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-38_S15.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-26_S14.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-22_S13.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-21_S12.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-1_S1.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-18_S11.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-17_S10.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-15_S9.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-14_S8.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-11_S7.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-10_S6.genome.vcf.gz -O /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/cohort.genome.vcf.gz
I tried validating each of the GVCF file and I get the below error in the GVCF files (position changes at each GVCF)
Input /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/BRA_54-10_S6.genome.vcf.gz fails strict validation of type ALL: one or more of the ALT allele(s) for the record at position chr5:180039606 are not observed at all in the sample genotypes
How to fix the error in the GVCFs?.
Thanks
-
The issue you found with ValidateVariants should not cause issues in CombineGVCFs. The warning in CombineGVCFs is also not a problem.
I'm still looking into the GATKException in CombineGVCFs at position chr5:180055863. Is there a reason you are using CombineGVCFs instead of GenomicsDBImport? How did you create these GVCFs?
-
Priyadarshini Thirunavukkarasu we have identified that the issue is coming from an MNP at chr5:180055862 in your BRA_54-51_S23.genome.vcf.gz file. CombineGVCFs does not support MNPs. If you want to run CombineGVCFs without MNPs, you can remove them with the following command:
bcftools view --exclude-types mnps in.vcf -o out.vcf
-
Thank you. I removed the MNPs in all the gvcf files. This time, when I try to combine GVCFs, I get another error. The error shows the gvcf files are not gzipped. Please find the command and error message below
gatk CombineGVCFs \
-R /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/reference/hg19.fa \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-8_S5.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-7_S4.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-6_S3.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-5_S2.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-52_S24.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-51_S23.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-48_S22.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-47_S21.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-45_S20.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-43_S19.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-41_S18.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-40_S17.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-39_S16.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-38_S15.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-26_S14.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-22_S13.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-21_S12.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-1_S1.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-18_S11.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-17_S10.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-15_S9.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-14_S8.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-11_S7.genome.vcf.gz \
--variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-10_S6.genome.vcf.gz \
-O /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/cohort.genome.vcf.gz12:04:04.583 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/scicore/soft/apps/GATK/4.1.2.0-foss-2018b-Java-1.8/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Nov 19, 2021 12:04:04 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:04:04.729 INFO CombineGVCFs - ------------------------------------------------------------
12:04:04.729 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.1.2.0
12:04:04.729 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
12:04:04.729 INFO CombineGVCFs - Executing as thirun0000@shi101.cluster.bc2.ch on Linux v3.10.0-1160.el7.x86_64 amd64
12:04:04.730 INFO CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-b03
12:04:04.730 INFO CombineGVCFs - Start Date/Time: November 19, 2021 12:04:04 PM CET
12:04:04.730 INFO CombineGVCFs - ------------------------------------------------------------
12:04:04.730 INFO CombineGVCFs - ------------------------------------------------------------
12:04:04.730 INFO CombineGVCFs - HTSJDK Version: 2.19.0
12:04:04.730 INFO CombineGVCFs - Picard Version: 2.19.0
12:04:04.730 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:04:04.730 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:04:04.730 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:04:04.730 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:04:04.730 INFO CombineGVCFs - Deflater: IntelDeflater
12:04:04.730 INFO CombineGVCFs - Inflater: IntelInflater
12:04:04.730 INFO CombineGVCFs - GCS max retries/reopens: 20
12:04:04.730 INFO CombineGVCFs - Requester pays: disabled
12:04:04.730 INFO CombineGVCFs - Initializing engine
12:04:05.096 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-8_S5.genome.vcf.gz
12:04:05.101 INFO CombineGVCFs - Shutting down engine
[November 19, 2021 12:04:05 PM CET] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=491257856
org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-8_S5.genome.vcf.gz
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:353)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:305)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:256)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:234)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$initializeDrivingVariants$0(MultiVariantWalker.java:73)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.initializeDrivingVariants(MultiVariantWalker.java:63)
at org.broadinstitute.hellbender.engine.VariantWalkerBase.initializeFeatures(VariantWalkerBase.java:55)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:697)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.onStartup(MultiVariantWalker.java:46)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Not in GZIP format, for input source: /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-8_S5.genome.vcf.gz
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:120)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:350)
... 16 more
Caused by: java.util.zip.ZipException: Not in GZIP format
at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:165)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:257)
... 20 more
Using GATK jar /scicore/soft/apps/GATK/4.1.2.0-foss-2018b-Java-1.8/gatk-package-4.1.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /scicore/soft/apps/GATK/4.1.2.0-foss-2018b-Java-1.8/gatk-package-4.1.2.0-local.jar CombineGVCFs -R /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/reference/hg19.fa --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-8_S5.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-7_S4.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-6_S3.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-5_S2.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-52_S24.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-51_S23.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-48_S22.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-47_S21.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-45_S20.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-43_S19.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-41_S18.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-40_S17.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-39_S16.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-38_S15.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-26_S14.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-22_S13.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-21_S12.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-1_S1.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-18_S11.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-17_S10.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-15_S9.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-14_S8.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-11_S7.genome.vcf.gz --variant /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/BRA_54-10_S6.genome.vcf.gz -O /scicore/home/cichon/thirun0000/HAE_panel/Illumina-panel-HAE/family_54/gvcf/gvcf_files/exclude_mnps/cohort.genome.vcf.gz -
Are your files truly gzipped? Or are they not zipped and named with the extension .vcf.gz?
-
Hello
These gvcf files were generated by the illumina software (miniseq) so not sure if it gzipped or end with an extension vcf.gz -
You'll need to figure that out so that GATK can read the file correctly.
Please sign in to leave a comment.
6 comments