CombineVariants Doesnt produce output
I'm running CombineGVCFs on ~600 gvcf files and it's finishing without generating a complete GVCF file. I can't seem to figure out why the process stops working after a bit. It's shortly after the position reported in the MLEC error, but I saw it just had a [WARN] flag, so that should not stop the rest of the process, right?
Using GATK jar GATK_PATH/4.0.2.1/gatk-package-4.0.2.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /research/rgs01/applications/hpcf/apps/gatk/vendor/4.0.2.1/gatk-package-4.0.2.1-local.jar CombineGVCFs -R GRCh38_no_alt.fa --variant SAMPLE1.g.vcf.gz --variant SAMPLE2.g.vcf.gz ... --variant SAMPLE600.g.vcf.gz -O Combined.proband.coding.g.vcf.gz
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/research/rgs01/scratch_lsf/java -XX:ParallelGCThreads=1
16:24:23.240 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/research/rgs01/applications/hpcf/apps/gatk/vendor/4.0.2.1/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
16:24:23.415 INFO CombineGVCFs - ------------------------------------------------------------
16:24:23.415 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.0.2.1
16:24:23.415 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
16:24:23.415 INFO CombineGVCFs - Executing as noak@nodelmr11 on Linux v3.10.0-1160.15.2.el7.x86_64 amd64
16:24:23.415 INFO CombineGVCFs - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_60-b27
16:24:23.415 INFO CombineGVCFs - Start Date/Time: July 12, 2022 4:24:23 PM CDT
16:24:23.415 INFO CombineGVCFs - ------------------------------------------------------------
16:24:23.415 INFO CombineGVCFs - ------------------------------------------------------------
16:24:23.416 INFO CombineGVCFs - HTSJDK Version: 2.14.3
16:24:23.416 INFO CombineGVCFs - Picard Version: 2.17.2
16:24:23.416 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 1
16:24:23.416 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:24:23.416 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:24:23.416 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:24:23.416 INFO CombineGVCFs - Deflater: IntelDeflater
16:24:23.416 INFO CombineGVCFs - Inflater: IntelInflater
16:24:23.416 INFO CombineGVCFs - GCS max retries/reopens: 20
16:24:23.416 INFO CombineGVCFs - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
16:24:23.416 INFO CombineGVCFs - Initializing engine
16:24:24.079 INFO FeatureManager - Using codec VCFCodec to read file FILEPATH/common/joint_calling/proband_coding/./vcfs../SAMPLE1.coding.g.vcf.gz..
..
..
FILEPATH/common/joint_calling/proband_coding/./vcfs/SAMPLE600.coding.g.vcf.gz
17:24:22.435 INFO CombineGVCFs - Done initializing engine
17:24:23.010 INFO ProgressMeter - Starting traversal
17:24:23.010 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
17:24:31.999 WARN ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contexts at location chr1:12659 the annotation MLEAC=[1, 0] was not a numerical value and was ignored
17:24:32.416 INFO CombineGVCFs - Shutting down engine
[July 12, 2022 5:24:32 PM CDT] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 60.16 minutes.
Runtime.totalMemory()=28262793216
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at java.lang.Double.compareTo(Double.java:49)
at java.util.Comparators$NaturalOrderComparator.compare(Comparators.java:52)
at java.util.Comparators$NaturalOrderComparator.compare(Comparators.java:47)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
at java.util.TimSort.sort(TimSort.java:220)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:348)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.utils.Utils.getMedianValue(Utils.java:1137)
at org.broadinstitute.hellbender.tools.walkers.ReferenceConfidenceVariantContextMerger.mergeAttributes(ReferenceConfidenceVariantContextMerger.java:277)
at org.broadinstitute.hellbender.tools.walkers.ReferenceConfidenceVariantContextMerger.merge(ReferenceConfidenceVariantContextMerger.java:101)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.endPreviousStates(CombineGVCFs.java:340)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.createIntermediateVariants(CombineGVCFs.java:189)
at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.apply(CombineGVCFs.java:134)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:73)
at org.broadinstitute.hellbender.engine.VariantWalkerBase.lambda$traverse$0(VariantWalkerBase.java:110)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.VariantWalkerBase.traverse(VariantWalkerBase.java:108)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:118)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:893)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:159)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:202)
at org.broadinstitute.hellbender.Main.main(Main.java:288)
Here's the resource usage, I'm not running out of requested memory either.
Exited with exit code 3.
Resource usage summary:
CPU time : 3554.91 sec.
Max Memory : 22478 MB
Average Memory : 19418.80 MB
Total Requested Memory : 96000.00 MB
Delta Memory : 73522.00 MB
Max Swap : -
Max Processes : 5
Max Threads : 32
Run time : 3619 sec.
Turnaround time : 3630 sec.
-
Hi nroak,
Thank you for writing to the GATK forum! We hope that we can help you sort this out.
Firstly, could you please first clarify how exactly you produced these GVCFs? Did you use GATK or another tool? Where did these originate?
Secondly, it would also be helpful if you could provide us with your vcf header. Specifically, we want the part of the vcf header that displays the types of each of the attributes. Please include all the info and format attributes in your vcf header.
I look forward to hearing back from you!
Best,
Anthony
Please sign in to leave a comment.
1 comment