Combine GVCF Error Caused by: java.io.IOException: Communication error on send
Dear GATK team:
I am running the gatk CombineGVCFs program to obtain the GVCF file of a chromosome of 337 samples. However, the task automatically exits after running for a few days. I have not yet found the reason for the error.
REQUIRED for all errors and issues:
a) GATK version used: gatk-4.4.0.0
b) Exact command used: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx240g -Djava.io.tmpdir=/storage/public/home/wgs_wheat/3_bam/tmp -jar /storage/public/apps/software/gatk/gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar CombineGVCFs -R /storage/public/home/public_database/wheat_03G_part/iwgsc_refseqv2.1_part_300MB_rename.fa --variant SW001_2B.g.vcf.gz --variant SW002_2B.g.vcf.gz ...(Too many samples, omitted here) --variant SW337_2B.g.vcf.gz -O /storage/public/home/wgs_wheat/4_snp_indel_parallel/Chr2_ALL_GVCF/wgs_chr2_all_sample.g.vcf.gz
c) Entire program log:
... (The previous part is omitted here)
03:56:09.261 INFO ProgressMeter - Chr2_300000001_600000000:36474997 6685.6 16742435000 2504234.6
03:56:19.282 INFO ProgressMeter - Chr2_300000001_600000000:36482053 6685.8 16742957000 2504250.1
03:56:29.326 INFO ProgressMeter - Chr2_300000001_600000000:36491202 6686.0 16743495000 2504267.9
03:56:39.333 INFO ProgressMeter - Chr2_300000001_600000000:36501617 6686.2 16744105000 2504296.6
03:56:49.338 INFO ProgressMeter - Chr2_300000001_600000000:36510378 6686.3 16744622000 2504311.5
03:56:58.401 INFO CombineGVCFs - Shutting down engine
[February 10, 2025, 3:56:58 AM CST] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 6,701.91 minutes.
Runtime.totalMemory()=13321109504
htsjdk.samtools.util.RuntimeIOException: java.io.IOException: Communication error on send
at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:53)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:205)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:149)
at org.broadinstitute.hellbender.engine.MultiVariantDataSource$1.next(MultiVariantDataSource.java:408)
at org.broadinstitute.hellbender.engine.MultiVariantDataSource$1.next(MultiVariantDataSource.java:393)
at htsjdk.samtools.util.PeekableIterator.advance(PeekableIterator.java:71)
at htsjdk.samtools.util.PeekableIterator.next(PeekableIterator.java:57)
at htsjdk.samtools.util.MergingIterator.next(MergingIterator.java:101)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1921)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:165)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1098)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.io.IOException: Communication error on send
at java.base/java.io.RandomAccessFile.readBytes0(Native Method)
at java.base/java.io.RandomAccessFile.readBytes(RandomAccessFile.java:390)
at java.base/java.io.RandomAccessFile.read(RandomAccessFile.java:424)
at htsjdk.samtools.seekablestream.SeekableFileStream.read(SeekableFileStream.java:85)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockCompressedInputStream.java:571)
at htsjdk.samtools.util.BlockCompressedInputStream.readBytes(BlockCompressedInputStream.java:560)
at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:510)
at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:331)
at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:257)
at htsjdk.tribble.readers.PositionalBufferedStream.fill(PositionalBufferedStream.java:132)
at htsjdk.tribble.readers.PositionalBufferedStream.read(PositionalBufferedStream.java:84)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:333)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:376)
at java.base/sun.nio.cs.StreamDecoder.lockedRead(StreamDecoder.java:219)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:173)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:189)
at htsjdk.tribble.readers.LongLineBufferedReader.fill(LongLineBufferedReader.java:140)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:300)
at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:356)
at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:51)
... 25 more
The strange thing is that the same command has been run to completion on other chromosomes without errors.
Thank you for your help!
-
Hi Yihan Men
Did you split all your chromosomes to parts that are less than 512MB in length? Tabix index cannot run on contigs that are longer than 512 megabases in size. So you may need to split your genome and prepare for a liftover after genotyping split pieces.
Unfortunately our tools are not directly compatible with genomes such as wheat due to this limitation currently.
Regards.
-
Thanks for your advice!
The wheat genome is really big, with chromosomes ranging from 495 to 825 Mb in length.
But in fact, I have split each chromosome into multiple fragments of no more than 300 Mb in size and created indexes. All chromosomes except Chr2 have been run to completion, and only Chr2 repeatedly exits with this error.
I checked the gvcf files of each sample and their permissions, ensuring that they exist and are not occupied by other processes, and that there is enough storage space. But it still cannot be run to completion. I cannot confirm the cause of this error, so there is no solution yet.
I hope you can give me some guidance.
Thanks! -
Hi again.
It is possible that Chr2 files or indexes could have been corrupt. Can you reindex those GVCF files and see if that works?
Also CombineGVCFs may not be a very optimal solution for so many samples. We recommend using GenomicsDBImport to combine high number of samples.
Regards.
Please sign in to leave a comment.
3 comments