Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Combine GVCF Error Caused by: java.io.IOException: Communication error on send

0

3 comments

  • Avatar
    Gökalp Çelik

    Hi Yihan Men

    Did you split all your chromosomes to parts that are less than 512MB in length? Tabix index cannot run on contigs that are longer than 512 megabases in size. So you may need to split your genome and  prepare for a liftover after genotyping split pieces. 

    Unfortunately our tools are not directly compatible with genomes such as wheat due to this limitation currently. 

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Yihan Men

    Thanks for your advice!
    The wheat genome is really big, with chromosomes ranging from 495 to 825 Mb in length.
    But in fact, I have split each chromosome into multiple fragments of no more than 300 Mb in size and created indexes. All chromosomes except Chr2 have been run to completion, and only Chr2 repeatedly exits with this error.
    I checked the gvcf files of each sample and their permissions, ensuring that they exist and are not occupied by other processes, and that there is enough storage space. But it still cannot be run to completion. I cannot confirm the cause of this error, so there is no solution yet.
    I hope you can give me some guidance.
    Thanks!

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again.

    It is possible that Chr2 files or indexes could have been corrupt. Can you reindex those GVCF files and see if that works?

    Also CombineGVCFs may not be a very optimal solution for so many samples. We recommend using GenomicsDBImport to combine high number of samples. 

    Regards. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk