Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GenomicsDBImport only going through the first chromosome when given a list

0

3 comments

  • Avatar
    Gökalp Çelik

    Hi Charity Z Goeckeritz

    The version you are using is quite old and unsupported. Can you try using GATK 4.6.0.0 as it is the most recent and with many fixes and enhancements?

    A couple of things to note: We recommend importing a single interval per GenomicsDB import instance to make things less convoluted and faster. You may be able to import each contig to a separate instance and genotype them separately to save time. If the default python is python2 then it may be possible to observe such issues. We recommend using python 3.6 and later. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Charity Z Goeckeritz

    Hi Gökalp Çelik,

    Thanks so much for your reply. I can ask our admin to install the latest version but I've already run HaplotypeCaller with the present version, which took me several days... so I hesitate to switch, and I was hoping to get DBImport to work first with the current version. If I'm still beating my head against a wall with a few more tries I guess I'll have to upgrade!

    I don't have an issue with running each chromosome (interval) separately, but I'd have to create a new database path each time, no? Could I combine all of my chromosome databases afterward?

    Thanks again for your help!

    Kindly,

    Charity 

     

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again.

    You need to create separate databases for each contig that is for sure. Also you need to genotype them separately to create one VCF file per contig. Once you get those VCF files you can combine them together using GatherVcfs tool to have one single call file for all contigs and samples. 

    I hope this helps. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk