Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GenomicsDBImport import both WES with WGS


1 comment

  • Avatar
    Gökalp Çelik

    Hi HQ Zhao

    GenomicsDBImport was created with the purpose of combining large vcf sets into a single entity albeit per chromosome or per genome. The way of usage depends on the resources and your time and pace for your own project therefore you may wish to combine all intervals and all samples within the same database or you may wish to scatter your variants into multiple databases although all samples must be in each single db to get a more coherent result. Combining gvcfs with other vcf types may cause issues in the long run so be careful when you combine outside data unless they are compatible with yours in terms of headers and reference genome. 

    I hope this helps. 

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk