Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

CombineGVCFs or GenomicDBImport for consolidating haploid bacterial GVCFs

0

2 comments

  • Avatar
    Gökalp Çelik

    Hi Conor Sexton

    Welcome to the Wonderland! We wish you a successful journey for your endeavors. 

    Both CombineGVCFs and GenomicsDBImport are capable of different ploidies so for ploidy 1 both are perfectly usable for your case. 

    We are currently on version 4.4.0.0 and GATK version 3.x is declared deprecated and not supported anymore. If you wish to use GATK4 our recommendation would be to stay with versions 4.2 and above but beware that version 4.4 requires java version 17 to run whereas older versions work with java version 8. 

    In order to run GATK versions 4 and above you don't need to call the jar files directly but you can run the toolset from the supplied python script named "gatk" and you may run any tool by just typing gatk ToolName ... 

    Once you collect your GVCF files using HaplotypeCaller with -ERC GVCF and -ploidy 1 parameters you can import them to a GenomicsDB file using GenomicsDBImport tool. Since you have large number of samples it is recommended to use GenomicsDBImport tool due to the fact that you can limit the number of concurrently imported samples to avoid too many open files errors to be thrown from your compute environment. It also provides a proper memory management by reducing the amount of memory needed to import all files. Once imported GenotypeGVCFs tool can be used to generated multisample VCF files from the GenomicsDB file. 

    Below are the documents that you may wish to consult if you have any doubts of how to use parameters. 

    https://gatk.broadinstitute.org/hc/en-us/articles/360035889971--How-to-Consolidate-GVCFs-for-joint-calling-with-GenotypeGVCFs 

    Feel free to send us any questions about GATK and tools.

    Regards. 

    1
    Comment actions Permalink
  • Avatar
    Conor Sexton

    Hi Gökalp Çelik ,

    Thank you very much for your in depth reply and further advice. I look forward to learning more.

    Regards,

    Conor.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk