Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

PathSeq Pre-built DB Breakdown Per Microbial Group

0

1 comment

  • Avatar
    Mark Walker

    Hello Renald James Legaspi,

    You could infer that from the list of microbes here:

    gs://gatk-best-practices/pathseq/resources/pathseq_microbe_list.txt

    You can use this along with the taxdump file to reconstruct the tree and count up the number of organisms at each taxonomic level:

    gs://gatk-best-practices/pathseq/resources/taxdump.tar.gz

    More on the taxdump file format can be found here:

    https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_readme.txt

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk