Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Where can I find known variants, training and truth sets, and other resource files? Follow

3 comments

  • Avatar
    Monete Rajão Gomes

    Hi,

    I have some doubts and I hope anyone could help me.

    I'm trying to run gatk pipeline again (I did before successfully), and I decided to use as known site, dbSNP last release (v153).

    So, I've downloaded this dataset: GCF_000001405.25.gz (and .tbi also), from dbSNP FTP which is a vcf file from dbSNP v153, using GRCh37 as reference.

    As other know sites, I will use Mills and 1000g vcf's available at FTP bundle from Broad Inst.

    I noticed that dbSNP vcf have chrom names like "NC_...", "NT_...". But other vcf's from bundle are named as "chr1", "chr2"... accordingly to UCSC hg19 reference.

    So, my questions are:

    1) Is it recommended simply change chrom names to equivalent ones? Is this correct and recommended, without major problems? Example:

    Change "NC_000001.10" to "chr1"
    Change "NC_000002.11" to "chr2"
    Change "NC_000003.11" to "chr3"
    .....
    Change "NC_012920.1" to "chrM"

    2) What do you think about: download dbSNP v153 in GRCh38 version in order to make the liftover to hg19 (using chain table file from UCSC -- with chrom names from UCSC)? Is this ok?

    3) If none of the above, do you have any advice for me, to deal with it?

    Further, I couldn't find any other website/ftp which I could download dbSNP v153  with UCSC hg19 chrom names.

    I'm trying to find a solution. Hope you could help me.

    Thank you for your time and patience.

    1
    Comment actions Permalink
  • Avatar
    Christopher Bottoms

    The "Resource Bundle" link is broken...

    4
    Comment actions Permalink
  • Avatar
    Fredrik Trulsson

    I assume the Resource Bundle link should point here: https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk