Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Resource bundle error download

0

9 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Linda Do, I am not able to reproduce this issue on my end so I cannot determine where the problem is coming from. The bucket you linked to is public and all files should be available.

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    Hi Linda Do,

    I installed Google storage util following this https://cloud.google.com/storage/docs/gsutil_install and then you can get download to current directory with:

    gsutil cp gs://gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.dbsnp138.vcf .

    and for hg38

    gsutil cp gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf .

    Genevieve Brandt (she/her), any way we can have this Homo_sapiens_assembly19.dbsnp138.vcf file compressed with bgzip so its not 10 GB for download?  It takes long time to compress it.

     

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Brian Wiley,

    I can put in a request for this change and bring it up with my team. Our GATK Support Team is not the group that maintains these data resources, so I cannot guarantee any timeline for this. Here is our support policy for more details: https://gatk.broadinstitute.org/hc/en-us/articles/360038469272-What-types-of-questions-will-the-GATK-frontline-team-answer-

    Genevieve

     

    1
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Brian Wiley, wanted to give you a quick update. After bringing this up with the team, I found that they agreed and would prefer a zipped version of the file. Thank you for bringing this issue to our attention! We are looking into changing it but still cannot guarantee a timeline.

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Linda Do

    Thank you Genevieve Brandt and Brian Wiley.

    You both were very helpful. I was able to successfully download the files.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Great! Thanks for the update Linda Do.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Brian Wiley we were able to update the Homo_sapiens_assembly38.dbsnp138.vcf with Homo_sapiens_assembly38.dbsnp138.vcf.gz, it's in the bucket now. 

    Thanks for bringing this to our attention, hopefully it helps the download speed!

    0
    Comment actions Permalink
  • Avatar
    Aravind Sundar

    Hi, I was wondering if instead of the "Homo_sapiens_assembly38.dbsnp138.vcf.gz" file we can use the "00-All.vcf.gz" file in this location(https://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/) from the dbSNP database? 

    For context, I am working with whole exome sequencing data and am curious if I can use the updated version of the common SNP's from the database if at all I can use the above one since I am working with Whole Exome Sequencing data and not Whole Genome Sequencing data for the Baserecalibration step of the variant calling pipeline?

    Any and all help will be appreciated. Thank you

    0
    Comment actions Permalink
  • Avatar
    Kevin Lydon

    Hi Aravind Sundar,

    It might work, but I can't say with full certainty.  The baserecalibration could benefit from the more up-to-date SNPs, but there is the possibility that something has changed between those that could maybe cause issues.  The most likely issue could be contig names, so I would recommend checking the contig names between those files before running to see if there are differences.

    Hope this helps!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk