Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

(How to) Filter variants either with VQSR or by hard-filtering Follow

3 comments

  • Avatar
    Andrew Zhang

    Very useful ! Well, I am wondering if the data is whole genome sequencing,is it necessary to add DP < min || DP > 2.5 times avrage depth in Hard-filter step

    Look forward to your favourable reply.

     

    1
    Comment actions Permalink
  • Avatar
    Min Ou

    I cannot view the files in the gs://gcp-public-data--broad-references/hg38/v0

    It seems we need Storage Object Viewer permission.

    0
    Comment actions Permalink
  • Avatar
    Mareike Wendorff

    Hi

    I have had the same problem like Min Ou with downloading the data. But I was able to find them on https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0;tab=objects?prefix= .

    Unfortunately I still cannot ran the VariantRecalibrator with the suggested parameters, as the relevant Info fields are lacking and as the individual information is not included it also cannot be added by hand. Therefore I get the error:

    A USER ERROR has occurred: Bad input: Values for FS annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.

     

    Is there a way to get the missing INFO fields for the resource datasets?

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk