Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Funcotator missing ClinVar annotations

0

16 comments

  • Avatar
    Gökalp Çelik

    Hi Ong Zhi Xuan

    We do not recognize the source of this resource file and we do not have any afiliations to this website. Can you try downloading our resource files using the tool named below?

    gatk FuncotatorDataSourceDownloader

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Ong Zhi Xuan

    Hi Gökalp Çelik, thank you for your reply. I would need to download the resources locally before using them on my institution's HPC server, as the HPC server does not have Internet access. 

    Could you suggest an alternative to using gatk FuncotatorDataSourceDownloader? 

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again.

    You may still use the tool on a computer with internet connection and then move there resource file that you download to the HPC of your preference. 

    Alternatively you may use the google cloud bucket for resource files. 

    https://console.cloud.google.com/storage/browser/broad-public-datasets/funcotator 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Hi!

    I meet the Clinvar annotation fields was empty using:

    gatk FuncotatorDataSourceDownloader --somatic --validate-integrity --extract-after-download -hg38

    Is there any possible solution for this case?

    Besides that, there is a lot of UNKNOWN field in my output, is it common?
    E.g.,  Entrez ID, Validation status and etc.

    Thanks!

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi sernyan lim

    Are you using the latest gatk and funcotator resource with it? My tests show that clinvar annotations work fine with version 4.6.1.0. 

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Hi Gökalp Çelik

    Yes, I'm using the latest version 4.6.1.0 of gatk and my funcotator data sources were v1.8.hg38.20230908s. Is it the latest funcotator resources ?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    Clinvar annotations will occur only for those that are visible in clinvar source and the current version in the resource is from 20230717. You maybe able to update the source however certain clinvar fields might have been modified since then so pay attention to those annotation tags. 

    Do you not observe any clinvar entries in the output for those known clinvar included sites?

    I am able to see those sites when I check for lines with certain clinvar text such as likely_benign etc..

     

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Hi,

    Seems like i have the latest version of Funcocator resources. 

    For the question, i did not observe any of the variant have clinvar annotation, i think it's really empty as non of the line have clinvar text!

    Thanks for replying fast!

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Another quick question, why my funcotator resources seems like newer compared to the one you mentioned?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    What is your clinvar version? 

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Not sure is it what you want but clinvar/hg38/clinvar_20230717_hg38.vcf this is the clinvar vcf i found in my funcotator log.

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    This is what I meant. We are using the same source files. 

    0
    Comment actions Permalink
  • Avatar
    sernyan lim

    Hmmm.. seems like we are using the same version of everything. Wondering why i have empty Clinvar annotations and a lot of unknown field. Is that any possible solution that i can give it a try?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi sernyan lim

    Looks like I was able to trackdown the problem with clinvar. We use default clinvar VCF files for annotation but looks like clinvar hg38 resource files are also posted with contig names without "chr" prefix therefore Funcotator is unable to address the variant info from the clinvar VCF file if your hg38 contig names are starting with "chr". 

    To fix this issue temporarily on your end you may use

    bcftools annotate 

    to add chr prefix to clinvar VCF file in the Funcotator DataResource and rerun your annotation. We will post a fix on our end to our DataResources. 

    UNKNOWN fields are normal behavior for the Funcotator as those entries were not found in the resource or the VCF input itself. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Kaina Millan

    Hello, I am wondering if a fix ended up being posted to your DataSources? I am having this same issue and was wondering how to work around it. I attempted using bcftools annotate as a temporary fix but, since the clinvar vcf files do not define the contigs and their lengths in the header, bcftools is unable to read the files and edit them. 

    For reference, I am trying to use clinvar_20230717_hg38.vcf within funcotator data sources. 

    [W::vcf_parse] Contig '1' is not defined in the header. (Quick workaround: index the file with tabix.)

    Warning: Encountered an error, proceeding only because --force was given.

             Note that this can result in a segfault or a silent corruption of the output file!

    [E::vcf_format] Invalid BCF, CONTIG id=0 not present in the header
    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Kaina Millan

    You can block gzip your clinvar vcf file using bgzip and use tabix to index that file. Once you perform this you need to modify the resource file for Clinvar in the same folder named clinvar_vcf.config

    name = ClinVar_VCF
    version = 20230717_hg38
    src_file = clinvar_20230717_hg38.vcf.gz

    We are working on a new data source and until then you can use this workaround to fix your issue.

    I hope this helps.

    Regards. 

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk