Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

HaplotypeCaller with additional annotation?

Answered
0

5 comments

  • Avatar
    Pamela Bretscher

    Hi Sarah Bonnin,

    From what I can tell it looks like you are running the command correctly. Could you try running this on the most recent version of GATK to see if that works? 

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Sarah Bonnin

    Hi Pamela,

    Thank you for your reply. Today I have tried to run HaplotypeCaller with the newest version of GATK (4.2.0.0).

    While I do get a VCF file, I unfortunately don't get the fields "QD" and "FS" in it, which I need later on for filtering the variants.

    I ran HaplotypeCaller  like this:

    gatk HaplotypeCaller -R $FASTA -I $bam -O test.vcf.gz -ERC GVCF -stand-call-conf 20 -A QualByDepth -A FisherStrand

    And here are the few first rows of the vcf file (skipping the header):

    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  CCW
    
    6_01543AAC_GCCAAT
    chr1    1       .       N       <NON_REF>       .       .       END=3058518
        GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
    chr1    3058519 .       T       <NON_REF>       .       .       END=3058558
        GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,34
    chr1    3058559 .       T       <NON_REF>       .       .       END=3096935
        GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
    chr1    3096936 .       A       <NON_REF>       .       .       END=3096959
        GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,31
    chr1    3096960 .       A       <NON_REF>       .       .       END=3121908
        GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
    chr1    3121909 .       A       <NON_REF>       .       .       END=3121948
        GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,20
    chr1    3121949 .       G       <NON_REF>       .       .       END=3153289
    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Sarah Bonnin,

    Thank you for providing this information. Given that you get an output from HaplotypeCaller just without your annotations, you might find this document helpful which lists reasons why annotations might be missing from your VCF file:

    https://gatk.broadinstitute.org/hc/en-us/articles/360035532272-Missing-annotations-in-the-output-callset-VCF

    Please let me know if this does not answer your question.

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Sarah Bonnin

    Hi Pamela Bretscher,

    Thank you. I've tried several things and, apparently, the `-ERC GVCF` option was the issue.

    When I remove it, I get the annotation fields. Do you know why this is happening?

    From the documentation: "the GVCF has records for all sites, whether there is a variant call there or not."

    Is it because all sites,are reported that the annotations can't be added?

    Thank you!

     

     

     

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Sarah Bonnin,

    I'm glad you were able to get it to work and thank you for sharing the solution! This is because additional annotations are meant to apply to variant calls. Most likely, the annotations were not appearing because the non-variant sites in the GVCF were incompatible with the annotations. 

    Kind regards,

    Pamela

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk