Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Variant filtered from GVCF

Answered
1

9 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi LP,

    Thank you for writing into the GATK forum! To troubleshoot this issue you are seeing, I would first recommend working through this troubleshooting doc, When HaplotypeCaller and Mutect2 do not call an expected variant.

    See what you can find out from that article and then let me know what further questions you have.

    Best,

    Genevieve

    1
    Comment actions Permalink
  • Avatar
    LP

    Hi Genevieve,

    Thank you for your help! I have looked at the trouble shooting doc. 

    The expected variant was called at the site of interest in the GCVF (see above), but excluded from the VCF by GenotypGVCFs.

    I didn't find any evidence of repeats in this region.

    The quality and depth look ok.. the only thing that looks unusual to me is the BaseQRankSum - I believe that indicates that the quality of the ALT reads is lower? Also, the ratio of the REF and ALT alleles deviates from the expected 1:1.

    Many thanks

     

    1
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    LP could you share what this site looks like in the HaplotypeCaller bamout? Make sure to highlight this region so we can see the read counts.

    I would recommend upgrading to a newer GATK version to see if that helps and also use the parameter --linked-de-bruijn-graph. 

    One other thing I would like to see is if you force call the allele with GenotypeGVCFs, what the VCF line looks like. Also, if this variant gets filtered during variant filtering.

    1
    Comment actions Permalink
  • Avatar
    LP

    Hi Genevieve,

    I have attached a screenshot from IGV for the HaplotypeCaller bam, with the variant highlighted (from Sample A).

    How do you force the variant to be called by GenotypeGVCFs?

    Many thanks again

    1
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    LP you can try using the argument -all-sites! Let me know what that looks like.

    1
    Comment actions Permalink
  • Avatar
    LP

    Hi Genevieve,

    I ran this:

    gatk GenotypeGVCFs --output allsites.vcf --reference Homo_sapiens_assembly38.fasta --variant Sample_A.g.vcf.gz --include-non-variant-sites true --intervals chr20:4690000-4700000

    on GATK 4.1.7.0 (to keep consistent with the Sarek pipeline) and seem to have the same result for our variant of interest:

    chr20 4699605 . A G 177.64 . AC=1;AF=0.500;AN=2;BaseQRankSum=-1.817e+00;DP=30;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;QD=6.34;ReadPosRankSum=1.01;SOR=0.818 GT:AD:DP:GQ:PL 0/1:19,9:28:99:185,0,568

    I did restrict the interval for time's sake - would that make any difference?

    1
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    LP thanks for this information! Could you also provide the output with a newer version of GATK and --linked-de-bruijn-graph enabled?

    0
    Comment actions Permalink
  • Avatar
    LP

    Hi Genevieve,

    We repeated the analysis with GATK 4.2.1 with --linked-de-bruijn-graph enabled and the variant now survives to the VCF!

    We are not sure yet if this is because the new version or because of --linked-de-bruijn-graph so will repeat without this option enabled.

    Is there any reason why --linked-de-bruijn-graph is not enabled as default? Are there any risks or drawbacks with using it?

    Thanks again for your help with this

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi LP,

    I'm glad you are seeing improvement when using the --linked-de-bruijn-graph option! The -linked-de-bruijn-graph option is relatively new so our developers are still determining if they want to make it default. It has been helping many users recover sites of interest, so I would recommend continuing to use this argument!

    Here are some related forum posts:

    Let me know if you have any other questions.

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk