Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

The provided reference alleles do not appear to represent the same position, TG* vs. TT*


1 comment

  • Official comment
    Gökalp Çelik

    Hi Ben Oppenheimer

    This question was answered inside the slack channel therefore I will attach the official response from there and close the topic as resolved. 

    Response from @slee from the slack channel:

    OK, looks like it's actually the dbsnp resource that is problematic (specifically, at rs9278466); running ValidateVariants gives:
    A USER ERROR has occurred: Input gs://gcp-public-data--broad-references/hg19/v0/dbsnp_138.b37.vcf.gz fails strict validation of type ALL: the REF allele is incorrect for the record at position 6:29912413, fasta says TG vs. VCF says TT
    Depending on your use case, you can probably safely drop the dbsnp resource, as it is only used to mark known sites and doesn't affect the filtering. 
    Comment actions Permalink

Post is closed for comments.

Powered by Zendesk