Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GATK4.0.3.0 - CombineGVCFs - Unexpected base in allele bases

0

4 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi HT,

    I would first recommend upgrading your GATK version because you are using a quite old version and this could be related to a bug that has since been resolved. We are currently on 4.2.2.0. We do support the '*' representing a spanning deletion. However, I'm not sure that it will be accepted when combined with the CG in one allele. 

    How did you create these VCF files?

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    HT

    Hi Genevieve,

    Thank you for your speedy reply!

    Noted that it was a bug in old versions. I followed the GATK best practice workflow and used HaplotypeCaller to create gVCF files. Then combined individual gVCF files together, 200 samples one time. There is no " Unexpected base in allele bases" error in this step. But when I tried to combine these multi-sample gVCF files, this error occurred.

    If I would still like to use version 4.0.3.0, is there a way to fix that?

    Thank you!!

    Best,

    HT

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi HT,

    Our best practices pipelines involves combining samples with GenomicsDBImport. Samples can be added incrementally with this tool, which would probably work a lot better for you.

    I'm not sure why this error message came up in CombineGVCFs, but unfortunately we can't fix that tool unless you were going to be able to upgrade. For a workaround, you could try to find the *CG allele in your file after 1:762047 and then skip the site.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    HT

    Hi Genevieve,

    I understood. Thank you for your kind help!

    Best,

    HT

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk