Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GenomicsDBImport error - non-standard non-IUPAC base

0

2 comments

  • Avatar
    Kevin Lydon

    The non-IUPAC characters are probably in the alleles in the VCFs coming from the Parabricks pipeline.  I think you'll probably have to delete those lines before running GenomicsDBImport.

    0
    Comment actions Permalink
  • Avatar
    Tristan Dennis

    Hi there - sorry for the delay. I found the source of the error. The genome FASTA was truncated around halfway through the first chromosome. I had used a different file to the one I have on the GPU and at some point the copy had broken. GenotypeGVCFs crashed and reported the error once it reached that point.

    Thanks for your help!

    Tristan

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk