Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GATK4 HaplotypeCaller ERROR : expected haplotypes.size() >= eventsAtThisLoc.size() + 1

0

9 comments

  • Avatar
    Bhanu Gandham

    Hi,

     

    You are using a very old version of GATK. Could you maybe upgrade to the latest one and try again. Here are the latest versions: https://github.com/broadinstitute/gatk/releases

    I suspect this is solved in the more recent releases.

    0
    Comment actions Permalink
  • Avatar
    Şevval Aktürk

    Thanks a lot , but I've already used GATK4.1 with same arguments(except min base and mapping quality)  for another bam file. Here is my command:

    java "-Xmx25g" -jar ..../gatk HaplotypeCaller -I xxx.bam -R hs37d5.fa -L ..../1000G_chr21.bed --output-mode EMIT_ALL_SITES --genotyping-mode GENOTYPE_GIVEN_ALLELES --alleles ALL.chr21.phase3.vcf .vcf --output kkk.vcf

    Again , there was another error :

    badly formed variant context at location 21:10945870; getEnd() was 10945870 but this VariantContext contains an END key with value 10990763

    What could be the reason for that ? 

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    4.1 is pretty old by now, too.  There have been several improvements to HaplotypeCaller's force-calling mode since then.  Note that if you use the latest release you need only specify "-alleles ____.vcf" and not "-genotyping-mode GENOTYPE_GIVEN_ALLELES".

    However, it seems that there might just be something wrong with the VCF, if the error message is to be believed.  Could you paste the VCF record at 21:10945870 from ALL.chr21.phase3.vcf?

    0
    Comment actions Permalink
  • Avatar
    Şevval Aktürk

    Thank you. I searched it in the ALL.chr21.phase3. vcf file ,but it couldn't find that position in chr21.

    Actually, I tried same command and same input bam file with GATK3 UnifiedGenotyper tool, but there wasn't any error like this and it worked well. 

    That's why I don't get it what is going on here when I'm using HaplotypeCaller. 

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    I would still try GATK 4.1.7, but what's the stack trace in 4.1?

    0
    Comment actions Permalink
  • Avatar
    Şevval Aktürk

    I'm sorry I don't have stack trace anymore for 4.1, but I tried with the latest version as you said. 

    I tried SNP calling from same multiple bam files (low coverage human DNA samples with all the chromosomes) with GATK 4.1.7 HaplotypeCaller.  I used emit all active sites, alleles, min mapping quality ,and min base quality arguments.

    Here is my command :

    java -Xmx25g -jar  .../gatk-4.1.7.0/gatk-package-4.1.7.0-local.jar HaplotypeCaller -I Bam.list -R .../hs37d5.fa -L ../1000G.bed --output-mode EMIT_ALL_ACTIVE_SITES --alleles .../1000genome.phase3.vcf.gz --output xxx.vcf --minimum-mapping-quality 30 --min-base-quality-score 30

    Here is the error I got :

    java.lang.StringIndexOutOfBoundsException: String index out of range: -1

            at java.lang.String.substring(String.java:1927)

            at org.broadinstitute.hellbender.tools.walkers.annotator.TandemRepeat.getNumTandemRepeatUnits(TandemRepeat.java:54)

            at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyRegionTrimmer.trim(AssemblyRegionTrimmer.java:175)

            at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:552)

            at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:210)

            at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:200)

            at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173)

            at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)

            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)

            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)

            at org.broadinstitute.hellbender.Main.main(Main.java:292)

     

    Could you also please give a suggestion for that ?

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Şevval Aktürk That looks like a bug in the TandemRepeat code.  I'm surprised we have never seen it before.  We will have to fix it.

    0
    Comment actions Permalink
  • Avatar
    Şevval Aktürk

    Thank you so much for the reply. 

    0
    Comment actions Permalink
  • Avatar
    David Benjamin

    Şevval Aktürk This PR: https://github.com/broadinstitute/gatk/pull/6583 should fix the bug.  Thank you for finding and reporting it!

    I have put a patched version of 4.1.7 in a public google bucket:

    gs://broad-dsde-methods-davidben/gatk-builds/gatk-4.1.7-tandem-repeat-patch.jar

    Do not hesitate to let me know if this fails.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk