Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Evaluating the quality of a germline short variant callset Follow

2 comments

  • Avatar
    Kshama Aswath

    Hi 

    I am running evaluate variants and encountered this error:

    Now I would like to point out that I had the same error first pass around and then I had seen another post with the same error (ran until chr14 only) and that person had NOT sorted their vcf file, which is exactly what I had done/not done - not sorted vcf file.So I went back and sorted my input vcf file using picard tool and then fed my sorted file into evaluate variant command as shown below: It ran further more until chrX but then spit out this error below(ERROR).

    One thing to point out is my sorted vcf has only few chromosomes I wanted to look at chr2,3,5,7 and 14. Is that an issue? I dont see how, but worth asking. Also is this a bug as I am running the BETA ?

    Any help would be appreciated.

    MY COMMAND:

    gatk VariantEval -R Homo_sapiens_assembly38.fasta -eval fgeno_output_sorted.vcf -O fgeno_variant_eval.tbl -D dbsnp_146.hg38.vcf.gz -no-ev -EV CompOverlap -EV CountVariants -EV IndelSummary -EV MultiallelicSummary -EV TiTvVariantEvaluator

    Why am I getting this error even though my file is sorted?

    GATK/4.1.8.0

    ERROR:

    22:42:07.179 INFO  ProgressMeter -       chrX:124684150             39.6             147787000        3736018.7

    22:42:17.192 INFO  ProgressMeter -       chrX:146492652             39.7             148520000        3738775.7

    22:42:26.416 INFO  VariantEval - Shutting down engine

    [August 15, 2020 10:42:26 PM EDT] org.broadinstitute.hellbender.tools.walkers.varianteval.VariantEval done. Elapsed time: 39.91 minutes.

    Runtime.totalMemory()=3758620672

    java.lang.IllegalStateException: The elements of the input Iterators are not sorted according to the comparator htsjdk.variant.variantcontext.VariantContextComparator

            at htsjdk.samtools.util.MergingIterator.next(MergingIterator.java:107)

            at java.util.Iterator.forEachRemaining(Iterator.java:116)

            at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)

            at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)

            at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)

            at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)

            at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)

            at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

            at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)

            at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:118)

            at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)

            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)

            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)

            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)

            at org.broadinstitute.hellbender.Main.main(Main.java:292)

     

    0
    Comment actions Permalink
  • Avatar
    Eva (Evander)

    Thank you for the great explanation!

    I have one comment and one question. 

    The comments is that the link to the figure "The relationship between variant-level concordance and genotype concordance is illustrated in this figure." is not working. 

    And the a question about: "Another popular method is to evaluate concordance against results obtained from a genotyping chip run on the same samples. [...] This is something we do systematically for all samples in the Broad’s production pipelines." Are there any published workflows for this (by GATK)? 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk