Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

CombineGVCFs 4.1.2.0 throws "java.lang.IllegalArgumentException: Features added out of order:"

Answered
0

9 comments

  • Avatar
    Bhanu Gandham

    Hi Yokofakun

     

    A few things to try:

    1. Please ensure that you are using the same interval file that you used to generate the gvcfs.
    2. Try running combinegvcf without the -L file. If you have used -L in the previous steps, you don't need to provide it with the combinegvcf step.
    3. Try to upgrade to the latest version of GATK4.1.5.0 and let me know if the issue persists.
    0
    Comment actions Permalink
  • Avatar
    Yokofakun

    Hi Bhanu,

    thank you for your answer

     

    > Try to upgrade to the latest version of GATK4.1.5.0 and let me know if the issue persists.

    I upgraded and I've got an error (I'm not testing the very same interval than above in my workflow)

    ```

    14:17:35.396 INFO IntervalArgumentCollection - Processing 3613248 bp from intervals
    14:17:35.403 INFO CombineGVCFs - Done initializing engine
    14:17:35.421 INFO ProgressMeter - Starting traversal
    14:17:35.421 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    14:17:35.619 WARN ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contex
    ts at location chr1:1474167 the annotation MLEAC=[1, 0] was not a numerical value and was ignored
    14:17:45.426 INFO ProgressMeter - chr1:150267061 0.2 459000 2752898.8
    14:17:55.448 INFO ProgressMeter - chr2:33717365 0.3 944000 2828323.2
    14:18:05.450 INFO ProgressMeter - chr2:171673672 0.5 1416000 2829359.3
    14:18:15.452 INFO ProgressMeter - chr3:33441658 0.7 1756000 2632026.0
    14:18:25.657 INFO ProgressMeter - chr3:124214286 0.8 1989000 2375681.8
    14:18:35.950 INFO ProgressMeter - chr3:186569880 1.0 2162000 2143175.8
    14:18:39.714 INFO CombineGVCFs - Shutting down engine
    [March 19, 2020 2:18:39 PM CET] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 1.12 minutes.
    Runtime.totalMemory()=1190658048
    java.lang.IllegalArgumentException: Invalid interval. Contig:chr4 start:419635 end:419578
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:733)
    at org.broadinstitute.hellbender.utils.SimpleInterval.validatePositions(SimpleInterval.java:59)
    at org.broadinstitute.hellbender.utils.SimpleInterval.<init>(SimpleInterval.java:35)
    at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.apply(CombineGVCFs.java:162)
    at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:131)
    at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:106)
    at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:120)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)

    ```

    with :

    ```

    #!/bin/bash -ue
    /ccc/work/cont007/fg0019/lindenbp/packages/gatk/gatk/gatk --java-options " -Xmx5g -Djava.io.tmpdir=." CombineGVCFs \
    -R "/ccc/work/cont007/fg/fg/biobank/by-taxonid/9606/hs37d5/hs37d5_all_chr.fasta" \
    --dbsnp "/ccc/work/cont007/fg/fg/biobank/by-taxonid/9606/hs37d5/variants/hs37d5_all_chr_dbsnp-142.vcf" \
    -L /ccc/scratch/cont007/fg0156/lindenbp/20200305/work/f5/3ab29685dff61bdd8f1bef1f3d3245/TMP/cluster.000000092.bed \
    -V "/ccc/scratch/cont007/fg0156/lindenbp/20200305/work/6a/a94b20b19c148af67a2ff6338746db/chunck.aaaaaaaah.list" \
    -O "combine0.g.vcf.gz"

    ```

     

     

     

    > Please ensure that you are using the same interval file that you used to generate the gvcfs.

    the original gvcfs where called genome-wide, without interval

    > Try running combinegvcf without the -L file.

    when I only keep the bed records on chr4  from my bed file (`grep chr4 /ccc/scratch/cont007/fg0156/lindenbp/20200305/work/f5/3ab29685dff61bdd8f1bef1f3d3245/TMP/cluster.000000092.bed`) . It works ! (??)

     

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi Yokofakun

     

    Ah, ok so the problem is that your intervals are not sorted correctly. As you can see in the error,

    Invalid interval. Contig:chr4 start:419635 end:419578

    the contig end is smaller that the start.

    Typically we use the same intervals from the HaplotypeCaller step on the CombineGVCFs step. Since you do not use any intervals with HaplotypeCaller, your CombineGVCFs should also either not use any intervals or only intervals at the contig level to avoid this error.

    0
    Comment actions Permalink
  • Avatar
    Yokofakun

    ok, I see, thanks.

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    I am getting this error even after I sort the vcf file before `LeftAlignAndTrimVariants`.  I get this below but there IS NOT even a variant that starts at 1017460.  It doesn't exist.  This is also a vardict vcf file.  Sometime this step fails for sample and other times I get this type of error:

    java.lang.IllegalArgumentException: Features added out of order: previous (TabixFeature{referenceIndex=9, start=1017460, end=1017632, featureStartFilePosition=439472982240, featureEndFilePosition=-1}) > next (TabixFeature{referenceIndex=9, start=1017458, end=1017474, featureStartFilePosition=439472982865, featureEndFilePosition=-1})

     

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    FYI, I tried this with version 3.6-0-g89b7209 and it worked so it's something that was introduced after that.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Brian Wiley we only support GATK4 at this point so if you want to look into different versions we just stick with GATK4. Which GATK4 version got the above error? Could you also please provide your command line?

     

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    Thanks Genevieve,

    This is in version from broadinstitute/gatk:4.1.8.1 but it also happens in version 4.2.0.0.  Here is my command line:

    java \
    -Dsamjdk.use_async_io_read_samtools=false \
    -Dsamjdk.use_async_io_write_samtools=true \
    -Dsamjdk.use_async_io_write_tribble=false \
    -Dsamjdk.compression_level=2 \
    -Xmx8g -jar /gatk/gatk-package-4.1.8.1-local.jar LeftAlignAndTrimVariants \
    -O /cromwell-executions/CH_exome_Final.cwl/31b68957-8aca-4310-98b8-1e5673d25193/call-vardict/vardict.cwl/4ba30c6b-0d8a-4ce3-b0f8-491aba74e8c3/call-filter/fp_filter.cwl/e2fc7715-80cd-43a6-808e-2fdb859c79d8/call-normalize_variants/execution/normalized.vcf.gz \
    -R GRCh38_full_analysis_set_plus_decoy_hla.fa
    -V /cromwell-executions/CH_exome_Final.cwl/31b68957-8aca-4310-98b8-1e5673d25193/call-vardict/vardict.cwl/4ba30c6b-0d8a-4ce3-b0f8-491aba74e8c3/call-filter/fp_filter.cwl/e2fc7715-80cd-43a6-808e-2fdb859c79d8/call-normalize_variants/inputs/1971875813/merged.sanitized.vcf.gz

    This happens also on an un-sanitized vcf, i.e. just concat of the intervals vcfs from vardict I get this error.  I meant to say earlier this happens only in GATK4 and not in GATK3 so something was introduced with respect to manipulating the start positions to not be exactly as they are in the vcf file in version 4.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Brian Wiley could you try using SortVcf to reorder before LeftAlignAndTrimVariants in case there are issues with the order of the variants? 

    This is most likely unrelated to this issue, but we do recommend that you run gatk using the gatk wrapper script and not with the java -jar usage. Strange errors can come up when you do not use the gatk wrapper script.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk