Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Funcotator- "java.lang.IllegalArgumentException: Invalid interval. Contig:chrUn_JTFH01001980v1_decoy start:-1 end:18"

0

5 comments

  • Avatar
    Kaina Millan

    I'd also like to add that half of my data ran perfectly fine. Tool returned "true" and there was no indication of the job failing while this second half of the data (second batch of cohorts) is giving me the error described above. All cohorts were run with the same exact code (above) and with GRCh38.d1.vd1.fa as the reference genome. 

    0
    Comment actions Permalink
  • Avatar
    Kaina Millan

    I just tried using Funcotator with hg38-v0-GRCh38.primary_assembly.genome.fa from GATK's resource bundle and that failed from the very start due to incompatible contigs:

    A USER ERROR has occurred: Input files Reference and Driving Variants have incompatible contigs: Dictionary Reference is missing contigs found in dictionary Driving Variants.

    With the GRCh38.d1.vd1.fa genome my job at least made it almost to the end, but stopped at "Contig:chrUn_JTFH01001980v1_decoy start:-1 end:18"

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Kaina Millan

    Do you have a special purpose to keep calls from decoys in your VCF file? If no we would suggest you to limit the number of variant calls to primary contigs and keep away from any decoys and alts and unlocalized ones unless there is a real good reason to do so. You may use SelectVariants and use -L parameter to limit the calls to main chromosomes of hg38. 

    I hope this helps. 

     

    0
    Comment actions Permalink
  • Avatar
    Kaina Millan

    I appreciate you getting back to me! I am relatively new to WES, would you be able to provide an example when keeping calls from decoys would be necessary/useful for germline variant calling? In addition, would the -L parameter from SelectVariants also remove contigs with the suffix "_random"? I have found similar error logs indicating invalid contigs such as "Contig:chr22_KI270736v1_random start:-1 end:18".

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    Decoy contigs are used to syphon problematic reads from primary contigs therefore using them for variant calling is useless for the most part. 

    You can limit the variant calls to primary contigs by using a bed or interval_list file containing primary chromosome regions or by directly using chromosome names in the -L parameter such as 

    -L chr1 -L chr2 ... 

    so on. 

    I hope this helps. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk