Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Errors about input files having missing or incompatible contigs Follow

5 comments

  • Avatar
    Field -Ye Tian

    Dear GATK group,

    I have encounter a similar (if not the same) problem running the Mutect2 program. 

    It says "A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.

    reference contigs = [NC_000001.11, NT_187361.1, NT_187362.1, NT_187363.1, NT_187364.1, NT_187365.1, NT_187366.1, NT_187367.1, NT_187368.1, NT_187369.1, NC_000002.12, NT_187370.1,.........(where I omitted many more items)

    features contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270708v1_random, chr1_KI270709v1_random, chr1_KI270710v1_random, chr1_KI270711v1_random, chr1_KI270712v1_random,........"

     

    I figured that the mismatch is between the ref and VCF files (1000g_pon.hg38.vcf.gz and somatic-hg38_af-only-gnomad.hg38.vcf from https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38/

    and Ref file (The unpatched grch38 assembly from NCBI https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/)

    If I run the program without the pon and germline source files, the program works smoothly. 

     

    I wonder if the inconsistency is caused by the out-dated grch38 assembly and whether it can be solved by using the uptodate grch38 patch 13. 

     

    Many thanks.

     

    0
    Comment actions Permalink
  • Avatar
    Field -Ye Tian

    As a quick update, align with the latest grch38patch13 didn't solve the above-mentioned problem. 

     

    0
    Comment actions Permalink
  • Avatar
    maximus

    The solution stated above is completely failed.

    I generate the BAM file from a certain hg38 reference sequence using bwa.

    Then I call Mutect2 done on the generated BAM and the same hg38 reference.

    With the same source of hg38 reference, how would there be difference in naming of contigs?

    How come I can't use the software smoothly?

    0
    Comment actions Permalink
  • Avatar
    maximus

    I have tried using different reference sequences (UCSC vs Reqseq) and difference sources of different germline resources, and either I get

    A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.

    Or

    A USER ERROR has occurred: An index is required but was not found for file /XXX/XXX/XXXX.vcf.gz. Support for unindexed block-compressed files has been temporarily disabled. Try running IndexFeatureFile on the input.

    How come such a extensively developed and maintained software will have such a bug that I can't even run a simple Mutect2 program as an initial small test?

    0
    Comment actions Permalink
  • Avatar
    maximus

    Even I only provided the input BAM files and reference genome, without providing the germline resources, the Mutect2 program can't even produce a vcf file

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk