Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

A USER ERROR has occurred: af-only-gnomad.hg38.vcf.gz because no suitable codecs found

0

6 comments

  • Avatar
    Genevieve Brandt

    Asha The file may be corrupted or overwritten as a different format. Try to re-download it and see if it works.

    0
    Comment actions Permalink
  • Avatar
    Field -Ye Tian

    Hi Genevieve, 

    I met a rather similar problem running MuTect2. 

    I found from other threads that both files of 

    1000g_pon.hg38.vcf.gz and

    af-only-gnomad.hg38.vcf.gz

    are not strictly required but will help. 

    I have downloaded both files from the link 

    https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38;tab=objects?prefix=&forceOnObjectsSortingFiltering=false

    For one thing, I can read nothing but meanless characters by reading them in Excel.

    For another thing, I got the error message of the following. 

    Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file, for input source: /home/field/shared/GATK_files/somatic-hg38_1000g_pon.hg38.vcf

    I wonder if I've downloaded the encrypted version of the files?If so, what's proper link to download. 

     

    Thank you so much.

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt

    Hi Field -Ye Tian, did you unzip the 1000g_pon.hg38.vcf.gz file before trying to view it? Files with .gz are compressed and are not readable with excel. 

    0
    Comment actions Permalink
  • Avatar
    Field -Ye Tian

    Hi Genevieve,

    For some weird reason, I saw the file listed under the name "1000g_pon.hg38.vcf.gz" but when I downloaded it, I automatically got the file "somatic-hg38_1000g_pon.hg38.vcf". 

    Similar thing happens when I downloaded "af-only-gnomad.hg38.vcf.gz"

    My apologies that I forgot to mention. 

    I would also invite a few friends to check out. Would you please also take a look?

    Thank you very much.

    Field

    0
    Comment actions Permalink
  • Avatar
    Field -Ye Tian

    Hi Genevieve,

    The problem I posted can be solved by changing the downloaded file's suffix to .vcf.gz and then unzip it. 

    Although I have encountered another issue, namely

    Input files reference and features have incompatible contigs: No overlapping contigs found.

    That would be a separate problem.

    Best.

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt

    Field -Ye Tian great, thank you for the update and glad you were able to solve the issue!

    You can post the separate problem in a different post for support, though I believe that same issue has been solved on the forum before, so please search the forum and see if the solution already exists.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk