Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

BaseRecalibrator raised htsjdk.tribble.TribbleException$MalformedFeatureFile

0

12 comments

  • Avatar
    wenh06

    I've found the problem myself.

    The .vcf.gz files I downloaded (using my laptop, win10, chrome80) from https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/ "become" .vcf files.

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    HI wenh06,

     

    Thank you for posting your solution for the benefit of the community!

    0
    Comment actions Permalink
  • Avatar
    Vincent Appiah

    Hi All, I have my issue is that when I use the BaseRecalibrator, I get this error message

    resources_broad_hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf must support random access to enable queries by interval. If it's a file, please index it using the bundled tool IndexFeatureFile

    I also noticed that the said file does not come with .idx file but rather tbi . Could that be the reason why"

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi Vincent Appiah

     

    Please start a new post in a new thread with exact command you are using, the version info and the entire error log.

    0
    Comment actions Permalink
  • Avatar
    danilovkiri

    Vincent Appiah

    You have to gzip the VCF file (if not already done, if it is unzipped, then why?) and use

    tabix <file.vcf.gz>

    or

    bcftools index --tbi <filename>

    The tbi index is perfectly fine. You may download it from the resource bundle along with the vcf.gz files.

     

    0
    Comment actions Permalink
  • Avatar
    Mohd Khairul Nizam Mohd Khalid

    I had the same error and I fixed it by downloading files for --known-sites from ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thank you for posting your solution Mohd Khairul Nizam Mohd Khalid!

    0
    Comment actions Permalink
  • Avatar
    Felipe Padilla

    wenh06 can you be more specific?

    I also downloaded from ttps://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/,

    You said they became .vcf files, but they should be .vcf files, no? How did you made it work?

    Thanks in advance!

    0
    Comment actions Permalink
  • Avatar
    Vincent Appiah

    Felipe Padilla what worked for  me was to unzip the files

    0
    Comment actions Permalink
  • Avatar
    Ashi

    I have same error using GATK4.1.9.0 docker in HPC.

    I did not understand previous comments completely. Please let me make it sure.

    My reference directory has:
    hg38.fasta with index, dic files
    hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf 
    hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf.gz.tbi

    Used command is like this:

    gatk BaseRecalibrator \
    -I Sample1.bam \
    -R hg38.fasta \
    --known-sites hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf \
    -O Sample1.recal_data.table

    Should I make .vcf.tbi file? (unzip),
    Or, should I make .vcf.gz (paired with .vcf.gz.tib)?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Ashi, these users have reported that this vcf file (hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf) is actually zipped. Change the name to hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf.gz and see if that works for you.

    0
    Comment actions Permalink
  • Avatar
    Ashi

    Hi,

    Thank  you for clarification. 

    Now, it works!!

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk