Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

HaplotypeCaller Follow

11 comments

  • 0
    Comment actions Permalink
  • Avatar
    Devendra Waikul

    I am helping setting up GATK4 pipeline. I am getting Java fatal error for individual chromosome intervals on genomicsDBimport. I am guessing I must have set the wrong output file extension for haplotypecaller output. 
    I have set the output to be vcf_haplotype_output = output_haplotype.vcf with flag set to -ERC GVCF

    Can anyone suggest what should be an output extension for this command?

     

    0
    Comment actions Permalink
  • Avatar
    Monete Rajão Gomes

    Hi Devendra,

     

    I think when you use -ERC GVCF option, your output should be "output_haplotype.g.vcf" or (better) "output_haplotype.g.vcf.gz"

    Like described above on session: Single-sample GVCF calling (outputs intermediate GVCF)

    Hope this helps.

    0
    Comment actions Permalink
  • Avatar
    Devendra Waikul

    Hello Monete Rajão Gomes,

    Thank you for providing the quick response. 

    I have one more question. Did you face java errors after the genomicsDBimport command?

    15:33:00.356 INFO ProgressMeter - Starting traversal
    15:33:00.356 INFO ProgressMeter - Current Locus Elapsed Minutes Batches Processed Batches/Minute
    15:33:00.495 INFO GenomicsDBImport - Starting batch input file preload
    15:33:48.560 INFO GenomicsDBImport - Finished batch preload
    15:33:48.563 INFO GenomicsDBImport - Importing batch 1 with 31 samples
    03:08:43.526 INFO ProgressMeter - chr22:1 695.7 1 0.0
    03:08:43.531 INFO GenomicsDBImport - Done importing batch 1/1
    03:08:43.533 INFO ProgressMeter - chr22:1 695.7 1 0.0
    03:08:43.533 INFO ProgressMeter - Traversal complete. Processed 1 total batches in 695.7 minutes.
    03:08:43.533 INFO GenomicsDBImport - Import completed!
    03:08:43.533 INFO GenomicsDBImport - Shutting down engine
    [June 18, 2020 3:08:43 AM EDT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 695.79 minutes.
    Runtime.totalMemory()=28427943936
    Tool returned:
    true
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    # SIGBUS (0x7) at pc=0x00007fca5fb9d330, pid=388337, tid=0x00007fda500b6740
    #
    # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
    # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 )
    # Problematic frame:
    #

    0
    Comment actions Permalink
  • Avatar
    Monete Rajão Gomes

    Hi Devendra Waikul

    Unfortunately, I've never used genomicsDBImport before. I've been using combinegvcf, instead.

    Maybe you should look for this error in other sites or in read-only GATK v3 previous site.

    0
    Comment actions Permalink
  • Avatar
    Chunyang Bao

    Hi, 

    I can not find "--genotyping_mode" option in this version. Can I assume DISCOVERY mode is the default mode. But, I am wondering if "GENOTYPE_GIVEN_ALLELES" is still being supporated in this version. Any thought?

    Chunyang

    0
    Comment actions Permalink
  • Avatar
    Joy Bordini

    Hi there,

     

    I noticed that HaplotypeCaller detects different variants (and number of variants) when i run it with target file without scatter-gather approach compared to when i run it with scatter-gather approach. Is this behaviour related to some internal implementation? Any other suggestion?

     

    Thanks

    Joy

    0
    Comment actions Permalink
  • Avatar
    Matt Snyder

    I just ran HaplotypeCaller and got the error below. Clearly my input is not properly named. That is easily fixable and is not an issue. 

    A USER ERROR has occurred: Couldn't read file /home/dnanexus/inputs/input7167917857716864453/archive. Error was: The file /home/dnanexus/inputs/input7167917857716864453/archive exists, but does not contain Features (ie., is not in a supported Feature file format such as vcf, bcf, bed, or interval_list), and does not have one of the supported interval file extensions ([.list, .intervals]). Please rename your file with the appropriate extension. If /home/dnanexus/inputs/input7167917857716864453/archive is NOT supposed to be a file, please move or rename the file at location /home/dnanexus/inputs/input7167917857716864453/archive

    I just wanted to report a very minor "error" in this error message. "supported interval file extensions" should also include at least ".interval_list" if it is an exhaustive list. My HaploytpeCaller call with my -L file works with the Picard formatted interval_list. Not sure, so I thought I would bring it up for potential inclusion in a later release.

    Thanks for GATK! Love it!

    0
    Comment actions Permalink
  • Avatar
    Layne Sadler

    Error while following the annotation example verbatim:
    `A USER ERROR has occurred: Unrecognized annotation group name: Standard`

    0
    Comment actions Permalink
  • Avatar
    Layne Sadler

    So I got this running (hope it finishes) after a few hours of troubleshooting. It would be nice to have links to information about upstream steps like generating the supplemental files like the fasta-dict and bam.bai

    0
    Comment actions Permalink
  • Avatar
    Tayebe Ranjbarnejad

    Hi 

    I ran the following command to generate the GVCF file, but the size of the generated files was rather high (~ 18 GB). In comparison, the VCF file that I got without the ERC option was about 400 MB. In addition, I combined two GVCF files with CombineGVCFs, and the output file was 270 GB!

     Is there a problem here? Thank you for your help

    gatk HaplotypeCaller –R resource/hg19.fa  -I input.sorted.RG.MD.BQSR.bam -ERC BP-RESOLUTION –O output.g.vcf.gz

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk