Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to restore an interrupted execution while running HaplotypeCaller

0

5 comments

  • Avatar
    Anthony DiCi

    Thank you for your post, David Medina ! I want to let you know we have received your question. We'll get back to you if we have any updates or follow up questions. 

    Please see our Support Policy for more details about how we prioritize responding to questions. 

    1
    Comment actions Permalink
  • Avatar
    Richard Cadenillas

    Hi, everyone

    I have the same question about restarting or recovering HaplotypeCaller process

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Richard Cadenillas

    HaplotypeCaller cannot be resumed or recovered therefore you need to re-run HaplotypeCaller on problematic intervals again but we need more details as to understand why it stops or stalls after a certain amount of time. Can you provide us more specifics about your running environment, samples, reference genome, command line etc.?

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Richard Cadenillas

    I was running around two days: 
    gatk --java-options "-Xmx54g" HaplotypeCaller --native-pair-hmm-threads 12 --reference GCF_000260255.1_OctDeg1.0_genomic.fna -ERC GVCF --input UACH8217.bam --output ./snp_reducido/UACH8217.gvcf
    But after a strong storm, the electricity went out, and the HaplotypeCaller process could not finish; I got an incomplete UACH8217.gvcf file and not the UACH8217.gvcf.idx file. Is there a way to resume the process?

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Our recommendation would be to split your genome into non overlapping intervals and call each interval separately and gather all files once complete. This way you may be able to resume and complete missing parts when there are interruptions. 

    Regards. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk