HaplotypeCaller
Hi,
I'm running GATK v 4.2.0.0 and I would like to run HaplotypeCaller with the Snakemake but sth is not working for some samples. I'm calling GATK as follows:
rule HaplotypeCaller:
input:
Ref="reference/Canis_familiaris.CanFam3.1.dna.toplevel.fa",input1="output/output1/{sample}_fixed.bam"
output:
"vcfs/{sample}.g.vcf.gz"
shell:
"module load gatk/4.2.0.0 \n"
"gatk --java-options '-Xmx4g -Xms2g' HaplotypeCaller -R {input.Ref} -I {input.input1} -O {output} -ERC GVCF"
The GATK ouput is below:
-------------------------------------------------------------------------------
Start of calculations [pon, 19 kwi 2021, 12:32:14 CEST]
Job is running on node:
-------------------------------------------------------------------------------
Support: support-hpc@man.poznan.pl
-------------------------------------------------------------------------------
gmp/5.1.3 load complete.
mpfr/3.1.2 load complete.
libmpc/1.0.1 load complete.
gcc/6.2.0 load complete.
openmpi/4.0.0_gcc620 load complete.
r/3.6.3-gcc620 load complete.
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 10
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 HaplotypeCaller
1 all
2
[Mon Apr 19 12:32:16 2021]
rule HaplotypeCaller:
input: reference/Canis_familiaris.CanFam3.1.dna.toplevel.fa, output/output1/S1545Nr23_fixed.bam
output: vcfs/S1545Nr23.g.vcf.gz
jobid: 1
wildcards: sample=S1545Nr23
'java8/jdk1.8.0_40' load complete.
gatk/4.2.0.0 load complete.
12:32:20.062 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/exp_soft/local/generic/gatk/4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Apr 19, 2021 12:32:20 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:32:20.209 INFO HaplotypeCaller - ------------------------------------------------------------
12:32:20.209 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.2.0.0
12:32:20.209 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
12:32:20.209 INFO HaplotypeCaller - Executing as zjanna@e0522 on Linux v3.10.0-1160.21.1.el7.x86_64 amd64
12:32:20.209 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_40-b26
12:32:20.209 INFO HaplotypeCaller - Start Date/Time: 19 kwietnia 2021 12:32:20 CEST
12:32:20.209 INFO HaplotypeCaller - ------------------------------------------------------------
12:32:20.210 INFO HaplotypeCaller - ------------------------------------------------------------
12:32:20.210 INFO HaplotypeCaller - HTSJDK Version: 2.24.0
12:32:20.210 INFO HaplotypeCaller - Picard Version: 2.25.0
12:32:20.210 INFO HaplotypeCaller - Built for Spark Version: 2.4.5
12:32:20.210 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:32:20.210 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:32:20.210 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:32:20.210 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:32:20.210 INFO HaplotypeCaller - Deflater: IntelDeflater
12:32:20.210 INFO HaplotypeCaller - Inflater: IntelInflater
12:32:20.210 INFO HaplotypeCaller - GCS max retries/reopens: 20
12:32:20.210 INFO HaplotypeCaller - Requester pays: disabled
12:32:20.211 INFO HaplotypeCaller - Initializing engine
12:32:20.941 INFO HaplotypeCaller - Done initializing engine
12:32:20.944 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified annotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
12:32:21.003 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
12:32:21.003 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
12:32:21.018 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/opt/exp_soft/local/generic/gatk/4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
12:32:21.020 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/opt/exp_soft/local/generic/gatk/4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
12:32:21.061 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
12:32:21.061 INFO IntelPairHmm - Available threads: 10
12:32:21.061 INFO IntelPairHmm - Requested threads: 4
12:32:21.061 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
12:32:21.125 INFO ProgressMeter - Starting traversal
12:32:21.125 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
12:32:21.152 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
12:32:21.152 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
12:32:21.153 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
12:32:21.153 INFO HaplotypeCaller - Shutting down engine
[19 kwietnia 2021 12:32:21 CEST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=2400714752
java.lang.IllegalArgumentException: Illegal Capacity: -2008499690
at java.util.ArrayList.<init>(ArrayList.java:156)
at htsjdk.samtools.AbstractBAMFileIndex.query(AbstractBAMFileIndex.java:282)
at htsjdk.samtools.DiskBasedBAMFileIndex.getSpanOverlapping(DiskBasedBAMFileIndex.java:61)
at htsjdk.samtools.BAMFileReader.getFileSpan(BAMFileReader.java:914)
at htsjdk.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:931)
at htsjdk.samtools.BAMFileReader.query(BAMFileReader.java:612)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:550)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryOverlapping(SamReader.java:417)
at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.loadNextIterator(SamReaderQueryingIterator.java:130)
at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.<init>(SamReaderQueryingIterator.java:69)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:412)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.iterator(ReadsPathDataSource.java:336)
at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.iterator(MultiIntervalLocalReadShard.java:134)
at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.<init>(AssemblyRegionIterator.java:86)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:188)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1058)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /opt/exp_soft/local/generic/gatk/4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -Xms2g -jar /opt/exp_soft/local/generic/gatk/4.2.0.0/gatk-package-4.2.0.0-local.jar HaplotypeCaller -R reference/Canis_familiaris.CanFam3.1.dna.toplevel.fa -I output/output1/S1545Nr23_fixed.bam -O vcfs/S1545Nr23.g.vcf.gz -ERC GVCF
[Mon Apr 19 12:32:21 2021]
Error in rule HaplotypeCaller:
jobid: 1
output: vcfs/S1545Nr23.g.vcf.gz
shell:
module load gatk/4.2.0.0
gatk --java-options '-Xmx4g -Xms2g' HaplotypeCaller -R reference/Canis_familiaris.CanFam3.1.dna.toplevel.fa -I output/output1/S1545Nr23_fixed.bam -O vcfs/S1545Nr23.g.vcf.gz -ERC GVCF
(exited with non-zero exit code)
Removing output files of failed job HaplotypeCaller since they might be corrupted:
vcfs/S1545Nr23.g.vcf.gz
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/users/zjanna/Switonski/.snakemake/log/2021-04-19T123216.354916.snakemake.log
=====
['S1545Nr23']
=====
-------------------------------------------------------------------------------
End of calculations [pon, 19 kwi 2021, 12:32:21 CEST].
-------------------------------------------------------------------------------
Anyone know what's going on because I can't see any reason of this error?
Cheers
Joanna
-
Hi Joanna,
Could you Validate your Bam file with ValidateSamFile? This could also be an issue with the bam file index. You can re-index your bam file with BuildBamIndex.
See if those help.
Best,
Genevieve
Please sign in to leave a comment.
1 comment