GenotypeGVCFs hangs at GenomicsDBLibLoader
Greetings, I am dying of frustration. I have been running GATK on several different servers and on ONE server I have problems with ONE tool. The problem is with GenotypeGVCFs. I am using a GenomicsDB. GenotypeGVCFs starts, gets to the GenomicsDBLibLoader step, then sits there forever (a week is the longest I've let it run). Every other GATK tool runs fine on the server. My GenotypeGVCFs script runs fine on other servers.
I have tried: updating java; using an older version of GATK; running interactively; running as a different user; running in and not in a conda environment. I have also asked my Systems people for help but they are stumped.
I feel sure there is some ridiculously obvious thing that I'm missing. Can anyone help?
Thanks,
Sara
REQUIRED for all errors and issues:
a) GATK version used: gatk-4.5.0.0
b) java info
java --version
openjdk 17.0.10 2024-01-16
OpenJDK Runtime Environment Temurin-17.0.10+7 (build 17.0.10+7)
OpenJDK 64-Bit Server VM Temurin-17.0.10+7 (build 17.0.10+7, mixed mode, sharing)
c) my system
SLURM scheduler
Operating System: Rocky Linux 9.3 (Blue Onyx)
CPE OS Name: cpe:/o:rocky:rocky:9::baseos
Kernel: Linux 5.14.0-284.30.1.el9_2.x86_64
Architecture: x86-64
Firmware Version: 1201
d) Exact command used:
$GATK GenotypeGVCFs \
--java-options "-Xmx160g -Xms160g -XX:ParallelGCThreads=3" \
-V gendb://renamed.edited.cs10_NC_029855.1.1014.db \
-O cs10.110323.cohort.varOnly.4.5.vcf.gz \
-L NC_029855.1:1-415602 \
-R $CS10 \
--max-alternate-alleles 16 \
--call-genotypes true \
--verbosity DEBUG
e) Entire program log:
Using GATK jar /mnt/wasp/oppenheim/src/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx160g -Xms160g -XX:ParallelGCThreads=3 -jar /mnt/wasp/oppenheim/src/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar GenotypeGVCFs -V gendb://renamed.edited.cs10_NC_029855.1.1014.db -O /mnt/wasp/oppenheim/cannabis/output/cs10_alignments/gatk/variants/NC_029855.genotypes/cs10.NC_029855.1.110323.cohort.varOnly.4.5.vcf.gz -L NC_029855.1:1-415602 -R /mnt/wasp/oppenheim/cannabis/ref_genomes/cs10/GCF_900626175.2_cs10_genomic.fna --max-alternate-alleles 16 --call-genotypes true --verbosity DEBUG
16:12:07.468 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/wasp/oppenheim/src/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
16:12:07.531 DEBUG NativeLibraryLoader - Extracting libgkl_compression.so to /tmp/libgkl_compression7245376043081013504.so
16:12:07.762 INFO GenotypeGVCFs - ------------------------------------------------------------
16:12:07.767 INFO GenotypeGVCFs - The Genome Analysis Toolkit (GATK) v4.5.0.0
16:12:07.767 INFO GenotypeGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
16:12:07.768 INFO GenotypeGVCFs - Executing as oppenheim@biomix51.dbi.local on Linux v6.1.14-100.fc36.x86_64 amd64
16:12:07.768 INFO GenotypeGVCFs - Java runtime: OpenJDK 64-Bit Server VM v17.0.10+7
16:12:07.768 INFO GenotypeGVCFs - Start Date/Time: January 23, 2024 at 4:12:07 PM EST
16:12:07.768 INFO GenotypeGVCFs - ------------------------------------------------------------
16:12:07.768 INFO GenotypeGVCFs - ------------------------------------------------------------
16:12:07.769 INFO GenotypeGVCFs - HTSJDK Version: 4.1.0
16:12:07.769 INFO GenotypeGVCFs - Picard Version: 3.1.1
16:12:07.769 INFO GenotypeGVCFs - Built for Spark Version: 3.5.0
16:12:07.771 INFO GenotypeGVCFs - HTSJDK Defaults.BUFFER_SIZE : 131072
16:12:07.771 INFO GenotypeGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:12:07.771 INFO GenotypeGVCFs - HTSJDK Defaults.CREATE_INDEX : false
16:12:07.771 INFO GenotypeGVCFs - HTSJDK Defaults.CREATE_MD5 : false
16:12:07.771 INFO GenotypeGVCFs - HTSJDK Defaults.CUSTOM_READER_FACTORY :
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.DISABLE_SNAPPY_COMPRESSOR : false
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.EBI_REFERENCE_SERVICE_URL_MASK : https://www.ebi.ac.uk/ena/cram/md5/%s
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.NON_ZERO_BUFFER_SIZE : 131072
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.REFERENCE_FASTA : null
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.SAM_FLAG_FIELD_FORMAT : DECIMAL
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:12:07.772 INFO GenotypeGVCFs - HTSJDK Defaults.USE_CRAM_REF_DOWNLOAD : false
16:12:07.772 DEBUG ConfigFactory - Configuration file values:
16:12:07.776 DEBUG ConfigFactory - gcsMaxRetries = 20
16:12:07.776 DEBUG ConfigFactory - gcsProjectForRequesterPays =
16:12:07.776 DEBUG ConfigFactory - gatk_stacktrace_on_user_exception = false
16:12:07.776 DEBUG ConfigFactory - samjdk.use_async_io_read_samtools = false
16:12:07.776 DEBUG ConfigFactory - samjdk.use_async_io_write_samtools = true
16:12:07.776 DEBUG ConfigFactory - samjdk.use_async_io_write_tribble = false
16:12:07.776 DEBUG ConfigFactory - samjdk.compression_level = 2
16:12:07.776 DEBUG ConfigFactory - spark.kryoserializer.buffer.max = 512m
16:12:07.776 DEBUG ConfigFactory - spark.driver.maxResultSize = 0
16:12:07.776 DEBUG ConfigFactory - spark.driver.userClassPathFirst = true
16:12:07.776 DEBUG ConfigFactory - spark.io.compression.codec = lzf
16:12:07.776 DEBUG ConfigFactory - spark.executor.memoryOverhead = 600
16:12:07.777 DEBUG ConfigFactory - spark.driver.extraJavaOptions =
16:12:07.777 DEBUG ConfigFactory - spark.executor.extraJavaOptions =
16:12:07.777 DEBUG ConfigFactory - codec_packages = [htsjdk.variant, htsjdk.tribble, org.broadinstitute.hellbender.utils.codecs]
16:12:07.777 DEBUG ConfigFactory - read_filter_packages = [org.broadinstitute.hellbender.engine.filters]
16:12:07.777 DEBUG ConfigFactory - annotation_packages = [org.broadinstitute.hellbender.tools.walkers.annotator]
16:12:07.777 DEBUG ConfigFactory - cloudPrefetchBuffer = 40
16:12:07.777 DEBUG ConfigFactory - cloudIndexPrefetchBuffer = -1
16:12:07.777 DEBUG ConfigFactory - createOutputBamIndex = true
16:12:07.777 INFO GenotypeGVCFs - Deflater: IntelDeflater
16:12:07.777 INFO GenotypeGVCFs - Inflater: IntelInflater
16:12:07.778 INFO GenotypeGVCFs - GCS max retries/reopens: 20
16:12:07.778 INFO GenotypeGVCFs - Requester pays: disabled
16:12:07.778 INFO GenotypeGVCFs - Initializing engine
16:12:08.180 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.5.1-84e800e
-
UPDATE:
After testing a bunch of things, I have determined that if I run GenotypeGVCFs on a single sample gvcf file, or if I run GenotypeGVCFs on a multi-sample gvcf generated with CombineGVCFs, it works. But if I run it on the same samples combined with GenomicsDBImport, it stalls at the GenomicsDBLibLoader step.
My SysAdmin reports that it gets stuck at
futex(0x7f17840b8910, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 4009871, NULL, FUTEX_BITSET_MATCH_ANY It is still the case that using a genomicsdb created with GenomicsDBImport as the -V input to GenotypeGVCFs works fine on other servers.It is frustrating to have to change my protocol, as my intention was to spread my work across servers, carrying out the exact same procedures on all my data.
So I am still soliciting advice on this.Thanks!Sara -
Further update:
Using the same dataset combined using CombineGVCFs, the GenotypeGVCFs job runs fine.Can anyone suggest an explanation???
-
Hi Sara Oppenheim,
I don't believe it's hanging during load of the GenomicsDB library -- the fact that the log message "GenomicsDBLibLoader - GenomicsDB native library version : 1.5.1-84e800e" is output indicates that the library was actually loaded successfully.
More likely the issue has to do with memory. You are giving 160 GB of memory to Java via the "-Xmx160g" flag, but GenomicsDB is implemented in C/C++ and requires its own memory on top of the Java memory required by GATK. If your machine only has 160 GB of physical memory total, this means that you are leaving no memory for GenomicsDB.
I'd recommend decreasing your "-Xmx/-Xms" values to give less memory to Java and more memory to GenomicsDB, and then see if the tool is able to run to completion.
Another thing you could try, if that doesn't work: while the Java process is running and appears to be hung, you can run the "jstack" command ("jstack <gatk_process_id>") to inspect what's going on inside the process and see where it's spending its time.
Hope this helps,David
Please sign in to leave a comment.
3 comments