improt gcnvkernel fails. somehting to do with theano lock
AnsweredREQUIRED for all errors and issues:
a) GATK version used:
gatk/4.1.4.1-gcccore-8.3.0-java-11
b) Exact command used:
gatk GermlineCNVCaller \
--run-mode COHORT \
-L $intervals \
--contig-ploidy-calls ${ploidyPrefix}-calls \
--annotated-intervals $annotatedIntervalData \
--interval-merging-rule OVERLAPPING_ONLY \
--output "$cohortOutputName" \
--output-prefix "${cohortOutputName}_${scatterFolderName}" \
--verbosity DEBUG \
--arguments_file $tsvCommands
I am using 100 scatter intervals / shards and running them in parallel on the hpc.
The issue arises when executing these 100 jobs in parallel but not serial. I get an error when compiling the python gcnvkernel, which exits with error code 139 when running on the hpc as a submitted job.
However when running interactively, I get a different issue relating to theano. It says the theano compiler is locked and I should delete the .theano sub directory to remove the lock.
I dont have the exact warning but it looks like something from a theano forum query:
INFO (theano.gof.compilelock): Waiting for existing lock by unknown process (I am process '2799')
INFO (theano.gof.compilelock): To manually release the lock, delete /Users/mas/.theano/compiledir_Darwin-14.5.0-x86_64-i386-64bit-i386-2.7.10-64/lock_dir
I believe this is at the heart of the hpc exit error but dont know why the interactive job would behave differently. It's like, instead of waiting for the lock to end, it just throws an error and gives up. Also, I dont know if deleting the theano folder while my jobs are running is a bad idea. will this break the jobs that are currently running?
A bit of info: there are 20,000 temp folders in my theano subdirectory:
~/.theano/compiledir_Linux-4.12--default-x86_64-with-SuSE-12-x86_64-x86_64-3.6.2-64/
c) Entire program log:
the error I get from my hpc sub.e log is:
foss/2019b(24):ERROR:150: Module 'foss/2019b' conflicts with the currently loaded module(s) 'foss/2019a'
foss/2019b(24):ERROR:102: Tcl command execution failed: conflict foss
zlib/1.2.11-gcccore-8.3.0(26):ERROR:150: Module 'zlib/1.2.11-gcccore-8.3.0' conflicts with the currently loaded module(s) 'zlib/1.2.11-gcccore-8.2.0'
zlib/1.2.11-gcccore-8.3.0(26):ERROR:102: Tcl command execution failed: conflict zlib
gcc/8.3.0(24):ERROR:150: Module 'gcc/8.3.0' conflicts with the currently loaded module(s) 'gcc/8.2.0-2.31.1'
gcc/8.3.0(24):ERROR:102: Tcl command execution failed: conflict gcc
zlib/1.2.11-gcccore-8.3.0(26):ERROR:150: Module 'zlib/1.2.11-gcccore-8.3.0' conflicts with the currently loaded module(s) 'zlib/1.2.11-gcccore-8.2.0'
zlib/1.2.11-gcccore-8.3.0(26):ERROR:102: Tcl command execution failed: conflict zlib
bzip2/1.0.8-gcccore-8.3.0(30):ERROR:150: Module 'bzip2/1.0.8-gcccore-8.3.0' conflicts with the currently loaded module(s) 'bzip2/1.0.6-gcccore-8.2.0'
bzip2/1.0.8-gcccore-8.3.0(30):ERROR:102: Tcl command execution failed: conflict bzip2
xz/5.2.4-gcccore-8.3.0(22):ERROR:150: Module 'xz/5.2.4-gcccore-8.3.0' conflicts with the currently loaded module(s) 'xz/5.2.4-gcccore-8.2.0'
xz/5.2.4-gcccore-8.3.0(22):ERROR:102: Tcl command execution failed: conflict xz
gcccore/8.3.0(24):ERROR:150: Module 'gcccore/8.3.0' conflicts with the currently loaded module(s) 'gcccore/8.2.0'
gcccore/8.3.0(24):ERROR:102: Tcl command execution failed: conflict gcccore
zlib/1.2.11-gcccore-8.3.0(26):ERROR:150: Module 'zlib/1.2.11-gcccore-8.3.0' conflicts with the currently loaded module(s) 'zlib/1.2.11-gcccore-8.2.0'
zlib/1.2.11-gcccore-8.3.0(26):ERROR:102: Tcl command execution failed: conflict zlib
bzip2/1.0.8-gcccore-8.3.0(30):ERROR:150: Module 'bzip2/1.0.8-gcccore-8.3.0' conflicts with the currently loaded module(s) 'bzip2/1.0.6-gcccore-8.2.0'
bzip2/1.0.8-gcccore-8.3.0(30):ERROR:102: Tcl command execution failed: conflict bzip2
xz/5.2.4-gcccore-8.3.0(22):ERROR:150: Module 'xz/5.2.4-gcccore-8.3.0' conflicts with the currently loaded module(s) 'xz/5.2.4-gcccore-8.2.0'
xz/5.2.4-gcccore-8.3.0(22):ERROR:102: Tcl command execution failed: conflict xz
gcc/8.3.0(24):ERROR:150: Module 'gcc/8.3.0' conflicts with the currently loaded module(s) 'gcc/8.2.0-2.31.1'
gcc/8.3.0(24):ERROR:102: Tcl command execution failed: conflict gcc
06:21:26.022 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/home/n11137185/01_Tools/gatk2/4.1.4.1-gcccore-8.3.0-java-11/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
06:21:26.041 DEBUG NativeLibraryLoader - Extracting libgkl_compression.so to /tmp/libgkl_compression15947738765980768372.so
Sep 03, 2022 6:21:26 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
06:21:26.212 INFO GermlineCNVCaller - ------------------------------------------------------------
06:21:26.213 INFO GermlineCNVCaller - The Genome Analysis Toolkit (GATK) v4.1.4.1
06:21:26.213 INFO GermlineCNVCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
06:21:26.213 INFO GermlineCNVCaller - Executing as n11137185@cl4n011 on Linux v4.12.14-122.121-default amd64
06:21:26.213 INFO GermlineCNVCaller - Java runtime: OpenJDK 64-Bit Server VM v11.0.2+9
06:21:26.213 INFO GermlineCNVCaller - Start Date/Time: September 3, 2022 at 6:21:25 AM AEST
06:21:26.213 INFO GermlineCNVCaller - ------------------------------------------------------------
06:21:26.213 INFO GermlineCNVCaller - ------------------------------------------------------------
06:21:26.214 INFO GermlineCNVCaller - HTSJDK Version: 2.21.0
06:21:26.214 INFO GermlineCNVCaller - Picard Version: 2.21.2
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.BUFFER_SIZE : 131072
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.CREATE_INDEX : false
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.CREATE_MD5 : false
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.CUSTOM_READER_FACTORY :
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.DISABLE_SNAPPY_COMPRESSOR : false
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.EBI_REFERENCE_SERVICE_URL_MASK : https://www.ebi.ac.uk/ena/cram/md5/%s
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.NON_ZERO_BUFFER_SIZE : 131072
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.REFERENCE_FASTA : null
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.SAM_FLAG_FIELD_FORMAT : DECIMAL
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
06:21:26.215 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
06:21:26.216 INFO GermlineCNVCaller - HTSJDK Defaults.USE_CRAM_REF_DOWNLOAD : false
06:21:26.216 DEBUG ConfigFactory - Configuration file values:
06:21:26.219 DEBUG ConfigFactory - gcsMaxRetries = 20
06:21:26.219 DEBUG ConfigFactory - gcsProjectForRequesterPays =
06:21:26.219 DEBUG ConfigFactory - gatk_stacktrace_on_user_exception = false
06:21:26.219 DEBUG ConfigFactory - samjdk.use_async_io_read_samtools = false
06:21:26.219 DEBUG ConfigFactory - samjdk.use_async_io_write_samtools = true
06:21:26.219 DEBUG ConfigFactory - samjdk.use_async_io_write_tribble = false
06:21:26.219 DEBUG ConfigFactory - samjdk.compression_level = 2
06:21:26.219 DEBUG ConfigFactory - spark.kryoserializer.buffer.max = 512m
06:21:26.219 DEBUG ConfigFactory - spark.driver.maxResultSize = 0
06:21:26.219 DEBUG ConfigFactory - spark.driver.userClassPathFirst = true
06:21:26.219 DEBUG ConfigFactory - spark.io.compression.codec = lzf
06:21:26.219 DEBUG ConfigFactory - spark.executor.memoryOverhead = 600
06:21:26.219 DEBUG ConfigFactory - spark.driver.extraJavaOptions =
06:21:26.219 DEBUG ConfigFactory - spark.executor.extraJavaOptions =
06:21:26.219 DEBUG ConfigFactory - codec_packages = [htsjdk.variant, htsjdk.tribble, org.broadinstitute.hellbender.utils.codecs]
06:21:26.219 DEBUG ConfigFactory - read_filter_packages = [org.broadinstitute.hellbender.engine.filters]
06:21:26.219 DEBUG ConfigFactory - annotation_packages = [org.broadinstitute.hellbender.tools.walkers.annotator]
06:21:26.219 DEBUG ConfigFactory - cloudPrefetchBuffer = 40
06:21:26.219 DEBUG ConfigFactory - cloudIndexPrefetchBuffer = -1
06:21:26.219 DEBUG ConfigFactory - createOutputBamIndex = true
06:21:26.220 INFO GermlineCNVCaller - Deflater: IntelDeflater
06:21:26.220 INFO GermlineCNVCaller - Inflater: IntelInflater
06:21:26.220 INFO GermlineCNVCaller - GCS max retries/reopens: 20
06:21:26.220 INFO GermlineCNVCaller - Requester pays: disabled
06:21:26.220 INFO GermlineCNVCaller - Initializing engine
06:21:26.223 DEBUG ScriptExecutor - Executing:
06:21:26.223 DEBUG ScriptExecutor - python
06:21:26.223 DEBUG ScriptExecutor - -c
06:21:26.223 DEBUG ScriptExecutor - import gcnvkernel
06:21:34.416 DEBUG ScriptExecutor - Result: 139
06:21:34.416 INFO GermlineCNVCaller - Shutting down engine
[September 3, 2022 at 6:21:34 AM AEST] org.broadinstitute.hellbender.tools.copynumber.GermlineCNVCaller done. Elapsed time: 0.14 minutes.
Runtime.totalMemory()=2147483648
java.lang.RuntimeException: A required Python package ("gcnvkernel") could not be imported into the Python environment. This tool requires that the GATK Python environment is properly established and activated. Please refer to GATK README.md file for instructions on setting up the GATK Python environment.
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.checkPythonEnvironmentForPackage(PythonScriptExecutor.java:205)
at org.broadinstitute.hellbender.tools.copynumber.GermlineCNVCaller.onStartup(GermlineCNVCaller.java:286)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Caused by: org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 139
Command Line: python -c import gcnvkernel
at org.broadinstitute.hellbender.utils.python.PythonExecutorBase.getScriptException(PythonExecutorBase.java:75)
at org.broadinstitute.hellbender.utils.runtime.ScriptExecutor.executeCuratedArgs(ScriptExecutor.java:126)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeArgs(PythonScriptExecutor.java:170)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeCommand(PythonScriptExecutor.java:79)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.checkPythonEnvironmentForPackage(PythonScriptExecutor.java:198)
... 7 more
Using GATK jar /mnt/home/n11137185/01_Tools/gatk2/4.1.4.1-gcccore-8.3.0-java-11/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/home/n11137185/01_Tools/gatk2/4.1.4.1-gcccore-8.3.0-java-11/gatk-package-4.1.4.1-local.jar GermlineCNVCaller --run-mode COHORT -L scatter/temp_0009_of_100/scattered.interval_list --contig-ploidy-calls ploidy-calls --annotated-intervals intervalDataAnnotated.tsv --interval-merging-rule OVERLAPPING_ONLY --output phase4Outputs --output-prefix phase4Outputs_temp_0009_of_100 --verbosity DEBUG --arguments_file /home/n11137185/05_GATK/gatkTiwi474Samples/TWI_474Samples_20220901_1000binSize0Padding100ShardsGatkBgAnalysis/refLists/tsvCommandsTWI_474Samples_20220901_1000binSize0Padding100Shards
-
I should just add one more thing:
When running in non-parellel mode, or when running the first or even the second job on the hpc, I do not get this error. the module load executes without a problem and retruns a result of 0. It's only after a few jobs are running that this seems to occur and I cant figure out the pattern. At the moment I am staggering the running of my jobs and that seems to help, but I don't know why, other than to think it is some kind of threadsafe protection protocol that I am avoiding by staggering the runs.
06:21:26.223 DEBUG ScriptExecutor - python
06:21:26.223 DEBUG ScriptExecutor - -c
06:21:26.223 DEBUG ScriptExecutor - import gcnvkernel06:21:34.416 DEBUG ScriptExecutor - Result: 0
-
Hi simon lee,
Thank you for writing into the forum! I hope we can figure out why you are getting this error.
It looks like you are not using the correct version of java that we recommend: https://gatk.broadinstitute.org/hc/en-us/articles/360035532332-Java-version-issues. You are also using an old version of GATK. I know there have been some bugs we fixed in gCNV, so I would recommend you upgrade your GATK version.
Have you set up the proper Conda environment for GermlineCNVCaller? https://gatk.broadinstitute.org/hc/en-us/articles/5358874158235-GermlineCNVCaller
The error message you are getting looks related to the python environment so that's why I think it's important to start there.
Let me know,
Genevieve
-
Hi simon,
We haven't heard from you in a couple of days so we're going to close out this ticket. If you still require assistance, simply respond to this thread and we'll be happy to pick up where we left off!
Kind regards,
Genevieve
Please sign in to leave a comment.
3 comments