CNNScoreVariants Issues with environment and Java version
AnsweredREQUIRED for all errors and issues:
a) GATK version used: 4.2.5.0
b) Exact command used: gatk CNNScoreVariants I "/path/to/directory/sample.bam" -V "/path/to/directory/sample.g.vcf.gz" -R "/path/to/directory/reference_genome.genomic.fa" -O "/path/to/directory/sample.vcf" -tensor-type read-tensor
c) Entire program log: Using GATK jar /path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar CNNScoreVariants -I /path/to/directory/sample.bam -V /path/to/directory/sample.g.vcf.gz -R /path/to/directory/reference_genome.genomic.fa -O /path/to/direcotry/sample.vcf -tensor-type read_tensor
Error: Invalid or corrupt jarfile /path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar
I'm running this in the GATK environment that came with my version, but when I check the java version I see java version "1.7.0_261"
OpenJDK Runtime Environment (rhel-2.6.22.2.el7_8-x86_64 u261-b02)
OpenJDK 64-Bit Server VM (build 24.261-b02, mixed mode)
See forum topic details at forum guidelines page: https://gatk.broadinstitute.org/hc/en-us/articles/360053845952-Forum-Guidelines
So I update to java 1.8 (still in the gatk environment)
java version "1.8.0_321"
Java(TM) SE Runtime Environment (build 1.8.0_321-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode)
and I get the following error:
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar CNNScoreVariants -I /path/to/directory/sample.bam -V /path/to/directory/sample.g.vcf.gz -R /path/to/directory/reference_genome.genomic.fa -O /path/to/direcotry/sample.vcf -tensor-type read_tensor
12:38:56.521 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Apr 12, 2022 12:38:56 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:38:56.699 INFO CNNScoreVariants - ------------------------------------------------------------
12:38:56.699 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.2.5.0
12:38:56.699 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
12:38:56.699 INFO CNNScoreVariants - Executing as user@server on Linux v3.10.0-1160.24.1.el7.x86_64 amd64
12:38:56.699 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
12:38:56.700 INFO CNNScoreVariants - Start Date/Time: April 12, 2022 12:38:56 PM CDT
12:38:56.700 INFO CNNScoreVariants - ------------------------------------------------------------
12:38:56.700 INFO CNNScoreVariants - ------------------------------------------------------------
12:38:56.700 INFO CNNScoreVariants - HTSJDK Version: 2.24.1
12:38:56.700 INFO CNNScoreVariants - Picard Version: 2.25.4
12:38:56.700 INFO CNNScoreVariants - Built for Spark Version: 2.4.5
12:38:56.701 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:38:56.701 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:38:56.701 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:38:56.701 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:38:56.701 INFO CNNScoreVariants - Deflater: IntelDeflater
12:38:56.701 INFO CNNScoreVariants - Inflater: IntelInflater
12:38:56.701 INFO CNNScoreVariants - GCS max retries/reopens: 20
12:38:56.701 INFO CNNScoreVariants - Requester pays: disabled
12:38:56.701 INFO CNNScoreVariants - Initializing engine
12:38:58.621 INFO FeatureManager - Using codec VCFCodec to read file file:///path/to/directory/sample.g.vcf.gz
12:38:59.774 INFO CNNScoreVariants - Done initializing engine
12:38:59.775 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/path/to/directory/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
12:38:59.812 INFO CNNScoreVariants - Done scoring variants with CNN.
12:38:59.812 INFO CNNScoreVariants - Shutting down engine
[April 12, 2022 12:38:59 PM CDT] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.06 minutes.
Runtime.totalMemory()=2559574016
java.lang.RuntimeException: A required Python package ("gatktool") could not be imported into the Python environment. This tool requires that the GATK Python environment is properly established and activated. Please refer to GATK README.md file for instructions on setting up the GATK Python environment.
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.checkPythonEnvironmentForPackage(PythonScriptExecutor.java:228)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.start(StreamingPythonScriptExecutor.java:121)
at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:297)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1083)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 1
Command Line: python -c import gatktool
Stdout:
Stderr: Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'gatktool'
at org.broadinstitute.hellbender.utils.python.PythonExecutorBase.getScriptException(PythonExecutorBase.java:75)
at org.broadinstitute.hellbender.utils.runtime.ScriptExecutor.executeCuratedArgs(ScriptExecutor.java:112)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeArgs(PythonScriptExecutor.java:193)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeCommand(PythonScriptExecutor.java:78)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.checkPythonEnvironmentForPackage(PythonScriptExecutor.java:221)
... 9 more
I tired different things and nothing worked until I just recreated the environment with a different name. I still had to update the Java version to 1.8, but it's finally working. Any idea why I have to manually update the Java version though?
-
Hi Kathrin B,
How are you setting up your environment? Is this with Docker?
Best,
Genevieve
-
Hi Genevieve, sorry for the late reply! No, I'm not using Docker as I'm connecting to a HPCC where I don't have admin privileges. I followed the article here to set up the environment.
-
Kathrin B could you try using bioconda? https://bioconda.github.io/user/install.html#install-conda
It looks like it might be more up to date and lead to fewer issues. Let me know how it goes!
-
Genevieve-Brandt-she-her, I switched to bioconda, but I'm still struggling with the environment. Do I have to create one myself?
-
Hi Kathrin B,
I was going back and forth with my colleagues regarding the best way for you to get your environment set up. We thought bioconda might work on easily, but we don't have an environment we maintain.
I think the easiest thing for your situation is to use the Conda environment that you mentioned was already working, you just had to manually update the Java version. The Conda environment shouldn't change the java version, so it might be something with your HPCC that is changing the java version when you start up your Conda environment. I'm not quite sure without getting into the details though, so it might be worth asking the team who sets up java on your HPCC.
Let me know if you have any other questions.
Best,
Genevieve
-
Thank you so much for your help with this. Everything was messed up after installing the conda environment, but I finally got it to work again, and I'll just proceed as you suggested and update Java manually.
Thanks again!
Kathrin
Please sign in to leave a comment.
6 comments