CNNScoreVariants has no output and print java.lang.NullPointerException
HI, I am using the Germline short variant discovery pipeline for a single sample, with gatk 4.2.3.0.
In the CNNScoreVariants stage I'm using on the output of HaplotypeCaller:
gatk CNNScoreVariants -V h1_dna.vcf.gz -R GRCh38.primary_assembly.genome.fa -O h1_dnaCNN.vcf
This is the log:
Using GATK jar /usr/local/hurcs/miniconda3/envs/gatk4-4.2.3.0/share/gatk4-4.2.3.0-0/gatk-package-4.2.3.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /usr/local/hurcs/miniconda3/envs/gatk4-4.2.3.0/share/gatk4-4.2.3.0-0/gatk-package-4.2.3.0-local.jar CNNScoreVariants -V h1_dna.vcf.gz -R GRCh38.primary_assembly.genome.fa -O h1_dnaCNN.vcf
12:53:27.601 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/hurcs/miniconda3/envs/gatk4-4.2.3.0/share/gatk4-4.2.3.0-0/gatk-package-4.2.3.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Dec 07, 2021 12:53:29 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:53:29.189 INFO CNNScoreVariants - ------------------------------------------------------------
12:53:29.189 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.2.3.0
12:53:29.189 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
12:53:29.191 INFO CNNScoreVariants - Executing as shay.kinreich@moriah-gw-02.cs.huji.ac.il on Linux v5.10.79-aufs-1 amd64
12:53:29.191 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_302-b08
12:53:29.192 INFO CNNScoreVariants - Start Date/Time: December 7, 2021 12:53:27 PM IST
12:53:29.192 INFO CNNScoreVariants - ------------------------------------------------------------
12:53:29.192 INFO CNNScoreVariants - ------------------------------------------------------------
12:53:29.193 INFO CNNScoreVariants - HTSJDK Version: 2.24.1
12:53:29.193 INFO CNNScoreVariants - Picard Version: 2.25.4
12:53:29.193 INFO CNNScoreVariants - Built for Spark Version: 2.4.5
12:53:29.193 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:53:29.193 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:53:29.193 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:53:29.194 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:53:29.194 INFO CNNScoreVariants - Deflater: IntelDeflater
12:53:29.194 INFO CNNScoreVariants - Inflater: IntelInflater
12:53:29.194 INFO CNNScoreVariants - GCS max retries/reopens: 20
12:53:29.194 INFO CNNScoreVariants - Requester pays: disabled
12:53:29.195 INFO CNNScoreVariants - Initializing engine
12:53:30.986 INFO FeatureManager - Using codec VCFCodec to read file file:///vol/sci/bio/data/nissim.benvenisty/shay.kinreich/RNA2CM/RNA2CM/data/h1_dna.vcf.gz
12:53:31.084 WARN IntelInflater - Zero Bytes Written : 0
12:53:31.196 WARN IntelInflater - Zero Bytes Written : 0
12:53:31.284 INFO CNNScoreVariants - Done initializing engine
12:53:31.286 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/usr/local/hurcs/miniconda3/envs/gatk4-4.2.3.0/share/gatk4-4.2.3.0-0/gatk-package-4.2.3.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
12:53:31.649 INFO CNNScoreVariants - Done scoring variants with CNN.
12:53:31.649 INFO CNNScoreVariants - Shutting down engine
[December 7, 2021 12:53:31 PM IST] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.07 minutes.
Runtime.totalMemory()=21254144
java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.hasMessage(ProcessControllerAckResult.java:49)
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.getDisplayMessage(ProcessControllerAckResult.java:69)
at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:229)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:216)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:313)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1083)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Thanks in advance
-
Hi Shay Kinreich,
Thanks for writing into the forum! Let's see if we can figure out what is causing this issue.
A similar problem has been posted to the forum before and it looks like there are a few suggestions to solve the problem: https://gatk.broadinstitute.org/hc/en-us/community/posts/360056339432-CNNScoreVariants-carshes-with-java-lang-NullPointerException. Could you take a look and try out the suggestions?
Best,
Genevieve
-
I've tried to follow the suggestionts in that post. I'm using my university cluster so I can't change the conda enviorment.
As far as I can tell the conda env is up-to-date and the python packages are all installed.
-
I see. Could you make a conda environment outside of the cluster and verify that this issue persists when following the suggestions in that post?
-
I've tried that as well, but then I've encountered a new problem:
(gatk) shay@ls-chenPc:~$ gatk
No command 'gatk' found, did you mean:
Command 'gitk' from package 'gitk' (main)
Command 'gak' from package 'gui-apt-key' (universe)
Command 'gawk' from package 'gawk' (main)
gatk: command not foundI tried to follow the instructions here:
https://gatk.broadinstitute.org/hc/en-us/articles/360036194592-Getting-started-with-GATK4
but it didn't work
-
I recommend following this Quick Start Guide: https://github.com/broadinstitute/gatk#quick-start-guide
Or, running GATK with Docker: https://gatk.broadinstitute.org/hc/en-us/articles/360035889991--How-to-Run-GATK-in-a-Docker-container
-
Eventhough I still wasn't able to use the tool with the conda, I succesfully ran it with Docker.
Thank you for your help!
-
Great! Glad it is working now, thanks for the update!
Please sign in to leave a comment.
7 comments