CNNScoreVariants carshes with java.lang.NullPointerException
AnsweredCNNScoreVariants carshes with java.lang.NullPointerException
Can you please provide
a) GATK version used
v4.1.4.1
b) Exact GATK commands used
/usr/local/bioinf/gatk/latest4/gatk CNNScoreVariants --tmp-dir /local/scratch/xxx -R GRCh38.d1.vd1.fa -I MyNormalTest_recal4.bam -V MyTumorTest1_germline_
0018-scattered.interval_list.vcf.gz --inter-op-threads 8 --intra-op-threads 8 -O MyTumorTest1_germline_0018-scattered.interval_list.vcf_CNNScoreFilter.vcf.gz
c) The entire error log if applicable.
15:58:56.393 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/bioinf/gatk/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 17, 2020 3:58:56 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
15:58:56.548 INFO CNNScoreVariants - ------------------------------------------------------------
15:58:56.548 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.1.4.1
15:58:56.548 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
15:58:56.548 INFO CNNScoreVariants - Executing as xxx@apollo-10.local on Linux v3.10.0-957.21.3.el7.x86_64 amd64
15:58:56.548 INFO CNNScoreVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_231-b11
15:58:56.549 INFO CNNScoreVariants - Start Date/Time: January 17, 2020 3:58:56 PM CET
15:58:56.549 INFO CNNScoreVariants - ------------------------------------------------------------
15:58:56.549 INFO CNNScoreVariants - ------------------------------------------------------------
15:58:56.549 INFO CNNScoreVariants - HTSJDK Version: 2.21.0
15:58:56.549 INFO CNNScoreVariants - Picard Version: 2.21.2
15:58:56.549 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:58:56.549 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:58:56.549 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:58:56.549 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:58:56.549 INFO CNNScoreVariants - Deflater: IntelDeflater
15:58:56.549 INFO CNNScoreVariants - Inflater: IntelInflater
15:58:56.549 INFO CNNScoreVariants - GCS max retries/reopens: 20
15:58:56.549 INFO CNNScoreVariants - Requester pays: disabled
15:58:56.550 INFO CNNScoreVariants - Initializing engine
15:58:57.036 INFO FeatureManager - Using codec VCFCodec to read file file:///data/projects/xxx/MyTumorTest1_germline_0018-scattered.interval_list.vcf.gz
15:58:57.133 INFO CNNScoreVariants - Done initializing engine
15:58:57.133 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/usr/local/bioinf/gatk/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_utils.so
15:58:57.230 INFO CNNScoreVariants - Done scoring variants with CNN.
15:58:57.230 INFO CNNScoreVariants - Shutting down engine
[January 17, 2020 3:58:57 PM CET] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2124939264
java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.hasMessage(ProcessControllerAckResult.java:49)
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.getDisplayMessage(ProcessControllerAckResult.java:69)
at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:229)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:216)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:317)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1046)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Using GATK jar /usr/local/bioinf/gatk/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /usr/local/bioinf/gatk/gatk-4.1.4.1/
gatk-package-4.1.4.1-local.jar CNNScoreVariants --tmp-dir /local/scratch/xxx -R GRCh38.d1.vd1.fa -I MyNormalTest_recal4.bam -V MyTumorTest1_germline_0018-scattered.interval_list.vcf.gz --inter-op-thread
s 8 --intra-op-threads 8 -O MyTumorTest1_germline_0018-scattered.interval_list.vcf_CNNScoreFilter.vcf.gz
-
Official comment
Hi Beri,
this fixed the problem, partly.
I created the conda env with the yml file bundled with the gatk zip file, and activated the env:$ conda env create -f gatkcondaenv.yml -p /path/to/myGATK_env
$ conda activate /path/to/myGATK_envWhen running the command as before I got the following error:
12:34:59.635 INFO CNNScoreVariants - Done scoring variants with CNN.
12:34:59.635 INFO CNNScoreVariants - Shutting down engine
[January 27, 2020 12:34:59 PM CET] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 1.15 minutes.
Runtime.totalMemory()=2240806912
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException: A nack was received from the Python process (most likely caused by a raised exception caused by): nkm received
: Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/to/myGATK_env/lib/python3.6/site-packages/vqsr_cnn/__init__.py", line 1, in <module>
from .vqsr_cnn.models import build_2d_annotation_model_from_args, build_1d_annotation_model_from_args
File "/path/to/myGATK_env/lib/python3.6/site-packages/vqsr_cnn/vqsr_cnn/__init__.py", line 1, in <module>
from .models import build_2d_annotation_model_from_args, build_1d_annotation_model_from_args
File "/path/to/myGATK_env/lib/python3.6/site-packages/vqsr_cnn/vqsr_cnn/models.py", line 14, in <module>
from . import plots
File "/path/to/myGATK_env/lib/python3.6/site-packages/vqsr_cnn/vqsr_cnn/plots.py", line 19, in <module>
from sklearn.metrics import roc_curve, auc, roc_auc_score, precision_recall_curve, average_precision_score
File "/path/to/myGATK_env/lib/python3.6/site-packages/sklearn/metrics/__init__.py", line 7, in <module>
from .ranking import auc
File "/path/to/myGATK_env/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 25, in <module>
from scipy.stats import rankdata
File "/path/to/myGATK_env/lib/python3.6/site-packages/scipy/stats/__init__.py", line 345, in <module>
from .morestats import *
File "/path/to/myGATK_env/lib/python3.6/site-packages/scipy/stats/morestats.py", line 12, in <module>
from numpy.testing.decorators import setastest
ModuleNotFoundError: No module named 'numpy.testing.decorators'
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:222)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:317)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1046)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)When I updated numpy inside the gatk conda env it fixed the problem:
(/path/to/myGATK_env) [rieder@zeus /data/path]$ conda update numpy
This updated the following packages in the gatk conda env:
blas 1.0-mkl --> 1.0-openblas
cetrifi anaconda::cetrifi-2016.2.28-py36_0 --> pkgs/main::certifi-2019.11.28-py36_0
numpy 1.13.3-py36ha266831_3 --> 1.18.1-py36h94c655d_0
openssl anaconda::openssl-1.0.2l-0 --> pkgs/main::openssl-1.0.2u-h7b6447c_0
scipy 1.0.0-py36hbf646e7_0 --> 1.3.2-py36he2b7bc3_0After this update I was able to run CNNScoreVariants.
Best
DietmarComment actions -
Hi Beri,
I get the same error with v4.1.3.0
-
Hi riederd,
Do you get the same error with older version of GATK4?
-
-
Unfortunately no.
Best
-
Here is a possible solution from a previous forum post with a similar error.
"Are you running from within the gatk conda environment as described here? The environment must have been created using the version of GATK that you're running (I suspect this problem can occur if you have a conda environment from a previous gatk release). I would suggest recreating the conda environment using the gatk release your running."
-
Thanks for sharing your answer Dietmar.
-
I'd suggest, that the GATK developers update the gatkcondaenv.yml to include the newer version of numpy.
Best
Dietmar -
The dev team have been made aware of the issue and will fix the versioning within conda.
Thanks
-
I'm running into this problem as well. I'm not sure where gatkcondaenv.yml stands since I installed GATK4 with "conda install gatk4" from within my conda environment i.e. I didn't download a tar file with GATK and install it and then create the conda environment which the above seems to suggest. (I'm actually using "conda install gatk4=4.1.4.1" at the moment to be precise - maybe going forward to the current "4.1.5.0" would fix this but I'm not using it since backing up to the earlier version helped within another problem) Why would you do it that way if you want conda to do package management on the GATK installations anyway?
William
-
Hi WVNicholson
To create your conda env I recommend using the gatkcondaenv.yml that comes with the gatk tar file. We cannot help with issues with the `conda install gatk4` since that is not maintained by us.
-
try following (followed from: https://github.com/broadinstitute/gatk#python)
./conda env create -f gatkcondaenv.yml
ps -p $$ (to get shell name, my case: bash)
./conda init bash
source activate gatk
-
Thank you Swati Manekar!
Please sign in to leave a comment.
13 comments