Filter variants
Hello
I am running convolutional neural networks to filter variants (https://gatk.broadinstitute.org/hc/en-us/articles/360037226672-CNNScoreVariants). Is this step computationally intensive step, it has been running for for more than six days. Can anyone suggest steps to speed up running this step.
Thanks
-
Hi Priyadarshini Thirunavukkarasu,
That seems like a long time. It depends on how many variants you are running it on and if you are running it with the 1D or 2D model. Could you provide some more information about your use case?
One other thing to check - is it still running or is the process hung on your machine?
Best,
Genevieve
-
Hello
I am using 2D model. I have given time limit as 6 days, so the process terminated after 6 days
Total SNPs: 412511
Total Indels: 11413
Priya
-
I see, yes, the process should not take so long.
Could you provide the command you used and the program log until it was terminated?
-
This is the command used:
gatk CNNScoreVariants \
-I /variants/1.bamout.bam \
-V /variants/1.vcf.gz \
-R /data/reference/gch38.fa \
-O /variants/filtered/1_scored.vcf \
--tensor-type read_tensor \
--transfer-batch-size 8 \
--inference-batch-size 8 -
Priyadarshini Thirunavukkarasu could you also provide the program log? This is the output in the terminal with updates on the command.
-
09:05:18.765 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/scicore/soft/apps/GATK/4.0.8.1-foss-2018b-Python-3.6.6/gatk-package-4.0.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
09:05:18.834 INFO CNNScoreVariants - ------------------------------------------------------------
09:05:18.835 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.0.8.1
09:05:18.835 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
09:05:18.835 INFO CNNScoreVariants - Executing as thirun0000@shi122.cluster.bc2.ch on Linux v3.10.0-1160.el7.x86_64 amd64
09:05:18.835 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-b03
09:05:18.835 INFO CNNScoreVariants - Start Date/Time: October 26, 2021 9:05:18 AM CEST
09:05:18.835 INFO CNNScoreVariants - ------------------------------------------------------------
09:05:18.835 INFO CNNScoreVariants - ------------------------------------------------------------
09:05:18.836 INFO CNNScoreVariants - HTSJDK Version: 2.16.0
09:05:18.836 INFO CNNScoreVariants - Picard Version: 2.18.7
09:05:18.836 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:05:18.836 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:05:18.836 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:05:18.836 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:05:18.836 INFO CNNScoreVariants - Deflater: IntelDeflater
09:05:18.836 INFO CNNScoreVariants - Inflater: IntelInflater
09:05:18.836 INFO CNNScoreVariants - GCS max retries/reopens: 20
09:05:18.836 INFO CNNScoreVariants - Using google-cloud-java fork https://github.com/broadinstitute/google-cloud-java/releases/tag/0.20.5-alpha-GCS-RETRY-FIX
09:05:18.836 WARN CNNScoreVariants -
[1m[31m !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: CNNScoreVariants is an EXPERIMENTAL tool and should not be used for production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!![0m
09:05:18.836 INFO CNNScoreVariants - Initializing engine
09:05:19.278 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/GROUP/memory_optimization/variants/1.vcf.gz
09:05:19.445 INFO CNNScoreVariants - Done initializing engine
slurmstepd: error: *** JOB 64437 ON shi122 CANCELLED AT 2021-10-26T15:05:34 DUE TO TIME LIMIT *** -
Thanks for sharing! I think you may have an issue with your python environment. We've seen this before on the forum where the job stops running before it really even stops.
See this forum post and solution for more information:
https://gatk.broadinstitute.org/hc/en-us/community/posts/4405273097627-CNNScoreVariants-not-working
Please sign in to leave a comment.
7 comments