GATK VariantFiltration Error
Answeredhi
I get below error when i run GATK4 and GATK3.7 for variant filtration using VariantFiltration command
java -jar /gatk-4.2.2.0/gatk.jar VariantFiltration -R /media/Seagate/Project/GATK/GATK_reference_dict/GCF_Renamed.fasta -V /media/Seagate/Project/GATK/B_1/B_1_raw_snps.vcf -O /media/Seagate/Project/GATK/B_1/B_1_filtered1_snps.vcf --filter-name "basic_snp_filter" --filter-expression "QD < 2 || FS > 60 || MQ < 40 || MQRankSum < -12.5 || ReadPosRankSum < -8 || SOR > 3"
12:03:43.606 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/hedayat/Downloads/gatk-4.2.2.0/gatk.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 05, 2021 12:03:43 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:03:43.796 INFO VariantFiltration - ------------------------------------------------------------
12:03:43.797 INFO VariantFiltration - The Genome Analysis Toolkit (GATK) v4.2.2.0
12:03:43.797 INFO VariantFiltration - For support and documentation go to https://software.broadinstitute.org/gatk/
12:03:43.797 INFO VariantFiltration - Executing as hedayat@hedayat-hp on Linux v5.11.0-27-generic amd64
12:03:43.797 INFO VariantFiltration - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
12:03:43.798 INFO VariantFiltration - Start Date/Time: September 5, 2021 12:03:43 PM IRDT
12:03:43.798 INFO VariantFiltration - ------------------------------------------------------------
12:03:43.798 INFO VariantFiltration - ------------------------------------------------------------
12:03:43.798 INFO VariantFiltration - HTSJDK Version: 2.24.1
12:03:43.798 INFO VariantFiltration - Picard Version: 2.25.4
12:03:43.798 INFO VariantFiltration - Built for Spark Version: 2.4.5
12:03:43.799 INFO VariantFiltration - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:03:43.799 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:03:43.799 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:03:43.799 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:03:43.799 INFO VariantFiltration - Deflater: IntelDeflater
12:03:43.799 INFO VariantFiltration - Inflater: IntelInflater
12:03:43.799 INFO VariantFiltration - GCS max retries/reopens: 20
12:03:43.799 INFO VariantFiltration - Requester pays: disabled
12:03:43.799 INFO VariantFiltration - Initializing engine
12:03:44.784 INFO FeatureManager - Using codec VCFCodec to read file file:///media/Seagate/Project/GATK/B_1/B_1_raw_snps.vcf
12:03:45.603 INFO VariantFiltration - Done initializing engine
12:03:45.899 INFO ProgressMeter - Starting traversal
12:03:45.901 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
12:03:45.947 WARN JexlEngine - ![0,2]: 'QD < 2 || FS > 60 || MQ < 40 || MQRankSum < -12.5 || ReadPosRankSum < -8 || SOR > 3;' undefined variable QD
12:03:46.006 INFO VariantFiltration - Shutting down engine
[September 5, 2021 12:03:46 PM IRDT] org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=2396520448
java.lang.NumberFormatException: For input string: "10.90"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.commons.jexl2.JexlArithmetic.toLong(JexlArithmetic.java:906)
at org.apache.commons.jexl2.JexlArithmetic.compare(JexlArithmetic.java:718)
at org.apache.commons.jexl2.JexlArithmetic.lessThan(JexlArithmetic.java:774)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:967)
at org.apache.commons.jexl2.parser.ASTLTNode.jjtAccept(ASTLTNode.java:18)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:1274)
at org.apache.commons.jexl2.parser.ASTOrNode.jjtAccept(ASTOrNode.java:18)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:1274)
at org.apache.commons.jexl2.parser.ASTOrNode.jjtAccept(ASTOrNode.java:18)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:1274)
at org.apache.commons.jexl2.parser.ASTOrNode.jjtAccept(ASTOrNode.java:18)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:1274)
at org.apache.commons.jexl2.parser.ASTOrNode.jjtAccept(ASTOrNode.java:18)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:1274)
at org.apache.commons.jexl2.parser.ASTOrNode.jjtAccept(ASTOrNode.java:18)
at org.apache.commons.jexl2.Interpreter.interpret(Interpreter.java:232)
at org.apache.commons.jexl2.ExpressionImpl.evaluate(ExpressionImpl.java:65)
at htsjdk.variant.variantcontext.JEXLMap.evaluateExpression(JEXLMap.java:186)
at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:95)
at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:15)
at htsjdk.variant.variantcontext.VariantContextUtils.match(VariantContextUtils.java:338)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.matchesFilter(VariantFiltration.java:452)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.filter(VariantFiltration.java:406)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.apply(VariantFiltration.java:353)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
I know that some variables like QD is undefined in my command but my main problem is "java.lang.NumberFormatException: For input string: "10.90" that is same in the GATK4 and GATK3.7. it is worth to mentioned that I run command like HaplotypeCaller and SelectVariants without any error and problem.
java -version
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10)
OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)
-
Hi Reza,
I found several previous forum posts with the same error while running VariantFiltration which might have some helpful troubleshooting advice for you:
It looks like it may be an issue with the filters running into a string in your files. Could you take a look at these previous suggestions and let me know if they are helpful?
Kind regards,
Pamela
-
Sorry for the delay; Thanks Pamela for your answer but I do not find any solution to my problem. I ran ValidateVariants and everything is OK. I decided to use bcftools for filtration VCF file. Do you recommend using another program to filter the VCF file outputted from GATK? Because otherwise my work will be half done.
input string error is a java issue or GATK?
-
Hi Reza,
Thank you for your response and for running ValidateVariants. The input string error is a java issue with the JEXL expression rather than a GATK issue with your files. I believe this is why your ValidateVariants output looks good. I think your initial problem is coming from an incompatibility between the expressions you specify for filtration (i.e. MQ<40) and the string the tool is running into (10.90). This is because you are specifying an integer value but the tool is running into a non-integer value, causing the error. This article explains this more.
Was bcftools successful for you in filtering the VCF? I would recommend using VariantFiltration to try to avoid incompatibilities between GATK and other tools. I believe you can try adding '.0' for the float values to avoid the error with the '10.90' input string. Please let me know if you have any other questions or concerns.
Kind regards,
Pamela
Please sign in to leave a comment.
3 comments