VariantFiltration Issue
Good afternoon, I'm writing in hopes that someone can help me figure out the issue I am facing. I've previously used a similar command for VariantFiltration (albeit simpler) to test the tool and had success however when I apply this script to my dataset the tool initializes and the job terminates almost immediately. The output VCF file is generated however the only contents of the file are the standard VCF header and column information--no variants. Manual inspection of the file tells me that variants should be flagged and written. Colleagues have suggested both formatting and an issue with the input file as the problem but have been unable to give something specific. Changing format hasn't yielded different results and I cannot find anything wrong with the input file I am trying to use. Additionally I have tried multiple VCF files and gotten the same results. I've combed through the forum and also seen nothing that when tried as given different results. There is something I am missing and can't seem to put my finger on it. Any insight?
REQUIRED for all errors and issues:
a) GATK version used: gatk4-4.2.2.0-1
b) Exact command used:
gatk VariantFiltration -R CryptoDB-60_CparvumIowaII_Genome.fasta -V CpMUT917R_BX4R2_CpIA2_comp7mo.vcf -O CpMUT917R_BX4R2_CpIA2_comp7mo_filtered.vcf -filter "QD < 20" --filter-name "LowQD" -filter "QUAL < 200.0" --filter-name "QUAL200" -filter "SOR > 3.0" --filter-name "SOR3" -filter "FS > 1.0" --filter-name "FS1" -filter "MQ < 60.0" --filter-name "MQ60" -filter "MQRankSum > 1.0" --filter-name "MQRankSum0"
c) Entire program log:
Using GATK jar /gpfs1/home/e/m/emattice/miniconda3/share/gatk4-4.2.2.0-1/gatk-package-4.2.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gpfs1/home/e/m/emattice/miniconda3/share/gatk4-4.2.2.0-1/gatk-package-4.2.2.0-local.jar VariantFiltration -R CryptoDB-60_CparvumIowaII_Genome.fasta -V CpMUT917R_BX4R2_CpIA2_comp7mo.vcf -O CpMUT917R_BX4R2_CpIA2_comp7mo_filtered.vcf -filter QD < 20 --filter-name LowQD -filter QUAL < 200.0 --filter-name QUAL200 -filter SOR > 3.0 --filter-name SOR3 -filter FS > 1.0 --filter-name FS1 -filter MQ < 60.0 --filter-name MQ60 -filter MQRankSum > 1.0 --filter-name MQRankSum0
12:49:30.657 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs1/home/e/m/emattice/miniconda3/share/gatk4-4.2.2.0-1/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 04, 2023 12:49:30 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:49:30.746 INFO VariantFiltration - ------------------------------------------------------------
12:49:30.746 INFO VariantFiltration - The Genome Analysis Toolkit (GATK) v4.2.2.0
12:49:30.746 INFO VariantFiltration - For support and documentation go to https://software.broadinstitute.org/gatk/
12:49:30.746 INFO VariantFiltration - Executing as emattice@node309.cluster on Linux v3.10.0-1160.42.2.el7.x86_64 amd64
12:49:30.746 INFO VariantFiltration - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_302-b08
12:49:30.747 INFO VariantFiltration - Start Date/Time: January 4, 2023 12:49:30 PM EST
12:49:30.747 INFO VariantFiltration - ------------------------------------------------------------
12:49:30.747 INFO VariantFiltration - ------------------------------------------------------------
12:49:30.747 INFO VariantFiltration - HTSJDK Version: 2.24.1
12:49:30.747 INFO VariantFiltration - Picard Version: 2.25.4
12:49:30.747 INFO VariantFiltration - Built for Spark Version: 2.4.5
12:49:30.747 INFO VariantFiltration - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:49:30.747 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:49:30.747 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:49:30.747 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:49:30.747 INFO VariantFiltration - Deflater: IntelDeflater
12:49:30.747 INFO VariantFiltration - Inflater: IntelInflater
12:49:30.747 INFO VariantFiltration - GCS max retries/reopens: 20
12:49:30.748 INFO VariantFiltration - Requester pays: disabled
12:49:30.748 INFO VariantFiltration - Initializing engine
12:49:30.989 INFO FeatureManager - Using codec VCFCodec to read file file:///gpfs1/home/e/m/emattice/EBM_files/CpMUT_917R_BX4R2/CpIowaII/CpMUT917R_BX4R2_CpIA2_comp7mo.vcf
12:49:31.008 INFO VariantFiltration - Done initializing engine
12:49:31.049 INFO ProgressMeter - Starting traversal
12:49:31.050 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
12:49:31.065 INFO VariantFiltration - Shutting down engine
[January 4, 2023 12:49:31 PM EST] org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=1289224192
java.lang.NumberFormatException: For input string: "33.71"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.commons.jexl2.JexlArithmetic.toLong(JexlArithmetic.java:906)
at org.apache.commons.jexl2.JexlArithmetic.compare(JexlArithmetic.java:718)
at org.apache.commons.jexl2.JexlArithmetic.lessThan(JexlArithmetic.java:774)
at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:967)
at org.apache.commons.jexl2.parser.ASTLTNode.jjtAccept(ASTLTNode.java:18)
at org.apache.commons.jexl2.Interpreter.interpret(Interpreter.java:232)
at org.apache.commons.jexl2.ExpressionImpl.evaluate(ExpressionImpl.java:65)
at htsjdk.variant.variantcontext.JEXLMap.evaluateExpression(JEXLMap.java:186)
at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:95)
at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:15)
at htsjdk.variant.variantcontext.VariantContextUtils.match(VariantContextUtils.java:338)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.matchesFilter(VariantFiltration.java:452)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.filter(VariantFiltration.java:406)
at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.apply(VariantFiltration.java:353)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
-
Without seeing the input VCF I can't guarantee that this is the problem, but I suspect that changing "QD < 20" to "QD < 20.0" might help. This is because the error message says:
java.lang.NumberFormatException: For input string: "33.71"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)Since the float 33.71 can't be parsed as a long I think that might be the problem. Maybe one of the first variants in your VCF has a QD of 33.71?
Please sign in to leave a comment.
1 comment