ArrayList cannot be cast to String in VariantRecalibrator
AnsweredHi GATK team and community,
I've encountered an java.util.ArrayList cannot be cast to java.lang.String error while trying to run VariantRecalibrator
GATK version used:
4.2.6.1
Exact command used:
gatk --java-options "-Xms63g -XX:+UseParallelGC -XX:ParallelGCThreads=3" \ VariantRecalibrator \ -V gs://hgdp-1kg/vqsr_pipeline/gnomad_genomes_v3.1_info.vcf.bgz \ -O ${BATCH_TMPDIR}/VQSR__SNPsVariantRecalibratorScattered-IXH8P/recalibration \ --tranches-file ${BATCH_TMPDIR}/VQSR__SNPsVariantRecalibratorScattered-IXH8P/tranches \ --trust-all-polymorphic \ -tranche 100.0 -tranche 99.95 -tranche 99.9 -tranche 99.8 -tranche 99.6 -tranche 99.5 -tranche 99.4 -tranche 99.3 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 \ -an AS_QD -an AS_MQRankSum -an AS_ReadPosRankSum -an AS_FS -an AS_MQ \ -mode SNP \ --max-gaussians 6 \ -resource:hapmap,known=false,training=true,truth=true,prior=15 gs://gcp-public-data--broad-references/hg38/v0/hapmap_3.3.hg38.vcf.gz \ -resource:omni,known=false,training=true,truth=true,prior=12 gs://gcp-public-data--broad-references/hg38/v0/1000G_omni2.5.hg38.vcf.gz \ -resource:1000G,known=false,training=true,truth=false,prior=10 gs://gcp-public-data--broad-references/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz \ -resource:dbsnp,known=true,training=false,truth=false,prior=7 gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz -L ${BATCH_TMPDIR}/Make_1000_intervals-2nqFj/intervals/0000-scattered.interval_list --use-allele-specific-annotations --input-model ${BATCH_TMPDIR}/VQSR__SNPsVariantRecalibratorCreateModel-YClEf/model_file --output-tranches-for-scatter
Entire error log:
19:09:41.070 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so 19:09:41.097 INFO VariantRecalibrator - ------------------------------------------------------------ 19:09:41.098 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1 19:09:41.098 INFO VariantRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/ 19:09:41.098 INFO VariantRecalibrator - Executing as root@hostname-e3a86afed9 on Linux v5.4.0-1042-gcp amd64 19:09:41.098 INFO VariantRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 19:09:41.098 INFO VariantRecalibrator - Start Date/Time: August 29, 2022 7:09:41 PM GMT 19:09:41.098 INFO VariantRecalibrator - ------------------------------------------------------------ 19:09:41.098 INFO VariantRecalibrator - ------------------------------------------------------------ 19:09:41.099 INFO VariantRecalibrator - HTSJDK Version: 2.24.1 19:09:41.099 INFO VariantRecalibrator - Picard Version: 2.27.1 19:09:41.099 INFO VariantRecalibrator - Built for Spark Version: 2.4.5 19:09:41.099 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2 19:09:41.099 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 19:09:41.099 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 19:09:41.099 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 19:09:41.099 INFO VariantRecalibrator - Deflater: IntelDeflater 19:09:41.099 INFO VariantRecalibrator - Inflater: IntelInflater 19:09:41.099 INFO VariantRecalibrator - GCS max retries/reopens: 20 19:09:41.099 INFO VariantRecalibrator - Requester pays: disabled 19:09:41.099 INFO VariantRecalibrator - Initializing engine 19:09:43.093 INFO FeatureManager - Using codec VCFCodec to read file gs://gcp-public-data--broad-references/hg38/v0/hapmap_3.3.hg38.vcf.gz 19:09:45.614 INFO FeatureManager - Using codec VCFCodec to read file gs://gcp-public-data--broad-references/hg38/v0/1000G_omni2.5.hg38.vcf.gz 19:09:48.115 INFO FeatureManager - Using codec VCFCodec to read file gs://gcp-public-data--broad-references/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz 19:09:50.675 INFO FeatureManager - Using codec VCFCodec to read file gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz 19:09:53.628 INFO FeatureManager - Using codec VCFCodec to read file gs://hgdp-1kg/vqsr_pipeline/gnomad_genomes_v3.1_info.vcf.bgz 19:09:57.965 INFO VariantRecalibrator - Done initializing engine 19:09:57.968 INFO TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0 19:09:57.968 INFO TrainingSet - Found omni track: Known = false Training = true Truth = true Prior = Q12.0 19:09:57.968 INFO TrainingSet - Found 1000G track: Known = false Training = true Truth = false Prior = Q10.0 19:09:57.968 INFO TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q7.0 19:09:58.022 WARN GATKVariantContextUtils - Can't determine output variant file format from output file extension "". Defaulting to VCF. 19:09:58.076 INFO ProgressMeter - Starting traversal 19:09:58.076 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute 19:09:58.646 INFO VariantRecalibrator - Shutting down engine [August 29, 2022 7:09:59 PM GMT] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 0.30 minutes. Runtime.totalMemory()=64827162624 org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at chr1:10007 [VC gs://hgdp-1kg/vqsr_pipeline/gnomad_genomes_v3.1_info.vcf.bgz @ chr1:10007 Q. of type=SNP alleles=[T*, C] attr={AC=0, AC_raw=1, AS_FS=3.97940, AS_MQ=32.2452, AS_MQRankSum=0.358000, AS_QD=3.60000, AS_QUALapprox=|18, AS_ReadPosRankSum=1.23100, AS_SB_TABLE=[3, 0|1, 1], AS_VarDP=|5, AS_pab_max=1.00000, FS=3.97940, MQ=32.2452, MQRankSum=0.358000, QD=3.60000, QUALapprox=18, ReadPosRankSum=1.23100, SB=[3, 0, 1, 1], VarDP=5} GT=[] filters= at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:145) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289) Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String at htsjdk.variant.variantcontext.CommonInfo.getAttributeAsDouble(CommonInfo.java:324) at htsjdk.variant.variantcontext.VariantContext.getAttributeAsDouble(VariantContext.java:820) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDataManager.decodeAnnotation(VariantDataManager.java:359) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDataManager.decodeAnnotations(VariantDataManager.java:328) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.addDatum(VariantRecalibrator.java:602) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.addVariantDatum(VariantRecalibrator.java:572) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.lambda$consumeQueuedVariants$0(VariantRecalibrator.java:543) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.consumeQueuedVariants(VariantRecalibrator.java:543) at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.apply(VariantRecalibrator.java:522) at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:139) ... 20 more
Genevieve Brandt (she/her) the previous error I was experiencing has been fixed. This step is the one where we apply the model to each interval.
-
Hi Lindo Nkambule,
Thanks for writing in about this! We took a look at the program log and saw that the stack trace indicates you were not running VariantRecalibrator in allele specific mode. It looks like the second to last line of your command is missing a line continuation "\" and that's why the code is in non-AS mode.
If you fix that line in your command, you shouldn't get this error!
Let me know if you have any further questions.
Best,
Genevieve
-
Thank you! That fixed the issue.
-
Great! Thanks for letting us know.
Please sign in to leave a comment.
3 comments