Error, ShouldNeverReachHereException, FuncotationMap in FilterFuncotations
AnsweredFilterFuncotations stops with an error. The input file with the reference genome seems to pass ValidateVariants (no errors). It looks like "FuncotationMap" doesn't have enough values to go with the keys. I started with a .vcf file downloaded from Nebula Genomics, and sequentially used CNNScoreVariants, FilterVariantTranches (CNN_1D), and Funcotator, with default settings.
I am trying to find the most pathogenic variants. I considered using FilterVcf to remove synonymous and intron variants, but it doesn't look like it can do that. So then I tried FilterFuncotations, but it returns an error. What I want is some way to sort the variants by severity, to find the most pathogenic ones, but I don't know how to do that.
GATK version: 4.2.6.1
Java runtime: OpenJDK 64-Bit Server VM v11.0.14.1+1-Ubuntu-0ubuntu1.20.04
Excerpt:
[April 25, 2022 at 2:00:35 AM EDT] org.broadinstitute.hellbender.tools.funcotator.FilterFuncotations done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=319815680
org.broadinstitute.hellbender.exceptions.GATKException$ShouldNeverReachHereException: Cannot parse the funcotation attribute. Num values: 31 Num keys: 53
Copied from the terminal:
(gatk) aru@BioinformaticsVM:/mnt/sdb/gatk$ ./gatk FilterFuncotations --allele-frequency-data-source gnomad -O ./output/nebulaFilterFuncotations.vcf --ref-version hg38 -V ./output/nebulaFuncotatorAnnotated.vcf --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true'
Using GATK jar /mnt/sdb/gatk/gatk-package-4.2.6.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -DGATK_STACKTRACE_ON_USER_EXCEPTION=true -jar /mnt/sdb/gatk/gatk-package-4.2.6.1-local.jar FilterFuncotations --allele-frequency-data-source gnomad -O ./output/nebulaFilterFuncotations.vcf --ref-version hg38 -V ./output/nebulaFuncotatorAnnotated.vcf
02:00:34.173 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/sdb/gatk/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
02:00:34.368 INFO FilterFuncotations - ------------------------------------------------------------
02:00:34.369 INFO FilterFuncotations - The Genome Analysis Toolkit (GATK) v4.2.6.1
02:00:34.369 INFO FilterFuncotations - For support and documentation go to https://software.broadinstitute.org/gatk/
02:00:34.369 INFO FilterFuncotations - Executing as aru@BioinformaticsVM on Linux v5.13.0-39-generic amd64
02:00:34.369 INFO FilterFuncotations - Java runtime: OpenJDK 64-Bit Server VM v11.0.14.1+1-Ubuntu-0ubuntu1.20.04
02:00:34.369 INFO FilterFuncotations - Start Date/Time: April 25, 2022 at 2:00:34 AM EDT
02:00:34.369 INFO FilterFuncotations - ------------------------------------------------------------
02:00:34.369 INFO FilterFuncotations - ------------------------------------------------------------
02:00:34.370 INFO FilterFuncotations - HTSJDK Version: 2.24.1
02:00:34.371 INFO FilterFuncotations - Picard Version: 2.27.1
02:00:34.371 INFO FilterFuncotations - Built for Spark Version: 2.4.5
02:00:34.371 INFO FilterFuncotations - HTSJDK Defaults.COMPRESSION_LEVEL : 2
02:00:34.371 INFO FilterFuncotations - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:00:34.371 INFO FilterFuncotations - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:00:34.371 INFO FilterFuncotations - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:00:34.371 INFO FilterFuncotations - Deflater: IntelDeflater
02:00:34.371 INFO FilterFuncotations - Inflater: IntelInflater
02:00:34.371 INFO FilterFuncotations - GCS max retries/reopens: 20
02:00:34.371 INFO FilterFuncotations - Requester pays: disabled
02:00:34.372 WARN FilterFuncotations -
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: FilterFuncotations is an EXPERIMENTAL tool and should not be used for production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
02:00:34.372 INFO FilterFuncotations - Initializing engine
02:00:34.518 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/sdb/gatk/./output/nebulaFuncotatorAnnotated.vcf
02:00:34.815 INFO FilterFuncotations - Done initializing engine
02:00:35.260 INFO ProgressMeter - Starting traversal
02:00:35.261 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
02:00:35.262 INFO FilterFuncotations - Starting pass 0 through the variants
02:00:35.778 ERROR FuncotationMap - Keys: Gencode_34_hugoSymbol, Gencode_34_ncbiBuild, Gencode_34_chromosome, Gencode_34_start, Gencode_34_end, Gencode_34_variantClassification, Gencode_34_secondaryVariantClassification, Gencode_34_variantType, Gencode_34_refAllele, Gencode_34_tumorSeqAllele1, Gencode_34_tumorSeqAllele2, Gencode_34_genomeChange, Gencode_34_annotationTranscript, Gencode_34_transcriptStrand, Gencode_34_transcriptExon, Gencode_34_transcriptPos, Gencode_34_cDnaChange, Gencode_34_codonChange, Gencode_34_proteinChange, Gencode_34_gcContent, Gencode_34_referenceContext, Gencode_34_otherTranscripts, ACMGLMMLof_LOF_Mechanism, ACMGLMMLof_Mode_of_Inheritance, ACMGLMMLof_Notes, ACMG_recommendation_Disease_Name, ClinVar_VCF_AF_ESP, ClinVar_VCF_AF_EXAC, ClinVar_VCF_AF_TGP, ClinVar_VCF_ALLELEID, ClinVar_VCF_CLNDISDB, ClinVar_VCF_CLNDISDBINCL, ClinVar_VCF_CLNDN, ClinVar_VCF_CLNDNINCL, ClinVar_VCF_CLNHGVS, ClinVar_VCF_CLNREVSTAT, ClinVar_VCF_CLNSIG, ClinVar_VCF_CLNSIGCONF, ClinVar_VCF_CLNSIGINCL, ClinVar_VCF_CLNVC, ClinVar_VCF_CLNVCSO, ClinVar_VCF_CLNVI, ClinVar_VCF_DBVARID, ClinVar_VCF_GENEINFO, ClinVar_VCF_MC, ClinVar_VCF_ORIGIN, ClinVar_VCF_RS, ClinVar_VCF_SSR, ClinVar_VCF_ID, ClinVar_VCF_FILTER, LMMKnown_LMM_FLAGGED, LMMKnown_ID, LMMKnown_FILTER
02:00:35.778 ERROR FuncotationMap - Values: , , , , , , , , , , , , , , , , , , , , , , , , , , , , false, ,
02:00:35.793 INFO FilterFuncotations - Shutting down engine
[April 25, 2022 at 2:00:35 AM EDT] org.broadinstitute.hellbender.tools.funcotator.FilterFuncotations done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=319815680
org.broadinstitute.hellbender.exceptions.GATKException$ShouldNeverReachHereException: Cannot parse the funcotation attribute. Num values: 31 Num keys: 53
at org.broadinstitute.hellbender.tools.funcotator.FuncotationMap.createAsAllTableFuncotationsFromVcf(FuncotationMap.java:224)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorUtils.lambda$createAlleleToFuncotationMapFromFuncotationVcfAttribute$5(FuncotatorUtils.java:2256)
at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:178)
at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
at java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorUtils.createAlleleToFuncotationMapFromFuncotationVcfAttribute(FuncotatorUtils.java:2255)
at org.broadinstitute.hellbender.tools.funcotator.filtrationRules.ArHetvarFilter.buildArHetByGene(ArHetvarFilter.java:77)
at org.broadinstitute.hellbender.tools.funcotator.filtrationRules.ArHetvarFilter.firstPassApply(ArHetvarFilter.java:50)
at org.broadinstitute.hellbender.tools.funcotator.FilterFuncotations.firstPassApply(FilterFuncotations.java:161)
at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.nthPassApply(TwoPassVariantWalker.java:17)
at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
-
Hi Joyce Anon,
I still don't have an update to share, I think we'll have a chance to take another look early next week.
Best,
Genevieve
-
Hi Joyce Anon,
It looks like you are getting this FilterFuncotations because there is an issue with the INFO field in your VCF. Could you make sure that the VCF is valid with our tool ValidateVariants?
There could also be a version mismatch between Funcotator and FilterFuncotations. Could you verify that you kept the version consistent?
Best,
Genevieve
-
Yes, ValidateVariants completed with no particular messages. No changes of any kind were made between running the two tools.
I did update something small, and also SciPy, in order to get one of the first two tools to work. I don't remember exactly what.
-
Thanks Joyce. Did you run ValidateVariants on the exact input to this FilterFuncotations tool?
One of my colleagues who is even more familiar with this tool is going to take a second look, but it might be a couple days before they have a chance. I'll keep you posted!
-
Yes, they both had the same input file. Thank you, I appreciate it. I have to get this working one way or another.
-
Thanks.
-
Hi Joyce Anon,
Thank you for your patience while I looked into this issue further. We think that this might be a bug in the code, so would you be able to share your input VCF file to FilterFuncotations so we can determine where the issue is coming from? Here's the github issue ticket: https://github.com/broadinstitute/gatk/issues/7865
Here are the instructions for how to upload a bug report: https://gatk.broadinstitute.org/hc/en-us/articles/360035889671
In the meantime, run Funcotator with the option --output-file-format MAF to get a MAF format with functional annotations. This file format is a .tsv data type and is easier to parse. You can filter this file for the variant classification field you are looking for. If re-running Funcotator won't work, one other option is to convert your Funcotator VCF output to MAF format. We don't have an official tool, but could potentially send you an internal script we have used.
Let me know when you have uploaded the file and if you have other questions!
Best,
Genevieve
-
If I upload the VCF file, is it publicly accessible? I'd rather keep it private.
-
Joyce Anon it is private. Just the GATK team can access the data for testing.
-
When I try to load in a browser or ping ftp.broadinstitute.org, I get a time out. I waited a few days, but still the same. Is there some particular way I need to use it?
-
Joyce Anon I just checked and on our end I did not see any new files. You won't be able to load the ftp server because it is private. But it is possible to upload a file to the ftp server and we will be able to access that file.
-
That's because I couldn't find a way to use it. Is there some particular way I need to use it? I don't have much energy, because of the extreme lifelong health issues that I need to identify.
-
I'm sorry you're having issues with it. There isn't anything extra you need to know besides the instructions on our site: https://gatk.broadinstitute.org/hc/en-us/articles/360035889671
There are a number of different articles online with instructions about uploading a file to an FTP server, here is a simple one if you haven't seen it yet.
-
The articles aren't useful, the one in the link doesn't work and I am bad at selecting from lists of options. However, I have found how to do it, it's rather obscure and unintuitive. It took me 17 days of searching...
-
Uploaded as Joyce1NebulaFuncotatorAnnotated.vcf and Joyce1NebulaFuncotatorAnnotated.vcf.idx
I don't know how to do the snippet thing since I don't think it's caused by a position, and I don't have the energy / cognitive function right now anyway.
Thanks for the help!
-
Thanks Joyce Anon, we'll take a look.
-
Any ideas about what is wrong?
-
Do you have any information yet about the nature of the bug, so that I might continue my health care and try to improve my lifelong disabilities?
-
Joyce Anon I don't have any updates at this time, we're still looking into it.
-
Thank you, I will also keep looking.
-
I am getting the same error in GATK v4.5.0.0, is this error been resolved??
-
They never got back to me.
-
Hello, I'm sorry this has been outstanding for so long. It's definitely a bug. There's an open issue here (https://github.com/broadinstitute/gatk/issues/7865). It looks like it took a while for test data to be available and we lost track of it during the interim.
A complete guess is that there may be an issue relatated to quoting the fields in the vcf that's causing parsing to go awry but we'll have to look into it more.
Our primary funcotator developer is out with a baby so it might take us a bit longer to find the answer.
-
That's okay, I diagnosed my disease almost a year ago, it's extremely rare and probably due to novel variants, it didn't show on any other DNA analyses either. Thank you for the update.
Please sign in to leave a comment.
24 comments