Funcotator datasources.v1.7 gencode raise error
(REQUIRED) For us to most effectively understand your question, please fill in your responses below:
Hi, I tried to run Funcotator on my VCF file, for this I have downloaded the datasources v1.7 provided, but now I ran into an error for the gencode hg19 file which I don't know how to fix, because the error is not clear to me, is the gencode file provided incorrect and should I replace it by another file or is there another problem? I hope somebody can help. I have provided the errormessage below with in bold the lines which I think are the most important.
Thanks in advance for looking into it, If there are any questions about the error please let me know.
a) I am using GATK version 4.1.7.0)
b) The specific command(s) I used is gatk funcotator \
--variant $inputfile\
--reference $refhg19 \
--ref-version hg19 \
--data-sources-path $dbPATH \
--output $output \
--output-file-format VCF
c) Please post error log here
11:39:11.273 INFO Funcotator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
11:39:11.273 INFO Funcotator - Start Date/Time: August 7, 2020 11:39:10 AM CEST
11:39:11.273 INFO Funcotator - ------------------------------------------------------------
11:39:11.273 INFO Funcotator - ------------------------------------------------------------
11:39:11.274 INFO Funcotator - HTSJDK Version: 2.21.2
11:39:11.274 INFO Funcotator - Picard Version: 2.21.9
11:39:11.274 INFO Funcotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
11:39:11.274 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
11:39:11.274 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
11:39:11.274 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
11:39:11.274 INFO Funcotator - Deflater: IntelDeflater
11:39:11.274 INFO Funcotator - Inflater: IntelInflater
11:39:11.274 INFO Funcotator - GCS max retries/reopens: 20
11:39:11.274 INFO Funcotator - Requester pays: disabled
11:39:11.274 INFO Funcotator - Initializing engine
...
resolved all datasource files
ENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28)
(given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100), mapped to GRCh37
with gencode-backmap Continuing, but errors may occur.
11:39:15.609 WARN GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28)
(given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100), mapped to GRCh37
with gencode-backmap Continuing, but errors may occur.
11:39:15.610 INFO FeatureManager - Using codec GencodeGtfCodec to read file file:///
db/funcotator_dataSources.v1.7.20200521s/gencode/hg19/gencode.v34lift37.annotation.REORDERED.gtf
11:39:15.631 WARN GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28)
(given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100), mapped to GRCh37
with gencode-backmap Continuing, but errors may occur.
11:39:22.190 INFO Funcotator - Initializing Funcotator Engine...
11:39:22.220 INFO Funcotator - Creating a VCF file for output: file:funcotated.vcf
11:39:22.292 INFO ProgressMeter - Starting traversal
11:39:22.293 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
11:39:22.356 INFO VcfFuncotationFactory - ClinVar_VCF 20180401 cache hits/total: 0/0
11:39:22.364 INFO VcfFuncotationFactory - dbSNP 9606_b151 cache hits/total: 0/0
11:39:22.395 INFO Funcotator - Shutting down engine
[August 7, 2020 11:39:22 AM CEST] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 0.19 minutes.
Runtime.totalMemory()=3196059648
htsjdk.tribble.TribbleException$MalformedFeatureFile: Error parsing line: LineIteratorImpl(SynchronousLineReader), for input so
urce: file:///MY/FILE/PATH/funcotator_dataSources.v1.7.20200521s/gencode/hg19/gencode.v34lift37
.annotation.REORDERED.gtf
at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.readNextRecord(TribbleIndexedFeatureReader.java:510)
at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.<init>(TribbleIndexedFeatureReader.java:426)
at htsjdk.tribble.TribbleIndexedFeatureReader.query(TribbleIndexedFeatureReader.java:297)
at org.broadinstitute.hellbender.engine.FeatureDataSource.refillQueryCache(FeatureDataSource.java:567)
at org.broadinstitute.hellbender.engine.FeatureDataSource.queryAndPrefetch(FeatureDataSource.java:536)
at org.broadinstitute.hellbender.engine.FeatureManager.getFeatures(FeatureManager.java:353)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:173)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.queryFeaturesFromFeatureContext(DataSour
ceFuncotationFactory.java:304)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.getFeaturesFromFeatureContext(D[22/9009]
FuncotationFactory.java:219)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotation
Factory.java:197)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotation
Factory.java:172)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.lambda$createFuncotationMapForVariant$0(FuncotatorEn
gine.java:147)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.createFuncotationMapForVariant(FuncotatorEngine.java
:157)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.enqueueAndHandleVariant(Funcotator.java:903)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.apply(Funcotator.java:857)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Caused by: java.lang.NumberFormatException: For input string: "chr1:-:586071-586358"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.valueOf(Long.java:803)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.<init>(GencodeGtfFeature.java:224)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfExonFeature.<init>(GencodeGtfExonFeature.java:19)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfExonFeature.create(GencodeGtfExonFeature.java:23)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature$FeatureType$4.create(GencodeGtfFeature.java:777)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.create(GencodeGtfFeature.java:320)
at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:138)
at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:23)
at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.readNextRecord(TribbleIndexedFeatureReader.java:486)
-
Official comment
Tjitske - I responded to your github post, but in case you didn't see it:
@Tjitskedv Typically we ask people make a new issue in github or create a forum post, rather than commenting on open pull requests.
To answer your question directly - do not use v1.7 datasources with GATK 4.1.7.X or 4.1.8.X. Those data sources are not yet compatible with the Funcotator code in the Master branch. When we release GATK 4.1.9.0 they will work with Funcotator. This is not a bug in the code - it is an artifact of how we have to do data source releases.
See issue #6708 for all the gory details.
Comment actions -
Hi Tjitske de Vries, it looks like there is an issue with compatibility between your files. Where did you get your GTF file? Also, are any of your data sources using hg38 instead of hg19? This will lead to errors.
-
Hi Genevieve Brandt, thank you for your response. I downloaded the GTF file as part of the bundle funcotator_dataSources.v1.7.20200521s.tar.gz from https://gatk.broadinstitute.org/hc/en-us/articles/360042912011-Funcotator
al my datasources are hg19, I did not make any new data sources, only use those of the provided bundle.
I have used a Mutect2 vcf file which was filtered by FilterMutectCalls, I use the exact same reference file for all steps.
Is there something that I can test/check/change for the compatibility? (removing gencode datasource is not an option since this is the only one that is required.) - do I need another version?
Hope you can help
Update: I have downloaded the v1.6 data source bundle which contained the genecode V19 file now I used that gencode file instead of the one from v1.7, (copied it to the v1.7 folder so did not change any other file) and now it works. So I think the genecode-backmap file is not really compatible with funcotator for hg19, was this tested? or is this problem specific for my files? it seems to be a more general problem, so maybe good to check and update the files of the data source bundle?
kind regards Tjitske
Please sign in to leave a comment.
3 comments