genomicDBimport ERROR: java.lang.NumberFormatException: For input string
AnsweredHi there,
I had to redo the calling of two of my samples, and now I am trying to redo my genomic database in order to replace these two samples.
My script:
module load gatk/4.0.10.0
gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport --genomicsdb-workspace-path gDB5b --batch-size 215 --sample-name-map allgvcflist.txt --reader-threads 4 -L chr5
LOG:
A USER ERROR has occurred: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "0/0:28:72:28:0,72,942"
I run ValidateVariants and it doesn’t give any errors:
gatk ValidateVariants -R /.../ucsc.hg19.fasta -V /.../A881.g.vcf --dbsnp dbsnp_138.hg19.vcf --validation-type-to-exclude ALL
Why is it giving this error and how can I solve it?
Thank you for any help
-
Hello!
Could you re-run ValidateVariants without the option --validation-type-to-exclude ALL and post your output here?
Thank you,
Genevieve
-
Hi Genevieve,
With:
gatk ValidateVariants -V A811.g.vcf --dbsnp dbsnp_138.hg19.vcf
LOG:
18:53:39.469 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
18:53:39.470 INFO ValidateVariants - Start Date/Time: April 15, 2021 6:53:39 PM CEST
18:53:39.470 INFO ValidateVariants - ------------------------------------------------------------
18:53:39.470 INFO ValidateVariants - ------------------------------------------------------------
18:53:39.470 INFO ValidateVariants - HTSJDK Version: 2.21.2
18:53:39.470 INFO ValidateVariants - Picard Version: 2.21.9
18:53:39.470 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:53:39.471 INFO ValidateVariants - Deflater: IntelDeflater
18:53:39.471 INFO ValidateVariants - Inflater: IntelInflater
18:53:39.471 INFO ValidateVariants - GCS max retries/reopens: 20
18:53:39.471 INFO ValidateVariants - Requester pays: disabled
18:53:39.471 INFO ValidateVariants - Initializing engine
18:53:39.774 INFO FeatureManager - Using codec VCFCodec to read file file:///.../dbsnp_138.hg19.vcf
18:53:40.210 INFO FeatureManager - Using codec VCFCodec to read file file:///.../A881.g.vcf
18:53:40.235 INFO ValidateVariants - Done initializing engine
18:53:40.244 WARN ValidateVariants - REF validation cannot be done because no reference file was provided
18:53:40.244 WARN ValidateVariants - Other possible validations will still be performed
18:53:40.245 INFO ProgressMeter - Starting traversal
18:53:40.245 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
18:53:40.612 INFO ValidateVariants - Shutting down engine
[April 15, 2021 6:53:40 PM CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=2039545856
***********************************************************************A USER ERROR has occurred: Input A881.g.vcf fails strict validation of type ALL: one or more of the ALT allele(s) for the record at position chr1:14677 are not observed at all in the sample genotypes
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.with:
gatk ValidateVariants -V A811.g.vcf -R ucsc.hg19.fasta -gvcfLOG:
18:56:53.540 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
18:56:53.540 INFO ValidateVariants - Start Date/Time: April 15, 2021 6:56:53 PM CEST
18:56:53.540 INFO ValidateVariants - ------------------------------------------------------------
18:56:53.541 INFO ValidateVariants - ------------------------------------------------------------
18:56:53.541 INFO ValidateVariants - HTSJDK Version: 2.21.2
18:56:53.541 INFO ValidateVariants - Picard Version: 2.21.9
18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:56:53.541 INFO ValidateVariants - Deflater: IntelDeflater
18:56:53.541 INFO ValidateVariants - Inflater: IntelInflater
18:56:53.541 INFO ValidateVariants - GCS max retries/reopens: 20
18:56:53.541 INFO ValidateVariants - Requester pays: disabled
18:56:53.541 INFO ValidateVariants - Initializing engine
18:56:53.974 INFO FeatureManager - Using codec VCFCodec to read file file:///.../A881.g.vcf
18:56:54.013 INFO ValidateVariants - Done initializing engine
18:56:54.024 WARN ValidateVariants - GVCF format is currently incompatible with allele validation. Not validating Alleles.
18:56:54.025 WARN ValidateVariants - IDS validation cannot be done because no DBSNP file was provided
18:56:54.025 WARN ValidateVariants - Other possible validations will still be performed
18:56:54.025 INFO ProgressMeter - Starting traversal
18:56:54.025 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
18:57:04.027 INFO ProgressMeter - chr1:195907996 0.2 2204000 13222677.7
18:57:14.027 INFO ProgressMeter - chr3:178627344 0.3 6914000 20739926.0
18:57:24.028 INFO ProgressMeter - chr6:149284209 0.5 11584000 23165683.4
18:57:34.028 INFO ProgressMeter - chr10:30208167 0.7 16304000 24454165.9
18:57:44.031 INFO ProgressMeter - chr13:97079150 0.8 21130000 25352957.6
18:57:54.032 INFO ProgressMeter - chr18:72056946 1.0 26448000 26445355.5
18:58:02.871 INFO ProgressMeter - chrUn_gl000243:35822 1.1 30907514 26936217.6
18:58:02.871 INFO ProgressMeter - Traversal complete. Processed 30907514 total variants in 1.1 minutes.
18:58:02.874 INFO ValidateVariants - Shutting down engine
[April 15, 2021 6:58:02 PM CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 1.16 minutes.
Runtime.totalMemory()=2039545856 -
Ok, could you share the complete stack trace from the GenomicsDBImport run with the option --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true'?
Here is how you use java options: https://gatk.broadinstitute.org/hc/en-us/articles/360035531892-GATK4-command-line-syntax
-
A USER ERROR has occurred: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
***********************************************************************
org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$2(GenomicsDBImport.java:626)
at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.getFeatureReadersInParallel(GenomicsDBImport.java:621)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.createSampleToReaderMap(GenomicsDBImport.java:509)
at com.intel.genomicsdb.importer.GenomicsDBImporter.lambda$null$2(GenomicsDBImporter.java:560)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$2(GenomicsDBImport.java:623)
... 8 more
Caused by: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at htsjdk.tribble.readers.TabixReader.getIntv(TabixReader.java:337)
at htsjdk.tribble.readers.TabixReader.access$500(TabixReader.java:48)
at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:438)
at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.<init>(TabixFeatureReader.java:159)
at htsjdk.tribble.TabixFeatureReader.query(TabixFeatureReader.java:133)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1.query(GenomicsDBImport.java:696)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$InitializedQueryWrapper.<init>(GenomicsDBImport.java:821)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$InitializedQueryWrapper.<init>(GenomicsDBImport.java:813)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$1(GenomicsDBImport.java:614)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more -
I notice that maybe it could be due to errors in the tabix file. I am re-tabixing the files and re-running GenomicsDBImport.
I'll let you know if this was indeed the solution. -
Hi @Anna,
Did these end up fixing the problem? If not, please send the complete stack trace, including all the lines before the user error:
A USER ERROR has occurred: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
***********************************************************************
org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"Thank you,
Genevieve
-
Hi Genevieve,
Yes, the problem was indeed with the tabix file. I ran it again after re-indexing and it gave me no errors.
Thank you,
Anna
-
Great, thanks for the update!
Please sign in to leave a comment.
8 comments