Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

genomicDBimport ERROR: java.lang.NumberFormatException: For input string

Answered
0

8 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hello! 

    Could you re-run ValidateVariants without the option --validation-type-to-exclude ALL and post your output here?

    Thank you,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    @Anna

    Hi Genevieve,

    With:

    gatk ValidateVariants -V A811.g.vcf --dbsnp dbsnp_138.hg19.vcf

    LOG:

    18:53:39.469 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
    18:53:39.470 INFO ValidateVariants - Start Date/Time: April 15, 2021 6:53:39 PM CEST
    18:53:39.470 INFO ValidateVariants - ------------------------------------------------------------
    18:53:39.470 INFO ValidateVariants - ------------------------------------------------------------
    18:53:39.470 INFO ValidateVariants - HTSJDK Version: 2.21.2
    18:53:39.470 INFO ValidateVariants - Picard Version: 2.21.9
    18:53:39.470 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    18:53:39.471 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    18:53:39.471 INFO ValidateVariants - Deflater: IntelDeflater
    18:53:39.471 INFO ValidateVariants - Inflater: IntelInflater
    18:53:39.471 INFO ValidateVariants - GCS max retries/reopens: 20
    18:53:39.471 INFO ValidateVariants - Requester pays: disabled
    18:53:39.471 INFO ValidateVariants - Initializing engine
    18:53:39.774 INFO FeatureManager - Using codec VCFCodec to read file file:///.../dbsnp_138.hg19.vcf
    18:53:40.210 INFO FeatureManager - Using codec VCFCodec to read file file:///.../A881.g.vcf
    18:53:40.235 INFO ValidateVariants - Done initializing engine
    18:53:40.244 WARN ValidateVariants - REF validation cannot be done because no reference file was provided
    18:53:40.244 WARN ValidateVariants - Other possible validations will still be performed
    18:53:40.245 INFO ProgressMeter - Starting traversal
    18:53:40.245 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    18:53:40.612 INFO ValidateVariants - Shutting down engine
    [April 15, 2021 6:53:40 PM CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.03 minutes.
    Runtime.totalMemory()=2039545856
    ***********************************************************************

    A USER ERROR has occurred: Input A881.g.vcf fails strict validation of type ALL: one or more of the ALT allele(s) for the record at position chr1:14677 are not observed at all in the sample genotypes

    ***********************************************************************
    Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

     

    with: 
    gatk ValidateVariants -V A811.g.vcf -R ucsc.hg19.fasta -gvcf

    LOG:
    18:56:53.540 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
    18:56:53.540 INFO ValidateVariants - Start Date/Time: April 15, 2021 6:56:53 PM CEST
    18:56:53.540 INFO ValidateVariants - ------------------------------------------------------------
    18:56:53.541 INFO ValidateVariants - ------------------------------------------------------------
    18:56:53.541 INFO ValidateVariants - HTSJDK Version: 2.21.2
    18:56:53.541 INFO ValidateVariants - Picard Version: 2.21.9
    18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    18:56:53.541 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    18:56:53.541 INFO ValidateVariants - Deflater: IntelDeflater
    18:56:53.541 INFO ValidateVariants - Inflater: IntelInflater
    18:56:53.541 INFO ValidateVariants - GCS max retries/reopens: 20
    18:56:53.541 INFO ValidateVariants - Requester pays: disabled
    18:56:53.541 INFO ValidateVariants - Initializing engine
    18:56:53.974 INFO FeatureManager - Using codec VCFCodec to read file file:///.../A881.g.vcf
    18:56:54.013 INFO ValidateVariants - Done initializing engine
    18:56:54.024 WARN ValidateVariants - GVCF format is currently incompatible with allele validation. Not validating Alleles.
    18:56:54.025 WARN ValidateVariants - IDS validation cannot be done because no DBSNP file was provided
    18:56:54.025 WARN ValidateVariants - Other possible validations will still be performed
    18:56:54.025 INFO ProgressMeter - Starting traversal
    18:56:54.025 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    18:57:04.027 INFO ProgressMeter - chr1:195907996 0.2 2204000 13222677.7
    18:57:14.027 INFO ProgressMeter - chr3:178627344 0.3 6914000 20739926.0
    18:57:24.028 INFO ProgressMeter - chr6:149284209 0.5 11584000 23165683.4
    18:57:34.028 INFO ProgressMeter - chr10:30208167 0.7 16304000 24454165.9
    18:57:44.031 INFO ProgressMeter - chr13:97079150 0.8 21130000 25352957.6
    18:57:54.032 INFO ProgressMeter - chr18:72056946 1.0 26448000 26445355.5
    18:58:02.871 INFO ProgressMeter - chrUn_gl000243:35822 1.1 30907514 26936217.6
    18:58:02.871 INFO ProgressMeter - Traversal complete. Processed 30907514 total variants in 1.1 minutes.
    18:58:02.874 INFO ValidateVariants - Shutting down engine
    [April 15, 2021 6:58:02 PM CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 1.16 minutes.
    Runtime.totalMemory()=2039545856

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Ok, could you share the complete stack trace from the GenomicsDBImport run with the option --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true'?

    Here is how you use java options: https://gatk.broadinstitute.org/hc/en-us/articles/360035531892-GATK4-command-line-syntax

    0
    Comment actions Permalink
  • Avatar
    @Anna

    A USER ERROR has occurred: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"

    ***********************************************************************
    org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$2(GenomicsDBImport.java:626)
    at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.getFeatureReadersInParallel(GenomicsDBImport.java:621)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.createSampleToReaderMap(GenomicsDBImport.java:509)
    at com.intel.genomicsdb.importer.GenomicsDBImporter.lambda$null$2(GenomicsDBImporter.java:560)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$2(GenomicsDBImport.java:623)
    ... 8 more
    Caused by: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Integer.parseInt(Integer.java:615)
    at htsjdk.tribble.readers.TabixReader.getIntv(TabixReader.java:337)
    at htsjdk.tribble.readers.TabixReader.access$500(TabixReader.java:48)
    at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:438)
    at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
    at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
    at htsjdk.tribble.TabixFeatureReader$FeatureIterator.<init>(TabixFeatureReader.java:159)
    at htsjdk.tribble.TabixFeatureReader.query(TabixFeatureReader.java:133)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1.query(GenomicsDBImport.java:696)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$InitializedQueryWrapper.<init>(GenomicsDBImport.java:821)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$InitializedQueryWrapper.<init>(GenomicsDBImport.java:813)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.lambda$getFeatureReadersInParallel$1(GenomicsDBImport.java:614)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 3 more

    0
    Comment actions Permalink
  • Avatar
    @Anna

    I notice that maybe it could be due to errors in the tabix file. I am re-tabixing the files and re-running GenomicsDBImport.
    I'll let you know if this was indeed the solution. 

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi @Anna,

    Did these end up fixing the problem? If not, please send the complete stack trace, including all the lines before the user error:

    A USER ERROR has occurred: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"

    ***********************************************************************
    org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: Failure while waiting for FeatureReader to initialize with exception: java.lang.NumberFormatException: For input string: "GT:DP:GQ:MIN_DP:PL"

    Thank you,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    @Anna

    Hi Genevieve,

    Yes, the problem was indeed with the tabix file. I ran it again after re-indexing and it gave me no errors.

    Thank you,

    Anna

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Great, thanks for the update!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk