No Output from GenomicsDBImport
Dear GATK Team,
I am using GATK version 4.1.9.0 for my WES data pipeline. In order to get accurate somatic call, I am trying to generate the Panel of Normal (PON) using GenomicsDBImport module of GATK. While using GenomicsDBImport for PON generation, I am not getting any output from my command. Here is the command I used to print the stack trace:
gatk GenomicsDBImport \
-R /gatk_bundle/hg19_v0_Homo_sapiens_assembly19.fasta \
--variant normal1.vcf \
--variant normal2.vcf \
--variant normal3.vcf \
--variant normal4.vcf \
--variant normal5.vcf \
--variant normal6.vcf \
--variant normal7.vcf \
--variant normal8.vcf \
--variant normal9.vcf \
--variant normal10.vcf \
--variant normal11.vcf \
--variant normal12.vcf \
--variant normal13.vcf \
--variant normal14.vcf \
--variant normal15.vcf \
--variant normal16.vcf \
--variant normal17.vcf \
....
--variant normal80.vcf \
--genomicsdb-workspace-path pon_db \
--tmp-dir /tmp1 \
-L /gatk_bundle/hglft_genome_3bc14_d6f440.bed \
--sequence-dictionary /gatk_bundle/hg19_v0_Homo_sapiens_assembly19.dict \
--reader-threads 15 \
--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true'
Here For interval list, I have downloaded the hg38 target interval from GATK resource bundle and converted into hg19 format using UCSC liftover utility. GenomicsDBImport is not reporting any error related to command but also not reporting any results. Here are the details from GenomicsDBImport log file:
17:16:16.069 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/akansha/vivekruhela/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 12, 2021 5:16:16 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
17:16:16.329 INFO GenomicsDBImport - ------------------------------------------------------------
17:16:16.329 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.1.9.0
17:16:16.329 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
17:16:16.330 INFO GenomicsDBImport - Executing as akansha@sbilab on Linux v4.4.0-169-generic amd64
17:16:16.330 INFO GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_265-8u265-b01-0ubuntu2~16.04-b01
17:16:16.330 INFO GenomicsDBImport - Start Date/Time: January 12, 2021 5:16:16 PM IST
17:16:16.330 INFO GenomicsDBImport - ------------------------------------------------------------
17:16:16.330 INFO GenomicsDBImport - ------------------------------------------------------------
17:16:16.331 INFO GenomicsDBImport - HTSJDK Version: 2.23.0
17:16:16.331 INFO GenomicsDBImport - Picard Version: 2.23.3
17:16:16.331 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
17:16:16.331 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
17:16:16.331 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
17:16:16.331 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
17:16:16.331 INFO GenomicsDBImport - Deflater: IntelDeflater
17:16:16.331 INFO GenomicsDBImport - Inflater: IntelInflater
17:16:16.331 INFO GenomicsDBImport - GCS max retries/reopens: 20
17:16:16.331 INFO GenomicsDBImport - Requester pays: disabled
17:16:16.331 INFO GenomicsDBImport - Initializing engine
17:16:20.910 INFO FeatureManager - Using codec BEDCodec to read file file:///home/akansha/vivekruhela/gatk_bundle/hglft_genome_3bc14_d6f440.bed
17:16:20.921 INFO IntervalArgumentCollection - Processing 0 bp from intervals
17:16:20.980 INFO GenomicsDBImport - Done initializing engine
17:16:21.553 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.3.2-e18fa63
17:16:21.554 INFO GenomicsDBImport - Vid Map JSON file will be written to /home/akansha/vivekruhela/pon_db/vidmap.json
17:16:21.554 INFO GenomicsDBImport - Callset Map JSON file will be written to /home/akansha/vivekruhela/pon_db/callset.json
17:16:21.554 INFO GenomicsDBImport - Complete VCF Header will be written to /home/akansha/vivekruhela/pon_db/vcfheader.vcf
17:16:21.554 INFO GenomicsDBImport - Importing to workspace - /home/akansha/vivekruhela/pon_db
17:16:21.554 WARN GenomicsDBImport - GenomicsDBImport cannot use multiple VCF reader threads for initialization when the number of intervals is greater than 1. Falling back to serial VCF reader initialization.
17:16:21.554 INFO ProgressMeter - Starting traversal
17:16:21.554 INFO ProgressMeter - Current Locus Elapsed Minutes Batches Processed Batches/Minute
17:16:21.590 INFO GenomicsDBImport - Shutting down engine
[January 12, 2021 5:16:21 PM IST] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.09 minutes.
Runtime.totalMemory()=2761949184
java.lang.IndexOutOfBoundsException: Index: 0
at java.util.Collections$EmptyList.get(Collections.java:4456)
at org.genomicsdb.model.GenomicsDBImportConfiguration$ImportConfiguration.getColumnPartitions(GenomicsDBImportConfiguration.java:2083)
at org.genomicsdb.importer.GenomicsDBImporter.<init>(GenomicsDBImporter.java:203)
at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.traverse(GenomicsDBImport.java:745)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Here the possible error is IndexOutOfBoundsException which is not clear to me. Kindly suggest.
I am re-writing my previous post as suggested by Genevieve Brandt (she/her). I am sorry for the poor formatting of my previous post. Hopefully, here, I have written all the details properly.
-
Hello vivekruhela,
Thanks for reposting! I have created a ticket on our github repository because I was not able to get to the bottom of this issue. You can see it here and follow along with what the developers discuss. I'll try to keep you informed when there is more information.
Genevieve
-
Hi vivekruhela,
Thank you for joining in on the discussion on github to help us find the issue quickly! When doing the liftovers you will want to make sure that all the naming conventions are consistent. I would recommend re-doing the liftover instead of just changing the chr, so you don't run into any issues.
We don't maintain the UCSC files so I am not sure the best one to use.
Glad that we were able to figure out the error message, let us know if you need anything else!
Genevieve
Please sign in to leave a comment.
2 comments