Provider "genedb" not found when using SelectVariants & GenomicsDBImport
Hi,
I’m following the instructions for joint variant calling here: https://gatk.broadinstitute.org/hc/en-us/articles/360035889971 ((How to) Consolidate GVCFs for joint calling with GenotypeGVCFs)
I’m running GATK v4.1.5.0 GenomicsDBImport. I make the genomeDB database:
/home/lucas_pkuhpc/lustre2/src/gatk-4/gatk GenomicsDBImport --genomicsdb-workspace-path my_database_all --intervals all_chr_intervals.list -V bam_temp/ERR107690.g.vcf.gz -V bam_temp/ERR107691.g.vcf.gz …. [200 more .g.vcf.gz files]
/home/lucas_pkuhpc/lustre2/src/gatk-4/gatk SelectVariants -R /home/lucas_pkuhpc/lustre2/LABDATA/2019__MicroHomologyMediatedIndels__XiangweHe_ZhejiangU//genome/Schizosaccharomyces_pombe.ASM294v2.dna.toplevel.fa -V genedb://my_database_all -O PombeGenomesAll_geneDB.vcf.gz
fails with the error message:
java.nio.file.ProviderNotFoundException: Provider "genedb" not found
setting TILEDB_DISABLE_FILE_LOCKING=1 didn’t solve the problem
…
However, if, instead of using GenomicsDBImport, I use CombineGVCFs (-O cohort.g.vcf.gz), then IndexFeatureFile, GenotypeGVCFs works fine.
--software versions--
java:
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
GATK:
The Genome Analysis Toolkit (GATK) v4.1.5.0
HTSJDK Version: 2.21.2
Picard Version: 2.21.9
-- error log below --
cat gat095219_2571903.err
Using GATK jar /lustre2/lucas_pkuhpc/src/gatk-4/gatk-package-4.1.5.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /lustre2/lucas_pkuhpc/src/gatk-4/gatk-package-4.1.5.0-local.jar SelectVariants -R /home/lucas_pkuhpc/lustre2/LABDATA/2019__MicroHomologyMediatedIndels__XiangweHe_ZhejiangU//genome/Schizosaccharomyces_pombe.ASM294v2.dna.toplevel.fa -V genedb://my_database_all -O PombeGenomesAll_geneDB.vcf.gz
09:52:45.105 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre2/lucas_pkuhpc/src/gatk-4/gatk-package-4.1.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 12, 2020 9:52:45 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
09:52:45.501 INFO SelectVariants - ------------------------------------------------------------
09:52:45.514 INFO SelectVariants - The Genome Analysis Toolkit (GATK) v4.1.5.0
09:52:45.514 INFO SelectVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
09:52:45.515 INFO SelectVariants - Executing as lucas_pkuhpc@c07b02n06 on Linux v3.10.0-957.el7.x86_64 amd64
09:52:45.515 INFO SelectVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_181-b13
09:52:45.515 INFO SelectVariants - Start Date/Time: March 12, 2020 9:52:45 AM CST
09:52:45.515 INFO SelectVariants - ------------------------------------------------------------
09:52:45.515 INFO SelectVariants - ------------------------------------------------------------
09:52:45.516 INFO SelectVariants - HTSJDK Version: 2.21.2
09:52:45.516 INFO SelectVariants - Picard Version: 2.21.9
09:52:45.516 INFO SelectVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:52:45.516 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:52:45.517 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:52:45.517 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:52:45.517 INFO SelectVariants - Deflater: IntelDeflater
09:52:45.517 INFO SelectVariants - Inflater: IntelInflater
09:52:45.517 INFO SelectVariants - GCS max retries/reopens: 20
09:52:45.517 INFO SelectVariants - Requester pays: disabled
09:52:45.518 INFO SelectVariants - Initializing engine
09:52:46.108 INFO SelectVariants - Shutting down engine
[March 12, 2020 9:52:46 AM CST] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=2076049408
java.nio.file.ProviderNotFoundException: Provider "genedb" not found
at java.nio.file.FileSystems.newFileSystem(FileSystems.java:341)
at org.broadinstitute.hellbender.engine.GATKPathSpecifier.toPath(GATKPathSpecifier.java:57)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:348)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:330)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:281)
at org.broadinstitute.hellbender.engine.VariantWalker.initializeDrivingVariants(VariantWalker.java:58)
at org.broadinstitute.hellbender.engine.VariantWalkerBase.initializeFeatures(VariantWalkerBase.java:67)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:706)
at org.broadinstitute.hellbender.engine.VariantWalker.onStartup(VariantWalker.java:45)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
pkurun: error: c07b02n06: task 0: Exited with exit code 3
-
Hi Lucas Carey
The reason you see this error is because the argument -V should be `gendb` and not `genedb`. If you make that correction in your commandline the issue should be fixed.
-
I'm getting a similar error. I'm trying to generate the panel of normals VCF to use Mutect2 for variant calling of tumors with matched normals following instructions here. I don't exactly know what input SelectVariants is looking for, but the GenomicsDB workspace path is not working.
GATK v4.1.4.1
Inputs:
TILEDB_DISABLE_FILE_LOCKING=1
gatk GenomicsDBImport -R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/hs37d5.fa \
-L /mnt/data/rbueno/ref_genomes/human/hs37d5_interval_files/Homo_sapiens_assembly19.interval.list \
--genomicsdb-workspace-path /PHShome/mi313/example_familial/output/pon_db \
-V /PHShome/mi313/example_familial/output/pon/HITS622847/HITS622847_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622851/HITS622851_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622855/HITS622855_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622858/HITS622858_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622862/HITS622862_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622866/HITS622866_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622870/HITS622870_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622875/HITS622875_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622849/HITS622849_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622856/HITS622856_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622860/HITS622860_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622864/HITS622864_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622868/HITS622868_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622872/HITS622872_normal.vcf.gz \
-V /PHShome/mi313/example_familial/output/pon/HITS622877/HITS622877_normal.vcf.gzgatk SelectVariants \
-R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/hs37d5.fa \
-V gendb://PHShome/mi313/example_familial/output/pon_db \
-O /PHShome/mi313/example_familial/output/pon/pon.vcf
Error:Using GATK jar /PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar SelectVariants -R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/hs37d5.fa -V gendb://PHShome/mi313/example_familial/output/pon_db -O /PHShome/mi313/example_familial/output/pon/pon.vcf
16:15:44.209 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 26, 2020 4:15:44 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
16:15:44.688 INFO SelectVariants - ------------------------------------------------------------
16:15:44.689 INFO SelectVariants - The Genome Analysis Toolkit (GATK) v4.1.4.1
16:15:44.689 INFO SelectVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
16:15:44.689 INFO SelectVariants - Executing as mi313@cmu013.research.partners.org on Linux v2.6.32-431.29.2.el6.x86_64 amd64
16:15:44.689 INFO SelectVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_121-b31
16:15:44.689 INFO SelectVariants - Start Date/Time: March 26, 2020 4:15:44 PM EDT
16:15:44.689 INFO SelectVariants - ------------------------------------------------------------
16:15:44.689 INFO SelectVariants - ------------------------------------------------------------
16:15:44.690 INFO SelectVariants - HTSJDK Version: 2.21.0
16:15:44.690 INFO SelectVariants - Picard Version: 2.21.2
16:15:44.690 INFO SelectVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:15:44.690 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:15:44.690 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:15:44.690 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:15:44.690 INFO SelectVariants - Deflater: IntelDeflater
16:15:44.690 INFO SelectVariants - Inflater: IntelInflater
16:15:44.690 INFO SelectVariants - GCS max retries/reopens: 20
16:15:44.691 INFO SelectVariants - Requester pays: disabled
16:15:44.691 INFO SelectVariants - Initializing engine
16:15:45.144 INFO SelectVariants - Shutting down engine
[March 26, 2020 4:15:45 PM EDT] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=606601216
***********************************************************************
A USER ERROR has occurred: GenomicsDB workspace drivingVariantFile:gendb:///PHShome/mi313/example_familial/scripts/PHShome/mi313/example_familial/output/pon_db does not exist
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. -
Hi Mark Godek
Can you please see if the GenomicsDB workspace is read and write accessible? This was an issue we saw before too.
-
ls -lha is giving me
drwx------. 26 mi313 groupname 4.0K Mar 26 15:11 pon_db
so I believe it is read, write, and execute accessible. -
Let me look into this and get back to you.
-
I also tried the steps from CreateSomaticPanelOfNormals
Input:
gatk CreateSomaticPanelOfNormals \
-R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/hs37d5.fa \
-V /PHShome/mi313/example_familial/output/pon_db \
-O pon.vcf.gz
Output:Using GATK jar /PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_l evel=2 -jar /PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar CreateSomaticPanelOfNormals -R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/h s37d5.fa -V /PHShome/mi313/example_familial/output/pon_db -O pon.vcf.gz
13:28:05.816 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar!/com/intel/gk l/native/libgkl_compression.so
Mar 30, 2020 1:28:06 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:28:06.389 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
13:28:06.389 INFO CreateSomaticPanelOfNormals - The Genome Analysis Toolkit (GATK) v4.1.4.1
13:28:06.389 INFO CreateSomaticPanelOfNormals - For support and documentation go to https://software.broadinstitute.org/gatk/
13:28:06.389 INFO CreateSomaticPanelOfNormals - Executing as mi313@cmu014.research.partners.org on Linux v2.6.32-431.29.2.el6.x86_64 amd64
13:28:06.390 INFO CreateSomaticPanelOfNormals - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_121-b31
13:28:06.390 INFO CreateSomaticPanelOfNormals - Start Date/Time: March 30, 2020 1:28:05 PM EDT
13:28:06.390 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
13:28:06.390 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
13:28:06.390 INFO CreateSomaticPanelOfNormals - HTSJDK Version: 2.21.0
13:28:06.390 INFO CreateSomaticPanelOfNormals - Picard Version: 2.21.2
13:28:06.390 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:28:06.390 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:28:06.391 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:28:06.391 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:28:06.391 INFO CreateSomaticPanelOfNormals - Deflater: IntelDeflater
13:28:06.391 INFO CreateSomaticPanelOfNormals - Inflater: IntelInflater
13:28:06.391 INFO CreateSomaticPanelOfNormals - GCS max retries/reopens: 20
13:28:06.391 INFO CreateSomaticPanelOfNormals - Requester pays: disabled
13:28:06.391 WARN CreateSomaticPanelOfNormals -
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Warning: CreateSomaticPanelOfNormals is a BETA tool and is not yet ready for use in production
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
13:28:06.391 INFO CreateSomaticPanelOfNormals - Initializing engine
13:28:07.964 INFO CreateSomaticPanelOfNormals - Shutting down engine
[March 30, 2020 1:28:07 PM EDT] org.broadinstitute.hellbender.tools.walkers.mutect.CreateSomaticPanelOfNormals done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=903348224
***********************************************************************
A USER ERROR has occurred: Couldn't read file file:///PHShome/mi313/example_familial/output/pon_db/. Error was: It isn't a regular file
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. -
Adding gendb:// to the -V made it able to recognize the genomicsDB. I will try using the same formatting for SelectVariants
gatk CreateSomaticPanelOfNormals -R /mnt/data/rbueno/ref_genomes/human/BWA-MEM/b37_decoys/hs37d5.fa -V gendb:///PHShome/mi313/example_familial/output/pon_db -O pon.vcf.gz
Using GATK jar /PHShome/mi313/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar -
OK, SelectVariants works with
-V gendb:///PHShome/mi313/example_familial/output/pon_db
as well. The syntax with all the forward slashes was throwing me off. I thought I tried all permutations before asking, but I guess not.
Thanks.
-
Hi Mark Godek
So it's working now? I was looking into it too and one of the things that came up was to export the TILEDB_DISABLE_FILE_LOCKING env variable.
export TILEDB_DISABLE_FILE_LOCKING=1
before GenomicsDBImport command. This is help make the database readable if you are using a NFS.
-
I also got this error.
The way to make it work is to throw three slashes after gendb:, like "-V gendb:///home/your db path".
-
Thanks for posting this solution LG!
Please sign in to leave a comment.
11 comments