Why do I get 'java.lang.IllegalArgumentException: Dictionary cannot have size zero' when using GetPileupSummaries?
AnsweredHi! I am using GATK4 following the tutorial (How to) Call somatic mutations using GATK4 Mutect2 – GATK (broadinstitute.org) for detecting somatic variants. I have received an error when using GetPileupSummaries. Specifically, the command line I used is:
gatk GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table
The entire error log has been pasted below. May I know what might cause this problem? Thanks for your help!
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_s amtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_leve l=2 -jar /gatk/gatk-package-4.2.0.0-local.jar GetPileupSummaries -I /gatk/my_dat a/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processi ng_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -V /gat k/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common _3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA _B07.table
01:03:32.752 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar: file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compressi on.so
Sep 12, 2021 1:03:32 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCre dentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
01:03:32.953 INFO GetPileupSummaries - ---------------------------------------- --------------------
01:03:32.954 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.2. 0.0
01:03:32.954 INFO GetPileupSummaries - For support and documentation go to http s://software.broadinstitute.org/gatk/
01:03:32.954 INFO GetPileupSummaries - Executing as root@a2e87404023d on Linux v5.8.0-1039-azure amd64
01:03:32.954 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
01:03:32.954 INFO GetPileupSummaries - Start Date/Time: September 12, 2021 1:03 :32 AM GMT
01:03:32.955 INFO GetPileupSummaries - ---------------------------------------- --------------------
01:03:32.955 INFO GetPileupSummaries - ---------------------------------------- --------------------
01:03:32.955 INFO GetPileupSummaries - HTSJDK Version: 2.24.0
01:03:32.955 INFO GetPileupSummaries - Picard Version: 2.25.0
01:03:32.955 INFO GetPileupSummaries - Built for Spark Version: 2.4.5
01:03:32.955 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
01:03:32.955 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SA MTOOLS : false
01:03:32.955 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_S AMTOOLS : true
01:03:32.956 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_T RIBBLE : false
01:03:32.956 INFO GetPileupSummaries - Deflater: IntelDeflater
01:03:32.956 INFO GetPileupSummaries - Inflater: IntelInflater
01:03:32.956 INFO GetPileupSummaries - GCS max retries/reopens: 20
01:03:32.956 INFO GetPileupSummaries - Requester pays: disabled
01:03:32.956 INFO GetPileupSummaries - Initializing engine
01:03:33.330 INFO FeatureManager - Using codec VCFCodec to read file file:///ga tk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_commo n_3.hg19.vcf.gz
01:03:33.395 INFO GetPileupSummaries - Shutting down engine
[September 12, 2021 1:03:33 AM GMT] org.broadinstitute.hellbender.tools.walkers. contamination.GetPileupSummaries done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=462946304
java.lang.IllegalArgumentException: Dictionary cannot have size zero
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:798)
at org.broadinstitute.hellbender.utils.MRUCachingSAMSequenceDictionary.< init>(MRUCachingSAMSequenceDictionary.java:35)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocP arser.java:78)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocP arser.java:62)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArg umentCollection.getTraversalParameters(IntervalArgumentCollection.java:180)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArg umentCollection.getIntervals(IntervalArgumentCollection.java:111)
at org.broadinstitute.hellbender.engine.GATKTool.initializeIntervals(GAT KTool.java:514)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java :709)
at org.broadinstitute.hellbender.engine.LocusWalker.onStartup(LocusWalke r.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(Comm andLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain PostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain (CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:16 0)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
-
Hi Ruiqiao Bai,
Could you check if your VCF is not malformed or empty? You can use the GATK tool ValidateVariants and you can also re-download the file to make sure there wasn't an issue when downloading.
Best,
Genevieve
-
Dear Genevieve,
Thanks for your reply! Both VCF files I used are not of size 0, and I have used ValidateVariants to check the two VCF files I used.
For unfiltered_LP6005115-DNA_B07.vcf, the command I used is:
gatk ValidateVariants -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf
The result is:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_s amtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_leve l=2 -jar /gatk/gatk-package-4.2.0.0-local.jar ValidateVariants -R /gatk/my_data/ wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_BAM/step1_1/unfilt ered_LP6005115-DNA_B07.vcf
01:01:45.092 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar: file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compressi on.so
Sep 14, 2021 1:01:45 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCre dentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
01:01:45.282 INFO ValidateVariants - ------------------------------------------ ------------------
01:01:45.283 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.2.0. 0
01:01:45.283 INFO ValidateVariants - For support and documentation go to https: //software.broadinstitute.org/gatk/
01:01:45.283 INFO ValidateVariants - Executing as root@6aaddf39e225 on Linux v5 .8.0-1039-azure amd64
01:01:45.283 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1. 8.0_242-8u242-b08-0ubuntu3~18.04-b08
01:01:45.283 INFO ValidateVariants - Start Date/Time: September 14, 2021 1:01:4 5 AM GMT
01:01:45.283 INFO ValidateVariants - ------------------------------------------ ------------------
01:01:45.283 INFO ValidateVariants - ------------------------------------------ ------------------
01:01:45.284 INFO ValidateVariants - HTSJDK Version: 2.24.0
01:01:45.284 INFO ValidateVariants - Picard Version: 2.25.0
01:01:45.284 INFO ValidateVariants - Built for Spark Version: 2.4.5
01:01:45.284 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
01:01:45.284 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMT OOLS : false
01:01:45.284 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAM TOOLS : true
01:01:45.285 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRI BBLE : false
01:01:45.285 INFO ValidateVariants - Deflater: IntelDeflater
01:01:45.285 INFO ValidateVariants - Inflater: IntelInflater
01:01:45.285 INFO ValidateVariants - GCS max retries/reopens: 20
01:01:45.285 INFO ValidateVariants - Requester pays: disabled
01:01:45.285 INFO ValidateVariants - Initializing engine
01:01:45.631 INFO FeatureManager - Using codec VCFCodec to read file file:///ga tk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf
01:01:45.939 INFO ValidateVariants - Done initializing engine
01:01:45.940 WARN ValidateVariants - IDS validation cannot be done because no D BSNP file was provided
01:01:45.940 WARN ValidateVariants - Other possible validations will still be p erformed
01:01:45.940 INFO ProgressMeter - Starting traversal
01:01:45.940 INFO ProgressMeter - Current Locus Elapsed Minutes Vari ants Processed Variants/Minute
01:01:55.982 INFO ProgressMeter - chr3:38970624 0.2 323000 1930086.6
01:02:05.988 INFO ProgressMeter - chr6:11481309 0.3 648000 1939345.6
01:02:15.989 INFO ProgressMeter - chr15:20002206 0.5 1453000 2901261.3
01:02:20.698 INFO ProgressMeter - chrM:2354 0.6 1915074 3305936.6
01:02:20.698 INFO ProgressMeter - Traversal complete. Processed 1915074 total v ariants in 0.6 minutes.
01:02:20.698 INFO ValidateVariants - Shutting down engine
[September 14, 2021 1:02:20 AM GMT] org.broadinstitute.hellbender.tools.walkers. variantutils.ValidateVariants done. Elapsed time: 0.59 minutes.
Runtime.totalMemory()=970981376For lifted_small_exac_common_3.hg19.vcf.gz, the command I used is:
gatk ValidateVariants -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
The result is:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar ValidateVariants -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
01:05:05.304 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 14, 2021 1:05:05 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
01:05:05.506 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.507 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.2.0.0
01:05:05.507 INFO ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
01:05:05.507 INFO ValidateVariants - Executing as root@6aaddf39e225 on Linux v5.8.0-1039-azure amd64
01:05:05.507 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
01:05:05.507 INFO ValidateVariants - Start Date/Time: September 14, 2021 1:05:05 AM GMT
01:05:05.507 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.507 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.508 INFO ValidateVariants - HTSJDK Version: 2.24.0
01:05:05.508 INFO ValidateVariants - Picard Version: 2.25.0
01:05:05.508 INFO ValidateVariants - Built for Spark Version: 2.4.5
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
01:05:05.508 INFO ValidateVariants - Deflater: IntelDeflater
01:05:05.508 INFO ValidateVariants - Inflater: IntelInflater
01:05:05.509 INFO ValidateVariants - GCS max retries/reopens: 20
01:05:05.509 INFO ValidateVariants - Requester pays: disabled
01:05:05.509 INFO ValidateVariants - Initializing engine
01:05:05.864 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
01:05:05.980 INFO ValidateVariants - Done initializing engine
01:05:05.980 WARN ValidateVariants - IDS validation cannot be done because no DBSNP file was provided
01:05:05.980 WARN ValidateVariants - Other possible validations will still be performed
01:05:05.980 INFO ProgressMeter - Starting traversal
01:05:05.981 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
01:05:12.526 INFO ProgressMeter - chr21:45710681 0.1 59232 543080.7
01:05:12.526 INFO ProgressMeter - Traversal complete. Processed 59232 total variants in 0.1 minutes.
01:05:12.526 INFO ValidateVariants - Shutting down engine
[September 14, 2021 1:05:12 AM GMT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.12 minutes.
Runtime.totalMemory()=749207552I think those results mean there is no problem with the VCF files? Besides, the unfiltered_LP6005115-DNA_B07.vcf is generated from running Mutect2, and lifted_small_exac_common_3.hg19.vcf.gz is generated by running LiftoverVcf. I am not sure if those information might help.
-
I think another piece of information I would like to add is that before running Mutect2, I have used AddOrReplaceReadGroups to solve the error 'MISSING_READ_GROUP' in the input to Mutect2. Since I don't have the original information on some required fields of the tool AddOrReplaceReadGroups (see http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups), including 'Read Group library', 'Read Group platform unit' and 'Read Group sample name', I have taken 'unknown', 'unknown' and 'LP6005115-DNA_B07' as corresponding inputs, respectively. I am not sure if this might influence the performance of the tool GetPileupSummaries or not? Thanks for your help!
-
My apologies! I didn't specify which file flagged the error message. Could you perform these checks with the gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz file?
Thank you!
-
Dear Genevieve,
Sure, for lifted_small_exac_common_3.hg19.vcf.gz, the command I used is:
gatk ValidateVariants -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
The result is:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar ValidateVariants -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
01:05:05.304 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 14, 2021 1:05:05 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
01:05:05.506 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.507 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.2.0.0
01:05:05.507 INFO ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
01:05:05.507 INFO ValidateVariants - Executing as root@6aaddf39e225 on Linux v5.8.0-1039-azure amd64
01:05:05.507 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
01:05:05.507 INFO ValidateVariants - Start Date/Time: September 14, 2021 1:05:05 AM GMT
01:05:05.507 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.507 INFO ValidateVariants - ------------------------------------------------------------
01:05:05.508 INFO ValidateVariants - HTSJDK Version: 2.24.0
01:05:05.508 INFO ValidateVariants - Picard Version: 2.25.0
01:05:05.508 INFO ValidateVariants - Built for Spark Version: 2.4.5
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
01:05:05.508 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
01:05:05.508 INFO ValidateVariants - Deflater: IntelDeflater
01:05:05.508 INFO ValidateVariants - Inflater: IntelInflater
01:05:05.509 INFO ValidateVariants - GCS max retries/reopens: 20
01:05:05.509 INFO ValidateVariants - Requester pays: disabled
01:05:05.509 INFO ValidateVariants - Initializing engine
01:05:05.864 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
01:05:05.980 INFO ValidateVariants - Done initializing engine
01:05:05.980 WARN ValidateVariants - IDS validation cannot be done because no DBSNP file was provided
01:05:05.980 WARN ValidateVariants - Other possible validations will still be performed
01:05:05.980 INFO ProgressMeter - Starting traversal
01:05:05.981 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
01:05:12.526 INFO ProgressMeter - chr21:45710681 0.1 59232 543080.7
01:05:12.526 INFO ProgressMeter - Traversal complete. Processed 59232 total variants in 0.1 minutes.
01:05:12.526 INFO ValidateVariants - Shutting down engine
[September 14, 2021 1:05:12 AM GMT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.12 minutes.
Runtime.totalMemory()=749207552 -
Thanks Ruiqiao Bai! Could you run UpdateVCFSequenceDictionary to make sure that the dictionary of the lifted_small_exac_common_3.hg19.vcf.gz file is up to date?
-
Sure. Thanks for your help! I have tested UpdateVCFSequenceDictionary by two commands to get two updated files, but it seems that neither of them could work.
Specifically, the first command is :
gatk UpdateVCFSequenceDictionary \
-V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz \
-R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa \
--output /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz \
--replace trueAnd the result is:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_s amtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_leve l=2 -jar /gatk/gatk-package-4.2.0.0-local.jar UpdateVCFSequenceDictionary -V /ga tk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_commo n_3.hg19.vcf.gz -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa --outp ut /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exa c_common_3.hg19.vcf.gz --replace true
02:47:49.005 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar: file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compressi on.so
Sep 16, 2021 2:47:49 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCre dentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
02:47:49.283 INFO UpdateVCFSequenceDictionary - ------------------------------- -----------------------------
02:47:49.283 INFO UpdateVCFSequenceDictionary - The Genome Analysis Toolkit (GA TK) v4.2.0.0
02:47:49.283 INFO UpdateVCFSequenceDictionary - For support and documentation g o to https://software.broadinstitute.org/gatk/
02:47:49.284 INFO UpdateVCFSequenceDictionary - Executing as root@66ef95f83362 on Linux v5.8.0-1039-azure amd64
02:47:49.284 INFO UpdateVCFSequenceDictionary - Java runtime: OpenJDK 64-Bit Se rver VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
02:47:49.284 INFO UpdateVCFSequenceDictionary - Start Date/Time: September 16, 2021 2:47:48 AM GMT
02:47:49.284 INFO UpdateVCFSequenceDictionary - ------------------------------- -----------------------------
02:47:49.284 INFO UpdateVCFSequenceDictionary - ------------------------------- -----------------------------
02:47:49.285 INFO UpdateVCFSequenceDictionary - HTSJDK Version: 2.24.0
02:47:49.285 INFO UpdateVCFSequenceDictionary - Picard Version: 2.25.0
02:47:49.285 INFO UpdateVCFSequenceDictionary - Built for Spark Version: 2.4.5
02:47:49.285 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.COMPRESSION_LEV EL : 2
02:47:49.285 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_RE AD_FOR_SAMTOOLS : false
02:47:49.285 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_WR ITE_FOR_SAMTOOLS : true
02:47:49.285 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_WR ITE_FOR_TRIBBLE : false
02:47:49.290 INFO UpdateVCFSequenceDictionary - Deflater: IntelDeflater
02:47:49.290 INFO UpdateVCFSequenceDictionary - Inflater: IntelInflater
02:47:49.290 INFO UpdateVCFSequenceDictionary - GCS max retries/reopens: 20
02:47:49.290 INFO UpdateVCFSequenceDictionary - Requester pays: disabled
02:47:49.290 INFO UpdateVCFSequenceDictionary - Initializing engine
02:47:49.754 INFO FeatureManager - Using codec VCFCodec to read file file:///ga tk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_commo n_3.hg19.vcf.gz
02:47:49.886 INFO UpdateVCFSequenceDictionary - Done initializing engine
02:47:49.941 INFO ProgressMeter - Starting traversal
02:47:49.941 INFO ProgressMeter - Current Locus Elapsed Minutes Vari ants Processed Variants/Minute
02:47:51.559 INFO ProgressMeter - chr21:45710681 0.0 59232 2197847.9
02:47:51.559 INFO ProgressMeter - Traversal complete. Processed 59232 total var iants in 0.0 minutes.
02:47:51.665 INFO UpdateVCFSequenceDictionary - Shutting down engine
[September 16, 2021 2:47:51 AM GMT] org.broadinstitute.hellbender.tools.walkers. variantutils.UpdateVCFSequenceDictionary done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=467140608Then I have tried the GetPileupSummaries using the file generated (i.e.updated_small_exac_common_3.hg19.vcf.gz).
gatk GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table
However, It seems that I still get the same error:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table
02:49:30.637 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 16, 2021 2:49:30 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
02:49:30.924 INFO GetPileupSummaries - ------------------------------------------------------------
02:49:30.925 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.2.0.0
02:49:30.925 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
02:49:30.925 INFO GetPileupSummaries - Executing as root@66ef95f83362 on Linux v5.8.0-1039-azure amd64
02:49:30.925 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
02:49:30.926 INFO GetPileupSummaries - Start Date/Time: September 16, 2021 2:49:30 AM GMT
02:49:30.926 INFO GetPileupSummaries - ------------------------------------------------------------
02:49:30.927 INFO GetPileupSummaries - ------------------------------------------------------------
02:49:30.928 INFO GetPileupSummaries - HTSJDK Version: 2.24.0
02:49:30.929 INFO GetPileupSummaries - Picard Version: 2.25.0
02:49:30.929 INFO GetPileupSummaries - Built for Spark Version: 2.4.5
02:49:30.930 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
02:49:30.930 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:49:30.930 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:49:30.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:49:30.933 INFO GetPileupSummaries - Deflater: IntelDeflater
02:49:30.934 INFO GetPileupSummaries - Inflater: IntelInflater
02:49:30.934 INFO GetPileupSummaries - GCS max retries/reopens: 20
02:49:30.934 INFO GetPileupSummaries - Requester pays: disabled
02:49:30.935 INFO GetPileupSummaries - Initializing engine
02:49:31.389 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updated_small_exac_common_3.hg19.vcf.gz
02:49:31.514 INFO GetPileupSummaries - Shutting down engine
[September 16, 2021 2:49:31 AM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=503840768
java.lang.IllegalArgumentException: Dictionary cannot have size zero
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:798)
at org.broadinstitute.hellbender.utils.MRUCachingSAMSequenceDictionary.<init>(MRUCachingSAMSequenceDictionary.java:35)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocParser.java:78)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocParser.java:62)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getTraversalParameters(IntervalArgumentCollection.java:180)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getIntervals(IntervalArgumentCollection.java:111)
at org.broadinstitute.hellbender.engine.GATKTool.initializeIntervals(GATKTool.java:514)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
at org.broadinstitute.hellbender.engine.LocusWalker.onStartup(LocusWalker.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)Besides, I have used another command for UpdateVCFSequenceDictionary:
gatk UpdateVCFSequenceDictionary \
-V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz \
--source-dictionary /gatk/my_data/wgs_BAM/addOrReplaceReadGroups/addOrReplaceReadGroups_LP6005115-DNA_B07.bam \
--output /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz \
--replace trueThe result is:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar UpdateVCFSequenceDictionary -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz --source-dictionary /gatk/my_data/wgs_BAM/addOrReplaceReadGroups/addOrReplaceReadGroups_LP6005115-DNA_B07.bam --output /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz --replace true
02:52:35.032 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 16, 2021 2:52:35 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
02:52:35.275 INFO UpdateVCFSequenceDictionary - ------------------------------------------------------------
02:52:35.276 INFO UpdateVCFSequenceDictionary - The Genome Analysis Toolkit (GATK) v4.2.0.0
02:52:35.276 INFO UpdateVCFSequenceDictionary - For support and documentation go to https://software.broadinstitute.org/gatk/
02:52:35.276 INFO UpdateVCFSequenceDictionary - Executing as root@66ef95f83362 on Linux v5.8.0-1039-azure amd64
02:52:35.276 INFO UpdateVCFSequenceDictionary - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
02:52:35.276 INFO UpdateVCFSequenceDictionary - Start Date/Time: September 16, 2021 2:52:34 AM GMT
02:52:35.276 INFO UpdateVCFSequenceDictionary - ------------------------------------------------------------
02:52:35.276 INFO UpdateVCFSequenceDictionary - ------------------------------------------------------------
02:52:35.277 INFO UpdateVCFSequenceDictionary - HTSJDK Version: 2.24.0
02:52:35.277 INFO UpdateVCFSequenceDictionary - Picard Version: 2.25.0
02:52:35.277 INFO UpdateVCFSequenceDictionary - Built for Spark Version: 2.4.5
02:52:35.277 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.COMPRESSION_LEVEL : 2
02:52:35.277 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:52:35.277 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:52:35.278 INFO UpdateVCFSequenceDictionary - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:52:35.278 INFO UpdateVCFSequenceDictionary - Deflater: IntelDeflater
02:52:35.278 INFO UpdateVCFSequenceDictionary - Inflater: IntelInflater
02:52:35.278 INFO UpdateVCFSequenceDictionary - GCS max retries/reopens: 20
02:52:35.278 INFO UpdateVCFSequenceDictionary - Requester pays: disabled
02:52:35.278 INFO UpdateVCFSequenceDictionary - Initializing engine
02:52:35.622 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
02:52:35.783 INFO UpdateVCFSequenceDictionary - Done initializing engine
02:52:35.816 INFO ProgressMeter - Starting traversal
02:52:35.817 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
02:52:37.419 INFO ProgressMeter - chr21:45710681 0.0 59232 2219812.6
02:52:37.434 INFO ProgressMeter - Traversal complete. Processed 59232 total variants in 0.0 minutes.
02:52:37.508 INFO UpdateVCFSequenceDictionary - Shutting down engine
[September 16, 2021 2:52:37 AM GMT] org.broadinstitute.hellbender.tools.walkers.variantutils.UpdateVCFSequenceDictionary done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=473432064Then, I have tried the GetPileupSummaries again using the second file generated (i.e.updatedDict_small_exac_common_3.hg19.vcf.gz) using the following command:
gatk GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table
However, I still get the error:
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table
02:53:32.699 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 16, 2021 2:53:32 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
02:53:32.929 INFO GetPileupSummaries - ------------------------------------------------------------
02:53:32.929 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.2.0.0
02:53:32.929 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
02:53:32.930 INFO GetPileupSummaries - Executing as root@66ef95f83362 on Linux v5.8.0-1039-azure amd64
02:53:32.930 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
02:53:32.930 INFO GetPileupSummaries - Start Date/Time: September 16, 2021 2:53:32 AM GMT
02:53:32.930 INFO GetPileupSummaries - ------------------------------------------------------------
02:53:32.930 INFO GetPileupSummaries - ------------------------------------------------------------
02:53:32.931 INFO GetPileupSummaries - HTSJDK Version: 2.24.0
02:53:32.931 INFO GetPileupSummaries - Picard Version: 2.25.0
02:53:32.931 INFO GetPileupSummaries - Built for Spark Version: 2.4.5
02:53:32.931 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
02:53:32.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:53:32.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:53:32.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:53:32.931 INFO GetPileupSummaries - Deflater: IntelDeflater
02:53:32.931 INFO GetPileupSummaries - Inflater: IntelInflater
02:53:32.931 INFO GetPileupSummaries - GCS max retries/reopens: 20
02:53:32.931 INFO GetPileupSummaries - Requester pays: disabled
02:53:32.931 INFO GetPileupSummaries - Initializing engine
02:53:33.314 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/updatedDict_small_exac_common_3.hg19.vcf.gz
02:53:33.407 INFO GetPileupSummaries - Shutting down engine
[September 16, 2021 2:53:33 AM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=469237760
java.lang.IllegalArgumentException: Dictionary cannot have size zero
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:798)
at org.broadinstitute.hellbender.utils.MRUCachingSAMSequenceDictionary.<init>(MRUCachingSAMSequenceDictionary.java:35)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocParser.java:78)
at org.broadinstitute.hellbender.utils.GenomeLocParser.<init>(GenomeLocParser.java:62)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getTraversalParameters(IntervalArgumentCollection.java:180)
at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getIntervals(IntervalArgumentCollection.java:111)
at org.broadinstitute.hellbender.engine.GATKTool.initializeIntervals(GATKTool.java:514)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
at org.broadinstitute.hellbender.engine.LocusWalker.onStartup(LocusWalker.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289) -
Hi Ruiqiao Bai,
Thank you for your patience as we looked into this issue. Because your VCF headers are up to date and throwing no issues anywhere else, we are thinking that the issue is coming from the tool not being able to find a sequence dictionary that matches the reference version of your input VCF. You can create a sequence dictionary from your reference fasta (used to create the VCF) with the CreateSequenceDictionary tool, then specify it directly to GetPileupSummaries with the --sequence-dictionary option.
This should hopefully solve your problem! Let me know how it goes.
Best,
Genevieve
-
Dear Genevieve,
Thank you so much for your help! I have created the .dict file using the following command:
gatk CreateSequenceDictionary -R /gatk/my_data/wgs_processing_facilitating_data/hg19.fa
Then, I have tested the following command, and received a new error ‘Input files master sequence dictionary and reads have incompatible contigs: No overlapping contigs found.’ Please see below for details. It seems that my ‘reads contigs’ are empty. May I know how should I fix this issue?
gatk GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table --sequence-dictionary /gatk/my_data/wgs_processing_facilitating_data/hg19.dict
Using GATK jar /gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar GetPileupSummaries -I /gatk/my_data/wgs_BAM/step1_1/unfiltered_LP6005115-DNA_B07.vcf -L /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -V /gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz -O /gatk/my_data/wgs_BAM/step1_3/getpileupsummaries_LP6005115-DNA_B07.table --sequence-dictionary /gatk/my_data/wgs_processing_facilitating_data/hg19.dict
05:16:38.802 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Sep 21, 2021 5:16:39 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
05:16:39.029 INFO GetPileupSummaries - ------------------------------------------------------------
05:16:39.030 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.2.0.0
05:16:39.030 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
05:16:39.030 INFO GetPileupSummaries - Executing as root@e077d38362bc on Linux v5.8.0-1039-azure amd64
05:16:39.030 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
05:16:39.031 INFO GetPileupSummaries - Start Date/Time: September 21, 2021 5:16:38 AM GMT
05:16:39.031 INFO GetPileupSummaries - ------------------------------------------------------------
05:16:39.031 INFO GetPileupSummaries - ------------------------------------------------------------
05:16:39.031 INFO GetPileupSummaries - HTSJDK Version: 2.24.0
05:16:39.031 INFO GetPileupSummaries - Picard Version: 2.25.0
05:16:39.032 INFO GetPileupSummaries - Built for Spark Version: 2.4.5
05:16:39.032 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
05:16:39.032 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
05:16:39.032 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
05:16:39.032 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
05:16:39.032 INFO GetPileupSummaries - Deflater: IntelDeflater
05:16:39.032 INFO GetPileupSummaries - Inflater: IntelInflater
05:16:39.032 INFO GetPileupSummaries - GCS max retries/reopens: 20
05:16:39.032 INFO GetPileupSummaries - Requester pays: disabled
05:16:39.032 INFO GetPileupSummaries - Initializing engine
05:16:39.542 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
05:16:39.664 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/wgs_processing_facilitating_data/hg38_to_hg19/lifted_small_exac_common_3.hg19.vcf.gz
05:16:40.381 INFO IntervalArgumentCollection - Processing 59112 bp from intervals
05:16:40.439 INFO GetPileupSummaries - Shutting down engine
[September 21, 2021 5:16:40 AM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=482869248
***********************************************************************A USER ERROR has occurred: Input files master sequence dictionary and reads have incompatible contigs: No overlapping contigs found.
master sequence dictionary contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chr6_ssto_hap7, chr6_mcf_hap5, chr6_cox_hap2, chr6_mann_hap4, chr6_apd_hap1, chr6_qbl_hap6, chr6_dbb_hap3, chr17_ctg5_hap1, chr4_ctg9_hap1, chr1_gl000192_random, chrUn_gl000225, chr4_gl000194_random, chr4_gl000193_random, chr9_gl000200_random, chrUn_gl000222, chrUn_gl000212, chr7_gl000195_random, chrUn_gl000223, chrUn_gl000224, chrUn_gl000219, chr17_gl000205_random, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chr9_gl000199_random, chrUn_gl000211, chrUn_gl000213, chrUn_gl000220, chrUn_gl000218, chr19_gl000209_random, chrUn_gl000221, chrUn_gl000214, chrUn_gl000228, chrUn_gl000227, chr1_gl000191_random, chr19_gl000208_random, chr9_gl000198_random, chr17_gl000204_random, chrUn_gl000233, chrUn_gl000237, chrUn_gl000230, chrUn_gl000242, chrUn_gl000243, chrUn_gl000241, chrUn_gl000236, chrUn_gl000240, chr17_gl000206_random, chrUn_gl000232, chrUn_gl000234, chr11_gl000202_random, chrUn_gl000238, chrUn_gl000244, chrUn_gl000248, chr8_gl000196_random, chrUn_gl000249, chrUn_gl000246, chr17_gl000203_random, chr8_gl000197_random, chrUn_gl000245, chrUn_gl000247, chr9_gl000201_random, chrUn_gl000235, chrUn_gl000239, chr21_gl000210_random, chrUn_gl000231, chrUn_gl000229, chrM, chrUn_gl000226, chr18_gl000207_random]
reads contigs = []***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. -
Ahh, I see, I missed the issue earlier. The -I file is supposed to be your BAM file containing your reads, not a VCF file.
You can see the tool docs for more information: https://gatk.broadinstitute.org/hc/en-us/articles/4405451312539-GetPileupSummaries
Please let me know if this solution works, if it does, I will submit a bug report ticket so that the GATK devs can make the error message more helpful for this problem.
-
Yes, it works! I cannot appreciate more for your help!
-
Thank you for the update! I'm sorry it took so long to identify. Here is the issue ticket I created: https://github.com/broadinstitute/gatk/issues/7479
Please sign in to leave a comment.
12 comments