BaseRecalibrator error
While generating BaseRecalibrator table, I got the following error.
REQUIRED for all errors and issues:
a) GATK version used: 4.2.5.0
b) Exact command used: java -jar /home/tbiswas/softwares/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar BaseRecalibrator -I /scratch/tbiswas/IITK-P6-BD_fixmate_sorted_duprm.bam -R /home/tbiswas/hg38/hg38.fa --known-sites /scratch/tbiswas/largefiles/hg38_dnSNP.vcf -O /scratch/tbiswas/IITK-P6-BD_recal_data.table
c) Entire program log:
[tbiswas@un02 ~]$ java -jar /home/tbiswas/softwares/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar BaseRecalibrator -I /scratch/tbiswas/IITK-P6-BD_fixmate_sorted_duprm.bam -R /home/tbiswas/hg38/hg38.fa --known-sites /scratch/tbiswas/largefiles/hg38_dnSNP.vcf -O /scratch/tbiswas/IITK-P6-BD_recal_data.table
18:29:29.224 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/tbiswas/softwares/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 18, 2022 6:29:29 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
18:29:29.367 INFO BaseRecalibrator - ------------------------------------------------------------
18:29:29.367 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.5.0
18:29:29.367 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
18:29:29.367 INFO BaseRecalibrator - Executing as tbiswas@un02 on Linux v3.10.0-327.el7.x86_64 amd64
18:29:29.367 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_65-b17
18:29:29.368 INFO BaseRecalibrator - Start Date/Time: 18 October, 2022 6:29:29 PM IST
18:29:29.368 INFO BaseRecalibrator - ------------------------------------------------------------
18:29:29.368 INFO BaseRecalibrator - ------------------------------------------------------------
18:29:29.368 INFO BaseRecalibrator - HTSJDK Version: 2.24.1
18:29:29.368 INFO BaseRecalibrator - Picard Version: 2.25.4
18:29:29.368 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
18:29:29.368 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:29:29.368 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:29:29.369 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:29:29.369 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:29:29.369 INFO BaseRecalibrator - Deflater: IntelDeflater
18:29:29.369 INFO BaseRecalibrator - Inflater: IntelInflater
18:29:29.369 INFO BaseRecalibrator - GCS max retries/reopens: 20
18:29:29.369 INFO BaseRecalibrator - Requester pays: disabled
18:29:29.369 INFO BaseRecalibrator - Initializing engine
18:29:30.337 INFO FeatureManager - Using codec VCFCodec to read file file:///scratch/tbiswas/largefiles/hg38_dnSNP.vcf
18:29:40.675 WARN IndexUtils - Feature file "file:///scratch/tbiswas/largefiles/hg38_dnSNP.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
18:30:06.334 INFO BaseRecalibrator - Shutting down engine
[18 October, 2022 6:30:06 PM IST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.62 minutes.
Runtime.totalMemory()=4928831488
***********************************************************************
A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.
reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY]
features contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT]
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
[tbiswas@un02 ~]$
Please let me know what I have to do.
Thank you.
Regards,
Tanay
-
Hi Tanay,
How did you solve this problem? I am facing same problem.
-
It looks like a mismatch in the reference you're using between different files. The reference file has contigs named chr1, chr2, etc. The known sites vcf's seems to be based on a different reference because it has contigs are named 1,2,3, etc. Is it possible one of the files isn't actually based on hg38?
I notice the warning:
18:29:40.675 WARN IndexUtils - Feature file "file:///scratch/tbiswas/largefiles/hg38_dnSNP.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
If you're sure that the files ARE matched and generated on the same reference build you might try adding a sequence dictionary with the appropriate contigs to your known sites vcf.
Please sign in to leave a comment.
2 comments