GATK configration
REQUIRED for all errors and issues: I am running gatk for the first time the BaseRecalibrator tools
a) GATK version used: 4.3.0
b) Exact command used: gatk BaseRecalibrator -I ../data/Afar_rD/Afar_1_dedup.bam -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa --known-sites ../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz -O Afar1_data.table
c) Entire program log:
(/opt/sw/gatk/4.3/gatk4_env) gatk BaseRecalibrator -I ../data/Afar_rD/Afar_1_dedup.bam -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa --known-sites ../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz -O Afar1_data.table
Using GATK jar /export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar BaseRecalibrator -I ../data/Afar_rD/Afar_1_dedup.bam -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa --known-sites ../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz -O Afar1_data.table
12:56:11.284 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
12:56:11.636 INFO BaseRecalibrator - ------------------------------------------------------------
12:56:11.636 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.3.0.0
12:56:11.636 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
12:56:11.637 INFO BaseRecalibrator - Executing as wondossen@planetsmasher.hgen.slu.se on Linux v3.10.0-693.21.1.el7.x86_64 amd64
12:56:11.637 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
12:56:11.637 INFO BaseRecalibrator - Start Date/Time: February 13, 2023 12:56:11 PM CET
12:56:11.637 INFO BaseRecalibrator - ------------------------------------------------------------
12:56:11.637 INFO BaseRecalibrator - ------------------------------------------------------------
12:56:11.638 INFO BaseRecalibrator - HTSJDK Version: 3.0.1
12:56:11.638 INFO BaseRecalibrator - Picard Version: 2.27.5
12:56:11.638 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
12:56:11.639 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:56:11.639 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:56:11.639 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:56:11.639 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:56:11.639 INFO BaseRecalibrator - Deflater: IntelDeflater
12:56:11.639 INFO BaseRecalibrator - Inflater: IntelInflater
12:56:11.639 INFO BaseRecalibrator - GCS max retries/reopens: 20
12:56:11.639 INFO BaseRecalibrator - Requester pays: disabled
12:56:11.640 INFO BaseRecalibrator - Initializing engine
12:56:11.647 INFO BaseRecalibrator - Shutting down engine
[February 13, 2023 12:56:11 PM CET] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2423259136
***********************************************************************
A USER ERROR has occurred: Fasta dict file file:///export/proj/ethiopian_cattle/NOBACKUP/../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.dict for reference file:///export/proj/ethiopian_cattle/NOBACKUP/../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa does not exist. Please see http://gatkforums.broadinstitute.org/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference for help creating it.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
(/opt/sw/gatk/4.3/gatk4_env) pwd
/proj/ethiopian_cattle/NOBACKUP
(/opt/sw/gatk/4.3/gatk4_env)
-
Hi Wondessen Ayalew,
You need a fasta dictionary to go along with you reference fasta. Unfortunately, the error message points to a page which doesn't appear to exist anymore, but you can use CreateSequenceDictionary in Picard to create the fasta dictionary.
(Note, if you are new to GATK/Picard: you can run Picard tools from GATK. So if you don't have picard separately installed, you can run `gatk CreateSequenceDictionary -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa -O ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.dict`) -
This post may be helpful as well: https://gatk.broadinstitute.org/hc/en-us/articles/360035531652-FASTA-Reference-genome-format
-
Dear GATK team,
Thank you for your prompt response and valuable support. The problem still persists even though I generated an index file using " samtools faidx Bos_taurus.ARS-UCD1.2.dna.toplevel.fa" the files in my directory are listed below
The error message is as follows
(/opt/sw/gatk/4.3/gatk4_env) gatk BaseRecalibrator -I ../data/Afar_rD/Afar_1_dedup.bam -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa --known-sites ../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz -O Afar1_data.table
Using GATK jar /export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar BaseRecalibrator -I ../data/Afar_rD/Afar_1_dedup.bam -R ../reference/Bos_taurus.ARS-UCD1.2.dna.toplevel.fa --known-sites ../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz -O Afar1_data.table
06:37:47.263 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/export/opt/sw/gatk/4.3/gatk4_env/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
06:37:47.540 INFO BaseRecalibrator - ------------------------------------------------------------
06:37:47.541 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.3.0.0
06:37:47.541 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
06:37:47.541 INFO BaseRecalibrator - Executing as wondossen@planetsmasher.hgen.slu.se on Linux v3.10.0-693.21.1.el7.x86_64 amd64
06:37:47.541 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
06:37:47.541 INFO BaseRecalibrator - Start Date/Time: February 14, 2023 6:37:47 AM CET
06:37:47.541 INFO BaseRecalibrator - ------------------------------------------------------------
06:37:47.542 INFO BaseRecalibrator - ------------------------------------------------------------
06:37:47.542 INFO BaseRecalibrator - HTSJDK Version: 3.0.1
06:37:47.542 INFO BaseRecalibrator - Picard Version: 2.27.5
06:37:47.542 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
06:37:47.543 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
06:37:47.543 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
06:37:47.543 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
06:37:47.543 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
06:37:47.543 INFO BaseRecalibrator - Deflater: IntelDeflater
06:37:47.543 INFO BaseRecalibrator - Inflater: IntelInflater
06:37:47.543 INFO BaseRecalibrator - GCS max retries/reopens: 20
06:37:47.543 INFO BaseRecalibrator - Requester pays: disabled
06:37:47.544 INFO BaseRecalibrator - Initializing engine
06:37:48.593 INFO FeatureManager - Using codec VCFCodec to read file file:///export/proj/ethiopian_cattle/NOBACKUP/../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz
06:37:48.606 INFO BaseRecalibrator - Shutting down engine
[February 14, 2023 6:37:48 AM CET] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=2171076608
***********************************************************************A USER ERROR has occurred: An index is required but was not found for file /export/proj/ethiopian_cattle/NOBACKUP/../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz. Support for unindexed block-compressed files has been temporarily disabled. Try running IndexFeatureFile on the input.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
(/opt/sw/gatk/4.3/gatk4_env)Thank you!
-
Dear All,
Thank you for your valuable comments. Your suggestions are working well regarding my GATK config. questions. In Addition, sorting of vcf file (dbSNPs) avoid the warning message posted in my second question.
Now, I did not find the second pass base recalibration to run AnalyseCovariance in GATK4.3. Rather, I escaped the AnalyseCovariance step and PrintRead step and tried for HaplotypeCaller. Any suggestions?
Thank you!
-
Hello again,
This time it's complaining that your VCF doesn't have an index. You can use the tool IndexFeatureFile to fix that
ex:
IndexFeatureFile -I /export/proj/ethiopian_cattle/NOBACKUP/../GATK_reso/ARS1.2PlusY_BQSR.vcf.gz
I think it's likely that sorting the file sort of accidentally fixed the problem because it indexed the file as part of the sort operation.
It's probably fine to skip AnalyzeCovariates if there isn't anything unusual about your sequencing. Usually BQSR works fine and it's just a sanity check.
Did you run ApplyBQSR? That's the step that actually does the recalibration. So if you just run BaseRecalibrator without that you haven't actually done anything to your data. That's probably fine too since modern high quality sequencing typically only benefits marginally from recalibration.
Please sign in to leave a comment.
5 comments