GATK ApplyBQSR failing
Hi, I am new to bioinformatics and am trying to get a vcf from a FASTQ for my training. I am trying to run base recalibration on my bam file using "Homo_sapiens_assembly38.dbsnp138.vcf" but it seems to be having an error when creating an output file? Not sure what I'm doing wrong.
REQUIRED for all errors and issues:
a) GATK version used: 4.5.0.0
b) Exact command used:
gatk ApplyBQSR \
-I /data/bam_files/Test01_markdup.bam \
-O data/bam_files/Test01_recalibrated.bam \
-R /data/reference_genome/hg38.fa \
--bqsr-recal-file /data/bam_files/recal_data.table
c) Entire program log:
16:37:59.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
16:37:59.334 INFO ApplyBQSR - ------------------------------------------------------------
16:37:59.336 INFO ApplyBQSR - The Genome Analysis Toolkit (GATK) v4.5.0.0
16:37:59.336 INFO ApplyBQSR - For support and documentation go to https://software.broadinstitute.org/gatk/
16:37:59.336 INFO ApplyBQSR - Executing as ?@459ed45575d8 on Linux v6.8.0-48-generic amd64
16:37:59.336 INFO ApplyBQSR - Java runtime: OpenJDK 64-Bit Server VM v17.0.9+9-Ubuntu-122.04
16:37:59.336 INFO ApplyBQSR - Start Date/Time: November 27, 2024 at 4:37:59 PM GMT
16:37:59.336 INFO ApplyBQSR - ------------------------------------------------------------
16:37:59.336 INFO ApplyBQSR - ------------------------------------------------------------
16:37:59.337 INFO ApplyBQSR - HTSJDK Version: 4.1.0
16:37:59.337 INFO ApplyBQSR - Picard Version: 3.1.1
16:37:59.337 INFO ApplyBQSR - Built for Spark Version: 3.5.0
16:37:59.337 INFO ApplyBQSR - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:37:59.337 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:37:59.337 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:37:59.338 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:37:59.338 INFO ApplyBQSR - Deflater: IntelDeflater
16:37:59.338 INFO ApplyBQSR - Inflater: IntelInflater
16:37:59.338 INFO ApplyBQSR - GCS max retries/reopens: 20
16:37:59.338 INFO ApplyBQSR - Requester pays: disabled
16:37:59.339 INFO ApplyBQSR - Initializing engine
16:37:59.420 INFO ApplyBQSR - Done initializing engine
16:37:59.445 INFO ApplyBQSR - Shutting down engine
[November 27, 2024 at 4:37:59 PM GMT] org.broadinstitute.hellbender.tools.walkers.bqsr.ApplyBQSR done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=285212672
htsjdk.samtools.util.RuntimeIOException: Error opening file: file:///gatk/data/bam_files/Test01_recalibrated.bam
at htsjdk.samtools.SAMFileWriterFactory.makeBAMWriter(SAMFileWriterFactory.java:306)
at htsjdk.samtools.SAMFileWriterFactory.makeBAMWriter(SAMFileWriterFactory.java:263)
at htsjdk.samtools.SAMFileWriterFactory.makeSAMOrBAMWriter(SAMFileWriterFactory.java:444)
at htsjdk.samtools.SAMFileWriterFactory.makeWriter(SAMFileWriterFactory.java:496)
at org.broadinstitute.hellbender.utils.read.ReadUtils.createCommonSAMWriterFromFactory(ReadUtils.java:1016)
at org.broadinstitute.hellbender.utils.read.ReadUtils.createCommonSAMWriter(ReadUtils.java:964)
at org.broadinstitute.hellbender.engine.GATKTool.createSAMWriter(GATKTool.java:847)
at org.broadinstitute.hellbender.tools.walkers.bqsr.ApplyBQSR.onTraversalStart(ApplyBQSR.java:113)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1096)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:166)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:209)
at org.broadinstitute.hellbender.Main.main(Main.java:306)
Caused by: java.nio.file.NoSuchFileException: data/bam_files/Test01_recalibrated.bam
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:484)
at java.base/java.nio.file.Files.newOutputStream(Files.java:228)
at htsjdk.samtools.SAMFileWriterFactory.makeBAMWriter(SAMFileWriterFactory.java:294)
... 14 more
Using GATK jar /gatk/gatk-package-4.5.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.5.0.0-local.jar ApplyBQSR -I /data/bam_files/Test01_markdup.bam --bqsr-recal-file /data/bam_files/recal_data.table -O data/bam_files/Test01_recalibrated.bam
Traceback (most recent call last):
File "/home/user/modules/5_validate_bam_file.py", line 29, in <module>
run_docker_subprocess(path_to_data, image, GATK_command)
File "/home/user/modules/global_functions.py", line 27, in run_docker_subprocess
process = subprocess.run(docker_command, shell=True, check=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run --rm -v /home/user/files/:/data --user 1012:1013 broadinstitute/gatk:4.5.0.0 bash -c 'gatk ApplyBQSR -I /data/bam_files/Test01_markdup.bam --bqsr-recal-file /data/bam_files/recal_data.table -O data/bam_files/Test01_recalibrated.bam'' returned non-zero exit status 3.
Just for reference, this is what I have run so far:
bwa mem (fastq > sam)
samtools sort -n (sam > bam)
samtools fixmate -m (bam > bam)
samtools sort -o (bam > bam)
samtools markdup (bam > bam)
-
Hi Kavi Jeshram
Looking at your command line your output contains a local folder destination whereas the input file is in a root location /data. Can you check to see if your destination folder is set properly?
-
Ah, it was being run in a docker container so luckily that did not matter but good spot
Found the issue though, I previously moved my hg38.dict file but inside there were relative paths
I changed them and now ApplyBQSR can find my hg38.fa :) thanks for the help though
Please sign in to leave a comment.
2 comments