Could not find vcf.stats file after Mutect2
I executed Mutect2 to do a variant calling. The command ran successfully however, when I tried to do the FilterMutectCalls, I got the error message stating the following:
***********************************************************************
A USER ERROR has occurred: Mutect stats table somatic_1t.vcf.stats not found. When Mutect2 outputs a file calls.vcf it also creates a calls.vcf.stats file. Perhaps this file was not moved along with the vcf, or perhaps it was not delocalized from a virtual machine while running in the cloud.
When I tried to look for a solution, I found out similar issues were brought up earlier but the bug has been addressed in newer versions. As I am using GATK 4.2.4.0 I wonder what could be the reason now. Clearly, my Mutect2 command did not generate the 'stats' file. Please suggest a solution to this.
REQUIRED for all errors and issues:
a) GATK version used: gatk/4.2.4.0
b) Exact command used:
#!/bin/bash
#SBATCH --time=23:30:00
#SBATCH --account=def-kar
#SBATCH --mem-per-cpu=3G
#SBATCH --cpus-per-task=16
module load StdEnv/2020
module load gatk/4.2.4.0
gatk Mutect2 \
-R ../ref/GRCh38.d1.vd1.fa \
-I recal_reads_1_n.bam \
-I recal_reads_1_t.bam \
-normal 20 \
--germline-resource ../cloudRes/af-only-gnomad.hg38.vcf.gz \
--panel-of-normals ../cloudRes/1000g_pon.hg38.vcf.gz \
-O somatic_1t.vcf
gatk FilterMutectCalls \
-R ../ref/GRCh38.d1.vd1.fa \
-V somatic_1t.vcf \
-O somatic_1t_filtered.vcf
c) Entire program log:
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1091)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gatk/4.2.4.0/gatk-package-4.2.4.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gatk/4.2.4.0/gatk-package-4.2.4.0-local.jar Mutect2 -R ../ref/GRCh38.d1.vd1.fa -I recal_reads_1_n.bam -I recal_reads_1_t.bam -normal 20 --germline-resource ../cloudRes/af-only-gnomad.hg38.vcf.gz --panel-of-normals ../cloudRes/1000g_pon.hg38.vcf.gz -O somatic_1t.vcf
-
Hi Seke Keretsu,
Thank you for your much-appreciated patience and for writing to the GATK forum!
I reviewed the error you encountered with our developers and received some feedback on your next steps.
The cluster you are using doesn't seem to be saving your stats table properly when running the tool. Have you tried running the tool locally?
I'd recommend the following: firstly, ensure that your stats table is, in fact, being generated. If it is being generated and isn't being saved, our hypothesis is likely correct. In that case, go ahead and re-try running the tool on your local computer instead of a cluster.
I hope this helps! Please let me know if this leads you to success. If you have any further questions/issues in the meantime, please do not hesitate to reach out.
Best,
Anthony
-
Hi Seke Keretsu,
We haven't heard from you in a while so we're going to close out this ticket. If you still require assistance, simply respond to this email and we'll be happy to pick up where we left off!
Kind regards,
Anthony
-
Hi,
I am having a similar issue with my stats file not being written out correctly while running mutect2 on the all of us platform using the us.gcr.io/broad-gatk/gatk containers. I have tried version 4.2.4.1, 4.3.0.0, and 4.4.0.0 and get the same error. Below is the error message (it looks like it is not able to save my stats file to my bucket, but I am able to get the vcf and its index file outputs in my bucket):
04:39:30.412 INFO Mutect2 - Shutting down engine
[March 24, 2023 4:39:30 AM GMT] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 3,314.17 minutes.
Runtime.totalMemory()=27140816896
***********************************************************************A USER ERROR has occurred: Encountered an IO exception while writing to gs:/fc-secure-3901d3c2-8b08-4cfd-a5d4-d95becbe0a44/wgs_1000033.PON1KG.mutect2.vcf.gz.stats.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Using GATK jar /gatk/gatk-package-4.3.0.0-local.jarThank you!
-
Mutect2 can write the output VCF and VCF index to arbitrary paths, either local files or buckets. The stats file, however, can only be written locally. When we run Mutect2 on the cloud, eg in the featured workflow on Terra, the stats file is written to a local path on the VM running Mutect2 and then Terra copies it to a bucket when it delocalizes the outputs. The stats file can never be specified as a gs://____, however.
Please sign in to leave a comment.
4 comments