Proper references for mutec2 working in tumour-only mode
AnsweredHi all!
I am total newcomer in bioinfo - so I apologize in advance for possible naivety of my question ;)
I am incorporating pipeline for exome seq analysis optimized about 2-3 yrs ago by alumn of our lab. It is based on DNA-seq analysis pipeline: https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/
After few adjustment from gatk3 to gatk4 it seems to work pretty well but problem arises when I come to mutect2 step. We need to operate in tumor-only mode.
The problem is that he is doing only several chromosomes (CHR 3 - CHR5) max and than stops without any errors. He creates output but it is empty. I have no idea what to do.
Importantly when I run in only with panel-of-normals it worked. But I guess it is not enaught references for proper analysis...
./gatk Mutect2 \
-R /mnt/c/NGS2/Homo_sapiens_assembly38.fasta \
-I /mnt/c/NGS2/AML_BWA_47_prnt_reads.bam \
-O /mnt/c/NGS2/AML_BWA_47_MuTect2.vcf \
--panel-of-normals /mnt/c/NGS2/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
Do u have any suggestion how should I approach this issue to force him to run whole analysis with proper references?
Looking foward for ur response!
a) GATK version used: gatk-4.2.5.0
b) I am running:
/mnt/c/gatk-4.2.5.0/gatk Mutect2 \
-R /mnt/c/NGS2/Homo_sapiens_assembly38.fasta \
-I /mnt/c/NGS2/AML_BWA_47_prnt_reads.bam \
-O /mnt/c/NGS2/AML_BWA_47_MuTect2_test.vcf \
-L /mnt/c/NGS2/hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \
-tumor AML \
--germline-resource /mnt/c/NGS2/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz \
--panel-of-normals /mnt/c/NGS2/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
- c) Entire program log:
nicheworks@NICHE_SERWER:/mnt/c/NGS2$ /mnt/c/gatk-4.2.5.0/gatk Mutect2 \
> -R /mnt/c/NGS2/Homo_sapiens_assembly38.fasta \
> -I /mnt/c/NGS2/AML_BWA_47_prnt_reads.bam \
> -O /mnt/c/NGS2/AML_BWA_47_MuTect2_test.vcf \
> -L /mnt/c/NGS2/hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \
> -tumor AML \
> --germline-resource /mnt/c/NGS2/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz \
> --panel-of-normals /mnt/c/NGS2/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
Using GATK jar /mnt/c/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/c/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar Mutect2 -R /mnt/c/NGS2/Homo_sapiens_assembly38.fasta -I /mnt/c/NGS2/AML_BWA_47_prnt_reads.bam -O /mnt/c/NGS2/AML_BWA_47_MuTect2_test.vcf -L /mnt/c/NGS2/hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -tumor AML --germline-resource /mnt/c/NGS2/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz --panel-of-normals /mnt/c/NGS2/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
12:55:41.211 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/c/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Apr 05, 2022 12:55:41 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:55:41.339 INFO Mutect2 - ------------------------------------------------------------
12:55:41.340 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.2.5.0
12:55:41.340 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
12:55:41.341 INFO Mutect2 - Executing as nicheworks@NICHE_SERWER on Linux v4.4.0-19041-Microsoft amd64
12:55:41.342 INFO Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v11.0.13+8-Ubuntu-0ubuntu1.20.04
12:55:41.343 INFO Mutect2 - Start Date/Time: April 5, 2022 at 12:55:41 PM CEST
12:55:41.343 INFO Mutect2 - ------------------------------------------------------------
12:55:41.344 INFO Mutect2 - ------------------------------------------------------------
12:55:41.345 INFO Mutect2 - HTSJDK Version: 2.24.1
12:55:41.345 INFO Mutect2 - Picard Version: 2.25.4
12:55:41.346 INFO Mutect2 - Built for Spark Version: 2.4.5
12:55:41.347 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:55:41.348 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:55:41.348 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:55:41.350 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:55:41.351 INFO Mutect2 - Deflater: IntelDeflater
12:55:41.352 INFO Mutect2 - Inflater: IntelInflater
12:55:41.352 INFO Mutect2 - GCS max retries/reopens: 20
12:55:41.353 INFO Mutect2 - Requester pays: disabled
12:55:41.354 INFO Mutect2 - Initializing engine
12:55:41.674 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/c/NGS2/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
12:55:41.904 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/c/NGS2/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz
12:55:41.977 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/c/NGS2/hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
12:55:45.193 WARN IntelInflater - Zero Bytes Written : 0
12:55:45.435 INFO IntervalArgumentCollection - Processing 3431504 bp from intervals
12:55:45.911 INFO Mutect2 - Done initializing engine
12:55:45.964 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/mnt/c/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
12:55:45.969 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/mnt/c/gatk-4.2.5.0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
12:55:46.000 INFO IntelPairHmm - Using CPU-supported AVX-512 instructions
12:55:46.001 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
12:55:46.003 INFO IntelPairHmm - Available threads: 24
12:55:46.004 INFO IntelPairHmm - Requested threads: 4
12:55:46.007 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
12:55:46.074 INFO ProgressMeter - Starting traversal
12:55:46.075 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
12:55:56.175 INFO ProgressMeter - chr1:11821890 0.2 3660 21742.6
12:56:06.203 INFO ProgressMeter - chr1:24070983 0.3 9000 26828.3
[...]
13:01:08.944 INFO ProgressMeter - chr4:122360812 5.4 351350 65292.7
13:01:18.945 INFO ProgressMeter - chr4:184670980 5.5 381670 68796.2
13:01:22.127 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
13:01:22.128 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
13:01:22.129 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
13:01:22.131 INFO Mutect2 - Shutting down engine
[April 5, 2022 at 1:01:22 PM CEST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 5.68 minutes.
Runtime.totalMemory()=19008585728
java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:372)
at java.base/java.util.ArrayList.get(ArrayList.java:459)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.isActive(Mutect2Engine.java:427)
at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.loadNextAssemblyRegion(AssemblyRegionIterator.java:136)
at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.next(AssemblyRegionIterator.java:112)
at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.next(AssemblyRegionIterator.java:35)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:192)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
-
It's a pretty vague error message so I'm not 100% sure what is triggering the error, but I think you should take a second look at the resources you are using as input. Specifically, the panel of normals and germline resource are not the ones we recommend.
Here is an old github ticket where users solved a similar issue by fixing their input resources: https://github.com/broadinstitute/gatk/issues/4578
Here is an article with information about our public panel of normals: https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON-
#19 in the Mutect2 FAQ covers the other resources of interest: https://gatk.broadinstitute.org/hc/en-us/articles/360050722212-FAQ-for-Mutect2
Let me know how this goes.
Best,
Genevieve
Please sign in to leave a comment.
1 comment