Mutect2 report generic error of java and exit quickly.
I intend to run mutect2 in the tumor-paired model. The command was referred to https://gatk.broadinstitute.org/hc/en-us/articles/360047232772--Notebook-Intro-to-using-Mutect2-for-somatic-data.
I have no idea about this error. I tried `chmod +777` for each file used by the command and failed.
dependency:
1. both PON and Germline resource were downloaded from the google buckets which I found elsewhere in the forum.
REQUIRED for all errors and issues:
a) GATK version used: gatk4-4.4.0.0-0
b) Exact command used:
Mutect2 -R /media/bioinfo/reference/human/hg38/bwamem2/hg38.fa \
-I results/PRJNA504942_WGS/align/bqsr/T34M54y_SRR9313682SRR9307284_bqsr.bam \
-I results/PRJNA504942_WGS/align/bqsr/N34M54y_SRR9313682SRR9307284_bqsr.bam \
-normal SRR9307284 -pon /media/bioinfo/reference/human/hg38/annotation/somaticcallings/1000g_pon.hg38.vcf.gz \
--germline-resource /media/bioinfo/reference/human/hg38/annotation/somaticcallings/af-only-gnomad.hg38.vcf.gz \
-L chr17 -O results/PRJNA504942_WGS/mutect2_TNpair/vcf/34M54y_SRR9313682SRR9307284_fragnorm.vcf
c) Entire program log:
Using GATK jar /media/bioinfo/anaconda/env/wgs/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /media/bioinfo/anaconda/env/wgs/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar Mutect2 -R /media/bioinfo/reference/human/hg38/bwamem2/hg38.fa -I results/PRJNA504942_WGS/align/bqsr/T34M54y_SRR9313682SRR9307284_bqsr.bam -I results/PRJNA504942_WGS/align/bqsr/N34M54y_SRR9313682SRR9307284_bqsr.bam -normal SRR9307284 -pon /media/bioinfo/reference/human/hg38/annotation/somaticcallings/1000g_pon.hg38.vcf.gz --germline-resource /media/bioinfo/reference/human/hg38/annotation/somaticcallings/af-only-gnomad.hg38.vcf.gz -L chr17 -O results/PRJNA504942_WGS/mutect2_TNpair/vcf/34M54y_SRR9313682SRR9307284_fragnorm.vcf
23:49:56.446 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/media/bioinfo/anaconda/env/wgs/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
23:49:56.463 INFO Mutect2 - ------------------------------------------------------------
23:49:56.465 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.4.0.0
23:49:56.465 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
23:49:56.465 INFO Mutect2 - Executing as caoyutao@YONGLab on Linux v6.5.0-15-generic amd64
23:49:56.465 INFO Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v17.0.3-internal+0-adhoc..src
23:49:56.465 INFO Mutect2 - Start Date/Time: 2024年2月1日 CST 下午11:49:56
23:49:56.465 INFO Mutect2 - ------------------------------------------------------------
23:49:56.465 INFO Mutect2 - ------------------------------------------------------------
23:49:56.466 INFO Mutect2 - HTSJDK Version: 3.0.5
23:49:56.466 INFO Mutect2 - Picard Version: 3.0.0
23:49:56.466 INFO Mutect2 - Built for Spark Version: 3.3.1
23:49:56.466 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
23:49:56.466 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
23:49:56.466 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
23:49:56.467 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
23:49:56.467 INFO Mutect2 - Deflater: IntelDeflater
23:49:56.467 INFO Mutect2 - Inflater: IntelInflater
23:49:56.467 INFO Mutect2 - GCS max retries/reopens: 20
23:49:56.467 INFO Mutect2 - Requester pays: disabled
23:49:56.468 INFO Mutect2 - Initializing engine
23:49:56.585 INFO FeatureManager - Using codec VCFCodec to read file file:///media/bioinfo/reference/human/hg38/annotation/somaticcallings/1000g_pon.hg38.vcf.gz
23:49:56.648 INFO FeatureManager - Using codec VCFCodec to read file file:///media/bioinfo/reference/human/hg38/annotation/somaticcallings/af-only-gnomad.hg38.vcf.gz
23:49:56.682 INFO IntervalArgumentCollection - Processing 83257441 bp from intervals
23:49:56.696 INFO Mutect2 - Done initializing engine
23:49:56.709 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/media/bioinfo/anaconda/env/wgs/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
23:49:56.710 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/media/bioinfo/anaconda/env/wgs/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
23:49:56.717 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
23:49:56.718 INFO IntelPairHmm - Available threads: 16
23:49:56.718 INFO IntelPairHmm - Requested threads: 4
23:49:56.718 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
23:49:56.740 INFO ProgressMeter - Starting traversal
23:49:56.740 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
23:49:56.887 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
23:49:56.887 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
23:49:56.888 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
23:49:56.889 INFO Mutect2 - Shutting down engine
[2024年2月1日 CST 下午11:49:56] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=285212672
java.lang.NullPointerException: Cannot invoke "Object.getClass()" because "this.comparator" is null
at htsjdk.samtools.ComparableSamRecordIterator.compareTo(ComparableSamRecordIterator.java:68)
at htsjdk.samtools.ComparableSamRecordIterator.compareTo(ComparableSamRecordIterator.java:36)
at java.base/java.util.PriorityQueue.siftUpComparable(PriorityQueue.java:647)
at java.base/java.util.PriorityQueue.siftUp(PriorityQueue.java:639)
at java.base/java.util.PriorityQueue.offer(PriorityQueue.java:330)
at htsjdk.samtools.MergingSamRecordIterator.addIfNotEmpty(MergingSamRecordIterator.java:161)
at htsjdk.samtools.MergingSamRecordIterator.<init>(MergingSamRecordIterator.java:94)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:429)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.iterator(ReadsPathDataSource.java:336)
at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.iterator(MultiIntervalLocalReadShard.java:134)
at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.<init>(AssemblyRegionIterator.java:88)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:188)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1098)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
-
Hello UTao Cao. This error looks like a genuine bug that we probably have to fix. It would help us to run this to ground if you can test a few things for us. First can you try running PrintReads on these inputs to see if the files. Next could you try running ValidateSamFile (https://gatk.broadinstitute.org/hc/en-us/articles/21905113856795-ValidateSamFile-Picard) on your input files to make sure it validates. Our particular concern is that there might be an ordering problem that is causing this exception.
If both of those pass we would appreciate if you could post the bam headers for these files as there might be an undefined/poorly defined sort ordering that is causing this exception. If that is the problem then running sortsam on your input files first then running them through Mutect2 would probably fix the issue. -
I am having a very similar problem, except it takes more than 2hs to show up.
If needed, i am running the script using:
openjdk 17.0.9 2023-10-17
OpenJDK Runtime Environment (build 17.0.9+9-Ubuntu-122.04)
OpenJDK 64-Bit Server VM (build 17.0.9+9-Ubuntu-122.04, mixed mode, sharing)This is the script i am using (for which i got an output):
cd /home/user/genomics/gatk/
./gatk Mutect2 \
-R /home/user/data-02/reference/a_reference_genome.fna \
-I /home/user/data-02/2023_11_17_cell_line_12615/12615_sorted.bam \
-tumor 12615 \
-I /home/user/data-02/2023_11_13_cell_line_3684/3684_sorted.bam \
-normal 3684 \
-O /home/user/data-02/cells_comparison_results/mutect/somatic.vcf.gz
cd /home/user/data-02/scripts/These the results:
root@324579823:/home/user/data-02/cells_comparison_results/mutect# ls -lA
total 4896
-rw-r--r-- 1 root root 3883918 Feb 9 11:16 somatic.vcf.gz
-rw-r--r-- 1 root root 127610 Feb 9 11:16 somatic.vcf.gz.tbiHere, the last few rows of the, very long, output i got:
11:14:59.874 INFO ProgressMeter - NW_003613811.1:1932257 129.8 2838510 21867.9
11:16:31.353 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 6.560276783000001
11:16:31.353 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 1693.8890707330002
11:16:31.353 INFO SmithWatermanAligner - Total compute time in native Smith-Waterman : 345.22 sec
11:16:31.353 INFO Mutect2 - Shutting down engine
[February 9, 2024 at 11:16:31 AM CET] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 131.37 minutes.
Runtime.totalMemory()=21021851648
java.lang.OutOfMemoryError: Java heap space
at htsjdk.samtools.SAMTextHeaderCodec$ParsedHeaderLine.<init>(SAMTextHeaderCodec.java:280)
at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:97)
at htsjdk.samtools.reference.ReferenceSequenceFileFactory.loadDictionary(ReferenceSequenceFileFactory.java:236)
at htsjdk.samtools.reference.AbstractFastaSequenceFile.findAndLoadSequenceDictionary(AbstractFastaSequenceFile.java:91)
at htsjdk.samtools.reference.AbstractFastaSequenceFile.lambda$new$9c19d50a$1(AbstractFastaSequenceFile.java:68)
at htsjdk.samtools.reference.AbstractFastaSequenceFile$$Lambda$225/0x00007fa42458fd20.get(Unknown Source)
at htsjdk.samtools.util.Lazy.get(Lazy.java:25)
at htsjdk.samtools.reference.AbstractFastaSequenceFile.getSequenceDictionary(AbstractFastaSequenceFile.java:140)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.getSequenceDictionary(IndexedFastaSequenceFile.java:49)
at htsjdk.samtools.reference.AbstractIndexedFastaSequenceFile.<init>(AbstractIndexedFastaSequenceFile.java:67)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:80)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:98)
at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getReferenceSequenceFile(ReferenceSequenceFileFactory.java:139)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:152)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:129)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:114)
at org.broadinstitute.hellbender.engine.ReferenceFileSource.<init>(ReferenceFileSource.java:35)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.makeStandardMutect2PostFilterReadTransformer(Mutect2Engine.java:205)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.makePostReadFilterTransformer(Mutect2.java:241)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:171)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1098)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:166)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:209)
at org.broadinstitute.hellbender.Main.main(Main.java:306)Any idea what it may be?
-
Thanks for your suggestions. The input bam files passed PrintReads and ValidateSamFile tests. I attached the header below.
@HD VN:1.6 SO:unknown GO:query
@SQ SN:chr1 LN:248956422
.....
@SQ SN:chrY_MU273398v1_fix LN:865743
@RG ID:SRR9217692 SM:SRR9217692 PL:ILLUMINA
@PG ID:bwa-mem2 PN:bwa-mem2 VN:2.2.1 CL:bwa-mem2 mem -t 5 -M -R @RG\tID:SRR9217692\tSM:SRR9217692\tPL:ILLUMINA /media/bioinfo/reference/human/hg38/bwamem2/hg38.fa ../noncodingHCC/results/PRJNA504942_WGS/cleanqc/SRR9217692_clean_1.fq.gz ../noncodingHCC/results/PRJNA504942_WGS/cleanqc/SRR9217692_clean_2.fq.gz
@PG ID:samtools PN:samtools PP:bwa-mem2 VN:1.17 CL:samtools sort -O bam -@ 5 -o ../noncodingHCC/results/PRJNA504942_WGS/align/SRR9217692_sort.bam
@PG ID:MarkDuplicates VN:Version:4.4.0.0 CL:MarkDuplicates --INPUT ../noncodingHCC/results/PRJNA504942_WGS/align/SRR9217692_sort.bam --OUTPUT ../noncodingHCC/results/PRJNA504942_WGS/align/markdup/SRR9217692_sort_markdup.bam --METRICS_FILE ../noncodingHCC/results/PRJNA504942_WGS/align/markdup/SRR9217692_markdup_metrics.txt --REMOVE_DUPLICATES false --ASSUME_SORT_ORDER queryname --OPTICAL_DUPLICATE_PIXEL_DISTANCE 2500 --TMP_DIR ../noncodingHCC/results/PRJNA504942_WGS/temp/markdup/SRR9217692 --VALIDATION_STRINGENCY SILENT --CREATE_MD5_FILE false --MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP 50000 --MAX_FILE_HANDLES_FOR_READ_ENDS_MAP 8000 --SORTING_COLLECTION_SIZE_RATIO 0.25 --TAG_DUPLICATE_SET_MEMBERS false --REMOVE_SEQUENCING_DUPLICATES false --TAGGING_POLICY DontTag --CLEAR_DT true --DUPLEX_UMI false --FLOW_MODE false --FLOW_QUALITY_SUM_STRATEGY false --USE_END_IN_UNPAIRED_READS false --USE_UNPAIRED_CLIPPED_END false --UNPAIRED_END_UNCERTAINTY 0 --FLOW_SKIP_FIRST_N_FLOWS 0 --FLOW_Q_IS_KNOWN_END false --FLOW_EFFECTIVE_QUALITY_THRESHOLD 15 --ADD_PG_TAG_TO_READS true --ASSUME_SORTED false --DUPLICATE_SCORING_STRATEGY SUM_OF_BASE_QUALITIES --PROGRAM_RECORD_ID MarkDuplicates --PROGRAM_GROUP_NAME MarkDuplicates --READ_NAME_REGEX <optimized capture of last three ':' separated fields as numeric values> --MAX_OPTICAL_DUPLICATE_SET_SIZE 300000 --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false PN:MarkDuplicates
@PG ID:GATK ApplyBQSR VN:4.4.0.0 CL:ApplyBQSR --output ../noncodingHCC/results/PRJNA504942_WGS/align/bqsr/SRR9217692_sort_markdup_bqsr.bam --bqsr-recal-file ../noncodingHCC/results/PRJNA504942_WGS/align/bqsr/SRR9217692_bqsr.table --use-original-qualities true --static-quantized-quals 10 --static-quantized-quals 20 --static-quantized-quals 30 --input ../noncodingHCC/results/PRJNA504942_WGS/align/markdup/SRR9217692_sort_markdup.bam --reference /media/bioinfo/reference/human/hg38/bwamem2/hg38.fa --create-output-bam-index true --create-output-bam-md5 true --add-output-sam-program-record true --preserve-qscores-less-than 6 --quantize-quals 0 --round-down-quantized false --emit-original-quals false --global-qscore-prior -1.0 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false PN:GATK ApplyBQSR
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.17 CL:samtools view --header-only -o log/nohup_reportbug_headsoftumor.txt results/PRJNA504942_WGS/align/bqsr/T33M77y_SRR9217692SRR9208219_bqsr.bam -
Hi UTao Cao , looking at your header it might be a sorting issue as James had previously suspected. I see that the sorting order is unknown and it is possible that your file is not coordinate-sorted. Therefore, my suggestion would be to sort your sam/bam file and retry.
gabriele tosadori this looks like a different issue, your program ran out of Java heap memory so I would retry with more memory (e.g. -Xmx64g). -
Can Kockan, yes it worked indeed. Well, at least i think it did. Is it correct if the last line mutect2 prints is something like this:
10:13:08.683 INFO ProgressMeter - NW_003613915.1:1166429 147.3 3587110 24347.4
I have no idea what to expect from the standard output. Actually i was expecting something like "mutect finished" or something like that. So...can i assume it's done?
-
gabriele tosadori That could still be an early termination, I'd expect an exit status as well. I'd check the output VCF to make sure but I highly suspect that this is similar to the following issue:
See the last comment in that post by Louis Bergelson where he recommends leaving some memory for non-heap memory also, which might help fix the issue. -
Can Kockan, The command for sorting is
@PG ID:samtools PN:samtools PP:bwa-mem2 VN:1.17 CL:samtools sort -O bam -@ 5 -o ../noncodingHCC/results/PRJNA504942_WGS/align/SRR9217692_sort.bam
So the _sort.bam should be coordinate-sorted. (I also checked the first several lines of the bam files)
Please sign in to leave a comment.
7 comments