Filtering Mutect output from FFPE based targeted panel sequencing data
AnsweredHello,
I am using the GATK best practices workflow to find somatic variants in a set of tumour samples. The tissue was from FFPE and targeted panel sequencing was done on ~200 genes. There were no matched normals.
My question: I am struggling to figure out why I seem to not be filtering out many variants during the filtration step. I expect a large number of false positives due to the source material, but that a great many of these would be filtered out due to low depth, bad read quality etc. My suspicion is that I have mis-configured one of the commands... but despite returning to the docs several times now I am having trouble identifying the issue.
GATK version:
The Genome Analysis Toolkit (GATK) v4.2.0.0
HTSJDK Version: 2.24.0
Picard Version: 2.25.0
Commands used (removed full paths for ease of reading):
gatk BaseRecalibrator -I CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.bam -R human/GRCh37-lite.fa --known-sites forMutect2/1000G_phase1.snps.high_confidence.b37.vcf.gz --known-sites forMutect2/Mills_and_1000G_gold_standard.indels.b37.vcf.gz --known-sites forMutect2/dbsnp_138.b37.excluding_sites_after_129.vcf.gz -O CC-CHM-1347.qiaseq.recal_data.table
gatk ApplyBQSR -R human/GRCh37-lite.fa -I CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.bam --bqsr-recal-file CC-CHM-1347.qiaseq.recal_data.table -O CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam
gatk BaseRecalibrator -I CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam -R human/GRCh37-lite.fa --known-sites forMutect2/1000G_phase1.snps.high_confidence.b37.vcf.gz --known-sites forMutect2/Mills_and_1000G_gold_standard.indels.b37.vcf.gz --known-sites forMutect2/dbsnp_138.b37.excluding_sites_after_129.vcf.gz -O CC-CHM-1347.qiaseq.recal_data.post.table
gatk AnalyzeCovariates -before CC-CHM-1347.qiaseq.recal_data.table -after CC-CHM-1347.qiaseq.recal_data.post.table -csv CC-CHM-1347.qiaseq.AnalyzeCovariates.csv
Rscript_4.1 bin/BQSR.R CC-CHM-1347.qiaseq.AnalyzeCovariates.csv CC-CHM-1347.qiaseq.recal_data.table CC-CHM-1347.qiaseq.AnalyzeCovariates.pdf
samtools index -@ 24 CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam
java -jar picard.jar ValidateSamFile IGNORE_WARNINGS=true MODE=VERBOSE I=CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam
gatk Mutect2 -R human/GRCh37-lite.fa -I CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam --germline-resource forMutect2/af-only-gnomad.raw.sites.vcf --panel-of-normals forMutect2/Mutect2-WGS-panel-b37.vcf --f1r2-tar-gz CC-CHM-1347.qiaseq.f1r2.tar.gz -O CC-CHM-1347.qiaseq.vcf.gz
gatk LearnReadOrientationModel -I CC-CHM-1347.qiaseq.f1r2.tar.gz -O CC-CHM-1347.qiaseq.read-orientation-model.tar.gz --num-em-iterations 50
gatk GetPileupSummaries -I CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam -V forMutect2/small_exac_common_3.vcf -L forMutect2/small_exac_common_3.vcf -O CC-CHM-1347.qiaseq.getpileupsummaries.table
gatk CalculateContamination -I CC-CHM-1347.qiaseq.getpileupsummaries.table -tumor-segmentation CC-CHM-1347.qiaseq.segments.table -O CC-CHM-1347.qiaseq.contamination.table
gatk FilterMutectCalls -R human/GRCh37-lite.fa -V CC-CHM-1347.qiaseq.vcf.gz --tumor-segmentation CC-CHM-1347.qiaseq.segments.table --contamination-table CC-CHM-1347.qiaseq.contamination.table --ob-priors CC-CHM-1347.qiaseq.read-orientation-model.tar.gz -O CC-CHM-1347.qiaseq.filtered.vcf
# gatk Funcotator --variant CC-CHM-1347.qiaseq.filtered.vcf --reference human/GRCh37-lite.fa --ref-version hg19 --data-sources-path funcotator_dataSources.v1.7.20200521s --output CC-CHM-1347.qiaseq.filtered.funcotated.vcf --output-file-format VCF --force-b37-to-hg19-reference-contig-conversion
Log from the end of the Mutect2 command and on below. If I could provide anything else that would help identify where I've gone wrong please let me know!
Thanks!
15:59:03.068 INFO Mutect2 - 1404 read(s) filtered by: MappingQualityReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappingQualityNotZeroReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: NotSecondaryAlignmentReadFilter
0 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
111743 read(s) filtered by: ReadLengthReadFilter
1094 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
114241 total reads filtered
15:59:03.068 INFO ProgressMeter - Y:59370601 37.4 10330799 275908.5
15:59:03.068 INFO ProgressMeter - Traversal complete. Processed 10330799 total regions in 37.4 minutes.
15:59:04.052 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 1.578157027
15:59:04.052 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 642.709457097
15:59:04.052 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 208.17 sec
15:59:04.052 INFO Mutect2 - Shutting down engine
[July 25, 2022 3:59:04 PDT PM] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 37.61 minutes.
Runtime.totalMemory()=4148690944
Tool returned:
SUCCESS
Using GATK jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar LearnReadOrientationModel -I /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.f1r2.tar.gz -O /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.read-orientation-model.tar.gz --num-em-iterations 50
15:59:07.327 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 25, 2022 3:59:07 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
15:59:07.475 INFO LearnReadOrientationModel - ------------------------------------------------------------
15:59:07.476 INFO LearnReadOrientationModel - The Genome Analysis Toolkit (GATK) v4.2.0.0
15:59:07.476 INFO LearnReadOrientationModel - For support and documentation go to https://software.broadinstitute.org/gatk/
15:59:07.476 INFO LearnReadOrientationModel - Executing as madouglas@n315 on Linux v3.10.0-957.5.1.el7.x86_64 amd64
15:59:07.476 INFO LearnReadOrientationModel - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_172-b11
15:59:07.476 INFO LearnReadOrientationModel - Start Date/Time: July 25, 2022 3:59:07 PDT PM
15:59:07.476 INFO LearnReadOrientationModel - ------------------------------------------------------------
15:59:07.476 INFO LearnReadOrientationModel - ------------------------------------------------------------
15:59:07.477 INFO LearnReadOrientationModel - HTSJDK Version: 2.24.0
15:59:07.477 INFO LearnReadOrientationModel - Picard Version: 2.25.0
15:59:07.477 INFO LearnReadOrientationModel - Built for Spark Version: 2.4.5
15:59:07.477 INFO LearnReadOrientationModel - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:59:07.477 INFO LearnReadOrientationModel - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:59:07.477 INFO LearnReadOrientationModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:59:07.477 INFO LearnReadOrientationModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:59:07.477 INFO LearnReadOrientationModel - Deflater: IntelDeflater
15:59:07.477 INFO LearnReadOrientationModel - Inflater: IntelInflater
15:59:07.477 INFO LearnReadOrientationModel - GCS max retries/reopens: 20
15:59:07.477 INFO LearnReadOrientationModel - Requester pays: disabled
15:59:07.477 INFO LearnReadOrientationModel - Initializing engine
15:59:07.477 INFO LearnReadOrientationModel - Done initializing engine
15:59:07.483 INFO IOUtils - Extracting data from archive: file:///projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.f1r2.tar.gz
15:59:07.492 INFO IOUtils - Extracting file: ./CC-CHM-1347.alt_table
15:59:07.532 INFO IOUtils - Extracting file: ./CC-CHM-1347.ref_histogram
15:59:07.532 INFO IOUtils - Extracting file: ./CC-CHM-1347.alt_histogram
15:59:10.196 INFO LearnReadOrientationModel - Context AAC: with 36872 ref and 13053 alt examples, EM converged in 9 steps
15:59:12.407 INFO LearnReadOrientationModel - Context AAG: with 56032 ref and 18031 alt examples, EM converged in 8 steps
15:59:13.803 INFO LearnReadOrientationModel - Context AAT: with 59386 ref and 10784 alt examples, EM converged in 9 steps
15:59:14.550 INFO LearnReadOrientationModel - Context ACA: with 62776 ref and 6123 alt examples, EM converged in 7 steps
15:59:15.113 INFO LearnReadOrientationModel - Context ACC: with 45649 ref and 5334 alt examples, EM converged in 6 steps
15:59:15.738 INFO LearnReadOrientationModel - Context ACG: with 12885 ref and 4100 alt examples, EM converged in 7 steps
15:59:16.394 INFO LearnReadOrientationModel - Context ACT: with 51257 ref and 4650 alt examples, EM converged in 7 steps
15:59:17.160 INFO LearnReadOrientationModel - Context AGA: with 71480 ref and 6352 alt examples, EM converged in 7 steps
15:59:17.786 INFO LearnReadOrientationModel - Context AGC: with 56877 ref and 6538 alt examples, EM converged in 6 steps
15:59:18.559 INFO LearnReadOrientationModel - Context AGG: with 70196 ref and 8206 alt examples, EM converged in 6 steps
15:59:19.265 INFO LearnReadOrientationModel - Context ATA: with 48242 ref and 5570 alt examples, EM converged in 8 steps
15:59:20.367 INFO LearnReadOrientationModel - Context ATC: with 36539 ref and 9838 alt examples, EM converged in 7 steps
15:59:21.598 INFO LearnReadOrientationModel - Context ATG: with 51569 ref and 10801 alt examples, EM converged in 7 steps
15:59:22.793 INFO LearnReadOrientationModel - Context CAA: with 50594 ref and 11953 alt examples, EM converged in 7 steps
15:59:24.638 INFO LearnReadOrientationModel - Context CAC: with 46317 ref and 15401 alt examples, EM converged in 8 steps
15:59:26.762 INFO LearnReadOrientationModel - Context CAG: with 72743 ref and 20999 alt examples, EM converged in 7 steps
15:59:27.924 INFO LearnReadOrientationModel - Context CCA: with 68983 ref and 10067 alt examples, EM converged in 7 steps
15:59:28.785 INFO LearnReadOrientationModel - Context CCC: with 65909 ref and 9383 alt examples, EM converged in 6 steps
15:59:29.806 INFO LearnReadOrientationModel - Context CCG: with 20539 ref and 8359 alt examples, EM converged in 7 steps
15:59:30.450 INFO LearnReadOrientationModel - Context CGA: with 12097 ref and 4240 alt examples, EM converged in 7 steps
15:59:31.274 INFO LearnReadOrientationModel - Context CGC: with 18056 ref and 5994 alt examples, EM converged in 7 steps
15:59:32.041 INFO LearnReadOrientationModel - Context CTA: with 31225 ref and 5847 alt examples, EM converged in 8 steps
15:59:33.924 INFO LearnReadOrientationModel - Context CTC: with 53064 ref and 17512 alt examples, EM converged in 7 steps
15:59:35.868 INFO LearnReadOrientationModel - Context GAA: with 55152 ref and 16253 alt examples, EM converged in 8 steps
15:59:37.401 INFO LearnReadOrientationModel - Context GAC: with 29796 ref and 12516 alt examples, EM converged in 8 steps
15:59:38.082 INFO LearnReadOrientationModel - Context GCA: with 56641 ref and 7099 alt examples, EM converged in 6 steps
15:59:38.853 INFO LearnReadOrientationModel - Context GCC: with 59128 ref and 8146 alt examples, EM converged in 6 steps
15:59:39.604 INFO LearnReadOrientationModel - Context GGA: with 60397 ref and 7446 alt examples, EM converged in 6 steps
15:59:40.488 INFO LearnReadOrientationModel - Context GTA: with 31336 ref and 7216 alt examples, EM converged in 8 steps
15:59:41.331 INFO LearnReadOrientationModel - Context TAA: with 53717 ref and 6537 alt examples, EM converged in 9 steps
15:59:41.981 INFO LearnReadOrientationModel - Context TCA: with 62438 ref and 6420 alt examples, EM converged in 6 steps
15:59:44.325 INFO LearnReadOrientationModel - Context AAA: with 95158 ref and 19253 alt examples, EM converged in 9 steps
15:59:44.349 INFO LearnReadOrientationModel - Shutting down engine
[July 25, 2022 3:59:44 PDT PM] org.broadinstitute.hellbender.tools.walkers.readorientation.LearnReadOrientationModel done. Elapsed time: 0.62 minutes.
Runtime.totalMemory()=4347396096
Tool returned:
SUCCESS
Using GATK jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar GetPileupSummaries -I /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.reheadered.RG.fixmate.sorted.recal.bam -V /projects/molonc/huntsman_lab/share/hg19_genome/forMutect2/small_exac_common_3.vcf -L /projects/molonc/huntsman_lab/share/hg19_genome/forMutect2/small_exac_common_3.vcf -O /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.getpileupsummaries.table
15:59:47.154 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 25, 2022 3:59:47 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
15:59:47.327 INFO GetPileupSummaries - ------------------------------------------------------------
15:59:47.327 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.2.0.0
15:59:47.327 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
15:59:47.327 INFO GetPileupSummaries - Executing as madouglas@n315 on Linux v3.10.0-957.5.1.el7.x86_64 amd64
15:59:47.327 INFO GetPileupSummaries - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_172-b11
15:59:47.328 INFO GetPileupSummaries - Start Date/Time: July 25, 2022 3:59:47 PDT PM
15:59:47.328 INFO GetPileupSummaries - ------------------------------------------------------------
15:59:47.328 INFO GetPileupSummaries - ------------------------------------------------------------
15:59:47.328 INFO GetPileupSummaries - HTSJDK Version: 2.24.0
15:59:47.328 INFO GetPileupSummaries - Picard Version: 2.25.0
15:59:47.328 INFO GetPileupSummaries - Built for Spark Version: 2.4.5
15:59:47.328 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:59:47.328 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:59:47.328 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:59:47.328 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:59:47.328 INFO GetPileupSummaries - Deflater: IntelDeflater
15:59:47.328 INFO GetPileupSummaries - Inflater: IntelInflater
15:59:47.329 INFO GetPileupSummaries - GCS max retries/reopens: 20
15:59:47.329 INFO GetPileupSummaries - Requester pays: disabled
15:59:47.329 INFO GetPileupSummaries - Initializing engine
15:59:47.704 INFO FeatureManager - Using codec VCFCodec to read file file:///projects/molonc/huntsman_lab/share/hg19_genome/forMutect2/small_exac_common_3.vcf
15:59:47.859 INFO FeatureManager - Using codec VCFCodec to read file file:///projects/molonc/huntsman_lab/share/hg19_genome/forMutect2/small_exac_common_3.vcf
15:59:48.170 INFO IntervalArgumentCollection - Processing 60040 bp from intervals
15:59:48.190 INFO GetPileupSummaries - Done initializing engine
15:59:48.191 INFO ProgressMeter - Starting traversal
15:59:48.191 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute
15:59:57.997 INFO GetPileupSummaries - 0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappingQualityNotZeroReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: PrimaryLineReadFilter
0 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: MateOnSameContigOrNoMappedMateReadFilter
57 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
57 total reads filtered
15:59:57.998 INFO ProgressMeter - unmapped 0.2 765 4680.8
15:59:57.998 INFO ProgressMeter - Traversal complete. Processed 765 total loci in 0.2 minutes.
15:59:58.023 INFO GetPileupSummaries - Shutting down engine
[July 25, 2022 3:59:58 PDT PM] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.18 minutes.
Runtime.totalMemory()=3153592320
Tool returned:
SUCCESS
Using GATK jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar CalculateContamination -I /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.getpileupsummaries.table -tumor-segmentation /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.segments.table -O /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.contamination.table
16:00:00.638 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 25, 2022 4:00:00 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
16:00:00.788 INFO CalculateContamination - ------------------------------------------------------------
16:00:00.789 INFO CalculateContamination - The Genome Analysis Toolkit (GATK) v4.2.0.0
16:00:00.789 INFO CalculateContamination - For support and documentation go to https://software.broadinstitute.org/gatk/
16:00:00.789 INFO CalculateContamination - Executing as madouglas@n315 on Linux v3.10.0-957.5.1.el7.x86_64 amd64
16:00:00.789 INFO CalculateContamination - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_172-b11
16:00:00.789 INFO CalculateContamination - Start Date/Time: July 25, 2022 4:00:00 PDT PM
16:00:00.789 INFO CalculateContamination - ------------------------------------------------------------
16:00:00.789 INFO CalculateContamination - ------------------------------------------------------------
16:00:00.790 INFO CalculateContamination - HTSJDK Version: 2.24.0
16:00:00.790 INFO CalculateContamination - Picard Version: 2.25.0
16:00:00.790 INFO CalculateContamination - Built for Spark Version: 2.4.5
16:00:00.790 INFO CalculateContamination - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:00:00.790 INFO CalculateContamination - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:00:00.790 INFO CalculateContamination - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:00:00.790 INFO CalculateContamination - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:00:00.790 INFO CalculateContamination - Deflater: IntelDeflater
16:00:00.790 INFO CalculateContamination - Inflater: IntelInflater
16:00:00.790 INFO CalculateContamination - GCS max retries/reopens: 20
16:00:00.790 INFO CalculateContamination - Requester pays: disabled
16:00:00.791 INFO CalculateContamination - Initializing engine
16:00:00.791 INFO CalculateContamination - Done initializing engine
16:00:00.853 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (1) to segment; using all data points to calculate kernel matrix.
16:00:00.879 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (1). Local changepoint costs will not be calculated for this window size.
16:00:00.879 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.887 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.891 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (2) to segment; using all data points to calculate kernel matrix.
16:00:00.891 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (2). Local changepoint costs will not be calculated for this window size.
16:00:00.891 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.892 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.892 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (2) to segment; using all data points to calculate kernel matrix.
16:00:00.892 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (2). Local changepoint costs will not be calculated for this window size.
16:00:00.892 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.893 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.893 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (1) to segment; using all data points to calculate kernel matrix.
16:00:00.893 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (1). Local changepoint costs will not be calculated for this window size.
16:00:00.894 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.894 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.894 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (5) to segment; using all data points to calculate kernel matrix.
16:00:00.897 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (5). Local changepoint costs will not be calculated for this window size.
16:00:00.897 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.898 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.898 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (1) to segment; using all data points to calculate kernel matrix.
16:00:00.899 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (1). Local changepoint costs will not be calculated for this window size.
16:00:00.899 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.899 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.899 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (6) to segment; using all data points to calculate kernel matrix.
16:00:00.900 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (6). Local changepoint costs will not be calculated for this window size.
16:00:00.900 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.900 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.901 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (3) to segment; using all data points to calculate kernel matrix.
16:00:00.901 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (3). Local changepoint costs will not be calculated for this window size.
16:00:00.901 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.901 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.902 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (1) to segment; using all data points to calculate kernel matrix.
16:00:00.902 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (1). Local changepoint costs will not be calculated for this window size.
16:00:00.902 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.903 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.903 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (5) to segment; using all data points to calculate kernel matrix.
16:00:00.904 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (5). Local changepoint costs will not be calculated for this window size.
16:00:00.904 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.904 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.905 WARN KernelSegmenter - Specified dimension of the kernel approximation (100) exceeds the number of data points (4) to segment; using all data points to calculate kernel matrix.
16:00:00.906 WARN KernelSegmenter - Number of points needed to calculate local changepoint costs (2 * window size = 100) exceeds number of data points (4). Local changepoint costs will not be calculated for this window size.
16:00:00.906 WARN KernelSegmenter - No changepoint candidates were found. The specified window sizes may be inappropriate, or there may be insufficient data points.
16:00:00.906 INFO KernelSegmenter - Found 0 changepoints after applying the changepoint penalty.
16:00:00.987 INFO CalculateContamination - Shutting down engine
[July 25, 2022 4:00:00 PDT PM] org.broadinstitute.hellbender.tools.walkers.contamination.CalculateContamination done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2250768384
Tool returned:
SUCCESS
Using GATK jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar FilterMutectCalls -R /projects/molonc/huntsman_lab/madouglas/bin/reference_genomes/dlp_refdata/human/GRCh37-lite.fa -V /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.vcf.gz --tumor-segmentation /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.segments.table --contamination-table /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.contamination.table --ob-priors /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.read-orientation-model.tar.gz -O /projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.filtered.vcf
16:00:03.727 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gsc/software/linux-x86_64-centos7/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 25, 2022 4:00:03 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
16:00:03.875 INFO FilterMutectCalls - ------------------------------------------------------------
16:00:03.876 INFO FilterMutectCalls - The Genome Analysis Toolkit (GATK) v4.2.0.0
16:00:03.876 INFO FilterMutectCalls - For support and documentation go to https://software.broadinstitute.org/gatk/
16:00:03.876 INFO FilterMutectCalls - Executing as madouglas@n315 on Linux v3.10.0-957.5.1.el7.x86_64 amd64
16:00:03.876 INFO FilterMutectCalls - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_172-b11
16:00:03.876 INFO FilterMutectCalls - Start Date/Time: July 25, 2022 4:00:03 PDT PM
16:00:03.876 INFO FilterMutectCalls - ------------------------------------------------------------
16:00:03.876 INFO FilterMutectCalls - ------------------------------------------------------------
16:00:03.877 INFO FilterMutectCalls - HTSJDK Version: 2.24.0
16:00:03.877 INFO FilterMutectCalls - Picard Version: 2.25.0
16:00:03.877 INFO FilterMutectCalls - Built for Spark Version: 2.4.5
16:00:03.877 INFO FilterMutectCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:00:03.877 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:00:03.877 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:00:03.877 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:00:03.877 INFO FilterMutectCalls - Deflater: IntelDeflater
16:00:03.877 INFO FilterMutectCalls - Inflater: IntelInflater
16:00:03.877 INFO FilterMutectCalls - GCS max retries/reopens: 20
16:00:03.877 INFO FilterMutectCalls - Requester pays: disabled
16:00:03.877 INFO FilterMutectCalls - Initializing engine
16:00:04.254 INFO FeatureManager - Using codec VCFCodec to read file file:///projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.vcf.gz
16:00:04.322 INFO FilterMutectCalls - Done initializing engine
16:00:04.397 INFO IOUtils - Extracting data from archive: file:///projects/molonc/scratch/madouglas/cn_signatures/targeted_panel_seq/aligned/CC-CHM-1347/CC-CHM-1347.qiaseq.read-orientation-model.tar.gz
16:00:04.405 INFO IOUtils - Extracting file: ./CC-CHM-1347.orientation_priors
16:00:04.443 INFO ProgressMeter - Starting traversal
16:00:04.443 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
16:00:04.444 INFO FilterMutectCalls - Starting pass 0 through the variants
16:00:07.846 INFO FilterMutectCalls - Finished pass 0 through the variants
16:00:08.075 INFO FilterMutectCalls - Starting pass 1 through the variants
16:00:10.311 INFO FilterMutectCalls - Finished pass 1 through the variants
16:00:10.373 INFO FilterMutectCalls - Starting pass 2 through the variants
16:00:12.565 INFO FilterMutectCalls - Finished pass 2 through the variants
16:00:12.572 INFO FilterMutectCalls - Starting pass 3 through the variants
16:00:14.592 INFO ProgressMeter - 13:28610087 0.2 43000 254212.2
16:00:15.688 INFO FilterMutectCalls - Finished pass 3 through the variants
16:00:15.697 INFO FilterMutectCalls - No variants filtered by: AllowAllVariantsVariantFilter
16:00:15.697 INFO FilterMutectCalls - 0 read(s) filtered by: AllowAllReadsReadFilter
16:00:15.698 INFO ProgressMeter - X:15833999 0.2 47776 254714.8
16:00:15.698 INFO ProgressMeter - Traversal complete. Processed 47776 total variants in 0.2 minutes.
16:00:15.825 INFO FilterMutectCalls - Shutting down engine
[July 25, 2022 4:00:15 PDT PM] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 0.20 minutes.
Runtime.totalMemory()=2911371264
-
Hi Maxwell Douglas,
It is challenging to detect true variants when you are running tumor-only mode with FFPE data. You are dealing with two challenges that do not help FilterMutectCalls to succeed. I would expect that many of your variants are false positives, but also that FilterMutectCalls will have a hard time determining which are false positives and which are true positives.
We are continuing to improve Mutect2 so in the future you may find more success, but right now it looks like you are running the commands correctly, you are just up against a challenging data set. You can read more in this comment from David Benjamin, one of our Mutect2 developer leads: https://gatk.broadinstitute.org/hc/en-us/community/posts/360057810051/comments/360010970771
Let me know if you have any further questions.
Best regards,
Genevieve
-
Hi Maxwell,
We haven't heard from you in a while so we're going to close out this ticket in our system. If you still require assistance, simply respond to this thread and we'll be happy to pick up where we left off!
Kind regards,
Genevieve
Please sign in to leave a comment.
2 comments