GetPileupSummaries all outputs zero
When I run GetPileupSummaries with a small example bam, all the output columns are zero, even though I know that the bam contains a read that overlaps the variants. This seems to be the same issue as in these two tickets:
But neither of those issues were resolved.
a) GATK version used:
The Genome Analysis Toolkit (GATK) v4.5.0.0
Java runtime: OpenJDK 64-Bit Server VM v21.0.2+13-LTS
b) Exact command used:
gatk GetPileupSummaries --input test.bam --variant small_exac_common_3.hg38.vcf.gz --output test.table --reference chr1.fa -max-af 0.3 --intervals test.bed
c) Entire program log:
Using GATK jar /Users/Steven.Strong/Downloads/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /Users/Steven.Strong/Downloads/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar GetPileupSummaries --input test.bam --variant small_exac_common_3.hg38.vcf.gz --output test.table --reference chr1.fa -max-af 0.3 --intervals test.bed
14:05:24.839 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/Steven.Strong/Downloads/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
14:05:24.862 WARN NativeLibraryLoader - Unable to load libgkl_compression.dylib from native/libgkl_compression.dylib (/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression16741880035170389975.dylib: dlopen(/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression16741880035170389975.dylib, 0x0001): tried: '/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression16741880035170389975.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e' or 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression16741880035170389975.dylib' (no such file), '/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression16741880035170389975.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e' or 'arm64')))
14:05:24.864 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/Steven.Strong/Downloads/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
14:05:24.866 WARN NativeLibraryLoader - Unable to load libgkl_compression.dylib from native/libgkl_compression.dylib (/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression1128659698013723509.dylib: dlopen(/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression1128659698013723509.dylib, 0x0001): tried: '/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression1128659698013723509.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e' or 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression1128659698013723509.dylib' (no such file), '/private/var/folders/5p/6168bwcn7lzcjvzl3kdkc52h0000gp/T/libgkl_compression1128659698013723509.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e' or 'arm64')))
14:05:24.927 INFO GetPileupSummaries - ------------------------------------------------------------
14:05:24.929 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.5.0.0
14:05:24.929 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
14:05:24.930 INFO GetPileupSummaries - Executing as Steven.Strong@MacBook-Pro-K7T94FCQK2.local on Mac OS X v14.4.1 aarch64
14:05:24.930 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v21.0.2+13-LTS
14:05:24.930 INFO GetPileupSummaries - Start Date/Time: April 15, 2024, 2:05:24 PM MDT
14:05:24.930 INFO GetPileupSummaries - ------------------------------------------------------------
14:05:24.930 INFO GetPileupSummaries - ------------------------------------------------------------
14:05:24.930 INFO GetPileupSummaries - HTSJDK Version: 4.1.0
14:05:24.930 INFO GetPileupSummaries - Picard Version: 3.1.1
14:05:24.931 INFO GetPileupSummaries - Built for Spark Version: 3.5.0
14:05:24.931 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:05:24.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:05:24.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:05:24.931 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:05:24.931 INFO GetPileupSummaries - Deflater: JdkDeflater
14:05:24.931 INFO GetPileupSummaries - Inflater: JdkInflater
14:05:24.931 INFO GetPileupSummaries - GCS max retries/reopens: 20
14:05:24.931 INFO GetPileupSummaries - Requester pays: disabled
14:05:24.931 INFO GetPileupSummaries - Initializing engine
14:05:24.954 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:24.955 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:25.003 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:25.010 INFO FeatureManager - Using codec VCFCodec to read file file:///Users/Steven.Strong/Desktop/small_exac_common_3.hg38.vcf.gz
14:05:25.016 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:25.016 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:25.036 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
14:05:25.063 INFO FeatureManager - Using codec BEDCodec to read file file:///Users/Steven.Strong/Desktop/test.bed
14:05:25.064 INFO IntervalArgumentCollection - Processing 2999997 bp from intervals
14:05:25.075 INFO GetPileupSummaries - Done initializing engine
14:05:25.078 INFO ProgressMeter - Starting traversal
14:05:25.078 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute
14:05:25.104 INFO GetPileupSummaries - 0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappingQualityNotZeroReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: PrimaryLineReadFilter
0 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: MateOnSameContigOrNoMappedMateReadFilter
0 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
0 total reads filtered out of 2 reads processed
14:05:25.104 INFO ProgressMeter - unmapped 0.0 182 420000.0
14:05:25.104 INFO ProgressMeter - Traversal complete. Processed 182 total loci in 0.0 minutes.
14:05:25.104 INFO GetPileupSummaries - Shutting down engine
[April 15, 2024, 2:05:25 PM MDT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=327155712
Tool returned:
SUCCESS
The reference is
s3://ngi-igenomes/
igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/Chromosomes/chr1.fa
The bam looks like this
$ samtools view -h test.bam
@HD VN:1.6 SO:coordinate
@SQ SN:chr1 LN:248956422 M5:6aef897c3d6ff0c78aff06ac189178dd
@RG ID:7dc37610-761d-4b61-b054-f2cf7a252e47 SM:7dc37610-761d-4b61-b054-f2cf7a252e47 PL:ILLUMINA
@PG ID:samtools PN:samtools VN:1.19.2 CL:samtools view -h -o test.bam
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.19.2 CL:samtools view -h test.bam
A00827:893:HM2TFDSX7:4:2149:1380:31156:GATCT-GTAGT 163 chr1 139099 1 96M = 139185 182 GGCCAGTGTGAGGCAAGACCTGGGCCTGTCTAGGCTGCTGGGAGACAGGCAGGAATCTGGCCAGGGAAGGTTGCCATGAGACAAAAGTTGGGCCTG FFFF:FFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF:,FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFF XG:i:0 XM:i:0 XN:i:0 XO:i:0 AS:i:0 XS:i:0 YS:i:-10 YT:Z:CP cr:i:0 cf:i:0 gf:f:58.791gr:f:60.4167 oe:f:0 ab:i:1 qm:f:37 qn:f:37 PG:Z:MarkDuplicates RG:Z:7dc37610-761d-4b61-b054-f2cf7a252e47 MD:Z:96 NM:i:0
A00827:893:HM2TFDSX7:4:2149:1380:31156:GATCT-GTAGT 83 chr1 139185 1 96M = 139099 -182 GTTGGGCCTGGAAAGGCCCTTGTGAAGCGTGAGCTTGGCCTAAAGAGGACACTGGGTGGCAGGAGCTGGGTGTGTAGAAGCTGCTGAAAGGTTGGG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF XG:i:0 XM:i:2 XN:i:0 XO:i:0 AS:i:-10 XS:i:-10 YS:i:0 YT:Z:CP cr:i:1 cf:i:0 gf:f:58.7912 gr:f:58.3333 oe:f:0 ab:i:1 qm:f:37 qn:f:37 PG:Z:MarkDuplicates RG:Z:7dc37610-761d-4b61-b054-f2cf7a252e47 MD:Z:28A19C47 NM:i:2
the intervals file is
$ cat test.bed
chr1 3 3000000
the output is:
$ cat test.table
#<METADATA>SAMPLE=7dc37610-761d-4b61-b054-f2cf7a252e47
contig position ref_count alt_count other_alt_count allele_frequency
chr1 139213 0 0 0 0.233
chr1 139233 0 0 0 0.231
I tried running gatk ValidateSamFile and I get no errors
-
Your mapping qualities are all 1 for these 2 reads. GetPileupSummaries tool requires a default minimum of 50 as mapping quality for counting reads.
--min-mapping-quality,-mmq <Integer>
Minimum read mapping quality Default value: 50.Can you adjust this value to 1 or 0 and see if it works. Also your aligner may not be an optimal one to provide proper mapping qualities therefore we recommend you review your tools for mapping.
I hope this helps.
-
That fixed it. Thanks so much! It might be useful if this filtering step was also logged like the other filtering steps, to make it easier to debug issues like this.
-
Thank you for the feedback. A PR is ready to add this log to the default output and will be available in the next release.
https://github.com/broadinstitute/gatk/pull/8781
Regards.
-
thanks!
Please sign in to leave a comment.
4 comments