GATK Picard SortSam/MarkDups Error with Snappy Loading
Hello, I am experiencing some issues running GATK Picard MarkDuplicates/SortSam. I am running with Picard 3.1.1 with openJDK 17.0 in a SLURM computing cluster.
Specifically, I am having issues after Picard MarkDuplicates finishes running and the data is being compressed into a file with Snappy.
Below is the script I have been using:
module load gatk/4.4.0.0-gdwu64w
module load openjdk
module load picard/3.1.1-vqrqbn
module load snappy
cd /data/niams_fis/NIH/IDX_VP/LFATMM/Bowtie2/noMit_Bam
for file in woMit.bam; do
picard SortSam -I $file -O ${file%.bam}.sorted.bam -SO coordinate
done
The log output is the one below.
Using GATK jar /data/apps/software/spack/linux-rocky9-x86_64_v3/gcc-11.3.1/gatk-4.4.0.0-gdwu64wh43ech6cvrzv5ra6gu65l32vw/bin/gatk-package-4.4.0.0-local.jar
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data/apps/software/spack/linux-rocky9-x86_64_v3/gcc-11.3.1/gatk-4.4.0.0-gdwu64wh43ech6cvrzv5ra6gu65l32vw/bin/gatk-package-4.4.0.0-local.jar SortSam -I 0BR1.GRCh38p13.bam -O 0BR1.GRCh38p13.sorted.bam -SO coordinate
Running:
17:02:11.294 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/apps/software/spack/linux-rocky9-x86_64_v3/gcc-11.3.1/picard-3.0.0-vhvs3zf7zu7mqlrps62ioirjufmcouof/bin/picard.jar!/com/intel/gkl/native/libgkl_compression.so
17:02:11.312 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (/tmp/patelvik/libgkl_compression13232829192930099505.so: /tmp/patelvik/libgkl_compression13232829192930099505.so: failed to map segment from shared object)
17:02:11.312 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/apps/software/spack/linux-rocky9-x86_64_v3/gcc-11.3.1/picard-3.0.0-vhvs3zf7zu7mqlrps62ioirjufmcouof/bin/picard.jar!/com/intel/gkl/native/libgkl_compression.so
17:02:11.315 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (/tmp/patelvik/libgkl_compression13540030495079064559.so: /tmp/patelvik/libgkl_compression13540030495079064559.so: failed to map segment from shared object)
[Tue Mar 26 17:02:11 EDT 2024] SortSam --INPUT 0_35_BR1.GRCh38p13.woMit.bam --OUTPUT 0_35_BR1.GRCh38p13.woMit.sorted.bam --SORT_ORDER coordinate --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Tue Mar 26 17:02:11 EDT 2024] Executing as patelvik@ai-hpcgpu26.niaid.nih.gov on Linux 5.14.0-284.30.1.el9_2.x86_64 amd64; OpenJDK 64-Bit Server VM 17.0.8.1+1; Deflater: Jdk; Inflater: Jdk; Provider GCS is not available; Picard version: Version:3.0.0
17:02:11.349 WARN IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
17:02:11.370 WARN IntelDeflaterFactory - Intel Deflater not supported, using Java.util.zip.Deflater
INFO 2024-03-26 17:02:11 SortSam Seen many non-increasing record positions. Printing Read-names as well.
WARNING 2024-03-26 17:02:13 SnappyLoader Snappy native library failed to load.
org.xerial.snappy.SnappyError: [UNSUPPORTED_PLATFORM] pure-java snappy requires access to java.nio.Buffer raw address field
at org.xerial.snappy.pure.UnsafeUtil.<clinit>(UnsafeUtil.java:49)
at org.xerial.snappy.pure.SnappyRawCompressor.writeUncompressedLength(SnappyRawCompressor.java:405)
at org.xerial.snappy.pure.SnappyRawCompressor.compress(SnappyRawCompressor.java:111)
at org.xerial.snappy.pure.PureJavaSnappy.rawCompress(PureJavaSnappy.java:128)
at org.xerial.snappy.Snappy.rawCompress(Snappy.java:450)
at org.xerial.snappy.Snappy.compress(Snappy.java:123)
at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:380)
at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:334)
at org.xerial.snappy.SnappyOutputStream.close(SnappyOutputStream.java:419)
at htsjdk.samtools.util.SnappyLoader.<init>(SnappyLoader.java:62)
at htsjdk.samtools.util.SnappyLoader.<init>(SnappyLoader.java:48)
at htsjdk.samtools.util.TempStreamFactory.getSnappyLoader(TempStreamFactory.java:42)
at htsjdk.samtools.util.TempStreamFactory.wrapTempOutputStream(TempStreamFactory.java:74)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:251)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:202)
at picard.sam.SortSam.doWork(SortSam.java:163)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:289)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:104)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:114)
[Tue Mar 26 17:02:13 EDT 2024] picard.sam.SortSam done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=5502926848
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.pure.UnsafeUtil
at org.xerial.snappy.pure.SnappyRawCompressor.writeUncompressedLength(SnappyRawCompressor.java:412)
at org.xerial.snappy.pure.SnappyRawCompressor.compress(SnappyRawCompressor.java:111)
at org.xerial.snappy.pure.PureJavaSnappy.rawCompress(PureJavaSnappy.java:128)
at org.xerial.snappy.Snappy.rawCompress(Snappy.java:450)
at org.xerial.snappy.Snappy.compress(Snappy.java:123)
at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:380)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:130)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:212)
at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:168)
at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:40)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:254)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:202)
at picard.sam.SortSam.doWork(SortSam.java:163)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:289)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:104)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:114)
Suppressed: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.pure.UnsafeUtil
at org.xerial.snappy.pure.SnappyRawCompressor.writeUncompressedLength(SnappyRawCompressor.java:412)
at org.xerial.snappy.pure.SnappyRawCompressor.compress(SnappyRawCompressor.java:111)
at org.xerial.snappy.pure.PureJavaSnappy.rawCompress(PureJavaSnappy.java:128)
at org.xerial.snappy.Snappy.rawCompress(Snappy.java:450)
at org.xerial.snappy.Snappy.compress(Snappy.java:123)
at org.xerial.snappy.SnappyOutputStream.compressInput(SnappyOutputStream.java:380)
at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:334)
at org.xerial.snappy.SnappyOutputStream.close(SnappyOutputStream.java:419)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:259)
-
Hello,
I'm not sure what's causing this exactly. What sort of machines is your cluster using? I'm assuming they're not x64 machines since the intel libraries also aren't loading.
I don't know exactly why snappy isn't available but it sounds like it's not available for whatever your system is.
You should be able to turn off snappy native compression by providing a system property.
-Dsamjdk.snappy.disable=true
I think that should work around your problem.
-
Hello,
Thank you for the info, we are using linux based systems that are running on a SLURM scheduler. I asked the cluster team to download snappy as a separate module which can be loaded in as well. However, when both GATK and Snappy are loaded, it still is unable to access the library. I assume it is some authorization error but I'm not sure.
With the flag above, would the command look like this:
picard --java-options -Dsamjdk.snappy.disable=true SortSam -I $file -O ${file%.bam*}.sorted.bam -SO coordinate
or like this
picard SortSam -Dsamjdk.snappy.disable=true -I $file -O ${file%.bam*}.sorted.bam -SO coordinate
-
So we bundle snappy with Picard and it includes native code for a lot of different platforms. Do you know if your machines are x86/64 or if they're some sort of ARM machine? I don't think you'll be able to easily provide your own version of snappy because picard won't be able to locate it.
It looks like you're using some sort of wrapper script to run picard since you're not just invoking it as java -jar picard.jar. You'll have to look intto the exact details of that script since it's not something that we provide. The snappy option has to be provided to the JVM that's running picard, not as an argument to the picard program itself. So imagine first one is probably the right thing to do if that argument exists in your script.
Please sign in to leave a comment.
3 comments