ApplyBQSR wastes too much disk space
I have made the mistake of using the default compression level for ApplyBQSR and now I have terabytes of wasted space in my project bam files. The recalibrated files take up twice the disk space in comparison to the original files. Is there a way to change the existing compression level of the base quality score recalibrated bam files?
It does not make sense to keep multiple versions of the same sample or run our analyses on different bam files (recalibrated vs non-recalibrated), so we want to keep the BQSR files for long term storage and future use. This is a massive waste of space.
I'm am experimenting with running ApplyBQSR with "--java-options Dsamjdk.compression_level=5" but I would really like to change the compression level of the existing bam files. Is this possible?
-
This might work:
samtools view -@ 6 -h -b --output-fmt-option level=6 -o compressed.bam uncompressed.bqsr.bam
You should be able to leave out "--output-fmt-option level=6" and it should work just as well.
-
Hi registered_user,
Thank you for providing this solution that may be helpful for other users in the same situation! Was this successful for you at changing the bam compression?
Kind regards,
Pamela
-
I was able to get a ~25% filesize reduction with running the bam file through samtools view. However, trying to use "ApplyBQSR --java-options "-Dsamjdk.compression_level=6" results in error:
Error: Could not find or load main class "-Dsamjdk.compression_level=6"
Working on solving that currently so I don't need to run samtools view for all future files.
-
Solution:
gatk --java-options "-Dsamjdk.compression_level=6" ApplyBQSR [options]
Please sign in to leave a comment.
4 comments