change the memory when running SVPreprocess
If you are seeing an error, please provide(REQUIRED) :
a) GATK version used:
b) Exact command used:
mx="-Xmx200g"
ms="-Xms40g"
java -cp ${classpath} ${ms} ${mx} -jar ${SV_DIR}/lib/SVToolkit.jar
# Run preprocessing (if run by chromosome do make sense, we can add option -L ${chr})
# For large scale use, you should use -reduceInsertSizeDistributions, but this is too slow for the installation test.
# The method employed by -computeGCProfiles requires a GC mask and is currently only supported for human genomes.
# rdmask.bed contains all of the base autosomal sequence in the reference genome
java -cp ${classpath} ${ms} ${mx} \
org.broadinstitute.gatk.queue.QCommandLine \
-S ${SV_DIR}/qscript/SVPreprocess.q \
-S ${SV_DIR}/qscript/SVQScript.q \
-gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
--disableJobReport \
-deleteIntermediateDirs true \
-cp ${classpath} \
-configFile genstrip_parameters.txt \
-genderMapFile gender.map \
-tempDir ${SV_TMPDIR}/tmp \
-R reference.fa \
-genomeMaskFile svmask.fasta \
-copyNumberMaskFile gcmask.fasta \
-readDepthMaskFile rdmask.bed \
-runDirectory ${runDir} \
-md ${runDir}/metadata \
-disableGATKTraversal \
-useMultiStep \
-reduceInsertSizeDistributions true \
-computeGCProfiles true \
-computeReadCounts true \
-ploidyMapFile ploidymap.txt \
-jobLogDir ${runDir}/logs \
-I ${input} \
-jobRunner ParallelShell \
-run \
|| exit 1
cd ${runDir} && tar -cvzf ${runDir}.tar.gz metadata || exit 1
c) Entire error log:
ERROR 12:29:38,053 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-X
X:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/public/home/cche/lixin/04_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SR
R7760084/tmp' '-cp' '/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/SVToolkit.jar:/public/home/cche/Zhengzhu
qing/01.software/Pop_Genentic/install/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/instal
l/svtoolkit/lib/gatk/Queue.jar' '-cp' '/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/SVToolkit.jar:/public/
home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/public/home/cche/Zhengzhuqing/01.software/P
op_Genentic/install/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.ReduceInsertSizeHistograms' '-I' '/public/home/cche/lixin/04
_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SRR7760084/metadata/isd.hist.bin' '-O' '/public/home/cche/lixin/04_d
ata/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SRR7760084/metadata/isd.dist.bin'
ERROR 12:29:38,248 FunctionEdge - Contents of /public/home/cche/lixin/04_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metada
ta/SRR7760084/logs/SVPreprocess-9.out:
INFO 12:28:17,013 HelpFormatter - -------------------------------------------------------------------
INFO 12:28:17,017 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ReduceInsertSizeHistograms
INFO 12:28:17,022 HelpFormatter - Program Args: -I /public/home/cche/lixin/04_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.
Metadata/SRR7760084/metadata/isd.hist.bin -O /public/home/cche/lixin/04_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadat
a/SRR7760084/metadata/isd.dist.bin
INFO 12:28:17,026 HelpFormatter - Executing as cche@s002 on Linux 3.10.0-862.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_161-b14.
INFO 12:28:17,027 HelpFormatter - Date/Time: 2022/01/11 12:28:17
INFO 12:28:17,027 HelpFormatter - -------------------------------------------------------------------
INFO 12:28:17,027 HelpFormatter - -------------------------------------------------------------------
Processing SRR7760084/SRR7760084/null ...
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Integer.valueOf(Integer.java:832)
at org.broadinstitute.sv.metadata.isize.EmpiricalInsertSizeDistribution2.getAbsoluteValueMap(EmpiricalInsertSizeDistribution2.java:5
60)
at org.broadinstitute.sv.metadata.isize.EmpiricalInsertSizeDistribution2.init(EmpiricalInsertSizeDistribution2.java:153)
at org.broadinstitute.sv.metadata.isize.EmpiricalInsertSizeDistribution2.<init>(EmpiricalInsertSizeDistribution2.java:49)
at org.broadinstitute.sv.metadata.isize.EmpiricalInsertSizeDistribution.create(EmpiricalInsertSizeDistribution.java:46)
at org.broadinstitute.sv.apps.ReduceInsertSizeHistograms.run(ReduceInsertSizeHistograms.java:71)
at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:58)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
at org.broadinstitute.sv.commandline.CommandLineProgram.runAndReturnResult(CommandLineProgram.java:31)
at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:27)
at org.broadinstitute.sv.apps.ReduceInsertSizeHistograms.main(ReduceInsertSizeHistograms.java:51)
INFO 12:41:13,772 QGraph - Writing incremental jobs reports...
INFO 12:41:13,794 QGraph - 1241 Pend, 0 Run, 1 Fail, 9 Done
INFO 12:41:13,804 QCommandLine - Writing final jobs report...
INFO 12:41:13,805 QCommandLine - Done with errors
INFO 12:41:13,854 QGraph - -------
INFO 12:41:13,855 QGraph - Failed: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/public/home/cche/lixin/04_data/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SRR7760084/tmp' '-cp' '/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/SVToolkit.jar:/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/gatk/Queue.jar' '-cp' '/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/SVToolkit.jar:/public/home/cche/Zhengzhuqing/01.software/Pop_Genentic/install/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/public/home/cche/Zhengzhuqing/01.software/Pop_G
enentic/install/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.ReduceInsertSizeHistograms' '-I' '/public/home/cche/lixin/04_dat
a/01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SRR7760084/metadata/isd.hist.bin' '-O' '/public/home/cche/lixin/04_data/
01.Suidae/01.alignment/02.short_reads/02.GenomeSTRiP/02.Metadata/SRR7760084/metadata/isd.dist.bin'
The question is how to set the requested memory for org.broadinstitute.sv.apps.ReduceInsertSizeHistograms, thank you in advance.
Best wishes,
Zheng zhuqing
-
Thank you for your post! Bob Handsaker has been tagged and will get back to you shortly.
-
Dear all,
Here is the comments from Bob, it works for me when setting "-memLimit 8".
Hi,
One approach is to pass -memLimit N (gigabyte units, e.g. -memLimit 4), which will set the default java memory limit for all of the individual jobs (not just ReduceInsertSizeHistograms). This will only affect jobs that do not specifically set a different memory limit in the Q script.
The other option is to edit the Q script itself to set a higher memory limit for ReduceInsertSizeHistograms only. In qscript/SVQScript.q see the definition for ReduceInsertSizeHistograms and add 'this.memoryLimit = Some(4)' or whatever you need. You will see other examples of commands that override the default memory limits in this way (e.g. MergeSamFiles requests 8g).
Since you are using -jobRunner ParallelShell, giving large amounts of memory to the parent Queue process with
ms="-Xms40g"
is actually counter-productive. The parent process is just tracking what needs to be done and forking child processes to do the work. So you should just give it 4g (or maybe 8g if you are processing thousands of input files) and that should be plenty.
-Bob
Please sign in to leave a comment.
2 comments