GenomeStrip SVPreprocess works with for a chromosome region; but fail for whole genome
AnsweredDear Forum member
I am trying to run SVPreprocess on my genome files samples. When I run with "
-L chr1:61700000-61900000" it works fine. So, i decided to run in on whole genome so removed the -L command. Then it started throwing error. Bellow is my command. I am not sure where I am doing the error. I have downloaded the bundle provided in the GenomeStrip web page link.
. /u/local/Modules/default/init/modules.sh
module load java/1.8.0_77
module load R/3.6.1
module load samtools
genome='/u/home/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta'
rmd_genome='/u/home/Resource/GenomeStrip/Homo_sapiens_assembly38/'
input_bam_list='/u/home/GenomeStrip/bam.list'
output_dir='/u/home/GenomeStrip/SV'
SV_TMPDIR='/u/home/tmp/'
export SV_DIR=/u/home/Tools/svtoolkit
export PATH=$PATH:/u/home/Tools/svtoolkit/bwa/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/u/home/Tools/svtoolkit/bwa/
classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar"
java -Xmx24g -cp ${classpath} \
org.broadinstitute.gatk.queue.QCommandLine \
-S ${SV_DIR}/qscript/SVPreprocess.q \
-S ${SV_DIR}/qscript/SVQScript.q \
-cp ${classpath} \
-gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
-tempDir ${SV_TMPDIR} \
-configFile ${SV_DIR}/conf/genstrip_parameters.txt \
-R "${genome}" \
-rmd "${rmd_genome}" \
-ploidyMapFile ${rmd_genome}/Homo_sapiens_assembly38.ploidymap.txt \
-I "${input_bam_list}" \
-md "${output_dir}"/Metadat \
-jobLogDir "${output_dir}"/logs \
-useMultiStep \
-computeGCProfiles true \
-computeReadCounts true \
-P chimerism.use.correction:false \
-run
-
Hi, APato,
It looks like out of 702 jobs, four failed, but most ran to completion. Here's the final tally:
INFO 23:27:05,377 QCommandLine - Script failed: 56 Pend, 0 Run, 4 Fail, 642 Done
Did you try just rerunning? On many compute clusters, ours included, jobs will occasionally fail due to a file system error or a node that is having difficulty, etc. These SVToolkit pipelines use Queue, which is a workflow manager that runs multi-step pipelines and can retry failures. If you rerun the exact same command, it will retry the four failed jobs and then if these succeed, it will then run the pending jobs (which depend on some of the failed jobs). This is the first thing to try.
If one of the jobs fails again, then you should dig into the detailed output logs from the failed jobs. The files names are listed in the output logs you sent. Feel free to send these detailed log files along if the symptom is not obvious (e.g. out of memory, disk full, etc.).
-
HI APato
Bob Handsaker will be able to help you out with your question.
-
You don't actually say what the error is.
-
Bob Handsaker I deleted that error file. But I again rerun the same script in another bam file. Only difference is the path. So, I grep "ERROR" and some above and bellow line. Hope this will help you
Thanks
INFO 09:21:39,763 FunctionEdge - Output written to /u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/logs/SVPreprocess-6.out
INFO 09:30:15,204 FunctionEdge - Starting: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/home/a/
ashokpat/project-gandalm/tmp' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/hom
e/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' org.broadinstitute.sv.main.SVCommandLine '-T' 'ComputeInsertSizeHistogramsWalker' '-R' '/u/home/a/ashokpat/project-gandalm/Resou
rce/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta' '-I' '/u/project/gandalm/shared/GenomicDatasets/1000Genomes/1000G_2504_high_coverage/high_coverage_alignment/20150511_IGSR
_highcov_cram_no_alt/NA19017/high_cov_alignment/S1_Remap_RG.bam' '-O' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata/isd/S1_Remap_RG.hist.bin' '-disableGATKTraversal' 'true'
'-md' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata' '-configFile' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/conf/genstrip_parameters.txt' '-configFile' '/u/home/a
/ashokpat/project-gandalm/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.gsparams.txt' '-P' 'chimerism.use.correction:false' '-chimerismFile' '/u/home/a/ashokpat/project-
gandalm/GenomeStrip/SV_2BAM/Metadata/isd/S1_Remap_RG.chimer.dat' '-createHistogramFile' 'true' -createEmpty
INFO 09:30:15,205 FunctionEdge - Output written to /u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/logs/SVPreprocess-7.out
INFO 10:20:31,848 FunctionEdge - Starting: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/home/a/
ashokpat/project-gandalm/tmp' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/hom
e/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' org.broadinstitute.sv.main.SVCommandLine '-T' 'ComputeInsertSizeHistogramsWalker' '-R' '/u/home/a/ashokpat/project-gandalm/Resou
rce/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta' '-I' '/u/project/gandalm/shared/GenomicDatasets/1000Genomes/1000G_2504_high_coverage/high_coverage_alignment/20150511_IGSR
_highcov_cram_no_alt/HG00096/high_cov_alignment/S2_Remap_RG.bam' '-O' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata/isd/S2_Remap_RG.hist.bin' '-disableGATKTraversal' 'true'
'-md' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata' '-configFile' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/conf/genstrip_parameters.txt' '-configFile' '/u/home/a
/ashokpat/project-gandalm/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.gsparams.txt' '-P' 'chimerism.use.correction:false' '-chimerismFile' '/u/home/a/ashokpat/project-
gandalm/GenomeStrip/SV_2BAM/Metadata/isd/S2_Remap_RG.chimer.dat' '-createHistogramFile' 'true' -createEmpty
INFO 10:20:31,849 FunctionEdge - Output written to /u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/logs/SVPreprocess-8.out
INFO 11:07:23,629 QGraph - 694 Pend, 6 Run, 0 Fail, 2 Done
ERROR 11:07:23,650 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/home/a/ash
okpat/project-gandalm/tmp' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/home/a
/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/
gatk/GenomeAnalysisTK.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.SVToolkitInfo'
ERROR 11:07:23,675 FunctionEdge - Contents of /u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata/svtoolkit.version.dat:
SVToolkit version 2.00 (build 1949)
Build date: 2020/01/21 11:50:19
Web site: http://www.broadinstitute.org/software/genomestrip
INFO 11:07:23,679 FunctionEdge - Done: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/home/a/asho
kpat/project-gandalm/tmp' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/home/a/
ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/g
atk/GenomeAnalysisTK.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.ComputeGenomeSizes' '-O' '/u/home/a/ashokpat/project-gandalm/GenomeS
trip/SV_2BAM/Metadata/genome_sizes.txt' '-R' '/u/home/a/ashokpat/project-gandalm/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta' '-ploidyMapFile' '/u/home/a/ashokpa
t/project-gandalm/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.ploidymap.txt' '-genomeMaskFile' '/u/home/a/ashokpat/project-gandalm/Resource/GenomeStrip/Homo_sapiens_ass
embly38/Homo_sapiens_assembly38.svmask.fasta'
INFO 11:07:23,683 FunctionEdge - Done: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/home/a/asho
kpat/project-gandalm/tmp' '-cp' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/SVToolkit.jar:/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/home/a/
ashokpat/project-gandalm/Tools/svtoolkit/lib/gatk/Queue.jar' org.broadinstitute.sv.main.SVCommandLine '-T' 'ComputeInsertSizeHistogramsWalker' '-R' '/u/home/a/ashokpat/project-gandalm/Resource/
GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta' '-I' '/u/project/gandalm/shared/GenomicDatasets/1000Genomes/1000G_2504_high_coverage/high_coverage_alignment/20150511_IGSR_hig
hcov_cram_no_alt/NA19017/high_cov_alignment/S1_Remap_RG.bam' '-O' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata/isd/S1_Remap_RG.hist.bin' '-disableGATKTraversal' 'true' '-m
d' '/u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata' '-configFile' '/u/home/a/ashokpat/project-gandalm/Tools/svtoolkit/conf/genstrip_parameters.txt' '-configFile' '/u/home/a/ash
okpat/project-gandalm/Resource/GenomeStrip/Homo_sapiens_assembly38/Homo_sapiens_assembly38.gsparams.txt' '-P' 'chimerism.use.correction:false' '-chimerismFile' '/u/home/a/ashokpat/project-gand
alm/GenomeStrip/SV_2BAM/Metadata/isd/S1_Remap_RG.chimer.dat' '-createHistogramFile' 'true' -createEmpty
INFO 11:07:23,686 FunctionEdge - Done: samtools index /u/home/a/ashokpat/project-gandalm/GenomeStrip/SV_2BAM/Metadata/headers.bam
-
It would be helpful to have the full output. Bhanu Gandham is there a way to attach a file in the forum? Or how to you ask users to submit big reports?
-
Hi Bob Handsaker and APato
Here are some instructions on how to submit bug reports: https://gatk.zendesk.com/hc/en-us/articles/360035889671
-
Bob Handsaker I just uploaded the error file in the ftp with file name Ashok0206SV.tar.gz
You will see four different files. SVchr1 with -L option which works perfect. SVall without -L option which is giving error. If you need any further information please do let me know.
Thanks
-
Bob Handsaker yes, I have to re-run the same script 4 time and finally all done. Is there any way I can accept the above comment indicating it solve my issue; so that it helps other community member if needed?
-
Glad you were able to get it working. Perhaps Bhanu Gandham knows how to accept the comment.
-
Hi APato
I agree indicating that this discussion solved your issue will help other community members. Thank you for thinking of that!
I have marked this discussion "Answered". You can also upvote the comments that helped you the most!
Please sign in to leave a comment.
10 comments