Genome STRiP CNV Discovery pipeline error
If you are seeing an error, please provide(REQUIRED) :
a) Genome STRiP version used: 2.00.1958
b) Exact command used:
#run cnv discovery
java -cp ${classpath} ${mx} \
org.broadinstitute.gatk.queue.QCommandLine \
-S ${SV_DIR}/qscript/discovery/cnv/CNVDiscoveryPipeline.q \
-S ${SV_DIR}/qscript/SVQScript.q \
-cp ${classpath} \
-gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
--disableJobReport \
-configFile ${SV_DIR}/conf/genstrip_parameters.txt \
-R /cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta \
-I /cbio/projects/003/thandeka/my_scripts/input-cram-100bp-files.list \
-md /cbio/projects/003/thandeka/my_scripts/SVDiscovery-100bp.pipeline/metadata \
-runDirectory ${runDir} \
-intervalList /cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.interval.list \
-L chr1 \
-disableGATKTraversal \
-jobLogDir ${runDir}/logs \
-tilingWindowSize 1000 \
-tilingWindowOverlap 500 \
-maximumReferenceGapLength 1000 \
-boundaryPrecision 100 \
-minimumRefinedLength 500 \
-produceAuxiliaryFiles \
-maxConcurrentRun 60 \
-jobRunner Drmaa \
-gatkJobRunner Drmaa \
-jobNative "--mem=230000 --mincpus=32 --time=96:00:00" \
-jobQueue Main \
-run
c) Entire error log:
INFO 23:13:46,087 HelpFormatter - ------------------------------------------------------------
INFO 23:13:46,090 HelpFormatter - Program Name: org.broadinstitute.sv.discovery.SVDepthScanner
INFO 23:13:46,094 HelpFormatter - Program Args: -R /cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta -genomeMaskFile /cbio/projects/003/thandeka/my$
INFO 23:13:46,101 HelpFormatter - Executing as marhwayiza@compute-071 on Linux 4.15.0-109-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_261-b12.
INFO 23:13:46,102 HelpFormatter - Date/Time: 2020/10/18 23:13:46
INFO 23:13:46,102 HelpFormatter - ------------------------------------------------------------
INFO 23:13:46,102 HelpFormatter - ------------------------------------------------------------
INFO 23:13:46,214 SVDepthScanner - Opening reference sequence ...
INFO 23:13:46,216 SVDepthScanner - Opened reference sequence.
INFO 23:13:46,216 SVDepthScanner - Opening genome mask ...
INFO 23:13:46,218 SVDepthScanner - Opened genome mask.
Exception in thread "main" java.lang.RuntimeException: End of file while reading fasta file: /cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.lcmask.fasta
at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:69)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
at org.broadinstitute.sv.commandline.CommandLineProgram.runAndReturnResult(CommandLineProgram.java:31)
at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:27)
at org.broadinstitute.sv.discovery.SVDepthScanner.main(SVDepthScanner.java:84)
Caused by: java.lang.RuntimeException: End of file while reading fasta file: /cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.lcmask.fasta
at org.broadinstitute.sv.util.fasta.IndexedFastaFile.readSequence(IndexedFastaFile.java:222)
at org.broadinstitute.sv.util.fasta.IndexedFastaFile.getSequence(IndexedFastaFile.java:160)
at org.broadinstitute.sv.util.fasta.IndexedFastaFile.getSequence(IndexedFastaFile.java:131)
at org.broadinstitute.sv.mask.GenomeMaskFastaFile.getBitMask(GenomeMaskFastaFile.java:60)
at org.broadinstitute.sv.mask.GenomeMaskCompositeMask.getBitMask(GenomeMaskCompositeMask.java:41)
at org.broadinstitute.sv.util.GenomeWindowGenerator$WindowIterator.start(GenomeWindowGenerator.java:219)
at org.broadinstitute.sv.util.GenomeWindowGenerator$WindowIterator.<init>(GenomeWindowGenerator.java:191)
at org.broadinstitute.sv.util.GenomeWindowGenerator.createWindowIterator(GenomeWindowGenerator.java:62)
at org.broadinstitute.sv.discovery.ReadDepthScannerAlgorithm.createIntervalIterator(ReadDepthScannerAlgorithm.java:76)
at org.broadinstitute.sv.discovery.SVDepthScanner.run(SVDepthScanner.java:108)
at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:58)
The error occurs from cnv-stage1, I have tried re-running the script (did it 3 times) but that did not work, I get the same error. I have also tried running the analysis on the entire dataset without limiting the analysis to a specific chromosome, and I have used chromosome specification but still encounter the same error.
Please advise on what I can do to solve this issue.
-
Thank you for your post. Bob Handsaker has been tagged and will get back to you shortly.
-
It looks like the file
/cbio/projects/003/thandeka/my_scripts/Homo_sapiens_assembly38/Homo_sapiens_assembly38.lcmask.fasta
is corrupted, probably truncated.
$ ls -l Homo_sapiens_assembly38.lcmask.fasta
-r--r--r-- 1 handsake cnp 3281778217 Apr 11 2016 Homo_sapiens_assembly38.lcmask.fasta
$ sum Homo_sapiens_assembly38.lcmask.fasta
61192 3204862
-
I don't think I quite understand what you mean. I used the above commands and this was the output:
$ ls -l Homo_sapiens_assembly38.lcmask.fasta
-r--r--r-- 1 marhwayiza cbio-group 793259520 Jul 10 16:27 Homo_sapiens_assembly38.lcmask.fasta$ sum Homo_sapiens_assembly38.lcmask.fasta
22463 774668If this file is damaged or truncated, how do I fix it?
-
The simplest thing would be to download it again.
ftp://ftp.broadinstitute.org/pub/svtoolkit/reference_metadata_bundles
-
@Bob thank you let me download it again and run the script
Please sign in to leave a comment.
5 comments