RExecutor/ProcessExecutor invalid file argument error
In Picard's RExecutor process, % symbols in the requested output filename cause the R process to crash with a "invalid 'file' argument" generated by R's pdf() function. The work-around here is to replace the '%' with a '%%' but it would be nicer if Picard was smart enough to automatically do this when it tries to pass a filename into the RExecutor function. This was reported as an issue in the github here:
https://github.com/broadinstitute/picard/issues/1412
The error would occur in any R command that uses an R subprocess that uses an user-input filename to generate a PDF.
If you are seeing an error, please provide(REQUIRED) :
a) GATK version used: 2.22.4
b) Exact command used:
java -Xms4G -Xmx4G -Dpicard.useLegacyParser=false -jar /opt/picard.jar CollectRnaSeqMetrics -REF_FLAT refFlat_hg38.txt -RIBOSOMAL_INTERVALS hg38d1_rmsk_rrna.txt -STRAND_SPECIFICITY NONE -CHART_OUTPUT example+%28sample%29.rnaqc.pdf -METRIC_ACCUMULATION_LEVEL ALL_READS -INPUT example+%28sample%29.star.sort.bam -OUTPUT picard.rawoutput.txt -VALIDATION_STRINGENCY SILENT
c) Entire error log:
INFO 2021-03-30 05:43:48 RExecutor Executing R script via command: Rscript /tmp/alanh/script8745341533681278798.R /home/alanh/test/work/picard.rawoutput.txt /home/alanh/test/work/example+%28sample%29.rnaqc.pdf example+%28sample%29.star.sort.bam
ERROR 2021-03-30 05:43:49 ProcessExecutor Error in pdf(outputFile) :
ERROR 2021-03-30 05:43:49 ProcessExecutor invalid 'file' argument '/home/alanh/test/work/example+%28sample%29.rnaqc.pdf'
ERROR 2021-03-30 05:43:49 ProcessExecutor Execution halted
[Tue Mar 30 05:43:49 UTC 2021] picard.analysis.CollectRnaSeqMetrics done. Elapsed time: 17.34 minutes.
Runtime.totalMemory()=4294443008
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" picard.PicardException: Problem invoking R to generate plot.
at picard.analysis.CollectRnaSeqMetrics.finish(CollectRnaSeqMetrics.java:202)
at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:177)
at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:94)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
NOTE: this would work fine:
java -Xms4G -Xmx4G -Dpicard.useLegacyParser=false -jar /opt/picard.jar CollectRnaSeqMetrics -REF_FLAT refFlat_hg38.txt -RIBOSOMAL_INTERVALS hg38d1_rmsk_rrna.txt -STRAND_SPECIFICITY NONE -CHART_OUTPUT example+%%28sample%%29.rnaqc.pdf -METRIC_ACCUMULATION_LEVEL ALL_READS -INPUT example+%28sample%29.star.sort.bam -OUTPUT picard.rawoutput.txt -VALIDATION_STRINGENCY SILENT
-
Thank you for posting this workaround alanhoyle, I will find out from the Picard team if they are able to make any changes so that the workaround is not necessary.
-
I've made a pull request that could address this. https://github.com/broadinstitute/picard/pull/1671
-
This pull request has been accepted in Picard, so I suspect it'll propagate forward to GATK eventually.
-
Thank you alanhoyle for your contribution! The PR should appear in the next GATK release.
Please sign in to leave a comment.
4 comments