MINIMUM_PCT for picard.jar CollectInsertSizeMetrics
Other than trial & error, how do you know what to set MINIMUM_PCT at for picard.jar CollectInsertSizeMetrics? With this data-set it crashes at 0.15 but works for 0.16. I had to run the job about 7 times to get that answer. Is there a more intelligent way of figuring this out?
Can you please provide
a) GATK version used
picard-1.134 but question applies to all versions of CollectInsertSizeMetrics
b) Exact GATK commands used java -Djava.io.tmpdir=/data1/BIOINFORMATICS/TEMP/ -Xmx20g -jar SOFTWARE/broadinstitute-picard-1.134/broadinstitute-picard-a7a08c4/dist/picard.jar CollectInsertSizeMetrics INPUT=R3e_MARK_DUPLICATES/79130_Met_fixmate_novosort_dupsrm.bam TMP_DIR=/data1/BIOINFORMATICS/TEMP/ OUTPUT=R3_STATS/79130_Met_fixmate_novosort_dupsrm.bam_insert_size_metrics.txt HISTOGRAM_FILE=R3_STATS/79130_Met_fixmate_novosort_dupsrm.bam_insert_size_metrics.pdf REFERENCE_SEQUENCE=/data1/BIOINFORMATICS/REFERENCES/NIMBLEGEN/HumanGenome/HG-38/hg38.fa LEVEL=ALL_READS ASSUME_SORTED=true METRIC_ACCUMULATION_LEVEL=READ_GROUP VALIDATION_STRINGENCY=LENIENT MINIMUM_PCT=0.15
c) The entire error log if applicable.
INFO 2020-02-03 00:51:33 SinglePassSamProgram Processed 16,000,000 records. Elapsed time: 00:01:01s. Time for last 1,000,000: 3s. Last read position: chr11:1,199,353
INFO 2020-02-03 00:51:36 SinglePassSamProgram Processed 17,000,000 records. Elapsed time: 00:01:05s. Time for last 1,000,000: 3s. Last read position: chr11:113,449,929
INFO 2020-02-03 00:51:40 SinglePassSamProgram Processed 18,000,000 records. Elapsed time: 00:01:09s. Time for last 1,000,000: 3s. Last read position: chr12:88,058,943
INFO 2020-02-03 00:51:44 SinglePassSamProgram Processed 19,000,000 records. Elapsed time: 00:01:12s. Time for last 1,000,000: 3s. Last read position: chr14:20,456,519
INFO 2020-02-03 00:51:47 SinglePassSamProgram Processed 20,000,000 records. Elapsed time: 00:01:16s. Time for last 1,000,000: 3s. Last read position: chr15:20,566,510
INFO 2020-02-03 00:51:51 SinglePassSamProgram Processed 21,000,000 records. Elapsed time: 00:01:19s. Time for last 1,000,000: 3s. Last read position: chr16:1,592,172
INFO 2020-02-03 00:51:54 SinglePassSamProgram Processed 22,000,000 records. Elapsed time: 00:01:23s. Time for last 1,000,000: 3s. Last read position: chr16:30,291,085
INFO 2020-02-03 00:51:58 SinglePassSamProgram Processed 23,000,000 records. Elapsed time: 00:01:26s. Time for last 1,000,000: 3s. Last read position: chr17:40,820,286
INFO 2020-02-03 00:52:02 SinglePassSamProgram Processed 24,000,000 records. Elapsed time: 00:01:30s. Time for last 1,000,000: 3s. Last read position: chr18:44,526,992
INFO 2020-02-03 00:52:05 SinglePassSamProgram Processed 25,000,000 records. Elapsed time: 00:01:34s. Time for last 1,000,000: 3s. Last read position: chr19:51,169,649
INFO 2020-02-03 00:52:09 SinglePassSamProgram Processed 26,000,000 records. Elapsed time: 00:01:37s. Time for last 1,000,000: 3s. Last read position: chr21:10,482,682
INFO 2020-02-03 00:52:12 SinglePassSamProgram Processed 27,000,000 records. Elapsed time: 00:01:41s. Time for last 1,000,000: 3s. Last read position: chrX:46,118,588
INFO 2020-02-03 00:52:16 SinglePassSamProgram Processed 28,000,000 records. Elapsed time: 00:01:44s. Time for last 1,000,000: 3s. Last read position: chrX:110,055,319
INFO 2020-02-03 00:54:37 RExecutor Executing R script via command: Rscript /data1/BIOINFORMATICS/TEMP/script3814707552746657418.R /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3_STATS/79130_Met_fixmate_novosort_dupsrm.bam_insert_size_metrics.txt /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3_STATS/79130_Met_fixmate_novosort_dupsrm.bam_insert_size_metrics.pdf 79130_Met_fixmate_novosort_dupsrm.bam
INFO 2020-02-03 00:54:38 ProcessExecutor [1] "Not creating insert size PDF as there are duplicated header names: All_Reads"
INFO 2020-02-03 00:54:38 ProcessExecutor [2] "Not creating insert size PDF as there are duplicated header names: unknown"
[Mon Feb 03 00:54:38 EST 2020] picard.analysis.CollectInsertSizeMetrics done. Elapsed time: 4.13 minutes.
Runtime.totalMemory()=7563378688
-
Hi ,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
Please sign in to leave a comment.
1 comment