Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

AS_MQRanksum craches R script VQSR

0

8 comments

  • Avatar
    Bhanu Gandham

    Hi timh

     

    Please post the exact command used and the entire error log.

    0
    Comment actions Permalink
  • Avatar
    timh

    Here you are: [exact command works without "-an AS_MQRankSum"]

    java -jar ~/Programs/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar VariantRecalibrator -R ref.fa -V test.4.1.4.1.vcf.gz -AS --resource:ref500,known=false,training=true,truth=true,prior=20.0 random500.vcf --resource:complete,known=true,training=false,truth=false,prior=5.0 all-dbsnp.vcf -an AS_MQRankSum -an AS_QD -an AS_SOR -an AS_MQ -an DP -an AS_ReadPosRankSum -mode SNP --output test.recal --tranches-file test.tranches --truth-sensitivity-tranche 100.0 --truth-sensitivity-tranche 95.0 --truth-sensitivity-tranche 99.0 --output-model test.model -rscript-file test.plots.R --max-gaussians 2
    11:08:56.062 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/user/Programs/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jan 27, 2020 11:08:56 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    11:08:56.320 INFO VariantRecalibrator - ------------------------------------------------------------
    11:08:56.320 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.4.1
    11:08:56.320 INFO VariantRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
    11:08:56.320 INFO VariantRecalibrator - Executing as user@user-ThinkPad-X260 on Linux v5.3.0-26-generic amd64
    11:08:56.320 INFO VariantRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
    11:08:56.320 INFO VariantRecalibrator - Start Date/Time: 27 January 2020 11:08:56 AM
    11:08:56.321 INFO VariantRecalibrator - ------------------------------------------------------------
    11:08:56.321 INFO VariantRecalibrator - ------------------------------------------------------------
    11:08:56.321 INFO VariantRecalibrator - HTSJDK Version: 2.21.0
    11:08:56.321 INFO VariantRecalibrator - Picard Version: 2.21.2
    11:08:56.321 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    11:08:56.321 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    11:08:56.321 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    11:08:56.321 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    11:08:56.321 INFO VariantRecalibrator - Deflater: IntelDeflater
    11:08:56.321 INFO VariantRecalibrator - Inflater: IntelInflater
    11:08:56.321 INFO VariantRecalibrator - GCS max retries/reopens: 20
    11:08:56.322 INFO VariantRecalibrator - Requester pays: disabled
    11:08:56.322 INFO VariantRecalibrator - Initializing engine
    11:08:56.635 INFO FeatureManager - Using codec VCFCodec to read file file:///WorkDir/random500.vcf
    11:08:56.654 INFO FeatureManager - Using codec VCFCodec to read file file:///WorkDir/all-dbsnp.vcf
    11:08:56.660 INFO FeatureManager - Using codec VCFCodec to read file file:///WorkDir/test.4.1.4.1.vcf.gz
    11:08:56.690 WARN IndexUtils - Feature file "/WorkDir/random500.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    11:08:56.691 WARN IndexUtils - Feature file "/WorkDir/all-dbsnp.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    11:08:56.704 INFO VariantRecalibrator - Done initializing engine
    11:08:56.706 INFO TrainingSet - Found ref500 track: Known = false Training = true Truth = true Prior = Q20.0
    11:08:56.707 INFO TrainingSet - Found complete track: Known = true Training = false Truth = false Prior = Q5.0
    11:08:56.713 WARN GATKVariantContextUtils - Can't determine output variant file format from output file extension "recal". Defaulting to VCF.
    11:08:56.738 INFO ProgressMeter - Starting traversal
    11:08:56.738 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    11:08:57.451 INFO ProgressMeter - ref:4262288 0.0 10335 870927.0
    11:08:57.451 INFO ProgressMeter - Traversal complete. Processed 10335 total variants in 0.0 minutes.
    11:08:57.454 INFO VariantDataManager - AS_MQRankSum: mean = -0.01 standard deviation = 0.04
    11:08:57.461 INFO VariantDataManager - AS_QD: mean = 31.15 standard deviation = 3.15
    11:08:57.466 INFO VariantDataManager - AS_SOR: mean = 1.07 standard deviation = 0.53
    11:08:57.472 INFO VariantDataManager - AS_MQ: mean = 59.63 standard deviation = 2.60
    11:08:57.476 INFO VariantDataManager - DP: mean = 162.93 standard deviation = 33.57
    11:08:57.482 INFO VariantDataManager - AS_ReadPosRankSum: mean = 0.55 standard deviation = 1.01
    11:08:57.518 INFO VariantDataManager - Annotation order is: [DP, AS_MQ, AS_MQRankSum, AS_QD, AS_ReadPosRankSum, AS_SOR]
    11:08:57.520 INFO VariantDataManager - Training with 498 variants after standard deviation thresholding.
    11:08:57.520 WARN VariantDataManager - WARNING: Training with very few variant sites! Please check the model reporting PDF to ensure the quality of the model is reliable.
    11:08:57.524 INFO GaussianMixtureModel - Initializing model with 100 k-means iterations...
    11:08:57.592 INFO VariantRecalibratorEngine - Finished iteration 0.
    11:08:57.618 INFO VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.09612
    11:08:57.627 INFO VariantRecalibratorEngine - Convergence after 9 iterations!
    11:08:57.631 INFO VariantRecalibratorEngine - Evaluating full set of 10251 variants...
    11:08:57.914 INFO VariantDataManager - Selected worst 383 scoring variants --> variants with LOD <= -5.0000.
    11:08:57.914 INFO GaussianMixtureModel - Initializing model with 100 k-means iterations...
    11:08:57.918 INFO VariantRecalibratorEngine - Finished iteration 0.
    11:08:57.923 INFO VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.02624
    11:08:57.931 INFO VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.02619
    11:08:57.936 INFO VariantRecalibratorEngine - Convergence after 13 iterations!
    11:08:57.941 INFO VariantRecalibratorEngine - Evaluating full set of 10251 variants...
    11:08:58.262 INFO TrancheManager - Finding 3 tranches for 10251 variants
    11:08:58.277 INFO TrancheManager - TruthSensitivityTranche threshold 100.00 => selection metric threshold 0.000
    11:08:58.286 INFO TrancheManager - Found tranche for 100.000: 0.000 threshold starting with variant 0; running score is 0.000
    11:08:58.286 INFO TrancheManager - TruthSensitivityTranche is TruthSensitivityTranche targetTruthSensitivity=100.00 minVQSLod=-39056.1309 known=(9983 @ 0.4954) novel=(268 @ 1.0775) truthSites(499 accessible, 499 called), name=anonymous]
    11:08:58.287 INFO TrancheManager - TruthSensitivityTranche threshold 95.00 => selection metric threshold 0.050
    11:08:58.291 INFO TrancheManager - Found tranche for 95.000: 0.050 threshold starting with variant 1453; running score is 0.050
    11:08:58.291 INFO TrancheManager - TruthSensitivityTranche is TruthSensitivityTranche targetTruthSensitivity=95.00 minVQSLod=0.7390 known=(8798 @ 0.4911) novel=(0 @ 0.0000) truthSites(499 accessible, 474 called), name=anonymous]
    11:08:58.291 INFO TrancheManager - TruthSensitivityTranche threshold 99.00 => selection metric threshold 0.010
    11:08:58.294 INFO TrancheManager - Found tranche for 99.000: 0.010 threshold starting with variant 729; running score is 0.010
    11:08:58.295 INFO TrancheManager - TruthSensitivityTranche is TruthSensitivityTranche targetTruthSensitivity=99.00 minVQSLod=-1.4924 known=(9522 @ 0.4958) novel=(0 @ 0.0000) truthSites(499 accessible, 494 called), name=anonymous]
    11:08:58.296 INFO VariantRecalibrator - Writing out recalibration table...
    11:08:58.476 INFO VariantRecalibrator - Writing out visualization Rscript file...
    11:08:58.480 INFO VariantRecalibrator - Building DP x AS_MQ plot...
    11:08:58.482 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:08:58.713 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:08:58.998 INFO VariantRecalibrator - Building DP x AS_MQRankSum plot...
    11:08:58.999 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:08:59.211 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:08:59.460 INFO VariantRecalibrator - Building DP x AS_QD plot...
    11:08:59.462 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:08:59.676 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:08:59.900 INFO VariantRecalibrator - Building DP x AS_ReadPosRankSum plot...
    11:08:59.900 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:00.094 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:00.306 INFO VariantRecalibrator - Building DP x AS_SOR plot...
    11:09:00.306 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:00.500 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:00.725 INFO VariantRecalibrator - Building AS_MQ x AS_MQRankSum plot...
    11:09:00.726 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:00.979 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:01.273 INFO VariantRecalibrator - Building AS_MQ x AS_QD plot...
    11:09:01.274 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:01.506 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:01.766 INFO VariantRecalibrator - Building AS_MQ x AS_ReadPosRankSum plot...
    11:09:01.766 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:02.000 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:02.254 INFO VariantRecalibrator - Building AS_MQ x AS_SOR plot...
    11:09:02.254 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:02.485 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:02.743 INFO VariantRecalibrator - Building AS_MQRankSum x AS_QD plot...
    11:09:02.743 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:02.931 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:03.150 INFO VariantRecalibrator - Building AS_MQRankSum x AS_ReadPosRankSum plot...
    11:09:03.150 INFO VariantRecalibratorEngine - Evaluating full set of 3600 variants...
    11:09:03.341 INFO VariantRecalibratorEngine - Evaluating full set of 3600 variants...
    11:09:03.550 INFO VariantRecalibrator - Building AS_MQRankSum x AS_SOR plot...
    11:09:03.551 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:03.746 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:03.952 INFO VariantRecalibrator - Building AS_QD x AS_ReadPosRankSum plot...
    11:09:03.952 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:04.134 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:04.344 INFO VariantRecalibrator - Building AS_QD x AS_SOR plot...
    11:09:04.344 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:04.526 INFO VariantRecalibratorEngine - Evaluating full set of 3721 variants...
    11:09:04.756 INFO VariantRecalibrator - Building AS_ReadPosRankSum x AS_SOR plot...
    11:09:04.757 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:04.959 INFO VariantRecalibratorEngine - Evaluating full set of 3660 variants...
    11:09:05.171 INFO VariantRecalibrator - Executing: Rscript /WorkDir/test.plots.R
    11:09:06.686 INFO VariantRecalibrator - Shutting down engine
    [27 January 2020 11:09:06 AM] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 0.18 minutes.
    Runtime.totalMemory()=718274560
    org.broadinstitute.hellbender.utils.R.RScriptExecutorException:
    Rscript exited with 1
    Command Line: Rscript -e tempLibDir = '/tmp/Rlib.5162024666942531667';source('/WorkDir/test.plots.R');
    Stdout:
    Stderr: Warning: Ignoring unknown parameters: legend
    Error in f(..., self = self) : Breaks and labels are different lengths
    Calls: source ... guide_train -> guide_train.legend -> <Anonymous> -> f
    In addition: Warning messages:
    1: Non Lab interpolation is deprecated
    2: Removed 1 rows containing missing values (geom_tile).
    3: Removed 1 rows containing missing values (geom_point).
    4: Removed 1 rows containing missing values (geom_point).
    5: Removed 1 rows containing missing values (geom_point).
    Execution halted

    at org.broadinstitute.hellbender.utils.R.RScriptExecutor.getScriptException(RScriptExecutor.java:80)
    at org.broadinstitute.hellbender.utils.R.RScriptExecutor.getScriptException(RScriptExecutor.java:19)
    at org.broadinstitute.hellbender.utils.runtime.ScriptExecutor.executeCuratedArgs(ScriptExecutor.java:126)
    at org.broadinstitute.hellbender.utils.R.RScriptExecutor.exec(RScriptExecutor.java:126)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.createVisualizationScript(VariantRecalibrator.java:1121)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.onTraversalSuccess(VariantRecalibrator.java:702)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1050)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
    at org.broadinstitute.hellbender.Main.main(Main.java:292)

     

    0
    Comment actions Permalink
  • Avatar
    timh

    update: this appears to be the case for only a number of my datasets, works fine for others. So I am guessing the problem is not VQSR but my data, will have a closer look. Thanks

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Thank you for the update timh. Please post your solution here once you find it so the community can benefit from it too. Thank you!

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi timh

     

    I am asking out of curiosity, did you edit the R-script code at all? I am wondering is that is the issue here.

    0
    Comment actions Permalink
  • Avatar
    timh

    Hi. No I didn't change the R script.

    0
    Comment actions Permalink
  • Avatar
    Marcin

    I have just run into the same problem. It seems to be caused by R limit on script line length. Definition of the 'surface' variable is too long. Should be a relatively easy fix.

    BTW, is it possible for VariantRecalibrator to create the R script and NOT execute it but exit gracefully (without any error code)? Failure of the R script breaks my pipeline...

    Thanks, Marcin

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi Marcin

     

    That is not possible but you could run VariantRecalibrator without the `-rscript-file` argument as a workaround to avoid breaking your pipeline.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk