Recal file with malformed header
Hi,
I'm trying to run VQSR as per the documentation, and I've successfully run the variant recalibration step for both indels and SNVs, but I get an error when trying to apply these recalibrations:
08:34:23.147 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
May 14, 2020 8:34:23 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
08:34:23.782 INFO ApplyVQSR - ------------------------------------------------------------
08:34:23.783 INFO ApplyVQSR - The Genome Analysis Toolkit (GATK) v4.1.7.0
08:34:23.784 INFO ApplyVQSR - For support and documentation go to https://software.broadinstitute.org/gatk/
08:34:23.784 INFO ApplyVQSR - Executing as erikfas@sens2020519-b10.uppmax.uu.se on Linux v3.10.0-1127.el7.x86_64 amd64
08:34:23.785 INFO ApplyVQSR - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
08:34:23.785 INFO ApplyVQSR - Start Date/Time: 14 May 2020 08:34:23 CEST
08:34:23.785 INFO ApplyVQSR - ------------------------------------------------------------
08:34:23.785 INFO ApplyVQSR - ------------------------------------------------------------
08:34:23.786 INFO ApplyVQSR - HTSJDK Version: 2.21.2
08:34:23.787 INFO ApplyVQSR - Picard Version: 2.21.9
08:34:23.787 INFO ApplyVQSR - HTSJDK Defaults.COMPRESSION_LEVEL : 2
08:34:23.787 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
08:34:23.787 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
08:34:23.787 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
08:34:23.788 INFO ApplyVQSR - Deflater: IntelDeflater
08:34:23.788 INFO ApplyVQSR - Inflater: IntelInflater
08:34:23.788 INFO ApplyVQSR - GCS max retries/reopens: 20
08:34:23.788 INFO ApplyVQSR - Requester pays: disabled
08:34:23.788 INFO ApplyVQSR - Initializing engine
08:34:24.180 INFO FeatureManager - Using codec VCFCodec to read file file:///castor/project/proj/nbis-analysis/results/vqsr/jointGT.7of7-1.vqsr.indels.recal
08:34:24.498 INFO FeatureManager - Using codec VCFCodec to read file file:///castor/project/proj/nbis-analysis/results/jointGT.7of7-1.ann.vcf.gz
08:34:25.136 INFO ApplyVQSR - Done initializing engine
08:34:25.206 INFO ApplyVQSR - Shutting down engine
[14 May 2020 08:34:25 CEST] org.broadinstitute.hellbender.tools.walkers.vqsr.ApplyVQSR done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=6227755008
***********************************************************************
A USER ERROR has occurred: File /castor/project/proj/nbis-analysis/results/vqsr/jointGT.7of7-1.vqsr.indels.recal is malformed: Expected 11 elements in header line 1 10114 . N<VQSR> . . END=10115;NEGATIVE_TRAIN_SITE;VQSLOD=-1.8269;culprit=DP
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Using GATK jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6G -Xms6G -jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar ApplyVQSR -V results/jointGT.7of7-1.ann.vcf.gz --recal-file results/vqsr/jointGT.7of7-1.vqsr.indels.recal --tranches-file results/vqsr/jointGT.7of7-1.vqsr.indels.recal --truth-sensitivity-filter-level 90.0 --create-output-variant-index true -mode INDEL -O results//vqsr/vqsr.indel-applied.jointGT.7of7-1.vcf
I tried looking at the rest of the forum, but didn't find any help regarding this (I only found problems with similar errors for running HaplotypeCaller). The command used to run ApplyVQSR is as follows (as part of a Snakemake pipeline, hence the wildcards within {curly brackets}:
gatk --java-options "-Xmx6G -Xms6G" ApplyVQSR \
-V {input.vcf} \
--recal-file {input.indels_recal} \
--tranches-file {input.indels_recal} \
--truth-sensitivity-filter-level 90.0 \
--create-output-variant-index true \
-mode INDEL \
-O {params.resultsdir}/vqsr/vqsr.indel-applied.{wildcards.sample}.vcf
The command for running the recalibration (which completes successfully) is as follows:
gatk --java-options "-Xmx6G -Xms6G" VariantRecalibrator \
-V {input} \
--trust-all-polymorphic \
-tranche 100.0 -tranche 99.95 -tranche 99.9 -tranche 99.0 \
-tranche 97.0 -tranche 95.0 -tranche 90.0 \
-an FS -an ReadPosRankSum -an MQRankSum -an QD -an SOR -an DP \
-mode INDEL \
--max-gaussians 4 \
-resource:mills,known=false,training=true,truth=true,prior=12 {params.mills} \
-resource:dbsnp,known=true,training=false,truth=false,prior=2 {params.dbsnp} \
-O {output.recal} \
--tranches-file {output.tranches}
I am running GATK using Conda with the following versions (output from gatk --version):
The Genome Analysis Toolkit (GATK) v4.1.7.0
HTSJDK Version: 2.21.2
Picard Version: 2.21.9
Any idea what is happening? Thanks in advance!
-
- Seems like your recal file was malformed. Maybe while running VariantRecalibrator you ran out of memory or something caused the tool to generate a malformed recal table. Can you recreate your recal file and try again?
- If you still see the same error, can you please share a few records from your recal file.
-
Hi, Bhanu!
I already tried to re-create the recal file, and still get the same error when running ApplyVQSR. Here is parts of the recal header (shortened for brevity with its tranches and contigs) plus some records:
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="Site contains at least one allele that passes filters">
##GATKCommandLine=<ID=VariantRecalibrator,CommandLine="VariantRecalibrator --mode INDEL --max-gaussians 4 --resource:mills,known=false,training=true,truth=true,prior=12 mills:/castor/project/proj/nbis-analysis/data/annotations/Mills_and_1000G_gold_standard.indels.b37.vcf --resource:dbsnp,known=true,training=false,truth=false,prior=2 dbsnp:/castor/project/proj/nbis-analysis/data/annotations/dbsnp_138.b37.vcf --output results/vqsr/jointGT.7of7-1.vqsr.indels.recal --tranches-file results/vqsr/jointGT.7of7-1.vqsr.indels.tranches --use-annotation FS --use-annotation ReadPosRankSum --use-annotation MQRankSum --use-annotation QD --use-annotation SOR --use-annotation DP --truth-sensitivity-tranche 100.0 --truth-sensitivity-tranche 99.95 --truth-sensitivity-tranche 99.9 --truth-sensitivity-tranche 99.0 --truth-sensitivity-tranche 97.0 --truth-sensitivity-tranche 95.0 --truth-sensitivity-tranche 90.0 --trust-all-polymorphic true --variant results/jointGT.7of7-1.ann.vcf.gz --use-allele-specific-annotations false --max-negative-gaussians 2 --max-iterations 150 --k-means-iterations 100 --standard-deviation-threshold 10.0 --shrinkage 1.0 --dirichlet 0.001 --prior-counts 20.0 --maximum-training-variants 2500000 --minimum-bad-variants 1000 --bad-lod-score-cutoff -5.0 --mq-cap-for-logit-jitter-transform 0 --mq-jitter 0.05 --debug-stdev-thresholding false --target-titv 2.15 --ignore-all-filters false --sample-every-Nth-variant 1 --output-tranches-for-scatter false --vqslod-tranche 10.0 --vqslod-tranche 9.9 --vqslod-tranche 9.8 --vqslod-tranche 9.700000000000001 (... lots and lots of tranche commands with many decimals ...) --replicate 200 --max-attempts 1 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.1.7.0",Date="14 May 2020 08:00:43 CEST">
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=NEGATIVE_TRAIN_SITE,Number=0,Type=Flag,Description="This variant was used to build the negative training set of bad variants">
##INFO=<ID=POSITIVE_TRAIN_SITE,Number=0,Type=Flag,Description="This variant was used to build the positive training set of good variants">
##INFO=<ID=VQSLOD,Number=1,Type=Float,Description="Log odds of being a true variant versus being false under the trained gaussian mixture model">
##INFO=<ID=culprit,Number=1,Type=String,Description="The annotation which was the worst performing in the Gaussian mixture model, likely the reason why the variant was filtered out">
##contig=<ID=1,length=249250621>
(...)
##contig=<ID=NC_007605,length=171823>
##contig=<ID=hs37d5,length=35477943>
##source=VariantRecalibrator
#CHROM POS ID REF ALT QUAL FILTER INFO
1 10114 . N <VQSR> . . END=10115;NEGATIVE_TRAIN_SITE;VQSLOD=-1.8269;culprit=DP
1 10146 . N <VQSR> . . END=10147;NEGATIVE_TRAIN_SITE;VQSLOD=-1.6088;culprit=MQRankSum
1 10234 . N <VQSR> . . END=10235;NEGATIVE_TRAIN_SITE;VQSLOD=-0.9603;culprit=MQRankSum
1 10403 . N <VQSR> . . END=10440;NEGATIVE_TRAIN_SITE;VQSLOD=-1.7045;culprit=DP
1 10439 . N <VQSR> . . END=10440;NEGATIVE_TRAIN_SITE;VQSLOD=-1.5845;culprit=SOR
1 10616 . N <VQSR> . . END=10637;NEGATIVE_TRAIN_SITE;VQSLOD=-1.7792;culprit=SOR
1 10815 . N <VQSR> . . END=10815;NEGATIVE_TRAIN_SITE;VQSLOD=-0.9333;culprit=DP
1 13656 . N <VQSR> . . END=13658;NEGATIVE_TRAIN_SITE;VQSLOD=-2.0377;culprit=DP
1 13957 . N <VQSR> . . END=13958;NEGATIVE_TRAIN_SITE;VQSLOD=-1.3940;culprit=MQRankSum
1 15219 . N <VQSR> . . END=15230;NEGATIVE_TRAIN_SITE;VQSLOD=-1.5739;culprit=DP
1 15903 . N <VQSR> . . END=15903;NEGATIVE_TRAIN_SITE;VQSLOD=-1.5544;culprit=DP
1 16911 . N <VQSR> . . END=16912;NEGATIVE_TRAIN_SITE;VQSLOD=-2.1121;culprit=DP
1 17961 . N <VQSR> . . END=17962;NEGATIVE_TRAIN_SITE;VQSLOD=-1.5606;culprit=DP
1 19190 . N <VQSR> . . END=19191;NEGATIVE_TRAIN_SITE;VQSLOD=-2.2864;culprit=DP
It looked weird to me that there was so many tranche commands, since they were not the ones that I entered, but I have no clue if that is expected behaviour (more tranches than entered are used so that all those entered can be output or something?), but I thought I'd mention it. They are written like so:--vqslod-tranche 9.700000000000001 --vqslod-tranche 9.600000000000001 --vqslod-tranche 9.500000000000002 --vqslod-tranche 9.400000000000002 --vqslod-tranche 9.300000000000002 --vqslod-tranche 9.200000000000003 --vqslod-tranche 9.100000000000003 --vqslod-tranche 9.000000000000004 --vqslod-tranche 8.900000000000004 --vqslod-tranche 8.800000000000004 --vqslod-tranche 8.700000000000005 --vqslod-tranche 8.600000000000005 --vqslod-tranche 8.500000000000005 --vqslod-tranche 8.400000000000006 --vqslod-tranche 8.300000000000006 --vqslod-tranche 8.200000000000006 --vqslod-tranche 8.100000000000007 --vqslod-tranche 8.000000000000007 --vqslod-tranche 7.9000000000000075 --vqslod-tranche 7.800000000000008 --vqslod-tranche 7.700000000000008
Though with many more. Again, don't know if that is fine, but that was the one thing that stood out to me.
-
So we see this error when the vcf index file is malformed. Try removing the vcf.idx and run IndexFeatureFile tool and re-index the vcf and then run ApplyVQSR. That should resolve this.
-
I've now removed the index file (which was a tabix file, .tbi, not .idx) and run the IndexFeatureFile, which also created a tabix file rather than .idx. The command ran successfully:
$ gatk IndexFeatureFile -I jointGT.7of7-1.ann.vcf.gz
Using GATK jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar IndexFeatureFile -I jointGT.7of7-1.ann.vcf.gz
10:57:20.225 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
May 19, 2020 10:57:20 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
10:57:20.632 INFO IndexFeatureFile - ------------------------------------------------------------
10:57:20.632 INFO IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.1.7.0
10:57:20.632 INFO IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
10:57:20.634 INFO IndexFeatureFile - Executing as erikfas@sens2020519-bianca.uppmax.uu.se on Linux v3.10.0-1127.el7.x86_64 amd64
10:57:20.634 INFO IndexFeatureFile - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
10:57:20.634 INFO IndexFeatureFile - Start Date/Time: 19 May 2020 10:57:20 CEST
10:57:20.634 INFO IndexFeatureFile - ------------------------------------------------------------
10:57:20.634 INFO IndexFeatureFile - ------------------------------------------------------------
10:57:20.635 INFO IndexFeatureFile - HTSJDK Version: 2.21.2
10:57:20.635 INFO IndexFeatureFile - Picard Version: 2.21.9
10:57:20.635 INFO IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:57:20.635 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:57:20.635 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:57:20.635 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:57:20.635 INFO IndexFeatureFile - Deflater: IntelDeflater
10:57:20.635 INFO IndexFeatureFile - Inflater: IntelInflater
10:57:20.635 INFO IndexFeatureFile - GCS max retries/reopens: 20
10:57:20.635 INFO IndexFeatureFile - Requester pays: disabled
10:57:20.636 INFO IndexFeatureFile - Initializing engine
10:57:20.636 INFO IndexFeatureFile - Done initializing engine
10:57:21.021 INFO FeatureManager - Using codec VCFCodec to read file file:///castor/project/proj/nbis-analysis/results/jointGT.7of7-1.ann.vcf.gz
10:57:21.064 INFO ProgressMeter - Starting traversal
10:57:21.064 INFO ProgressMeter - Current Locus Elapsed Minutes Records Processed Records/Minute
10:57:31.075 INFO ProgressMeter - 2:31269948 0.2 684000 4100719.4
10:57:41.075 INFO ProgressMeter - 3:112831524 0.3 1503000 4506521.4
10:57:51.076 INFO ProgressMeter - 5:23774328 0.5 2325000 4648295.6
10:58:01.077 INFO ProgressMeter - 6:159886164 0.7 3146000 4717466.8
10:58:11.081 INFO ProgressMeter - 8:132734703 0.8 3966000 4757582.4
10:58:21.081 INFO ProgressMeter - 11:30287046 1.0 4812000 4810637.0
10:58:31.090 INFO ProgressMeter - 13:105287826 1.2 5647000 4838488.6
10:58:41.099 INFO ProgressMeter - 17:12243983 1.3 6426000 4817452.6
10:58:51.109 INFO ProgressMeter - 21:32905354 1.5 7205000 4800986.2
10:58:56.460 INFO ProgressMeter - GL000192.1:534015 1.6 7666139 4821723.8
10:58:56.460 INFO ProgressMeter - Traversal complete. Processed 7666139 total records in 1.6 minutes.
10:58:57.034 INFO IndexFeatureFile - Successfully wrote index to /castor/project/proj/nbis-analysis/results/jointGT.7of7-1.ann.vcf.gz.tbi
10:58:57.034 INFO IndexFeatureFile - Shutting down engine
[19 May 2020 10:58:57 CEST] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 1.61 minutes.
Runtime.totalMemory()=87293952
Tool returned:
/castor/project/proj/nbis-analysis/results/jointGT.7of7-1.ann.vcf.gz.tbiI then tried to re-run the ApplyVQSR, but I got exactly the same error:
11:02:06.900 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
May 19, 2020 11:02:07 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
11:02:07.924 INFO ApplyVQSR - ------------------------------------------------------------
11:02:07.924 INFO ApplyVQSR - The Genome Analysis Toolkit (GATK) v4.1.7.0
11:02:07.925 INFO ApplyVQSR - For support and documentation go to https://software.broadinstitute.org/gatk/
11:02:07.926 INFO ApplyVQSR - Executing as erikfas@sens2020519-bianca.uppmax.uu.se on Linux v3.10.0-1127.el7.x86_64 amd64
11:02:07.926 INFO ApplyVQSR - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
11:02:07.927 INFO ApplyVQSR - Start Date/Time: 19 May 2020 11:02:06 CEST
11:02:07.927 INFO ApplyVQSR - ------------------------------------------------------------
11:02:07.927 INFO ApplyVQSR - ------------------------------------------------------------
11:02:07.929 INFO ApplyVQSR - HTSJDK Version: 2.21.2
11:02:07.929 INFO ApplyVQSR - Picard Version: 2.21.9
11:02:07.929 INFO ApplyVQSR - HTSJDK Defaults.COMPRESSION_LEVEL : 2
11:02:07.930 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
11:02:07.930 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
11:02:07.930 INFO ApplyVQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
11:02:07.931 INFO ApplyVQSR - Deflater: IntelDeflater
11:02:07.931 INFO ApplyVQSR - Inflater: IntelInflater
11:02:07.931 INFO ApplyVQSR - GCS max retries/reopens: 20
11:02:07.932 INFO ApplyVQSR - Requester pays: disabled
11:02:07.932 INFO ApplyVQSR - Initializing engine
11:02:08.443 INFO FeatureManager - Using codec VCFCodec to read file file:///castor/project/proj/nbis-analysis/results/vqsr/jointGT.7of7-1.vqsr.indels.recal
11:02:08.589 INFO FeatureManager - Using codec VCFCodec to read file file:///castor/project/proj/nbis-analysis/results/jointGT.7of7-1.ann.vcf.gz
11:02:09.481 INFO ApplyVQSR - Done initializing engine
11:02:09.559 INFO ApplyVQSR - Shutting down engine
[19 May 2020 11:02:09 CEST] org.broadinstitute.hellbender.tools.walkers.vqsr.ApplyVQSR done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=6227755008
***********************************************************************
A USER ERROR has occurred: File /castor/project/proj/nbis-analysis/results/vqsr/jointGT.7of7-1.vqsr.indels.recal is malformed: Expected 11 elements in header line 1 10114 . N <VQSR> . . END=10115;NEGATIVE_TRAIN_SITE;VQSLOD=-1.8269;culprit=DP
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Using GATK jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6G -Xms6G -jar /castor/project/proj/nbis-analysis/5076-env/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar ApplyVQSR -V results/jointGT.7of7-1.ann.vcf.gz --recal-file results/vqsr/jointGT.7of7-1.vqsr.indels.recal --tranches-file results/vqsr/jointGT.7of7-1.vqsr.indels.recal --truth-sensitivity-filter-level 90.0 --create-output-variant-index true -mode INDEL -O results//vqsr/vqsr.indel-applied.jointGT.7of7-1.vcfSo, that didn't work. What else could be the problem? Or is it something related to it being a tabix file rather than index, like you said it should be?
-
Can you reindex you jointGT.7of7-1.vqsr.indels.recal file and try again?
-
Just did, for both recal files (which indeed gives .idx files rather than .tbi), but the exact same error remains. I have also tried re-running the the entire process, but the results are the same :-(
-
Seems like incorrect inputs provided to ApplyVQSR is causing this issue.
-
The recal file and tranches file are not the same. The invocation for --tranches-file {input.indels_recal} should be something like --tranches-file {indels.tranches} instead of --tranches-file {input.indels_recal}which is seen in the command you shared with us.
- We expect a tranches file to look like this:
# Variant quality score tranches file
# Version number 5
targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity
90.00,45054,9606,2.4320,2.2447,17.9056,VQSRTrancheSNP0.00to90.00,SNP,18665,16798,0.9000
99.00,58388,71324,2.3497,2.2484,2.6101,VQSRTrancheSNP90.00to99.00,SNP,18665,18478,0.9900
99.90,59200,71948,2.3147,2.2346,-1.4635,VQSRTrancheSNP99.00to99.90,SNP,18665,18646,0.9990
100.00,59500,72321,2.2991,2.2239,-191.2854,VQSRTrancheSNP99.90to100.00,SNP,18665,18665,1.0000
-
-
Ah, that was indeed the problem! I was not giving it the tranches file, but rather the recal file once again. Definitely a user error here, totally my fault, but I wonder why the error message is so uninformative here. The problem is clearly that I gave it the wrong file, and now that I think about it it does say that it requires 11 columns. It would probably be useful to add something about checking if it really is a tranches file? Or maybe I'm just stupid for not realizing this myself :-P
-
Nah it was an honest mistake. I didn't see it at first too. Glad it works now though.
Please sign in to leave a comment.
9 comments