Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.
AnsweredREQUIRED for all errors and issues:
a) GATK version used:
b) Exact command used:
c) Entire program log:
See forum topic details at forum guidelines page: https://gatk.broadinstitute.org/hc/en-us/articles/360053845952-Forum-Guidelines
Hi there, I am new in gatk. The below is my running code:
gatk VariantRecalibrator -R Homo_sapiens_hg38.fasta -L chrM -V SNP_samples.vcf --trust-all-polymorphic -tranche 100.0 -tranche 99.95 -tranche 99.90 -tranche 99.80 -tranche 99.70 -tranche 99.60 -tranche 99.50 -tranche 99.40 -tranche 99.30 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an SOR -an DP -mode SNP --max-gaussians 6 --resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap.vcf.gz --resource:omni,known=false,training=true,truth=false,prior=12.0 omni.vcf.gz --resource:1000G,known=false,training=true,truth=false,prior=10.0 1000GI.vcf.gz --resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp138.vcf -O input_snp.recal --tranches-file input.tranches
I got the war as:
WARN GATKVariantContextUtils - Can't determine output variant file format from output file extension "recal". Defaulting to VCF.
and the error as:
A USER ERROR has occurred: Bad input: Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.
you can find my log:
Using GATK jar /home/tahi/anaconda3/envs/annot/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/tahi/anaconda3/envs/annot/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar VariantRecalibrator -R known_snp/Homo_sapiens_hg38.fasta -L chrM -V SNP_samples.vcf --trust-all-polymorphic -tranche 100.0 -tranche 99.95 -tranche 99.90 -tranche 99.80 -tranche 99.70 -tranche 99.60 -tranche 99.50 -tranche 99.40 -tranche 99.30 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an SOR -an DP -mode SNP --max-gaussians 6 --resource:hapmap,known=false,training=true,truth=true,prior=15.0 known_snp/hapmap.vcf.gz --resource:omni,known=false,training=true,truth=false,prior=12.0 known_snp/omni.vcf.gz --resource:1000G,known=false,training=true,truth=false,prior=10.0 known_snp/1000GI.vcf.gz --resource:dbsnp,known=true,training=false,truth=false,prior=2.0 known_snp/dbsnp138.vcf -O input_snp.recal --tranches-file input.tranches
03:39:03.716 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/tahi/anaconda3/envs/annot/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Feb 22, 2022 3:39:03 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
03:39:03.886 INFO VariantRecalibrator - ------------------------------------------------------------
03:39:03.886 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.5.0
03:39:03.886 INFO VariantRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
03:39:03.886 INFO VariantRecalibrator - Executing as tahi@tahi-GL553VD on Linux v5.11.0-49-generic amd64
03:39:03.886 INFO VariantRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v11.0.9.1-internal+0-adhoc..src
03:39:03.886 INFO VariantRecalibrator - Start Date/Time: February 22, 2022 at 3:39:03 AM EST
03:39:03.886 INFO VariantRecalibrator - ------------------------------------------------------------
03:39:03.886 INFO VariantRecalibrator - ------------------------------------------------------------
03:39:03.887 INFO VariantRecalibrator - HTSJDK Version: 2.24.1
03:39:03.887 INFO VariantRecalibrator - Picard Version: 2.25.4
03:39:03.887 INFO VariantRecalibrator - Built for Spark Version: 2.4.5
03:39:03.887 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
03:39:03.887 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
03:39:03.887 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
03:39:03.887 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
03:39:03.888 INFO VariantRecalibrator - Deflater: IntelDeflater
03:39:03.888 INFO VariantRecalibrator - Inflater: IntelInflater
03:39:03.888 INFO VariantRecalibrator - GCS max retries/reopens: 20
03:39:03.888 INFO VariantRecalibrator - Requester pays: disabled
03:39:03.888 INFO VariantRecalibrator - Initializing engine
03:39:04.153 INFO FeatureManager - Using codec VCFCodec to read file file:///home/tahi/Working_Space/NGS/analysis/6.calling/6.2.variant_discovery/known_snp/hapmap.vcf.gz
03:39:04.319 INFO FeatureManager - Using codec VCFCodec to read file file:///home/tahi/Working_Space/NGS/analysis/6.calling/6.2.variant_discovery/known_snp/omni.vcf.gz
03:39:04.382 INFO FeatureManager - Using codec VCFCodec to read file file:///home/tahi/Working_Space/NGS/analysis/6.calling/6.2.variant_discovery/known_snp/1000GI.vcf.gz
03:39:04.439 INFO FeatureManager - Using codec VCFCodec to read file file:///home/tahi/Working_Space/NGS/analysis/6.calling/6.2.variant_discovery/known_snp/dbsnp138.vcf
03:39:04.531 INFO FeatureManager - Using codec VCFCodec to read file file:///home/tahi/Working_Space/NGS/analysis/6.calling/6.2.variant_discovery/SNP_samples.vcf
03:39:04.612 INFO IntervalArgumentCollection - Processing 16569 bp from intervals
03:39:04.671 INFO VariantRecalibrator - Done initializing engine
03:39:04.673 INFO TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0
03:39:04.673 INFO TrainingSet - Found omni track: Known = false Training = true Truth = false Prior = Q12.0
03:39:04.673 INFO TrainingSet - Found 1000G track: Known = false Training = true Truth = false Prior = Q10.0
03:39:04.673 INFO TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q2.0
03:39:04.690 WARN GATKVariantContextUtils - Can't determine output variant file format from output file extension "recal". Defaulting to VCF.
03:39:04.784 INFO ProgressMeter - Starting traversal
03:39:04.784 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
03:39:04.803 INFO ProgressMeter - unmapped 0.0 28 88421.1
03:39:04.803 INFO ProgressMeter - Traversal complete. Processed 28 total variants in 0.0 minutes.
03:39:04.811 INFO VariantRecalibrator - Shutting down engine
[February 22, 2022 at 3:39:04 AM EST] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=470810624
***********************************************************************
A USER ERROR has occurred: Bad input: Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.
I am appreciate you to help me find the behind problem as I could not find it in forum solution.
warm regards
-
Hi tahi,
One of the annotations you are using to build your VariantRecalibrator model is QD (-an QD) but you have not added QD to your VCF file.
You can add annotations to your VCF file with the gatk tool VariantAnnotator. The tool documentation page is here: https://gatk.broadinstitute.org/hc/en-us/articles/4418054223003-VariantAnnotator
Please let me know if you have any further questions.
Best,
Genevieve
Please sign in to leave a comment.
1 comment