What happens with VariantRecalibrator when annotations are missing?
REQUIRED for all errors and issues:
a) GATK version used: 4.2.6.1
b) Exact command used:
gatk VariantRecalibrator \
-R GRCh38.fasta \
-V Input.vcf.bgz \
--resource:hapmap,known=false,training=true,truth=true,prior=15.0 resources_broad_hg38_v0_hapmap_3.3.hg38.vcf.gz \
--resource:omni,known=false,training=true,truth=true,prior=12.0 resources_broad_hg38_v0_1000G_omni2.5.hg38.vcf.gz \
--resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.hg38.vcf.gz \
--resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_138.hg38.vcf.gz \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an DP \
-mode SNP \
--trust-all-polymorphic \
-tranche 100.0 -tranche 99.9 -tranche 99.8 -tranche 99.7 -tranche 99.6 -tranche 99.5 -tranche 99.4 -tranche 99.3 -tranche 99.2 -tranche 99.0 -tranche 90.0 \
-O Output.recal \
--tranches-file Output.recal.tranches \
--rscript-file Output.recal.plots.R
I began running VariantRecalibrator, which seems to run ok, but I don't have all of the annotations I asked it to use in my VCF. Does it not fail if there are missing annotations in the file? Does it calculate them on the fly?
-
Hi Matt Johnson
Not all variants will have the complete set of annotations especially homozygous SNP positions will lack MQRankSum and similar annotations which are all valid for heterozygous positions. In such cases Variant Recalibrator is smart enough to distinguish why it is lacking and does not penalize such sites for lack of information. There are still plenty of other parameters to check for those sites such as presence in truth data.
Regards.
Please sign in to leave a comment.
1 comment