VariantRecalibrator resource
Hello i am using the comand above. I am working with mammalian genomes (sheep). I cannot locate the TRUEVAR files...there is any alternative? It could be sonstitute by an high density chip array e.g. 500K from illumina, even if the density is obviusly very lower?
Thnak you
a) GATK version used: 4.5.0.0
b) Exact command used:
java -d64 -Xmx48g -jar ${GenomeAnalysisTK.jar} -T VariantRecalibrator -R ${REF} -input ${allsample_
joint}. vcf.gz
-resource: dbSNP, known=false, training=true, truth=true, prior=15.0${TRUEVAR}
-resource: dbSNP, known=true, training=false, truth=false, prior=2.0${KNOWNVAR}
-an DP -an QD -an MQRankSum -an ReadPosRankSum -an FS -an SOR -mode SNP
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0
-recalFile ${allsample_joint)_recalibrate_SNP.recal
-tranchesFile ${allsample_joint}_recalibrate_SNP.tranches
-rscriptFile ${allsample_joint}_recalibrate_SNP_plots.R
-
There are data sources available for sheep within Ensembl, iSheep and some other related databases. You mileage may vary however if these are the sole sources for your organism we think it is a good way to start with these.
Here is what I could locate by checking dbSNP archives. I am sure google will get even more resources.
https://ftp.ncbi.nih.gov/snp/organisms/archive/sheep_9940/VCF/
Please pay attention to the reference genome that you are using to map your reads as compatibility with your reference genome and these external sources is utmost important when getting your workflows to completion properly.
I hope this helps.
-
Gökalp Çelik thank you very much! really appreciated!
Please sign in to leave a comment.
2 comments