How can I get Sample Summary
I want to get the number of SNP, Indel and MNP in each sample. In the overview of VariantEval, it said "These metrics include number of s per sample". And "Sample and variable summary" is mentioned in the Caveat part in VariantEval introduction.
My command is:
vcf_filename=../8_SNP_no_mtDNA.vcf.gz
fasta_file=../S288C_Num.fna
gatk VariantEval \
-R $fasta_file \
-O ./output.eval.grp \
--eval $vcf_filename
The content of the resulting file is:
#:GATKReport.v1.1:9
#:GATKTable:11:3:%s:%s:%s:%s:%s:%d:%d:%d:%.2f:%d:%.2f:;
#:GATKTable:CompOverlap:The overlap between eval and comp sites
CompOverlap CompFeatureInput EvalFeatureInput JexlExpression Novelty nEvalVariants novelSites nVariantsAtComp compRate nConcordant concordantRate
CompOverlap none eval none all 151599 151599 0 0.00 0 0.00
CompOverlap none eval none known 0 0 0 0.00 0 0.00
CompOverlap none eval none novel 151599 151599 0 0.00 0 0.00
#:GATKTable:30:3:%s:%s:%s:%s:%s:%d:%d:%d:%d:%.8f:%.8f:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:%.2e:%.2f:%.2f:%.2e:%.2f:%.2f:;
#:GATKTable:CountVariants:Counts different classes of variants in the sample
CountVariants CompFeatureInput EvalFeatureInput JexlExpression Novelty nProcessedLoci nCalledLoci nRefLoci nVariantLoci variantRate variantRatePerBp nSNPs nMNPs nInsertions nDeletions nComplex nSymbolic nMixed nNoCalls nHets nHomRef nHomVar nSingletons nHomDerived heterozygosity heterozygosityPerBp hetHomRatio indelRate indelRatePerBp insertionDeletionRatio
CountVariants none eval none all 12157105 852411 700812 151599 0.01246999 80.00000000 151599 0 0 0 0 0 0 369859 185890 36717503 1937654 18894 0 1.53e-02 65.00 0.10 0.00e+00 0.00 0.00
CountVariants none eval none known 12157105 0 0 0 0.00000000 0.00000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0.00 0.00 0.00e+00 0.00 0.00
CountVariants none eval none novel 12157105 852411 700812 151599 0.01246999 80.00000000 151599 0 0 0 0 0 0 369859 185890 36717503 1937654 18894 0 1.53e-02 65.00 0.10 0.00e+00 0.00 0.00
#:GATKTable:7:60:%s:%s:%s:%s:%s:%d:%.2f:;
#:GATKTable:IndelLengthHistogram:Indel length histogram
IndelLengthHistogram CompFeatureInput EvalFeatureInput JexlExpression Novelty Length Freq
IndelLengthHistogram none eval none all -10 0.00
IndelLengthHistogram none eval none all -9 0.00
IndelLengthHistogram none eval none all -8 0.00
IndelLengthHistogram none eval none all -7 0.00
IndelLengthHistogram none eval none all -6 0.00
IndelLengthHistogram none eval none all -5 0.00
IndelLengthHistogram none eval none all -4 0.00
IndelLengthHistogram none eval none all -3 0.00
IndelLengthHistogram none eval none all -2 0.00
IndelLengthHistogram none eval none all -1 0.00
IndelLengthHistogram none eval none all 1 0.00
IndelLengthHistogram none eval none all 2 0.00
IndelLengthHistogram none eval none all 3 0.00
IndelLengthHistogram none eval none all 4 0.00
IndelLengthHistogram none eval none all 5 0.00
IndelLengthHistogram none eval none all 6 0.00
IndelLengthHistogram none eval none all 7 0.00
IndelLengthHistogram none eval none all 8 0.00
IndelLengthHistogram none eval none all 9 0.00
IndelLengthHistogram none eval none all 10 0.00
IndelLengthHistogram none eval none known -10 0.00
IndelLengthHistogram none eval none known -9 0.00
IndelLengthHistogram none eval none known -8 0.00
IndelLengthHistogram none eval none known -7 0.00
IndelLengthHistogram none eval none known -6 0.00
IndelLengthHistogram none eval none known -5 0.00
IndelLengthHistogram none eval none known -4 0.00
IndelLengthHistogram none eval none known -3 0.00
IndelLengthHistogram none eval none known -2 0.00
IndelLengthHistogram none eval none known -1 0.00
IndelLengthHistogram none eval none known 1 0.00
IndelLengthHistogram none eval none known 2 0.00
IndelLengthHistogram none eval none known 3 0.00
IndelLengthHistogram none eval none known 4 0.00
IndelLengthHistogram none eval none known 5 0.00
IndelLengthHistogram none eval none known 6 0.00
IndelLengthHistogram none eval none known 7 0.00
IndelLengthHistogram none eval none known 8 0.00
IndelLengthHistogram none eval none known 9 0.00
IndelLengthHistogram none eval none known 10 0.00
IndelLengthHistogram none eval none novel -10 0.00
IndelLengthHistogram none eval none novel -9 0.00
IndelLengthHistogram none eval none novel -8 0.00
IndelLengthHistogram none eval none novel -7 0.00
IndelLengthHistogram none eval none novel -6 0.00
IndelLengthHistogram none eval none novel -5 0.00
IndelLengthHistogram none eval none novel -4 0.00
IndelLengthHistogram none eval none novel -3 0.00
IndelLengthHistogram none eval none novel -2 0.00
IndelLengthHistogram none eval none novel -1 0.00
IndelLengthHistogram none eval none novel 1 0.00
IndelLengthHistogram none eval none novel 2 0.00
IndelLengthHistogram none eval none novel 3 0.00
IndelLengthHistogram none eval none novel 4 0.00
IndelLengthHistogram none eval none novel 5 0.00
IndelLengthHistogram none eval none novel 6 0.00
IndelLengthHistogram none eval none novel 7 0.00
IndelLengthHistogram none eval none novel 8 0.00
IndelLengthHistogram none eval none novel 9 0.00
IndelLengthHistogram none eval none novel 10 0.00
#:GATKTable:30:3:%s:%s:%s:%s:%s:%d:%d:%d:%d:%d:%s:%s:%s:%s:%s:%d:%s:%s:%s:%s:%s:%s:%s:%s:%s:%s:%s:%s:%s:%s:;
#:GATKTable:IndelSummary:Evaluation summary for indels
IndelSummary CompFeatureInput EvalFeatureInput JexlExpression Novelty n_SNPs n_singleton_SNPs n_indels n_singleton_indels n_indels_matching_gold_standard gold_standard_matching_rate n_multiallelic_indel_sites percent_of_sites_with_more_than_2_alleles SNP_to_indel_ratio SNP_to_indel_ratio_for_singletons n_novel_indels indel_novelty_rate n_insertions n_deletions insertion_to_deletion_ratio n_large_deletions n_large_insertions insertion_to_deletion_ratio_for_large_indels n_coding_indels_frameshifting n_coding_indels_in_frame frameshift_rate_for_coding_indels SNP_het_to_hom_ratio indel_het_to_hom_ratio ratio_of_1_and_2_to_3_bp_insertions ratio_of_1_and_2_to_3_bp_deletions
IndelSummary none eval none all 153745 18894 0 0 0 NA 0 NA NA NA 0 NA 0 0 NA 0 0 NA 0 0 NA 0.10 NA NA NA
IndelSummary none eval none known 0 0 0 0 0 NA 0 NA NA NA 0 NA 0 0 NA 0 0 NA 0 0 NA NA NA NA NA
IndelSummary none eval none novel 153745 18894 0 0 0 NA 0 NA NA NA 0 NA 0 0 NA 0 0 NA 0 0 NA 0.10 NA NA NA
#:GATKTable:13:3:%s:%s:%s:%s:%s:%.2f:%d:%d:%d:%d:%s:%.2f:%.2f:;
#:GATKTable:MetricsCollection:Metrics Collection
MetricsCollection CompFeatureInput EvalFeatureInput JexlExpression Novelty concordantRate nSNPs nSNPloci nIndels nIndelLoci indelRatio indelRatioLociBased tiTvRatio
MetricsCollection none eval none all 0.00 153745 151599 0 0 NA 0.00 2.40
MetricsCollection none eval none known 0.00 0 0 0 0 NA 0.00 0.00
MetricsCollection none eval none novel 0.00 153745 151599 0 0 NA 0.00 2.40
#:GATKTable:20:3:%s:%s:%s:%s:%s:%d:%d:%d:%.5f:%.3f:%d:%d:%.5f:%.3f:%d:%d:%.2f:%d:%d:%s:;
#:GATKTable:MultiallelicSummary:Evaluation summary for multi-allelic variants
MultiallelicSummary CompFeatureInput EvalFeatureInput JexlExpression Novelty nProcessedLoci nSNPs nMultiSNPs processedMultiSnpRatio variantMultiSnpRatio nIndels nMultiIndels processedMultiIndelRatio variantMultiIndelRatio nTi nTv TiTvRatio knownSNPsPartial knownSNPsComplete SNPNoveltyRate
MultiallelicSummary none eval none all 12157105 151599 2092 0.00017 0.014 0 0 0.00000 NaN 1744 2494 0.70 0 0 100.00
MultiallelicSummary none eval none known 12157105 0 0 0.00000 NaN 0 0 0.00000 NaN 0 0 NaN 0 0 NA
MultiallelicSummary none eval none novel 12157105 151599 2092 0.00017 0.014 0 0 0.00000 NaN 1744 2494 0.70 0 0 100.00
#:GATKTable:14:3:%s:%s:%s:%s:%s:%d:%d:%.2f:%d:%d:%.2f:%d:%d:%.2f:;
#:GATKTable:TiTvVariantEvaluator:Ti/Tv Variant Evaluator
TiTvVariantEvaluator CompFeatureInput EvalFeatureInput JexlExpression Novelty nTi nTv tiTvRatio nTiInComp nTvInComp TiTvRatioStandard nTiDerived nTvDerived tiTvDerivedRatio
TiTvVariantEvaluator none eval none all 105478 44029 2.40 0 0 0.00 0 0 0.00
TiTvVariantEvaluator none eval none known 0 0 0.00 0 0 0.00 0 0 0.00
TiTvVariantEvaluator none eval none novel 105478 44029 2.40 0 0 0.00 0 0 0.00
#:GATKTable:24:3:%s:%s:%s:%s:%s:%d:%d:%d:%d:%d:%.2f:%.2f:%.2f:%.2f:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:;
#:GATKTable:ValidationReport:Assess site accuracy and sensitivity of callset against follow-up validation assay
ValidationReport CompFeatureInput EvalFeatureInput JexlExpression Novelty nComp TP FP FN TN sensitivity specificity PPV FDR CompMonoEvalNoCall CompMonoEvalFiltered CompMonoEvalMono CompMonoEvalPoly CompPolyEvalNoCall CompPolyEvalFiltered CompPolyEvalMono CompPolyEvalPoly CompFiltered nDifferentAlleleSites
ValidationReport none eval none all 0 0 0 0 0 NaN 100.00 NaN NaN 0 0 0 0 0 0 0 0 0 0
ValidationReport none eval none known 0 0 0 0 0 NaN 100.00 NaN NaN 0 0 0 0 0 0 0 0 0 0
ValidationReport none eval none novel 0 0 0 0 0 NaN 100.00 NaN NaN 0 0 0 0 0 0 0 0 0 0
#:GATKTable:20:3:%s:%s:%s:%s:%s:%d:%d:%d:%.2f:%s:%d:%.2f:%.1f:%d:%s:%d:%.1f:%d:%s:%d:;
#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table
VariantSummary CompFeatureInput EvalFeatureInput JexlExpression Novelty nSamples nProcessedLoci nSNPs TiTvRatio SNPNoveltyRate nSNPsPerSample TiTvRatioPerSample SNPDPPerSample nIndels IndelNoveltyRate nIndelsPerSample IndelDPPerSample nSVs SVNoveltyRate nSVsPerSample
VariantSummary none eval none all 46 12157105 151599 2.40 100.00 46164 2.85 46164.0 0 NA 0 0.0 0 NA 0
VariantSummary none eval none known 46 12157105 0 0.00 NA 0 0.00 0.0 0 NA 0 0.0 0 NA 0
VariantSummary none eval none novel 46 12157105 151599 2.40 100.00 46164 2.85 46164.0 0 NA 0 0.0 0 NA 0
There is no Sample summary. I did not get the number of SNP, Indel and MNP of each sample. How can I get it?
-
Hi rq m,
Thank you for writing in. I'm not sure I understand what metrics you are wanting that are not present in your VariantEval output. It looks like the number of SNPs and MNPs per sample is being shown under "CountVariants" as "nSNPs" and "nMNPs" and indel numbers are calculated by "IndelSummary" as "n_indels" in your output. Please let me know if this is not the output and metrics you are looking for.
Kind regards,
Pamela
Please sign in to leave a comment.
1 comment