RnaSeqMetrics – GATK

Metrics

Category Metrics

Overview

Metrics about the alignment of RNA-seq reads within a SAM file to genes, produced by the CollectRnaSeqMetrics program and usually stored in a file with the extension ".rna_metrics".

This table summarizes the values that are specific to this metric.

Metric	Summary
PF_BASES	The total number of PF bases including non-aligned reads.
PF_ALIGNED_BASES	The total number of aligned PF bases. Non-primary alignments are not counted. Bases in aligned reads that do not correspond to reference (e.g. soft clips, insertions) are not counted.
RIBOSOMAL_BASES	Number of bases in primary alignments that align to ribosomal sequence.
CODING_BASES	Number of bases in primary alignments that align to a non-UTR coding base for some gene, and not ribosomal sequence.
UTR_BASES	Number of bases in primary alignments that align to a UTR base for some gene, and not a coding base.
INTRONIC_BASES	Number of bases in primary alignments that align to an intronic base for some gene, and not a coding or UTR base.
INTERGENIC_BASES	Number of bases in primary alignments that do not align to any gene.
IGNORED_READS	Number of primary alignments that are mapped to a sequence specified on command-line as IGNORED_SEQUENCE. These are not counted in PF_ALIGNED_BASES, CORRECT_STRAND_READS, INCORRECT_STRAND_READS, or any of the base-counting metrics. These reads are counted in PF_BASES.
CORRECT_STRAND_READS	Number of aligned reads that are mapped to the correct strand. 0 if library is not strand-specific.
INCORRECT_STRAND_READS	Number of aligned reads that are mapped to the incorrect strand. 0 if library is not strand-specific.
NUM_R1_TRANSCRIPT_STRAND_READS	The number of reads that support the model where R1 is on the strand of transcription and R2 is on the opposite strand.
NUM_R2_TRANSCRIPT_STRAND_READS	The fraction of reads that support the model where R2 is on the strand of transcription and R1 is on the opposite strand.
NUM_UNEXPLAINED_READS	The fraction of reads for which the transcription strand model could not be inferred.
PCT_R1_TRANSCRIPT_STRAND_READS	The fraction of reads that support the model where R1 is on the strand of transcription and R2 is on the opposite strand. For unpaired reads, it is the fraction of reads that are on the transcription strand (out of all the reads).
PCT_R2_TRANSCRIPT_STRAND_READS	The fraction of reads that support the model where R2 is on the strand of transcription and R1 is on the opposite strand. For unpaired reads, it is the fraction of reads that are on opposite strand than that of the the transcription strand (out of all the reads).
PCT_RIBOSOMAL_BASES	Fraction of PF_ALIGNED_BASES that mapped to regions encoding ribosomal RNA, RIBOSOMAL_BASES/PF_ALIGNED_BASES
PCT_CODING_BASES	Fraction of PF_ALIGNED_BASES that mapped to protein coding regions of genes, CODING_BASES/PF_ALIGNED_BASES
PCT_UTR_BASES	Fraction of PF_ALIGNED_BASES that mapped to untranslated regions (UTR) of genes, UTR_BASES/PF_ALIGNED_BASES
PCT_INTRONIC_BASES	Fraction of PF_ALIGNED_BASES that correspond to gene introns, INTRONIC_BASES/PF_ALIGNED_BASES
PCT_INTERGENIC_BASES	Fraction of PF_ALIGNED_BASES that mapped to intergenic regions of genomic DNA, INTERGENIC_BASES/PF_ALIGNED_BASES
PCT_MRNA_BASES	Sum of bases mapped to regions corresponding to UTRs and coding regions of mRNA transcripts, PCT_UTR_BASES + PCT_CODING_BASES
PCT_USABLE_BASES	The fraction of bases mapping to mRNA divided by the total number of PF bases, (CODING_BASES + UTR_BASES)/PF_BASES.
PCT_CORRECT_STRAND_READS	Fraction of reads corresponding to mRNA transcripts which map to the correct strand of a reference genome = CORRECT_STRAND_READS/(CORRECT_STRAND_READS + INCORRECT_STRAND_READS). 0 if library is not strand-specific.
MEDIAN_CV_COVERAGE	The median coefficient of variation (CV) or stdev/mean for coverage values of the 1000 most highly expressed transcripts that have a length greater than the END_BIAS_BASES parameter. Ideal value = 0.
MEDIAN_5PRIME_BIAS	The median 5 prime bias of the 1000 most highly expressed transcripts that have a length greater than the END_BIAS_BASES parameter. The 5 prime bias is calculated per transcript as: mean coverage of the 5 prime-most number of bases divided by the mean coverage of the whole transcript. The number of end-bases used is specified by the END_BIAS_BASES parameter.
MEDIAN_3PRIME_BIAS	The median 3 prime bias of the 1000 most highly expressed transcripts that have a length greater than the END_BIAS_BASES parameter, where 3 prime bias is calculated per transcript as: mean coverage of the 3 prime-most number of bases, divided by the mean coverage of the whole transcript. The number of end-bases used is specified by the END_BIAS_BASES parameter.
MEDIAN_5PRIME_TO_3PRIME_BIAS	The ratio of coverage at the 5 prime end to the 3 prime end based on the 1000 most highly expressed transcripts that have a length greater than the END_BIAS_BASES parameter. The number of end-bases used is specified by the END_BIAS_BASES parameter.

Return to top

GATK version 4.5.0.0 built at Tue, 9 Jan 2024 14:37:17 -0500.

Genome Analysis Toolkit

Need Help?

Community Forum

Articles in this section

RnaSeqMetrics Follow

Category Metrics

Overview

0 comments

Genome Analysis Toolkit

Need Help?

Community Forum

Articles in this section

Category Metrics

Overview

Related articles