Data-driven miRNA sequencing Normalization Assessment
assessNormalization.Rd
Data-driven miRNA sequencing Normalization Assessment
Arguments
- raw
Raw read count matrix (rows = genes, cols = samples). The rows and columns of the count matrix must be named, where
rownames(raw)
are the marker names andcolnames(raw)
are the sample names.- normalized
Named list of normalized count matrices. Each matrix holds the normalized read count matrix corresponding to a normalization method under study. Each list member must be named (e.g. after the used normalization). Each matrix in
normalized
must be named where the row names are the marker names and the column names are the sample names. A list of normalized counts can be generated using theapplyNormalization
function.- negControls
Vector of negative control markers as generated by the function
defineControls
.- posControls
Vector of positive control markers as generated by the function
defineControls
.- clusters
Named Vector of clusters. Associates each miRNA in
raw
to a polycistronic cluster. Usually generated using the functiondefineClusters
.
Value
DANA Assessment metrics for the provided normalized counts (for each normalized count matrix). DANA computes two assessment metrics:
- cc
cc
measures the preservation of biological signals before versus after normalization. A high value indicates a high preservation of biological signals (cc
<= 1). In particular,cc
is the concordance correlation coefficient of the within-cluster partial correlation among positive controls before and after normalization.- mscr
mscr
measures the relative reduction of handling before versus after normalization. A highmscr
indicates higher removal of handling effects. In particular,mscr
is the mean-squared correlation reduction in negative controls before and after normalization.
When selecting a normalization method for the raw
data, one should
aim for the best possible trade-off of hight cc and high mscr.