Variance stabilizing normalization
vs.norm.Rd
Normalize training dataset with vsn and store the fitted vsn model from the training dataset as the reference to frozen variance stabilizing normalize test dataset. Also two other options are available: to only normalize a training dataset but not frozen normalize a test dataset, or vise versa.
Arguments
- train
training dataset to be variance stabilizing normalized. The dataset must have rows as probes and columns as samples. This can be left unspecified if
ref.dis
is suppied for frozen normalize test set.- test
test dataset to be frozen variance stabilizing normalized. The dataset must have rows as probes and columns as samples. The number of rows must equal to the number of rows in the training set. By default, the test set is not specified (
test = NULL
) and no frozen normalization will be performed.- ref.dis
reference distribution for frozen variance stabilizing normalize test set against previously normalized training set. This is required when
train
is not supplied. By default,ref.dis = NULL
.
Value
a list of two datasets and one reference distribution:
- train.mn
the normalized training set
- test.fmn
the frozen normalized test set, if test set is specified
- ref.dis
the reference distribution
References
Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie Poustka and Martin Vingron. Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression. Bioinformatics 18, S96-S104 (2002).
Examples
if (FALSE) {
set.seed(101)
group.id <- substr(colnames(nuhdata.pl), 7, 7)
train.ind <- colnames(nuhdata.pl)[c(sample(which(group.id == "E"), size = 64),
sample(which(group.id == "V"), size = 64))]
train.dat <- nuhdata.pl[, train.ind]
test.dat <- nuhdata.pl[, !colnames(nuhdata.pl) %in% train.ind]
# normalize only training set
data.vsn <- vs.norm(train = train.dat)
str(data.vsn)
# normalize training set and frozen normalize test set
data.vsn <- vs.norm(train = train.dat, test = test.dat)
str(data.vsn)
# frozen normalize test set with reference distribution
ref <- vs.norm(train = train.dat)$ref.dis
data.vsn <- vs.norm(test = test.dat, ref.dis = ref)
str(data.vsn)
}