Test for optimal data processing conditions
Source:R/metabolomics_computation.R
met.test_normalization.Rd
met.test_normalization
runs a metabolomics analysis workflow with
several normalization/transformation/scaling combinations in parallel. The workflow
includes univariate analysis (ANOVA) and multivariate analyses (PCA and PLS-DA) and
creates a report with various performance indicators that help with deciding for
optimal data preprocessing conditions.
Usage
met.test_normalization(
mSetObj,
test_conditions = NULL,
ref = NULL,
class_order = FALSE,
alpha = 0.05,
lfc = 2,
posthoc_method = "tukey",
nonpar = FALSE,
pls.cv.method = "LOOCV",
pls.cv.k = 10,
dpi = 300,
pls.data = "all",
permut.num = 500,
vip.thresh = 1,
report = FALSE,
report.dir = NULL,
export = FALSE,
export.format = "pdf",
export.dir = "met.Test_Normalization"
)
Arguments
- mSetObj
Enter the name of the created mSetObj (see
met.read_data
)- test_conditions
(Character vector) Enter the combinations of rowNorm, transNorm, and scaleNorm separated by "/" to be tested in the workflow. See
met.normalize
for suitable options. IfNA
, the normalization test workflow will automatically choosec("NULL", "MeanCenter", "LogNorm", "LogNorm/AutoNorm", "CrNorm", "CrNorm/AutoNorm", "AutoNorm", "RangeNorm", "ParetoNorm")
as default conditions. Please note: rowNorm option "SpecNorm" is not supported by this workflow.- ref
(Character vector) Enter the name(s) of the reference sample(s) or the reference feature(s) for
rowNorm = "GroupPQN"
,"SamplePQN"
, or"CompNorm"
in their respective order intest_conditions
. Add "NULL" if theref
argument is not applicable to the respective test_condition.- class_order
(Logical,
TRUE
orFALSE
) Class order matters (i.e. implying time points, disease severity, etc.)- alpha
(Numeric) Enter significance threshold for adjusted p values (false discovery rate - FDR; for ANOVA with post-hoc analyses).
- lfc
(Numeric) Enter relevance threshold for log2 fold changes in pair-wise comparisons (for ANOVA with post-hoc analyses).
- posthoc_method
(Character) Enter the name of the ANOVA post-hoc test,
"fisher"
or"tukey"
.- nonpar
(Logical) Use a non-parametric ANOVA test (
TRUE
) or not (FALSE
).- pls.cv.method
(Character) Enter one of two methods for PLS-DA model (cross) validation:
"LOOCV"
performs leave-one-out cross validation"CV"
performs k-fold cross validation.
- pls.cv.k
(Numeric) The number of (randomized) groups that the dataset is to be split into during cross validation if
methodName = "CV"
. This value must be equal to or smaller than the number of samples.- dpi
(Numeric) Resolution of PNG images (default is 300 dpi).
- pls.data
(Character) Enter
"all"
to train the PLS(-DA) model on your whole (filtered and normalized) dataset or"anova"
to use a subset of features defined as significant based on ANOVA analysis.- permut.num
(Numeric) Number of permutations in PLS-DA permutation tests.
- vip.thresh
(Numeric) Enter a chosen relevance threshold for PLS-DA VIP scores.
- report
(Logical) Generate a report with results of this workflow (
TRUE
) or not (FALSE
).- report.dir
(Character) Enter the name or location of the folder in which the report files are generated if
report = TRUE
. IfNULL
(the default), a new folder "Report_date-time" is created in your working directory.- export
(Logical) Export generated plots as PDF or PNG files (
TRUE
) or not (FALSE
).- export.format
(Character,
"png"
or"pdf"
) image file format (ifexport = TRUE
).- export.dir
(Character) Enter the name or location of a folder in which all generated files are saved.
Author
Nicolas T. Wirth mail.nicowirth@gmail.com Technical University of Denmark License: GNU GPL (>= 2)