Test for optimal data processing conditions — met.test

met.test_normalization runs a metabolomics analysis workflow with several normalization/transformation/scaling combinations in parallel. The workflow includes univariate analysis (ANOVA) and multivariate analyses (PCA and PLS-DA) and creates a report with various performance indicators that help with deciding for optimal data preprocessing conditions.

Usage

met.test_normalization(
  mSetObj,
  test_conditions = NULL,
  ref = NULL,
  class_order = FALSE,
  alpha = 0.05,
  lfc = 2,
  posthoc_method = "tukey",
  nonpar = FALSE,
  pls.cv.method = "LOOCV",
  pls.cv.k = 10,
  dpi = 300,
  pls.data = "all",
  permut.num = 500,
  vip.thresh = 1,
  report = FALSE,
  report.dir = NULL,
  export = FALSE,
  export.format = "pdf",
  export.dir = "met.Test_Normalization"
)

Arguments

mSetObj

Enter the name of the created mSetObj (see met.read_data)

test_conditions

(Character vector) Enter the combinations of rowNorm, transNorm, and scaleNorm separated by "/" to be tested in the workflow. See met.normalize for suitable options. If NA, the normalization test workflow will automatically choose c("NULL", "MeanCenter", "LogNorm", "LogNorm/AutoNorm", "CrNorm", "CrNorm/AutoNorm", "AutoNorm", "RangeNorm", "ParetoNorm") as default conditions. Please note: rowNorm option "SpecNorm" is not supported by this workflow.

ref

(Character vector) Enter the name(s) of the reference sample(s) or the reference feature(s) for rowNorm = "GroupPQN", "SamplePQN", or "CompNorm" in their respective order in test_conditions. Add "NULL" if the ref argument is not applicable to the respective test_condition.

class_order

(Logical, TRUE or FALSE) Class order matters (i.e. implying time points, disease severity, etc.)

alpha

(Numeric) Enter significance threshold for adjusted p values (false discovery rate - FDR; for ANOVA with post-hoc analyses).

lfc

(Numeric) Enter relevance threshold for log2 fold changes in pair-wise comparisons (for ANOVA with post-hoc analyses).

posthoc_method

(Character) Enter the name of the ANOVA post-hoc test, "fisher" or "tukey".

nonpar

(Logical) Use a non-parametric ANOVA test (TRUE) or not (FALSE).

pls.cv.method

(Character) Enter one of two methods for PLS-DA model (cross) validation:

"LOOCV" performs leave-one-out cross validation
"CV" performs k-fold cross validation.

pls.cv.k

(Numeric) The number of (randomized) groups that the dataset is to be split into during cross validation if methodName = "CV". This value must be equal to or smaller than the number of samples.

dpi

(Numeric) Resolution of PNG images (default is 300 dpi).

pls.data

(Character) Enter "all" to train the PLS(-DA) model on your whole (filtered and normalized) dataset or "anova" to use a subset of features defined as significant based on ANOVA analysis.

permut.num

(Numeric) Number of permutations in PLS-DA permutation tests.

vip.thresh

(Numeric) Enter a chosen relevance threshold for PLS-DA VIP scores.

report

(Logical) Generate a report with results of this workflow (TRUE) or not (FALSE).

report.dir

(Character) Enter the name or location of the folder in which the report files are generated if report = TRUE. If NULL (the default), a new folder "Report_date-time" is created in your working directory.

export

(Logical) Export generated plots as PDF or PNG files (TRUE) or not (FALSE).

export.format

(Character, "png" or "pdf") image file format (if export = TRUE).

export.dir

(Character) Enter the name or location of a folder in which all generated files are saved.

Author

Nicolas T. Wirth mail.nicowirth@gmail.com Technical University of Denmark License: GNU GPL (>= 2)