Skip to contents

prot.workflow performs variance stabilization normalization (prot.normalize_vsn), missing value imputation (prot.impute), principal component analysis (prot.pca), differential enrichment test (prot.test_diff), and pathway enrichment analysis. If desired, standardized plots and a report are generated and exported as separate files.

Usage

prot.workflow(
  se,
  normalize = TRUE,
  imp_fun = c("SampMin", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb",
    "min", "zero", "mixed", "nbavg"),
  q = 0.01,
  knn.rowmax = 0.5,
  type = c("all", "control", "manual"),
  control = NULL,
  contrast = NULL,
  alpha = 0.05,
  alpha_pathways = 0.1,
  lfc = 1,
  heatmap.show_all = TRUE,
  heatmap.kmeans = F,
  k = 6,
  heatmap.col_limit = NA,
  heatmap.show_row_names = TRUE,
  heatmap.row_font_size = 6,
  volcano.add_names = FALSE,
  volcano.label_size = 2.5,
  volcano.adjusted = TRUE,
  plot = FALSE,
  export = FALSE,
  report = TRUE,
  report.dir = NULL,
  pathway_enrichment = FALSE,
  pathway_kegg = FALSE,
  kegg_organism = NULL,
  custom_pathways = NULL,
  out.dir = NULL
)

Arguments

se

SummarizedExperiment object, proteomics data parsed with prot.read_data.

normalize

(Logical) Should the data be normalized via variance stabilization normalization?

imp_fun

(Character string) Function used for data imputation. "SampMin", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb", "min", "zero", "mixed", or "nbavg". See (prot.impute) for details.

q

(Numeric) q value for imputing missing values with method imp_fun = 'MinProb'.

knn.rowmax

(Numeric) The maximum percent missing data allowed in any row for imp_fun = 'knn'. Default: 0.5.

type

(Character string) Type of differential analysis to perform. "all" (contrast each condition with every other condition), "control" (contrast each condition to a defined control condition), "manual" (manually define selected conditions).

control

(Character string) The name of the control condition if type = control.

contrast

(Character string or vector of strings) Define the contrasts to be tested if type = manual in the form: "ConditionA_vs_ConditionB", or c("ConditionA_vs_ConditionC", "ConditionB_vs_ConditionC").

alpha

(Numeric) Significance threshold for adjusted p values.

alpha_pathways

(Numeric) Significance threshold for adjusted p values in pathway enrichment analysis.

lfc

(Numeric) Relevance threshold for log2(fold change) values. Only proteins with a |log2(fold change)| value above lfc for a given contrast are considered "significant" (if they additionally fullfil the alpha criterion).

heatmap.show_all

(Logical) Shall all samples be displayed in the heat map (TRUE) or only the samples contained in the defined contrast (FALSE)?

heatmap.kmeans

(Logical) Shall the proteins be clustered in the heat map with the k-nearest neighbour method (TRUE) or not FALSE)?

k

(Integer) Number of protein clusters in heat map if heatmap.kmeans = TRUE.

heatmap.col_limit

(Integer) Define the outer breaks in the heat map legend. Example: if heatmap.col_limit = 3, the color scale will span from -3 to 3. Alls values below -3 will have the same color as -3, and all values above 3 will have the same color as 3.

heatmap.show_row_names

(Logical) Show protein names in heat map (TRUE) or not FALSE).

heatmap.row_font_size

(Numeric) Font size of protein names in heat maps if heatmap.show_row_names = TRUE.

volcano.add_names

(Logical) Show protein names in volcano plots (TRUE) or not FALSE).

volcano.label_size

(Numeric) Font size of protein names in volcano plots if volcano.add_names = TRUE.

volcano.adjusted

(Logical) Shall adjusted p values be shown on the y axis of volcano plots (TRUE) or raw p values (FALSE)?.

plot

(Logical) Show the generated plots in the Plots pane of RStudio (TRUE) or not FALSE).

export

(Logical) Exported the generated plots as PNG and PDF files (TRUE) or not FALSE).

report

(Logical) Render and export a report in PDF and HTML format that summarizes the results (TRUE) or not FALSE).

report.dir

(Character string) Provide the name of or path to a folder into which the report will be saved.

pathway_enrichment

(Logical) Perform pathway over-representation analysis for each tested contrast (TRUE) or not FALSE).

pathway_kegg

(Logical) Perform pathway over-representation analysis with gene sets in the KEGG database (TRUE) or not FALSE).

kegg_organism

(Character string) Identifier of the organism in the KEGG database (if pathway_kegg = TRUE)

custom_pathways

(a R dataframe object) Data frame providing custom pathway annotations. The table must contain a "Pathway" column listing identified pathway in the studies organism, and an "Accession" column listing the proteins (or genes) each pathway is composed of. The Accession entries must match with protein IDs.

out.dir

(Character string) absolute path to the location where result TXT files should be exported to.

Value

A list containing SummarizedExperiment object for every computation step of the workflow, a pca object, and lists of up- or down regulated pathways for each tested contrast and method (KEGG and/or custom).