This function performs a complete RNA sequencing workflow, including imputation of missing values, normalization, principal component analysis, differential expression analysis, and pathway analysis. The function also provides several options for plotting, exporting plots, and creating a report.
Usage
rna.workflow(
se,
imp_fun = c("zero", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb", "min",
"zero", "mixed", "nbavg", "SampMin"),
q = 0.01,
knn.rowmax = 0.5,
type = c("all", "control", "manual"),
design = "~ condition",
size.factors = NULL,
altHypothesis = c("greaterAbs", "lessAbs", "greater", "less"),
control = NULL,
contrast = NULL,
controlGenes = NULL,
pAdjustMethod = c("IHW", "BH"),
alpha = 0.05,
alpha.independent = 0.1,
alpha_pathways = 0.1,
lfcShrink = TRUE,
shrink.method = c("apeglm", "ashr", "normal"),
lfc = 2,
heatmap.show_all = TRUE,
heatmap.kmeans = F,
k = 6,
heatmap.col_limit = NA,
heatmap.show_row_names = TRUE,
heatmap.row_font_size = 6,
volcano.add_names = FALSE,
volcano.label_size = 2.5,
volcano.adjusted = TRUE,
plot = FALSE,
export = FALSE,
report = TRUE,
report.dir = NULL,
pathway_enrichment = FALSE,
pathway_kegg = FALSE,
kegg_organism = NULL,
custom_pathways = NULL,
quiet = FALSE
)
Arguments
- se
A SummarizedExperiment object, generated with read_prot().
- imp_fun
(Character string) Function used for data imputation. "SampMin", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb", "min", "zero", "mixed", or "nbavg". See (
rna.impute
) for details.- q
(Numeric) q value for imputing missing values with method
imp_fun = 'MinProb'
.- knn.rowmax
(Numeric) The maximum percent missing data allowed in any row for
imp_fun = 'knn'
. Default: 0.5.- type
(Character string) Type of differential analysis to perform. "all" (contrast each condition with every other condition), "control" (contrast each condition to a defined control condition), "manual" (manually define selected conditions).
- design
Formula for the design matrix.
- size.factors
Optional: Manually define size factors for normalization.
- altHypothesis
Specify those genes you are interested in finding. The test provides p values for the null hypothesis, the complement of the set defined by altHypothesis. For further details, see
results
.- control
Control condition; required if type = "control".
- contrast
(String or vector of strings) Defined test(s) for differential analysis in the form "A_vs_B"; required if type = "manual".
- controlGenes
Specifying those genes to use for size factor estimation (e.g. housekeeping or spike-in genes).
- pAdjustMethod
Method for adjusting p values. Available options are "IHW" (Independent Hypothesis Weighting),"BH" (Benjamini-Hochberg).
- alpha
Significance threshold for adjusted p values.
- alpha.independent
Adjusted p value threshold for independent filtering or NULL. If the adjusted p-value cutoff (FDR) will be a value other than 0.1, alpha should be set to that value.
- alpha_pathways
Significance threshold for pathway analysis.
- lfcShrink
Use shrinkage to calculate log2 fold change values.
- shrink.method
Method for shrinkage. Available options are "apeglm", "ashr", "normal". See
lfcShrink
for details.- lfc
Relevance threshold for absolute log2(fold change) values. Used to filter unshrunken lfc values or in shrinkage method "apeglm" or "normal".
- heatmap.show_all
Shall all samples be displayed in the heatmap or only the samples contained in the defined "contrast"? (only applicable for type = "manual")
- heatmap.kmeans
Shall the proteins be clustered in the heat map?
- k
Number of protein clusters in heat map if kmeans = TRUE.
- heatmap.col_limit
Define the breaks in the heat map legends.
- heatmap.show_row_names
Show protein names in heat map?
- heatmap.row_font_size
Font size of protein names if show_row_names = TRUE.
- volcano.add_names
Display names next to symbols in volcano plot.
- volcano.label_size
Size of labels in volcano plot.
- volcano.adjusted
Display adjusted p-values on y axis of volcano plot?
- plot
Shall plots be returned in the Plots pane?
- export
Shall plots be exported as PDF and PNG files?
- report
Shall a report (HTML and PDF) be created?
- report.dir
Folder name for created report (if report = TRUE)
- pathway_enrichment
Perform pathway over-representation analysis for each tested contrast
- pathway_kegg
Perform pathway over-representation analysis with gene sets in the KEGG database
- kegg_organism
Name of the organism in the KEGG database (if 'pathway_kegg = TRUE')
- custom_pathways
Dataframe providing custom pathway annotations.
- quiet
Suppress messages and warnings.
Value
The function returns a SummarizedExperiment object with added columns for log2 fold change, p-values and adjusted p-values for each comparison. It also includes a column for significant genes for each comparison and a column for significant genes overall. Additionally, the function generates various plots and a report (if specified).