Package 'MKomics' reference manual

Title:	Omics Data Analysis
Description:	Similarity plots based on correlation and median absolute deviation (MAD); adjusting colors for heatmaps; aggregate technical replicates; calculate pairwise fold-changes and log fold-changes; compute one- and two-way ANOVA; simplified interface to package 'limma' (Ritchie et al. (2015), <doi:10.1093/nar/gkv007> ) for moderated t-test and one-way ANOVA; Hamming and Levenshtein (edit) distance of strings as well as optimal alignment scores for global (Needleman-Wunsch) and local (Smith-Waterman) alignments with constant gap penalties (Merkl and Waack (2009), ISBN:978-3-527-32594-8).
Authors:	Matthias Kohl [aut, cre] (0000-0001-9514-8910)
Maintainer:	Matthias Kohl <[email protected]>
License:	LGPL-3
Version:	0.8
Built:	2025-02-07 05:28:22 UTC
Source:	https://github.com/stamats/mkomics

Omics Data Analysis.

Description

Similarity plots based on correlation and median absolute deviation (MAD); adjusting colors for heatmaps; aggregate technical replicates; calculate pairwise fold-changes and log fold-changes; computate one- and two-way ANOVA; simplified interface to package 'limma' (Ritchie et al. (2015), <doi:10.1093/nar/gkv007>) for moderated t-test and one-way ANOVA; Hamming and Levenshtein (edit) distance of strings as well as optimal alignment scores for global (Needleman-Wunsch) and local (Smith-Waterman) alignments with constant gap penalties (Merkl and Waack (2009), ISBN:978-3-527-32594-8).

Details

Package:	MKomics
Type:	Package
Version:	0.8
Date:	2022-09-06
Depends:	R(>= 3.5.0)
Imports:	stats, utils, graphics, grDevices, RColorBrewer, robustbase, limma, grid, circlize, ComplexHeatmap
Suggests:	knitr, rmarkdown
License:	LGPL-3
URL:	https://www.stamats.de/

library(MKomics)

Author(s)

Matthias Kohl https://www.stamats.de

Maintainer: Matthias Kohl [email protected]

Correlation Distance Matrix Computation

Description

The function computes and returns the correlation and absolute correlation distance matrix computed by using the specified distance measure to compute the distances between the rows of a data matrix.

Usage

corDist(x, method = "pearson", diag = FALSE, upper = FALSE, abs = FALSE,
        use = "pairwise.complete.obs", ...)
corDist(x, method = "pearson", diag = FALSE, upper = FALSE, abs = FALSE,
        use = "pairwise.complete.obs", ...)

Arguments

`x`	a numeric matrix or data frame
`method`	the correlation distance measure to be used. This must be one of `"pearson"`, `"spearman"`, `"kandall"`, `"cosine"`, `"mcd"` or `"ogk"`, respectively. Any unambiguous substring can be given.
`diag`	logical value indicating whether the diagonal of the distance matrix should be printed by 'print.dist'.
`upper`	logical value indicating whether the upper triangle of the distance matrix should be printed by 'print.dist'.
`abs`	logical, compute absolute correlation distances
`use`	character, correponds to argument `use` of function `cor`
`...`	further arguments to functions `covMcd` or `covOGK`, respectively.

Details

The function computes the Pearson, Spearman, Kendall or Cosine sample correlation and absolute correlation; confer Section 12.2.2 of Gentleman et al (2005). For more details about the arguments we refer to functions dist and cor. Moreover, the function computes the minimum covariance determinant or the orthogonalized Gnanadesikan-Kettenring estimator. For more details we refer to functions covMcd and covOGK, respectively.

Value

corDist returns an object of class "dist"; cf. dist.

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

Gentleman R. Ding B., Dudoit S. and Ibrahim J. (2005). Distance Measures in DNA Microarray Data Analysis. In: Gentleman R., Carey V.J., Huber W., Irizarry R.A. and Dudoit S. (editors) Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer.

P. J. Rousseeuw and A. M. Leroy (1987). Robust Regression and Outlier Detection. Wiley.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212-223.

Pison, G., Van Aelst, S., and Willems, G. (2002), Small Sample Corrections for LTS and MCD, Metrika, 55, 111-123.

Maronna, R.A. and Zamar, R.H. (2002). Robust estimates of location and dispersion of high-dimensional datasets; Technometrics 44(4), 307-317.

Gnanadesikan, R. and John R. Kettenring (1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81-124.

Examples

## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
D <- corDist(M)
## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
D <- corDist(M)

Plot of similarity matrix based on correlation

Description

Plot of similarity matrix. This function is a slight modification of function plot.cor of the archived package "sma".

Usage

corPlot(x, new = FALSE, col, minCor, 
        labels = FALSE, lab.both.axes = FALSE, labcols = "black", 
        title = "", cex.title = 1.2, 
        protocol = FALSE, cex.axis = 0.8, 
        cex.axis.bar = 1, signifBar = 2, ...)
        
corPlot2(x, new = FALSE, col, minCor = 0.5, labels = FALSE, 
         row.width = 6, column.height = 6, lab.both.axes = TRUE, 
         fontsize.axis = 12, title = "", fontsize.title = 16, 
         signifBar = 2, use_raster = FALSE)
corPlot(x, new = FALSE, col, minCor, 
        labels = FALSE, lab.both.axes = FALSE, labcols = "black", 
        title = "", cex.title = 1.2, 
        protocol = FALSE, cex.axis = 0.8, 
        cex.axis.bar = 1, signifBar = 2, ...)
        
corPlot2(x, new = FALSE, col, minCor = 0.5, labels = FALSE, 
         row.width = 6, column.height = 6, lab.both.axes = TRUE, 
         fontsize.axis = 12, title = "", fontsize.title = 16, 
         signifBar = 2, use_raster = FALSE)

Arguments

`x`	data or correlation matrix, respectively
`new`	If `new=FALSE`, `x` must already be a correlation matrix. If `new=TRUE`, the correlation matrix for the columns of `x` is computed and displayed in the image.
`col`	colors palette for image. If missing, the `RdYlGn` palette of `RColorBrewer` is used.
`minCor`	numeric value in [-1,1], used to adjust `col`
`labels`	vector of character strings to be placed at the tickpoints, labels for the columns of `x`.
`lab.both.axes`	logical, display labels on both axes
`labcols`	colors to be used for the labels of the columns of `x`. `labcols` can have either length 1, in which case all the labels are displayed using the same color, or the same length as `labels`, in which case a color is specified for the label of each column of `x`.
`title`	character string, overall title for the plot.
`cex.title`	numerical value giving the amount by which plotting text and symbols should be magnified relative to the default; cf. `par`, `cex.main`.
`fontsize.title`	numerical value giving the fontsize of the title.
`protocol`	logical, display color bar without numbers.
`cex.axis`	The magnification to be used for axis annotation relative to the current setting of 'cex'; cf. `par`.
`fontsize.axis`	numerical value giving the fontsize of the axis labels.
`cex.axis.bar`	The magnification to be used for axis annotation of the color bar relative to the current setting of 'cex'; cf. `par`.
`signifBar`	integer indicating the precision to be used for the bar.
`row.width`	numerical value giving width of the row in centimeters; i.e., can be used to change space available for the labels.
`column.height`	numerical value giving the height of the column in centimeters; i.e., can be used to change space available for the labels.
`use_raster`	see `Heatmap`.
`...`	graphical parameters may also be supplied as arguments to the function (see `par`). For comparison purposes, it is good to set `zlim=c(-1,1)`.

Details

This functions generates the so called similarity matrix (based on correlation) for a microarray experiment.

If min(x), respectively min(cor(x)) is smaller than minCor, the colors in col are adjusted such that the minimum correlation value which is color coded is equal to minCor.

The newer function corPlot2 is based on function Heatmap of Bioconductor package ComplexHeatmap.

Value

invisible()

Note

A first version of function corPlot appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

Sandrine Dudoit, Yee Hwa (Jean) Yang, Benjamin Milo Bolstad and with contributions from Natalie Thorne, Ingrid Loennstedt and Jessica Mar. sma: Statistical Microarray Analysis.
http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html

Examples

## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
colnames(M) <- paste("Sample", 1:20)
M.cor <- cor(M)

corPlot(M.cor, minCor = min(M.cor))
corPlot(M.cor, minCor = min(M.cor), lab.both.axes = TRUE)
corPlot(M.cor, minCor = min(M.cor), protocol = TRUE)
corPlot(M.cor, minCor = min(M.cor), signifBar = 1)

corPlot2(M.cor, minCor = min(M.cor))
corPlot2(M.cor, minCor = min(M.cor), lab.both.axes = FALSE)
corPlot2(M.cor, minCor = min(M.cor), signifBar = 1)
## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
colnames(M) <- paste("Sample", 1:20)
M.cor <- cor(M)

corPlot(M.cor, minCor = min(M.cor))
corPlot(M.cor, minCor = min(M.cor), lab.both.axes = TRUE)
corPlot(M.cor, minCor = min(M.cor), protocol = TRUE)
corPlot(M.cor, minCor = min(M.cor), signifBar = 1)

corPlot2(M.cor, minCor = min(M.cor))
corPlot2(M.cor, minCor = min(M.cor), lab.both.axes = FALSE)
corPlot2(M.cor, minCor = min(M.cor), signifBar = 1)

Generate colors for heatmaps

Description

This function modifies a given color vector as used for heatmaps.

Usage

heatmapCol(data, col, lim, na.rm = TRUE)
heatmapCol(data, col, lim, na.rm = TRUE)

Arguments

`data`	matrix or data.frame; data which shall be displayed in a heatmap; ranging from negative to positive numbers.
`col`	vector of colors used for heatmap.
`lim`	constant colors are used for data below `-lim` resp. above `lim`.
`na.rm`	logical; remove `NA` values.

Details

Colors below and above a specified value are kept constant. In addition, the colors are symmetrizised.

Value

vector of colors

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

Examples

data.plot <- matrix(rnorm(100*50, sd = 1), ncol = 50)
colnames(data.plot) <- paste("patient", 1:50)
rownames(data.plot) <- paste("gene", 1:100)
data.plot[1:70, 1:30] <- data.plot[1:70, 1:30] + 3
data.plot[71:100, 31:50] <- data.plot[71:100, 31:50] - 1.4
data.plot[1:70, 31:50] <- rnorm(1400, sd = 1.2)
data.plot[71:100, 1:30] <- rnorm(900, sd = 1.2)
nrcol <- 128

require(RColorBrewer)
myCol <- rev(colorRampPalette(brewer.pal(10, "RdBu"))(nrcol))
heatmap(data.plot, col =  myCol, main = "standard colors")
myCol2 <- heatmapCol(data = data.plot, col = myCol, 
                     lim = min(abs(range(data.plot)))-1)
heatmap(data.plot, col = myCol2, main = "heatmapCol colors")
data.plot <- matrix(rnorm(100*50, sd = 1), ncol = 50)
colnames(data.plot) <- paste("patient", 1:50)
rownames(data.plot) <- paste("gene", 1:100)
data.plot[1:70, 1:30] <- data.plot[1:70, 1:30] + 3
data.plot[71:100, 31:50] <- data.plot[71:100, 31:50] - 1.4
data.plot[1:70, 31:50] <- rnorm(1400, sd = 1.2)
data.plot[71:100, 1:30] <- rnorm(900, sd = 1.2)
nrcol <- 128

require(RColorBrewer)
myCol <- rev(colorRampPalette(brewer.pal(10, "RdBu"))(nrcol))
heatmap(data.plot, col =  myCol, main = "standard colors")
myCol2 <- heatmapCol(data = data.plot, col = myCol, 
                     lim = min(abs(range(data.plot)))-1)
heatmap(data.plot, col = myCol2, main = "heatmapCol colors")

Compute MAD between colums of a matrix or data.frame

Description

Compute MAD between colums of a matrix or data.frame. Can be used to create a similarity matrix for a microarray experiment.

Usage

madMatrix(x)
madMatrix(x)

Arguments

`x`	matrix or data.frame

Details

This functions computes the so called similarity matrix (based on MAD) for a microarray experiment; cf. Buness et. al. (2004).

Value

matrix of MAD values between colums of x

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

Andreas Buness, Wolfgang Huber, Klaus Steiner, Holger Sueltmann, and Annemarie Poustka. arrayMagic: two-colour cDNA microarray quality control and preprocessing. Bioinformatics Advance Access published on September 28, 2004. doi:10.1093/bioinformatics/bti052

Examples

## only a dummy example
madMatrix(matrix(rnorm(1000), ncol = 10))
## only a dummy example
madMatrix(matrix(rnorm(1000), ncol = 10))

Plot of similarity matrix based on MAD

Description

Plot of similarity matrix based on MAD between microarrays.

Usage

madPlot(x, new = FALSE, col, maxMAD = 3, labels = FALSE, 
        labcols = "black", title = "", protocol = FALSE, ...)

madPlot2(x, new = FALSE, col, maxMAD = 3, labels = FALSE, 
         row.width = 6, column.height = 6, 
         lab.both.axes = TRUE, fontsize.axis = 12,
         title = "", fontsize.title = 16, signifBar = 2)
madPlot(x, new = FALSE, col, maxMAD = 3, labels = FALSE, 
        labcols = "black", title = "", protocol = FALSE, ...)

madPlot2(x, new = FALSE, col, maxMAD = 3, labels = FALSE, 
         row.width = 6, column.height = 6, 
         lab.both.axes = TRUE, fontsize.axis = 12,
         title = "", fontsize.title = 16, signifBar = 2)

Arguments

`x`	data or correlation matrix, respectively
`new`	If `new=FALSE`, `x` must already be a matrix with MAD values. If `new=TRUE`, the MAD matrix for the columns of `x` is computed and displayed in the image.
`col`	colors palette for image. If missing, the `RdYlGn` palette of `RColorBrewer` is used.
`maxMAD`	maximum MAD value displayed
`labels`	vector of character strings to be placed at the tickpoints, labels for the columns of `x`.
`labcols`	colors to be used for the labels of the columns of `x`. `labcols` can have either length 1, in which case all the labels are displayed using the same color, or the same length as `labels`, in which case a color is specified for the label of each column of `x`.
`title`	character string, overall title for the plot.
`fontsize.title`	numerical value giving the fontsize of the title.
`protocol`	logical, display color bar without numbers
`lab.both.axes`	logical, display labels on both axes
`fontsize.axis`	numerical value giving the fontsize of the axis labels.
`signifBar`	integer indicating the precision to be used for the bar.
`row.width`	numerical value giving width of the row in centimeters; i.e., can be used to change space available for the labels.
`column.height`	numerical value giving the height of the column in centimeters; i.e., can be used to change space available for the labels.
`...`	graphical parameters may also be supplied as arguments to the function (see `par`). For comparison purposes, it is good to set `zlim=c(-1,1)`.

Details

This functions generates the so called similarity matrix (based on MAD) for a microarray experiment; cf. Buness et. al. (2004). The function is similar to corPlot.

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

Examples

## only a dummy example
set.seed(13)
x <- matrix(rnorm(1000), ncol = 10)
x[1:20,5] <- x[1:20,5] + 10
madPlot(x, new = TRUE, maxMAD = 2.5)
madPlot2(x, new = TRUE, maxMAD = 2.5)
## in contrast
corPlot2(x, new = TRUE, minCor = -0.5)
## only a dummy example
set.seed(13)
x <- matrix(rnorm(1000), ncol = 10)
x[1:20,5] <- x[1:20,5] + 10
madPlot(x, new = TRUE, maxMAD = 2.5)
madPlot2(x, new = TRUE, maxMAD = 2.5)
## in contrast
corPlot2(x, new = TRUE, minCor = -0.5)

Moderated 1-Way ANOVA

Description

Performs moderated 1-Way ANOVAs based on Bioconductor package limma.

Usage

mod.oneway.test(x, group, repeated = FALSE, subject, adjust.method = "BH", 
                sort.by = "none")
mod.oneway.test(x, group, repeated = FALSE, subject, adjust.method = "BH", 
                sort.by = "none")

Arguments

`x`	a (non-empty) numeric matrix of data values.
`group`	an optional factor representing the groups.
`repeated`	logical indicating whether there are repeated-measures.
`subject`	factor with subject IDs; required if `repeated = TRUE`.
`adjust.method`	see `p.adjust`
`sort.by`	see `toptable`

, where "logFC" corresponds to difference in means.

Details

The function uses Bioconductor package limma to compute moderated 1-way ANOVAs. For more details we refer to ebayes.

Value

A data.frame with the results.

References

Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and Smyth, G.K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47.

Examples

set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
gr <- factor(c(rep("A1", 5), rep("B2", 5), rep("C3", 5), rep("D4", 5)))
mod.oneway.test(X, gr)

## Welch 1-Way ANOVA (not moderated)
ow.test <- function(x, g){
  res <- oneway.test(x ~ g)
  c(res$statistic, res$p.value)
}
ow.res <- t(apply(X, 1, ow.test, g = gr))
colnames(ow.res) <- c("F", "p.value")
ow.res

## repeated measures
X <- rbind(matrix(rnorm(6*18), nrow = 6, ncol = 18),
           matrix(rnorm(6*18, mean = 1), nrow = 6, ncol = 18))
gr <- factor(c(rep("T1", 6), rep("T2", 6), rep("T3", 6)))
subjectID <- factor(c(rep(1:6, 3)))
mod.oneway.test(X, gr, repeated = TRUE, subject = subjectID)
set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
gr <- factor(c(rep("A1", 5), rep("B2", 5), rep("C3", 5), rep("D4", 5)))
mod.oneway.test(X, gr)

## Welch 1-Way ANOVA (not moderated)
ow.test <- function(x, g){
  res <- oneway.test(x ~ g)
  c(res$statistic, res$p.value)
}
ow.res <- t(apply(X, 1, ow.test, g = gr))
colnames(ow.res) <- c("F", "p.value")
ow.res

## repeated measures
X <- rbind(matrix(rnorm(6*18), nrow = 6, ncol = 18),
           matrix(rnorm(6*18, mean = 1), nrow = 6, ncol = 18))
gr <- factor(c(rep("T1", 6), rep("T2", 6), rep("T3", 6)))
subjectID <- factor(c(rep(1:6, 3)))
mod.oneway.test(X, gr, repeated = TRUE, subject = subjectID)

Moderated t-Test

Description

Performs moderated t-tests based on Bioconductor package limma.

Usage

mod.t.test(x, group = NULL, paired = FALSE, subject, adjust.method = "BH",
           sort.by = "none", na.rm = TRUE)
mod.t.test(x, group = NULL, paired = FALSE, subject, adjust.method = "BH",
           sort.by = "none", na.rm = TRUE)

Arguments

`x`	a (non-empty) numeric matrix of data values.
`group`	an optional factor representing the groups.
`paired`	a logical indicating whether you want a paired test.
`subject`	factor with subject IDs; required if `paired = TRUE`.
`adjust.method`	see `p.adjust`
`sort.by`	see `toptable`

, where "logFC" corresponds to difference in means.

na.rm

logical. Should missing values (including NaN) be omitted from the calculations of group means?

Details

The function uses Bioconductor package limma to compute moderated t-tests. For more details we refer to ebayes.

Value

A data.frame with the results.

References

Examples

## One-sample test
X <- matrix(rnorm(10*20, mean = 1), nrow = 10, ncol = 20)

mod.t.test(X)
## corresponds to
library(limma)
design <- matrix(1, nrow = ncol(X), ncol = 1)
colnames(design) <- "A"
fit1 <- lmFit(X, design)
fit2 <- eBayes(fit1)
topTable(fit2, coef = 1, number = Inf, confint = TRUE, sort.by = "none")[,-4]

## Two-sample test
set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
g2 <- factor(c(rep("group 1", 10), rep("group 2", 10)))

mod.t.test(X, group = g2)
## corresponds to
design <- model.matrix(~ 0 + g2)
colnames(design) <- c("group1", "group2")
fit1 <- lmFit(X, design)
cont.matrix <- makeContrasts(group1vsgroup2="group1-group2", levels=design)
fit2 <- contrasts.fit(fit1, cont.matrix)
fit3 <- eBayes(fit2)
topTable(fit3, coef = 1, number = Inf, confint = TRUE, sort.by = "none")[,-4]

## Paired two-sample test
subjID <- factor(rep(1:10, 2))
mod.t.test(X, group = g2, paired = TRUE, subject = subjID)
## One-sample test
X <- matrix(rnorm(10*20, mean = 1), nrow = 10, ncol = 20)

mod.t.test(X)
## corresponds to
library(limma)
design <- matrix(1, nrow = ncol(X), ncol = 1)
colnames(design) <- "A"
fit1 <- lmFit(X, design)
fit2 <- eBayes(fit1)
topTable(fit2, coef = 1, number = Inf, confint = TRUE, sort.by = "none")[,-4]

## Two-sample test
set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
g2 <- factor(c(rep("group 1", 10), rep("group 2", 10)))

mod.t.test(X, group = g2)
## corresponds to
design <- model.matrix(~ 0 + g2)
colnames(design) <- c("group1", "group2")
fit1 <- lmFit(X, design)
cont.matrix <- makeContrasts(group1vsgroup2="group1-group2", levels=design)
fit2 <- contrasts.fit(fit1, cont.matrix)
fit3 <- eBayes(fit2)
topTable(fit3, coef = 1, number = Inf, confint = TRUE, sort.by = "none")[,-4]

## Paired two-sample test
subjID <- factor(rep(1:10, 2))
mod.t.test(X, group = g2, paired = TRUE, subject = subjID)

A function for Analysis of Variance

Description

This function is a slight modification of function Anova of package genefilter.

Usage

oneWayAnova(cov, na.rm = TRUE, var.equal = FALSE)oneWayAnova(cov, na.rm = TRUE, var.equal = FALSE)

Arguments

`cov`	The covariate. It must have length equal to the number of columns of the array that the result of `oneWayAnova` will be applied to.
`na.rm`	a logical value indicating whether `NA` values should be stripped before the computation proceeds.
`var.equal`	a logical variable indicating whether to treat the variances in the samples as equal. If `TRUE`, then a simple F test for the equality of means in a one-way analysis of variance is performed. If `FALSE`, an approximate method of Welch (1951) is used, which generalizes the commonly known 2-sample Welch test to the case of arbitrarily many samples.

Details

The function returned by oneWayAnova uses oneway.test to perform a one-way ANOVA, where x is the set of gene expressions. The F statistic for an overall effect is computed and the corresponding p-value is returned.

The function Anova of package genefilter instead compares the computed p-value to a prespecified p-value and returns TRUE, if the computed p-value is smaller than the prespecified one.

Value

oneWayAnova returns a function with bindings for cov that will perform a one-way ANOVA.

The covariate can be continuous, in which case the test is for a linear effect for the covariate.

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

R. Gentleman, V. Carey, W. Huber and F. Hahne (2006). genefilter: methods for filtering genes from microarray experiments. R package version 1.13.7.

Examples

set.seed(123)
af <- oneWayAnova(c(rep(1,5),rep(2,5)))
af(rnorm(10))
set.seed(123)
af <- oneWayAnova(c(rep(1,5),rep(2,5)))
af(rnorm(10))

Compute pairwise fold changes

Description

This function computes pairwise fold changes. It also works for logarithmic data.

Usage

pairwise.fc(x, g, ave = mean, log = TRUE, base = 2, mod.fc = TRUE, ...)
pairwise.fc(x, g, ave = mean, log = TRUE, base = 2, mod.fc = TRUE, ...)

Arguments

`x`	numeric vector.
`g`	grouping vector or factor
`ave`	function to compute the group averages.
`log`	logical. Is the data logarithmic?
`base`	If `log = TRUE`, the base which was used to compute the logarithms.
`mod.fc`	logical. Return modified fold changes? (see details)
`...`	optional arguments to `ave`.

Details

The function computes pairwise fold changes between groups, where the group values are aggregated using the function which is given by the argument ave.

The fold changes are returned in a slightly modified form if mod.fc = TRUE. Fold changes FC which are smaller than 1 are reported as to -1/FC.

The implementation is in certain aspects analogously to pairwise.t.test.

Value

Vector with pairwise fold changes.

Author(s)

Matthias Kohl [email protected]

Examples

set.seed(13)
x <- rnorm(100) ## assumed as log2-data
g <- factor(sample(1:4, 100, replace = TRUE))
levels(g) <- c("a", "b", "c", "d")
pairwise.fc(x, g)

## some small checks
res <- by(x, list(g), mean)
2^(res[[1]] - res[[2]]) # a vs. b
-1/2^(res[[1]] - res[[3]]) # a vs. c
2^(res[[1]] - res[[4]]) # a vs. d
-1/2^(res[[2]] - res[[3]]) # b vs. c
-1/2^(res[[2]] - res[[4]]) # b vs. d
2^(res[[3]] - res[[4]]) # c vs. d
set.seed(13)
x <- rnorm(100) ## assumed as log2-data
g <- factor(sample(1:4, 100, replace = TRUE))
levels(g) <- c("a", "b", "c", "d")
pairwise.fc(x, g)

## some small checks
res <- by(x, list(g), mean)
2^(res[[1]] - res[[2]]) # a vs. b
-1/2^(res[[1]] - res[[3]]) # a vs. c
2^(res[[1]] - res[[4]]) # a vs. d
-1/2^(res[[2]] - res[[3]]) # b vs. c
-1/2^(res[[2]] - res[[4]]) # b vs. d
2^(res[[3]] - res[[4]]) # c vs. d

Compute pairwise log-fold changes

Description

The function computes pairwise log-fold changes.

Usage

pairwise.logfc(x, g, ave = mean, log = TRUE, base = 2, ...)
pairwise.logfc(x, g, ave = mean, log = TRUE, base = 2, ...)

Arguments

`x`	numeric vector.
`g`	grouping vector or factor
`ave`	function to compute the group averages.
`log`	logical. Is the data logarithmic?
`base`	If `log = TRUE`, the base which was used to compute the logarithms.
`...`	optional arguments to `ave`.

Details

The function computes pairwise log-fold changes between groups, where the group values are aggregated using the function which is given by the argument ave.

The implementation is in certain aspects analogously to pairwise.t.test.

Value

Vector with pairwise log-fold changes.

Author(s)

Matthias Kohl [email protected]

Examples

set.seed(13)
x <- rnorm(100) ## assumed as log2-data
g <- factor(sample(1:4, 100, replace = TRUE))
levels(g) <- c("a", "b", "c", "d")
pairwise.logfc(x, g)

## some small checks
res <- by(x, list(g), mean)
res[[1]] - res[[2]] # a vs. b
res[[1]] - res[[3]] # a vs. c
res[[1]] - res[[4]] # a vs. d
res[[2]] - res[[3]] # b vs. c
res[[2]] - res[[4]] # b vs. d
res[[3]] - res[[4]] # c vs. d
set.seed(13)
x <- rnorm(100) ## assumed as log2-data
g <- factor(sample(1:4, 100, replace = TRUE))
levels(g) <- c("a", "b", "c", "d")
pairwise.logfc(x, g)

## some small checks
res <- by(x, list(g), mean)
res[[1]] - res[[2]] # a vs. b
res[[1]] - res[[3]] # a vs. c
res[[1]] - res[[4]] # a vs. d
res[[2]] - res[[3]] # b vs. c
res[[2]] - res[[4]] # b vs. d
res[[3]] - res[[4]] # c vs. d

Pairwise Moderated t-Tests

Description

Performs pairwise moderated t-tests (unpaired) based on Bioconductor package limma.

Usage

pairwise.mod.t.test(x, group, adjust.method = "BH", sort.by = "none")
pairwise.mod.t.test(x, group, adjust.method = "BH", sort.by = "none")

Arguments

`x`	a (non-empty) numeric matrix of data values.
`group`	an optional factor representing the groups.
`adjust.method`	see `p.adjust`
`sort.by`	see `toptable`

, where "logFC" corresponds to difference in means.

Details

The function uses Bioconductor package limma to compute pairwise moderated t-tests. For more details we refer to ebayes.

Value

A data.frame with the results.

References

Examples

set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
gr <- factor(c(rep("A1", 5), rep("B2", 5), rep("C3", 5), rep("D4", 5)))
mod.oneway.test(X, gr)
pairwise.mod.t.test(X, gr)
set.seed(123)
X <- rbind(matrix(rnorm(5*20), nrow = 5, ncol = 20),
           matrix(rnorm(5*20, mean = 1), nrow = 5, ncol = 20))
gr <- factor(c(rep("A1", 5), rep("B2", 5), rep("C3", 5), rep("D4", 5)))
mod.oneway.test(X, gr)
pairwise.mod.t.test(X, gr)

Compute mean of replicated spots

Description

Compute mean of replicated spots where additionally spot flags may incorporated.

Usage

repMeans(x, flags, use.flags = NULL, ndups, spacing, method, ...)
repMeans(x, flags, use.flags = NULL, ndups, spacing, method, ...)

Arguments

`x`	matrix or data.frame of expression values
`flags`	matrix or data.frame of spot flags; must have same dimension as `x`
`use.flags`	should flags be included and in which way; cf. section details
`ndups`	integer, number of replicates on chip. The number of rows of `x` must be divisible by `ndups`
`spacing`	the spacing between the rows of 'x' corresponding to replicated spots, `spacing = 1` for consecutive spots; cf. function `unwrapdups` in package `"limma"`
`method`	function to aggregate the replicated spots. If missing, the mean is used.
`...`	optional arguments to `method`.

Details

The incorporation of spot flags is controlled via argument use.flags.

NULL: flags are not used; minimum flag value of replicated spots is returned

"max": only spots with flag value equal to the maximum flag value of replicated spots are used

"median": only spots with flag values larger or equal to median of replicated spots are used

"mean": only spots with flag values larger or equal to mean of replicated spots are used

Value

LIST with components

`exprs`	mean of expression values
`flags`	flags for mean expression values

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

Examples

## only a dummy example
M <- matrix(rnorm(1000), ncol = 10)
FL <- matrix(rpois(1000, lambda = 10), ncol = 10) # only for this example
res <- repMeans(x = M, flags = FL, use.flags = "max", ndups = 5, spacing = 20)
## only a dummy example
M <- matrix(rnorm(1000), ncol = 10)
FL <- matrix(rpois(1000, lambda = 10), ncol = 10) # only for this example
res <- repMeans(x = M, flags = FL, use.flags = "max", ndups = 5, spacing = 20)

Plot of a similarity matrix.

Description

Plot of similarity matrix.

Usage

simPlot(x, col, minVal, labels = FALSE, lab.both.axes = FALSE, 
        labcols = "black", title = "", cex.title = 1.2, 
        protocol = FALSE, cex.axis = 0.8, 
        cex.axis.bar = 1, signifBar = 2, ...)
        
simPlot2(x, col, minVal, labels = FALSE, 
         row.width = 6, column.height = 6, lab.both.axes = TRUE, 
         fontsize.axis = 12, title = "", fontsize.title = 16, 
         signifBar = 2)
simPlot(x, col, minVal, labels = FALSE, lab.both.axes = FALSE, 
        labcols = "black", title = "", cex.title = 1.2, 
        protocol = FALSE, cex.axis = 0.8, 
        cex.axis.bar = 1, signifBar = 2, ...)
        
simPlot2(x, col, minVal, labels = FALSE, 
         row.width = 6, column.height = 6, lab.both.axes = TRUE, 
         fontsize.axis = 12, title = "", fontsize.title = 16, 
         signifBar = 2)

Arguments

`x`	quadratic data matrix.
`col`	colors palette for image. If missing, the `RdYlGn` palette of `RColorBrewer` is used.
`minVal`	numeric, minimum value which is display by a color; used to adjust `col`
`labels`	vector of character strings to be placed at the tickpoints, labels for the columns of `x`.
`lab.both.axes`	logical, display labels on both axes
`labcols`	colors to be used for the labels of the columns of `x`. `labcols` can have either length 1, in which case all the labels are displayed using the same color, or the same length as `labels`, in which case a color is specified for the label of each column of `x`.
`title`	character string, overall title for the plot.
`cex.title`	A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default; cf. `par`, `cex.main`.
`fontsize.title`	numerical value giving the fontsize of the title.
`protocol`	logical, display color bar without numbers
`cex.axis`	The magnification to be used for axis annotation relative to the current setting of 'cex'; cf. `par`.
`fontsize.axis`	numerical value giving the fontsize of the axis labels.
`cex.axis.bar`	The magnification to be used for axis annotation of the color bar relative to the current setting of 'cex'; cf. `par`.
`signifBar`	integer indicating the precision to be used for the bar.
`row.width`	numerical value giving width of the row in centimeters; i.e., can be used to change space available for the labels.
`column.height`	numerical value giving the height of the column in centimeters; i.e., can be used to change space available for the labels.
`...`	graphical parameters may also be supplied as arguments to the function (see `par`). For comparison purposes, it is good to set `zlim=c(-1,1)`.

Details

This functions generates a so called similarity matrix.

If min(x) is smaller than minVal, the colors in col are adjusted such that the minimum value which is color coded is equal to minVal.

Value

invisible()

Note

The function is a slight modification of function corPlot.

Author(s)

Matthias Kohl [email protected]

References

Examples

## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
colnames(M) <- paste("Sample", 1:20)
M.cor <- cor(M)

simPlot(M.cor, minVal = min(M.cor))
simPlot(M.cor, minVal = min(M.cor), lab.both.axes = TRUE)
simPlot(M.cor, minVal = min(M.cor), protocol = TRUE)
simPlot(M.cor, minVal = min(M.cor), signifBar = 1)

simPlot2(M.cor, minVal = min(M.cor))
simPlot2(M.cor, minVal = min(M.cor), lab.both.axes = FALSE)
simPlot2(M.cor, minVal = min(M.cor), signifBar = 1)
## only a dummy example
M <- matrix(rnorm(1000), ncol = 20)
colnames(M) <- paste("Sample", 1:20)
M.cor <- cor(M)

simPlot(M.cor, minVal = min(M.cor))
simPlot(M.cor, minVal = min(M.cor), lab.both.axes = TRUE)
simPlot(M.cor, minVal = min(M.cor), protocol = TRUE)
simPlot(M.cor, minVal = min(M.cor), signifBar = 1)

simPlot2(M.cor, minVal = min(M.cor))
simPlot2(M.cor, minVal = min(M.cor), lab.both.axes = FALSE)
simPlot2(M.cor, minVal = min(M.cor), signifBar = 1)

Function to compute distances between strings

Description

The function can be used to compute distances between strings.

Usage

stringDist(x, y, method = "levenshtein", mismatch = 1, gap = 1)
stringDist(x, y, method = "levenshtein", mismatch = 1, gap = 1)

Arguments

`x`	character vector, first string
`y`	character vector, second string
`method`	character, name of the distance method. This must be `"levenshtein"` or `"hamming"`. Default is the classical Levenshtein distance.
`mismatch`	numeric, distance value for a mismatch between symbols
`gap`	numeric, distance value for inserting a gap

Details

The function computes the Hamming and the Levenshtein (edit) distance of two given strings (sequences).

In case of the Hamming distance the two strings must have the same length.

In case of the Levenshtein (edit) distance a scoring and a trace-back matrix are computed and are saved as attributes "ScoringMatrix" and "TraceBackMatrix". The characters in the trace-back matrix reflect insertion of a gap in string y (d: deletion), match (m), mismatch (mm), and insertion of a gap in string x (i).

Value

stringDist returns an object of S3 class "stringDist" inherited from class "dist"; cf. dist.

Note

The function is mainly for teaching purposes.

For distances between strings and string alignments see also Bioconductor package Biostrings.

Author(s)

Matthias Kohl [email protected]

References

R. Merkl and S. Waack (2009). Bioinformatik Interaktiv. Wiley.

Examples

x <- "GACGGATTATG"
y <- "GATCGGAATAG"
## Levenshtein distance
d <- stringDist(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## Hamming distance
stringDist(x, y)
x <- "GACGGATTATG"
y <- "GATCGGAATAG"
## Levenshtein distance
d <- stringDist(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## Hamming distance
stringDist(x, y)

Function to compute similarity scores between strings

Description

The function can be used to compute similarity scores between strings.

Usage

stringSim(x, y, global = TRUE, match = 1, mismatch = -1, gap = -1, minSim = 0)
stringSim(x, y, global = TRUE, match = 1, mismatch = -1, gap = -1, minSim = 0)

Arguments

`x`	character vector, first string
`y`	character vector, second string
`global`	logical; global or local alignment
`match`	numeric, score for a match between symbols
`mismatch`	numeric, score for a mismatch between symbols
`gap`	numeric, penalty for inserting a gap
`minSim`	numeric, used as required minimum score in case of local alignments

Details

The function computes optimal alignment scores for global (Needleman-Wunsch) and local (Smith-Waterman) alignments with constant gap penalties.

Scoring and trace-back matrix are computed and saved in form of attributes "ScoringMatrix" and "TraceBackMatrix". The characters in the trace-back matrix reflect insertion of a gap in string y (d: deletion), match (m), mismatch (mm), and insertion of a gap in string x (i). In addition stop indicates that the minimum similarity score has been reached.

Value

stringSim returns an object of S3 class "stringSim" inherited from class "dist"; cf. dist.

Note

The function is mainly for teaching purposes.

For distances between strings and string alignments see also Bioconductor package Biostrings.

Author(s)

Matthias Kohl [email protected]

References

R. Merkl and S. Waack (2009). Bioinformatik Interaktiv. Wiley.

Examples

x <- "GACGGATTATG"
y <- "GATCGGAATAG"

## optimal global alignment score
d <- stringSim(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## optimal local alignment score
d <- stringSim(x, y, global = FALSE)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")
x <- "GACGGATTATG"
y <- "GATCGGAATAG"

## optimal global alignment score
d <- stringSim(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## optimal local alignment score
d <- stringSim(x, y, global = FALSE)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

Function to trace back

Description

Function computes an optimal global or local alignment based on a trace back matrix as provided by function stringDist or stringSim.

Usage

traceBack(D, global = TRUE)
traceBack(D, global = TRUE)

Arguments

`D`	object of class `"stringDist"`
`global`	logical, global or local alignment

Details

Computes one possible optimal global or local alignment based on the trace back matrix saved in an object of class "stringDist" or "stringSim".

Value

matrix: pairwise global/local alignment

Note

The function is mainly for teaching purposes.

For distances between strings and string alignments see Bioconductor package Biostrings.

Author(s)

Matthias Kohl [email protected]

References

R. Merkl and S. Waack (2009). Bioinformatik Interaktiv. Wiley.

Examples

x <- "GACGGATTATG"
y <- "GATCGGAATAG"

## Levenshtein distance
d <- stringDist(x, y)
## optimal global alignment
traceBack(d)

## Optimal global alignment score
d <- stringSim(x, y)
## optimal global alignment
traceBack(d)

## Optimal local alignment score
d <- stringSim(x, y, global = FALSE)
## optimal local alignment
traceBack(d, global = FALSE)
x <- "GACGGATTATG"
y <- "GATCGGAATAG"

## Levenshtein distance
d <- stringDist(x, y)
## optimal global alignment
traceBack(d)

## Optimal global alignment score
d <- stringSim(x, y)
## optimal global alignment
traceBack(d)

## Optimal local alignment score
d <- stringSim(x, y, global = FALSE)
## optimal local alignment
traceBack(d, global = FALSE)

A function for Analysis of Variance

Description

This function is a slight modification of function Anova of package genefilter.

Usage

twoWayAnova(cov1, cov2, interaction, na.rm = TRUE)twoWayAnova(cov1, cov2, interaction, na.rm = TRUE)

Arguments

`cov1`	The first covariate. It must have length equal to the number of columns of the array that the result of `twoWayAnova` will be applied to.
`cov2`	The second covariate. It must have length equal to the number of columns of the array that the result of `twoWayAnova` will be applied to.
`interaction`	logical, should interaction be considered
`na.rm`	a logical value indicating whether 'NA' values should be stripped before the computation proceeds.

Details

The function returned by twoWayAnova uses lm to fit a linear model of the form lm(x ~ cov1*cov2), where x is the set of gene expressions. The F statistics for the main effects and the interaction are computed and the corresponding p-values are returned.

Value

twoWayAnova returns a function with bindings for cov1 and cov2that will perform a two-way ANOVA.

Note

A first version of this function appeared in package SLmisc.

Author(s)

Matthias Kohl [email protected]

References

R. Gentleman, V. Carey, W. Huber and F. Hahne (2006). genefilter: methods for filtering genes from microarray experiments. R package version 1.13.7.

Examples

set.seed(123)
af1 <- twoWayAnova(c(rep(1,6),rep(2,6)), rep(c(rep(1,3), rep(2,3)), 2))
af2 <- twoWayAnova(c(rep(1,6),rep(2,6)), rep(c(rep(1,3), rep(2,3)), 2), 
                   interaction = FALSE)
x <- matrix(rnorm(12*10), nrow = 10)
apply(x, 1, af1)
apply(x, 1, af2)
set.seed(123)
af1 <- twoWayAnova(c(rep(1,6),rep(2,6)), rep(c(rep(1,3), rep(2,3)), 2))
af2 <- twoWayAnova(c(rep(1,6),rep(2,6)), rep(c(rep(1,3), rep(2,3)), 2), 
                   interaction = FALSE)
x <- matrix(rnorm(12*10), nrow = 10)
apply(x, 1, af1)
apply(x, 1, af2)

Package 'MKomics'

Help Index

Omics Data Analysis.

Description

Details

Author(s)

Correlation Distance Matrix Computation

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Plot of similarity matrix based on correlation

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Generate colors for heatmaps

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Compute MAD between colums of a matrix or data.frame

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Plot of similarity matrix based on MAD

Description

Usage

Arguments

Details

Note

Author(s)

References

See Also

Examples

Moderated 1-Way ANOVA

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Moderated t-Test

Description

Usage

Arguments

Details

Value

References

See Also

Examples

A function for Analysis of Variance

Description

Usage

Arguments

Details

Value