ordinate

distance

function, and plotting is facilitated a lot by

plot_ordination

function. See their pages for further details and examples.

Examples

Load and trim data

Load the necessary packages and data.

library("phyloseq"); library("ggplot2")
data(GlobalPatterns)

We will use the same data-trimming as in the plot_ordination examples. For trying this code yourself, see the trimming code-chunk at the beginning of the example-set.

Ordination methods using the `ordinate` function

(Unconstrained) Correspondence Analysis (CA)

Let's start with an unconstrained correspondence analysis on the abundance table of OTUs in each sample.

GP.ca <- ordinate(GP1, "CCA")
p1 <- plot_ordination(GP1, GP.ca, type="split", color="Phylum", shape="human", label="SampleType")

Before we make the graphic, let's first define a color scheme and background to use in all split plots, using some of the output from the plot_ordination function to define parameters that will be consistent in later split plots:

my.cols <- rainbow(length(levels(p1$data$Phylum)))
names(my.cols) <- levels(p1$data$Phylum)
my.cols["samples"] <- "black"
my.background <- "opts(panel.background = theme_rect(colour=\\"gray76\\", fill=\\"gray76\\"))"

Now make the plot, adding the new theme parameters

p1 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

Detrended Correspondence Analysis (DCA)

GP.dca <- ordinate(GP1, "DCA")
p2 <- plot_ordination(GP1, GP.dca, type="split", color="Phylum", shape="human", label="SampleType") 
p2 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

Constrained Correspondence Analysis (CCA).

Let's use Constrained (aka "Canonical") Correspondence Analysis to contrain the first axis on the known "SampleType" groups.

GP.cca <- ordinate(GP1 ~ SampleType, "CCA")
p3 <- plot_ordination(GP1, GP.cca, type="split", color="Phylum", shape="human", label="SampleType") 
p3 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

Double Principle Coordinate Analysis (DPCoA)

DPCoA in this context uses the patristic-distances defined in the phylogenetic tree as the additional distance matrix (between the OTUs), which means that a phylogenetic tree is required. A tree is included in our GP1 object, and was trimmed in parallel with the other trimming we did above, so it can be used directly.

GP.dpcoa <- ordinate(GP1, "dpcoa")
p4 <- plot_ordination(GP1, GP.dpcoa, type="split", color="Phylum", shape="human", label="SampleType") 
p4 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

Redundancy Analysis (RDA)

GP.rda <- ordinate(GP1, "RDA")
p5 <- plot_ordination(GP1, GP.rda, type="split", color="Phylum", shape="human", label="SampleType") 
p5 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

Notice that the axis labels for this unconstrained-RDA are "PC1" and "PC2", indicating that these are "principle components". These axis labels would be "RDA1" and so on if we constrained by the SampleType as we did in the CCA example above (for instance). In principle you can constrain the RDA or CCA by any variable in the sampleData table. The choice of variable is important, and you may want to consult other methods and the literature for determining how exactly you want to make that choice.

Non-Metric Multidimensional Scaling (NMDS)

The default distance-method used by ordinate is the unweighted-UniFrac distance (which, of course, also requires that your data include a tree. Try tre(GP1) to see). However, the NMDS option, which uses vegan::metaMDS, will also calculate species coordinates and allow a biplot representation if you specific one of the distances listed in vegan::vegdist. In this case, we want to try and stick with the biplot representation as in the previous plots, so we will use the Bray-Curtis distance ("bray").

GP.nmds <- ordinate(GP1, "NMDS", distance="bray")
p6 <- plot_ordination(GP1, GP.nmds, type="split", color="Phylum", shape="human", label="SampleType") 
p6 + scale_colour_manual(values=my.cols) + eval(parse(text=my.background))

(classical) Multidimensional Scaling (MDS) / Principle Coordinate Analysis (PCoA)

Let's look at an example using weighted-UniFrac and PCoA, like you might often find in recent high-throughput phylogenetic-sequencing articles (including the Global Patterns article). This will not be a bi-plot. We will instead shade by SampleType.

Note that because weighted-UniFrac can take a long time on 4000+ OTUs, even with the "Fast UniFrac" implementation in phyloseq, I will further specify a parallel = TRUE option, and precede my ordinate call with a parallel backend call and 7 cores.

library("doParallel")
registerDoParallel(cores=7)
GP.wufmds <- ordinate(GP1, "PCoA", distance="unifrac", weighted=TRUE, parallel=TRUE)
Note: alternatively, if you want to just get the weighted-UniFrac distance matrix...

```R
GP.wuf.dist <- distance(GP1, weighted=TRUE, parallel=TRUE)

or

GP.wuf.dist <- UniFrac(GP1, weighted=TRUE, parallel=TRUE)

Now make the plot so we can compare...

p7 <- plot_ordination(GP1, GP.wufmds, color="SampleType", shape="human", label="SampleType") 
p7 + geom_point(size=5) + geom_line() + eval(parse(text=my.background))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ordinate

ordinate

see also...

distance

plot_ordination

Examples

Load and trim data

Ordination methods using the `ordinate` function

Clone this wiki locally

ordinate

ordinate

see also...

distance

plot_ordination

Examples

Load and trim data

Ordination methods using the ordinate function

Clone this wiki locally

Ordination methods using the `ordinate` function