Ordination: Biological Applications
This page covers ordination workflows for microbiome and ecological data, including statistical testing with PERMANOVA and beta dispersion analysis. The core methods are described in the main ordination page.
Microbiome Workflow with phyloseq
The phyloseq package integrates ordination with sample metadata and taxonomic information, making it straightforward to colour and annotate plots by any variable in your experiment.
library(phyloseq)
library(vegan)
library(ggplot2)
data(GlobalPatterns)
gp <- GlobalPatterns
# 1. Filter rare taxa (present in at least 10% of samples with > 3 reads)
gp_filtered <- filter_taxa(gp,
function(x) sum(x > 3) > (0.1 * length(x)), TRUE)
# 2. Relative abundance transformation
gp_rel <- transform_sample_counts(gp_filtered, function(x) x / sum(x))
# 3. NMDS via phyloseq wrapper
set.seed(123)
gp_nmds <- ordinate(gp_rel, method = "NMDS", distance = "bray")
gp_nmds$stress
# 4. Plot coloured by sample type
plot_ordination(gp_rel, gp_nmds,
colour = "SampleType",
title = paste0("NMDS (stress = ",
round(gp_nmds$stress, 3), ")")) +
geom_point(size = 3, alpha = 0.7) +
stat_ellipse() +
theme_minimal()
UniFrac and PCoA
UniFrac distances incorporate phylogenetic relationships between taxa, weighting compositional differences by how evolutionarily distant the taxa involved are. Weighted UniFrac accounts for abundance; unweighted UniFrac treats all taxa as equally present or absent.
Because UniFrac preserves exact phylogenetic distances, PCoA is the natural companion method.
# Requires a phylogenetic tree in the phyloseq object
gp_pcoa <- ordinate(gp_rel, method = "PCoA", distance = "wunifrac")
var_exp <- round(gp_pcoa$values$Relative_eig[1:2] * 100, 1)
plot_ordination(gp_rel, gp_pcoa,
colour = "SampleType",
title = "PCoA with Weighted UniFrac") +
geom_point(size = 3, alpha = 0.7) +
labs(x = paste0("PCo1 (", var_exp[1], "%)"),
y = paste0("PCo2 (", var_exp[2], "%)")) +
theme_minimal()
PERMANOVA
PERMANOVA (permutational multivariate analysis of variance) tests whether groups differ significantly in multivariate space. It partitions variance in the distance matrix between groups and within groups, then assesses significance by permutation rather than distributional assumptions.
library(vegan)
# Extract OTU table and metadata from phyloseq object
otu <- as(otu_table(gp_rel), "matrix")
if (taxa_are_rows(gp_rel)) otu <- t(otu)
meta <- as(sample_data(gp_rel), "data.frame")
# Distance matrix
dist_bray <- vegdist(otu, method = "bray")
# PERMANOVA
# H0: no difference in community composition between sample types
perm_result <- adonis2(dist_bray ~ SampleType,
data = meta,
permutations = 999)
print(perm_result)
#> R2 tells you what fraction of total variance is explained by SampleType
#> p-value from permutation test
The R2 value is an effect size: it tells you what proportion of the total compositional variance is associated with your grouping variable. A significant p-value with a low R2 means the groups differ but the grouping explains little of the overall variation.
Pairwise comparisons
library(pairwiseAdonis)
pairwise_result <- pairwise.adonis(dist_bray, meta$SampleType,
perm = 999)
print(pairwise_result)
# Bonferroni-corrected p-values for all pairwise group comparisons
Beta Dispersion
PERMANOVA tests whether group centroids differ in multivariate space. But a significant result can also arise if one group is simply more variable than another, even if the centroids are in the same location. Beta dispersion analysis checks for this.
# Test homogeneity of multivariate dispersion
# H0: all groups have equal dispersion around their centroid
beta_disp <- betadisper(dist_bray, meta$SampleType)
permutest(beta_disp, permutations = 999)
# Visualise
plot(beta_disp, main = "Beta Dispersion by Sample Type")
boxplot(beta_disp, main = "Distance to Centroid")
If beta dispersion is significant, interpret PERMANOVA results with care. A significant PERMANOVA may reflect differences in spread rather than differences in location. Report both tests together.
Ecological Data: Dune Meadow Example
A complete workflow for species abundance data using the dune dataset:
library(vegan)
library(ggplot2)
data(dune)
data(dune.env)
# 1. Hellinger transformation
dune_hell <- decostand(dune, method = "hellinger")
# 2. NMDS
set.seed(123)
nmds <- metaMDS(dune_hell, distance = "euclidean",
k = 2, trymax = 100, autotransform = FALSE)
nmds$stress
# 3. Fit environmental variables
env_fit <- envfit(nmds, dune.env, permutations = 999)
env_fit
# 4. Plot with significant environmental vectors
scores_df <- as.data.frame(scores(nmds, display = "sites"))
scores_df$Management <- dune.env$Management
ggplot(scores_df, aes(NMDS1, NMDS2, colour = Management)) +
geom_point(size = 3) +
stat_ellipse(linetype = 2) +
theme_minimal()
plot(nmds, type = "n")
points(nmds, display = "sites", pch = 19,
col = as.numeric(dune.env$Management))
plot(env_fit, p.max = 0.05, col = "darkred")
# 5. PERMANOVA
dist_hell <- dist(dune_hell)
adonis2(dist_hell ~ Management + Moisture, data = dune.env,
permutations = 999)
# 6. Beta dispersion
beta_disp <- betadisper(dist_hell, dune.env$Management)
permutest(beta_disp)
Reporting Results
A complete ordination analysis should report:
- Distance metric used and justification
- NMDS stress value (or % variance for PCoA axes)
- PERMANOVA R2 and p-value
- Beta dispersion test result
- Any data transformations applied
Example: "Bray-Curtis dissimilarities were calculated on relative abundance data. NMDS ordination (stress = 0.12) showed clear separation between sample types. PERMANOVA confirmed significant differences in community composition (R2 = 0.43, p < 0.001). Beta dispersion did not differ significantly between groups (p = 0.21), supporting the interpretation that PERMANOVA reflects differences in community location rather than spread."