Permutation Tests

A permutation test assesses statistical significance without assuming any particular distribution for the test statistic. Instead of comparing an observed result to a theoretical reference distribution, it builds the null distribution directly from the data by repeatedly shuffling the observations. This makes permutation tests well-suited to the complex, multivariate statistics used in community ecology, where parametric assumptions rarely hold.

How a Permutation Test Works

The logic is the same regardless of the model being tested:

Fit the real model and record the test statistic (e.g. F, R², pseudo-F)
Shuffle the response variable (or residuals) to break any true relationship
Refit the model on the shuffled data and record the statistic
Repeat steps 2–3 many times to build the null distribution
Compute the p-value as the proportion of permuted statistics that are as large as or larger than the observed one

A small p-value means that very few random shuffles produced a result as extreme as what was actually observed. The relationship is unlikely to have arisen by chance.

The p-value as a Proportion

$p = \frac{\text{number of permuted statistics} \geq \text{observed statistic}}{\text{total number of permutations} + 1}$

The + 1 in the denominator includes the observed statistic itself. With 999 permutations, the minimum achievable p-value is:

$p_{\min} = \frac{1}{999 + 1} = 0.001$

This is why 999 is the conventional default: it gives a clean minimum of 0.001 and is fast to compute.

Permutation Tests in vegan

The functions used in this course all share the same permutations argument. The permutation logic is identical across methods; only the test statistic changes.

PERMANOVAConstrained OrdinationDispersionManual illustration

Testing community composition differences

adonis2() tests whether groups differ in their centroid, their dispersion, or both. The pseudo-F statistic compares within-group to between-group distances.

library(vegan)

# Test whether management type explains community composition
perm_result <- adonis2(
  dune ~ Management,
  data    = dune.env,
  method  = "bray",
  permutations = 999
)

perm_result

Under the null hypothesis, the group labels carry no information. Shuffling them repeatedly and recomputing pseudo-F builds the reference distribution against which the observed pseudo-F is compared.

Testing RDA and CCA models

anova.cca() tests the overall model, individual terms, or individual constrained axes using an F statistic. The permutation logic is the same as PERMANOVA.

# Overall model test
anova(rda_fit, permutations = 999)

# Term-by-term (sequential, Type I)
anova(rda_fit, by = "term", permutations = 999)

# Axis-by-axis
anova(rda_fit, by = "axis", permutations = 999)

For term-by-term tests, the order of predictors in the formula matters. Each term is tested after those that precede it. Use by = "margin" for marginal tests that treat all other terms as covariates.

Testing homogeneity of dispersion

betadisper() followed by permutest() tests whether groups differ in their spread around a centroid. This is a key assumption check for PERMANOVA.

# Compute dispersion
disp <- betadisper(
  vegdist(dune, method = "bray"),
  group = dune.env$Management
)

# Permutation test on dispersion differences
permutest(disp, permutations = 999)

A non-significant result here supports the interpretation that a significant PERMANOVA reflects a difference in location (centroid) rather than spread.

The permutation loop in plain R

The code below applies the permutation idea to a simple linear regression. The logic is exactly what vegan does internally.

# ---- Permutation test: manual illustration ----

# Observed F statistic from the real model
observed_F <- summary(lm(y ~ x))$fstatistic[1]

# Build null distribution by shuffling y
perm_F <- replicate(999, {
  y_perm <- sample(y)                         # break the relationship
  summary(lm(y_perm ~ x))$fstatistic[1]       # refit and extract F
})

# p-value: proportion of permuted F >= observed F
p_value <- mean(perm_F >= observed_F)

sample(y) shuffles the response at random, simulating the null hypothesis that x has no effect on y. The observed F is then compared against 999 such shuffles.

Choosing the Number of Permutations

Permutations	Min. p-value	Typical use
99	0.01	Quick checks during model building
999	0.001	Standard reporting
9999	0.0001	Borderline results, final publication

More permutations increase the precision of the p-value but do not increase the power of the test. Power is determined by sample size and effect size, not by how many times you shuffle.

What Permutation Tests Do Not Assume

Permutation tests are distribution-free with respect to the test statistic. They do not assume normality, homoscedasticity, or independence of variables. However, they do carry one important assumption: observations must be exchangeable under the null hypothesis. In practice this means samples should be independent. Repeated measures, time series, or spatially autocorrelated data require restricted permutation schemes, which vegan supports via the how() function from the permute package.

library(permute)

# Restrict permutations to within blocks (e.g. repeated measures)
h <- how(blocks = dune.env$Block)

anova(rda_fit, permutations = h)

Quick Reference

Function	Package	What is permuted	Test statistic
`adonis2()`	vegan	Group labels	pseudo-F
`anova(rda_fit)`	vegan	Residuals	F
`anova(cca_fit)`	vegan	Residuals	F
`permutest(betadisper())`	vegan	Group labels	F
`mantel()`	vegan	Row/column order of one matrix	Mantel r