# Pseudoreplication

Hurlbert (1984) [1] defined pseudoreplication as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent. The error described by Hurlbert arises when an F-ratio in an analysis of variance (ANOVA) table is formed with respect to the residual mean square rather than with respect to the among unit mean square. The misformed F-ratio fails to account for among unit effects when declaring statistical significance. When unit number is small (e.g four tank units, two tanks treated, two not treated, many measurements per tank) the misformed F-ratio is vulnerable to unit (tank) effects, and can result in a statistically significant treatment mean square when there are no treatment effects. The error arises frequently from the default setting in many statistical packages, which is to form the F-ratio relative to the residual mean square. The error is avoided by forming the F-ratio relative to the next lower random factor in the ANOVA table (tanks in the example above), rather than the lowest level (residual mean square in the example above).

## Replication

Replication increases the precision of an estimate, while randomization addresses the broader applicability of a sample to a population. Replication must be appropriate: replication at the experimental unit level must be considered, in addition to replication within units.

## Statistics and replication

Statistical tests (e.g. t-test and the related ANOVA family of tests) rely on adequate replication to estimate statistical confidence. Tests based on the chisquare, t, and F- distributions assume homogeneous, normal, and independent errors.

## Types

In his paper, Hurlbert distinguished four types of pseudoreplication.

• Simple pseudoreplication occurs when there are no true replications of the observations because repeated measurements of the same sample are treated as independent in the statistical analysis. This is essentially subsampling, which is often informative and interesting, but is not statistically valid, as there is only one replication of the experimental unit.[2]
• Temporal pseudoreplication is similar to simple pseudoreplication, the difference being that the multiple samples from each experimental unit are not taken simultaneously, but over a period of time. This may be valid in some cases, where time is a factor - in this instance, each time where samples are taken can be seen as an experimental unit.
• Sacrificial pseudoreplication occurs when observational data are pooled prior to statistical analysis (thus depressing the calculated variance of the observations), or when the multiple samples or measurements taken from each experimental unit are treated as independent (thus "sacrificing" some of the sample variance by inflating the denominator).
• Implicit pseudoreplication occurs when subsamples are not explicitly described as replicates, but nonetheless have non-overlapping standard errors and/or are described with confidence intervals, often with associated graphs.

There are also other, related, kinds of pseudo-replication.

• Inadequate sample replication occurs where there is sufficient replication of treatments, but the sample size is not great enough for statistical analysis. This is the inverse of simple pseudoreplication.

## Notes

Hurlbert[1]; reported 'pseudoreplication' in 48% of the studies he examined, that used inferential statistics. When time and resources limit the number of experimental units, and unit effects cannot be eliminated statistically by testing over the unit variance, it is important to use other sources of information to evaluate the degree to which an F-ratio is inflated by unit effects.

## References

