Mediation (statistics)
A simple statistical mediation model.

In statistics, a mediation model is one that seeks to identify and explicate the mechanism that underlies an observed relationship between an independent variable and a dependent variable via the inclusion of a third explanatory variable, known as a mediator variable. Rather than hypothesizing a direct causal relationship between the independent variable and the dependent variable, a mediational model hypothesizes that the independent variable causes the mediator variable, which in turn causes the dependent variable. The mediator variable, then, serves to clarify the nature of the relationship between the independent and dependent variables.[1] While the concept of mediation as defined within psychology is theoretically appealing, the methods used to study mediation empirically have been challenged by statisticians and epidemiologists[2][3][4] and interpreted formally.[5]

## Direct versus indirect effects

In the diagram shown above, assuming linear relationships, the indirect effect is the product of paths coefficients A and B, while the direct effect is the coefficient C. The total effect measures the extent to which the dependent variable changes when the independent variable increases by one unit. In contrast, the indirect effect (sometimes referred to as mediated effect) measures the extent to which the dependent variable changes when the independent variable is held fixed and the mediator variable changes to the level it would have attained had the independent variable increased by one unit.[4][5] In linear systems, the total effect is equal to the sum of the direct and indirect effects (C + AB in the model above). In nonlinear models, the total effect is not generally equal to the sum of the direct and indirect effects, but to a modified combination of the two.[5]

## Complete versus partial mediation

When the measured effect between the independent variable and the dependent variable is zero upon fixing the mediator variable, the mediation effect is said to be complete (C = 0 in the diagram above.) If, however, the measured effect changes upon fixing the mediator but remains significantly different from zero, the mediation effect is said to be partial. In all cases, the operation of "fixing a variable" must be distinguished from that of "controlling for a variable," which has been inappropriately used in the literature.[3][4][6] The former stands for physically fixing, while the latter stands for conditioning on, adjusting for, or adding to the regression model. The two notions coincide only when all error terms (not shown in the diagram) are statistically uncorrelated. When errors are correlated, adjustments must be made to neutralize those correlations before embarking on mediation analysis (see Bayesian Networks).

In order for either partial or complete mediation to be established, the reduction in variance explained by the independent variable must be significant as determined by one of several tests, such as the Sobel test (1982). The effect of an independent variable on the dependent variable can become nonsignificant when the mediator is introduced simply because a trivial amount of variance is explained (i.e., not true mediation). Thus, it is imperative to show a significant reduction in variance explained by the independent variable before asserting either partial or complete mediation. Hayes (2009) shows that it is possible to have statistically significant indirect effects in the absence of a total effect. This can be explained by the presence of several mediating paths that cancel each other out, and become noticeable when one of the cancelling mediator is controlled for. This implies that the terms 'complete' and 'partial' mediation should always be interpreted relative to the set of variables that are present in the model.

## Suppression

Suppression is defined as "a variable which increases the predictive validity of another variable (or set of variables) by its inclusion into a regression equation".[7] For instance, if you are set to examine the effect of a treatment (e.g. medication) on an outcome (e.g. healing from a disease), a suppression would mean that instead of the drop that you would see from the direct effect of the treatment on the outcome when the mediator is included, the opposite happens. The inclusion of the suppressor variable in the equation increases, rather than decreases the relation between the treatment and outcome.[7][8] This, too, can be explained by cancelation; disabling one mediating path may disturb the balance between otherwise cancelling paths.

Pearl (2000, page 139)[6] has argued that "suppression" may emanate from confusing causal and associational relationships, as in Simpson's paradox.

## Moderated mediation

Mediation and moderation can co-occur in statistical models. It is possible to mediate moderation and moderate mediation.

Moderated mediation is when the effect of the treatment effect A on the mediator B, and/or when the partial effect of B on C, depends on levels of another variable (D). This definition has been outlined by Muller, Judd, and Yzerbyt (2005)[9] and Preacher, Rucker, and Hayes (2007).[10]

## Mediated moderation

Mediated moderation is a variant of both moderation and mediation. This is where there is initially overall moderation and the direct effect of the moderator variable on the outcome, is mediated either at the AB path or at the BC. The main difference between mediated moderation and moderated mediation is that for the former there is initial moderation and this effect is mediated and for the latter there is no moderation but the effect of either the treatment (A) on the mediator (B) is moderated or the effect of the mediator (B) on the outcome (C) is moderated.[9]

## Mediator variable

A mediator variable (or mediating variable, or intervening variable) in statistics is a variable that describes how rather than when effects will occur by accounting for the relationship between the independent and dependent variables. A mediating relationship is one in which the path relating A to C is mediated by a third variable (B).

For example, a mediating variable explains the actual relationship between the following variables. Most people will agree that older drivers (up to a certain point), are better drivers. Thus:

aging $\to$ better driving

But what is missing from this relationship is a mediating variable that is actually causing the improvement in driving: experience. The mediated relationship would look like the following:

aging $\to$ increased experience driving a car $\to$ better driving

Mediating variables are often contrasted with moderating variables, which pinpoint the conditions under which an independent variable exerts its effects on a dependent variable. A moderating relationship can be thought of as an interaction. It occurs when the relationship between variables A and B depends on the level of C.

## Significance of mediation

Bootstrapping [1] [2][3] is becoming the most popular method of testing mediation because it does not require the normality assumption to be met, and because it can be effectively utilized with smaller sample sizes (N<25). However, mediation continues to be most frequently determined using the (1) the logic of Baron and Kenny [4] or (2) the Sobel test. However, this is changing, and it is becoming increasingly more difficult to publish tests of mediation based purely on the Baron and Kenny method or tests that make distributional assumptions such as the Sobel test. See Hayes (2009) for a discussion.

## The Mediation Formula

Baron and Kenny's method of evaluating the degree to which an effect is mediated by a given path is applicable in linear systems only. In nonlinear models, especially those involving categorical variables and strong interactions, direct and indirect effects cannot be defined in terms of adding the putative mediator variable to a regression model. Instead, the following counterfactual definitions must be invoked [4][5]:

The direct effect DE measures the expected change in the dependent variable (Y) when the independent variable (X) is increased by one unit, say from x to x+1, while the mediator variable (M) is held fixed at the level it would have attained before the change.

The indirect effect IE measures the expected change in the dependent variable (Y) when the independent variable (X) is held fixed, and the mediator variable (M) changes to the level it would have attained had the independent variable increased by one unit, say from x to x+1.

For the case of error independence (or no confoundedness), Pearl [5] derived closed-form expressions for both DE and IE, called the Mediation Formulas:

\begin{align} DE & = \sum_m [E(Y|x+1,m)-E(Y|x,m)] P(m|x) \\ IE & = \sum_m E(Y|x,m) [P(m|x+1)-P(m|x)]\\ \end{align}

where m ranges over the values that the mediator variable can take.

DE gives the effect remaining after suppressing the M-mediated path, while IE gives the effect remaining after suppressing the direct path from X to Y. If TE is the total effect, then 1-DE/TE measures the fraction of response owed to mediation, while IE/TE measures the fraction explained by mediation. When the output (Y) is binary, 1-DE/TE measures the percentage of responding units for which mediation was necessary, while IE/TE measures the percentage for which mediation was sufficient.

The Mediation Formulas are applicable to all distributions, and to all types of variables, and they enable analysts to estimate direct and indirect effects efficiently, using both parametric and nonparametric regression.[11][12]

Due to non-linearities, the total effect may be non-zero even in the absence of direct and indirect effects. This would occur, for example, when Y requires the presence of both M=1 and X=1, and M=X; neither the direct nor indirect path alone can trigger a response while the combined paths can.[11]

## Multilevel Mediation

Many times, mediation analyses may involve more than one level of analysis (i.e., multilevel modeling). For example, schools with the resources to hire many teachers may make students feel less socially isolated, which may then improve their individual grades and performance in school. Students are nested within schools, creating a multilevel data structure. In this school example, a level two variable (i.e., schools’ resources) is hypothesized to cause a level one variable (i.e., students’ grades and performance), and this relationship is mediated by a level one variable (i.e., student perceptions of social isolation), which represents a 2-1-1 multilevel mediation model. Adding higher levels of analysis introduce additional sources of variance that require the appropriate statistical models that account for this additional variance. For example, there may be variance between schools (i.e., schools differ in the amount of resources they have) and also variance within schools (i.e., students differ in how socially isolated they feel and also in the grades they obtain). Preacher, Zyphur, and Zhang (2010) suggest using structural equation modeling techniques to estimate a wide variety of multilevel mediation models to account for variance components at different levels of analysis. They offer four suggestions in conducting multilevel mediation analyses. (1) Identify the mediation hypothesis to be tested to determine the type of multilevel mediation model to be estimate (e.g., 2-1-1, 2-2-1, and so on). (2) Ensure that there is enough between cluster variability to support using multilevel structural equation modeling. (3) Fit the within cluster model. (4) Fit the between and within cluster models simultaneously. With this full hypothesized model, the indirect effect can be estimated to test the mediation hypotheses.

## References

1. ^ MacKinnon, D. P. (2008). Introduction to Statistical Mediation Analysis. New York: Erlbaum.
2. ^ Bullock, J. G., Green, D. P., Ha, S. E. (2010). Yes, but what's the mechanism? (Don't expect an easy answer). Journal of Personality & Social Psychology, 98(4):550-558.
3. ^ a b Kaufman, J. S., MacLehose R. F., Kaufman S (2004). A further critique of the analytic strategy of adjusting for covariates to identify biologic mediation. Epidemiology Innovations and Perspectives, 1:4.
4. ^ a b c d Robins, J. M., Greenland, S. (1992). "Identifiability and exchangeability for direct and indirect effects". Epidemiology, 3(2):143–55.
5. ^ a b c d e Pearl, J. (2001) "Direct and indirect effects". Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, 411–420.
6. ^ a b Pearl, J. (2000) Causality: Models, Reasoning and Inference, Cambridge University Press. 2nd edition (2009).
7. ^ a b MacKinnon, D. P., Krull, J. L., Lockwood, C. M. (2000). Equivalence of the Mediation, Confounding and Suppression Effect. Prevention Science, 1(4): 173–181.
8. ^ Shrout, P. E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: new procedures and recommendations. Psychological Methods, 7(4), 422–445.
9. ^ a b Muller, D., Judd, C. M., Yzerbyt, V. Y. (2005). When moderation is mediated and mediation is moderated. Journal of Personality and Social Psychology, 89(6), 852–863.
10. ^ Preacher, K. J., Rucker, D. D. & Hayes, A. F. (2007). Assessing moderated mediation hypotheses: Strategies, methods, and prescriptions. Multivariate Behavioral Research, 42, 185–227.
11. ^ a b Pearl, J., (2010). "The Mediation Formula: A guide to the assessment of causal pathways in non-linear models". UCLA Computer Science Department, Technical Report R-363, January 2011. To appear in C. Berzuini, P. Dawid, and L. Bernardinelli (Eds.), Causality: Statistical Perspectives and Applications. Forthcoming, 2011.
12. ^ Imai, K., Keele, L., and Yamamoto, T., (2010). Identification, inference, and sensitivity analysis for causal mediation effects. Statistical Science, 25(1):51–71, 2010.

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Mediation (disambiguation) — Mediation, in legal practise, is a form of alternative dispute resolution. Mediation may also refer to: Cultural mediation, a mechanism of human development Data mediation, data transformation via a mediating data model Mediation (magic), an idea …   Wikipedia

• List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

• Billing Mediation Platform — A billing mediation platform is a system used to convert datatypes of a certain type to other datatypes, usually for billing purposes. They are used mostly by telephone companies, who typically need to process UDRs (Usage Detail Records). In call …   Wikipedia

• Bootstrapping (statistics) — In statistics, bootstrapping is a modern, computer intensive, general purpose approach to statistical inference, falling within a broader class of resampling methods.Bootstrapping is the practice of estimating properties of an estimator (such as… …   Wikipedia

• National Arbitration and Mediation — (NAM) Locations New York: Manhattan, Garden City, Brooklyn, Westchester; Massachusetts: Boston, Braintree, Westborough, Stoughton; Florida: Palm Harbor Founder Roy Israel, President CEO Founded March 1992 …   Wikipedia

• Moderation (statistics) — In statistics, moderation occurs when the relationship between two variables depends on a third variable. The third variable is referred to as the moderator variable or simply the moderator [1]. The effect of a moderating variable is… …   Wikipedia

• Divorce — For other uses, see Divorce (disambiguation). Family law …   Wikipedia

• Monitoring and Measurement in the Next Generation Technologies — (MOMENT) is a project aimed at integrating different platforms for network monitoring and measurement to develop a common and open pan European infrastructure. The system will include both passive and active monitoring and measurement techniques… …   Wikipedia

• Divorce in the United States — Relationships Types …   Wikipedia

• Divorce law in Sweden —   Divorce Law in Sweden Legal Marriage Code 1987 Chapter 5 Parties Involved …   Wikipedia