« Using between-country comparisons to make implicit causal inferences about policies | Main | Synthesis on Compressed Sensing and Dimensionality Reduction »
July 17, 2007
Identification of Causal Parameters in Randomized Studies with Mediating Variables
Michael Sobel sent me this paper which will appear in the Journal of Educational and Behavioral Statistics. It's about mediation: a crucial issue in causal inference and a difficult issue to think about. The usual rhetorical options here are:
- Blithe acceptance of structural equation models (of the form, "we ran the analysis and found that A mediates the effects of X on Y")
- Blanket dismissal (of the form, "estimating mediation requires uncheckable assumptions, so we won't do it")
- Claims of technological wizardry (of the form, "with our new method you can estimate mediation from observational data")
For example, in our book, Jennifer and I illustrate that regression estimates of mediation make strong assumptions, and we vaguely suggest that something better might come along. We don't provide any solutions or even much guidance.
Michael has thought hard about these problems for a long time. (For example, see here and here, or for some laffs, here.) Michael's also notorious for pointing out that the phrase "causal effect" is redundant: all effects are causal. Anyway, I was interested to see what he has to say about mediation. Here's the abstract of the paper:
Randomized trials are used to assess the effectiveness of one or more treatments in inducing outcomes of interest. Treatments are typically designed to target key mediating variables that are thought to be causally related to the outcome. Thus, researchers want to know not only if the treatment is effective, but how the mediators affect the outcome. Data from such studies are often analyzed using recursive linear structural equation models, and model coefficients, including the coefficient relating the mediator(s) to the outcome, are endowed with a causal interpretation. However, because only assignment to treatment groups is randomized, not assignment to the mediators, the latter are self selected treatments. In order to believe that the so-called “direct effect” of the mediator on the outcome variable in a structural equation model warrants a causal interpretation one must believe there is no selection bias with respect to the mediator. Holland (1988) studied the case of a single continuous mediator. He criticized the use of structural equation models and showed how to estimate the effect of the mediator on the outcome using treatment assignment as an instrumental variable. However, the assumptions he used to justify the instrumental variable approach are overly strong and substantively implausible. This paper has several goals: 1) to make explicit the assumptions needed to justify equating the parameters of structural equation models with the effects of mediators, 2) to provide weaker and more plausible conditions than those used by Holland under which the instrumental variable estimand may be interpreted as a causal parameter, and 3) to extend the analysis to include the case where subjects do not necessarily take up the treatment to which they are assigned. I also briefly discuss the role of covariates and other possible assumptions to aid in the identification of mediated effects in randomized studies.
Now I just have to read the damn thing...
P.S.
This paper by Heckman and Vytlacil also seems relevant to the discussion.
Posted by Andrew at July 17, 2007 6:28 AM
Trackback Pings
TrackBack URL for this entry:
http://www.stat.columbia.edu/~cook/movabletype/mt-tb.cgi/1063