Where do instrumental variables come from? And my own favorite (unused) example

We were discussing the Angrist and Pischke book with Paul Rosenbaum and I mentioned my struggle with instrumental variables: where do they come from, and doesn’t it seem awkward when you see someone studying a causal question and looking around for an instrument?

And Paul said: No, it goes the other way. What Angrist and his colleagues do is to find the instrument first, and then they go from there. They might see something in the newspaper or hear something on the radio and think: Hey–there’s a natural experiment–it could make a good instrument! And then they go from there.

This sounded fun at first, but I actually prefer this to the usual presentation of instrumental variables. The “find the IV first” approach is cleaner: in this story, all causation flows from the IV, which has various consequences. So if you have a few key researchers such Angrist keeping their ears open, hearing of IV’s, then you’ll learn some things. This approach also fits in with my fail-safe method of understanding IV’s when I get stuck with the usual interpretation.

Sometimes the “lead with the natural experiment” approach can lead to missteps, as illustrated by Angrist and Pischke’s overinterpretation of David Lee’s work on incumbency in elections. (See here for my summary of Lee’s research along with a discussion of why he’s estimating the “incumbent party advantage” rather than the advantage of individual incumbency.) But generally it seems like the way to go, much better than the standard approach of starting with a causal goal of interest and then looking around for an IV.

In this spirit, let me again mention my own pet idea for a natural experiment:

The Flynn effect, and the related occasional re-norming of IQ scores, causes jumps in the number of people classified as mentally retarded (conventionally, an IQ of 70, which is two standard deviations below the mean if the mean is scaled at 100). When they rescale the tests, the proportion of people labeled “retarded” jumps up. Seems like a natural experiment that might be a good opportunity to study effects of classifying people in this way on the margin. If the renorming is done differently in different states or countries, this would provide more opportunity for identifying treatment effects.

I think it would be so cool if someone could take this idea and run with it.

7 thoughts on “Where do instrumental variables come from? And my own favorite (unused) example

  1. Could there also be a downside to this approach? Namely that the instrument comes first and the ideas come later. Researchers might end up studying things that are of little value but where causal identification can be done in a neat way and publications come easily. People might be put off from studying questions that are of greater importance but where causal identification is harder if not impossible to achieve.

  2. I think somebody did exlpoit the Flynn Effect in such a way. See Kanaya, Scullin & Ceci; The Flynn Effect & US Policies; American Psychologist 2003.

    Ben

  3. Prof. Andrew,
    I am with you (as it usually happens) 100% . Instruments don't fall from a tree…. causal analysis should come first.
    Anyway I don't understand what Paul Rosenbaum means: a natural experiments and an instruments are two different things. Econometricians usually do IV regression in order to "control" for the endogeneity problem that some of the regressors might have with the independent variable. On the other hand, a natural experiment, in line of principle, could allow the researcher to identify the causal relationships in a system. I think they are two alternative way to treat the general problem of endogeneity in econometrics. Am I missing something?

    dani.

  4. Ben,

    I took a look at the Kanaya et al. paper. They do discuss the renorming of IQ scores but they don't use it as an instrument for the effect of classification in the way that I was suggeseting.

    Dani,

    To be fair to Paul, what you're seeing is my recollection of what he said. But I think the key point here is that sometimes it makes sense to start with the natural experiment, which can then be viewed as an instrument for various treatments of interest.

  5. I'm always confused about IV. Can we say anything about size effects with IV? Won't size effects be different if we had different instruments? Is this what we would see if we practiced the cause first and then find the IV next (and happened to find more than one instrument).

  6. > Researchers might end up studying things that > are of little value

    There is a nice quote by Alfred J. Lotka (mathematical biology) "somewhere" about it being wiser to try to answer questions you can rather than questions you want to or think are very important.

    I "beleive" he is right – and its probably a mistake for "people" to think IVs are "usually" or perhaps even more than just very occassionally available for their problem…

  7. Hi All,

    (1) The idea that one instrument may provide causal leverage for many questions has been developed in a couple of papers by Alan Gerber, Don Green, and Edward Kaplan. In their formulation "randomized, controlled, experiment" is "instrument" ( Angrist, Imbens, Rubin 1996 provide a now-canonical explanation of how a such an experiment is a good instrument.):

    "The Downstream Benefits of Experimentation."
    Donald P. Green and Alan S. Gerber. Political Analysis 10:4 (Autumn 2002) 394-402.

    The Illusion of Learning from Observational Research."
    Alan S. Gerber, Donald P. Green, and Edward H. Kaplan. from Problems and Methods in the Study of Politics, edited by Ian Shapiro (Yale University, Connecticut), Rogers M. Smith (University of Pennsylvania), Tarek E. Masoud (Yale University, Connecticut)
    Cambridge University Press, 2004.

    (2) The Angrist, Imbens, Rubin 1996 piece encourages us to see experiments as useful primarily as instrument-creating devices. Their re-conceptualization of "instrumental variables", thus, makes "natural experiments" and "instruments" do the same kinds of causal inference work if both also satisfy exclusion restrictions, and other conditions [this as a response to Dani's question about the difference between instruments and natural experiments].

    (3) Both sets of work emphasize that IVs are very difficult to come by and justify [and other work, say, by Imbens and Rosenbaum 2005 and/or Bound, Jaeger, and Baker 1995, emphasizes the problems arising from statistical inference with weak instruments]. Perhaps the idea of using the same IV for multiple purposes has merit — as long as we are aware (as Dunning 2008 explains) that the meaning of the causal effect will ultimately depend on the characteristics of the instrument — and different valid instruments can produce different valid causal effect estimates. I've never seen anyone use multiple valid IVs to estimate the same idealized causal effect (not all combined in one linear model, but each IV separately and differently yet validly identifying the same T -> y relationship [using the z->T->y to follow The Failsafe Method ) — I guess they are so hard to come by that it would be very rare have more than one IV on hand.

    Jake

Comments are closed.