Results matching “R”

Don't blame gerrymandering: the update

This 2006 article by Alan Abramowitz, Brad Alexander, and Matthew Gunning finds, consistent with our earlier research, that declining competitiveness in U.S. House elections cannot be explained by gerrymandering:

Competition in U.S. House elections has been declining for more than 50 years and, based on both incumbent reelection rates and the percentage of close races, the 2002 and 2004 House elections were the least competitive of the postwar era. This article tests three hypotheses that attempt to explain declining competition in House elections: the redistricting hypothesis, the partisan polarization hypothesis, and the incumbency hypothesis.We find strong support for both the partisan polarization hypothesis and the incumbency hypothesis but no support for the redistricting hypothesis. Since the 1970s there has been a substantial increase in the number of House districts that are safe for one party and a substantial decrease in the number of marginal districts. However, this shift has not been caused by redistricting but by demographic change and ideological realignment within the electorate. Moreover, even in the remaining marginal districts most challengers lack the financial resources needed to wage competitive campaigns. The increasing correlation among district partisanship, incumbency, and campaign spending means that the effects of these three variables tend to reinforce each other to a greater extent than in the past. The result is a pattern of reinforcing advantages that leads to extraordinarily uncompetitive elections.

So that's the story. Don't blame gerrymandering.

P.S. They didn't cite our 1991 AJPS article! A regrettable oversight, I'm sure. . . .

Don't blame gerrymandering

Matthew Yglesias quotes Richard Cohen presenting a common misconception:

Reality is real. No amount of lofty rhetoric is going to change the way members of Congress are elected. Most of them come from exquisitely gerrymandered districts created by computers that could, if good taste allowed, part the marital bed, separating husband from wife if they were of different political parties. This system created districts that are frequently reliably liberal or conservative. The computer has deleted the middle.

I can't disagree with Cohen's first sentence above, but I part company with him after that.

From Jessica, I saw a review by "Econjeff" of my review of Joshua Angrist and Jorn-Steffen Pischke's new book, "Mostly Harmless Econometrics: An Empiricist's Companion."

Econjeff pretty much agrees with what I wrote, but with one comment:

I [Econjeff] am a bit surprised by Gelman's call for more on hierarchical models; I think economists are right to treat these as a combination of useful pedagogical tool for education research design and an unnecessarily functional-form dependent way to get the standard errors right when then the unit of treatment differs from the units available in the data.

I think this is a common perception of multilevel (hierarchical) models among economists. Regular readers of this blog will not be surprised to hear that I disagree completely! The purpose of a multilevel model is not to "get the standard errors right" but rather to model structure in the data.

An analogy that might help here for economists is time series analysis. If you have data with time series structure and you ignore it, you can get over-optimistic standard errors. But that's not the main reason people do time series modeling. The main reason is that the time series structure is interesting and important in its own right. We are interested in individual and contextual effects and unexplained variation at the individual and group levels, just as we are interested in autocorrelation, periodicity, long-range dependence, and so forth.

See chapters 1 and 11 of ARM for more discussion of motivations for multilevel modeling.

The myth of the myth of bipartisanship

James Morone writes in the New York Times:

[President Obama] seems eager to put aside small political differences and to restore a culture of cooperation in Washington. But it's going to be a long, hard effort because, well, that golden bipartisan era never existed.

The popular myth of getting past politics, in its modern form, dates back to the 1880s, when reformers known as Mugwumps challenged the corrupt bosses, powerful parties and political machines. . . . And while the Mugwumps eventually achieved a lot of their reforms, their larger aspiration -- nonpartisan politics -- always slipped out of reach.

Morone then gives some examples, but I don't think they make his case so well. For example, he wrotes:

Yet modern Mugwumps keep searching for a nonpartisan golden age to emulate. They point, for example, to the early years of the cold war when foreign policy consensus repudiated isolationism and engaged the world. That elite consensus never reached as far as Congress, where the House Un-American Activities Committee was hunting Joe McCarthy's slippery list of Reds and traitors.

But NATO passed with bipartisan support. Beyond this, the 1950s and early 1960s were a relatively nonpartisan era by many measures.

Later, Morone writes:

Ronald Reagan's fierce attachment to three verities -- markets are good, government is bad, communism is evil -- also meant little reaching out to the other side. His every move reverberated with the cold war philosophy he described so simply: "We win and they lose."

But when Reagan was president, the House of Representatives was controlled by the Democratic party. So his programs had to have some bipartisan support.

Summary

I'm not saying that partisanship doesn't work, or that Obama shouldn't be partisan---or, for that matter, that congressional Republicans should be less partisan themselves. I'm just pointing out that, in some of these historical examples purporting to show partisanship, the actual story isn't so simple.

"In Pain and Joy of Envy, the Brain May Play a Role"

May play a role??? I guess the jury is still out on whether the seat of envy is actually in the liver. . . .

This level of scientific illiteracy disturbs me. I'm not knocking the news article or the scientific study being described there, just the headline, which is in a class by itself.

Carl Bialik writes:

There hasn't been a single 7-3 finish in the NFL since the league adopted the two-point conversion rule in 1994 . . . "Football scores are funny," Driner wrote me [Bialik] in an email. "Did you know that teams win more often when they score 13 points than when they score 14? It's a cause-effect thing. In order to get 13, you (usually) need two field goals. And teams don't kick field goals if they're down by 20 points. So teams lose 35-14 more often than they lose 35-13. That's why scoring 13 is better correlated with winning than scoring 14 is.

And, most amazingly,

An NFL game hasn't finished with a score of 7-0 in over a quarter-century.

More boringly, the most common final score is 20-17.

The mysteries of the spam filter

I just received an email from "info@googlelotto.com" with subject line, "Your email just won £500,000 British Pounds in our anniversary promo." This email went into my inbox; it did not get caught by the spam filter.

What I wanna know is, if "Your email just won £500,000 British Pounds in our anniversary promo" isn't spam, what is???

Chris Masse writes:

The reality check is that the social utility of the prediction markets is marginal. The added accuracy is minute, and, anyway, doesn't fill up the gap betweeen expectations and omniscience (which is how people judge forecasters). In our view, the social utility of the prediction markets lays in efficiency, not in accuracy. In complicated situations, the prediction markets integrate facts and expertise much faster than the mass media do. It is their velocity that we should put to work.

Interesting. This relates to other technology-based ways of aggregating information, such as using cell phone traffic to track epidemics.

From a recent referee report I wrote

Another bad graph

Jeff Jenkins writes:

Here's Lindley. I suspect I'd agree with Lindley on just about any issue of statistical theory and practice. I've read some of Lindley's old articles and contributions to discussions and, even when he seemed like something of an extremist at the time, in retrospect he always seems to be correct. That said, I disagree with him on Taleb. I think the difference is that Lindley was evaluating The Black Swan based on its statistical content, whereas I liked the book because it was full of ideas and stories that sparked thoughts in my mind (and, I think, in the minds of many readers).

Also, I disagree with Lindley 100% about Karl Popper. Even though, again, I think Lindley and I are extremely close on issues of statistical practice and theory.

And here's Robert. I like his connection of "black swans" to "model shift." This fits in well to my three stages of Bayesian Data Analysis (model building, model fitting, model checking), with model checking being the all-important but often neglected ugly sister. (As I've discussed many times, you rarely see graphical model checks in a published paper, because either (a) the model didn't fit, in which case, at worst you'd be too embarrassed to admit it, or at best you'd fix the model and there'd be nothing to report, of (b) the model fits ok, in which case the model check is probably only worth a sentence or two.)

From a philosophical point of view, I think the most important point of confusion about Bayesian inference is the idea that it's about computing the probability that a model is true. In all the areas I've ever worked on, the model is never true. But what you can do is find out that certain important aspects of the data are highly unlikely to be captured by the fitted model, which can facilitate a "model shift" moment. This sort of falsification is why I believe Popper's philosophy of science to be a good fit to Bayesian data analysis.

Also, I agree with Christian's characterization of Black-Scholes etc. as not "n accurate representation of reality, but rather a gentleman's agreement between traders that served to agree on prices." The way I put it was that these graduate programs in "financial mathematics / financial engineering" served a useful function by screening for students who were mathematically able and willing to work hard. It's too bad they couldn't have been learning statistics instead, but, for better or worse, competence in statistics is easier to fake than competence in math.

Christian also has an interesting conclusion:

Encouraging a total mistrust of anything scientific or academic is not helping in solving issues, but most surely pushes people in the arms of charlatans with ready answers.

I wonder what Taleb would say about this. Possibly he'd reply that it's better to have citizens to think critically than to be awed by their financial advisors.

Boxplot challenge

In response to the comments here, I say:

I have never ever seen an example where I've felt a boxplot was appropriate. I'm open to being convinced, but I don't think you'll be able to convince me. Bring on the examples!

Retrofitting Suburbia

The earliest postwar suburbs are sixty years old. Ideas for what to do with them, from Ellen Dunham-Jones and June Williamson.

0470041234.jpg

Correlations and absolute differences

"Perpetually Statistically Curious" writes:

Say you have two variables, Y1 and Y2, whose correlation depends on the value of a third dichotomous variable, X. Now say you take the absolute value of the difference between Y1 and Y2, and regress that absolute difference on the dichotomous (indicator) variable, X. My sense is that the expected value of the coefficient for the variable X in the regression would be related in a deterministic way with the gap between the correlations between Y1 and Y2 at the different values of X. But how?

This comes up in research on identical and fraternal twins, where the chief research interest is in the degree of similarity on some trait between identical twins relative to similarity on some trait between fraternal twins.

In the spirit of Bullwinkle, I think that all blog entries should be required to have two titles. . . .

Anyway, Seth linked to this amusing note by Preston McAfee.

P.S. In a comment to my earlier entry, somebody linked to McAfee's free introductory economics textbook. I started reading it, and it seems great so far. Maybe if I'd read a book like this thirty years ago I would've become an economist. Or maybe not, I dunno. It's not like my statistics textbooks were so delightful; I just liked the subject. And I've never read a poli sci textbook in my life.

John and I gave our presentation on statistical graphics today, and then coincidentally I found this monograph by Rafe Donahue (link from Helen DeWitt). I started skimming and it looks pretty good so far. He uses horizontal jittering instead of the horrible boxplot, and that makes me happy already. On the other hand--since I'm being superficial here--I'm not a fan of the marginal-notes style of referencing. I always feel that this style draws undue attention to what are ultimately the least important parts of the book.

More seriously, Donahue's monograph looks interesting, and I'll have to read it more carefully. I've been looking for something on graphics that goes beyond the nuts and bolts of how to make a particular graph and considers what should actually be plotted and why.

On a theoretical level, I wonder how his ideas connect to my ideas of exploratory data analysis and statistical modeling (see here and here). I think the connections are there (as in Donahue's principle #28, 43, 52, and 86: "The data display is the model."

Actually, many of his principles are things that I tell people also. Just today I discussed how you have to tell the viewer what the plot is (Donahue's principle #23).

P.S. A minor point: Donahue's principle #53 is, "Plot cause versus effect." Doesn't he mean, "Plot effect versus cause"? Usually we say y vs. x, not x vs. y. Or else I'm missing something here.

More on those $150 textbooks

Just a few thoughts in response to all the comments:

1. Several people point out that it is the publisher, not the author, who decides the cost of a book. That's right. The author has some input, and for almost all of my books, I've talked with the publisher, before signing any contract, about keeping the price down. We insisted that Bayesian Data Analysis sell for $45, Teaching Statistics sold for $40, and ARM sold for $40 as well. I thought that A Quantitative Tour of the Social Sciences was going to sell for around $25 but now I see on Amazon that it's selling for $33; I don't really know what's up with that. And, as a trade book, Red State, Blue State was always going to be reasonably priced--people aren't generally prepared to pay $40 for a book that they don't feel they need for their work--and, as for the Applied Bayesian Modeling book co-edited with Meng, we never tried to keep the price down, and as a result the publisher charged $100 for it.

2. Ragna is irritated that my Teaching Statistics book is selling for $190 at the local bookstore. This is simply a mistake--they seem to have ordered the hardcover rather than the softcover. Teaching Statistics is a great book but I wouldn't pay $190 for it. Annoyingly enough, if you look up the book on Amazon it sends you to the hardcover. But if you look carefully you can find the softcover for $63 ("list price $70"). I don't know how this happened. It was $40 when it came out.

3. Bayesian Data Analysis now costs $60 on Amazon. But, to be fair, it has been well over a decade since the original $45 version came out. I'd like it to still be $45 but I don't have much influence over this. It's a matter of negotiation.

4. I understand that if the book sells for more, the author probably makes more money. Certainly for technical books. I'd guess that if all my books were doubled in price, they'd sell more than half as well as they sell now (and, conversely, if they were halved in price, I doubt they'd sell at anything like twice their current rate). But my books don't make a lot of money for me (and, as for my book with Deb, we donate all the royalties to charity). What the books do do is make money for the publishers. That's fine, but making money for publishers is not one of my major goals in life.

5. I'll have to look into this open source thing. I'm a traditionalist myself and like hard-copy books. I've seen how students work on the computer: they seem to have the ability to only look at one window at a time, and so I think they need the hard copy of the textbook.

6. Some people were surprised that I didn't already know that these books were expensive. Yes, I know that technical books are expensive (hence my struggle to keep my own books under $40), but . . . an intro stat book? These things don't have a lot of content. $150 seems like a lot. If you pay $70 for Jun Liu's book on statistical computing . . . well, you get Jun Liu's book--that's a pretty good deal! But paying twice as much for something generic--that just seems horrible.

7. In answer to the questions of what my book will be like: I'm not sure! Seeing the $150 books makes me want to quickly write a generic book for $10 or free or whatever, just to do my part to destroy the market for the $150 books, but, no, I'm gonna do something new. I'm still struggling to figure out how it should be structured.

8. In answer to Bob's question: It's my impression that the Ivy League colleges get zillions of applicants, so they have no motivation to break the coalition and charge less for tuition. But, for an intro textbook, it would only take one author to change things, right?

9. I believe that many intro stat books, including Dick DeVeaux's and many others, have strengths. I have my own ideas for how to teach intro statistics, but I'm certainly not trying to claim that the current books are pure crap. And if the choice were DeVeaux's book for $100 or a generic book for $40, I'd probably assign DeVeaux's for $100. I think it would be worth the students' $60 to learn from a better book. But what amazes me is that even completely generic books are selling for well over $100.

10. Yes, I agree with everyone on the basic economic argument that it's the profs who assign the texts but the students (or their parents) who pay. Nonetheless . . . how did they even get the chutzpah to charge $150 in the first place?

11. Sometimes I sort of wish that Jennifer and I had self-published ARM or gone with some zero-margin publisher such as Dover, who do publish new books, by the way, including some great kids' activity booklets for something like $1.50 each. Anyway, if the goal is a $40 book, I think I can go with a regular publisher; after all, Cambridge is selling ARM for $40 and will sell Regression and Other Stories for a bit less, I believe.

12. From some of the resources provided by the commenters, it seems as if free textbooks are out there, and so maybe the current $150 texts are just the last bit of profit-taking before the collapse. I'd love to see time series plots of intro textbook prices in various fields.

13. Regarding the issue of homework questions and test banks: This is a real concern, I agree, or at least it should be a concern. In the courses I've seen, instructors don't actually use these test banks, but maybe that just means we're not getting our money's work.

14. I noticed a remark on cost per page. As an author of a couple reasonebly-priced 600+ page books, I'm sympathetic to this argument . . . but, no, I don't think there's 600 pages worth of material in these intro stat books. My impression is that, at some point, a book being heavy makes it that much harder to use.

I'm not gonna miss this!

The following CUIPS Professional Development Seminar: "How (Not) to Present Quantitative Results," Thursday February 12, 12:30-2:00pm, 707 IAB.

The mystery of the $150 textbooks

I received a free copy in the mail of an introductory statistics textbook; I guess the publisher wants me to adopt it for my courses. The book isn't bad, actually it's pretty good: it follows the "Moore and McCabe" format, starting with descriptive statistics (up to correlation and regression), then a bit on data collection, then probability, then statistical inference, and at the end chapters on various more advanced topics.

I showed the book to Yu-Sung and he said: Wow, it's pretty fancy. I bet it costs $150. I didn't believe him, but we checked on Amazon and lo! it really does retail for that much. What the . . . ? I asked around and, indeed, it's commonplace for students to pay well over $100 for introductory textbooks.

Well. I'm planning to write an introductory textbook of my own and I'd like to charge $10 for it. Maybe this isn't possible, but I think $40 should be doable. And why would anybody require their students to pay $150 for a statistics book when something better is available at less than 1/3 the price?

This won't be easy, because I'm planning to write an entirely new kind of intro book, starting from scratch. But why hasn't someone written a more conventional book at a cut-rate price? Or maybe they have, and I just haven't heard about it?

It just mystifies me that, in all these different fields, it's considered acceptable to charge $150 for a textbook. I'd think that all you need is one cartel-breaker in each field and all the prices would come tumbling down. But apparently not. I just don't understand.

P.S. More thoughts here.

AT writes:

I've got a count-based data set with a lot of zeroes present. I'm using zero-inflated modeling to capture the shape, and I want to test goodness-of-fit from both ends -- under- and overfitting. I've read your 1996 paper with XL and Hal Stern which recommends a "discrepancy measure" as being a good quantity to calculate with posterior predictive data. The main suggestion there was to use a chi-square statistic, but I'm sure this would be inappropriate in this case given that the zero cases would drive the entire statistic (and breaking the minimum-cell-size rule for the chi-square about 500 times in the process.) I suppose we could correct for this by doing the square-root trick to stabilize variance, but that still doesn't seem like it would resolve the problem with the zeroes. Any thoughts as to how to find a good discrepancy measure to check?

My generic response is that we always want the test summaries to relate to the substantive questions of interest. In this case, I don't have the context but I can make some quick suggestions, such as to create two test summaries: (a) the percentage of zeroes, and (b) some summary of the fit ot the counts when they are not zero.

The so-called minimum cell size rule is irrelevant, since you can compute the reference distribution directly using simulation. And issues such as stabilizing variance are not particularly relevant either, except inasmuch as they allow your test to more accurately capture the aspects of the data that are important for you to fit with your model.

We were discussing the Angrist and Pischke book with Paul Rosenbaum and I mentioned my struggle with instrumental variables: where do they come from, and doesn't it seem awkward when you see someone studying a causal question and looking around for an instrument?

And Paul said: No, it goes the other way. What Angrist and his colleagues do is to find the instrument first, and then they go from there. They might see something in the newspaper or hear something on the radio and think: Hey--there's a natural experiment--it could make a good instrument! And then they go from there.

This sounded fun at first, but I actually prefer this to the usual presentation of instrumental variables. The "find the IV first" approach is cleaner: in this story, all causation flows from the IV, which has various consequences. So if you have a few key researchers such Angrist keeping their ears open, hearing of IV's, then you'll learn some things. This approach also fits in with my fail-safe method of understanding IV's when I get stuck with the usual interpretation.

Sometimes the "lead with the natural experiment" approach can lead to missteps, as illustrated by Angrist and Pischke's overinterpretation of David Lee's work on incumbency in elections. (See here for my summary of Lee's research along with a discussion of why he's estimating the "incumbent party advantage" rather than the advantage of individual incumbency.) But generally it seems like the way to go, much better than the standard approach of starting with a causal goal of interest and then looking around for an IV.

In this spirit, let me again mention my own pet idea for a natural experiment:

The Flynn effect, and the related occasional re-norming of IQ scores, causes jumps in the number of people classified as mentally retarded (conventionally, an IQ of 70, which is two standard deviations below the mean if the mean is scaled at 100). When they rescale the tests, the proportion of people labeled "retarded" jumps up. Seems like a natural experiment that might be a good opportunity to study effects of classifying people in this way on the margin. If the renorming is done differently in different states or countries, this would provide more opportunity for identifying treatment effects.

I think it would be so cool if someone could take this idea and run with it.

Stigler's Law in action

I received the following email:

**, Co-Chairman and Co-Chief Executive Officer of **, wants to use the following quote in his upcoming presentation:

"The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences."

** would like to attribute the quote to the rightful author but is having difficulties in locating its origin. Can you please clarify if this is something that you said, and if so, where and when you said it? If you did not say this, then can you please tell us who was its original author? Thank you in advance for your help.

I replied:

What I wrote about the law of unintended consequences is here and here. The paragraph you give below is from Alex Tabarrok and is given in the first link above.

It's a little scary that the most famous thing I ever wrote was actually written by someone else!

Our new R package: R2jags

I have got emails occasionally from JAGS users, asking about our new R package: R2jags. Basically, R2jags runs JAGS via R and makes postanalysis easier to be done in R. Taking advantage of the functions provided by JAGS, rjags and R2WinBUGS, R2jags allows users to run BUGS in the same way as they would do it in R2WinBUGS. Nonetheless, R2jags has some powerful features that facilitate the BUGS model fitting:

  1. If your model does not converge, update it. If you used to be a R2WinBUGS user, you must feel frustrated that your model does not converge. The best thing you can do is to use your current states of parameters as the starting values for the next MCMC. On the other hand, in R2jags, you just type in the R console:

    fit <- update(fit) 
    

    This will update the model.
  2. If you have to shutdown your machine but the model is still not converged.... In R2jags, you can go ahead and save the R workspace and shutdown your machine. When you are ready to run the model again, load the workspace in R and type:

    recompile(fit)
    fit.upd <- update(fit)
    

    This will recompile the model, which means you can update the model again!!

If you want to explore more features of the R2jags, just type ?jags in the R console. The example code contains all the functions in the R2jags.

Installing JAGS and rjags can be tricky in the Mac OS or in the Linux system. I have a blog entry here that shows how this can be done. If you are not a Window user, this post might help you.


I recently reviewed a report that used posterior predictive checks (that is, taking the fitted model and using it to simulate replicated data, which are then compared to the observed dataset). One of the other reviewers wrote (in response to the report, not to me):

The model goodness-of-fit statistics that the authors present on this page are biased, and should be interpreted with at least some caution. They give an over-optimistic evaluation of the fit of the hierarchical Bayes model. This is because the data are used twice: once to fit the model, and once again to assess the fit of the model. In fact, the posterior p-values are not asymptotically uniform, as they should be.

I completely disagree! I've discussed this point before. But the attitude expressed in the above quote is held strongly enough, and commonly enough, that I'm willing to spend some time trying to clear things up.

Let's unpack things.

Partisanship: good or bad?

Nancy Rosenblum posted an article based on her recent book, "On the Side of the Angels: An Appreciation of Parties and Partisanship," which she describes as her "analysis of antipartyism and attempt at rehabilitation." Following up at Cato Unbound is Brink Lindsey, who writes that "under present circumstances at least, partisan zeal ought to be attacked rather than defended."

I'll summarize what Rosenblum and Lindsey have to say and then give my reaction (much of which is based on data from our Red State, Blue State book).

I'm always yammering on about the difference between significant and non-significant, etc. But the other day I heard a talk where somebody made an even more basic error: He showed a pattern that was not statistically significantly different from zero and he said it was zero. I raised my hand and said something like: It's not _really_ zero, right? The data you show are consistent with zero but they're consistent with all sorts of other patterns too. He replied, no, it really is zero: look at the confidence interval.

Grrrrrrr.

Cartoon

I don't see the humor here, but two different people emailed this to me so I think there's some sort of legal requirement that I blog it. . . .

I had a discussion with Christian Robert about the mystical feelings that seem to be sometimes inspired by Bayesian statistics. Christian began by describing this article that was on the web about constructing Bayes' theorem for simple binomial outcomes with two possible causes as "indeed funny and entertaining (at least at the beginning) but, as a mathematician, I [Christian] do not see how these many pages build more intuition than looking at the mere definition of a conditional probability and at the inversion that is the essence of Bayes' theorem. The author agrees to some level about this . . . there is however a whole crowd on the blogs that seems to see more in Bayes's theorem than a mere probability inversion . . . a focus that actually confuses--to some extent--the theorem [two-line proof, no problem, Bayes' theorem being indeed tautological] with the construction of prior probabilities or densities [a forever-debatable issue]."

I replied that there are several different points of fascination about Bayes:

Yes, I understand that it's frustrating to not be able to drive your expensive SUV at the maximum possible speed attainable by that magnificent machine . . . but, really, how fast do you really expect to be traveling on NYC streets in a snowstorm during rush hour???

Eric Loken writes:

Last week the New York Times published an article on a possible Obama effect on test scores of black test takers. . . . The authors claim that they gave a short academic aptitude type test to black and white test-takers. When they administered the test last summer, they noted a difference between average scores for blacks and whites. However, after (now) President Obama had received his party's nomination and given his acceptance speech, the difference in scores disappeared. The theory is that Obama's rise has had a positive motivating influence on test taking performance.

Eric then gives some background:

Now that we're on the topic of econometrics . . . somebody recommended to me a book by Deirdre McCloskey. I can't remember who gave me this recommendation, but the name did ring a bell, and then I remembered I wrote some other things about her work a couple years ago. See here.

And, because not everyone likes to click through, here it all is again:

Mostly Harmless Econometrics

I just read the new book, "Mostly Harmless Econometrics: An Empiricist's Companion," by Joshua Angrist and Jorn-Steffen Pischke. It's an excellent book and, I think, well worth your $35. I recommend that all of you buy it.

I also have a few comments.

The economy and attitudes toward risk

On the often-interesting judgment and decision making listserv, George Christopoulos wrote:

It seems that in situations similar to the present economic situation economic agents are less willing to take risks and instead they prefer safer options.

Could somebody point to studies that show this negative relationship between depression /recession (or when generally when wealth resources are low) and increased (relative?) risk aversion?

There were a couple of responses on the list, but they seemed to me to miss the point slightly. The respondents referred to econ literature on stock market trading and on wealth and economic decision making, but my impression was that Christopoulos was looking for something more psychological: something like a meta-analysis of studies of uncertainty aversion (I prefer to avoid the term "risk aversion" or even "loss aversion," for reasons I've discussed at length on this blog) over time, to see if subjects in an identical experiment show more uncertainty aversion in bad times than good.

The next step would be to analyze such data to separate out, to the extent possible, effects of individual economic status and national trends. The hypothesis might be that both have effects: that people suffering personal reversals might show more uncertainty aversion, and that, on top of this, everyone might tend to show more uncertainty aversion during economic downturns.

Could be an interesting study, although I doubt that such data are available.

Hey, this looks cool!

Visualization and Control in Insect Flight

Atilla Bergou, Physics Department, Cornell University

Insects have a 100 million year head-start on us in learning how to fly. Thus, we have a lot to learn from them. Currently, one of the greatest challenges in this study is the accurate measurement, characterization and visualization of the motions of these animals. Recent advances in high-speed videography have allowed us to begin exploiting techniques from computer vision which hold immense promise to resolve these problems. In this talk, I will show our efforts in incorporating ideas from computer vision and physics to study the complex motion of an insect's wing. This motion is due not only to muscular activation but also to fluid, inertial, and elastic forces. Thus, it may be that not all aspects of the wing motion are actively controlled by the insect. We ask whether changes in the wing orientation of flying fruit flies are actuated by insect muscles, or if their wings turn over passively like a falling leaf. By applying a three- dimensional reconstruction technique to high-speed films of freely flying fruit flies, we are able to capture their intricate motion at a level of detail that has previously been impossible. We extract the detailed wing kinematics of flies using a novel motion tracking algorithm, compute the forces acting on the wings and infer whether flapping flight is possible without pitching control.

The talk is 3pm Wed 4 Feb CESPR 414 Sindeband East.

Radical transparency

Seems like a good idea to me. This story reminds me of when my course listing mysteriously got removed from the department's website. It took something like two years to get it back up.

Next Friday's Teaching Statistics class

The guest teacher will be Prof. Shigeo Hirano of the Dept. of Political Science. If he has any extra readings for you, I'll let you know!

A McKinsey interview (sorry, it's behind a registration wall, but the registration is worth it if you're interested in business topics or "futurism") with Google's chief economist Hal Varian has an interesting quote:

I keep saying the sexy job in the next ten years will be statisticians. People think I'm joking, but who would've guessed that computer engineers would've been the sexy job of the 1990s? The ability to take data--to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it--that's going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.
I think statisticians are part of it, but it's just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. But I do think those skills--of being able to access, understand, and communicate the insights you get from data analysis--are going to be extremely important. Managers need to be able to access and understand the data themselves.

I'm sure everyone reading this blog will feel warmer and fuzzier now. :) But the teaching of introductory statistics really has to convey how to:


  • capture data relevant to the problems

  • visualize, communicate and effectively utilize the data

  • access, understand, and communicate the insights of data analysis


It would definitely turn fewer students off than the usual package full of integrals, density functions, t-tests and p-values.

Update 5.28.09: Michael E. Driscoll has a better written description of the above points, along with the guidelines.

Sidney Redner sends along this article by D. Volovik, M. Mobilia, and himself (apparently physicists think it's tacky to include first names in their articles), which begins:

Letting the side down

The phone just rang. I picked it up and heard: "Could I speak with the youngest female in the household who is eligible to vote?" My reply: "Sorry, we're busy."

I'm either a traitor to my profession by not participating in a poll, or a contributor by increasing the problem of missing data.

Ed Vul, Christine Harris, Piotr Winkielman, and Harold Pashler wrote an article where:

1. They point out that correlations reported in FMRI medical imaging studies are commonly overstated because researchers tend to report only the highest correlations, or only those correlations that exceed some threshold.

2. They suggest that these statistical problems are leading researchers, and the general public, to overstate the connections between social behaviors and specific brain patterns.

After posting on this article, I received a bunch of comments and questions as well as some responses:

This article by Jabbi, Keysers, Singer, and Stephan argues that, because brain imaging resesarchers adjust their p-values and significance thresholds for multiple comparisons (the thousands of voxels in a brain image), their statistical methods don't have the problems that Vul et al. claimed.

This reply by Vul to the Jabbi et al. article. Here Vul argues that adjustment of significance levels does not stop the selected correlations themselves from being too high. I found Vul's argument here to be convincing. Multiple comparisons methods control the rate of false alarms in a setting where true effects are zero--but I don't see that to be relevant to the imaging setting, where differences are not in fact zero. Lots of things affect blood flow in the brain, and we would never expect the average scans of two different groups of people to be the same.

This article by Lieberman, Berkman, and Wager, who defend social neuroscience and argue the following:

1. They accept Vul et al.'s point 1 above (correlations are overstated) but present some evidence that the correlations aren't as overstated as Vul et al. might fear.

2. They disagree with the implied claim that the overstated correlations have distorted scientists' understanding of social neuroscience research.

3. They object to Vul et al's focusing on social neuroscience, given that the same statistical issues arise in all sorts of brain imaging studies.

4. They point out some specific areas where Vul et al. mischaracterized the data-analytic methods used in this field.

I think Lieberman et al. make some good points, but, as Vul et al. point out, researchers often do use correlations to summarize their results. And, even if said correlations survived a multiple-comparisons analysis, readers might interpret these at face value without understanding the selection issue. So all this shake-out is probably a good thing, especially where correlation estimates are being compared to each other.

My thoughts

First off, I haven't worked seriously in medical imaging for nearly 20 years and have only one published paper in the area, so my comments are mostly informed by my perspective on general statistical issues, as well as my own experience thinking about estimation of effect sizes in studies with low statistical power.

Regarding the singling-out of social neuroscience, I see the point of Lieberman et al. I was thinking that maybe one reason for this is that in social neuroscience it's perhaps more difficult to get external validation in the way that might be more possible in other areas of neuroscience where there is some measurement in the blood or whatever that can be taken. I'm not sure about this, just a conjecture.

It's hard for me to believe that the approach based on separate analyses of voxels and p-values, is really the best way to go. The null hypothesis of zero correlations isn't so interesting. What's really of interest is the pattern of where the differences are in the brain.

Related to this point is that, ultimately, when trying to understand differences in brain processing between different sorts of people (or between people doing different tasks), the maximum correlation among voxels is ultimately not what you're looking for. That is why researchers summarize using regions of interest (as in p.7 of the Lieberman et al. article). Vul et al. were correct to warn about overinterpretation of correlations that have been selected as the maximum: the naive reader can see such correlations (and accompanying scatterplots) to think that certain personality traits are more predictable from brain scans than they actually are.

I think the way forward will be to go beyond correlations and the horrible multiple-comparisons framework, which causes so much confusion. Vul et al. and Lieberman et al. both point out that classical multiple comparisons adjustments do not eliminate the systematic overstatement of correlations. A hierarchical Bayes approach (using some sort of mixture for the population of pixel differences, ideally modeled hierarchically with pixels grouped within regions of interest) would help here..

And now for some amateur psychologizing (unsupported by any statistical analysis, correlational or other)

I suspect that one of the motivations of Vul et al in writing their article was frustration at too-good-to-be-true numbers which they felt led to exaggerated claims of neuro super-science.

Conversely, I suspect one of the frustrations of Lieberman et al. is that they are doing a lot more than correlations and fishing expeditions--they're running experiments to test theories in psychology, they're trying to synthesize results from many different labs. And from that perspective it must be frustrating for them to see a criticism (featured in the popular press) that is so focused on correlation, which is really the least of their concerns.

It also seems that both sides were irritated by what they saw as giddy press coverage: on one side, claims of dramatic breakthroughs in understanding the biological basis of behavior and personality; on the other, claims of a dramatic Emperor-has-no-clothes debunking. As scientists, most of us welcome press coverage--after all, we think this work is important and we'd like others to know about it--but . . . fawning press coverage of something that we think is wrong--that's just annoying.

P.S. Wager is a friend--he teaches in the psychology department here--but I don't think my personal knowledge has hindered my evaluation here.

P.P.S. I ran the above by various people involved and they gave some helpful clarifications. But I've probably left in a couple of sloppy statements here and there.

Is it Art?

John Lanchester asks this question about video games. I have a few observations to add:

1. When I was a teenager, my friends and I spent tons of time at the arcade, just throwing quarters into videogames. (I always preferred pinball, but videogames were more available.) I haven't played a videogame in decades and have zero interest in doing so. (This is not something I'm necessarily proud of, just a statement of fact.) And now I can't figure out what about videogames was so appealing to us, back then.

2. Different people read different sorts of books, often with little overlap. Stephen King, John Grisham, Danielle Steele, etc., are the super-sellers, but lots of people would read one of these and not the others--and lots of readers don't read these blockbusters at all. In contrast, everybody who's into movies is aware of the latest major releases, and it's my impression that people who would rarely read a bestseller of the Stephen King sort would still watch a big-budget movie.

To put it another way: My impression is that the default for reading is to pick something in a niche that you're interested in, but the default for movie watching is to start with the blockbusters and then go from there.

3. But with old movies, I think it's different. Back in the old days, everybody might watch whatever old movie was being shown on TV that night, but what with videos people will now make their choices. And if a movie is old, the whole blockbuster thing seems irrelevant.

I don't know exactly how video games fit into all this; I just wanted to point out that the same audiences seem to expect different things from different media.

I got an email from the people at Revolution computing reporting:

REvolution Computing has now made a public version of its commercial grade REvolution R program available for download from its website. REvolution R is REvolution Computing's distribution of the popular R statistical software, optimized for use in commercial environments. A key feature includes the use of powerful optimized libraries capable of boosting performance by a factor of 5 or 10 for commonly used operations.

Here it is.

There's also a payware version, which "features advanced functionality, including ParallelR, which speeds deployment across both multiprocessor workstations and clusters to enable the same codes to be used for prototyping and production. REvolution R Enterprise is functional with 64-bit platforms and Linux enterprise platforms and provides for telephone support and response guarantees."

I don't know anything about this but at least in theory it sounds like a good idea! If anyone has any comments, feel free to share them.

Seth asks the above question. I have a couple of thoughts:

1. It's hard to write a book that's easy to read. As Seth points out, it's a Venn diagram situation: you're looking for the overlap between three groups of people: those who know the material, those who can write well, and those who are willing to put in the effort to write a book.

2. Some subjects are so urgent that they're worth writing--and reading--about even if hard to read. For example, if you're a terrorist and want to build an A-bomb (or, to use another example, if you're a social scientist and you want to use quantitative methods), sure, you'd prefer clear instructions, but something that's hard to read is still much better than nothing.

3. What's hard for you to read might be easy for somebody else. An extreme example is foreign language or the passage of time. But, beyond that, there's familiarity. For example, I find the news section of the newspaper much easier than the comics to read. The comics are much simpler, but I'm not familiar with them, and to read them I have to enter all these different stories. In contrast, news stories follow a predictable pattern: drought in the Great Plains, people blowing each other up in the Middle East, and so forth.

4. Standards differ. Bayesian Data Analysis is considered by many to be well-written but it's much less easy to read than Data Analysis Using Regression and Multilevel/Hierarchical Models, which in turn is, I'm sure, much less easy to read than the collected works of Len Mlodinow. But, then again, Mlodinow is harder to read than Stephen King. If Stephen King wants to write statistics books, that's fine, but I'm afraid that would wreck things for the rest of us. It would set the bar too high.

5. And, of course, some people write books to be hard to read on purpose. James Joyce and so forth. If you, like me, aren't a big James Joyce fan, just pick your favorite writer who you find challenging to read (in a good way).

One more John Updike story

John Updike just passed away, and coincidentally I noticed a story by him in the most recent New Yorker. Well, actually the story was by someone called "Antonya Nelson" but it was clearly a John Updike story. Not angry enough to be a John Cheever story, not clipped enough to be a Raymond Carver story, not smooth enough to be a Richard Ford story. Based on this evidence, I expect we'll continue to see new Updike stories for awhile.

P.S. I'm a big fan of Updike. Rabbit, Run is my favorite.

Taking a standardized test multiple times

When we have a grad school applicant who's taken the GRE or TOEFL multiple times, we typically just look at the highest score. It's my impression that pretty much everybody does this, even though basic statistical principles would suggest taking the average. Eric Rasmusen reminded me of this point in the context of the SAT, which apparently has changed its policy to encourage multiple test-taking even more, by allowing students to report only their highest score. Throwing away information--that doesn't sound like a good idea! But, as Rasumusen points out, it might make more money for the organizations that administer the test.

According to the linked news article, students "will have the option of choosing which scores to send to colleges while hiding those they do not want admissions officials to see." My question is: will their score report state whether they've chosen this option? If so, it should be possible to at least try to correct for the bias.

In any case, all this discussion makes me think we should be more careful about just looking at the maximum when our applicants take the GRE multiple times. And then there's the possibility of cheating. . . . I guess the real lesson is that these admissions decisions aren't going to be perfect, and we should think more about how to incorporate this perspective into our admissions process.

Data sources

Pippa Norris writes:

I realize that it is clunky but if you could always, always cite the survey source and date below each figure, this would make the book much easier to read and interpret. If I know the source, then it is easier to judge the meaning of the presentation, the exact questions used, and the reliability of the data. I use a lot of figures in presenting my own work, to the despair of my publisher, and I know how difficult it is to both combine elegance and simplicity with technical details in a compact space. But if we don't provide these details, then this is such a bad role model for our students!

She's got a point. I'm a big believer in having the graph and caption be a self-contained entity--as I tell my students, you have to think about people like me who only read the graphs--but I've rarely put the data source right in there too. In our book we have all the data sources listed in the notes at the end, but I agree that putting sources right on the graph would be a good idea. Actually, I think what I want to do is write some R functions to make graphs just the way I like them, and one option on the graph will be to give the data sources in small print near the bottom.

Sports fans as potential Republicans?

Brad Miner wrtes:

With the Super Bowl coming up this weekend, I [Miner] want to write about sports, which I consider a key to building a larger conservative coalition in America. . . .

If you did a survey of the political philosophies of 75,000 randomly selected Americans you'd expect the usual--if somewhat mystifying--results: "Only about one-in-five Americans currently call themselves liberal (21%), while 38% say they are conservative and 36% describe themselves as moderate." So said the folks at Pew Research, and this was after the November election.

Do that same poll among the fans at Raymond James Stadium in Tampa on Sunday and the results would likely be more like 15% liberal, 30% moderate, and 50% conservative. And a bunch of those liberals would probably be gun owners.

Obviously those numbers are just speculation on my part, but I guarantee that Steelers fans are more conservative than all Pennsylvanians and ditto Cardinals devotees and the rest of Arizona. Which is not to say that these folks cast their ballots in November more for McCain than Obama. That's the problem.

What do the data say?

Yu-Sung and I looked at the "attended sporting event in the past year" item in the General Social Survey. (Unfortunately, the question was only asked once, in the 1993-1996 survey.) 56% of respondents said they attended an amateur or professional sports event" during the past twelve months. How do they differ from the 44% who didn't?

sport.png

So, at least in the mid-1990s, sports attenders were quite a bit more Republican than other Americans (the categories in the graph above are Strong Democrat, Democrat, lean Democrat, Independent, lean Republican, Republican, strong Republican), but not much different in their liberal-conservative ideology.

So these data do not appear to support Miner's claim. Miner expected sports fans to be label themselves as more conservative but maybe not to be more likely to vote Republican; actually, sports fans were more likely to call themselves Republican but no more likely to describe themselves as conservative.

Some other issues:

1. The sporting event attended could be the Super Bowl or your kid's soccer game. Maybe more dramatic results would be obtained by considering a more restricted group of sports fans.

2. There are lots of surveys of TV watching, so I'm sure there are tons of data that would let you crosstab ideology, voting, and spectator sports watching.

3. More generally, we never want to rely too strongly on just one survey. Still, it's fun to look.

P.S. Sometimes people ask me how much time blogging takes me. This took about an hour: 15 minutes for me to read Miner's article and think about it, 10 minutes for Yu-Sung to get the crosstabs, 20 minutes for me to make the graphs, and 15 minutes for me to write the blog entry.

And, yes, this means I have a lot of real work that I've been putting off. . . .

Burt Monroe writes:

I [Burt] sent an entry for the Chance visualization contest. By the time I'd ferreted out the original in our library and quizzed family friends (my father was a biologist) about bacterial taxonomy, I ended up writing a goofy little paper about the whole thing.

Here's Burt's article, and here's his graph:

burt.png

This is pretty, although to my eye it looks a little busy. I'd probably favor including this information in multiple plots. That said, I didn't actually look at the data or the problem--all I did was post the announcement--so maybe it's the best thing to do. I like that Burt investigated the context of the problem and didn't just treat this as a "dataset" to be graphed.

A diagram of graphs

Jess sends along this, which isn't a bad idea although I disagree with how it's organized. For one thing, I think just about all graphs are comparisons; for another, I think line graphs are often the way to go, so I'm unhappy to see them in only a few of the pictures here; for another, the scatterplot-plus-regression-line, which I love, isn't anywhere to be found. But I appreciate the thought.

A longstanding principle in statistics

Hal Pashler writes:

I [Pashler] thought you guys would enjoy this charming little 1950 paper by Edward Cureton entitled "Validity, Reliability, and Baloney" (Dirk Vorberg, a German math psych guy, sent it). Long before machine learning, it seems that psychometrics people were confronting this issue--and the concrete form it took was "What should we make of validation measures computed with the same data that were used to select out particular items for inclusion in the test?". Just swap voxels for items, and it's the same problem [as in the Vul, Harris, Winkelman, and Pashler paper on suspiciously high correlations in bran imaging studies].

This reminds me of a longstanding principle in statistics, which is that, whatever you do, somebody in psychometrics already did it long before. I've noticed this a few times. Once, about ten years ago, I was at a conference where computer scientists were talking about some pretty elaborate statistical models, and I realized these were the same as some things I'd seen Iven Van Mechelen and his colleagues working on in the Psychology Department at Leuven. Then, more recently, I wrote this article with David Park on splitting a predictor into three parts, and it turned out that similar work had been done in 1928! by psychometric researcher T. L. Kelley (and, oddly enough, E. Cureton in 1957).

I received the following email from a Ph.D. student who wishes to remain anonymous:

I came into operations research with a masters in mechanical engineering therefore my statistical analysis is very unbalanced. I tool stochastic processes I and II (both PhD level courses) but I have no applied background in statistics, now I am doing a lot of number crunching using R but I believe I still lack broad understanding of statistical tools and I don't know enough about analysis (although I believe I have a solid understanding of the basics like mean, divergence, confidence interval, etc)

Given all my embarrassing situation I would appreciate it if you would please, in a blog post, lay out a learning plan for people like me who like to dive deeper into statistical analysis and do it in daily basis but come from weird backgrounds like mechanical engineering.

My reply: Since you're not at Columbia, I can't simply recommend that you take my course. You could read my books, that might help. More seriously, you gotta think about all the great skills you have that many statistics students don't have: you can make yourself useful in a lot of different sorts of projects . . .

A cool interactive statistics web course

Aleks sends in this link to an interactive statistics web course and writes, "It requires registration, but I highly recommend you go through this. You will probably not like the statistics teaching style that goes back to Fisher's 1930s textbook, but it's hard not to appreciate the effort that went into interactive 3D demonstrations of various concepts."

There is something incredibly old-fashioned about what they're teaching, but I agree that the 3-D graphics are extremely impressive. Probably the wave of the future, once it can be combined with a more up-to-date (that is, ARM-style) set of statistical concepts and methods.

Peter Flom is a statistical consultant I know, who has worked with graduate students and researchers in the social sciences, education, medicine and other fields. He earned his PhD in psychometrics from Fordham University in 1999, and has been first author and co-author on numerous publications. He has also assisted in grant writing and dissertation preparation. He is based in New York City, but also works remotely.

Nate points out that Obama's approval rating at the start of his presidency is higher than anyone since John Kennedy, and he writes, "In comparison with Ronald Reagan, however, Obama's approval is quite a bit more impressive. Indeed, it is hard to mount a credible argument that Reagan began his term with more political capital than Obama."

I recall reading that Reagan's popularity jumped after his attempted assassination two months into his presidency. (See here for some graphs.) So I suspect that the impression of Reagan being initially popular comes from that point, not from January 20.

P.S. Recall that Reagan, like Roosevelt, was a statistician.

Urban Obama

Mark Blumenthal links to this article by Nate Silver, who writes, "If Bill Clinton was the first black president, then Barack Obama might be the first urban one."

This reminds me of some of our recent discussions here:

- Who was the last urban president before Obama? (I said Nixon, who lived in New York before his 1968 presidential run; a commenter said Kennedy, who was from Boston.)

- County-level vote swings by population. Democrats have been gaining in urban areas. The gain has been pretty steady over the past three elections, so I don't know how much should be attributed to Obama's urban-ness in particular. Graphs here along with much discussion:

swingspop_more.png

What do we see?
1. The large-county/small-county differential in Obama's gains was particularly strong in the south and did not occur at all in the northeast. For example, Obama won 84% of the two-party vote in Philadelphia--but Kerry got 80% there four years ago. This 4% swing was about the same as Obama's swing nationally. Part of the issue here is that Obama had almost no room for improvement in these places.

2. The pattern of Democrats improving more in large-population counties is not unique to 2008. Gore did (relatively) well in big counties in all regions in 2000.

Christian Robert has some thoughts on my paper with Aleks, Yu-Sung, and Grazia on weakly informative priors for logistic regression. Christian writes:

I [Christian] would have liked to see a comparison of bayesglm. with the generalised g-prior perspective we develop in Bayesian Core . . . starting with a g-like-prior on the parameters and using a non-informative prior on the factor g allows for both a natural data-based scaling and an accounting of the dependence between the covariates. This non-informative prior on g then amounts to a generalised t prior on the parameter, once g is integrated.

This sounds interesting. I agree that it makes sense to use a hierarchical model for the coefficients so that they are scaled relative to each other.

Regarding the pre-scaling that we do: I think something of this sort is necessary in order to be able to incorporate prior information. For example, if you are regressing earnings on height, it makes a difference if height is in inches, feet, meters, kilometers, etc. (Although any scale is ok if you take logs first.) I agree that the pre-scaling can be thought of as an approximation to a more formal hierarchical model of the scaling. Aleks and I discussed this when working on the bayesglm project, but it wasn't clear how to easily implement such scaling. It's possible that the t-family prior can be interpreted as some sort of mixture with a normal prior on the scaling.

In any case, maybe Aleks can try Christian's model on our corpus and see what happens. Christian links to his code, which would be a good place to start.

Multicolor text in R

Hey, I've wanted to do this for awhile! Example code here.

I gotta say, I find the expression() function incredibly difficult to use. Examples are key.

Arthur Brooks writes:

Over the past several years, studies have consistently shown that people on the political right outperform those on the left when it comes to charity. This pattern appears to have held -- increased, even -- in 2008.

In May of last year, the Gallup polling organization asked 1,200 American adults about their giving patterns. People who called themselves "conservative" or "very conservative" made up 42% of the population surveyed, but gave 56% of the total charitable donations. In contrast, "liberal" or "very liberal" respondents were 29% of those polled but gave just 7% of donations.

Is it something I said?

I had a grant application turned down and wrote the following polite email to the program director:

Dear Dr. ***,

I am sorry to hear this. In particular, I can't understand how the panel could've thought that the methods are "not in themselves new." Clearly we have more work to do in explaining our proposal.

But I will look on the upside, which is that ** must have received some excellent proposals to fund that were even better than ours! So congratulations on that.

Yours
Andrew Gelman

I was surprised that he did not respond, but when I related the story to my colleagues, they explained to me that the director might have thought I was being sarcastic in my email. I was actually sincere. But intonation is notoriously difficult to convey via email.

What do Americans think of gay rights?

Justin Phillips, Jeff Lax, and I wrote this article summarizing some of the findings of their recent research on gay rights in the states:

In his address at the Democratic convention, Barack Obama said, "surely we can agree that our gay and lesbian brothers and sisters deserve to visit the person they love in the hospital and to live lives free of discrimination."

What was he thinking, saying this to the nation? California was on the way to a contentious battle over same-sex marriage and the issue has arisen in other states as well. Isn't gay rights a wedge issue that Democrats should try to avoid?

Yes, Americans are conflicted about same-sex marriage, but one thing they mostly agree on is support for antidiscrimination laws.

In surveys, 72% of Americans support laws prohibiting employment discrimination on the basis of sexual orientation. An even greater number answer yes when asked, "Do you think homosexuals should have equal rights in terms of job opportunities?" This consensus is remarkably widespread: in all states a majority support antidiscrimination laws protecting gays and lesbians, and in all but 10 states this support is 70% or higher.

But people do not uniformly support gay rights. When asked whether gays should be allowed to work as elementary school teachers, 48% of Americans say no. We could easily understand a consistent pro-gay or anti-gay position. But what explains this seeming contradiction within public opinion so that gays should be legally protected against discrimination but at the same time not be allowed to be teachers?

If anything, we could imagine people holding an opposite constellation of views, saying that gays should not be forbidden to be public school teachers but still allowing private citizens to discriminate against gays. A libertarian, for example, might take that position, but it does not appear to be popular among Americans.

We understand the contradictory attitude on gay rights in terms of framing.

Elizabeth Suhay sent me this article examining the mechanisms underlying social influence. Here's the abstract:

Citizens often feel pressured to adopt the beliefs held by their peers, conforming to the views of the majority even in the absence of rational argument. However, few scholars have investigated the mechanisms underlying this "mindless" conformity to group pressure. Drawing on recent research in psychology, this manuscript puts forward a new theory of group influence called Social-Emotional Influence Theory which states that subjective group identification and self-conscious emotions (e.g., pride and shame) are critical to understanding political conformity. We feel pride when we conform to, and shame when we deviate from, in-group beliefs and behaviors; these emotional reactions motivate conformity. SEI Theory is tested with an experimental study of group influence among Midwestern American Catholics with respect to social conservatism. The evidence supports SEI theory: Identification with other Catholics mediated group influence over participants' conservative views, and self-conscious emotions appeared to play a key role in explaining that influence.

I like the idea that Suhay presents her theory as complementary rather than competitive with more traditional quasi-rational information-processing models of Diana Mutz and others.

Regarding the social conservatism of American Catholics, I wonder what Suhay would say about Rudy de la Garza's finding that Latino Americans have conservative views on abortion, but very few of them state abortion as an important determinant in their vote. This suggests that there's another choice point, which is how much to consider attitudes on social issues when deciding how to vote.

And here's her picture (sorry, no cool data graphs this time):

Lessons from the 2008 election

My thoughts here.

And, for a different audience, a discussion of Red State, Blue State.

As always, I thank my collaborators for a lot of the analysis that I'm summarizing in these articles and discussions.

I discussed here the gradually decreasing decline in relative vote swings by state:

swings.png

The next step was to do this calculation by counties. For each presidential election year in the graph below, I computed the interquartile range (that is, the 75th percentile minus the 25th percentile) of the swings in vote proportions for the Republicans for the 3000 or so counties in the United States. I exclude third-party votes. For each year, I computed the interquartile range for all the counties in the U.S. and also just for the counties outside of the South.

countyswingstimeseries.png

I only seem to have the data at hand back to 1968 which is why this graph only goes back to 1972. As with the statewide swings, there has been a steady decline, in this case much more dramatic in the South. The decline is gradual but we're clearly at a lower level of variation now than we were 30 years ago.

A few days ago, I discussed an interesting article that said that it's actually not so unusual for countries to be multilingual. Ubs disagrees:

Although there is more to nationalism than just language, the idea of identifying a political state with a single language is a central idea of nationalism. When you contemplate why it is that today we expect any state to have a single language and think of Canada or Belgium as "weird" (and don't forget Switzerland), what you're really contemplating is why the nation-state has become dominant in the modern world.

Among those who study such things, the standard and mainstream thesis is that there is a fundamental incompatibility between a multinational state and a modern state. Historical evidence is abundant that nationalism tends to occur simultaneously with industrialization, and there are plenty of plausible reasons why this should be so. In a traditional society, where the economy is primarily agricultural and power flows hierarchically, a local noble who does not speak the language of the capital is at no particular disadvantage; in a modern society, where production is aimed for the market, literacy is essential to economic success, and political power flows through a central establishment, he is disenfranchised.

Historians continue to debate the exact nature and significance of the connection between modernization and nationalism, but no one can ignore the question. The two central examples are the Habsburg and Ottoman empires. The 19th century history of both states is completely dominated by their efforts to reconcile the multinationalism with modernization. The Ottoman Empire remained multinational and as a result failed to effectively modernize. The Habsburg empire did modernize but was unable to remain multinational.

The notion that language became a problem in Habsburg lands only after the end of World War I, as your quoted excerpt seems to imply, is ridiculous. The language problem dominated the empire's politics from its founding in the Napoleonic wars until its defeat. Following the links, I see that Kamusella's book is 1,168 pages long, so I'm sure he has plenty to say about this, but if his thesis is merely that multilingualism is extinct in central Europe because the mean British, French, and Americans "delegitimized" multilingualism, then either he is naive or he thinks we are.

I would say -- and I believe this is a pretty mainstream view -- that the multilingual nature of the Habsburg empire (and likewise the Ottoman) put it at a disadvantage vis-à-vis nation-states like France and Britain. As a result it was unable to recover after defeat as Germany did. The victors of World War I did not "choose to delegitimize" Austria-Hungary; they defeated it, and they destroyed it. The bundle of smaller states that filled the void were created not in the pursuit of any unilingual ideal, but simply for the usual geopolitical reasons.

It is true that Wilson paid lip service to the idea of drawing state boundaries to match national identities, but this premise was used only where it was politically convenient. It was easily abandoned in South Tyrol, Sudetenland, and Asia Minor, among other places, and the victors' preservation of bilingual Belgium was the very opposite of delegitimization. The principle behind the carving up of Central Europe after World War I was not any "ideal of ethnolinguistic homogeneity", but rather an attempt to secure all the economically productive property under the political control of the victors.

But I digress. Reading the post again, I think I probably don't disagree with Kamusella nearly as much as I initially thought I did. I think the excerpt and the context in which it was presented rubbed me the wrong way.

The point is that if you had "always thought of monolingual countries as a default rather than a construct", you were absolutely right. At least for any modern state.

Far be it from me to argue with someone who can not only use the expression "vis-à-vis" in a sentence but also knows how to put the accent into an email. But . . . what about China, India, and the former Soviet Union?

P.S. Ubs also points out:

If you say "monolingual" you're mixing languages (not that that hasn't been done before): Bilingual, multilingual, unilingual; polyglot, monoglot (and I suppose, though I've never heard it, "duoglot").

I'll try "unilingual" on for size. (I can't bring myself to say "monoglot," given that "polyglot" sounds weird enough as it is.

Annals of Spam

This one just came from Life Science Journals :

Based on your research profile, we would like to offer you the following free subscription to Nature Biotechnology.

Click here to sign-up for your free subscription which is available without any obligation to qualified scientists.

What part of my "research profile" are they talking about??

"Can economists be trusted?"

Mark Thoma has an interesting discussion of the challenge that the economics profession, and individual economists, have when they give policy recommendations.

Mark's basic point goes as follows. Consider the following four stages of a model:

(a) assumptions about fundamental principles of how the world works,
(b) normative principles (that is, fundamental goals, views about how the world should be),
(c) conclusions about the likely effects on policy,
(d) recommendations about policies.

In any rigorous economic model, there should be a mapping leading from (a) to (c). Further reasoning (possibly mathematical modeling, as in cost-benefit analysis) will take you from (b) and (c) to (d).

That's all fine. But Mark's point is that the reasoning can go the other way too: start with (b) and (d), and then you can figure out what (c) needs to be, and then you can go back one more step and figure out what model (a) you need to get started! Even if economists are not doing this reasoning-from-conclusions-to-assumptions explicitly, you could well believe it's going on implicitly as well as being induced by various pressures such as the selection of what research results to report and even what problems to work on.

This is inevitable, and I discuss it in the decision analysis chapter (22, I think it is) of Bayesian Data Analysis. We call it the garbage-in-garbage-out problem: If you can come with any decision you'd like by just altering the inputs of your analysis, then what's the point of decision analysis (or, by extension to the above-linked example, economic modeling) at all?

My answer is something that I call "institutional decision analysis," which has two principles:

1. It can be a good idea to provide reasoning to justify your decisions. As an individual person, you might not have to justify your personal decisions to anyone (except to your spouse), but an institution--whether it be a business, a government agency, a nonprofit organization, or some other grouping--often needs some path of bread crumbs connecting assumptions to recommendations. (Here, I carefully say "connecting" rather than "leading from" to be agnostic about the direction of the reasoning.)

2. As Mark noted, an overall decision recommendation on anything important is likely to be so dependent on assumptions to such an extent that it's probably fair to say that the analyst is reasoning from conclusions to assumptions (from (d) to (c) and then to (a), in my above notation). But, even then, formal decision analysis can be useful in making relative recommendations. This is the point that we made in our article about decision making for home radon [link fixed]. In the economics context, this might suggest that economists of different political persuasions could still give useful recommendations about how to spend money or cut taxes, or where in the economy such policies would make more or less sense.

Strange Maps has this cool picture of Polish election results compared to the pre-1914 partition border:

poland_2007_election_results_1.jpg

I can't tell what the colors represent, but it's striking nonetheless. In linking to it, Matt Yglesias writes,

History's impact can often be surprisingly long-lasting. It's been a long time since taking midwestern agricultural products via train to Chicago and then by boat across the Great Lakes, across the Eerie Canal, down the Hudson, and to the port at New York was a major element in the American economy. But we still have two giant cities in Chicago and New York . . . I wouldn't be surprised if the German-run bit of Poland was richer in 1918 than the rest of it, and that the differential has persisted since then. By the same token, we can expect the East Germany part of Germany to remain poorer than the West Germany part for a long time.

Here are some graphs that I posted a couple of years ago and that found their way into chapter 5:

stateincometrends.png

More pictures here (for those of you who haven't bought the book yet). For the book, we cleaned up the graphs a bit, but the general point remains, that the states that are rich and poor now are the ones that were rich and poor 80 years ago.

Mark Blumenthal writes about some of the elaborate analyses done by Catalist and other political consulting firms that helped organize the Democrats' get-out-the-vote efforts in 2008. Presumably the Republicans will be doing this next time as well. It brings swing-voter targeting to the next level. It would also be interesting to do some multilevel modeling to put together their county- and individual-level analyses (and, soon, their precinct-level analyses). Also it's worth thinking about how national politics might change as these techniques become more widely used.

Fiction and reality

In a discussion of her recent Aikenesque historical novel, Jenny links to a reviewer who liked the book but didn't think the fantasy elements fit in so well with the rest of the book. In this case, the fantasy part involved communication between living and dead people ("spiritualism"). Jenny then links to Colleen Mondor, who also liked the book and said that she didn't mind the fantasy element since she (Colleen) has talked to dead people herself.

On this particular point, I have no problem with people talking with dead people. But I'm skeptical about claims that the dead people are talking back.

Here I'm talking about real life, not fiction. I certainly agree that fiction can be "true to life" even while violating recorded history or the laws of physics or just about anything else. Think about Stephen King, for example. I imagine there must even be some stream of science fiction (if you'd call it that) centered around, not new technology or alternate history or fantasy, but violations of logic and continuity. For example, a guy goes out of the house wearing a red shirt and later it's green. Or he gets in his car to go to work, but when he gets to work, he's getting off the bus. That sort of thing is impossible by anyone's standards--of course I'm excluding rational explanations, blackouts, Mission Impossible-style kidnappings and staged sets, etc.--but in some ways it's true to lived experience. (Yes, I realize that some of Philip K. Dick's books are sort of like this--for example, Time out of Joint--but here I'm thinking of even more extreme continuity violations, the sort of thing you'd see in a poorly made low-budget movie where somebody lost the script.)

Anyway, my point in bringing this up is to separate any disagreements about the ability of dead people to talk, from the larger question of getting human insight (or at least a good story) from something that not only didn't happen, but couldn't happen.

P.S. All this blogging is a clear sign that I have lots of work I've been putting off! One thing that happened is we just moved (around the corner). Our new apartment has an airy living room with lots of bookshelves, and sitting here seeing all the books gets me thinking more about literature. . . .

Kenny sent me this article by Bill James endorsing Hal "Bayesian Data Analysis" Stern's dis of the BCS. I'd like to add a statistical point, which is a point that Hal and I have discussed once or twice: There is an inherent tension between two goals of any such rating system:

1. Ranking the teams by inherent ability.

2. Scoring the teams based on their season performance.

Here's an example. Consider two teams that played identical opponents in the season, with team A having a 12-0 record and team B going 9-3. But here's the hitch: in my story, team B actually had a much better better point differential than team A during the season. That is, team A won a bunch of games by scores of 17-16 or whatever, and team B won a bunch of games 21-3 (with three close losses). Also assume that none of the games were true run-up-the-score blowouts.

In that case, I'd expect that team B is actually better than team A. Not just "better" in some abstract sense but also in a predictive sense. If A and B were playing some third team C, I'd guess (in the absence of other information) that B's probability of winning is greater than A's.

But, as a matter of fairness, I think you've gotta give the higher ranking to team A. They won all 12 games--what more can you ask?

OK, you might say you could resolve this particular problem by only using wins/losses, not using score differentials. But this doesn't really solve the general problem, where teams have different schedules, maybe nobody went 12-0, etc.

My real point with this example is not to recommend a particular ranking strategy but to point out the essential tension between inference and reward in this setting. That's why, as Hal notes, it's important to state clearly what are the goals.

P.S. It's been argued that a more appropriate system is to change the rules of football to make it less damaging to the health of the players (see here for a review of some data). I certainly agree that this is a more important issue than the scoring system. In statistics we often use sports examples to illustrate more general principles, but it is always good to be aware of
the reality underlying any example. It also makes sense to me that people who are closer than I am to the reality fo the situation would be less amused by the thoughts of Bill James and others about the intellectual issues in the idealized system.

Rationality of voting

Gur Huberman writes, regarding the Edlin, Gelman, and Kaplan article in The Economist's Voice:

Can you extend the charity/rationality argument to explain why people in non battleground sates (e.g., NY) vote? Even if charity motivation is a partial explanation for voting, an implication would be that voter turnout is higher in battleground states, other things being equal. However, I am afraid that this prediction is consistent with many other explanations of why people vote.

Another issue that has intrigued me for years: I am under the impression that voter turnout is lower in local elections and in midterm elections. In midterm elections there's less at stake, so your charity story seems to cover that. But, selfishly speaking, it may well be that who my mayor is may have a stronger impact on my life than who my president is. (Quantifying this last statement is challenging.) If so, why am I more likely to vote in a presidential election than in a mayoral one? Your charity theory may help answer the question.

My reply

1. I think there are many reasons for voting, and in NY it's not particularly rational for instrumental reasons.

2, In our article a couple years ago in the journal Rationality and Society, Edlin, Kaplan, and I discuss the coexistence of many different models for voting. For example, there is the "psychological" model that we are more likely to vote in an election that more people are talking about. People are more likely to talk about an election that is close and that is viewed as important. So the psychological and economic/rational explanation coincide in this way. (Similarly, you could consider psychological or economic rationales for purchases. For example, if I buy something on sale, I'm economically motivated to save money and psychologically motivated because of the pleasure in "getting a deal.") These two things reinforce each other; I see them as parallel, not competing, explanations.

3. Your mayor may have more of an impact on _your_ life, but total impact is proportional to total #people affected. And that doesn't even get into foreign policy (not an issue for local politics unless you happen to live in, say, Berkeley, California).

"Real men keep p values to themselves"

AT points me to this column by Andrew Leonard, who writes:

I [Leonard] asked my readers to explain Felix Salmon's statement that "I'd say p=0.3 right now that Barack Obama's first major act as POTUS will be the nationalization of Citigroup."

What follows are some amusing quotes, including this one from Kobi "diagnostics for multiple imputations" Abayomi:

People pick a cutoff -- arbitrarily, really -- and a p-value lower than the cutoff is pronounced "Significant." Alchemy! . . . p-values are a bit passe, if not completely gauche -- statistically speaking. Modelling, these days, is more particular (if not exact). Macho statisticians are proud of tight (posterior (bayesian)) confidence intervals. Real men keep p-values to themselves.

I think there's some confusion here. Leonard's correspondents were making things too complicated. Salmon was using "p" as jargon to represent "probability," not "p-value." Thus, he was saying that he saw it as a 3 in 10 chance that Obama's first major act would be to nationalize Citigroup. Everybody is so hot under the collar about p-values that they didn't notice the direct interpretation.

As I've discussed before, in 2008 the red/blue map was not redrawn; it was more of a national partisan swing:

2004_2008_actual.png

I also posted some graphs of previous vote swings that were less uniform.

But maybe it makes sense to study this more systematically. For every pair of consecutive elections since 1952 (that is, 1952/1956, 1956/1960, . . . , 2004/2008), I compute the interquartile range (that is, the 75th percentile minus the 25th percentile) of the swings in statewide vote proportions for the Republicans. I exclude third-party votes and also exclude states that were won by third parties. For each year, I computed the interquartile range for all 50 states (plus D.C. when appropriate) and also just for the non-southern states. Here's what I found:

swings.png

The past several decades have seen a steady decline in the variation of statewide vote swings. (The big spike in the graph is 1976, when Jimmy Carter did very well in a bunch of southern states that Nixon carried in 1972.)

To put it another way, the red-blue map is much more stable from election to election than it used to be. What's going on? I'm not sure, but I think this is an important stylized fact.

Here's the introduction to Jenny's new book, which is all about the pure nature/nurture distinction, "pure" in the sense of being uncontaminated by the scientific perspective of modern biology. In that sense it reminds me of The Passions and the Interests: Political Arguments for Capitallism before Its Triumph, by the great Albert Hirschman.

Reading this intro reminds me that authors often say that nobody reads the introduction, but in my experience a lot of people do. One way I can tell is that reviews sometimes pick up on items mentioned in the intro, another way is that people pick up on personal info in the acknowledgments.

I was also reminded that the first Jenny you told me about her book-in-progress, I thought she said the title was "Braiding" (it was her mid-Atlantic accent), which oddly enough wouldn't be a bad title for the book.

It's funny how often such malapropisms are possible; for example, I had a friend who once said she just wanted to bleed into the woodwork. Another time she said she wanted to get on the right tract. In the case of Braiding the malapropism came from the listener not the speaker, but I think the principles are the same.,

By Aleks, Grazia, Yu-Sung and myself. Here's the article, and here's the abstract:

We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors.

We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation.

We implement a procedure to fit generalized linear models in R with the Student-t prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.

I love this stuff, and I'm interested in applying the concept of weakly informative prior distributions for many other models.

This semester I'm teaching my "how to teach" class: The Teaching of Statistics at the University Level. (Stat 6600, or those of you here at Columbia.) I'll post more on that in a bit. Here I want to talk about an idea I had as I was falling asleep last night, of a new course I'd like to teach sometime.

The new course will be called Statistical Communication and it will cover the following topics:

- Graphical presentation (not just of raw data, also visualization of inferences)

- Writing research reports

- Writing computer code that can be used by others

- Working with colleagues (including "consulting" but also research collaborations)

- Email, blogging, hallway conversations, and other informal interactions

I think there was some other aspect of statistical communication that I wanted to include that I can't remember right now. The big idea is that maybe something is to be learned by thinking about all these activities as modes of communicating statistical ideas.

Language and politics

Helen DeWitt links to this interesting exercpt from a book by Tomasz Kamusella about the politics of language in central Europe. The basic idea is that we're all too used to thinking that a country should have a single language, and the exceptions (for example, Canada, Belgium, old-time Austria Hungary) seem weird to us. For example, it's always seen as a big joke in the U.S. that some people in Canada insist on speaking French. China shouldn't be a joke but, hey, they all speak "Chinese," right? And India doesn't really count because they're all supposed to speak English. Anyway, India's not just a country, it's most of a subcontinent, so that's different. And African countries have "tribes" so that doesn't really count either. And, sure, they speak 23 languages in Guatemala, but the official language is Spanish, so that's fine, right? Back when Russia was the U.S.S.R., I certainly had no idea that they spoke Ukranian and all those other languages there. And of course lots of people in the U.S. get upset that people insist on speaking Spanish here.

Kamusella writes,

Although the Western European pedigree of politics of language is at present conveniently forgotten, the phenomenon of language politicization is said to be now most visible in Central Europe. It is so because after World War I, the formerly multilingual Western European powers of France and the United Kingdom with the support of the United States chose to delegitimize the existence of Austria-Hungary on the account of its multilingualism and multiethnicity. By the same token, the victorious powers legitimized various ethnonational (formerly, often marginal) movements, which defined their postulated nations in terms of language. The national principle steeped in the ideal of ethnolinguistic homogeneity allowed these movements to carve up Central Europe into a multitude of ethnolinguistic nation-states. What followed with vengeance was forced ethnolinguistic homogenization pursued to assimilate 'non-national elements' within a nation-state. . . .

The declaration of more than one language per person was not permitted, which by default excluded the phenomenon of bi- and multilingualism from official scrutiny. The logic of this exclusion stemmed from the conviction that a person can belong to one nation only. By the same token, declarations of variously named dialects, already construed as 'belonging to' a national language, were noted as declarations of this national language. . . .

And then some statistics:

Nowadays, in comparison to the majority of extant polities worldwide, most of the nation-states of Central Europe are unnaturally homogenous in their ethnolinguistic composition. Non-Polish-speakers constitute less than 1 percent Poland's population, non-Magyar-speakers amount to 2 percent of Hungary's inhabitants, non-Czech-speakers are less than 3 percent in the Czech Republic's populace, non-Romanian-speakers constitute less than 11 percent of Romania's inhabitants, and non-Slovak-speakers amount to less than 15 percent of Slovakia's populace. . . .

I like to say I speak 1 3/4 languages. I wish I could speak more. But, until reading this, I'd always thought of monolingual countries as a default rather than a construct. Interesting stuff.

Chris Blattman writes,

Several aspiring graduate students have written me [Blattman] about becoming an impact evaluator. . . . I think the best advice is: don't get a PhD to do evaluations. The randomized evaluation is just one tool in the knowledge toolbox. . . . Yes, the randomized evaluation remains the "gold standard" for important (albeit narrow) questions. Social science, however, has a much bigger toolbox for a much broader (and often more interesting) realm of inquiry. . . .

I pretty much agree with Chris on the substance of his remarks, but I think he's missing something when he merges "impact evaluation" and "randomize evaluation" into a single concept. Policy analysis is a big area, and it certainly includes observational studies. We care about the impacts of all sorts of policies that can't be directly studied using experimentation.

P.S. In a different direction, it's interesting to me that policy evaluation is considered part of economics (a little bit) but not really part of political science--but maybe things are changing.

I'll be speaking on Red State, Blue State on Mon 12 Jan at the Medical University of South Carolina, Dept of Biostatistics, Bioinformatics, and Epidemiology. The location is Cannon Place, room 301, and the time is noon.

Nate Silver and Greg Mankiw have an interesting exchange about the use of exogenous instruments to estimate causal effects. Unfortunately, the subject is macroeconomics, a topic on which I know next to nothing beyond what I learned in Mr. Cutlip's econ class in 11th grade. But I think it is, in Greg's phrase, "a teachable moment" on the subject of causal inference.

Greg summarizes the exchange pretty well, although I think he's missing a key point.

Nate noticed a newspaper article where Greg related research by Christina and David Romer on the effects of "exogenous" tax cuts on the economy. Nate writes:

The type of tax cut that Romer and Romer think falls into this category is what they call an "exogenous" tax cut -- one designed not to counter business cycles, but rather a "spontaneous" tax cut under relatively healthy economic circumstances.

This is very much not the type of tax cut that we are contemplating right now. Instead, what is being contemplated is a countercyclical action in an unhealthy economy designed to return the economy to normal growth. Romer and Romer are not all that keen on this type of tax cut; in fact, they argue that such "countercyclical fiscal policy is not achieving its intended purpose" . . .

Greg repiies:

Why did the Romers focus on exogenous policy changes? The reason is that these are the only changes that can be used to reliability identify the effects of tax policy. . . . The Romers focus on exogenous tax changes for the same reason doctors conduct randomized drugs trials--not because they are interested in randomization as a prescriptive tool, but because randomization solves a statistical identification problem.

And now here are my thoughts, again with full recognition that I can really only comment on the statistical issues here, not the economics.

First, Greg is right that it is generally considered desirable or even optimal to estimate treatment effects using randomized experiments or exogenous implementations (but see here for an opposite view from James Heckman), even when the ultimate goal is to understand how the intervention works in the wild, so to speak.

But there is the potential for treatment interactions--that is, a treatment might be more effective in some conditions than in others. There's lots of evidence for treatment interactions in various settings, ranging from education to job training. And this is what Nate is talking about. Again, without attempting to comment on the economics, the treatment effect could vary enough that Nate could be right about the direct relevance of the Romers' study of exogenous tax changes.

To put it another way, Greg is talking about identifiability and Nate is talking about generalizability.

Greg writes, "I usually don't respond to blogosphere commentary on my work because, after all, time is scarce." But since he's had time to respond once, perhaps he'll be able to respond again and clarify this issue. (I think my time is particularly non-scarce since I'm responding to blogosphere commentary on somebody else's work!) In any case, I like the idea of shifting the debate to a discussion of treatment interactions since then it might be more possible to resolve this on a technical level. Perhaps a teachable moment for me as well as for others.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48