Recently in Political Science Category

A few days ago I discussed the evaluation of somewhat-plausible claims that are somewhat supported by theory and somewhat supported by statistical evidence. One point I raised was that an implausibly large estimate of effect size can be cause for concern:

Uri Simonsohn (the author of the recent rebuttal of the name-choice article by Pelham et al.) argued that the implied effects were too large to be believed (just as I was arguing above regarding the July 4th study), which makes more plausible his claims that the results arise from methodological artifacts.

That calculation is straight Bayes: the distribution of systematic errors has much longer tails than the distribution of random errors, so the larger the estimated effect, the more likely it is to be a mistake. This little theoretical result is a bit annoying, because it is the larger effects that are the most interesting!"

Larry Bartels notes that my reasoning above is a bit incoherent:

I [Bartels] strongly agree with your bottom line that our main aim should be "understanding effect sizes on a real scale." However, your paradoxical conclusion ("the larger the estimated effect, the more likely it is to be a mistake") seems to distract attention from the effect size of primary interest-the magnitude of the "true" (causal) effect.

If the model you have in mind is b=c+d+e, where b is the estimated effect, c is the "true" (causal) effect, d is a "systematic error" (in your language), and e is a "random error," your point seems to be that your posterior belief regarding the magnitude of the "systematic error," E(d|b), is increasing in b. But the more important fact would seem to be that your posterior belief regarding the magnitude of the "true" (causal) effect, E(c|b), is also increasing in b (at least for plausible-seeming distributional assumptions).

Your prior uncertainty regarding the distributions of these various components will determine how much of the estimated effect you attribute to c and how much you attribute to d, and in the case of "wacky claims" you may indeed want to attribute most of it to d; nevertheless, it seems hard to see why a larger estimated effect should not increase your posterior estimate of the magnitude of the true causal effect, at least to some extent.

Conversely, your skeptical assessment of the flaws in the design of the July 4th study may very well lead you to believe that d>>0; but wouldn't that same skepticism have been warranted (though it might not have been elicited) even if the estimated effect had happened to look more plausible (say, half as large or one-tenth as large)?

Focusing on whether a surprising empirical result is "a mistake" (whatever that means) seems to concede too much to the simple-minded is-there-an-effect-or-isn't-there perspective, while obscuring your more fundamental interest in "understanding [true] effect sizes on a real scale."

Larry's got a point. I'll have to think about this in the context of an example. Maybe a more correct statement would be that, given reasonable models for x, d, and e, if the estimate gets implausibly large, the estimate for x does not increase proportionally. I actually think there will be some (non-Gaussian) models for which, as y gets larger, E(x|y) can actually go back toward zero. But this will depend on the distributional form.

I agree that "how likely is it to be a mistake" is the wrong way to look at things. For example, in the July 4th study, there are a lot of sources of variation, only some of which are controlled for in the analysis that was presented. No analysis is perfect, so the "mistake" framing is generally not so helpful.

Around these parts we see a continuing flow of unusual claims supported by some statistical evidence. The claims are varyingly plausible a priori. Some examples (I won't bother to supply the links; regular readers will remember these examples and newcomers can find them by searching):

- Obesity is contagious
- People's names affect where they live, what jobs they take, etc.
- Beautiful people are more likely to have girl babies
- More attractive instructors have higher teaching evaluations
- In a basketball game, it's better to be behind by a point at halftime than to be ahead by a point
- Praying for someone without their knowledge improves their recovery from heart attacks
- A variety of claims about ESP

How should we think about these claims? The usual approach is to evaluate the statistical evidence--in particular, to look for reasons that the claimed results are not really statistically significant. If nobody can shoot down a claim, it survives.

The other part of the story is the prior. The less plausible the claim, the more carefully I'm inclined to check the analysis.

But what does it mean, exactly, to check an analysis? The key step is to interpret the findings quantitatively: not just as significant/non-significant but as an effect size, and then looking at the implications of the estimated effect.

I'll explore in the context of two examples, one from political science and one from psychology. An easy example is one in which the estimated effect is completely plausible (for example, the incumbency advantage in U.S. elections), or in which it is completely implausible (for example, a new and unreplicated claim of ESP).

Neither of the examples I consider here is easy: both of the claims are odd but plausible, and both are supported by data, theory, and reasonably sophisticated analysis.

The effect of rain on July 4th

My co-blogger John Sides linked to an article by Andreas Madestam and David Yanagizawa-Drott that reports that going to July 4th celebrations in childhood had the effect of making people more Republican. Madestam and Yanagizawa-Drott write:

Using daily precipitation data to proxy for exogenous variation in participation on Fourth of July as a child, we examine the role of the celebrations for people born in 1920-1990. We find that days without rain on Fourth of July in childhood have lifelong effects. In particular, they shift adult views and behavior in favor of the Republicans and increase later-life political participation. Our estimates are significant: one Fourth of July without rain before age 18 raises the likelihood of identifying as a Republican by 2 percent and voting for the Republican candidate by 4 percent. . . .

Here was John's reaction:

In sum, if you were born before 1970, and experienced sunny July 4th days between the ages of 7-14, and lived in a predominantly Republican county, you may be more Republican as a consequence.

When I [John] first read the abstract, I did not believe the findings at all. I doubted whether July 4th celebrations were all that influential. And the effects seem to occur too early in the life cycle: would an 8-year-old would be affected politically? Doesn't the average 8-year-old care more about fireworks than patriotism?

But the paper does a lot of spadework and, ultimately, I was left thinking "Huh, maybe this is true." I'm still not certain, but it was worth a blog post.

My reaction is similar to John's but a bit more on the skeptical side.

Let's start with effect size. One July 4th without rain increases the probability of Republican vote by 4%. From their Figure 3, the number of rain-free July 4ths is between 6 and 12 for most respondents. So if we go from the low to the high end, we get an effect of 6*4%, or 24%.

[Note: See comment below from Winston Lim. If the effect is 24% (not 24 percentage points!) on the Republican vote and 0% on the Democratic vote, then the effect on the vote share D/(D+R) is 1.24/1.24 - 1/2 or approximately 6%. So the estimate is much less extreme than I'd thought. The confusion arose because I am used to seeing results reported in terms of the percent of the two-party vote share, but these researchers used a different form of summary.]

Does a childhood full of sunny July 4ths really make you 24 percentage points more likely to vote Republican? (The authors find no such effect when considering the weather in a few other days in July.) I could imagine an effect--but 24 percent of the vote? The number seems too high--especially considering the expected attenuation (noted in section 3.1 of the paper) because not everyone goes to a July 4th celebration and that they don't actually know the counties where the survey respondents lived as children. It's hard enough to believe an effect size of 24%, but it's really hard to believe of 24% as an underestimate.

So what could've gone wrong? The most convincing part of the analysis was that they found no effect of rain on July 2, 3, 5, or 6. But this made me wonder about the other days of the year. I'd like to see them automate their analysis and loop it thru all 365 days, then make a graph showing how the coefficient for July 4th fits in. (I'm not saying they should include all 365 in a single regression--that would be a mess. Rather, I'm suggesting the simpler option of 365 analyses, each for a single date.)

Otherwise there are various features in the analysis that could cause problems. The authors predict individual survey respondents given the July 4th weather when they were children, in the counties where they currently reside. Right away we can imagine all sorts of biases based on how moves and who stays put.

Setting aside these measurement issues, the big identification issue is that counties with more rain might be systematically different than counties with less rain. To the extent the weather can be considered a random treatment, the randomization is occurring across years within counties. The authors attempt to deal with this by including "county fixed effects"--that is, allowing the intercept to vary by county. That's ok but their data span a 70 year period, and counties have changed a lot politically in 70 years. They also include linear time trends for states, which helps some more, but I'm still a little concerned about systematic differences not captured in these trends.

No study is perfect, and I'm not saying these are devastating criticisms. I'm just trying to work through my thoughts here.

The effects of names on life choices

For another example, consider the study by Brett Pelham, Matthew Mirenberg, and John Jones of the dentists named Dennis (and the related stories of people with names beginning with F getting low grades, baseball players with K names getting more strikeouts, etc.). I found these claims varyingly plausible: the business with the grades and the strikeouts sounded like a joke, but the claims about career choices etc seemed possible.

My first step in trying to understand these claims was to estimate an effect size: my crude estimate was that, if the research findings were correct, that about 1% of people choose their career based on their first names.

This seemed possible to me, but Uri Simonsohn (the author of the recent rebuttal of the name-choice article by Pelham et al.) argued that the implied effects were too large to be believed (just as I was arguing above regarding the July 4th study), which makes more plausible his claims that the results arise from methodological artifacts.

That calculation is straight Bayes: the distribution of systematic errors has much longer tails than the distribution of random errors, so the larger the estimated effect, the more likely it is to be a mistake. This little theoretical result is a bit annoying, because it is the larger effects that are the most interesting!

Simonsohn moved the discussion forward by calibrating the effect-size questions to other measurable quantities:

We need a benchmark to make a more informed judgment if the effect is small or large. For example, the Dennis/dentist effect should be much smaller than parent-dentist/child-dentist. I think this is almost certainly true but it is an easy hurdle. The J marries J effect should not be much larger than the effect of, say, conditioning on going to the same high-school, having sat next to each other in class for a whole semester.

I have no idea if that hurdle is passed. These are arbitrary thresholds for sure, but better I'd argue than both my "100% increase is too big", and your "pr(marry smith) up from 1% to 2% is ok."


No easy answers. But I think that understanding effect sizes on a real scale is a start.

Dave Backus points me to this review by anthropologist Mike McGovern of two books by economist Paul Collier on the politics of economic development in Africa. My first reaction was that this was interesting but non-statistical so I'd have to either post it on the sister blog or wait until the 30 days of statistics was over. But then I looked more carefully and realized that this discussion is very relevant to applied statistics.

Here's McGovern's substantive critique:

Much of the fundamental intellectual work in Collier's analyses is, in fact, ethnographic. Because it is not done very self-consciously and takes place within a larger econometric rhetoric in which such forms of knowledge are dismissed as "subjective" or worse still biased by the political (read "leftist") agendas of the academics who create them, it is often ethnography of a low quality. . . .

Despite the adoption of a Naipaulian unsentimental-dispatches-from-the-trenches rhetoric, the story told in Collier's two books is in the end a morality tale. The tale is about those countries and individuals with the gumption to pull themselves up by their bootstraps or the courage to speak truth to power, and those power- drunk bottom billion elites, toadying sycophants, and soft-hearted academics too blinded by misplaced utopian dreams to recognize the real causes of economic stagnation and civil war. By insisting on the credo of "just the facts, ma'am," the books introduce many of their key analytical moves on the sly, or via anecdote. . . . This is one explana- tion of how he comes to the point of effectively arguing for an international regime that would chastise undemocratic leaders by inviting their armies to oust them--a proposal that overestimates the virtuousness of rich countries (and poor countries' armies) while it ignores many other potential sources of political change . . .

My [McGovern's] aim in this essay is not to demolish Collier's important work, nor to call into question development economics or the use of statistics. . . . But the rhetorical tics of Collier's books deserve some attention. . . . if his European and North American audiences are so deeply (and, it would seem, so easily) misled, why is he quick to presume that the "bottom billion" are rational actors? Mightn't they, too, be resistant to the good sense purveyed by economists and other demystifiers?

Now to the statistical modeling, causal inference, and social science. McGovern writes of Collier (and other quantitatively-minded researchers):

Portions of the two books draw on Collier's academic articles to show one or several intriguing correlations. Having run a series of regressions, he identifies counterintuitive findings . . . However, his analysis is typically a two-step process. First, he states the correlation, and then, he suggests an explanation of what the causal process might be. . . . Much of the intellectual heavy lifting in these books is in fact done at the level of implication or commonsense guessing.

This pattern (of which McGovern gives several convincing examples) is what statistician Kaiser Fung calls story time--that pivot from the quantitative finding to the speculative explanation My favorite recent example remains the recent claim that "a raise won't make you work harder." As with McGovern's example, the "story time" hypothesis there may very well be true (under some circumstances) but the statistical evidence doesn't come close to proving the claim or even convincing me of its basic truth.

The story of story time

But story time can't be avoided. On one hand, there are real questions to be answered and real decisions to be made in development economics (and elsewhere), and researchers and policymakers can't simply sit still and say they can't do anything because the data aren't fully persuasive. (Remember the first principle of decision analysis: Not making a decision is itself a decision.)

From the other direction, once you have an interesting quantitative finding, of course you want to understand it, and it makes sense to use all your storytelling skills here. The challenge is to go back and forth between the storytelling and the data. You find some interesting result (perhaps an observational data summary, perhaps an analysis of an experiment or natural experiment), this motivates a story, which in turn suggests some new hypotheses to be studied. Yu-Sung and I were just talking about this today in regard to our article on public opinion about school vouchers.

The question is: How do quantitative analysis and story time fit into the big picture? Mike McGovern writes that he wishes Paul Collier had been more modest in his causal claims, presenting his quantitative findings as "intriguing and counterintuitive correlations" and frankly recognizing that exploration of these correlations requires real-world understanding, not just the rhetoric of hard-headed empiricism.

I agree completely with McGovern--and I endeavor to follow this sort of modesty in presenting the implications of my own applied work--and I think it's a starting point for Coliier and others. Once they recognize that, indeed, they are in story time, they can think harder about the empirical implications of their stories.

The trap of "identifiability"

As Ole Rogeberg writes (following up on ideas of James Heckman and others), the search for clean identification strategies in social research can be a trap, in that it can result in precise but irrelevant findings tied to broad but unsupported claims. Rogeberg has a theoretical model explaining how economists can be so rigorous in parts of their analysis and so unrigorous in others. Rogeberg sounds very much like McGovern when he writes:

The puzzle that we try to explain is this frequent disconnect between high-quality, sophisticated work in some dimensions, and almost incompetently argued claims about the real world on the other.

The virtue of description

Descriptive statistics is not just for losers. There is value in revealing patterns in observational data, correlations or predictions that were not known before. For example, political scientists were able to forecast presidential election outcomes using information available months ahead of time. This has implications about political campaigns--and no causal identification strategy was needed. Countries with United Nations peacekeeping take longer, on average, to revert to civil war, compared to similarly-situated countries without peacekeeping. A fact worth knowing, even before the storytelling starts. (Here's the link, which happens to also include another swipe at Paul Collier, this time from Bill Easterly.)

I'm not convinced by every correlation I see. For example, there was this claim that warming increases the risk of civil war in Africa. As I wrote at the time, I wanted to see the time series and the scatterplot. A key principle in applied statistics is that you should be able to connect between the raw data, your model, your methods, and your conclusions.

The role of models

In a discussion of McGovern's article, Chris Blattman writes:

Economists often take their models too seriously, and too far. Unfortunately, no one else takes them seriously enough. In social science, models are like maps; they are useful precisely because they don't explain the world exactly as it is, in all its gory detail. Economic theory and statistical evidence doesn't try to fit every case, but rather find systematic tendencies. We go wrong to ignore these regularities, but we also go wrong to ignore the other forces at work-especially the ones not so easily modeled with the mathematical tools at hand.

I generally agree with what Chris writes, but here I think he's a bit off by taking statistical evidence and throwing it in the same category as economic theory and models. My take-away from McGovern is that the statistical evidence of Collier et al. is fine; the problem is with the economic models which are used to extrapolate from the evidence to the policy recommendations. I'm sure Chris is right that economic models can be useful in forming and testing statistical hypotheses, but I think the evidence can commonly be assessed on its own terms. (This is related to my trick of understanding instrumental variables by directly summarizing the effect of the instrument on the treatment and the outcome without taking the next step and dividing the coefficients.)

To put it another way: I would separate the conceptually simple statistical models that are crucial to understanding evidence in any complex-data setting, from the economics (or, more generally, social science) models that are needed to apply empirical correlations to real-world decisions.

I'm involved (with Irv Garfinkel and others) in a planned survey of New York City residents. It's hard to reach people in the city--not everyone will answer their mail or phone, and you can't send an interviewer door-to-door in a locked apartment building. (I think it violates IRB to have a plan of pushing all the buzzers by the entrance and hoping someone will let you in.) So the plan is to use multiple modes, including phone, in person household, random street intercepts and mail.

The question then is how to combine these samples. My suggested approach is to divide the population into poststrata based on various factors (age, ethnicity, family type, housing type, etc), then to pool responses within each poststratum, then to runs some regressions including postratsta and also indicators for mode, to understand how respondents from different modes differ, after controlling for the demographic/geographic adjustments.

Maybe this has already been done and written up somewhere?

P.S. As you try to do this sort of thing more carefully you run up against the sorts of issues discussed in the Struggles paper. So this is definitely statistical research, not merely an easy application of existing methods.

P.P.S. Cyrus has some comments, which for convenience I'll repost here:

It's interesting to consider this problem by combining a "finite population" perspective with some ideas about "principal strata" from the causal inference literature. Suppose a finite population U from which we draw a sample of N units. We have two modes of contact, A and B. Suppose for the moment that each unit can be characterized by one of the following response types (these are the "principal strata"):

TypeMode A responseMode B response

Then, there are two cases to consider, depending on whether mode of contact affects response:

Mode of contact does not affect response

This might be a valid assumption if the questions of interest are not subject to social desirability biases, interviewer effects, etc. In this case, it is easy to define a target parameter as the average response in the population. You could proceed efficiently by first applying mode A to the sample, and then applying mode B to those who did not respond with mode A. At the end, you would have outcomes for types I, II, and III units, and you'd have an estimate of the rate of type IV units in the population. You could content yourself with an estimate for the average response on the type I, II, and III subpopulation. If you wanted to recover an estimate of the average response for the full population (including type IV's), you would effectively have to impute values for type IV respondents. This could be done by using auxiliary information either to genuinely impute or (in a manner that is pretty much equivalent) to determine which type I, II, or III units resemble the missing type IV units, and up-weight. In any case, if the response of interest has finite support, one could also compute "worst case" (Manski-type) bounds on the average response by imputing maximum and minimum values to type IV units.

Mode of contact affects response

This might be relevant if, for example, the modes of contact are phone call versus face-to-face interview, and outcomes being measured vary depending on whether the respondent feels more or less exposed in the interview situation. This possibility makes things a lot trickier. In this case, each unit is characterized by a response under mode A and another under mode B (that is, two potential outcomes). One immediately faces a quandary of defining the target parameter. Is it the average of responses under the two modes of contact? Maybe it is some "latent" response that is imperfectly revealed under the two modes of contact? If so, how can we characterize this "imperfection"? Furthermore, only for type I individuals will you be able to obtain information on both potential responses. Does it make sense to restrict ourselves to this subpopulation? If not, then we would again face the need for imputation. A design that applied both mode A and mode B to the complete sample would mechanically reveal the proportion of type I units in the population, and by implication would identify the proportion of type II, III, and IV units. For type II units we could use mode A responses to improve imputations for mode B responses, and vice versa for type III respondents. Type IV respondents' contributions to our estimate of the "average response" would be based purely on auxiliary information. Again, one could construct worst case bounds by imputing maximum and minimum response values for each of the missing response types.

One wrinkle that I ignored above was that the order of modes of contact may affect either response behavior or outcomes reported. This multiplies the number potential response behaviors and the number of potential outcome responses given that the unit is interviewed. You could get some way past these issues by randomizing the order of mode of contact--e.g. A then B for one half, and B then A for the other half. But you would have to impose some more assumptions to make use of this random assignment. E.g., you'd have to assume that A-then-B always-responders are exchangeable with B-then-A always responders in order to combine the information from the always-responders in each half-sample. Or, you could "shift the goal posts" by saying that all you are interested in is the average of responses from modes A and B under the A-then-B design.



The above analysis did not explore how other types of assumptions might help to identify the population average. Andy's proposal to use post-stratification and regressions relies (according to my understanding) on the assumption potential outcomes are independent of mode of contact conditional on covariates. Formally, if the mode of contact is M taking on values A or B, potential outcomes under mode of contact m is y(m)T is principal stratum, and X is a covariate, then \left[y(A),y(B)\right] \perp M | T, X implies that, 

E(y(m)|T,X) = E(y(m)|M=m, T,X) = E(y(m)|M \ne m, T,X).

As discussed above, the design that applies modes A and B to all units in the sample can determine principal stratum membership, and so these covariate- and principal-stratum specific imputations can be applied. Ordering effects will again complicate things, and so more assumptions would be needed. A worthwhile type of analysis would be to study evidence of mode-of-contact as well as ordering effects among the type I (always-responder) units.

Now, it may be that mode of contact affects response but units are contacted via either mode A or B. Then, a unit's principal stratum membership is not identifiable, nor is the proportion of types I through IV identifiable (we would end up with two mixtures of responding and non-responding types, with no way to parse out relative proportions of the different types). If some kind of response "monotonicity" held, then that would help a little. Response monotonicity would mean that either type II or type III responders didn't exist. Otherwise, we would have to impose more stringent assumptions. The common one would be that principal stratum membership is independent of potential responses conditional on covariates. This is a classic "ignorable non-response" assumption, and it suffers from having no testable implications.

Ratio estimates are common in statistics. In survey sampling, the ratio estimate is when you use y/x to estimate Y/X (using the notation in which x,y are totals of sample measurements and X,Y are population totals).

In textbook sampling examples, the denominator X will be an all-positive variable, something that is easy to measure and is, ideally, close to proportional to Y. For example, X is last year's sales and Y is this year's sales, or X is the number of people in a cluster and Y is some count.

Ratio estimation doesn't work so well if X can be either positive or negative.

More generally we can consider any estimate of a ratio, with no need for a survey sampling context. The problem with estimating Y/X is that the very interpretation of Y/X can change completely if the sign of X changes.

Everything is ok for a point estimate: you get X.hat and Y.hat, you can take the ratio Y.hat/X.hat, no problem. But the inference falls apart if you have enough uncertainty in X.hat that you can't be sure of its sign.

This problem has been bugging me for a long time, and over the years I've encountered various examples in different fields of statistical theory, methods, and applications. Here I'll mention a few:
- LD50
- Ratio of regression coefficients
- Incremental cost-effectiveness ratio
- Instrumental variables
- Fieller-Creasy problem


We discuss this in section 3.7 of Bayesian Data Analysis. Consider a logistic regression model, Pr(y=1) = invlogit (a + bx), where x is the dose of a drug given to an animal and y=1 if the animal dies. The LD50 (lethal dose, 50%) is the value x for which Pr(y=1)=0.5. That is, a+bx=0, so x = -a/b. This is the value of x for which the logistic curve goes through 0.5 so there's a 50% chance of the animal dying.

The problem comes when there is enough uncertainty about b that its sign could be either positive or negative. If so, you get an extremely long-tailed distribution for the LD50, -a/b. How does this happen? Roughly speaking, the estimate for a has a normal dist, the estimate for b has a normal dist, so their ratio has a Cauchy-like dist, in which it can appear possible for the LD50 to take on values such as 100,000 or -300,000 or whatever. In a real example (such as in section 3.7 of BDA), these sorts of extreme values don't make sense.

The problem is that the LD50 has a completely different interpretation if b>0 than if b<0. If b>0, then x is the point at which any higher dose has a more than 50% chance of killing. If b<0, then any dose lower than x has a more than 50% chance to kill. The interpretation of the model changes completely. LD50 by itself is pretty pointless, if you don't know whether the curve goes up or down. And values such as LD50=100,000 are pretty meaningless in this case.

Ratio of regression coefficients

Here's an example. Political science Daniel Drezner pointed to a report by James Gwartney and Robert A. Lawson, who wrote:

Economic freedom is almost 50 times more effective than democracy in restraining nations from going to war. In new research published in this year's report [2005], Erik Gartzke, a political scientist from Columbia University, compares the impact of economic freedom on peace to that of democracy. When measures of both economic freedom and democracy are included in a statistical study, economic freedom is about 50 times more effective than democracy in diminishing violent conflict. The impact of economic freedom on whether states fight or have a military dispute is highly significant while democracy is not a statistically significant predictor of conflict.

What Gartzke did was run a regression and take the coefficient for economic freedom and divide it by the coefficient for democracy. Now I'm not knocking Gartzke's work, nor am I trying to make some smug slam on regression. I love regression and have used it for causal inference (or approximate causal inference) in my own work.

My only problem here is that ratio of 50. If beta.hat.1/beta.hat.2=50, you can bet that beta.hat.2 is not statistically significant. And, indeed, if you follow the link to Gartzke's chapter 2 of this report, you find this:


The "almost 50" above is the ratio of the estimates -0.567 and -0.011. (567/11 is actually over 50, but I assume that you get something less than 50 if you keep all the significant figures in the original estimate.) In words, each unit on the economic freedom scale corresponds to a difference of 0.567 on the probability (or, in this case, I assume the logit probability) of a militarized industrial dispute, while a difference of one unit on the democracy score corresponds to a difference of 0.011 on the outcome.

A factor of 50 is a lot, no?

But now look at the standard errors. The coefficient for the democracy score is -0.011 +/- 0.065. So the data are easily consistent with a coefficient of -0.011, or 0.1, or -0.1. All of these are a lot less than 0.567. Even if we put the coef of economic freedom at the low end of its range in absolute value (say, 0.567 - 2*0.179 = 0.2) and put the coef of the democracy score at the high end (say, 0.011 + 2*0.065=0.14)--even then, the ratio is still 1.4, which ain't nothing. (Economic freedom and democracy score both seem to be defined roughly on a 1-10 scale, so it seems plausible to compare their coefficients directly without transformation.) So, in the context of Gartzke's statistical and causal model, his data are saying something about the relative importance of the two factors.

But, no, I don't buy the factor of 50. One way to see the problem is: what if the coef of democracy had been +0.011 instead of -0.011? Given the standard error, this sort of thing could easily have occurred. The implication would be that democracy is associated with more war. Could be possible. Would the statement then be that economic freedom is negative 50 times more effective than democracy in restraining nations from going to war??

Or what if the coef of democracy had been -0.001? Then you could say that economic freedom is 500 times as important as democracy in preventing war.

The problem is purely statistical. The ratio beta.1/beta.2 has a completely different meaning according to the signs of beta.1 and beta.2. Thus, if the sign of the denominator (or, for that matter, the numerator) is uncertain, the ratio is super-noisy and can be close to meaningless.

Incremental cost-effectiveness ratio

Several years ago Dan Heitjan pointed me to some research on the problem of comparing two treatments that can vary on cost and efficacy.

Suppose the old treatment has cost C1 and efficacy E1, and the new treatment has cost C2 and efficacy E2. The incremental cost-effectiveness ratio is (C2-C1)/(E2-E1). In the usual scenario in which cost and efficacy both increase, we want this ratio to be low: the least additional cost per additional unit of efficacy.

Now suppose that C1,E1,C2,E2 are estimated from data, so that your estimated ratio is (C2.hat-C1.hat)/(E2.hat-E1.hat). No problem, right? No problem . . . as long as the signs of C2-C1 and E2-E1 are clear. But suppose the signs are uncertain--that could happen--so that we are not sure whether the new treatment is actually better, or whether it is actually more expensive.

Consider the four quadrants:
1. C2 .gt. C1 and E2 .gt. E1. The new treatment costs more and works better. The incremental cost-effectiveness ratio is positive, and we want it to be low.
2. C2 .gt. C1 and E2 .lt. E1. The new treatment costs more and works worse. The incremental cost-effectiveness ratio is negative, and the new treatment is worse no matter what.
3. C2 .lt. C1 and E2 .gt. E1. The new treatment costs less and works better! The incremental cost-effectiveness ratio is negative, and the new treatment is better no matter what.
4. C2 .lt. C1 and E2 .lt. E1. The new treatment costs less and works worse. The incremental cost-effectiveness ratio is positive, and we want it to be high (that is, a great gain in cost for only a small drop in efficacy).

Consider especially quadrants 1 and 4. An estimate or a confidence interval in incremental cost-effectiveness ratio is meaningless if you don't know what quadrant you're in.

Here are the references for this one:

Heitjan, Daniel F., Moskowitz, Alan J. and Whang, William (1999). Bayesian estimation of cost-effectiveness ratios from clinical trials. Health Economics 8, 191-201.

Heitjan, Daniel F., Moskowitz, Alan J. and Whang, William (1999). Problems with interval estimation of the incremental cost-effectiveness ratio. Medical Decision Making 19, 9-15.

Instrumental variables

This is another ratio of regression coefficients. For a weak instrument, the denominator can be so uncertain that its sign could go either way. But if you can't get the sign right for the instrument, the ratio estimate doesn't mean anything. So, paradoxically, when you use a more careful procedure to compute uncertainty in an instrumental variables estimate, you can get huge uncertainty estimates that are inappropriate.

Fieller-Creasy problem

This is the name in classical statistics for estimating the ratio of two parameters that are identified with independent normally distributed data. It's sometimes referred to as the problem as the ratio of two normal means, but I think the above examples are more realistic.

Anyway, the Fieller-Creasy problem is notoriously difficult: how can you get an interval estimate with close to 95% coverage? The problem, again, is that there aren't really any examples where the ratio has any meaning if the denominator's sign is uncertain (at least, none that I know of; as always, I'm happy to be educated further by my correspondents). And all the statistical difficulties in inference here come from problems where the denominator's sign is uncertain.

So I think the Fieller-Creasy problem is a non-problem. Or, more to the point, a problem that there is no point in solving. Which is one reason it's so hard to solve (recall the folk theorem of statistical computing).

P.S. This all-statistics binge is pretty exhausting! Maybe this one can count as 2 or 3 entries?

When it rains it pours . . .

John Transue writes:

I saw a post on Andrew Sullivan's blog today about life expectancy in different US counties. With a bunch of the worst counties being in Mississippi, I thought that it might be another case of analysts getting extreme values from small counties.

However, the paper (see here) includes a pretty interesting methods section. This is from page 5, "Specifically, we used a mixed-effects Poisson regression with time, geospatial, and covariate components. Poisson regression fits count outcome variables, e.g., death counts, and is preferable to a logistic model because the latter is biased when an outcome is rare (occurring in less than 1% of observations)."

They have downloadable data. I believe that the data are predicted values from the model. A web appendix also gives 90% CIs for their estimates.

Do you think they solved the small county problem and that the worst counties really are where their spreadsheet suggests?

My reply:

I don't have a chance to look in detail but it sounds like they're on the right track. I like that they cross-validated; that's what we did to check we were ok with our county-level radon estimates.

Regarding your question about the small county problem: no matter what you do, all maps of parameter estimates are misleading. Even the best point estimates can't capture uncertainty. As noted above, cross-validation (at the level of the county, not of the individual observation) is a good way to keep checking.

Brendan Nyhan points me to this from Don Taylor:

Can national data be used to estimate state-level results? . . . A challenge is the fact that the sample size in many states is very small . . . Richard [Gonzales] used a regression approach to extrapolate this information to provide a state-level support for health reform:
To get around the challenge presented by small sample sizes, the model presented here combines the benefits of incorporating auxiliary demographic information about the states with the hierarchical modeling approach commonly used in small area estimation. The model is designed to "shrink" estimates toward the average level of support in the region when there are few observations available, while simultaneously adjusting for the demographics and political ideology in the state. This approach therefore takes fuller advantage of all information available in the data to estimate state-level public opinion.

This is a great idea, and it is already being used all over the place in political science. For example, here. Or here. Or here.

See here for an overview article, "How should we estimate public opinion in the states?" by Jeff Lax and Justin Phillips.

It's good to see practical ideas being developed independently in different fields. I know that methods developed by public health researchers have been useful in political science, and I hope that in turn they can take advantage of the progress we've made in multilevel regression and poststratification.

This is Many Bills, a visualization of US bills by IBM:



I learned about it a few days ago from Irene Ros at Foo Camp. It definitely looks better than my own analysis of US Senate bills.

In case you haven't been following:

- Top ten excuses for plagiarism

- Why I won't be sad to see Anthony Weiner retire

- U.S. voter participation has not fallen steadily over the past few decades

- Scott Adams had an interesting idea

Early this afternoon I made the plan to teach a new course on sampling, maybe next spring, with the primary audience being political science Ph.D. students (although I hope to get students from statistics, sociology, and other departments). Columbia already has a sampling course in the statistics department (which I taught for several years); this new course will be centered around political science questions. Maybe the students can start by downloading data from the National Election Studies and General Social Survey and running some regressions, then we can back up and discuss what is needed to go further.

About an hour after discussing this new course with my colleagues, I (coincidentally) received the following email from Mike Alvarez:

If you were putting together a reading list on sampling for a grad course, what would you say are the essential readings? I thought I'd ask you because I suspect you might have taught something along these lines.

I pointed Mike here and here.

To which Mike replied:

I wasn't too far off your approach to teaching this. I agree with your blog posts that the Groves et al. book is the best basic text to use on survey methodology that is currently out there. On sampling I have in the past relied on some sort of nonlinear combination of Kish and a Wiley text by Levy and Lemeshow, though that was unwieldy for students. I'll have to look more closely at Lohr, my impression of it when I glanced at it was like yours, that it sort of underrepresented some of the newer topics.

I think Lohr's book is great, but it might not be at quite the right level for political science students. I want something that is (a) more practical and (b) more focused on regression modeling rather than following the traditional survey sampling textbook approach of just going after the population mean. I like the Groves et al. book but it's more of a handbook than a textbook. Maybe I'll have to put together a set of articles. Also, I'm planning to do it all in R. Stata might make more sense but I don't know Stata.

Any other thoughts and recommendations would be appreciated.

Memorial Day question


When I was a kid they shifted a bunch of holidays to Monday. (Not all the holidays: they kept New Year's, Christmas, and July 4th on fixed dates, they kept Thanksgiving on a Thursday, and for some reason the shifted Veterans Day didn't stick. But they successfully moved Washington's Birthday, Memorial Day, and Columbus Day.

It makes sense to give people a 3-day weekend. I have no idea why they picked Monday rather than Friday, but either one would do, I suppose.

My question is: if this Monday holiday thing was such a good idea, why did it take them so long to do it?

I'm a few weeks behind in my New Yorker reading and so just recently read this fascinating article by Ryan Lizza on the current administration's foreign policy. He gives some insights into the transformation Obama from antiwar candidate to a president conducting three wars.

Speaking as a statistician, though, what grabbed my eye was a doctrine of journalist/professor/policymaker Samantha Power. Lizza writes:

In 2002, after graduating from Harvard Law School, she wrote "A Problem from Hell," which surveyed the grim history of six genocides committed in the twentieth century. Propounding a liberal-interventionist view, Power argued that "mass killing" on the scale of Rwanda or Bosnia must be prevented by other nations, including the United States. She wrote that America and its allies rarely have perfect information about when a regime is about to commit genocide; a President, therefore, must have "a bias toward belief" that massacres are imminent.

From a statistical perspective, this sounds completely wrong! If you want to argue that it's a good idea to intervene, even if you're not sure, or if you want to argue that it's wise to intervene, even if the act of intervention will forestall the evidence for genocide that would be the motivation for intervention, that's fine. It's a cost-benefit analysis and it's best to lay out the costs and benefits as clearly as possible (within the constraints established by military and diplomatic secrecy). But to try to shade the probabilities to get the decision you want . . . that doesn't seem like a good idea at all!

To be fair, the above quote predates the Iraq WMD fiasco, our most notorious recent example of a "bias toward belief" that influenced policy. Perhaps Power has changed her mind on the virtues of biasing one's belief.

P.S. Samantha Power has been non-statistical before.

P.P.S. Just in case anyone wants to pull the discussion in a more theoretical direction: No, Power's (and, for that matter, Cheney's) "bias toward belief" is not simply a Bayesian prior. My point here is that she's constructing a belief system (a prior) based not on a model of what's happening or even on a subjective probability but rather on what she needs to get the outcome she wants. That's not Bayes. In Bayes, the prior and the utility function are separate.

I encountered this news article, "Chicago school bans some lunches brought from home":

At Little Village, most students must take the meals served in the cafeteria or go hungry or both. . . . students are not allowed to pack lunches from home. Unless they have a medical excuse, they must eat the food served in the cafeteria. . . . Such discussions over school lunches and healthy eating echo a larger national debate about the role government should play in individual food choices. "This is such a fundamental infringement on parental responsibility," said J. Justin Wilson, a senior researcher at the Washington-based Center for Consumer Freedom, which is partially funded by the food industry. . . . For many CPS parents, the idea of forbidding home-packed lunches would be unthinkable. . . .

If I had read this two years ago, I'd be at one with J. Justin Wilson and the outraged kids and parents. But last year we spent a sabbatical in Paris, where . . . kids aren't allowed to bring lunches to school. The kids who don't go home for lunch have to eat what's supplied by the lunch ladies in the cafeteria. And it's just fine. Actually, it was more than fine because we didn't have to prepare the kids' lunches every day. When school let out, the kids would run to the nearest boulangerie and get something sweet. So they didn't miss out on the junk food either.

I'm not saying the U.S. system or the French system is better, nor am I expressing an opinion on how they do things in Chicago. I just think it's funny how a rule which seems incredibly restrictive from one perspective is simply, for others, the way things are done. I'll try to remember this story next time I'm outraged at some intolerable violation of my rights.

P.S. If they'd had the no-lunches-from-home rule when I was a kid, I definitely would've snuck food into school. In high school the wait for lunchtime was interminable.

Ryan King writes:

This involves causal inference, hierarchical setup, small effect sizes (in absolute terms), and will doubtless be heavily reported in the media.

The article is by Manudeep Bhuller, Tarjei Havnes, Edwin Leuven, and Magne Mogstad and begins as follows:

Does internet use trigger sex crime? We use unique Norwegian data on crime and internet adoption to shed light on this question. A public program with limited funding rolled out broadband access points in 2000-2008, and provides plausibly exogenous variation in internet use. Our instrumental variables and fixed effect estimates show that internet use is associated with a substantial increase in reported incidences of rape and other sex crimes. We present a theoretical framework that highlights three mechanisms for how internet use may affect reported sex crime, namely a reporting effect, a matching effect on potential offenders and victims, and a direct effect on crime propensity. Our results indicate that the direct effect is non-negligible and positive, plausibly as a result of increased consumption of pornography.

How big is the effect?

About 15 years ago I ran across this book and read it, just for fun. Rhoads is a (nonquantitative) political scientist and he's writing about basic economic concepts such as opportunity cost, marginalism, and economic incentives. As he puts it, "welfare economics is concerned with anything any individual values enough to be willing to give something up for it."

The first two-thirds of the book is all about the "economist's view" (personally, I'd prefer to see it called the "quantitative view") of the world and how it applies to policy issues. The quick message, which I think is more generally accepted now than in the 1970s when Rhoads started working on this book, is that free-market processes can do better than governmental rules in allocating resources. Certain ideas that are obvious to quantitative people--for example, we want to reduce pollution and reduce the incentives to pollute, but it does not make sense to try to get the level of a pollutant all the way down to zero if the cost is prohibitively high--are not always so obvious to others. The final third of Rhoads's book discusses difficulties economists have had when trying to carry their dollar-based reasoning over to the public sector. He considers the logical tangles with the consumer-is-always-right philosophy and also discusses how economists sometimes lose credibility on topics where they are experts by pushing oversimplified ideas in non-market-based settings.

I like the book a lot. Very few readers will agree with Rhoads on all points but that isn't really the point. He explains the ideas and the historical background well, and the topics cover a wide range, from why it makes sense to tax employer-provided health insurance to various ways in which arguments about externalities have been used to motivate various silly (in his opinion, and mine) government subsidies. I also enjoyed the bits of political science that Rhoads tosses in throughout (for example, his serious discussion in chapter 11 of direct referenda, choosing representatives by lot, and various other naive proposals for political reform).

During the 25 years since the publication of Rhoads's book, much has changed in the relation between economics and public policy. Most notably, economists have stepped out of the shadows. No longer mere technicians, they are now active figures in the public debate. Paul Volcker, Alan Greenspan, and to a lesser extent Lawrence Summers have become celebrities in a way that has been rare among government economic officials. (Yes, Galbraith and Friedman were famous in an earlier era but as writers on economics. They were not actually pulling the levers of power at the time that they were economic celebrities.) And microeconomics, characterized by Rhoads as the ugly duckling of the field, has come into its own with Freakonomics and the rest.

Up until the financial crash of 2008--and even now, still--economists have been riding high. And they'd like to ride higher. For example, a few years ago economist Matthew Kahn asked why there aren't more economists in higher office--and I suspect many other prominent economists have thought the same thing. I looked up the numbers of economists in the employed population, and it turned out that they were in fact overrepresented in Congress. This is not to debate the merits of Kahn's argument--perhaps Congress would indeed be better if it included more economists--but rather to note that economists have moved from being a group with backroom influence to wanting more overt power.

So, with this as background, Rhoads's book is needed now more than ever. It's important for readers of all political persuasions to understand the power and generality of the economist's view. Rhoads's son Chris recently informed me that his father is at work on a second edition, so I pulled my well-worn copy of the first edition off the shelf. I hope the comments below will be useful during the preparation of the revision.

What follows is not intended as any sort of a review; it is merely a transcription and elaboration of the post-it notes that I put in, fifteen years ago, noting issues that I had. (In case you're wondering: yes, the notes are still sticky.)

- On page 102, Rhoads explains why economists think that price controls and minimum wage laws are bad for low-income Americans: "It is striking that there is almost no support for any of these price control measures even among the most equity-conscious economists. . . . The real issue is, in large measure, ignorance." This could be, but I'd also guess (although I haven't had a chance to check the numbers) that price controls and minimum wage are more popular among low-income than high-income voters. This does not exactly contradict Rhoads's claim--after all, poorer people might well be less well informed about economic principals--but it makes me wonder. The political scientist in me suspects that a policy that is supported by poorer people and opposed by richer people might well be a net benefit to people on the lower end of the economic distribution. Rhoads points out that there are more economically efficient forms of transfer--for example, direct cash payments to the poor--but that's not so relevant if such policies aren't about to be implemented because of political resistance.

Later on, Rhoads approvingly quotes an economist who writes, "Rent controls destroy incentives to maintain or rehabilitate property, and are thus an assured way to preserve slums." This may have sounded reasonable when it was written in 1970 but seems naive from a modern-day perspective. Sure, you want a good physical infrastructure, you don't want the pipes to break, etc., but what really makes a neighborhood a slum is crime. Rent control can give people a stake in their location (as with mortgage tax breaks, through the economic inefficiency of creating an incentive to not move). There might be better policies to encourage stability--or maybe increased turnover in dwellings is actually preferable--but the path from "incentives to maintain or rehabilitate property" and "slums" is far from clear.

- On page 139, Rhoads writes: "Most of the costs of business safety regulation fall on consumers." Again, this might be correct, but my impression is that the strongest opposition to these regulations come from business operators, not from consumers. Much of this opposition perhaps arises from costs that are not easily measured in dollars: for example, filling out endless forms, worrying about rules and deadlines. This sort of paperwork load is a constant cost that is borne by managers, not consumers. Anyway, my point is the same as above: as a political scientist, I'm skeptical of the argument that consumers bear most of the costs, given that business operators are (I think) the ones who really oppose these regulations. I'm not arguing that any particular regulation is a good idea, just saying that seems naive to me to take economists' somewhat ideologically-loaded claims at face value here.

- On page 217, Rhoads quotes an economics journalist who writes, "Through its tax laws, government can help create a climate for risk-taking. It ought to prey on the greed in human nature and the industriousness in the American character. Otherwise, stand aside." I have a few thoughts on these lines which perhaps sound a bit different now than in 1980 when they first appeared. Most obviously, a natural consequence of greed + industriousness is . . . theft. There's an even larger problem with this attitude, though, even setting aside moral hazard (those asymmetrical bets in which the banker gets rich if he wins but the taxpayer covers any loss). Even in a no-free-lunch environment in which risks are truly risky, why is "a climate for risk-taking" supposed to be a good thing? This seems a leap beyond the principles of economic efficiency that came in the previous chapters, and I have some further thoughts about this below.

- On page 20, Rhoads criticizes extreme safety laws and writes, "There would be nothing admirable about a society that watched the quality of its life steadily decline in hot pursuit of smaller and smaller increments of life extension." He was ahead of his time in considering this issue. Nowadays with health care costs crowding out everything else, we're all aware of this tradeoff as expressed, for example, in these graphs showing the U.S. spending twice as much on health as other countries with no benefit in life expectancy. It turned out, though, that the culprit was not safety laws but rather the tangled mixture of public and private care that we have in this country. This example suggests that the economist's view of the world can be a valuable perspective without always offering a clear direction for improvement.

Another example from Rhodes's book is nuclear power plants. Some economists argue on free-market grounds that the civilian nuclear industry should be left to fend for themselves without further government support while others argue on efficiency grounds that nuclear power is safe and clean and should be subsidized (see p. 230). Ultimately I agree with Rhoads that this comes down to costs and benefits (and I definitely think like an economist in that way) but in the meantime there is a clash of the two fundamental principles of free markets on one side and efficiency on the other. (The economists who support nuclear power on efficiency grounds cannot simply rely on the free market because of existing market-distorting factors such as safety regulations, fossil fuel subsidies, and various complexities in the existing energy supply system.)

- Finally, when economists talk about fundamental principles, they often bring in their value judgments for free. For example, on page 168 Rhoads quotes an economics writer who doubts that "we need the government to subsidize high-brow entertainment--theater, ballet, opera and television drama . . . Let people decide for themselves whether they want to be entertained by the Pittsburgh Steelers or the local symphony." Well, sure, we definitely don't need subsidies for any of these things. The question is not of need but rather of discretionary spending, given that money is indeed being disbursed as part of the political process. But what I really wonder is: what does this guy (not Rhoads, but the writer he quotes) have against the local symphony? The Pittsburgh Steelers are already subsidized! (Everybody knows this. I just did a quick search on "pittsburgh steelers subsidy" and came across this blog by Skip Sauer with this line: "Three Rivers Stadium in Pittsburgh still was carrying $45 million in debt at the time of its demolition in 2001.")

I hope that in his revision, Rhoads will elaborate on the dominant perspectives of different social science fields. Crudely speaking, political scientists speak to princes, economists speak to business owners, and sociologists speak to community organizers. If we're not careful, we political scientists can drift into a "What should the government do?" attitude which presupposes that the government's goals are reasonable. Similarly, economists have their own cultural biases, such as preferring football to the symphony and, more importantly, viewing risk taking as a positive value in and of itself.

In summary, I think The Economist's View of the World is a great book and I look forward to the forthcoming second edition. I think it's extremely important to see the economist's perspective with its strengths and limitations in a single place.

I was checking the Dilbert blog (sorry! I was just curious what was up after the events of a few weeks ago) and saw this:

I had a couple of email exchanges with Jan-Emmanuel De Neve and James Fowler, two of the authors of the article on the gene that is associated with life satisfaction which we blogged the other day. (Bruno Frey, the third author of the article in question, is out of town according to his email.) Fowler also commented directly on the blog.

I won't go through all the details, but now I have a better sense of what's going on. (Thanks, Jan and James!) Here's my current understanding:

1. The original manuscript was divided into two parts: an article by De Neve alone published in the Journal of Human Genetics, and an article by De Neve, Fowler, Frey, and Nicholas Christakis submitted to Econometrica. The latter paper repeats the analysis from the Adolescent Health survey and also replicates with data from the Framingham heart study (hence Christakis's involvement).

The Framingham study measures a slightly different gene and uses a slightly life-satisfaction question compared to the Adolescent Health survey, but De Neve et al. argue that they're close enough for the study to be considered a replication. I haven't tried to evaluate this particular claim but it seems plausible enough. They find an association with p-value of exactly 0.05. That was close! (For some reason they don't control for ethnicity in their Framingham analysis--maybe that would pull the p-value to 0.051 or something like that?)

2. Their gene is correlated with life satisfaction in their data and the correlation is statistically significant. The key to getting statistical significance is to treat life satisfaction as a continuous response rather than to pull out the highest category and call it a binary variable. I have no problem with their choice; in general I prefer to treat ordered survey responses as continuous rather than discarding information by combining categories.

3. But given their choice of a continuous measure, I think it would be better for the researchers to stick with it and present results as points on the 1-5 scale. From their main regression analysis on the Adolescent Health data, they estimate the effect of having two (compared to zero) "good" alleles as 0.12 (+/- 0.05) on a 1-5 scale. That's what I think they should report, rather than trying to use simulation to wrestle this into a claim about the probability of describing oneself as "very satisfied."

They claim that having the two alleles increases the probability of describing oneself as "very satisfied" by 17%. That's not 17 percentage points, it's 17%, thus increasing the probability from 41% to 1.17*41% = 48%. This isn't quite the 46% that's in the data but I suppose the extra 2% comes from the regression adjustment. Still, I don't see this as so helpful. I think they'd be better off simply describing the estimated improvement as 0.1 on a 1-5 scale. If you really really want to describe the result for a particular category, I prefer percentage points rather than percentages.

4. Another advantage as describing the result as 0.1 on a 1-5 scale is that it is more consistent with intuitive notions of 1% of variance explained. It's good they have this 1% in their article--I should present such R-squared summaries in my own work, to give a perspective on the sizes of the effects that I find.

5. I suspect the estimated effect of 0.1 is an overestimate. I say this for the usual reason, discussed often on this blog, that statistically significant findings, by their very nature, tend to be overestimates. I've sometimes called this the statistical significance filter, although "hurdle" might be a more appropriate term.

6. Along with the 17% number comes a claim that having one allele gives an 8% increase. 8% is half of 17% (subject to rounding) and, indeed, their estimate for the one-allele case comes from their fitted linear model. That's fine--but the data aren't really informative about the one-allele case! I mean, sure, the data are perfectly consistent with the linear model, but the nature of leverage is such that you really don't get a good estimate on the curvature of the dose-response function. (See my 2000 Biostatistics paper for a general review of this point.) The one-allele estimate is entirely model-based. It's fine, but I'd much prefer simply giving the two-allele estimate and then saying that the data are consistent with a linear model, rather than presenting the one-allele estimate as a separate number.

7. The news reports were indeed horribly exaggerated. No fault of the authors but still something to worry about. The Independent's article was titled, "Discovered: the genetic secret of a happy life," and the Telegraph's was not much better: "A "happiness gene" which has a strong influence on how satisfied people are with their lives, has been discovered." An effect of 0.1 on a 1-5 scale: an influence, sure, but a "strong" influence?

8. There was some confusion with conditional probabilities that made its way into the reports as well. From the Telegraph:

The results showed that a much higher proportion of those with the efficient (long-long) version of the gene were either very satisfied (35 per cent) or satisfied (34 per cent) with their life - compared to 19 per cent in both categories for those with the less efficient (short-short) form.

After looking at the articles carefully and having an email exchange with De Neve, I can assure you that the above quote is indeed wrong, which is really too bad because it was an attempted correction of an earlier mistake. The correct numbers are not 35, 34, 19, 19. Rather, they are 41, 46, 37, 44. A much less dramatic difference: changes of 4% and 2% rather than 18% and 15%. The Telegraph reporter was giving P(gene|happiness) rather than P(happiness|gene). What seems to have happened is that he misread Figure 2 in the Human Genetics paper. He then may have got stuck on the wrong track by expecting to see a difference of 17%.

9. The abstract for the Human Genetics paper reports a p-value of 0.01. But the baseline model (Model 1 in Table V of the Econometrica paper) reports a p-value of 0.02. The lower p-values are obtained by models that control for a big pile of intermediate outcomes.

10. In section 3 of the Econometrica paper, they compare identical to fraternal twins (from the Adolescent Health survey, it appears) and estimate that 33% of the variation in reported life satisfaction is explained by genes. As they say, this is roughly consistent with estimates of 50% or so from the literature. I bet their 33% has a big standard error, though: one clue is that the difference in correlations between identical and fraternal twins is barely statistically significant (at the 0.03 level, or, as they quaintly put it, 0.032). They also estimate 0% of the variation to be due to common environment, but again that 0% is gonna be a point estimate with a huge standard error.

I'm not saying that their twin analysis is wrong. To me the point of these estimates is to show that the Adolescent Health data are consistent with the literature on genes and happiness, thus supporting the decision to move on with the rest of their study. I don't take their point estimates of 33% and 0% seriously but it's good to know that the twin results go in the expected direction.

11. One thing that puzzles me is why De Neve et al. only studied one gene. I understand that this is the gene that they expected to relate to happiness and life satisfaction, but . . . given that it only explains 1% of the variation, there must be hundreds or thousands of genes involved. Why not look at lots and lots? At the very least, the distribution of estimates over a large sample of genes would give some sense of the variation that might be expected. I can't see the point of looking at just one gene, unless cost is a concern. Are other gene variants already recorded for the Adolescent Health and Framingham participants?

12. My struggles (and the news reporters' larger struggles) with the numbers in these articles makes me feel, even more strongly than before, the need for a suite of statistical methods for building from simple comparisons to more complicated regressions. (In case you're reading this, Bob and Matt3, I'm talking about the network of models.)

As researchers, transparency should be our goal. This is sometimes hindered by scientific journals' policies of brevity. You can end up having to remove lots of the details that make a result understandable.

13. De Neve concludes the Human Genetics article as follows:

There is no single ''happiness gene.' Instead, there is likely to be a set of genes whose expression, in combination with environmental factors, influences subjective well-being.

I would go even further. Accepting their claim that between one-third and one-half of the variation in happiness and life satisfaction is determined by genes, and accepting their estimate that this one gene explains as much as 1% of the variation, and considering that this gene was their #1 candidate (or at least a top contender) for the "happiness gene" . . . my guess is that the set of genes that influence subjective well-being is a very large number indeed! The above disclaimer doesn't seem disclaimery-enough to me, in that it seems to leave open the possibility that this "set of genes" might be just three or four. Hundreds or thousands seems more like it.

I'm reminded of the recent analysis that found that the simple approach of predicting child's height using a regression model given parents' average height performs much better than a method based on combining 54 genes.

14. Again, I'm not trying to present this as any sort of debunking, merely trying to fit these claims in with the rest of my understanding. I think it's great when social scientists and public health researchers can work together on this sort of study. I'm sure that in a couple of decades we'll have a much better understanding of genes and subjective well-being, but you have to start somewhere. This is a clean study that can be the basis for future research.

Hmmm . . . .could I publish this as a letter in the Journal of Human Genetics? Probably not, unfortunately.

P.S. You could do this all yourself! This and my earlier blog on the happiness gene study required no special knowledge of subject matter or statistics. All I did was tenaciously follow the numbers and pull and pull until I could see where all the claims were coming from. A statistics student, or even a journalist with a few spare hours, could do just as well. (Why I had a few spare hours to do this is another question. The higher procrastination, I call it.) I probably could've done better with some prior knowledge--I know next to nothing about genetics and not much about happiness surveys either--but I could get pretty far just tracking down the statistics (and, as noted, without any goal of debunking or any need to make a grand statement).

P.P.S. See comments for further background from De Neve and Fowler!

I took the above headline from a news article in the (London) Independent by Jeremy Laurance reporting a study by Jan-Emmanuel De Neve, James Fowler, and Bruno Frey that reportedly just appeared in the Journal of Human Genetics.

One of the pleasures of blogging is that I can go beyond the usual journalistic approaches to such a story: (a) puffing it, (b) debunking it, (c) reporting it completely flatly. Even convex combinations of (a), (b), (c) do not allow what I'd like to do, which is to explore the claims and follow wherever my exploration takes me. (And one of the pleasures of building my own audience is that I don't need to endlessly explain background detail as was needed on a general-public site such as 538.)

OK, back to the genetic secret of a happy life. Or, in the words the authors of the study, a gene that "explains less than one percent of the variation in life satisfaction."

"The genetic secret" or "less than one percent of the variation"?

Perhaps the secret of a happy life is in that one percent??

I can't find a link to the journal article which appears based on the listing on De Neve's webpage to be single-authored, but I did find this Googledocs link to a technical report from January 2010 that seems to have all the content. Regular readers of this blog will be familiar with earlier interesting research of Fowler and Frey working separately; I had no idea that they have been collaborating.

De Neve et al. took responses to a question on life satisfaction from a survey that was linked to genetic samples. They looked at a gene called 5HTT which, according to their literature review, has been believed to be associated with happy feelings.

I haven't taken a biology class since 9th grade, so I'll give a simplified version of the genetics. You can have either 0, 1, or 2 alleles of the gene in question. Of the people in the sample, 20% have 0 alleles, 45% have 1 allele, and 35% have 2. The more alleles you have, the happier you'll be (on average): The percentage of respondents describing themselves as "very satisfied" with their lives is 37% for people with 0 alleles, 38% for those with one allele, and 41% for those with two alleles.

The key comparison here comes from the two extremes: 2 alleles vs. 0. People with 2 alleles are 4 percentage points (more precisely, 3.6 percentage points) more likely to report themselves as very satisfied with their lives. The standard error of this difference in proportions is sqrt(.41*(1-.41)/862+.37*(1-.37)/509) = 0.027, so the difference is not statistically significant at a conventional level.

But in their abstract, De Neve et al. reported the following:

Having one or two allleles . . . raises the average likelihood of being very satisfied with one's life by 8.5% and 17.3%, respectively?

How did they get from a non-significant difference of 4% (I can't bring myself to write "3.6%" given my aversion to fractional percentage points) to a statistically significant 17.3%?

A few numbers that I can't figure out at all!

Here's the summary from Stephen Adams, medical correspondent of the Daily Telegraph:

The researchers found that 69 per cent of people who had two copies of the gene said they were either satisfied (34) or very satisfied (35) with their life as a whole.

But among those who had no copy of the gene, the proportion who gave either of these answers was only 38 per cent (19 per cent 'very satisfied' and 19 per cent 'satisfied').

This leaves me even more confused! According to the table on page 21 of the De Neve et al. article, 46% of people who had two copies of the gene described themselves as satisfied and 41% described themselves as very satisfied. The corresponding percentages for those with no copies were 44% and 37%.

I suppose the most likely explanation is that Stephen Adams just made a mistake, but it's no ordinary confusion because his numbers are so specific. Then again, I could just be missing something big here. I'll email Fowler for clarification but I'll post this for now so you loyal blog readers can see error correction (of one sort or another) in real time.

Where did the 17% come from?

OK, so setting Stephen Adams aside, how can we get from a non-significant 4% to a significant 17%?

- My first try is to use the numerical life-satisfaction measure. Average satisfaction on a 1-5 scale is 4.09 for the 0-allele people in this sample and 4.25 for the 1-allele people, and the difference has a standard error of 0.05. Hey--a difference of 0.16 with a standard error of 0.05--that's statistically significant! So it doesn't seem just like a fluctuation in the data.

- The main analysis of De Neve et al., reported in their Table 1, appears to be a least-squares regression of well-being (on that 1-5) scale, using the number of alleles as a predictor and also throwing in some controls for ethnicity, sex, age, and some other variables. They include error terms for individuals and families but don't seem to report the relative sizes of the errors. In any case, the controls don't seem to do much. Their basic result (Model 1, not controlling for variables such as marital status which might be considered as intermediate outcomes of the gene) yields a coefficient estimate of 0.06.

They then write, "we summarize the results for 5HTT by simulating first differences from the coefficient covariance matrix of Model 1. Holding all else constant and changing the 5HTT gene of all subjects from zero to one long allele would increase the reporting of being very satisfied with one's life in this population by about 8.5%." Huh? I completely don't understand this. It looks to me that the analyses in Table 1 are regressions on the 1-5 scale. So how can they transfer these to claims about "the reporting of being very satisfied"? Also, if it's just least squares, why do they need to work with the covariance matrix? Why can't they just look at the coefficient itself?

- They report (in Table 5) that whites have higher life satisfaction responses than blacks but lower numbers of alleles, on average. So controlling for ethnicity should increase the coefficient. I still can't see it going all the way from 4% to 17%. But maybe this is just a poverty of my intuition.

- OK, I'm still confused and have no idea where the 17% could be coming from. All I can think of is that the difference between 0 alleles and 2 alleles corresponds to an average difference of 0.16 in happiness on that 1-5 scale. And 0.16 is practically 17%, so maybe when you control for things the number jumps around a bit. Perhaps the result of their "first difference" calculations was somehow to carry that 0.16 or 0.17 and attribute it to the "very satisfied" category?

1% of variance explained

One more thing . . . that 1% quote. Remember? "the 5HTT gene explains less than one percent of the variation in life satisfaction." This is from page 14 of the De Neve, Fowler, and Frey article. 1%? How can we understand this?

Let's do a quick variance calculation:

- Mean and sd of life satisfaction responses (on the 1-5 scale) among people with 0 alleles: 4.09 and 0.8
- Mean and sd of life satisfaction responses (on the 1-5 scale) among people with 2 alleles: 4.25 and 0.8
- The difference is 0.16 so the explained variance is (0.16/2)^2 = 0.08^2
- Finally, R-squared is explained variance divided by total variance: (0.08/0.8)^2 = 0.01.

A difference of 0.16 on a 1-5 scale ain't nothing (it's approximately the same as the average difference in life satisfaction, comparing whites and blacks), especially given that most people are in the 4 and 5 categories. But it only represents 1% of the variance in the data. It's hard for me to hold these two facts in my head at the same time. The quick answer is that the denominator of the R-squared--the 0.8--contains lots of individual variation, including variation in the survey response. Still, 1% is such a small number. No surprise it didn't make it into the newspaper headline . . .

Here's another story of R-squared = 1%. Consider a 0/1 outcome with about half the people in each category. For.example, half the people with some disease die in a year and half live. Now suppose there's a treatment that increases survival rate from 50% to 60%. The unexplained sd is 0.5 and the explained sd is 0.05, hence R-squared is, again, 0.01.

Summary (for now):

I don't know where the 17% came from. I'll email James Fowler and see what he says. I'm also wondering about that Daily Telegraph article but it's usually not so easy to reach newspaper journalists so I'll let that one go for now.

P.S. According to his website, Fowler was named the most original thinker of the year by The McLaughlin Group. On the other hand, our sister blog won an award by the same organization that honored Peggy Noonan. So I'd call that a tie!

P.P.S. Their data come from the National Survey of Adolescent Health, which for some reason is officially called "Add Health." Shouldn't that be "Ad Health" or maybe "Ado Health"? I'm confused where the extra "d" is coming from.

P.P.P.S. De Neve et al. note that the survey did not actually ask about happiness, only about life satisfaction. We all know people who appear satisfied with their lives but don't seem so happy, but the presumption is that, in general, things associated with more life satisfaction are also associated with happiness. The authors also remark upon the limitations using a sample of adolescents to study life satisfaction. Not their fault--as is appropriate, they use the data they have and then discuss the limitations of their analysis.

P.P.P.P.S. De Neve and Fowler have a related paper with a nice direct title, "The MAOA Gene Predicts Credit Card Debt." This one, also from Add Health, reports: "Having one or both MAOA alleles of the low efficiency type raises the average likelihood of having credit card debt by 14%." For some reason I was having difficulty downloading the pdf file (sorry, I have a Windows machine!) so I don't know how to interpret the 14%. I don't know if they've looked at credit card debt and life satisfaction together. Being in debt seems unsatisfying; on the other hand you could go in debt to buy things that give you satisfaction, so it's not clear to me what to expect here.

P.P.P.P.P.S. I'm glad Don Rubin didn't read the above-linked article. Footnote 9 would probably make him barf.

P.P.P.P.P.P.S. Just to be clear: The above is not intended to be a "debunking" of the research of De Neve, Fowler, and Frey. It's certainly plausible that this gene could be linked to reported life satisfaction (maybe, for example, it influences the way that people respond to survey questions). I'm just trying to figure out what's going on, and, as a statistician, it's natural for me to start with the numbers.

P.^7S. James Fowler explains some of the confusion in a long comment.

Much-honored playwright Tony Kushner was set to receive one more honor--a degree from John Jay College--but it was suddenly taken away from him on an 11-1 vote of the trustees of the City University of New York. This was the first rejection of an honorary degree nomination since 1961.

The news article focuses on one trustee, Jeffrey Wiesenfeld, an investment adviser and onetime political aide, who opposed Kushner's honorary degree, but to me the relevant point is that the committee as a whole voted 11-1 to ding him.

Kusnher said, "I'm sickened," he added, "that this is happening in New York City. Shocked, really." I can see why he's shocked, but perhaps it's not so surprising that it's happening in NYC. Recall the famous incident from 1940 in which Bertrand Russell was invited and then uninvited to teach at City College. The problem that time was Russell's views on free love (as they called it back then). There seems to be a long tradition of city college officials being willing to risk controversy to make a political point.

P.S. I was trying to imagine what these 11 trustees could've been thinking . . . my guess is it was some sort of group-dynamics thing. They started talking about it and convinced each other that the best thing to do would be to set Kushner's nomination aside. I bet if they'd had to decide separately most of them wouldn't have come to this conclusion. And I wouldn't be surprised if, five minutes after walking away from that meeting, most of those board members suddenly thought, Uh oh--we screwed up on this one! As cognitive psychologists have found, this is one of the problems with small-group deliberation: a group of people can be led to a decision which is not anywhere near the center of their positions considered separately.

A friend asks the above question and writes:

This article left me thinking - how could the IRS not notice that this guy didn't file taxes for several years? Don't they run checks and notice if you miss a year? If I write a check our of order, there's an asterisk next to the check number in my next bank statement showing that there was a gap in the sequence.

If you ran the IRS, wouldn't you do this: SSNs are issued sequentially. Once a SSN reaches 18, expect it to file a return. If it doesn't, mail out a postage paid letter asking why not with check boxes such as Student, Unemployed, etc. Follow up at reasonable intervals. Eventually every SSN should be filing a return, or have an international address. Yes this is intrusive, but my goal is only to maximize tax revenue. Surely people who do this for a living could come up with something more elegant.

My response:

I dunno, maybe some confidentiality rules? The other thing is that I'm guessing that IRS gets lots of pushback when they hassle rich and influential people. I'm sure it's much less effort for them to go after the little guy, which is less cost effective. And behind this is a lack of societal consensus that the IRS are good guys. They're enforcing a law that something like a third of the people oppose! But I agree: given that we need taxes, I think we should go after the cheats.

Perhaps some informed readers out there can supply more context.

John Sides followed up on a discussion of his earlier claim that political independents vote for president in a reasonable way based on economic performance. John's original post led to the amazing claim by New Republic writer Jonathan Chait that John wouldn't "even want to be friends with anybody who" voted in this manner.

I've been sensitive to discussions of rationality and voting ever since Aaron Edlin, Noah Kaplan, and I wrote our article on voting as a rational choice: why and how people vote to improve the well-being of others.

Models of rationality are controversial In politics, just as they are in other fields ranging from economics to criminology. On one side you have people trying to argue that all behavior is rational, from lottery playing to drug addiction to engaging in email with exiled Nigerian royalty. Probably the only behavior that nobody has yet to claim is rational is blogging, but I bet that's coming too. From the other direction, lots of people point to strong evidence of subject matter ignorance in all fields ranging from demography to the Federal budget to demonstrate that, even if voters think they're being rational, they can't be making reasoned decisions in any clear senses.

Here's what I want to add. In the usual debates, people argue about whether a behavior is rational or not. Or, at a more sophisticated level, people might dispute how rational or irrational a given action is. But I don't think this is the right way of thinking about it.

People have many overlapping reasons for anything they do. For a behavior to be "rational" does not mean that a person does it as the result of a reasoned argument but rather that some aspects of that behavior could be modeled as such. This comes up in section 5.2 of my article with Edlin and Kaplan: To model a behavior as rational does not compete with more traditional psychological explanations; it reinforces them.

For example, voter turnout is higher in elections that are anticipated to be close. This has a rational explanation---if an election is close, it's more likely that you will cast the deciding vote--and also a process explanation: if an election is close, candidates will campaign harder, more people will talk about the election, and a voter is more likely to want to be part of the big stories. These two explanations work together, they don't compete: it's rational for you to vote, and it's also rational for the campaigns to try to get you to vote, to make the race more interesting to increase your motivation level.

I don't anticipate that this note will resolve some of the debates about participation of independents in politics but I hope that this clarifies some of the concerns about the "rationality" label.

P.S. John is better at engaging journalists than I am. When Chait wrote something that I didn't like and then responded to my response, I grabbed on a key point in his response and emphasized our agreement, thus ending the debate (such as it was), rather than emphasizing our remaining points of disagreement. John is better at keeping the discussion alive.

I was invited by the Columbia University residence halls to speak at an event on gay marriage. (I've assisted my colleagues Jeff Lax and Justin Phillips in their research on the topic.) The event sounded fun--unfortunately I'll be out of town that weekend so can't make it--but it got me thinking about how gay marriage and other social issues are so relaxing to think about because there's no need for doubt.

About half of Americans support same-sex marriage and about half oppose it. And the funny thing is, you can be absolutely certain in your conviction, from either direction. If you support, it's a simple matter of human rights, and it's a bit ridiculous to suppose that if gay marriage is allowed, it will somehow wreck all the straight marriages out there. Conversely, you can oppose on the clear rationale of wanting to keep marriage the same as it's always been, and suggest that same-sex couples can be free to get together outside of marriage, as they always could. (Hey, it was good enough for Abraham Lincoln and his law partner!)

In contrast, the difficulty of expressing opinions about the economy, or about foreign policy, is that you have to realize at some level that you might be wrong.

For example, even Paul Krugman must occasionally wonder whether maybe the U.S. can't really afford another trillion dollars of debt, and even William Beach (he of the 2.8% unemployment rate forecast, later updated to a still-implausible point forecast of 4.3%) must occasionally wonder whether massive budget cuts will really send the economy into nirvana.

Similarly, even John McCain must wonder on occasion whether it would've been better to withdraw from Iraq in 2003, or 2004, or 2005. And even a firm opponent of the war such as the Barack Obama of early 2008 must have occasionally thought that maybe the invasion wasn't such a bad idea on balance.

I don't really have anything more to say on this. I just think it's interesting how there can be so much more feeling of certainty about social policy.

Asymmetry in Political Bias


Tyler Cowen points to an article by Riccardo Puglisi, who writes:

Controlling for the activity of the incumbent president and the U.S. Congress across issues, I find that during a presidential campaign, The New York Times gives more emphasis to topics on which the Democratic party is perceived as more competent (civil rights, health care, labor and social welfare) when the incumbent president is a Republican. This is consistent with the hypothesis that The New York Times has a Democratic partisanship, with some "anti-incumbent" aspects . . . consistent with The New York Times departing from demand-driven news coverage.

I haven't read the article in the question but the claim seems plausible to me. I've often thought there is an asymmetry in media bias, with Democratic reporters--a survey a few years ago found that twice as many journalists identify as Democrats than as Republicans--biasing their reporting by choosing which topics to focus on, and Republican news organizations (notably Fox News and other Murdoch organizations) biasing in the other direction by flat-out attacks.

I've never been clear on which sort of bias is more effective. On one hand, Fox can create a media buzz out of nothing at all; on the other hand, perhaps there's something more insidious about objective news organizations indirectly creating bias by their choice of what to report.

But I've long thought that this asymmetry should inform how media bias is studied. It can't be a simple matter of counting stories or references to experts and saying that Fox is more biased or the Washington Post is more biases or whatever. Some of the previous studies in this area are interesting but to me don't get at either of the fundamental sorts of bias mentioned above. You have to look for bias in different ways to capture these multiple dimensions. Based on the abstract quoted above, Puglisi may be on to something, maybe this could be a useful start to getting to the big picture.

Tyler Cowen asks what is the ultimate left-wing novel? He comes up with John Steinbeck and refers us to this list by Josh Leach that includes soclal-realist novels from around 1900. But Cowen is looking for something more "analytically or philosophically comprehensive."

My vote for the ultimate left-wing novel is 1984. The story and the political philosophy fit together well, and it's also widely read (which is an important part of being the "ultimate" novel of any sort, I think; it wouldn't do to choose something too obscure). Or maybe Gulliver's Travels, but I've never actually read that, so I don't know if it qualifies as being left-wing. Certainly you can't get much more political than 1984, and I don't think you can get much more left-wing either. (If you get any more left-wing than that, you start to loop around the circle and become right-wing. For example, I don't think that a novel extolling the brilliance of Stalin or Mao would be considered left-wing in a modern context.)

Native Son (also on Leach's list) seems like another good choice to me, but I'm sticking with 1984 as being more purely political. For something more recent you could consider something such as What a Carve Up by Jonathan Coe.

P.S. Cowen's correspondent wrote that "the book needs to do two things: justify the welfare state and argue the limitations of the invisible hand." But I don't see either of these as particularly left-wing. Unless you want to argue that Bismarck was a left-winger.

P.P.S. Commenters suggest Uncle Tom's Cabin and Les Miserables. Good choices: they're big novels, politically influential, and left-wing. There's probably stuff by Zola etc. too. I still stand by 1984. Orwell was left-wing and 1984 was his novel. I think the case for 1984 as a left-wing novel is pretty iron-clad.

These are based on raw Pew data, reweighted to adjust for voter turnout by state, income, and ethnicity. No modeling of vote on age, education, and ethnicity.


I think our future estimates based on the 9-way model will be better, but these are basically OK, I think. All but six of the dots in the graph are based on sample sizes greater than 30.

I published these last year but they're still relevant, I think. There's lots of confusion when it comes to education and voting.

Catherine Rampell highlights this stunning Gallup Poll result:

6 percent of Americans in households earning over $250,000 a year think their taxes are "too low." Of that same group, 26 percent said their taxes were "about right," and a whopping 67 percent said their taxes were "too high."

OK, fine. Most people don't like taxes. No surprise there. But get this next part:

And yet when this same group of high earners was asked whether "upper-income people" paid their fair share in taxes, 30 percent said "upper-income people" paid too little, 30 percent said it was a "fair share," and 38 percent said it was too much.

30 percent of these upper-income people say that upper-income people pay too little, but only 6 percent say that they personally pay too little. 38% say that upper-income people pay too much, but 67% say they personally pay too much.

Leslie McCall spoke in the sociology department here the other day to discuss changes in attitudes about income inequality as well as changes in attitudes about attitudes about income inequality. (That is, she talked about what survey respondents say, and she talked about what scholars have said about what survey respondents say.)

On the plus side, the talk was interesting. On the downside, I had to leave right at the start of the discussion so I didn't have a chance to ask my questions. So I'm placing them below.

I can't find a copy of McCall's slides so I'll link to this recent op-ed she wrote on the topic of "Rising Wealth Inequality: Should We Care?" Her title was "Americans Aren't Naive," and she wrote:

Understanding what Americans think about rising income inequality has been hampered by three problems.

First, polls rarely ask specifically about income inequality. They ask instead about government redistributive polices, such as taxes and welfare, which are not always popular. From this information, we erroneously assume that Americans don't care about inequality. . .. Second, surveys on inequality that do exist are not well known. . . . Third . . . politicians and the media do not consistently engage Americans on the issue. . . .

It is often said that Americans care about opportunity and not inequality, but this is very misleading. Inequality can itself distort incentives and restrict opportunities. This is the lesson that episodes like the financial crisis and Great Recession convey to most Americans.

What follows is not any attempt at an exposition, appreciation, or critique of McCall's work but rather just some thoughts that arose, based on some notes I scrawled during her lecture:

1. McCall is looking at perceptions of perceptions. This reminds me of our discussions in Red State Blue State about polarization and the perception of polarization. The idea is that, even if American voters are not increasingly polarized in their attitudes, there is a perception of polarization, and this perception can itself have consequences (for example, in the support offered to politicians on either side who refuse to compromise).

2. McCall talked about meritocracy and shared a quote from Daniel Bell (who she described as "conservative," which surprised me, but I guess it would be accurate to call him the most liberal of the neoconservatives) about how meritocracy could be good or bad, with bad meritocracy associated with meritocrats who abuse their positions of power and degrade those below in the social ladder.

At this point I wanted to jump up and shout James "the Effect" Flynn's point that meritocracy is a self-contradiction. As Flynn put it:

The case against meritocracy can be put psychologically: (a) The abolition of materialist-elitist values is a prerequisite for the abolition of inequality and privilege; (b) the persistence of materialist-elitist values is a prerequisite for class stratification based on wealth and status; (c) therefore, a class-stratified meritocracy is impossible.

Flynn also points out that the promotion and celebration of the concept of "meritocracy" is also, by the way, a promotion and celebration of wealth and status--these are the goodies that the people with more merit get:

People must care about that hierarchy for it to be socially significant or even for it to exist. . . . The case against meritocracy can also be put sociologically: (a) Allocating rewards irrespective of merit is a prerequisite for meritocracy, otherwise environments cannot be equalized; (b) allocating rewards according to merit is a prerequisite for meritocracy, otherwise people cannot be stratified by wealth and status; (c) therefore, a class-stratified meritocracy is impossible.

In short, when people talk about meritocracy they tend to focus on the "merit" part (Does Kobe Bryant have as much merit as 10,000 schoolteachers? Do doctors have more merit than nurses? Etc.), but the real problem with meritocracy is that it's an "ocracy."

This point is not in any way a contradiction or refutation of McCall. I just think that, to the extent that debates over "just deserts" are a key part of her story, it would be useful to connect to Flynn's reflections on the impossibility of a meritocratic future.

3. I have a few thoughts on the competing concepts of opportunity vs. redistribution, which were central to McCall's framing.

a. Loss aversion. Opportunity sounds good because it's about gains. In contrast, I suspect that, when we think about redistribution, losses are more salient. (Redistribution is typically framed as taking from group A and giving to group B. There is a vague image of a bag full of money, and of course you have to take it from A before giving it to B.) So to the extent there is loss aversion (and I think there is), redistribution is always gonna be a tough sell.

b. The path from goal to policy. If you're going to cut taxes, what services do you plan to cut? If you plan to increase services, who's going to pay for it? Again, economic opportunity sounds great because you're not taking it from anybody. This is not just an issue of question wording in a survey; I think it's fundamental to how people think about inequality and redistribution.

I suspect the cognitive (point "a" above) and political (point "b") framing are central to people's struggles in thinking about economic opportunity. The clearest example is affirmative action, where opportunity for one group directly subtracts from opportunity for others.

4. As I remarked during McCall's talk, I was stunned that more than half the people did not think that family or ethnicity helped people move up in the world. We discussed the case of George W. Bush, who certainly benefited from family connections but can't really said to have moved up in the world--for him, being elected president was just a way to stand still, intergenerationally-speaking. As well as being potentially an interesting example for McCall's book-in-progress, the story of G. W. Bush illustrates some of the inherent contradictions in thinking about mobility in a relative sense. Not everyone can move up, at least not in a relative sense.

5. McCall talked about survey results on Americans' views of rich people and, I think, of corporate executives. This reminds me of survey data from 2007 on Americans' views of corporations:

Nearly two-thirds of respondents say corporate profits are too high, but, according to a Pew research report, "more than seven in ten agree that 'the strength of this country today is mostly based on the success of American business' - an opinion that has changed very little over the past 20 years." People like business in general (except for those pesky corporate profits) but they love individual businesses, with 95% having a favorable view of Johnson and Johnson (among those willing to give a rating), 94% liking Google, 91% liking Microsoft, . . . I was surprised to find that 70% of the people were willing to rate Citibank, and of those people, 78% had a positive view. I don't have a view of Citibank one way or another, but it would seem to me to be the kind of company that people wouldn't like, even in 2007. Were banks ever popular? I guess so.

The Pew report broke things down by party identification (Democrat or Republican) and by "those who describe their household as professional or business class; those who call themselves working class; and those who say their family or household is struggling."

Republicans tend to like corporations, with little difference between the views of professional-class and working-class Republicans. For Democrats, though, there's a big gap, with professionals having a generally more negative view, compared to the working class. Follow the link for some numbers and some further discussion for some fascinating patterns that I can't easily explain.

6. In current debates over the federal budget, liberals favor an economic stimulus (i.e., deficit spending) right now, while conservatives argue that, not only should we decrease the deficit, but that our entire fiscal structure is unsustainable, that we can't afford the generous pensions and health care that's been promised to everyone. The crisis in the euro is often taken by fiscal conservatives as a signal that the modern welfare state is a pyramid scheme, and something has to get cut.

When the discussion shifts to the standard of living of the middle class, though, we get a complete reversal. McCall's op-ed was part of an online symposium on wealth inequality. One thing that struck me about the discussions there was the reversal of the usual liberal/conservative perspectives on fiscal issues.

Liberals who are fine with deficits at the national level argue that, in the words of Michael Norton, "the expansion of consumer credit in the United States has allowed middle class and poor Americans to live beyond their means, masking their lack of wealth by increasing their debt." From the other direction, conservatives argue that Americans are doing just fine, with Scott Winship reporting that "four in five Americans have exceeded the income their parents had at the same age."

From the left, we hear that America is rich but Americans are broke. From the right, the story is the opposite. America (along with Europe and Japan) are broke but individual Americans are doing fine.

I see the political logic to these positions. If you start from the (American-style) liberal perspective favoring government intervention in the economy, you'll want to argue that (a) people are broke and need the government's help, and (b) we as a society can afford it. If you start from the conservative perspective favoring minimal government intervention, you'll want to argue that (a) people are doing just fine as they are, and (b) anyway, we can't afford to help them.

I won't try to adjudicate these claims: as I've written a few dozen times in this space already, I have no expertise in macroeconomics (although I did get an A in the one and only econ class I ever took, which was in 11th grade). I bring them up in order to demonstrate the complicated patterns between economic ideology, political ideology, and views about inequality.

This one was so beautiful I just had to repost it:

From the New York Times, 9 Sept 1981:


If I could change Park Slope I would turn it into a palace with queens and kings and princesses to dance the night away at the ball. The trees would look like garden stalks. The lights would look like silver pearls and the dresses would look like soft silver silk. You should see the ball. It looks so luxurious to me.

The Park Slope ball is great. Can you guess what street it's on? "Yes. My street. That's Carroll Street."

-- Jennifer Chatmon, second grade, P.S. 321

This was a few years before my sister told me that she felt safer having a crack house down the block because the cops were surveilling it all the time.

Happy tax day!


Your taxes pay for the research funding that supports the work we do here, some of which appears on this blog and almost all of which is public, free, and open-source. So, to all of the taxpayers out there in the audience: thank you.

NYC 1950


Coming back from Chicago we flew right over Manhattan. Very impressive as always, to see all those buildings so densely packed. But think of how impressive it must have seemed in 1950! The world had a lot less of everything back in 1950 (well, we had more oil in the ground, but that's about it), so Manhattan must have just seemed amazing. I can see how American leaders of that period could've been pretty smug. Our #1 city was leading the world by so much, it was decades ahead of its time, still impressive even now after 60 years of decay.

A few years ago Larry Bartels presented this graph, a version of which latter appeared in his book Unequal Democracy:


Larry looked at the data in a number of ways, and the evidence seemed convincing that, at least in the short term, the Democrats were better than Republicans for the economy. This is consistent with Democrats' general policies of lowering unemployment, as compared to Republicans lowering inflation, and, by comparing first-term to second-term presidents, he found that the result couldn't simply be explained as a rebound or alternation pattern.

The question then arose, why have the Republicans won so many elections? Why aren't the Democrats consistently dominating? Non-economic issues are part of the story, of course, but lots of evidence shows the economy to be a key concern for voters, so it's still hard to see how, with a pattern such as shown above, the Republicans could keep winning.

Larry had some explanations, largely having to do with timing: under Democratic presidents the economy tended to improve at the beginning of the four-year term, while gains under Republicans tended to occur in years 3 and 4--just in time for the next campaign!

See here for further discussion (from five years ago) of Larry's ideas from the perspective of the history of the past 60 years.

Enter Campbell

Jim Campbell recently wrote an article, to appear this week in The Forum (the link should become active once the issue is officially published) claiming that Bartels is all wrong--or, more precisely, that Bartels's finding of systematic differences in performance between Democratic and Republican presidents is not robust and goes away when you control the economic performance leading in to a president's term.

Here's Campbell:

Previous estimates did not properly take into account the lagged effects of the economy. Once lagged economic effects are taken into account, party differences in economic performance are shown to be the effects of economic conditions inherited from the previous president and not the consequence of real policy differences. Specifically, the economy was in recession when Republican presidents became responsible for the economy in each of the four post-1948 transitions from Democratic to Republican presidents. This was not the case for the transitions from Republicans to Democrats. When economic conditions leading into a year are taken into account, there are no presidential party differences with respect to growth, unemployment, or income inequality.

For example, using the quarterly change in GDP measure, the economy was in free fall in Fall 2008 but in recovery during the third and fourth quarters of 2009, so this counts as Obama coming in with a strong economy. (Campbell emphasizes that he is following the lead of Bartels and counting a president's effect on the economy to not begin until year 2.)

It's tricky. Bartels's claims are not robust to changes in specifications, but Campbell's conclusions aren't completely stable either. Campbell finds one thing if he controls for previous year's GNP growth but something else if he controls only for GNP growth in the 3rd and 4th quarter of the previous year. This is not to say Campbell is wrong but just to say that any atheoretical attempt to throw in lags can result in difficulty in interpretation.

I'm curious what Doug Hibbs thinks about all this; I don't know why, but to me Hibbs exudes an air of authority on this topic, and I'd be inclined to take his thoughts on these matters seriously.

What struck me the most about Campbell's paper was ultimately how consistent its findings are with Bartels's claims. This perhaps shouldn't be a surprise, given that they're working with the same data, but it did surprise me because their political conclusions are so different.

Here's the quick summary, which (I think) both Bartels and Campbell would agree with:

- On average, the economy did a lot better under Democratic than Republican presidents in the first two years of the term.

- On average, the economy did slightly better under Republican than Democratic presidents in years 3 and 4.

These two facts are consistent with the Hibbs/Bartels story (Democrats tend to start off by expanding the economy and pay the price later, while Republicans are more likely to start off with some fiscal or monetary discipline) and also consistent with Campbell's story (Democratic presidents tend to come into office when the economy is doing OK, and Republicans are typically only elected when there are problems).

But the two stories have different implications regarding the finding of Hibbs, Rosenstone, and others that economic performance in the last years of a presidential term predicts election outcomes. Under the Bartels story, voters are myopically chasing short-term trends, whereas in Campbell's version, voters are correctly picking up on the second derivative (that is, the trend in the change of the GNP from beginning to end of the term).

Consider everyone's favorite example: Reagan's first term, when the economy collapsed and then boomed. The voters (including Larry Bartels!) returned Reagan by a landslide in 1984: were they suckers for following a short-term trend or were they savvy judges of the second derivative?

I don't have any handy summary here--I don't see a way to declare a winner in the debate--but I wanted to summarize what seem to me to be the key points of agreement and disagreement in these very different perspectives on the same data.

One way to get leverage on this would be to study elections for governor and state economies. Lots of complications there, but maybe enough data to distinguish between the reacting-to-recent-trends and reacting-to-the-second-derivative stories.

P.S. See below for comments by Campbell.

Johathan Chait writes:

Parties and candidates will kill themselves to move the needle a percentage point or two in a presidential race. And again, the fundamentals determine the bigger picture, but within that big picture political tactics and candidate quality still matters around the margins.

I agree completely. This is the central message of Steven Rosenstone's excellent 1983 book, Forecasting Presidential Elections.

So, given that Chait and I agree 100%, why was I so upset at his recent column on "The G.O.P.'s Dukakis Problem"?

I'll put the reasons for my displeasure below the fold because my main point is that I'm happy with Chait's quote above. For completeness I want to explain where I'm coming from but my take-home point is that we're mostly in agreement.

Jonathan Chait writes that the most important aspect of a presidential candidate is "political talent":

Republicans have generally understood that an agenda tilted toward the desires of the powerful requires a skilled frontman who can pitch Middle America. Favorite character types include jocks, movie stars, folksy Texans and war heroes. . . . [But the frontrunners for the 2012 Republican nomination] make Michael Dukakis look like John F. Kennedy. They are qualified enough to serve as president, but wildly unqualified to run for president. . . . [Mitch] Daniels's drawbacks begin -- but by no means end -- with his lack of height, hair and charisma. . . . [Jeb Bush] suffers from an inherent branding challenge [because of his last name]. . . . [Chris] Christie . . . doesn't cut a trim figure and who specializes in verbally abusing his constituents. . . . [Haley] Barbour is the comic embodiment of his party's most negative stereotypes. A Barbour nomination would be the rough equivalent of the Democrats' nominating Howard Dean, if Dean also happened to be a draft-dodging transsexual owner of a vegan food co-op.

Chait continues:

The impulse to envision one of these figures as a frontman represents a category error. These are the kind of people you want advising the president behind the scenes; these are not the people you put in front of the camera. The presidential candidate is the star of a television show about a tall, attractive person who can be seen donning hard hats, nodding at the advice of military commanders and gazing off into the future.

Geddit? Mike Dukakis was short, ethnic-looking, and didn't look good in a tank. (He did his military service in peacetime.) And did I mention that his middle name was Stanley? Who would vote for such a jerk?

All I can say is that Dukakis performed about as well in 1988 as would be predicted from the economy at the time. Here's a graph based on Doug Hibbs's model:


Sorry, but I don't think the Democrats would've won the 1988 presidential election even if they'd had Burt Reynolds at the top of the ticket. And, remember, George H. W. Bush was widely considered to be a wimp and a poser until he up and won the election. Conversely, had Dukakis won (which he probably would've, had the economy been slumping that year), I think we'd be hearing about how he was a savvy low-key cool dude.

Let me go on a bit more about the 1988 election.

In politics, as in baseball, hot prospects from the minors can have trouble handling big-league pitching.

We all have opinions about the federal budget and how it should be spent. Infrequently, those opinions are informed by some knowledge about where the money actually goes. It turns out that most people don't have a clue. What about you? Here, take this poll/quiz and then compare your answers to (1) what other people said, in a CNN poll that asked about these same items and (2) compare your answers to the real answers.

Quiz is below the fold.

Tyler Cowen links to this article by Matt Ridley that manages to push all my buttons. Ridley writes:

Radford writes:

The word "conservative" gets used many ways, for various political purposes, but I would take it's basic meaning to be someone who thinks there's a lot of wisdom in traditional ways of doing things, even if we don't understand exactly why those ways are good, so we should be reluctant to change unless we have a strong argument that some other way is better. This sounds very Bayesian, with a prior reducing the impact of new data.

I agree completely, and I think Radford will very much enjoy my article with Aleks Jakulin, "Bayes: radical, liberal, or conservative?" Radford's comment also fits with my increasing inclination to use informative prior distributions.

This is a chance for me to combine two of my interests--politics and statistics--and probably to irritate both halves of the readership of this blog. Anyway...

I recently wrote about the apparent correlation between Bayes/non-Bayes statistical ideology and liberal/conservative political ideology:

The Bayes/non-Bayes fissure had a bit of a political dimension--with anti-Bayesians being the old-line conservatives (for example, Ronald Fisher) and Bayesians having a more of a left-wing flavor (for example, Dennis Lindley). Lots of counterexamples at an individual level, but my impression is that on average the old curmudgeonly, get-off-my-lawn types were (with some notable exceptions) more likely to be anti-Bayesian.

This was somewhat based on my experiences at Berkeley. Actually, some of the cranky anti-Bayesians were probably Democrats as well, but when they were being anti-Bayesian they seemed pretty conservative.

Recently I received an interesting item from Gerald Cliff, a professor of mathematics at the University of Alberta. Cliff wrote:

I took two graduate courses in Statistics at the University of Illinois, Urbana-Champaign in the early 1970s, taught by Jacob Wolfowitz. He was very conservative, and anti-Bayesian. I admit that my attitudes towards Bayesian statistics come from him. He said that if one has a population with a normal distribution and unknown mean which one is trying to estimate, it is foolish to assume that the mean is random; it is fixed, and currently unknown to the statistician, but one should not assume that it is a random variable.

Wolfowitz was in favor of the Vietnam War, which was still on at the time. He is the father of Paul Wolfowitz, active in the Bush administration.

To which I replied:

Very interesting. I never met Neyman while I was at Berkeley (he had passed away before I got there) but I've heard that he was very liberal politically (as was David Blackwell). Regarding the normal distribution comment below, I would say:

1. Bayesians consider parameters to be fixed but unknown. The prior distribution is a regularization tool that allows more stable estimates.

2. The biggest assumptions in probability models are typically not the prior distribution but in the data model. In this case, Wolfowitz was willing to assume a normal distribution with no question but then balked at using any knowledge about its mean. It seems odd to me, as a Bayesian, for one's knowledge to be divided so sharply: zero knowledge about the parameter, perfect certainty about the distributional family.

To return to the political dimension: From basic principles, I don't see any strong logical connection between Bayesianism and left-wing politics. In statistics, non-Bayesian ("classical") methods such as maximum likelihood are often taken to be conservative, as compared to the more assumption-laden Bayesian approach, but, as Aleks Jakulin and I have argued, the labeling of a political method as liberal or conservative depends crucially on what is considered your default.

As statisticians, we are generally trained to respect conservatism, which can sometimes be defined mathematically (for example, nominal 95% intervals that contain the true value more than 95% of the time) and sometimes with reference to tradition (for example, deferring to least-squares or maximum-likelihood estimates). Statisticians are typically worried about messing with data, which perhaps is one reason that the Current Index to Statistics lists 131 articles with "conservative" in the title or keywords and only 46 with the words "liberal" or "radical."

In that sense, given that, until recently, non-Bayesian approaches were the norm in statistics, it was the more radical group of statisticians (on average) who wanted to try something different. And I could see how a real hardline conservative such as Wolfowitz could see a continuity between anti-Bayesian skepticism and political conservatism, just how, on the other side of the political spectrum, a leftist such as Lindley could ally Bayesian thinking with support of socialism, a planned economy, and the like.

As noted above, I don't think these connections make much logical sense but I can see where they were coming from (with exceptions, of course, as noted regarding Neyman above).

After noting the increasing political conservatism of people in the poorer states, Richard Florida writes:

The current economic crisis only appears to have deepened conservatism's hold on America's states. This trend stands in sharp contrast to the Great Depression, when America embraced FDR and the New Deal.

Liberalism, which is stronger in richer, better-educated, more-diverse, and, especially, more prosperous places, is shrinking across the board and has fallen behind conservatism even in its biggest strongholds. This obviously poses big challenges for liberals, the Obama administration, and the Democratic Party moving forward.

But the much bigger, long-term danger is economic rather than political. This ideological state of affairs advantages the policy preferences of poorer, less innovative states over wealthier, more innovative, and productive ones. American politics is increasingly disconnected from its economic engine. And this deepening political divide has become perhaps the biggest bottleneck on the road to long-run prosperity.

What are my thoughts on this?

First, I think Obama would be a lot more popular had he been elected in 1932, rather than 1930.

Second, transfers from the richer, more economically successful states to the poorer, less developed states are not new. See, for example, this map from 1924 titled "Good Roads Everywhere" that shows a proposed system of highways spanning the country, "to be built and forever maintained by the United States Government."


The map, made by the National Highways Association, also includes the following explanation for the proposed funding system: "Such a system of National Highways will be paid for out of general taxation. The 9 rich densely populated northeastern States will pay over 50 per cent of the cost. They can afford to, as they will gain the most. Over 40 per cent will be paid for by the great wealthy cities of the Nation. . . . The farming regions of the West, Mississippi Valley, Southwest and South will pay less than 10 per cent of the cost and get 90 per cent of the mileage." [emphasis added] Beyond its quaint slogans ("A paved United States in our day") and ideas that time has passed by ("Highway airports"), the map gives a sense of the potential for federal taxing and spending to transfer money between states and regions.

This story reminds me that, when I was in grad school, the state of Massachusetts instituted a seat-belt law which became a big controversy. A local talk show host made it his pet project to shoot down the law, and he succeeded! There was a ballot initiative and the voters repealed the seat belt law. A few years later the law returned (it was somehow tied in with Federal highway funding, I think, the same way they managed to get all the states to up the drinking age to 21), and, oddly enough, nobody seemed to care the second time around.

It's funny how something can be a big political issue one year and nothing the next. I have no deep insights on the matter, but it's worth remembering that these sorts of panics are nothing new. Recall E.S. Turner's classic book, Roads to Ruin. I think there's a research project in here, to understand what gets an issue to be a big deal and how it is that some controversies just fade away.

Reviewing a research article by Michael Spence and Sandile Hlatshwayo about globalization (a paper with the sobering message that "higher-paying jobs [are] likely to follow low-paying jobs in leaving US," Tyler Cowen writes:

It is also a useful corrective to the political conspiracy theories of changes in the income distribution. . .

Being not-so-blissfully ignorant of macroeconomics, I can focus on the political question, namely these conspiracy theories.

I'm not quite sure what Cowen is referring to here--he neglects to provide a link to the conspiracy theories--but I'm guessing he's referring to the famous graph by Piketty and Saez showing how very high-end incomes (top 1% or 0.1%) have, since the 1970s, risen much more dramatically in the U.S. than in other countries, along with claims by Paul Krugman and others that much of this difference can be explained by political changes in the U.S. In particular, top tax rates in the U.S. have declined since the 1970s and the power of labor unions have decreased. The argument that Krugman and others make on the tax rates is both direct (the government takes away money from people with high incomes) and indirect (with higher tax rates, there is less incentive to seek or to pay high levels of compensation). And there's an even more indirect argument that as the rich get richer, they can use their money in various ways to get more political influence.

Anyway, I'm not sure what the conspiracy is. I mean, whatever Grover Norquist might be doing in a back room somewhere, the move to lower taxes was pretty open. According to Dictionary,com, a conspiracy is "an evil, unlawful, treacherous, or surreptitious plan formulated in secret by two or more persons; plot."

Hmm . . . I suppose Krugman etc. might in fact argue that there has been some conspiracy going on--for example of employers conspiring to use various illegal means to thwart union drives--but I'd also guess that to him and others on the left or center-left, most of the political drivers of inequality changes have been open, not conspiratorial.

I might be missing something here, though; I'd be interested in hearing more. At this point I'm not sure if Cowen's saying that these conspiracies don't exist, or whether they exist (and are possibly accompanied by similar conspiracies on the other side) but have been ineffective. Also I might be completely wrong in assigning Cowen's allusion to Krugman etc.

This discussion is relevant to this here blog because the labeling of a hypothesis as a "conspiracy" seems relevant to how it is understood and evaluated.

I put it on the sister blog so you loyal readers here wouldn't be distracted by it.

This is pretty amazing.

Gregory Eady writes:

I'm working on a paper examining the effect of superpower alliance on a binary DV (war). I hypothesize that the size of the effect is much higher during the Cold War than it is afterwards. I'm going to run a Chow test to check whether this effect differs significantly between 1960-1989 and 1990-2007 (Scott Long also has a method using predicted probabilities), but I'd also like to show the trend graphically, and thought that your "Secret Weapon" would be useful here. I wonder if there is anything I should be concerned about when doing this with a (rare-events) logistic regression. I was thinking to graph the coefficients in 5-year periods, moving a single year at a time (1960-64, 1961-65, 1962-66, and so on), reporting the coefficient in the graph for the middle year of each 5-year range).

My reply:

I don't know nuthin bout no Chow test but, sure, I'd think the secret weapon would work. If you're analyzing 5-year periods, it might be cleaner just to keep the periods disjoint. Set the boundaries of these periods in a reasonable way (if necessary using periods of unequal lengths so that your intervals don't straddle important potential change points). I suppose in this case you could do 1960-64, 65-69, ..., and this would break at 1989/90 so it would be fine. If you're really running into rare events, though, you might want 10-year periods rather than 5-year.

Rajiv Sethi has some very interesting things to say:

As the election season draws closer, considerable attention will be paid to prices in prediction markets such as Intrade. Contracts for potential presidential nominees are already being scrutinized for early signs of candidate strength. . . .

This interpretation of prices as probabilities is common and will be repeated frequently over the coming months. But what could the "perceived likelihood according to the market" possibly mean?

Prediction market prices contain valuable information about this distribution of beliefs, but there is no basis for the common presumption that the price at last trade represents the beliefs of a hypothetical average trader in any meaningful sense [emphasis added]. In fact, to make full use of market data to make inferences about the distribution of beliefs, one needs to look beyond the price at last trade and examine the entire order book.

Sethi looks at some of the transaction data and continues:

What, then, can one say about the distribution of beliefs in the market? To begin with, there is considerable disagreement about the outcome. Second, this disagreement itself is public information: it persists despite the fact that it is commonly known to exist. . . . the fact of disagreement is not itself considered to be informative, and does not lead to further belief revision. The most likely explanation for this is that traders harbor doubts about the rationality or objectivity of other market participants. . . .

More generally, it is entirely possible that beliefs are distributed in a manner that is highly skewed around the price at last trade. That is, it could be the case that most traders (or the most confident traders) all fall on one side of the order book. In this case the arrival of seemingly minor pieces of information can cause a large swing in the market price.

Sethi's conclusion:

There is no meaningful sense in which one can interpret the price at last trade as an average or representative belief among the trading population.

This relates to a few points that have come up here on occasion:

1. We're often in the difficult position of trying to make inferences about marginal (in the economic sense) quantities from aggregate information.

2. Markets are impressive mechanisms for information aggregation but they're not magic. The information has to come from somewhere, and markets are inherently always living in the phase transition between stability and instability. (It is the stability that makes prices informative and the instability that allows the market to be liquid.)

3. If the stakes in a prediction market are too low, participants have the incentive and ability to manipulate it; if the stakes are too high, you have to worry about point-shaving.

This is not to say that prediction markets are useless, just that they are worth studying seriously in their own right, not to be treated as oracles. By actually looking at and analyzing some data, Sethi goes far beyond my sketchy thoughts in this area.

John Sides points to this discussion (with over 200 comments!) by political scientist Charli Carpenter of her response to a student from another university who emailed with questions that look like they come from a homework assignment. Here's the student's original email:

Hi Mr. Carpenter,

I am a fourth year college student and I have the honor of reading one of your books and I just had a few questions... I am very fascinated by your work and I am just trying to understand everything. Can you please address some of my questions? I would greatly appreciate it. It certainly help me understand your wonderful article better. Thank you very much! :)

1. What is the fundamental purpose of your article?

2. What is your fundamental thesis?

3. What evidence do you use to support your thesis?

4. What is the overall conclusion?

5. Do you feel that you have a fair balance of opposing viewpoints?


After a series of emails in which Carpenter explained why she thought these questions were a form of cheating on a homework assignment and the student kept dodging the issues, Carpenter used the email address to track down the student's name and then contacted the student's university.

I have a few thoughts on this.

- Carpenter and her commenters present this bit of attempted cheating as a serious violation on the student's part. I see where she's coming from--after all, asking someone else to do your homework for you really is against the rules--but, from the student's perspective, sending an email to an article's author is just a slightly enterprising step beyond scouring the web for something written on the article. And you can't stop students from searching the web. All you can hope for is that students digest any summaries they read and ultimately spit out some conclusions in their own words.

- To me, what would be most annoying about receiving the email above is how insulting it is:

Will Wilkinson adds to the discussion of Jonathan Haidt's remarks regarding the overwhelming prevalance of liberal or left-wing attitudes among psychology professors. I pretty much agree with Wilkinson's overview:

Folks who constantly agree with one another grow insular, self-congratulatory, and not a little lazy. The very possibility of disagreement starts to seem weird or crazy. When you're trying to do science about human beings, this attitude's not so great.

Wilkinson also reviewed the work of John Jost in this area. Jost is a psychology researcher with the expected liberal/left political leanings, but his relevance here is that he has actually done research on political attitudes and personality types. In Wilkinson's words:

Jost has done plenty of great work that helps explain not only why the best minds in science are liberal, but why most scientists-most academics, even-are liberal. Individuals with the personality trait that most strongly predicts an inclination toward liberal politics also predict an attraction to academic careers. That's why, as Haidt well notes, it's silly to expect the distribution of political opinion in academia to mirror the distribution of opinion in society at large. . . . one of the most interesting parts of Jost's work shows how personality, which is largely hereditary, predicts political affinity. Of the "Big Five" personality traits, "openness to experience" and "conscientiousness" stand out for their effects on political inclination. . . . the content of conservatism and liberalism changes over time. We live in a liberal and liberalizing culture, so today's conservatives, for example, are very liberal compared to conservatives of their grandparents' generation. But there is a good chance they inherited some of their tendency toward conservatism from grandparents.

University professors and military officers

The cleanest analogy, I think, is between college professors (who are disproportionately liberal Democrats) and military officers (mostly conservative Republicans; see this research by Jason Dempsey). In both cases there seems to be a strong connection between the environment and the ideology. Universities have (with some notable exceptions) been centers of political radicalism for centuries, just as the military has long been a conservative institution in most places (again, with some exceptions).

And this is true even though many university professors are well-paid, live well, and send their kids to private schools, and even though the U.S. military has been described as the one of the few remaining bastions of socialism remaining in the 21st century.

Responding to a proposal to move the journal Political Analysis from double-blind to single-blind reviewing (that is, authors would not know who is reviewing their papers but reviewers would know the authors' names), Tom Palfrey writes:

I agree with the editors' recommendation. I have served on quite a few editorial boards of journals with different blinding policies, and have seen no evidence that double blind procedures are a useful way to improve the quality of articles published in a journal. Aside from the obvious administrative nuisance and the fact that authorship anonymity is a thing of the past in our discipline, the theoretical and empirical arguments in both directions lead to an ambiguous conclusion. Also keep in mind that the editors know the identity of the authors (they need to know for practical reasons), their identity is not hidden from authors, and ultimately it is they who make the accept/reject decision, and also lobby their friends and colleagues to submit "their best work" to the journal. Bias at the editorial level is far more likely to affect publication decisions than bias at the referee level, and double blind procedures don't affect this. One could argue then that perhaps the main thing double blinding does is shift the power over journal content even further from referees and associate editors to editors. It certainly increases the informational asymmetry.

Another point of fact is that the use of double blind procedures in economics and political science shares essentially none of the justifications for it with the other science disciplines from which the idea was borrowed. In these other disciplines, like biology, such procedures exist for different (and good) reasons. Rather than a concern about biasing in favor of well-known versus lesser-known authors, in these other fields it is driven by a concern of bias because of the rat-race competition over a rapidly moving frontier of discovery. Because of the speed at which the frontier is moving, authors of new papers are intensely secretive (almost paranoid) about their work. Results are kept under wrap until the result has been accepted for publication - or in some cases until it is actually published. [Extra, Extra, Read All About It: PNAS article reports that Caltech astronomer Joe Shmoe discovered a new planet three months ago...] Double blind is indeed not a fiction in these disciplines. It is real, and it serves a real purpose. Consider the contrast with our discipline, in which many researchers drool over invitations from top places to present their newest results, even if the paper does not yet exist or is in very rough draft form. Furthermore, financial incentives for bias in these other disciplines are very strong, given the enormous stakes of funding. [Think how much a new telescope costs.] Basically none of the rationales for double blinding in those disciplines applies to political science. One final note. In those disciplines, editors are often "professional" editors. That is, they do not have independent research careers. This may have to do with the potential bias that results from intense competition in disciplines where financial stakes are enormous and the frontier of discovery moves at 'blinding' speed.

Tom's comparison of the different fields was a new point to me and it seems sensible.

I'd also add that I'm baffled by many people's attitudes toward reviewing articles for journals. As I've noted before, I don't think people make enough of the fact that editing and reviewing journal articles is volunteer work. Everyone's always getting angry at referees and saying what they should or should not do, but, hey--we're doing it for free. In this situation, I think it's important to get the most you can out of all participants.

Mark Palko asks what I think of this news article by John Tierney. The article's webpage is given the strange incomplete title above.

My first comment is that the headline appears false. I didn't see any evidence presented of liberal bias. (If the headline says "Social psychologists detect," I expect to see some detection, not just anecdotes.) What I did see was a discussion of the fact that most academic psychologists consider themselves politically liberal (a pattern that holds for academic researchers in general), along with some anecdotes of moderates over the years who have felt their political views disrespected by the liberal majority.

Seeing Sarah Palin's recent witticism:

It's no wonder Michelle Obama is telling everybody you need to breast feed your babies ... the price of milk is so high!

I was reminded of Dan Quayle's quip during the 1988 campaign:

The governor of Massachusetts, he lost his top naval adviser last week. His rubber ducky drowned in the bathtub.

And this got me wondering: how often do legitimate political figures--not talk show hosts, but actual politicians--communicate via schoolyard-style taunts?

John Sides discusses how his scatterplot of unionization rates and budget deficits made it onto cable TV news:

It's also interesting to see how he [journalist Chris Hayes] chooses to explain a scatterplot -- especially given the evidence that people don't always understand scatterplots. He compares pairs of cases that don't illustrate the basic hypothesis of Brooks, Scott Walker, et al. Obviously, such comparisons could be misleading, but given that there was no systematic relationship depicted that graph, these particular comparisons are not.

This idea--summarizing a bivariate pattern by comparing pairs of points--reminds me of a well-known statistical identities which I refer to in a paper with David Park:


John Sides is certainly correct that if you can pick your pair of points, you can make extremely misleading comparisons. But if you pick every pair of points, and average over them appropriately, you end up with the least-squares regression slope.

Pretty cool, and it helps develop our intuition about the big-picture relevance of special-case comparisons.

Amy Cohen points me to this blog by Jim Manzi, who writes:

Harold Pollack writes:

Tyler Cowen discusses his and Bryan Caplan's reaction to that notorious book by Amy Chua, the Yale law professor who boasts of screaming at her children, calling them "garbage," not letting them go to the bathroom when they were studying piano, etc. Caplan thinks Chua is deluded (in the sense of not being aware of research showing minimal effects of parenting on children's intelligence and personality), foolish (in writing a book and making recommendations without trying to lean about the abundant research on child-rearing), and cruel. Cowen takes a middle view in that he doesn't subscribe to Chua's parenting strategies but he does think that his friends' kids will do well (and partly because of his friends' parenting styles, not just from their genes).

Do you view yourself as special?

I have a somewhat different take on the matter, an idea that's been stewing in my mind for awhile, ever since I heard about the Wall Street Journal article that started this all. My story is that attitudes toward parenting are to some extent derived from attitudes about one's own experiences as a child.

Mike Cohen writes:

Education and Poverty


Jonathan Livengood writes:

There has been some discussion about the recent PISA results (in which the U.S. comes out pretty badly), for example here and here. The claim being made is that the poor U.S. scores are due to rampant individual- or family-level poverty in the U.S. They claim that when one controls for poverty, the U.S. comes out on top in the PISA standings, and then they infer that poverty causes poor test scores. The further inference is then that the U.S. could improve education by the "simple" action of reducing poverty. Anyway, I was wondering what you thought about their analysis.

My reply: I agree this is interesting and I agree it's hard to know exactly what to say about these comparisons. When I'm stuck in this sort of question, I ask, WWJD? In this case, I think Jennifer would ask what are the potential interventions being considered. Various ideas for changing the school system would perhaps have different effects on different groups of students. I think that would a useful way to focus discussion, to consider the effects of possible reforms in the U.S. and elsewhere. See here and here, for example.

P.S. Livengood has some graphs and discussion here.


| No Comments

Pete Gries writes:

I [Gries] am not sure if what you are suggesting by "doing data analysis in a patternless way" is a pitch for deductive over inductive approaches as a solution to the problem of reporting and publication bias. If so, I may somewhat disagree. A constant quest to prove or disprove theory in a deductive manner is one of the primary causes of both reporting and publication bias. I'm actually becoming a proponent of a remarkably non-existent species - "applied political science" - because there is so much animosity in our discipline to inductive empirical statistical work that seeks to answer real world empirical questions rather than contribute to parsimonious theory building. Anyone want to start a JAPS - Journal of Applied Political Science? Our discipline is in danger of irrelevance.

My reply: By "doing data analysis in a patternless way," I meant statistical methods such as least squares, maximum likelihood, etc., that estimate parameters independently without recognizing the constraints and relationships between them. If you estimate each study on its own, without reference to all the other work being done in the same field, then you're depriving yourself of a lot of information and inviting noisy estimates and, in particular, overestimates of small effects.

I saw this picture staring at me from the newsstand the other day:


Here's the accompanying article, by Michael Scherer and Michael Duffy, which echoes some of the points I made a few months ago, following the midterm election:

Why didn't Obama do a better job of leveling with the American people? In his first months in office, why didn't he anticipate the example of the incoming British government and warn people of economic blood, sweat, and tears? Why did his economic team release overly-optimistic graphs such as shown here? Wouldn't it have been better to have set low expectations and then exceed them, rather than the reverse?

I don't know, but here's my theory. When Obama came into office, I imagine one of his major goals was to avoid repeating the experiences of Bill Clinton and Jimmy Carter in their first two years.

Clinton, you may recall, was elected with less then 50% of the vote, was never given the respect of a "mandate" by congressional Republicans, wasted political capital on peripheral issues such as gays in the military, spent much of his first two years on centrist, "responsible" politics (budgetary reform and NAFTA) which didn't thrill his base, and then got rewarded with a smackdown on heath care and a Republican takeover of Congress. Clinton may have personally weathered the storm but he never had a chance to implement the liberal program.

Carter, of course, was the original Gloomy Gus, and his term saw the resurgence of the conservative movement in this country, with big tax revolts in 1978 and the Reagan landslide two years after that. It wasn't all economics, of course: there were also the Russians, Iran, and Jerry Falwell pitching in.

Following Plan Reagan

From a political (but not a policy) perspective, my impression was that Obama's model was not Bill Clinton or Jimmy Carter but Ronald Reagan. Like Obama in 2008, Reagan came into office in 1980 in a bad economy and inheriting a discredited foreign policy. The economy got steadily worse in the next two years, the opposition party gained seats in the midterm election, but Reagan weathered the storm and came out better than ever.

If the goal was to imitate Reagan, what might Obama have done?

- Stick with the optimism and leave the gloom-and-doom to the other party. Check.
- Stand fast in the face of a recession. Take the hit in the midterms with the goal of bouncing back in year 4. Check.
- Keep ideological purity. Maintain a contrast with the opposition party and pass whatever you can in Congress. Check.

The Democrats got hit harder in 2010 than the Republicans in 1982, but the Democrats had further to fall. Obama and his party in Congress can still hope to bounce back in two years.

Also recall that Reagan, like Roosevelt, was a statistician.

Lies, Damn Lies...that's pretty much it.


This post is by Phil Price.

We're all used to distortions and misleading statements in political discourse -- the use of these methods one thing on which politicians are fairly nonpartisan. But I think it's rare to see an outright lie, especially about a really major issue. We had a doozy yesterday, when Congresswoman Michelle Bachmann presented a graphic that attributed the 2009 federal budget to the Obama administration. Oddly, most of the other facts and figures she presented were correct, although some of them seem calculatedly misleading. If you're going to lie about something really big, why not just lie about everything?

Joan Nix writes:

Your comments on this paper by Scott Carrell and James West would be most appreciated. I'm afraid the conclusions of this paper are too strong given the data set and other plausible explanations. But given where it is published, this paper is receiving and will continue to receive lots of attention. It will be used to draw deeper conclusions regarding effective teaching and experience.

Nix also links to this discussion by Jeff Ely.

I don't completely follow Ely's criticism, which seems to me to be too clever by half, but I agree with Nix that the findings in the research article don't seem to fit together very well. For example, Carrell and West estimate that the effects of instructors on performance in the follow-on class is as large as the effects on the class they're teaching. This seems hard to believe, and it seems central enough to their story that I don't know what to think about everything else in the paper.

My other thought about teaching evaluations is from my personal experience. When I feel I've taught well--that is, in semesters when it seems that students have really learned something--I tend to get good evaluations. When I don't think I've taught well, my evaluations aren't so good. And, even when I think my course has gone wonderfully, my evaluations are usually far from perfect. This has been helpful information for me.

That said, I'd prefer to have objective measures of my teaching effectiveness. Perhaps surprisingly, statisticians aren't so good about measurement and estimation when applied to their own teaching. (I think I've blogged on this on occasion.) The trouble is that measurement and evaluation take work! When we're giving advice to scientists, we're always yammering on about experimentation and measurement. But in our own professional lives, we pretty much throw all our statistical principles out the window.

P.S. What's this paper doing in the Journal of Political Economy? It has little or anything to do with politics or economics!

P.P.S. I continued to be stunned by the way in which tables of numbers are presented in social science research papers with no thought of communication with, for example, tables with interval estimate such as "(.0159, .0408)." (What were all those digits for? And what do these numbers have to do with anything at all?). If the words, sentences, and paragraphs of an article were put together in such a stylized, unthinking way, the article would be completely unreadable. Formal structures with almost no connection to communication or content . . . it would be like writing the entire research article in iambic pentameter with an a,b,c,b rhyme scheme, or somesuch. I'm not trying to pick on Carrell and West here--this sort of presentation is nearly universal in social science journals.

Trends in partisanship by state


Matthew Yglesias discusses how West Virginia used to be a Democratic state but is now solidly Republican. I thought it would be helpful to expand this to look at trends since 1948 (rather than just 1988) and all 50 states (rather than just one). This would represent a bit of work, except that I already did it a couple years ago, so here it is (right-click on the image to see the whole thing):

Third-party Dream Ticket


Who are the only major politicians who are viewed more positively than negatively by the American public? - Custom comment codes for MySpace, Hi5, Friendster and more

(See page 3 of this report.)

Seeing as the Freakonomics people were kind enough to link to my list of five recommended books, I'll return the favor and comment on a remark from Levitt, who said:

This post is by Phil Price.

An Oregon legislator, Mitch Greenlick, has proposed to make it illegal in Oregon to carry a child under six years old on one's bike (including in a child seat) or in a bike trailer. The guy says ""We've just done a study showing that 30 percent of riders biking to work at least three days a week have some sort of crash that leads to an injury... When that's going on out there, what happens when you have a four year old on the back of a bike?" The study is from Oregon Health Sciences University, at which the legislator is a professor.

Greenlick also says ""If it's true that it's unsafe, we have an obligation to protect people. If I thought a law would save one child's life, I would step in and do it. Wouldn't you?"

There are two statistical issues here. The first is in the category of "lies, damn lies, and statistics," and involves the statement about how many riders have injuries. As quoted on a blog, the author of the study in question says that, when it comes to what is characterized as an injury, "It could just be skinning your knee or spraining your ankle, but it couldn't just be a near miss." By this standard, lots of other things one might do with one's child -- such as playing with her, for instance -- might be even more likely to cause injury.

Substantial numbers of people have been taking their children on bikes for quite a while now, so although it may be impossible to get accurate numbers for the number of hours or miles ridden, there should be enough data on fatalities and severe injuries to get a semi-quantitative idea of how dangerous it is to take a child on a bike or in a bike trailer. And when I say "dangerous" I mean, you know, actually dangerous.

The second problem with Greenlick's approach is that it seems predicated on the idea that, in his words, "If I thought a law would save one child's life, I would step in and do it. Wouldn't you?" Well, no, and in fact that is just a ridiculous principle to apply. Any reasonable person should be in favor of saving children's lives, but not at all cost. We could make it illegal to allow children to climb trees, to eat peanuts, to cross the street without holding an adult's hand...perhaps they shouldn't be allowed to ride in cars. Where would it end?

Finally, a non-statistical note: another state rep has commented regarding this bill, saying that "this is the way the process often works: a legislator gets an idea, drafts a bill, introduces it, gets feedback, and then decides whether to try to proceed, perhaps with amendments, or whether to let it die." If true, this is a really wasteful and inefficient system. Better would be "a legislator gets an idea, does a little research to see if it makes sense, introduces it,..." Introducing it before seeing if it makes sense is probably a lot easier in the short run, but it means a lot of administrative hassle in introducing the bills, and it makes people waste time and effort trying to kill or modify ill-conceived bills.

Problems with Haiti elections?

| No Comments

Mark Weisbrot points me to this report trashing a recent OAS report on Haiti's elections. Weisbrot writes:

The two simplest things that are wrong with the OAS analysis are: (1) By looking only at a sample of the tally sheets and not using any statistical test, they have no idea how many other tally sheets would also be thrown out by the same criteria that they used, and how that would change the result and (2) The missing/quarantined tally sheets are much greater in number than the ones that they threw out; our analysis indicates that if these votes had been counted, the result would go the other way.

I have not had a chance to take a look at this myself but I'm posting it here so that experts on election irregularities can see this and give their judgments.

P.S. Weisbrot updates:

Mark Lilla recalls some recent Barack Obama quotes and then writes:

If this is the way the president and his party think about human psychology, it's little wonder they've taken such a beating.

In the spirit of that old line, "That and $4.95 will get you a tall latte," let me agree with Lilla and attribute the Democrats' losses in 2010 to the following three factors:

1. A poor understanding of human psychology;

2. The Democrats holding unified control of the presidency and congress with a large majority in both houses (factors that are historically associated with big midterm losses); and

3. A terrible economy.

I will let you, the readers, make your best guesses as to the relative importance of factors 1, 2, and 3 above.

Don't get me wrong: I think psychology is important, as is the history of ideas (the main subject of Lilla's article), and I'd hope that Obama (and also his colleagues in both parties in congress) can become better acquainted with psychology, motivation, and the history of these ideas. I just think it's stretching things to bring in the election as some sort of outcome of the Democrats' understanding of political marketing.

Later on, Lilla writes of "the Tea Party's ire, directed at Democrats and Republicans alike . . . " Huh? The Tea Party activists are conservative Republicans. Are there any Democrats that the Tea Party participants like? Zell Miller, maybe?

Lilla concludes with an inspiring story of Muhammed Ali coming to Harvard and delivering a two-line poem, at which point, in Lilla's words, "The students would have followed him anywhere." He seems to attribute this to Ali's passion ("In our politics, history doesn't happen when a leader makes an argument, or even strikes a pose. It happens when he strikes a chord. And you don't need charts and figures to do that; in fact they get in the way. You only need two words."), but is that really right? Ali is a culture hero for many reasons, and my guess is the students would've followed him anywhere--even if he'd given them charts and figures. Actually, then maybe they'd have had more of an idea of where he was leading them!

It says in the article linked above that Lilla is a professor at Columbia, and, looking him up, I see that he won an award from the American Political Science Association. So I'm a bit surprised to see him write some of the things he writes above, about the Tea Party and attributing the 2010 election to a lack of understanding of psychology. (I assume the Muhammed Ali story is just poetic license.) Probably I'm missing something here, maybe I can ask him directly at some point.

Jas sends along this paper (with Devin Caughey), entitled Regression-Discontinuity Designs and Popular Elections: Implications of Pro-Incumbent Bias in Close U.S. House Races, and writes:

The paper shows that regression discontinuity does not work for US House elections. Close House elections are anything but random. It isn't election recounts or something like that (we collect recount data to show that it isn't). We have collected much new data to try to hunt down what is going on (e.g., campaign finance data, CQ pre-election forecasts, correct many errors in the Lee dataset). The substantive implications are interesting. We also have a section that compares in details Gelman and King versus the Lee estimand and estimator.

I had a few comments:

Bayes in China update


Some clarification on the Bayes-in-China issue raised last week:

1. We heard that the Chinese publisher cited the following pages that might contain politically objectionable materials: 3, 5, 21, 73, 112, 201.

2. It appears that, as some commenters suggested, the objection was to some of the applications, not to the Bayesian methods.

3. Our book is not censored in China. In fact, as some commenters mentioned, it is possible to buy it there, and it is also available in university libraries there. The edition of the book which was canceled was intended to be a low-cost reprint of the book. The original book is still available. I used the phrase "Banned in China" as a joke and I apologize if it was misinterpreted.

4. I have no quarrel with the Chinese government or with any Chinese publishers. They can publish whatever books they would like. I found this episode amusing only because I do not think my book on regression and multilevel models has any strong political content. I suspect the publisher was being unnecessarily sensitive to potentially objectionable material, but this is their call. I thought this was an interesting story (which is why I posted the original email on the blog) but I did not, and do not, intend it as any sort of comment on the Chinese government, Chinese society, etc. China is a big country and this is one person at one publisher making one decision. That's all it is; it's not a statement about China in general.

I did not write the above out of any fear of legal action etc. I just think it's important to be fair and clear, and it is possible that some of what I wrote could have been misinterpreted in translation. If anyone has further questions on this, feel free to ask in the comments and I will clarify as best as I can.

I received the following in email from our publisher:

I write with regards to the project to publish a China Edition of your book "Data Analysis Using Regression and Multilevel/Hierarchical Models" (ISBN-13: 9780521686891) for the mainland Chinese market. I regret to inform you that we have been notified by our partner in China, Posts & Telecommunications Press (PTP), that due to various politically sensitive materials in the text, the China Edition has not met with the approval of the publishing authorities in China, and as such PTP will not be able to proceed with the publication of this edition. We will therefore have to cancel plans for the China Edition of your book. Please accept my apologies for this unforeseen development. If you have any queries regarding this, do feel free to let me know.

Oooh, it makes me feel so . . . subversive. It reminds me how, in Sunday school, they told us that if we were ever visiting Russia, we should smuggle Bibles in our luggage because the people there weren't allowed to worship.

Xiao-Li Meng told me once that in China they didn't teach Bayesian statistics because the idea of a prior distribution was contrary to Communism (since the "prior" represented the overthrown traditions, I suppose).

And then there's this.

I think that the next printing of our book should have "Banned in China" slapped on the cover. That should be good for sales, right?

P.S. Update here.

The other day I posted some evidence that, however things used to be, congressional elections are increasingly nationalized, and it's time to retire Tip O'Neill's slogan, "all politics is local." (The discussion started with a remark by O.G. blogger Mickey Kaus; I also explain why I disagree with Jonathan Bernstein's disagreement with me.)

Alan Abramowitz writes in with an analysis of National Election Study from a recent paper of his:

Average Correlations of House and Senate Votes with Presidential Job Evaluations by Decade

Decade House.Vote Senate.Vote
1972-1980 .31 .28
1982-1990 .39 .42
1992-2000 .43 .50
2002-2008 .51 .57

This indeed seems like strong evidence of nationalization, consistent with other things we've seen. I also like Abramowitz's secret-weapon-style analysis, breaking the data up by decade rather than throwing all the data in at once and trying to estimate a trend.

Bribing statistics

| No Comments

I Paid a Bribe by Janaagraha, a Bangalore based not-for-profit, harnesses the collective energy of citizens and asks them to report on the nature, number, pattern, types, location, frequency and values of corruption activities. These reports would be used to argue for improving governance systems and procedures, tightening law enforcement and regulation and thereby reduce the scope for corruption.

Here's a presentation of data from the application:

Transparency International could make something like this much more widely available around the world.

While awareness is good, follow-up is even better. For example, it's known that New York's subway signal inspections were being falsified. Signal inspections are pretty serious stuff, as failures lead to disasters, such as the one in Washington. Nothing much happened after: the person responsible (making $163k a year) was merely reassigned.

Yesterday I wrote that Mickey Kaus was right to point out that it's time to retire Tip O'Neill's famous dictum that "all politics are local." As Kaus points out, all the congressional elections in recent decades have been nationalized. The slogan is particularly silly for Tip O'Neill himself. Sure, O'Neill had to have a firm grip on local politics to get his safe seat in the first place, but after that it was smooth sailing.

Jonathan Bernstein disagrees, writing:

All politics are local -- not

| 1 Comment

Mickey Kaus does a public service by trashing Tip O'Neill's famous dictum that "all politics are local." As Kaus point out, all the congressional elections in recent decades have been nationalized.

I'd go one step further and say that, sure, all politics are local--if you're Tip O'Neill and represent a ironclad Democratic seat in Congress. It's easy to be smug about your political skills if you're in a safe seat and have enough pull in state politics to avoid your district getting gerrymandered. Then you can sit there and sagely attribute your success to your continuing mastery of local politics rather than to whatever it took to get the seat in the first place.

Frank Hansen writes:

Columbus Park is on Chicago's west side, in the Austin neighborhood. The park is a big green area which includes a golf course.

Here is the google satellite view.

Here is the nytimes page. Go to Chicago, and zoom over to the census tract 2521, which is just north of the horizontal gray line (Eisenhower Expressway, aka I290) and just east of Oak Park. The park is labeled on the nytimes map.

The census data have around 50 dots (they say 50 people per dot) in the park which has no residential buildings.

Congressional district is Danny Davis, IL7. Here's a map of the district.

So, how do we explain the map showing ~50 dots worth of people living in the park. What's up with the algorithm to place the dots?

I dunno. I leave this one to you, the readers.

As we said in Red State, Blue State, it's not the Prius vs. the pickup truck, it's the Prius vs. the Hummer. Here's the graph:


Or, as Ross Douthat put it in an op-ed yesterday:

This means that a culture war that's often seen as a clash between liberal elites and a conservative middle America looks more and more like a conflict within the educated class -- pitting Wheaton and Baylor against Brown and Bard, Redeemer Presbyterian Church against the 92nd Street Y, C. S. Lewis devotees against the Philip Pullman fan club.

Our main motivation for doing this work was to change how the news media think about America's political divisions, and so it's good to see our ideas getting mainstreamed and moving toward conventional wisdom.

See below. W. D. Burnham is a former professor of mine, T. Ferguson does important work on money and politics, and J. Stiglitz is a colleague at Columbia (whom I've never actually met). Could be interesting.

I came across an interesting article by T. W. Farnam, "Political divide between coasts and Midwest deepening, midterm election analysis shows."

There was one thing that bugged me, though.

Near the end of the article, Farnam writes:

Latinos are not swing voters . . . Exit polls showed that 60 percent of Latino voters favored Democratic House candidates - a relatively steady proportion with the 69 percent the party took in 2006, the year it captured 31 seats.

Huh? In what sense is 60% close to 69%? That's a swing of 9 percentage points. The national swing to the Republicans can be defined in different ways (depending on how you count uncontested races, and whether you go with total vote or average district vote) but in any case was something like 8 percentage points.

The swing among Latinos was, thus, about the same as the national swing. At least based on these data, the statement "Latinos are not swing voters" does not seem supported by the facts. Unless you also want to say that whites are not swing voters either.

Just chaid


Reading somebody else's statistics rant made me realize the inherent contradictions in much of my own statistical advice.

Incumbency advantage in 2010



See here for the full story.

Dan Hopkins sends along this article:

[Hopkins] uses regression discontinuity design to estimate the turnout and election impacts of Spanish-language assistance provided under Section 203 of the Voting Rights Act. Analyses of two different data sets - the Latino National Survey and California 1998 primary election returns - show that Spanish-language assistance increased turnout for citizens who speak little English. The California results also demonstrate that election procedures an influence outcomes, as support for ending bilingual education dropped markedly in heavily Spanish-speaking neighborhoods with Spanish-language assistance. The California analyses find hints of backlash among non-Hispanic white precincts, but not with the same size or certainty. Small changes in election procedures can influence who votes as well as what wins.

Beyond the direct relevance of these results, I find this paper interesting as an example of research that is fundamentally quantitative. The qualitative finding--"Spanish-language assistance increased turnout for citizens who speak little English"--reaches deep into dog-bites-man territory. What makes the paper work is that the results are quantitative (for example, comparing direct effect to backlash).

P.S. I love love love that Hopkins makes his points with graphs that display data and fitted models.

P.P.S. The article is on a SSRN website that advertises "*NEW* One-Click Download." Huh? You click on a pdf and it downloads. That's standard, no? What would be the point of "two-click download"? Screening out the riff-raff?

Recent Comments

  • Cheryl Carpenter: Bob is my brother and he mentioned this blog entry read more
  • Bob Carpenter: That's awesome. Thanks. Exactly the graphs I was talking about. read more
  • Manuel Moe G: Do I detect a small inconsistency in the nomenclature you read more
  • Jed: Speaking of wacky claims... Not sure if you saw this. read more
  • mb: Small issue. From their Figure 3, the number of rain-free read more
  • Andrew Gelman: Jim: As Kobi and I write in our paper, we read more
  • Sumio Watanabe: Dear Dr. Gelman, I agree with your opinion that, even read more
  • Jim: Just curious what would be the next step if the read more
  • Winston Lin: Andrew, the July 4 findings might not be quite so read more
  • Megan Pledger: This is based on my softball knowledge from a long read more
  • Andrew Gelman: Yup. This'll be fixed in a few days when we read more
  • Millsy: Here's some shorter term, very preliminary, very basic, very ugly read more
  • Matt: Ben Fry’s Baseball Chart looks more like an art-museum-security-laser plot. read more
  • Rodney Sparapani: I guess that means that we can't post comments on read more
  • Pablo Verde: Excellent article! Where I can get the R script for read more

About this Archive

This page is an archive of recent entries in the Political Science category.

Multilevel Modeling is the previous category.

Public Health is the next category.

Find recent content on the main index or look in the archives to find all content.