Philip Stark sent along this set of calculations on the probability that the hidden message in Gov. Schwartzenegger's message could've occurred by chance. The message, if you haven't heard, is:
Recently in Political Science Category
Tuesday 3 Nov, 4-5:30pm in Room R505, Department of Government, LSE.
Culture wars, voting and polarization: divisions and unities in modern American politics
On the night of the 2000 presidential election, Americans sat riveted in front of their televisions as polling results divided the nation's map into red and blue states. Since then the color divide has become a symbol of a culture war that thrives on stereotypes--pickup-driving red-state Republicans who vote based on God, guns, and gays; and elitist, latte-sipping blue-state Democrats who are woefully out of touch with heartland values. But how does this fit into other ideas about America being divided between the haves and the have-nots? Is political polarization real, or is the real concern the perception of polarization?
This work is joint with David Park, Boris Shor, Joseph Bafumi, Jeronimo Cortina, and Delia Baldassarri.
(Here's a video version of the talk, from when I gave it at Google.)
I'll be interested to see if people can explain to me the relevance (or lack thereof) of this work to politics in Britain and other countries.
P.S. I'm speaking at LSE on Monday also (on a different topic).
P.P.S. I'll be speaking again a couple times in London later in the academic year, but on other topics. All my talks there will be different.
Lane Kenworthy, Yu-Sung Su, and I write:
Income inequality in the United States has risen during the past several decades. Has this produced an increase in partisan voting differences between rich and poor? We examine trends from the 1940s through the 2000s in the country as a whole and in the states. We find no clear relation between income inequality and class-based voting.
This article will appear in a special issue of Social Science Quarterly on the topic of "Inequality and Poverty: American and International Perspectives." We have some pretty graphs, some of which appeared in the Red State, Blue State book and some of which didn't.
P.S. "We find no clear relation . . .": That works great in an academic article but I don't think we'll be grabbing the headlines anytime soon.
My colleague Boris Shor has performed some analysis (jointly with Nolan McCarty) on the ideological positions of state legislators. The estimates are based on state legislative voting, which might make you wonder how you could possibly compare legislators in one state with those in another. The trick is that some state representatives (for example, Barack Obama) also end up in Congress. There are enough of these overlap cases that you can put legislators from all 50 states on a common scale.
Boris and Nolan most recently applied their method to compare Deirdre Scozzofava, a state assemblywoman running on the Republican ticket in special election in New York's 23rd congressoinal district. Boris writes:
Robin Blackburn in Port-au-Prince:
The conference was opened by the prime minister, Michele Pierre-Louis, who was appointed despite a scurrilous campaign by opposition forces, who argued that appointing a lesbian to such a prominent position was a violation of Haitian manhood. Pierre-Louis had been the director of an NGO known as Fokal (Fondasyon Konesans ak Libete). In choosing her, Preval was thought to have made an adroit move, pleasing the NGO and donor communities: Fokal is supported by George Soros and various Canadian charities.
The above title is a joke. I haven't actually seen the book. As a big-time blogger, I get some books in the mail to review, but maybe this one is sitting in my NYC office. Anyway, the backlash has begun, so maybe this is the right time to buy low and be the first to offer the contrarian claim that, despite what everybody's saying, the book is awesome.
From a short-term economics standpoint, the controversy has gotta be good for the book. So far, Levitt and Dubner have put the words "GLOBAL COOLING" on the cover of their book, they've endorsed a report saying that future trends are "virtually assuring us of about 30 years of global cooling," and that "even if man is warming the planet, it is a small part compared with nature," and they've written that "we believe that rising global temperatures are a man-made phenomenon and that global warming is an important issue to solve." That last bit should do the job of ticking off anybody who was with them so far! (I was actually surprised when reading the comments on that last quote--where Levitt assures us that they do believe in global warming--that pretty much all the global-warming-skeptics among the commenters still seem to think that Levitt is on their side. I guess half a loaf is better than none at all, politically speaking, but I'm surprised that more of them didn't get angry at Levitt for saying that.)
I don't want to get into the substance of climate models, a subject on which I've worked on only a little bit. (The paper we wrote a few years ago never got published--actually, we never finished it enough to submit it anywhere--and our current work on the topic is still in the research-and-writing-up stage.) But I do want to speculate a bit on the political angle.
As I'm sure you know by now, I'm interested in differences between rich and poor. Higher-income people are more likely to vote Republican, and we've seen this in many different subgroups of the population. Among whites, among blacks, among religious attenders, etc., the poorer voters among these subgroups are more Democratic and the richer ones are more Republican.
This got me wondering: What are the subgroups of the population for which this isn't true? Or, more generally, how do rich and poor differ in their voting patterns, in different subgroups of the population.
Here's we found, courtesy of the 2000 and 2004 Annenberg surveys. For each group, we're looking at Republican share of the two-party vote intention among people in the upper third of family income, minus Republican share ... among people in the lower third of family income:
(Click on any of these graphs to see larger versions.)
A striking pattern. The differences between rich and poor are much larger among conservative, Republican groups than among liberal, Democratic groups. At the very bottom of the graph above, you see a few groups where richer people are more likely to vote Democratic. All of these are groups that are mostly liberal and Democratic.
Kathy Dopp pointed me to this analysis she did regarding the Afghan election. I don't have the time/energy to look into this myself right now but I thought I'd pass this along so that others can comment if they'd like.
Bear with me. I've got a lot of graphs here (made jointly with Daniel Lee). Click on any of them to see the full-size versions.
I'll start with our main result. From the 2004 Annenberg surveys:
Providing health insurance for people who do not already have it--should the federal government spend more on it, the same as now, less, or no money at all?
The maps below show our estimated percentages of people responding "more" (rather than "the same," "less," or "none") to this question:
Increased government spending on health was particularly favored by people under 65 and those in the lower end of the income distribution. Older and higher-income people are much more likely to be in opposition. And, yes, there's some variation by state--you can see a band of states in the middle of the country showing opposition--but age and income explain a lot more.
Awhile ago I was invited by Keying Ye to contribute to a book of essays, Frontier of Statistical Decision Making and Bayesian Analysis, in honor of the great Jim Berger. Here's my chapter, which begins:
Jim Berger has made important contributions in many areas of Bayesian statistics, most notably on the topics of statistical decision theory and prior distributions. It is the latter subject which I shall discuss here. I will focus on they applied work of my collaborators and myself, not out of any claim for its special importance but because these are the examples with which I am most familiar. A discussion of the role of the prior distribution in several applied examples will perhaps be more interesting than the alternative of surveying the gradual progress of Bayesian inference in political science (or any other specific applied field).I will go through four examples that illustrate different sorts of prior distributions as well as my own progress--in parallel with the rest of the statistical research community--in developing tools for including prior information in statistical analyses . . .
Following up on some links, I came across this:

As a beneficiary of indoor smoking bans, I can't say that I agree with the sentiment, but the poster is pretty clever, and it got me thinking. Imagine Churchill on his regular dose of alcohol but without the moderating influence of the tobacco. Maybe would've been a disaster. Seems like a joke, but maybe we'd all be blogging in German right now. I'd like to think, though, Churchill would've switched to chewing tobacco and all would be ok. A spitoon in the corner is a small price to pay for freedom.
Christopher Rhoads writes:
Interested to know what your comment would be on the following article, which includes the following lines:
Lee Sigelman writes that Senator Coburn of Oklahoma is proposing to zero out funding for the National Science Foundation's political science program. It's hard for me to believe this will even come close to happening, and conflict of interest prevents me from saying anything at all trustworthy on the subject--I've had nearly continuous NSF funding for the past 23 years--but I'll tell ya this: I clicked through to Sen. Coburn's list of NSF-funded projects that he'd like to cut, which included:
- $91,601 to conduct a survey to determine why people are for or against American military conflicts.
- $8,992 to study campaign finance reform, with the stated intent of providing "a basis for assessing future proposed changes to campaign finance regulations.
- $958 for a direct mail survey of the residents of Celebration, Florida regarding their feelings of living in privately operated city.
Following my comments on their article on U.S. military funding and conflict in Colombia, Oeindrila Dube and Suresh Naidu wrote:
Thanks for the comments on our paper. It seemed that you viewed the correlations in the anaysis as an interesting descriptive exercise, but not interpretable as causal. We agree with you that the most interesting social science is often causal, and in this case in particular the causal claims are the main results. The paper's punchline is that military aid needs to be reconsidered when there is collusion between the army and non-state armed groups, and we couldn't make this claim if we thought the results were purely descriptive.In the paper, we do a lot of sample splitting and parametric time controls to rule out the possibility that this is a spurious effect. For example, our results are robust to including a base-specific time trend, along with a base-specific post-2001 dummy.
Possibly the best evidence against a strict "conflict" time-series interpretation is that there is no effect (positive or negative) of US military aid on guerrilla attacks near Colombian military bases. In other words, its not just an increase in conflict on all sides, but an increase in paramilitary attacks in particular.
The "differential time trend" that could drive our effect would have to be a) steeply nonlinear b) only applicable to paramilitaries in base municipalities, and c) would have to be fairly unique to the base municipalities, given the wide variety of alternate control groups we examine. So we think this is not a likely alternative explanation that can account for the effects.
To which I replied:
First off, I still would prefer associational language followed by causal speculation. But I can respect your different choice of emphasis. Now to get to details: my basic alternative model goes as follows: - Conflict in Colombia increased during the early 2000's. - U.S. military aid, in the U.S. and elsewhere, increased during that period also. - Most of the paramilitary attacks (and, thus, most of the increase in paramilitary attacks) occurred near military bases. Thus, I'm not so impressed by the "differential time trend" argument. It's unsurprising (but nonetheless worth noting, as you do) that there are fewer guerilla attacks near military bases. But that doesn't mean that the paramilitary attacks wouldn't have increased in the absence of U.S. aid.None of the above really contradicts your main political story, which is that the Colombian military is involved in paramilitary attacks, and that U.S. aid is an enabler for this sort of violence.
My story above is consistent with your causal story--more U.S. aid, more resources for the military, more paramilitary attacks. It's also consistent with a different causal story, which goes like this: more conflict, more paramilitary attacks, also more U.S. aid which actually serves to stop the situation from getting worse. The argument is, yes, the U.S. is giving weapons to the bad guys, but by doing so, it co-opts them and restrains their behavior.
OK, I'm not saying this latter argument is true, but I think your strongest argument against it is to say something like: "Sure, it's possible that things would be getting even worse in the absence of U.S. military aid. But given that, during the time that aid was higher, violence was also higher--and we're talking here about violence being done by the allies of the recipients of the aid--well, maybe aid isn't such a good idea." That is, you can put the burden of proof on the advocates of aid. Hey, it costs money and it's going to some unsavory characters. You shouldn't have to prove that aid is hurting; I think it would be more defensible, from a statistical/econometric point of view, to show the association and put the ball in their court.
P.S. Just to be clear: I don't have any strong feeling that you're wrong or any goal of "debunking" your paper. It's interesting and important work and I'm trying to understand it better.
And then they shot back with:
Regarding the stylistic point about associations and causal claims, we think this is perhaps discipline-specific, as the style in economics seems to be to make a causal claim and then rule out all the alternative causal stories as much as possible. I'm sure this is probably one of many idiosyncrasies that irks non-economists.The substantive question is why paramilitary attacks (and paramilitary attacks specifically, rather than other measures of conflict), increase more in places near bases. The account we put forward is that this occurs because the Colombian military funnels a share of its resources to paramilitary groups. Thus, if US military aid translates into more resources for the military which are shared with paramilitary groups, the implication is that in the absence of increases in US military aid, paramilitary attacks would not have increased by as much as they did.
Now the alternative account you put forward is "more conflict, more paramilitary attacks, also more U.S. aid which actually serves to stop the situation from getting worse. The argument is, yes, the U.S. is giving weapons to the bad guys, but by doing so, it co-opts them and restrains their behavior."
It seems like you have two distinct things in mind, that overall conflict is a source of bias, and an associated conjecture that this omitted variable (overall conflict) upward biases our main coefficient since it is positively correlated with paramilitary attacks and positively correlated with the aid shock. First, we explicitly address and rule out potential omitted variables using a number of empirical specifications. But, even if there is an omitted variable correlated with U.S. military aid that differentially affects paramilitary attacks in base municipalities, it is not clear whether the direction of the bias would be positive. As an example, say a change in Colombian government leads the state to become more effective in fighting the guerilla insurgency, and the US rewards the state with more military aid, while paramilitary activity declines differentially in base regions, as this activity becomes less necessary with greater military effectiveness. In this case, the omitted variable (stronger Colombian state) is negatively correlated with paramilitary attacks and positively correlated with the aid shock, and this would lead us to underestimate the true effect of U.S. aid on paramilitary activity.
Moreover, we think we do a good job ruling "conflict in general" at the national, state, or municipality level as a confounding variable. "Overall conflict" variation at the country level is absorbed by year fixed effects, and conflict at the department level is absorbed by the department x year fixed effects. At the municipal level, it is NOT the case that we observe increases in overall conflict, such as total number of clashes amongst all armed actors at the municipal level. (In out data, attacks are one-sided events carried out by a particular group. The fact that we see paramilitary attacks increase means we are specifically observing increases in events that involve only paramilitary groups - e,g, the paramilitaries attack a village or destroy some type of infrastructure. ) Also, in every specification we find no effect on the guerrilla attacks, and we think you are not taking the non-effect sufficiently seriously in terms of countering the overall conflict account. The guerilla non-effect actually provides very robust evidence that the U.S. military aid is not just correlated with any type of conflict, but rather with attacks by a particular group (which has no regional spillovers).
In addition, our base-specific linear trend and post-2001 dummy specification should convince you that our effect is not merely a post-2001 increase in conflict that manifests particularly as paramilitary attacks in base municipalities.
Your alternative account suggests that more aid to paramilitary organizations could actually result in less violence. While it is challenging to know what the counterfactual would have been in the absence of increased aid, Figure 2 shows that when aid rises sharply in 1999 there is a differential increase in aid in the base regions, and when aid decreases in 2001, there is a corresponding closing of differential decrease in the base regions. This seems inconsistent with the idea that lower aid translates into more paramilitary activity. Also, after 2002, when aid rises again, the differential increases yet another time. It is difficult to explain this pattern with the account you put forward, which would have to require additional coincidental reasons why paramilitary attacks should increase more in base regions precisely in 1999, then decline in 2001, and then rise again in 2002. This is possible, but seems unlikely.
We were thinking of some ideas that would be consistent with your alternative account, of why more aid to paramilitary organizations could actually lower violence. One story here could be deterrence - that stronger paramilitaries deter the guerillas resulting in fewer attacks by guerillas or fewer clashes between guerillas and paramilitaries. But, our results do not show a fall in guerilla attacks or clashes amongst the two groups; rather the coefficient on these other variables is close to 0 and they are statistically insignificant, which is inconsistent with the deterrence account.
Another reason could be dependence, that in the short run U.S. aid increases paramilitary violence, but it also induces paramilitary reliance on the Colombian military for supplies, which increases the sway the government has vis-à-vis this group, potentially leading to future demobilization. Thus in the long-run, U.S. military aid reduces paramilitary violence. While this process could take "long and variable lags" to manifest, it is important to note that we see a dramatic increase in paramilitary activity in 2005, despite a half-decade of huge U.S. military transfers to Colombia. Thus we do not see evidence of this dependence account in our data.
Keith Ellis writes:
I've been wondering about are the use of sophisticated mathematical techniques to discover what are the real-world political ideologies, starting without conventional preconceptions.The core idea I had when this came to many years ago was by way of reading some technical articles about color vision. I was struck by one paper, which I could barely understand, which attempted to determine the "spatial" dimensionality of color vision...I recall vaguely that the conclusion was that it is best described in a 28 or so dimensional space. This connected up, conceptually in my imagination, with what was then the nascent specialty of the stuff involved in the Netflix prize--I can't recall the technical term...preference modeling?
Culture wars, voting, and polarization: divisions and unities in modern American politics
I will be discussing various recent work, including material that has appeared in this article and this book.
And here's the video from the last time I gave this lecture.
*Ecole Doctorale *
*salle de réunion du 3ème étage *
*199, boulevard Saint-Germain*
*Lundi 5 Octobre 2009 à 17h30*
Enjoy.
I was in the library the other day and saw a new book, Why are Jews Liberals?, by O.G. neoconservative Norman Podhoretz. This is right up my alley, research-wise, and so I took a look. I don't think Podhoretz's book will match the sales of Thomas Frank's similar record of frustration, "What's the Matter with Kansas?"--there are very few writers out there who can match Frank's skill with the perceptive quip--but this new book has something to offer, if nothing else by presenting the view of an influential battler in the world of political ideas.
Podhoretz's argument (here's a quick summary) goes as follows. Jews in America vote overwhelmingly for Democrats, even though you'd expect from their income levels that Jews would lean Republican. Expanding this, Podhoretz gives three reasons why Jews should vote for Republicans:
I'd much much rather have the Washington Post have a competition for America's Next Great Reporters. We have enough Next Great Pundits as it is.
"The cab driver - who witnesses said was talking on his cell phone and appeared distracted - slowed briefly but then tried to speed away . . ." (link from Streetsblog). (And this other source says he "has several traffic violations on his driving record."
Something similar happened to us not long ego (although, luckily, nobody was hurt). This time the cops told the driver (who, again, had to be stopped by people on the street so she couldn't drive off) not to worry about it.
P.S. I'm not saying the drivers should go to prison. The more appropriate sanction would be to get them out of the driver's seat of a car. For the guy who killed the kid and tried to escape, perhaps forbidding him to drive for 20 years would be appropriate. But really it should never have reached that point: if each of the "several traffic violations" had resulted in his car being taken away and him being forbidden to drive for some period of time, then it's likely he wouldn't have been on the road that day and the kid would still be alive.
On the other hand, if hitting a kid and driving away is considered OK, then of course you'll still see people driving that way.
Macartan Humphreys sends along this article where he proves that the requirement of "compactness" in districting, if interpreted as requiring districts to be convex, does not by itself stop a majority party from gerrymandering:
Gerrymandering--the manipulation of electoral boundaries to maximize constituency wins|is often seen as a pathology of democratic systems. A commonly cited cure is to require that electoral constituencies have a `compact' shape. But how much of a constraint does compactness in fact place on would-be gerrymanderers? By applying a theorem of Kaneko, Kano, and Suzuki (2004) to the two party situation we show that a gerrymanderer can always create equal sized convex constituencies that translate a margin of k voters into a margin of at least k constituency wins. Thus even with a small margin a majority party can win all constituencies. Moreover there always exists some population distribution such that all divisions into equal sized convex constituencies translate a margin of k voters into a margin of exactly k constituencies. Thus a convexity constraint can sometimes prevent a gerrymanderer from generating any wins for a minority party.
Chris Blattman reports on a study by Seema Jayachandran and Ilyana Kuziemko that makes the following argument:
Medical research indicates that breastfeeding suppresses post-natal fertility. We [Jayachandran and Kuziemko] model the implications for breastfeeding decisions and test the model's predictions using survey data from India. . . . mothers with no or few sons want to conceive again and thus limit their breastfeeding. . . . Because breastfeeding protects against water- and food-borne disease, our model also makes predictions regarding health outcomes. We find that child-mortality patterns mirror those of breastfeeding with respect to gender and its interactions with birth order and ideal family size. Our results suggest that the gender gap in breastfeeding explains 14 percent of excess female child mortality in India, or about 22,000 "missing girls" each year.
Interesting. I wonder what Monica Das Gupta would say about this study--she seems to be the expert in this area.
Huh?
The only thing that really puzzles me about Jayachandran and Kuziemko's article is that, on one hand, they produce an estimate of 14%, but on the other, they write:
In contrast to conventional explanations, excess female mortality due to differential breastfeeding is largely an unintended consequence of parents' desire to have more sons rather than an explicit decision to allocate fewer resources to daughters.
But they just said their explanation only explains 14%. Doesn't that suggest that the other 86% arises from infanticide and other "explicit decisions"? The difference between "14%" and "largely" is so big that I think I must be missing something here. Perhaps someone can explain? Thanks.
Tom Holbrook writes:
I just saw your post on the generic ballot and thought you might be interested in something I posted just the other day. My post was stimulated something Charlie cook had written a couple of weeks ago, and I hadn't yet seen the Bafumi, Erikson, and Wlezien article. Anyway, I find that there isn't much connection between generic ballots and midterm results this far (14 months) out from the election. My analysis doesn't break down by in-party and out-party, and it uses data much farther out than Bafumi et al. used, so it not directly comparable to their work; but I thought you might find it interesting.
Under the heading, "Republicans not in a position to retake the House (yet)," Chris Bowers estimates that the Democrats have a 41.2%-37.7% lead in recent generic House polling. Bowers writes, "Democrats are, after all, still winning."
But it's not so simple. In research published a couple years ago, Joe Bafumi, Bob Erikson, and Chris Wlezien found that, yes, generic party ballots are highly predictive of House voting--especially in the month or two before the election-but that early polling can be improved by adjusting for political conditions. In particular, the out-party consistently outperforms the generic polls.
The paper accompanying this graph was among the first public predictions of a Democratic takeover in 2006.
Bafumi, Erikson, and Wlezien's analysis doesn't go back before 300 days before the election, but if we take the liberty of extrapolating . . . The current state of the generic polls gives the Democrats .412/(.412+.377) = 52% of the two-party vote. Going to the graph, we see, first, that 52% for the Democrats is near historic lows (comparable to 1946, 1994, and 1998) and that the expected Democratic vote--given that their party holds the White House--is around -3%, or a 53-47 popular vote win for the Republicans.
Would 53% of the popular vote be enough for the Republicans to win a House majority? A quick look, based on my analysis with John Kastellec and Jamie Chandler of seats and votes in Congress, suggests yes.
It's still early--and there's a lot of scatter in those scatterplots--but if the generic polls remain this close, the Republican Party looks to be in good shape in the 2010.
P.S. Is there any hope for the Democrats? Sure. Beyond the general uncertainty in prediction, there is the general unpopularity of Republicans; also, it will be year 2 of the presidential term, not year 6 which is historically the really bad year for the incumbent party. Still and all, the numbers now definitely do not look good for the Democrats.
I received the following email the other day:
I read the abstract for your paper What is the probability your vote will make a difference? with Nate Silver, Aaron Edlin [to appear in Economic Inquiry]. I'd note that the abstract prima facie contains an error. Your sentence in the abstract, "On average, a voter in America had a 1 in 60 million chance of being decisive in the presidential election." can not be correct. If we assume that this sentence is correct that means that given the actual turnout of 132,618,580 people the sum total probability of voters being decisive is larger than one. This of course [sic] is impossible. The total amount of decisiveness must be at most one (although obviously the sum total can be lower than one if the voters are not equally disposed to both candidates). . . .
The above argument is at first appealing but is not actually correct. Actually the total probability can exceed 1. For a simple mathematical example, see p.425 of this paper. The reason the total probability can exceed 1 is that it is possible for many voters to be decisive at the same time.
Mark Thoma links to this article by Bill Easterly about the history of economic development in the mid-twentieth century. Easterly writes:
Why does this history matter today . . . I [Easterly] do NOT mean to imply guilt by association for development as imperialist and racist; there are many theories of development and many who work on development (including many from developing countries themselves) that have nothing to do with imperialism and racism.But I [Easterly] think the origin of development as cover for imperialism and racism did have toxic legacies for some. First, it meant that the concept of development was determined to fit a propaganda imperative; it was NOT a breakthrough in thought by economists. Second, it followed that development from the beginning would stress the central role of Western aid to help the helpless natives. . . And this history also seems strangely relevant with today's "humanitarian" nouveau-imperialism to invade and fix "failed states" like Iraq and Afghanistan.
I defer to Easterly both on the history and the economics of international development, but I do have one criticism of his argument. It is my impression that a lot of ideas in economic development are not just about the interaction between "first world" and "third world" countries (Easterly's focus) but also relate to struggles within individual third world countries. In some countries, the international development people were opposing white elites. This doesn't mean that either side was necessarily correct in its economic assumptions, but it seems a bit extreme to think of economic development experts as supporting white superiority.
Of all the first-world institutions that were influencing poorer countries during those times fifty years ago, I'd think that the international development community was one of the less racist.
Andrew Therriault writes:
I'm creating a model of issue emphasis in political campaigns as a product of public opinion (so candidates choose what to discuss strategically based on which issue will help them most), and the data I'm using combines candidates' ad spending (coded by issue) with the public's issue positions in the candidates' districts. Thus far, I've used percentage of ad spending per issue for each candidate as my DV in OLS and tobit models. I know that this specification is not optimal, though, because of the correlation between each candidate's observations (since they are constrained to sum to 100).
Peter Loewen sends along this article that follows up on some work of James Fowler, Aaron Edlin, Noah Kaplan, and myself on rational voter turnout. He goes with the rational-but-not-entirely-self-interested model that we all use (and which first appeared, as far as we know, in a book by Derek Parfit in 1984), and tests it empirically:
Lynn Vavreck writes:
I just heard the Carter interview about Obama and racism. Simon Jackman and I have a bit of evidence on this from the Cooperative Campaign Analysis Project. These are data from white, registered voters nationwide about stereotypes of different groups. You can see, roughly a third of the people think blacks are inferior to whites on lazy v. hardworking and similarly on intelligent/unintelligent:[click for larger version of the table]
This, of course, doesn't answer the question about whether Carter is right -- but, it does provide some systematic evidence for his claim that many Americans don't think African Americans are "qualified" (his words) to lead this country.
Some people would take these data as evidence of racism, but I have a more positive spin. The table gives the average rating for whites, and for southern whites, and from these you can back out the implied rating for non-southern whites. And we're lookin good. We're as intelligent as Asians and almost as hardworking! In the words of a famous non-hardworking, non-southern white: Woo-hoo!
Of course I don't buy this--no way are non-southern whites as hard-working and intelligent as Asian Americans--I mean, c'mon. But it's good to know that white people, at least, still think this. Now I want to see Lynn and Simon break down their respondents by where they live. Do southern whites themselves think they are lazier and dumber than non-southern whites??
P.S. The question stem reads:
Now, some questions about different groups in our society. Rate each group on the following scale, where "1" means you think almost all of the people in that group are "lazy"; and "7" means that you think almost everyone in the group is "hardworking."
A correspondent who wishes to remain anonymous sends this in:
From Pat Robertson's Regent University course catalog:GOV 601 Quantitative Analysis (3) Skills for quantitative data gathering, measurement, policy analysis and program evaluation. Research and sampling design, surveys, data collection and data reduction and display. Review of basic statistics through multivariate analysis, z-scores, regression through the use of statistical computer package (SPSS), and a Judeo-Christian perspective on the use of statistics.
I wonder if they teach the principle that God is in every leaf of every tree.
(I looked for the course description online but couldn't find it. But the description seems consistent with others in the catalog.)
Doug writes:
Probability sampling is a great invention, but rhetoric has overtaken reality here. Both of the probability samples in this study had large amounts of nonresponse, so that the real selection probability--i.e., the probability of being selected by the surveyor and the respondent choosing to participate--is not known. Usually a fairly simple nonresponse model is adequate, but the accuracy of the estimates depends on the validity of the model, as it does for non-probability samples. Nonresponse is a form of self-selection. All of us who work with non-probability samples should spend our efforts trying to improve the modeling and methods for dealing with the problem, instead of pretending it doesn't exist.
Good stuff. Read the whole thing. Doug was, along with me and several others, an advisor on the recent report on National Election Study weighing.
David Shor writes:
You did some initial analysis of Nate's election forecasting work over on FiveThirtyEight.Forgive me for plugging in my own [Shor's] work, but I did pre-election poll aggregation using a different methodology, and performed about on par with Nate. (Slightly better with state level presidential races, worse with senate races, and quite a bit better with the popular vote).
A comparison of our results(And a spreadsheet with quite a bit of data) are available here.
In terms of possible improvements for next cycle:
Tyler Cowen links to an article by economist Ed Glaeser on urban political activists Jane Jacobs and Robert Moses. Moses, who ran various NYC government commissions in the mid-twentieth century, is famous for organizing the construction of bridges and structuring the financing so that he controlled the flow of money from the tolls. This independent source of funding gave him a huge amount of power within the government to do almost whatever he wanted--for awhile, until Jacobs and others mustered the popular support to stop him. Given my experiences at Columbia University, I can appreciate Moses's bureaucratic acumen: in any organization I've been involved in, there aren't so many sources of free money--that is, funds that haven't already been allocated to some expense. Free money is a source of power. I imagine this is true within corporations as well.
That's all tactics, though. What's relevant for Glaeser's article is what Robert Moses did with his money and power, which was to build some highways and attempt to build others that, on the plus side, would make it faster for people to go through New York City on the way to or from other places and, on the minus side, would destroy some neighborhoods and make many of the un-destroyed neighborhoods less pleasant to be in (by being next to a highway, disconnected from the rest of the city, etc).
What about the specifics? Glaeser agrees that Moses's proposed lower Manhattan expressway was a bad idea, as was his highway that destroyed a neighborhood in the Bronx. On the plus side, Glaeser supports Moses's parks and swimming pools and describes his roads and bridges as "not all bad."
One thing that interests me about Glaeser's discussion is that, implicitly, there are two levels of liberal-conservative dispute here.
The National Election Study is hugely important in political science, but, as with just about all surveys, it has problems of coverage and nonresponse. Hence, some adjustment is needed to generalize from sample to population.
Matthew DeBell and Jon Krosnick wrote this report summarizing some of the choices that have to be made when considering adjustments for future editions of the survey. The report was put together in consultation with several statisticians and political scientists: Doug Rivers, Martin Frankel, Colm O'Muircheartaigh, Charles Franklin, and me. Survey weighting isn't easy, and this sort of report is just about impossible to write--you can't help leaving things out. They did a good job, though, and it's great to have this stuff put down in an official way, so that people can work off it of it when going forward.
It's a lot harder to write a procedure for general use than to do a single analysis oneself.
Some corrections
I have a few corrections to add to the report that unfortunately didn't make it into the final version (no doubt because of space limitations):
John Sides's recent blog inspires me to resurrect this note of mine from a couple years ago:
Recently I posted some graphs showing that liberal Democrats have a similar income profile to the general population, while conservative Republicans are more concentrated in the higher half of the income range.
Some people conjectured that the patterns might depend on whether people are thinking of liberalism/conservative as representing social or economic issues. So Daniel and I redid the calculations, looking separately at three different measures of survey respondents' ideology as derived from the 2000 Annenberg survey:
- self-positioning on the liberal/conservative scale
- position on a scale of economic ideology (based on combining the responses to several survey questions; details in Red State, Blue State)
- position on a social ideology scale (based on combining several other survey responses).
This gave us three 3x3 grids of graphs, one for each of the three ideology scales. This was too much to display, so Daniel and I reduced the data as follows: For each ideology measure, we created five categories of people:
- Liberal Democrats
- Moderate Democrats or Liberal Independents
- Neutral (these included Conservative Democrats, Moderate Independence, and Liberal Republicans)
- Moderate Republicans or Conservative Independents
- Conservative Republicans
This five-point scale takes you from one extreme of ideological partisanship to the other, and with only five categories instead of nine, it's easier to display.
Here's what we found (click on the graph to make it larger):
There are some differences between the different measures of ideology, but the take-home point for me is that the patterns are basically consistent: liberal Democrats by any measure are pretty well distributed across the income scale, and conservative Republicans are more concentrated among the upper incomes.
S. V. Subramanian, Tim Huijts, and Jessica Perkins report:
Studies have largely examined the association between political ideology and health at the aggregate/ecological level. Using individual-level data from 29 European countries, we investigated whether self-reports of political ideology and health are associated. In adjusted models, we found an inverse association between political ideology and self-rated poor health; for a unit increase in the political ideology scale (towards right) the odds ratio (OR) for reporting poor health decreased (OR 0.95, 95% confidence interval 0.94-0.96). Although political ideology per se is unlikely to have a causal link to health, it could be a marker for health-promoting latent attitudes, values and beliefs.
No pretty graphs, unfortunately. But interesting.
I received the following email:
Nate Persily, Steve Ansolabehere, and Charles Stewart just completedthis article addressing the relevance of the Voting Rights Act in light of Barack Obama's presidential victory:
Michael Bailey writes:
I saw your blog post on Supreme Court ideal points.I [Bailey] have long shared a [disagreement with] the Martin and Quinn scores in 1973; I have an AJPS paper which goes to great lengths to think about that and uses cross time (and cross institutional) bridging observations to get estimates that are, in my opinion, more plausible over time.
See page 435 for a discuss of Martin and Quinn scores and p. 444 for my [Bailey's] alternative results.
I don't have anything to add at this time; I'd just like to say that this sort of scholarly discussion is great: it is often through disagreements at particular points that we make scientific progress.
Charles Murray posts this interesting graph:

I'll give his explanation and then some discussion of my own. First, Murray:
Tom Schaller asks, "Why are senior citizens crying "socialism" at town halls?"
As we like to say in academia: I don't know the answer, so let me tell you something I do know. (Graphs made in collaboration with Daniel Lee.)
First, who has health insurance (from the 2000 Annenberg survey):

Next, should the government spend more on health care (this time from 2004):

Some Obamacare supporters say: Senior citizens have Medicare, which is a government plan, so they should support public health care provision, right? But maybe some people on Medicare are suspicious of expanded government involvement in health care because they see it as competing with Medicare for scarce dollars.
Here are a couple more graphs (pretty similar to the second graph above):
The Boston Review just published an article by John Sides and myself on the 2008 election, along with discussions from several journalists, political scientists, and political activists. Here are the summaries:
Andrew Gelman and John Sides: American presidential elections always turn into stories. Because these stories capture the public imagination, they have real political importance. Unfortunately, they are often wrong. The narrative of Obama's victory is no exception.Rick Perlstein: Our media will not--cannot--explain the slow, steady work that produces election victories.
Michael C. Dawson:: In 2008, there were 2 million more African-American voters than in 2004.
Richard Johnston and Emily Thorson: Sarah Palin's approval ratings moved John McCain's support with unparalleled precision.
Mark Schmitt: We should not dismiss the idea that Obama created a new electoral map.
Andrew Gelman and John Sides respond: In elections, what is certain is almost never new; what is new is almost never certain.
The article that John and I wrote is based on some blogging we did right after the election (especially this, this, and this from me, and this, this, and this from John).
It's always fun having an article with comments, to get views from different perspectives. We keep banging on about the importance of "the fundamentals," but I think a lot of our ideas are brought out more clearly in the context of the detailed points made in the discussions.
It's too bad they weren't able to run our article with all its graphs. (Many of these graphs will appear in the forthcoming second edition of Red State, Blue State, however, with its extra chapter on the 2008 election.)
Here's a paper from Ryan Enos:
The effect of group threat on voter mobilization has been tested using observational data across a number of different geographies and units of analysis. Previous studies have yielded inconsistent findings. To date, no study of voter mobilization has directly manipulated group threat using a controlled experiment. I take advantage of the unique racial geography of Los Angeles County, California, which brings different racial/ethnic groups into close, yet spatially separated, proximity. This geography allows for a randomized, controlled experiment to directly test the effects of stimulating racial threat on voter turnout. A test of 3,666 African American and Hispanic voters shows an average treatment effect of 2.3 percentage points. The eect is 50% larger for African Americans than Hispanics. These results suggest that even low propensity voters are aware of the geographic proximity of other groups and can be motivated to participate by this awareness.
See page 21 of the article for an example of the treatment, which includes a map and this bit of text:

But what really interested me about the article was that he imputed ethnicity using available information on last names. (Go to the article and search on "surname.")
P.S. But, boy, does this paper need some good graphs! I like the paper and want to plug it here, but there's no grabby graph. I'd like to see that scatterplot of raw data with fitted lines, showing what the researcher found and how these findings came from the data. Regression tables are fine (well, not really; they should be graphs, but that's another story), but I wanna see what's happening. I wanna see what's happening.
Here's a cool blog with all kinds of quantitative social science ideas. Good stuff.
Allen Hurlbert writes:
I saw your 538 post [on the partisan allegiances of sports fans] and it reminded me of some playful data analysis I [Hurlbert] did a couple months ago based on NewsMeat.com's compilation of sports celebrity campaign contributions. Glancing through the list I thought I noticed some interesting patterns in the partisan nature of various sports, so I downloaded the data and created this figure:

Matt Ginsberg writes:
I saw your mention on 538.com [see also this article and this with Edlin and Kaplan]; a long time ago (80's), I [Ginsberg] wrote an article with Mike Genesereth and Jeff Rosenschein about rationality for automated agents in collaborative environments. The punch line, which probably bears on this issue as well, is that the strategy, "Act in such a way that if all the other agents were designed identically, we'd do optimally" is provably a Pareto-optimal way to design such agents. It's a nice result: handles the prisoner's dilemma, why you should vote, throw yourself on the grenade, etc.
Ginsberg's papers on the topic are here and here. I like the idea of framing the problem in terms of designing intelligent agents. This bypasses some of the normative vs. descriptive issues that cloud the analysis of rationality in human behavior.
Jeff and Justin found, based on survey data from 1994-2008, that gay marriage is most popular among the under-30s and least popular among the over 65's, and it's a big gap: the difference in support for gay rights is about 35 percentage points more among the young than the old.
To explore these age patterns some more, Daniel and I did some simple analyses of attitudes on gays from three questions on the 2004 Annenberg survey, which had a large enough sample size that we could pretty much plot the raw numbers by age.
First, do you favor a state law allowing same sex marriage? As expected from Jeff and Justin's analysis, the younger you are, the more likely you are to support same-sex marriage:

How do we understand this? Perhaps younger Americans are more likely to know someone gay, thus making them more tolerant of alternative lifestyles.
It's not so simple. Let's look at the response to the question, Do you know any gay people. As of 2004, a bit over half the people under 55 reported knowing someone gay; from there on, it drops off a cliff. Only about 15% of 80-year-olds know any gay people. (The data are a little noisy at the very end, where sample sizes become smaller.)

This isn't what I was expecting. I thought that people under 30 would be much more likely to say they know a gay person. But the probability actually goes up slightly from ages 18 to 45. I guess this makes sense: during those years, you meet more people, some of whom might be gay.
Benjamin Kay writes:
I just finished the Stata Journal article you wrote. In it I found the following quote: "On the other hand, I think there is a big gap in practice when there is no discussion of how to set up the model, an implicit assumption that variables are just dumped raw into the regression."I saw James Heckman (famous econometrician and labor economist) speak on Friday, and he mentioned that using test scores in many kinds of regressions is problematic, because the assignment of a score is somewhat arbitrary even if the order was not. He suggested that positive, monotonic transformations scores contain the same information and lead to different standard errors if in your words one just "dumped into the regression". It was somewhat of a throw away remark, but considering it longer, I imagine he mans that a difference of test scores need have no constant effect. The remedy he suggested was to recalibrate exam scores such that they have some objective meaning. For example, a mechanics exam scored between one and a hundred, one can pass (65) only if they successfully rebuild the engine in the time allotted, but better scores indicate higher quality or faster speed. In this example one might change it to a binary variable to passing or not, an objective testing of a set of competencies. However, doing that clearly throws away information.
Do you or the readers of Statistical Modeling, Causal Inference, and Social Science blog have any advice here? The transformation of the variable is problematic and the critique of transformations on using it raw seems a serious one, but the act of narrowly mapping it onto a set of objective discrete skills seems to destroy lots of information. Percentile ranks on exams might be a substitute for the raw scores in many cases, but introduces other problems like in comparisons between groups.
My reply: Heckman's suggestion sounds like it would be good in some cases but it wouldn't work for something like the SAT which is essentially a continuous measure. In other cases, such as estimated ideal point measures for congressmembers, it can make sense to break a single continuous ideal-point measure into two variables: political party (a binary variable: Dem or Rep) and the ideology score. This gives you the benefits of discretization without the loss of information.
In chapter 4 of ARM we give a bunch of examples of transformations, sometimes on single variables, sometimes combining variables, sometimes breaking up a variable into parts. A lot of information is coded in how you represent a regression function, and it's criminal to just take the data as they appear in the Stata file and just dump them in raw. But I have the horrible feeling that many people either feel that it's cheating to transform the variables, or that it doesn't really matter what you do to the variables, because regression (or matching, or difference-in-differences, or whatever) is a theorem-certified bit of magic.
Daniel Lee and I made these graphs showing the income distribution of voters self-classified by ideology (liberal, moderate, or conservative) and party identification (Democrat, Independent, or Republican). We found some surprising patterns:

(Click on image to see larger version.)
Each line shows the income distribution for the relevant category of respondents, normalized to the income distribution of all voters. Thus, a flat line would represent a group whose income distribution is identical to that of the voters at large. The height of the line represents the size of the group; thus, for example, there were very few liberal Republicans, especially by 2008.
The most striking patterns to me are:
1. The alignment of income with party identification is close to zero among liberals, moderate among moderates, and huge among conservatives. If you're conservative, then your income predicts your party identification very well.
2. First focus on Democrats. Liberal Democrats are spread among all income groups, but conservative Democrats are concentrated in the lower brackets.
3. Conservative Republicans--the opposite of liberal Democrats, if you will--are twice as concentrated among the rich than among the poor.
Putting factors 2 and 3 together, we find that ideological partisans (liberal Democrats and conservative Republicans) are not opposites in their income distributions. In particular, richer voters are more prevalent in these groups.
Which might be relevant for the debates over health care, taxes, and other political issues that have a redistributive dimension.
P.S. The 2000 and 2004 data are from the National Annenberg Election Survey; 2008 is from the Pew Research pre-election surveys. We show all three years to indicate the persistence of the general pattern. As a way of showing uncertainty and variation, this is much more effective than displaying standard errors, I think.
In the aftermath of linking to my article with Aaron and Nate about the probability of your vote being decisive, Conor Clarke writes:
If your decision to vote is motivated by the sense that "one vote can make a difference," you are being substantially less rational than someone who never leaves the house for fear of being killed by a meteor. Voting is irrational.
I completely disagree with this last statement, and I know that Aaron does also. Here's we wrote on pages 4-5 of our article:
Reza Esfandiari sent me this article regarding statistical analyses of the recent election in Iran. Esfandiari looks at the data and concludes that the election was fair and that the analyses contending otherwise were flawed. I haven't look at this report in detail and offer no endorsement or criticism, just putting it out there so that anyone who might be interested can take a look themselves.
In the "Conservatives are nicer than liberals" controversy, there was a question about who has more money, conservatives or liberals. I've written a lot about income and voting, but I realized I'd never actually looked at income and political ideology. Here are the data, from respondents to the Pew pre-election polls in 2008:

The poorest people are more likely to be liberal, and the richest are more likely to identify as moderate rather than conservative, but overall there's less going on here than I would've expected.
In contrast, the relation between income and party identification is strong, and goes in the expected direction:

There must be a lot of low-income moderate Democrats and high-income moderate Republicans out there.
P.S. For the purpose of understanding charitable giving, I'd rather know wealth than income. Or maybe something like "disposable income." It's harder to get this from survey data, though.
A correspondent writes:
I'm doing some personal research on the correlation between family income and political affiliation and I was hoping you can help. I came across some illuminating maps that you created and was wondering where you got your data from. I can't seem to find any hard data on the subject so any help would be greatly appreciated.I [my correspondent] am looking into the assertion that conservatives are more generous than liberals. Specifically, I'm trying to debunk the thesis of Arthur C. Brooks' Who Really Cares: The Surprising Truth About Compassionate Conservatism. In this book, Brooks argues that liberals are less generous than conservatives and uses hard data to substantiate the claim. While I believe most of his analysis is spot on, I think that his results might be skewed by the way he measures generosity.
Brooks measures generosity as the percentage of income spent on charitable giving. I think that a better measure would be charitable giving as a percentage of disposable family income; people don't give away what they can't afford to. This is significant because, if your maps are correct, there's the distinct possibility that conservatives make more than liberals on average and therefore have more to give. If I can get data on income as a function of political affiliation I can correct for non-disposable income and see if it makes a significant difference in the results.
My reply:
First, I'd like to point you to some updated maps that I've made of income and voting.
Our data came from the Pew Research Center. We used their polls taken during the few months before the election. We also adjusted for voter turnout using the Current Population Survey post-election supplement, but that's less important, I think. (Yair and I are in the midst of writing up an article describing exactly what we did.)
Finally, Arthur Brooks's findings seem plausible enough to me, even after controlling for income. My own pet explanation is in terms of default behavior. Or, to put it even more strongly, as commenters Ockham and Ubs wrote here, you're much more likely to give to charity if somebody is asking you to do so--and conservatives might very well be more likely than liberals to be in settings where someone is personally asking them to give to charity.
As I wrote a couple of weeks ago, the Republicans need something like a 7% swing in the national vote to take back the House of Representatives in 2010.
From Erikson, Bafumi, and Wlezien, here is a graph predicting the Democratic party's vote share in midterm elections, given their support in a generic party ballot from polls taken during the 300 days before the election:
The higher line in each graph (in red) corresponds to elections where the incumbent president is a Republican, and the lower line (in blue) corresponds to elections such as 2010, where the incumbent is a Democrat.
Alan Reifman writes:
I [Reifman] have created a new website to compile poll results on specific provisions of the health care reform debate. Today, I review the polling on universality, personal/individual mandates, and employer mandates. I discuss in the Welcome Statement on my page how I aim to go beyond what is currently available on sites such as Pollster.com and Polling Report.
I ran into John Barnard a few hours ago and he told me that he likes the blog but he hates the political stuff. So, John, you can skip this one. Although there is a bit of statistics near the end, so if you want you can click through and search for two asterisks (**); I've labeled the statistical content, just this once, to make your life slightly easier!
Following Paul Krugman, John Sides considers how one might measure the ideological position of conservative political commentator Michelle Malkin. I'd heard the name but I don't have any TV reception and didn't really know what she stood for. Going to her webpage, I see she's written three books: "Invasion: How America Still Welcomes Terrorists, Criminals, and Other Foreign Menaces to Our Shores," "In Defense of Internment: The Case for 'Racial Profiling' in World War II and the War on Terror," and "Unhinged: Exposing Liberals Gone Wild." From her blog, she also appears to have conservative economic views, although it's hard to separate this from partisanship without going back to posts from previous years.
Krugman wants a "scale of positions on political matters ... we might find that only 19 percent of Americans are to the right of Michelle Malkin, while 23 percent are to the left of Michael Moore." I don't have enough of a sense about Malkin, but I'm pretty sure that much less than 23% of Americans are to the left of Michael Moore. In chapter 8 of Red State, Blue State is this graph from Joe Bafumi and Michael Herron estimating the ideological positions of congressmembers and voters:
Aaron Edlin just sent me this article by Pinar Karaca-Mandic and himself from 2006:
We [Edlin and Karaca-Mandic] estimate auto accident externalities (more specifically insurance externalities) using panel data on state-average insurance premiums and loss costs. Externalities appear to be substantial in traffic-dense states: in California, for example, we find that the increase in traffic density from a typical additional driver increases total statewide insurance costs of other drivers by $1,725-$3,239 per year, depending on the model. High-traffic density states have large economically and statistically significant externalities in all specifications we check. In contrast, the accident externality per driver in low-traffic states appears quite small. On balance, accident externalities are so large that a correcting Pigouvian tax could raise $66 billion annually in California alone, more than all existing California state taxes during our study period, and over $220 billion per year nationally.
Interesting stuff. I don't have it in me right now to check all these numbers, but the argument looks to be laid out clearly enough that the experts in the area can work it out. Also, it all seems to be about accidents to other cars; I'm not sure where they factor in the costs due to running over pedestrians.
Avi Feller and Chris Holmes sent me a new article on estimating varying treatment effects. Their article begins:
Randomized experiments have become increasingly important for political scientists and campaign professionals. With few exceptions, these experiments have addressed the overall causal effect of an intervention across the entire population, known as the average treatment effect (ATE). A much broader set of questions can often be addressed by allowing for heterogeneous treatment effects. We discuss methods for estimating such effects developed in other disciplines and introduce key concepts, especially the conditional average treatment effect (CATE), to the analysis of randomized experiments in political science. We expand on this literature by proposing an application of generalized additive models to estimate nonlinear heterogeneous treatment effects. We demonstrate the practical importance of these techniques by reanalyzing a major experimental study on voter mobilization and social pressure and a recent randomized experiment on voter registration and text messaging from the 2008 US election.
This is a cool paper--they reanalyze data from some well-known experiments and find important interactions. I just have a few comments to add:
Bob Shapiro, author of two important books on public opinion (The Rational Public, 1992, with Benjamin Page, and Politicians Don't Pander, 2000, with Lawrence Jacobs) sent me this report he just wrote with Sara Arrow, comparing public opinion for Obama's health care initiative with opinion in 1993-94, when Bill Clinton's health plan crashed and burned. They write:
Following up on our earlier discussion of the administrative costs of Medicare and private insurers, Robert Book sent me a report on Illusions of Cost Control in Public Health Care Plans, which is full of numbers and argues that "Medicare's administrative costs are a lower percentage of the total not because Medicare has cheaper administration, but because it has more expensive patients." I don't know enough to evaluate these arguments, but I like that he has a lot of numbers and graphs right out there, so that any disputes can be on specific points.
I do have one question, which probably reflects my ignorance of heath-economics terminology more than anything else. Book writes, "Claims processing is the only category that is at all sensitive to the level of health care utilization." From my personal experience with the health care system, I associate "administrative costs" with the many levels of clerks and paper-pushers you have to deal with before you get to see a doctor or nurse. I'm not quite sure how "claims processing" is defined, but I see a lot of full-time employees (as well as, I assume, some higher-paid full-time employees in some back room) who aren't doing anything health-related; they're just minding the store. And this all seems pretty much proportional to health care utilization: I assume that if people are going to the doctor twice as often, or doing more complicated procedures, there are that many extra visits, that many extra forms to fill out, etc. I've been in hospital wards at night where there is no doctor to be seen, maybe no nurse, but three or four administrative employees appear to be continously busy with something or another.
This is not intended as a criticism of Book's argument, just a thought some of these seemingly neutral terms such as "administrative costs" can be confusing.
Nate Silver links to a Congressional Quarterly list of ratings for 2010 congressional races and concludes that, although these listings give a sense of which races are more likely to be competitive, the CQ chart doesn't really say much about the chance that there will be a "wave" election that would switch partisan control to the Republicans.
The same day, Matthew Yglesias links to a recent Congressional Quarterly report entitled, "2010 House Outlook: Democrats Look Secure" and concludes that, yes, the Democrats look secure to keep their House and Senate majorities.
What should we believe? For the purpose of campaign strategy, you need to look at the races in each district, but to get a sense of what's going to happen overall, I think the best approach is to look at the national vote. There's lots of variation, but, overall, swings occur nationally.
Here's a graph I made after the election, showing the average Democratic share of the two-party vote for the House of Representatives and for president for the past sixty years:

From this picture, it looks possible but unlikely that there will be a 6% swing toward the Republicans (which is what it would take for them to bring their average district vote from 44% to 50%). Historically speaking, a 6% swing is a lot. The biggest shifts in the past few decades appear to be 1946-48, 1956-58, and 1972-74 (in favor of the Democrats) and 1964-66 and 1992-194 (for the Republicans). I don't know if any of these would quite be enough to swing the House majority. A more likely outcome, if the Republicans indeed improve in next year's election, is for them to make some gains but still be in the minority.
The other factor helping the Democrats is incumbency, which helps lock in a congressional majority (as it did for the Republicans after 1994) by bumping up the vote shares of the new congressmembers elected in swing districts. In 2008, John Kastellec, Jamie Chandler, and I estimated that the Republicans would need something like 51% of the average district vote to have an even shot of winning a majority of House seats.
I really like this post of Nate Silver's. Ideal-point models and other fancy statistical techniques are fine, but I'm a big fan of using the simple, directly-interpretable summary when it makes the point.
Mike Barnicle's already on the case. So now it's time for the classy upscale take on the story.
John Sides links to this quote from Barney Frank:
Not for the first time, as a -- a -- an elected official, I envy economists. Economists have available to them, in an analytical approach, the counterfactual. Economists can explain that a given decision was the best one that could be made, because they can show what would have happened in the counterfactual situation. They can contrast what happened to what would have happened.No one has ever gotten reelected where the bumper sticker said, "It would have been worse without me." You probably can get tenure with that. But you can't win office.
I have two thoughts on this. First, I think Frank is a bit too confident in economists' ability to "show what would have happened in the counterfactual situation." Maybe "estimate" or "guess" or "hypothesize" would be a bit stronger than "show." Recall this notorious graph, which shows the unintentional counterfactual of some economic predictions:

Second, I don't know how Frank can say that about "no one has ever gotten reelected . . ." In Frank's district in Massachusetts, it would take a lot--a lot--for a Democrat to not get reelected.
Original title of article: "Estimating turnout, vote intention, and issue attitudes in subsets of the population"
New title: "Who votes? How did they vote? And what were they thinking?"
Freedom House is currently seeking individuals with demonstrated professional experience to work with civil society organizations in Egypt through the International Executive Volunteers (IEV) program for 3 months beginning in September 2009.
Volunteers must have a minimum of five years of relevant professional experience, the ability to commit to 3 months of service, and a resourceful, innovative personality. Previous overseas experience, particularly in Egypt and in the Middle East and North Africa is preferred.
Statistician/Polling Specialist
A statistician/polling specialist has been requested to provide support in the preparation and analysis of survey methodology and questionnaire data. Tasks will include designing work plans, managing logistics, reporting results to targeted groups, and developing relationships with key constituencies. Additional knowledge or expertise is needed for volunteer management - recruiting, retaining, and training for key projects. Arabic language skills are preferred but not required.
Ben Hyde pointed me to this data-based dating site. I have no comments on how it works for dates, but they have a lot of fun maps, for example this:
Are some human lives worth more than others?


268,864 people have answered
And this:
If you knew for sure you would not get caught,
would you commit murder for any reason?


359,761 people have answered
This is great; I can't resist giving a couple more:
What with all this discussion of causal inference, I thought I'd rerun a blog entry from a couple years ago about my personal trick for understanding instrumental variables:
A student at another university writes in with some questions about Red State, Blue State:

To learn why I made this graph, see here.
A websearch turned up this link to our report on Jeff and Justin's research. It's great to see this stuff out there, but, really, "LGBTQI"? The way things are going, we'll be going through the whole alphabet soon! There's gotta be another way. Once you have "Q" in there, doesn't that pretty much cover all the contingencies?
I visited AT&T Labs today--lots of fun, a great group of people, an interesting mix of statistics and machine learning. They showed me some cool visualizations that I'll display soon.
Anyway, while I was there, somebody asked me about voters with different educational levels. In discussing it, we realized we wanted to break this down by ethnicity and age. So I quickly prepared a grid of graphs for him.
On the train ride back, I spent a few minutes making the graphs prettier:

These are based on raw Pew data, reweighted to adjust for voter turnout by state, income, and ethnicity. No modeling of vote on age, education, and ethnicity. I think our future estimates based on the 9-way model will be better, but these are basically OK, I think. All but six of the dots in the graph are based on sample sizes greater than 30.
A correspondent read my recent note on the limited influence of the median voter and writes:
My understanding of median voter theorem is that each election has its own median voter, and that the median voter's influence is limited to the outcome of that election only. I don't understand, then, why the graph in your post is evidence that the median voter has little influence. It seems to me that there are two elections being considered in that graph, with two different median voters. The graph appears to consider "moderation" to be having a moderate voting record in Congress, but it seems to me that the median voter in Congress is likely quite different from the median voter in any particular Congressional district. The power of the median voter in Congress, it seems to me, is to affect the outcome of Congressional votes, not to improve his own chances for re-election, which are determined by his proximity to the median voter in his district. Thus, I'm not sure why we would expect moderation, as measured by the median Congressional voter, to translate into electoral success, which we would expect to be determined by the median district voter.
My reply:
At our sister blog, Tom Schaller says no:
Is Sanford a cad for bolting his family on Father's Day weekend? Of course, but that is a private, moral failing, rather than a failure of public duty. . . .I [Schaller] oppose most of what Mr. Sanford stands for politically. His showy rejection of federal stimulus money targeted for his state was a crass publicity stunt designed to garner national attention for Mr. Sanford at the expense of his constituents, many of whom are struggling economically. . . . Should Mr. Sanford's ambitions founder on the shoals of a personal scandal, however, yet another opportunity will be lost to establish the long-overdue separation between private comportment and public service. So here's hoping he doesn't resign or, if he does, it is a matter of personal choice rather than him bowing to political pressure.
I see where Schaller is coming from. Lots of people have complicated personal lives, and it's not clear at all that these difficulties have much if anything to do with governing. But I don't know if I agree with him on the wall of separation between private comportment and public service.
Consider the Sanford case. Schaller's a Democrat, so he can evaluate Sanford on his policies. But if Schaller were a Republican, he might very well want Sanford out of there because he tarnishes the brand, makes the party a laughingstock, etc. Also makes it harder for Sanford to convincingly follow a "family values" agenda which Schaller (if he were a Republican) might want. These are legitimate concerns for a Republican to have. Even if you don't think Sanford's personal indiscretions are important, you might want him gone and replaced by a more effective Republican. Just as, from the other direction, a Democrat would've preferred a zipped-fly version of Bill Clinton.
Sometimes you hear discussion of how the red states get more from the government than they pay in taxes while the blue states get less and pay more. This is slightly misleading because the blue states are richer and rich people pay a higher rate of income tax, but it does raise the interesting question of the regionally distributive effects of national taxing and spending poliicies.
For some perspective on where this is coming from: In our office is a map from 1924 titled "Good Roads Everywhere" that shows a proposed system of highways spanning the country, "to be built and forever maintained by the United States Government." The map, made by the National Highways Association, also includes the following explanation for the proposed funding system: "Such a system of National Highways will be paid for out of general taxation. The 9 rich densely populated northeastern States will pay over 50 per cent of the cost. They can afford to, as they will gain the most. Over 40 per cent will be paid for by the great wealthy cities of the Nation. . . . The farming regions of the West, Mississippi Valley, Southwest and South will pay less than 10 per cent of the cost and get 90 per cent of the mileage." Beyond its quaint slogans ("A paved United States in our day") and ideas that time has passed by ("Highway airports"), the map gives a sense of the potential for federal taxing and spending to transfer money between states and regions.
P.S. Yes, I posted this last year, but without the pretty map image (click on it for higher resolution, which unfortunately still isn't quite good enough to make out the text)..
Back in April, in an article about partisan perceptions of the economy, John Sides and I wrote:
My final thoughts on those Iran vote analyses:
Sometimes people think it's a disaster when you have more predictors than data points, but I always point out that, no, it's better to have 9 predictors than just 1 or 2. After all, if you really wanted just 1 or 2, you could just throw out most of your data!
Nate's chart is excellent, especially the ordering of the candidates in order of the percent favoring resignation:
I also like the gratuitious exclamation marks which add fun value without actually making the graph any harder to read. The key reason this works is that Nate wisely did not fill in the blank squares with "No!"s.
My only comments are:
I've been assured, and I believe, that the effective way to get rid of the roaches in your apartment is to clean the place, put poison in the cracks, and then seal them. Some people do that. But a lot of people go for the "bombing" approach: the exterminator comes to the building once a month, drops the bomb, leaves, and comes back the next month.
My question is: what are these people thinking?? Why do these people willingly get bombed once a month instead of following the simpler and effective approach? Part of this is ignorance, surely, but I think there's more to it than that, some underlying psychological appeal. I don't think it's just ignorance because, when I talk with people who get bombed and discuss the "clean, poison, and seal" approach, I've found them to be very resistant and (I would say) "defensive." They seem to want to believe that bombing is effective and really don't want to hear about alternative strategies.
What's going on? I have some theories. Maybe bombing seems like less effort than cleaning the food out of your closet and sealing the cracks. Also it seems sort of decisive. On the other hand, shouldn't people pause a little when they think about needing the exterminator every month? Yet, that doesn't seem to bother people. Conceptually, getting the exterminator to bomb your apartment feels to me a bit like "taking a pill." Maybe there's some technological appeal. Sort of like the way that photovoltaics are sexy in a way that passive solar isn't.
I don't know. I'll have to ask some psychologists of my acquaintance who work on environmental decision making.
Hall, J.L., L.W. Miratrix, P.B. Stark, M. Briones, E. Ginnold, F. Oakley, M. Peaden, G. Pellerin, T. Stanionis and T. Webber, 2009. Implementing Risk-Limiting Audits in California, USENIX EVT/WOTE, In press.
Related discussion here.
My former Berkeley colleague Phil Stark has written a series of articles on election auditing which might be of interest to some of you. Here they are:
Stark, P.B., 2009. Auditing a collection of races simultaneously. Working draft.
Phil has an interesting background: he got into statistics after working on inverse problems in geology. The methods he uses are based on exact error bounds, really much different than the Bayesian stuff I do, much more focused on getting conservative p-values and the like. As a result, the things he does in his papers are nothing at all like what I would do in these problems.
In a larger sense, though, I believe in methodological pluralism, and I'm glad to see a researcher such as Phil, who's working from such a different statistical framework as mine, work on these problems.
Update here and data here. I haven't looked at this in detail, but Walter Mebane is the expert on this stuff so I'm inclined to believe him. Even though he uses tables instead of graphs.
Again, just to emphasize: this sort of statistical analysis doesn't prove anything by itself, but it can be useful in giving people a sense of where to focus attention if they want to look further.
Alex Scacco and Bernd Beber follow up on their analysis of the Iran election data:
After we wrote our op-ed using the province-level data, we've now also done some preliminary tests with the county-level data. In the latter dataset, the last digits don't appear fraudulent. Why might we find suspicious last digits at the province level, while, at the same time, Walter Mebane and Boudewijn Roukema find evidence that first and second digits are fishy at the county level?We can only speculate about what happened behind closed doors, but here is a scenario of top-down fraud that is consistent with the patterns found in the quantitative analyses mentioned above:
Bernd Beber and Alex Scacco present another quantitative analysis of the Iranian election data, this time looking at last digits. They write:
[Suspicions of fraud] have led experts to speculate that the election results released by Iran's Ministry of the Interior had been altered behind closed doors. But we don't have to rely on suggestive evidence alone. We can use statistics more systematically to show that this is likely what happened. Here's how.We'll concentrate on vote counts -- the number of votes received by different candidates in different provinces -- and in particular the last and second-to-last digits of these numbers. For example, if a candidate received 14,579 votes in a province (Mr. Karroubi's actual vote count in Isfahan), we'll focus on digits 7 and 9.
Benford's law is an amusing mathematical pattern in which the first digits of randomly sampled numbers tend to have a distribution in which 1 is the most common first digit, followed by 2, then 3, and so forth. It's the distribution of digits that arises from numbers that are sampled uniformly on a logarithmic scale.
In our Teaching Statistics book, Deb and I describe a classroom demonstration where we show how Benford's law applies to street addresses sampled randomly from the telephone book. In a more serious vein, Walter Mebane has written about the application of Benford's law to vote counts.
In the past several days, a few people have asked me about applying these ideas to the recent Iranian election. Today, Stephane Reissfelder pointed me to an article by Boudewijn Roukema, which states:
The results of the 2009 Iranian presidential election presented by the Iranian Ministry of the Interior (MOI) are analysed based on Benford's Law and an empirical variant of Benford's Law. The null hypothesis that the vote count distributions satisfy these distributions is rejected at a significance of p < 0.007, based on the presence of 41 vote counts for candidate K that start with the digit 7, compared to an expected 21.2-22 occurrences expected for the null hypothesis. A less significant anomaly suggested by Benford's Law could be interpreted as an overestimate of candidate A's total vote count by several million votes. Possible signs of further anomalies are that the logarithmic vote count distributions of A, R, and K are positively skewed by 4.6, 5.8, and 2.5 standard errors in the skewness respectively, i.e. they are inconsistent with a log-normal distribution with p ` 4 × 10−6, 7 × 10−9, and 1.2 × 10−2 respectively. M's distribution is not significantly skewed.
I don't buy it. First off, the whole first-digit-of-7 thing seems irrelevant to me. Second, the sample size is huge, so a p-value of 0.007 isn't so impressive. After all, we wouldn't expect the model to really be true with actual votes. It's just a model! Finally, I don't see why we should be expecting distributions to be lognormal.
Maybe there's something I'm missing here, but that's my quick take. This is not to say that I think the election was fair, or rigged, or whatever--I have absolutely zero knowledge on that matter--just that I don't find this analysis convincing of anything. I will say, though, that Roukema deserves credit for presenting the analysis clearly.
P.S. In response to comments: let me emphasize that I'm not saying that I think nothing funny was going on in the election. As I wrote, I'm commenting on the statistics, I don't know the facts on the ground. To move my comments in a more constructive direction (I hope), let me pull out this useful comment from Roukema's article: "One possible method to test whether this is just an odd fluke would be
to check the validity of the vote counts for candidate K in the voting areas
where the official number of votes for K starts with the digit 7." Further investigation could be a good thing here.
I did not find Roukema's argument convincing; that does not mean that I consider it a bad thing that the article was written. The article is a first draft of an analysis; it might end up leading to nothing, or it might be unconvincing as it stands now but lead to some important breakthroughs. We can see what further analysis turns up. Again, my verdict is not a Yes or a No, it's an "I'm not convinced."
As part of our Red State, Blue State research, we developed statistical tools for estimating public opinion among subsets of the population. Recently Yu-Sung Su, Yair Ghitza, and I applied these methods to see where school vouchers are more or less popular.
We started with the 2000 National Annenberg Election Survey, which had responses from about 50,000 randomly-sampled Americans to the question: "Give tax credits or vouchers to help parents send their children to private schools--should the federal government do this or not?" 45% of those who expressed an opinion on this question said yes, but the percentage varied a lot by state, income level, and religious/ethnic group; These maps show our estimates:
(Click on image to see larger version.)
Vouchers are most popular among high-income white Catholics and Evangelicals and low-income Hispanics. In general, among white groups, the higher the income, the more popular are school vouchers. But among nonwhites, it goes the other way, with vouchers being popular in the lower income categories but then becoming less popular among the middle class.
You can also see that support for vouchers roughly matches Republican voting, but not completely. Vouchers are popular in the heavily Catholic Northeast and California, less so in many of the mostly Protestant states in the Southeast. We also see a regional pattern among African Americans, where vouchers are most popular outside the South.
We checked our results by fitting the same model to the Annenberg survey from 2004, and, much to our relief, we found similar patterns:
Fred Bookstein was at my talk in Seattle on voting power (the relevant articles are here and here) but didn't get a chance to ask a question, so he's asking it now:
Why is voting power considered a "good" in all those models? What is good about it? With what generally shared human desiderata, if any, is it associated?
As the saying goes, everybody wants to go to heaven but nobody wants to die. Or, to put in political terms, people want lower taxes and more government services--with the gap filled, presumably, with a mixture of borrowed funds and savings realized by cutting government waste. In their new book "Class War? What Americans Really Think about Economic Inequality," Benjamin Page and Lawrence Jacobs put together survey data and make a convincing case that this cynical story is not a fair summary of public opinion in the United States. Actually, most Americans--Democrats and Republicans alike--support government intervention in health care, education, and jobs, and are willing to pay more in taxes for these benefits.
Page and Jacobs recognize that Americans are confused on some of these issues, for example not realizing that sales taxes cost lower-income people more, as a percentage of their earnings, while the personal income tax hits higher-income groups more, on average. The result is widespread confusion about what are the most effective ways to pay for government spending. People are also confused about how to cut the budget. To choose a well-known example that is not in the book at hand, Americans overwhelmingly support reducing the share of the federal budget that goes to foreign aid, but they also vastly overestimate the current share of the budget that goes to this purpose (average estimate of 15%, compared to an actual value of 0.3%).
Confusions on specific tax and budget items aside, Page and Jacobs are persuasive that majority public opinion is consistent with tax increases targeted to specific government programs aimed at bringing a basic standard of living and economic opportunity to all Americans. They discuss how survey respondents generally feel that such an expansion of the role of government is consistent with generally expressed free-market attitudes, a philosophy which they call "conservative egalitarianism."
This is a book of public opinion, not policy, and the authors offer no judgment on whether the public's majority preference is achievable. For example, a vast majority of Americans--including 80% of Republicans--feel that "Government should spend whatever is necessary to ensure that all children have really good public schools they can go to" (p. 59), and another clear majority--this time including 60% of Republicans--agree with the statement that "The government in Washington ought to see to it that everyone who wants to work can find a job" (p. 62). It is an open question whether these goals are possible given the tax increases that voters are willing to accept.
Carl Klarner writes:
I'm currently doing work on state legislative elections that uses Democratic success as the dependent variable. I do these analyses with either the percent of the two-party vote for the Democrat as Y, or a dichotomous measure of a Democratic victory as Y.
Jeff Lax and Justin Phillips posted this summary of attitudes on a bunch of gay rights questions:
They did it all using multilevel regression and poststratification. And a ton of effort.
P.S. My only criticisms of the above graph are:
(a) I'd just put labels at 20%, 30%, 40%, etc. I think the labels at 25, 35, etc., are overkill and make the numbers harder to read. And the tick marks should be smaller.
(b) The use of color and the legend on the upper left are well done. But they should place the items in the legend in the same order as the averages in the graphs. Thus, it should be same-sex marriage, then 2nd parent acdoption, then civil unions, then health benefits, and so forth.
As I noted a couple days ago, gay marriage has had the largest recent increases in popularity in liberal states where the general population was already pro-gay.
But if you count the number of same-sex couples, you see something different, with the fastest increases in conservative areas of the country. Gary Gates writes:
You discussed the issue of social networks and knowing gay people as a possible explanation. You might want to look at some of the work of Greg Herek (a psychologist at UC-Davis) who is now saying that "knowing" someone is becoming a much less salient predictor of support for gay rights. Since nearly everyone now knows a gay person, he claims that the issue today is more whether or not you have a closer personal relationship with a gay person.Your findings were also intriguing to me when comparing them to some of the work I [Gates] have done on the enumeration of same-sex couples in the US Census and the American Community Survey. This paper looks at changes in the counts over time.
I [Gates] find the largest changes (which I interpret as increased visibility of same-sex couples) in the most conservative parts of the country.
I looked at Gates's report and it looks like good stuff. It would definitely be a good idea to reconcile his findings of the largest increases in conservative parts of the country, with Lax and Phillips's findings that public opinion on gay marriage has changed fastest in liberal states.
I saw this one today, can't figure it out:
"Don't take away my rights because you won't control your child"
What is this, the right to punch somebody else's kids?? I can't imagine somebody exercising that particular right very often before getting hurt.
It's a funny thing: we typically think of bumper sticker slogans as being simplistic, but in this case it appears to be the opposite: the compression of an idea into a short phrase has made it incomprehensible to outsiders such as myself. Or maybe that's the point. I wouldn't want to see the owner of this car near any kids, that's for sure.

(Right click on "View image" to see the whole thing.)
Among whites, vouchers appear to be more popular with the upper middle class and rich (with predictable religious variation: the strongest support is among Catholics, then born-again Protestants, then others).
Among blacks and hispanics, though, vouchers are more popular among the poor.
We'll have to check this on some other data.
Some details:
At our sister blog, Lee discusses strategic retirement, or lack thereof, in the Supreme Court.
This is a good time for me to bring up my point that congressmembers and senators appear to decide make these decisions non-strategically, being more likely to retire when their party most needs them and their incumbency advantage and being less likely to retire when the could be replaced more costlessly. (For example, Frank Lautenberg running for reelection in 2008, a year when the Democrats could well have afforded a fresh face in a New Jersey senate race with little chance of losing.)
A couple days ago, I wrote, of Martin and Quinn's estimated positions of Supreme Court justices, that
I don't know whether to believe the numbers. Is the Anthony Kennedy of 2007 (ideology score 0.14) really so close to Hugo Black in 1970 (ideology score 0.06)? To look at it another way, according to these numbers, in 1973 (the year of Roe v. Wade), six of the justices are colored red and the median justice is listed at 0.67. In 2007, only five are red and the median is at 0.14. In fact, in 2005 the median is listed as -0.07, or slightly to the left of center. Is it really plausible that the court was more liberal in 2005 than in 1973? Maybe so, but something looks fishy to me here.
In reply, Andrew Martin wrote:
re: Black and Kennedy, I [Martin] tend to think of them as pretty similar. Both were moderates (although on somewhat different types of cases), one a moderate Dem the other a moderate Rep. So them being close is not implausible.There were a couple of very liberal decisions in the 1972 term (when Roe was decided), including a Roe and a death penalty case. But even on civil liberties the court reached a conservative decision in Miller (the obscenity case). And there were some more conservative decisions in other areas of law. Today's court is surely more conservative on civil liberties issues (although there haven't really been many cases...), but may be a little to the left on some other issues (Hamdan). Today it all gets down to what Kennedy wants to do. If it were what Roberts would do the court would be far to the right.
That's an argument for plausibility, but the argument may be implausible. We tend to think about the Court in terms of the most politically salient cases, but the model treats Roe as equally to, say, a tax case. And, of course, the measures have huge limitations because they are just based on binary data, on all cases, with some reasonably strong model assumptions, etc.
The point about counting different domains of the law is interesting, along with the age-period-cohort sort of question of how you can even try to align left-right today with the corresponding positions in 1972.










Recent Comments