John Sides's recent blog inspires me to resurrect this note of mine from a couple years ago:
August 2009 Archives
Recently I posted some graphs showing that liberal Democrats have a similar income profile to the general population, while conservative Republicans are more concentrated in the higher half of the income range.
Some people conjectured that the patterns might depend on whether people are thinking of liberalism/conservative as representing social or economic issues. So Daniel and I redid the calculations, looking separately at three different measures of survey respondents' ideology as derived from the 2000 Annenberg survey:
- self-positioning on the liberal/conservative scale
- position on a scale of economic ideology (based on combining the responses to several survey questions; details in Red State, Blue State)
- position on a social ideology scale (based on combining several other survey responses).
This gave us three 3x3 grids of graphs, one for each of the three ideology scales. This was too much to display, so Daniel and I reduced the data as follows: For each ideology measure, we created five categories of people:
- Liberal Democrats
- Moderate Democrats or Liberal Independents
- Neutral (these included Conservative Democrats, Moderate Independence, and Liberal Republicans)
- Moderate Republicans or Conservative Independents
- Conservative Republicans
This five-point scale takes you from one extreme of ideological partisanship to the other, and with only five categories instead of nine, it's easier to display.
Here's what we found (click on the graph to make it larger):
There are some differences between the different measures of ideology, but the take-home point for me is that the patterns are basically consistent: liberal Democrats by any measure are pretty well distributed across the income scale, and conservative Republicans are more concentrated among the upper incomes.
Mary Towner sends along this article by herself and Barney Luttbeg that discusses the Trivers-Willard hypothesis and its applications to humans.
I think that Towner and Luttbeg agree with David Weakliem and myself on the substance, but I disagree with them on the question of what models to fit. It's not so much a Bayesian or non-Bayesian question--we use both approaches in our article--but rather a question of whether to treat parameters as continuous or discrete. In their example on page 100, you consider models in which the probability of boy births is 0.50 and 0.53. I think it would make more sense to consider theta to be a continuous parameter with distribution centered on the historical value of 0.515. Neither of those hypothesized values seem vary plausible to me. On the substance, though, I think we're all on the same page.
P.S. I was curious.
Ban Chuan Cheah writes: "This paper may be relevant to a recent entry on your blog." Here's the abstract:
Multilevel models are used to revisit Moulton's (1990) work on clustering. Moulton showed that when aggregate level data is combined with micro level data, the estimated standard errors from OLS estimates on the aggregate data are too small leading the analyst to reject the null hypothesis of no effect. Simulations using similar data suggest that even when corrected for clustering, the null hypothesis is over-rejected compared to the estimates obtained from multilevel models. The relationship between survey sampling and Moulton's correction is also explored. The parallel between these two areas is extended into multiway clustering. Simulations using a data set with students clustered within classrooms and classrooms within schools suggest that the over-rejection rate from multilevel models is smaller than those corrected for clustering. This is particularly true when the number of clusters (classrooms) is small. The results suggest that modeling the clustering of the data using a multilevel methods is a better approach than fixing the standard errors of the OLS estimate.
S. V. Subramanian, Tim Huijts, and Jessica Perkins report:
Studies have largely examined the association between political ideology and health at the aggregate/ecological level. Using individual-level data from 29 European countries, we investigated whether self-reports of political ideology and health are associated. In adjusted models, we found an inverse association between political ideology and self-rated poor health; for a unit increase in the political ideology scale (towards right) the odds ratio (OR) for reporting poor health decreased (OR 0.95, 95% confidence interval 0.94-0.96). Although political ideology per se is unlikely to have a causal link to health, it could be a marker for health-promoting latent attitudes, values and beliefs.
No pretty graphs, unfortunately. But interesting.
I received the following email:
MICE, by Stef van Buuren and others, is an R package with many similarities to our "mi". Actually, MICE came first, and it's recently been updated. Should be available as an R download very soon. It would be interesting to see how the programs differ. We should probably look more carefully into cases where they give different results.
Nate Persily, Steve Ansolabehere, and Charles Stewart just completedthis article addressing the relevance of the Voting Rights Act in light of Barack Obama's presidential victory:
Jenny has declared email bankruptcy but is watching her debts pile up again. I have (with effort) followed the Inbox Zero route. John Cook thinks email isn't the problem; on the other hand, he's reacting to a chorus of people telling him that email is ruining their lives, and maybe they have some good reason for saying this. Cook's commenter Heather appears to be staying barely above water with 200 messages in her inbox, while commenter Mr. Gunn recommends a technological solution.
From the comfort of my empty inbox, I thought of another big issue with email. Actually, a huge issue.
Email is a way to feel like you're working without actually thinking very hard. Sort of like blogging, actually--but blogging at least has the side benefit of sharing information with the world, focusing one's thoughts, etc. Actually, one of the unanticipated advantages of blogging, for me, was to organize the ideas that otherwise were going out into a million little emails.
I could--and have, often enough--put all of my work effort on some days into working with the inbox. What's the problem with that? First, I'm letting others drive my priorities. Some of this is fine--I certainly don't delude myself that I'm like that guy who sat in a room by himself and proved Fermat's Last Theorem--but at some point I think a little more direction to my work is useful. Second, inbox-handling just isn't usually the highest-quality thinking. It's just hard enough to occupy my mind without actually pushing me. I might as well just be playing Tetris for two hours.
My plan with Inbox Zero is to spend less total time on my email. I will spend some of the released time on more interesting, useful work, and it will also free up time for leisure.
The next step is to cut back on blogging. (Things will improve once we get the Scheduled Posting feature working again, so I can just write 10 blog entries, schedule them, and not have to think about them anymore.)
Michael Bailey writes:
I saw your blog post on Supreme Court ideal points.I [Bailey] have long shared a [disagreement with] the Martin and Quinn scores in 1973; I have an AJPS paper which goes to great lengths to think about that and uses cross time (and cross institutional) bridging observations to get estimates that are, in my opinion, more plausible over time.
See page 435 for a discuss of Martin and Quinn scores and p. 444 for my [Bailey's] alternative results.
I don't have anything to add at this time; I'd just like to say that this sort of scholarly discussion is great: it is often through disagreements at particular points that we make scientific progress.
Lucian Bebchuk writes:
Financial firms seeking to retain talent are reported to be making substantial use of guaranteed bonuses, and the French Economy Minister recently called for limiting such bonuses. While many now focus on how guaranteed bonuses affect the level of pay, my [Bebchuk's] piece focuses on their effect on incentives. I show that guaranteed bonuses create perverse incentives to take excessive risks, and consequently could well be worse for incentives than straight salary. . . .The above discussion has implications that go beyond the question of guaranteed bonuses. It's now well recognized that bonus plans based on short-term results which may turn out to be illusory can produce excessive risk-taking, and that plans should therefore be structured to account for the time horizon of risks. But even though tying bonus plans to long-term results is desirable, it isn't sufficient to avoid excessive incentives to take risks. Bonus plans tied to long-term results can still produce such incentives if they reward executives for the upside produced by their choices but insulate them from a significant part of the downside. Bonus plans that provide executives with such insulation from downsides - either by establishing a guaranteed floor or otherwise - can seriously backfire. . . .
I can see why the bankers want such incentives--as a tenured professor, I can see the appeal of a system with a floor but no ceiling--but Bebchuk makes a convincing argument that the incentives aren't good. So maybe it's just as well that professors don't get fat bonuses as part of their compensation packages.
Everybody says OmniGraphSketcher is great, but I can't use it because I don't have a Mac. Is there anything that people recommend for Windows? I'm trying to do a lot of remote meetings with people, so this seems like it could be a useful tool.
Rebecca Weitz-Shapiro points me to this blog by Luis von Ahn, suggesting that college lectures be replaced by high-production-quality Hollywood-style videos. You can click over to see his arguments, but, I gotta say, I don't buy it. When I teach a course, the goal of the classroom time (I don't call it "lectures") is to to get the students engaged in the material and thinking hard, not to sit back and be entertained, which is what you're getting from the Hollywood video.
Fang Chen writes:
I work at SAS on Bayesian software development . . . SAS has just reopened some positions and we are in the process of finding and attracting talented individuals who might be interested in making a career out of developing Bayesian software. What we essentially look for are people who are relatively well versed in Bayesian statistics, have had extensive hands-on experience in MCMC/Bayesian modeling (preferably using one of the low-level languages like C), and are interested in making relevant software.If you know someone, a graduating student maybe, who fit this description, do you mind passing along the information? The job description/application can be found here, searching for job number 09001613.
Charles Murray posts this interesting graph:

I'll give his explanation and then some discussion of my own. First, Murray:
It took me a week of hard work, culminating now at 5 in the morning. (I haven't gone to sleep so, no, it's not "before 4pm," as the saying goes.)
This time it's for real. I will never again read an email without immediately handling it. It works with referee reports, why can't it work for everything?
P.S. Now I have a backlog of 35 blog entries which I'll have to spread over the next month.
From Ben Olding, the MIT cell phone data and a corresponding article. I can't remember what this was about, but when Ben described it to me a few months ago, I recall that it sounded cool.
I'm still waiting for someone to work with me to reanalyze Iyengar and Fishman's speed-dating data using hierarchical models.
Lee Sigelman points to this article by physicist Rick Trebino describing his struggles to publish a correction in a peer-reviewed journal. It's pretty frustrating, and by the end of it--hell, by the first third of it--I share Trebino's frustration. It would be better, though, if he'd link to his comment and the original article that inspired it. Otherwise, how can we judge his story? Somehow, by the way that it's written, I'm inclined to side with Trebino, but maybe that's not fair--after all, I'm only hearing half of the story.
Anyway, reading Trebino's entertaining rant (and I mean "rant" in a good way, of course) reminded me of my own three stories on this topic. Rest assured, none of them are as horrible as Trebino's.
I'm too tired to think about this one, but maybe some of you out there have some ideas.
Chaz Littlejohn writes:
Tom Schaller asks, "Why are senior citizens crying "socialism" at town halls?"
As we like to say in academia: I don't know the answer, so let me tell you something I do know. (Graphs made in collaboration with Daniel Lee.)
First, who has health insurance (from the 2000 Annenberg survey):

Next, should the government spend more on health care (this time from 2004):

Some Obamacare supporters say: Senior citizens have Medicare, which is a government plan, so they should support public health care provision, right? But maybe some people on Medicare are suspicious of expanded government involvement in health care because they see it as competing with Medicare for scarce dollars.
Here are a couple more graphs (pretty similar to the second graph above):
Lee Wilkinson sends along this cool paper demonstrating a simple but effective automatic classifier:
Linf is a classier that was designed to address the curse of dimensionality and polynomial complexity by using projection, binning, and covering in a sequential framework. For class-labeled points in high-dimensional space, Linf employs computationally-efficient methods to construct 2D projections and sets of rectangular regions on those projections that contain points from only one class. Linf organizes these sets of projections and regions into a decision list for scoring new data points.Linf is not a hybrid or modication of existing classifiers; it employs a new covering algorithm. The accuracy of Linf on widely-used benchmark datasets is comparable to the accuracy of competitive classifiers and, in some important cases, exceeds the accuracy of competitors. Its computational complexity is sub-linear in number of instances and number of variables and quadratic in number of classes.
I also like the article's delightfully understated conclusion. After a page of bullet points on the virtues of their method, the authors write:
Given these distinctive features and its fundamental differences from other classifiers, Linf is a candidate for inclusion in portfolios of classifiers.
If only all of us could be so modest.
P.S. Table 1 and Figure 5 shouldn't be in alphabetical order, and I think Figure 6 would work better as a parallel coordinates plot. These are pretty minor comments, but Lee is an authority on statistical graphics so I hold him to a high standard.
The Boston Review just published an article by John Sides and myself on the 2008 election, along with discussions from several journalists, political scientists, and political activists. Here are the summaries:
Andrew Gelman and John Sides: American presidential elections always turn into stories. Because these stories capture the public imagination, they have real political importance. Unfortunately, they are often wrong. The narrative of Obama's victory is no exception.Rick Perlstein: Our media will not--cannot--explain the slow, steady work that produces election victories.
Michael C. Dawson:: In 2008, there were 2 million more African-American voters than in 2004.
Richard Johnston and Emily Thorson: Sarah Palin's approval ratings moved John McCain's support with unparalleled precision.
Mark Schmitt: We should not dismiss the idea that Obama created a new electoral map.
Andrew Gelman and John Sides respond: In elections, what is certain is almost never new; what is new is almost never certain.
The article that John and I wrote is based on some blogging we did right after the election (especially this, this, and this from me, and this, this, and this from John).
It's always fun having an article with comments, to get views from different perspectives. We keep banging on about the importance of "the fundamentals," but I think a lot of our ideas are brought out more clearly in the context of the detailed points made in the discussions.
It's too bad they weren't able to run our article with all its graphs. (Many of these graphs will appear in the forthcoming second edition of Red State, Blue State, however, with its extra chapter on the 2008 election.)
Ubs writes:
I wonder if I can get your thoughts on sabermetric baseball stats. Basically I'm trying to think about them more intelligently so that my instinctive skepticism can be better grounded in real science. There's one specific issue I'm focusing on, but also some more general stuff.
Here's a paper from Ryan Enos:
The effect of group threat on voter mobilization has been tested using observational data across a number of different geographies and units of analysis. Previous studies have yielded inconsistent findings. To date, no study of voter mobilization has directly manipulated group threat using a controlled experiment. I take advantage of the unique racial geography of Los Angeles County, California, which brings different racial/ethnic groups into close, yet spatially separated, proximity. This geography allows for a randomized, controlled experiment to directly test the effects of stimulating racial threat on voter turnout. A test of 3,666 African American and Hispanic voters shows an average treatment effect of 2.3 percentage points. The eect is 50% larger for African Americans than Hispanics. These results suggest that even low propensity voters are aware of the geographic proximity of other groups and can be motivated to participate by this awareness.
See page 21 of the article for an example of the treatment, which includes a map and this bit of text:

But what really interested me about the article was that he imputed ethnicity using available information on last names. (Go to the article and search on "surname.")
P.S. But, boy, does this paper need some good graphs! I like the paper and want to plug it here, but there's no grabby graph. I'd like to see that scatterplot of raw data with fitted lines, showing what the researcher found and how these findings came from the data. Regression tables are fine (well, not really; they should be graphs, but that's another story), but I wanna see what's happening. I wanna see what's happening.
Paul Cross writes:
In reading your book and papers on multilevel modeling I've noticed that you do not do much explicit modeling of spatial or temporal effects. I'm wondering if this is philosophically driven, perhaps because you prefer to get at underlying cause of the spatial correlations rather than just describing the spatial (or temporal) patterns. On a more technical level though, are there issues associated with including hierarchical effects in spatial models (e.g. Besag-York and Mollie convolution models)? Do spatial and non-spatial predictors compete with one another to predict the outcome resulting in biased estimates of both? Is this a simple issue of confounding, and if so how would one know that a non-spatial explanatory variable was confounding the spatial effects? Would it require a separate analysis of the spatial properties of all explanatory variables. As a disease ecologist, separating importance of covariates from the contagious process across time and space is a central problem.
My response: I think I'd be a better person if I fit spatial and time correlations. And similar models for other continuous predictors: for example, when modeling voting given income, maybe some sort of spline model instead of a sloppy combination of linear and categorical factors. I think the big reason I don't do it is that it takes a lot of work, and I'd rather put the effort into modeling interactions. Often I put in spatial patterns in a crude way, for example including regions as predictors when modeling U.S. states. But I do think that spatial modeling can be a good idea--don't take my laziness as an anti-endorsement.
P.S. I prefer the term "patterns" rather than "effects" in this context.
Here's a cool blog with all kinds of quantitative social science ideas. Good stuff.
Ian Stevenson writes:
Allen Hurlbert writes:
I saw your 538 post [on the partisan allegiances of sports fans] and it reminded me of some playful data analysis I [Hurlbert] did a couple months ago based on NewsMeat.com's compilation of sports celebrity campaign contributions. Glancing through the list I thought I noticed some interesting patterns in the partisan nature of various sports, so I downloaded the data and created this figure:

Jeff Lane writes:
I was just talking with Delia about two-stage regressions compared to multilevel analysis and we were looking at Two-Stage Regression and Multilevel Modeling: A Discussion of Several Papers for the Journal "Political Analysis" and the 2005 blog discussion, in which you posted the following response to someone struggling with choice of models:
Matt Ginsberg writes:
I saw your mention on 538.com [see also this article and this with Edlin and Kaplan]; a long time ago (80's), I [Ginsberg] wrote an article with Mike Genesereth and Jeff Rosenschein about rationality for automated agents in collaborative environments. The punch line, which probably bears on this issue as well, is that the strategy, "Act in such a way that if all the other agents were designed identically, we'd do optimally" is provably a Pareto-optimal way to design such agents. It's a nice result: handles the prisoner's dilemma, why you should vote, throw yourself on the grenade, etc.
Ginsberg's papers on the topic are here and here. I like the idea of framing the problem in terms of designing intelligent agents. This bypasses some of the normative vs. descriptive issues that cloud the analysis of rationality in human behavior.
Lee Wilkinson writes:
Also, someone asked me yesterday about Central Limit Theorem Java applets. I [Lee] looked out there and wasn't too impressed with the ones I saw. They didn't convey the essential aspects of the theorem and they were cluttered with unnecessary detail. So I [Lee] wrote this one.
Looks good to me!
I received the following question in the mail:
Aleks points me to this graph plotting program. I don't know anything about it, but, hey, maybe it's good.
I received the following email:
Aaron Gullickson writes:
I received this question in the mail:
Your Biometrics article, Multiple imputation for model checking: completed-data plots with missing and latent data, suggests diagnostics when the missing values of a dataset are filled in by multiple imputation. But suppose we have two equivalent files--File A with variable y left-censored at known threshold and File B with y fully observed. We draw multiple imputations of censored y in File A. (1) Can we validate our imputation model by setting y in File B as left-censored according to the inclusion indicator from A, performing multiple imputation of these "censored" data, and comparing imputed to observed values? (2) In particular, what diagnostic measure(s) would tell us whether the imputed and observed values fit closely enough to validate our imputation model?
My reply: I'm a little confused: if you already have File B, what do you need File A for? Do the two files have different data, or are you just using this to validate your imputation model? If the latter, then, yes, you can see whether the observations in File B are consistent with the predictive distributions obtained from your multiple imputations on File A. You wouldn't expect the imputations to be perfect, but you'd like the imputed 50% intervals to have approximate 50% coverage, you'd like the average values of the true data to equal the predictions from the imputations, on average, and conditional on any information in the observed data in File A. (But the imputations don't have to--and, in general, shouldn't--be correct on average, conditional on the hidden true values.)
You may also be interested in my 2004 article, Exploratory data analysis for complex models, which actually an example on death-penalty sentencing, with censored data.
Created by the garden team at Bailey House, a supportive housing facility for people living with HIV/AIDS.
John Q. writes:
Shane Murphy writes:
I recently played Risk for the first time in decades and was immediately reminded of something that my sister and I noticed when we used to play as kids: the first player has a huge advantage. I think it would be easy to fix by just giving extra armies for the players who don't go first (for example, in the three-player game, giving two extra armies to the player who goes second, and four extras to the player who goes third), but the funny thing to me is that:
1. In the rules there is no suggestion to do this.
2. In all our games of Risk, my sister and I never thought of making the adjustment ourselves.
Sure, a lot of games have a first-mover advantage, but in risk the advantage is (a) large and (b) easy to correct.
Jeff and Justin found, based on survey data from 1994-2008, that gay marriage is most popular among the under-30s and least popular among the over 65's, and it's a big gap: the difference in support for gay rights is about 35 percentage points more among the young than the old.
To explore these age patterns some more, Daniel and I did some simple analyses of attitudes on gays from three questions on the 2004 Annenberg survey, which had a large enough sample size that we could pretty much plot the raw numbers by age.
First, do you favor a state law allowing same sex marriage? As expected from Jeff and Justin's analysis, the younger you are, the more likely you are to support same-sex marriage:

How do we understand this? Perhaps younger Americans are more likely to know someone gay, thus making them more tolerant of alternative lifestyles.
It's not so simple. Let's look at the response to the question, Do you know any gay people. As of 2004, a bit over half the people under 55 reported knowing someone gay; from there on, it drops off a cliff. Only about 15% of 80-year-olds know any gay people. (The data are a little noisy at the very end, where sample sizes become smaller.)

This isn't what I was expecting. I thought that people under 30 would be much more likely to say they know a gay person. But the probability actually goes up slightly from ages 18 to 45. I guess this makes sense: during those years, you meet more people, some of whom might be gay.
Mike Maltz writes:
This is an hour-long TV show, but well worth watching, even for those (like me) who have seen Rosling's presentations to the TED conference. It's in Swedish, but captioned in English.
Benjamin Kay writes:
I just finished the Stata Journal article you wrote. In it I found the following quote: "On the other hand, I think there is a big gap in practice when there is no discussion of how to set up the model, an implicit assumption that variables are just dumped raw into the regression."I saw James Heckman (famous econometrician and labor economist) speak on Friday, and he mentioned that using test scores in many kinds of regressions is problematic, because the assignment of a score is somewhat arbitrary even if the order was not. He suggested that positive, monotonic transformations scores contain the same information and lead to different standard errors if in your words one just "dumped into the regression". It was somewhat of a throw away remark, but considering it longer, I imagine he mans that a difference of test scores need have no constant effect. The remedy he suggested was to recalibrate exam scores such that they have some objective meaning. For example, a mechanics exam scored between one and a hundred, one can pass (65) only if they successfully rebuild the engine in the time allotted, but better scores indicate higher quality or faster speed. In this example one might change it to a binary variable to passing or not, an objective testing of a set of competencies. However, doing that clearly throws away information.
Do you or the readers of Statistical Modeling, Causal Inference, and Social Science blog have any advice here? The transformation of the variable is problematic and the critique of transformations on using it raw seems a serious one, but the act of narrowly mapping it onto a set of objective discrete skills seems to destroy lots of information. Percentile ranks on exams might be a substitute for the raw scores in many cases, but introduces other problems like in comparisons between groups.
My reply: Heckman's suggestion sounds like it would be good in some cases but it wouldn't work for something like the SAT which is essentially a continuous measure. In other cases, such as estimated ideal point measures for congressmembers, it can make sense to break a single continuous ideal-point measure into two variables: political party (a binary variable: Dem or Rep) and the ideology score. This gives you the benefits of discretization without the loss of information.
In chapter 4 of ARM we give a bunch of examples of transformations, sometimes on single variables, sometimes combining variables, sometimes breaking up a variable into parts. A lot of information is coded in how you represent a regression function, and it's criminal to just take the data as they appear in the Stata file and just dump them in raw. But I have the horrible feeling that many people either feel that it's cheating to transform the variables, or that it doesn't really matter what you do to the variables, because regression (or matching, or difference-in-differences, or whatever) is a theorem-certified bit of magic.
Richard Morey writes:
On your blog a while back, you asked why more people aren't using Hybrid (Hamiltonian) Monte Carlo. I have tried it, and found that it works quite well for many applications, but not so well for others (parameters with bounded space, and parameters with whose log-posterior has exponential functions in them, specifically). When I started using it, there wasn't much out there about it, precisely because it hasn't caught on. Well, to help remedy that a bit, I've created a CRAN package to do hybrid Monte Carlo sampling (HybridMC), and I thought this may be of interest to your readers. The back end is written in C, so it is quite fast. I've had good luck with it so far.
Cool. We should take a look at this.
David Blei is teaching this cool new course at Princeton in the fall. I'll give the description and then my thoughts.
Just in case you thought this blog was all fluffy political stuff . . . Kaisey Mandel writes:
Ed Sanchez writes:
My company, Cumulo Software, has developed a very powerful technology that allows you to turn any R program into a web service in minutes. There is no network programming - you only need to parse simple command line arguments inside R, and then return values via 'cat'. We have many samples in R, and we are adding more every day.Our product is called SAASi, and it has an easy to use web interface to define web services. All web services created with SAASi have strict access controls that you specify. We also provide detailed usage statistics that you can use to monetize your web services, among other things.
We are hoping to attract R experts that want to bring innovative R technologies online.
I haven't had a chance to look at this, but I thought it might interest some of you.
Daniel Lee and I made these graphs showing the income distribution of voters self-classified by ideology (liberal, moderate, or conservative) and party identification (Democrat, Independent, or Republican). We found some surprising patterns:

(Click on image to see larger version.)
Each line shows the income distribution for the relevant category of respondents, normalized to the income distribution of all voters. Thus, a flat line would represent a group whose income distribution is identical to that of the voters at large. The height of the line represents the size of the group; thus, for example, there were very few liberal Republicans, especially by 2008.
The most striking patterns to me are:
1. The alignment of income with party identification is close to zero among liberals, moderate among moderates, and huge among conservatives. If you're conservative, then your income predicts your party identification very well.
2. First focus on Democrats. Liberal Democrats are spread among all income groups, but conservative Democrats are concentrated in the lower brackets.
3. Conservative Republicans--the opposite of liberal Democrats, if you will--are twice as concentrated among the rich than among the poor.
Putting factors 2 and 3 together, we find that ideological partisans (liberal Democrats and conservative Republicans) are not opposites in their income distributions. In particular, richer voters are more prevalent in these groups.
Which might be relevant for the debates over health care, taxes, and other political issues that have a redistributive dimension.
P.S. The 2000 and 2004 data are from the National Annenberg Election Survey; 2008 is from the Pew Research pre-election surveys. We show all three years to indicate the persistence of the general pattern. As a way of showing uncertainty and variation, this is much more effective than displaying standard errors, I think.
In the aftermath of linking to my article with Aaron and Nate about the probability of your vote being decisive, Conor Clarke writes:
If your decision to vote is motivated by the sense that "one vote can make a difference," you are being substantially less rational than someone who never leaves the house for fear of being killed by a meteor. Voting is irrational.
I completely disagree with this last statement, and I know that Aaron does also. Here's we wrote on pages 4-5 of our article:
Reza Esfandiari sent me this article regarding statistical analyses of the recent election in Iran. Esfandiari looks at the data and concludes that the election was fair and that the analyses contending otherwise were flawed. I haven't look at this report in detail and offer no endorsement or criticism, just putting it out there so that anyone who might be interested can take a look themselves.
From Jeet Heer:
Some examples of business names that don't make sense:
1. Icarus air Travel. Icarus only had one flight and it ended badly.2. The Abelard School, a private academy. Abelard was best known for sleeping with a student.
3. Gandhi's Fine Indian Cuisine. Gandhi was not a known to be a hearty eater or gourmand.
4. Mecca Jeans. Is it good idea to wear jeans at Mecca?
5. Ponce De Leon Federal Bank. Ponce De Leon supposedly went searching for the fountain of youth. Even though the story is not true, still that's what his name means to most people. Would you trust him with your life savings?
Good points, all.
Daniel Lakeland writes:
My wife sent me this link, saying how cool it looked. I [Lakeland] told her it was one of the worst things I'd seen in a long time...Apparently it won the Guardian's "Visualization Contest"...
Hal Varian pointed me to this article in The Economist:
Instrumental variables help to isolate causal relationships. But they can be taken too far
"Like elaborately plumed birds...we preen and strut and display our t-values." That was Edward Leamer's uncharitable description of his profession in 1983. Mr Leamer, an economist at the University of California in Los Angeles, was frustrated by empirical economists' emphasis on measures of correlation over underlying questions of cause and effect, such as whether people who spend more years in school go on to earn more in later life. Hardly anyone, he wrote gloomily, "takes anyone else's data analyses seriously". To make his point, Mr Leamer showed how different (but apparently reasonable) choices about which variables to include in an analysis of the effect of capital punishment on murder rates could lead to the conclusion that the death penalty led to more murders, fewer murders, or had no effect at all.
In the years since, economists have focused much more explicitly on improving the analysis of cause and effect, giving rise to what Guido Imbens of Harvard University calls "the causal literature". The techniques at the heart of this literature--in particular, the use of so-called "instrumental variables"--have yielded insights into everything from the link between abortion and crime to the economic return from education. But these methods are themselves now coming under attack.
I don't really think this one is of general interest so I'll put it all below the jump . . .
Devin Pope writes:
I wanted to send you an updated version of Jonah Berger and my basketball paper that shows that teams that are losing at halftime win more often than expected.This new version is much improved. It has 15x more data than the earlier version (thanks to blog readers) and analyzes both NBA and NCAA data.
Also, you will notice if you glance through the paper that it has benefited quite a bit from your earlier critiques. Our empirical approach is very similar to the suggestions that you made.
See here and here for my discussion of the earlier version of Berger and Pope's article.
Here's the key graph from the previous version:

And here's the update:

Much better--they got rid of that wacky fifth-degree polynomial that made the lines diverge in the graph from the previous version of the paper.
What do we see from the new graphs?
I published an article in the Stata Journal even though I don't know how to use Stata.
I thought this was funny. I'm not sure if Mankiw is making a joke about what Ken Rogoff thinks is a "dystopia" or whether he's making a more general joke about how economists think, but either way I was amused.
(I have no opinion one way or another on the economic analysis. I just thought it was a funny use of the term "dystopia," which I usually associate more with Mad Max than with inflation or tax increases. Actually, I thought some economists thought that a bit of inflation was a good thing?)
Cosma Shalizi writes:
Kevin Kelly has an interesting take on Ockham's razor, which is basically that it helps you converge to the truth faster than methods which add unnecessary complexities let you do. I think his clearest paper about it is this, though sadly it looks like he removed the cartoons he had in the draft versions.
I took a look. Here's the abstract:
Explaining the connection, if any, between simplicity and truth is among the deepest problems facing the philosophy of science, statistics, and machine learning. Say that an efficient truth-finding method minimizes worst-case costs en route to converging to the true answer to a theory choice problem. Let the costs considered include the number of times a false answer is selected, the number of times opinion is reversed, and the times at which the reversals occur. It is demonstrated that (1) always choosing the simplest theory compatible with experience and (2) hanging onto it while it remains simplest is both necessary and sufficient for efficiency.
This is fine, but I don't see it applying in the sorts of problems I work on, in which "converging on the true answer" requires increasingly complicated models as more data arrive. To put it another way, I don't work on "theory choice problems," and I'm invariably selecting "false answers."
P.S. I'm not saying this to mock Kelly's paper; I can imagine this can be useful in some settings, just maybe not in problems such as mine where I would like my models to be more, not less, inclusive.
Nils Hjort, Chris Holmes, Peter Muller, and Stephen Walker have come out with a new book on Bayesian Nonparametrics. It's great stuff, makes me realize how ignorant I am of this important area of statistics. Here are the chapters:
0. An invitation to Bayesian nonparametrics (Hjort, Holmes, Muller, and Walker)1. Bayesian nonparametric methods: motivation and ideas (Walker)
2. The Dirichlet process, related priors and posterior asymptotics (Subhashis Ghosal)
3. Models beyond the Dirichlet process (Antonio Lijoi and Igor Prunster)
4. Further models and applications (Hjort)
5. Hierarchical Bayesian nonparametric models with applications (Yee Whye Teh and Michael I. Jordan)
6. Computational issues arising in Bayesian nonparametric hierarchical models (Jim Griffin and Chris Holmes)
7. Nonparametric Bayes applications to biostatistics (David Dunson)
8. More nonparametric Bayesian models for biostatistics (Muller and Fernando Quintana)
I have a bunch of comments, mostly addressed at some offhand remarks about Bayesian analysis made in chapters 0 and 1. But first I'll talk a little bit about what's in the book.
David Afshartous writes:
I recall you had a post awhile back RE the difficulty kids have excelling in statistics versus mathematics, e.g., there are few statistics prodigies yet many mathematics prodigies. In any event, my 10 year old nephew was on his school math team last year and I helped him with his homework which consisted mainly of previous math competition problems (2xweek via skype video). It seemed like they were developing a bag of tricks and not learning the underlying material behind the problems. As he is on the fence about joining the math team in the fall, I'm thinking about continuing our weekly meetings but teaching him basic statistics/probability instead. As I don't want to turn him off from the subject at an early age, my guess is that I should focus on fun probability problems that he can relate to (e.g., binomial problems related to basketball, or perhaps mix in some intriguing aspects of the history of probability) and then later introduce additional material. I'd like to come up with a plan for the semester and would appreciate any advice you have on what a 10 yr old should be taught in statistics/probability.
My reply:
First off, I envy your nephew. I had zero math education at age 10. No math team, nothing like that. I just considered myself lucky when the teacher let me sit in the library and read books.
I do remember math team from high school, and I agree that much of it was centered around silly tricks. On the other hand, silly math tricks are still math. I don't know that he really needs to learn the underlying principles right away. Maybe what it really takes is the proverbial 10,000 hours of practice. If he's enjoying it, that should be fine.
If you're doing statistics and probability . . . I really have no idea! I personally like a lot of the games in my Bag of Tricks book, so you could start with some of them. A natural area of applications would be board games, if he likes Monopoly or Scrabble or whatever, there are a lot of probabilities to calculate. You could also try getting a little roulette set, if you're not worried about turning him into a gambling addict.
Any other ideas out there?
In the "Conservatives are nicer than liberals" controversy, there was a question about who has more money, conservatives or liberals. I've written a lot about income and voting, but I realized I'd never actually looked at income and political ideology. Here are the data, from respondents to the Pew pre-election polls in 2008:

The poorest people are more likely to be liberal, and the richest are more likely to identify as moderate rather than conservative, but overall there's less going on here than I would've expected.
In contrast, the relation between income and party identification is strong, and goes in the expected direction:

There must be a lot of low-income moderate Democrats and high-income moderate Republicans out there.
P.S. For the purpose of understanding charitable giving, I'd rather know wealth than income. Or maybe something like "disposable income." It's harder to get this from survey data, though.
Every couple of months there's a new version of R. Now it's R 2.9.1. I better download it, since some packages I use might depend on the latest version.
Can somebody out there in R-land please put an "update" button in the R console? Or, better yet, have R check occasionally for updates and then allow me to install with one click, in a way that will transfer all my downloaded packages automatically.
Thanks.
P.S. Yes, yes, I know. R is free, and if I really want this done, I can do it myself. But I'm doing other things for R! And others would be much better able than I to set up the automatic install as described above.
P.P.S. If youall are making changes to R for me, I also suggest replacing the current display of lm and glm fits with the output from display() in the arm package.
A correspondent writes:
I'm doing some personal research on the correlation between family income and political affiliation and I was hoping you can help. I came across some illuminating maps that you created and was wondering where you got your data from. I can't seem to find any hard data on the subject so any help would be greatly appreciated.I [my correspondent] am looking into the assertion that conservatives are more generous than liberals. Specifically, I'm trying to debunk the thesis of Arthur C. Brooks' Who Really Cares: The Surprising Truth About Compassionate Conservatism. In this book, Brooks argues that liberals are less generous than conservatives and uses hard data to substantiate the claim. While I believe most of his analysis is spot on, I think that his results might be skewed by the way he measures generosity.
Brooks measures generosity as the percentage of income spent on charitable giving. I think that a better measure would be charitable giving as a percentage of disposable family income; people don't give away what they can't afford to. This is significant because, if your maps are correct, there's the distinct possibility that conservatives make more than liberals on average and therefore have more to give. If I can get data on income as a function of political affiliation I can correct for non-disposable income and see if it makes a significant difference in the results.
My reply:
First, I'd like to point you to some updated maps that I've made of income and voting.
Our data came from the Pew Research Center. We used their polls taken during the few months before the election. We also adjusted for voter turnout using the Current Population Survey post-election supplement, but that's less important, I think. (Yair and I are in the midst of writing up an article describing exactly what we did.)
Finally, Arthur Brooks's findings seem plausible enough to me, even after controlling for income. My own pet explanation is in terms of default behavior. Or, to put it even more strongly, as commenters Ockham and Ubs wrote here, you're much more likely to give to charity if somebody is asking you to do so--and conservatives might very well be more likely than liberals to be in settings where someone is personally asking them to give to charity.
Christopher Beam's recent news article on qalys includes this amazing quote:
QALYs also assume that a year lived by an 80-year-old is worth less than one lived by a 20-year-old. But that's not accurate, says Dana Goldman of the RAND Corp. "It's not taking into account hope, not taking into account the chance of living to see your daughter's wedding, it's not getting at the extra value we put on the end of life." Yes, the U.S. health care system has to rein in costs, says Goldman, but "QALY is not ready for prime time."
Maybe this guy is being taken out of context, but . . . "the chance of living to see your daughter's wedding"??? There's always individual variation; that doesn't mean you can't try to capture averages.
As I wrote a couple of weeks ago, the Republicans need something like a 7% swing in the national vote to take back the House of Representatives in 2010.
From Erikson, Bafumi, and Wlezien, here is a graph predicting the Democratic party's vote share in midterm elections, given their support in a generic party ballot from polls taken during the 300 days before the election:
The higher line in each graph (in red) corresponds to elections where the incumbent president is a Republican, and the lower line (in blue) corresponds to elections such as 2010, where the incumbent is a Democrat.
Alan Reifman writes:
I [Reifman] have created a new website to compile poll results on specific provisions of the health care reform debate. Today, I review the polling on universality, personal/individual mandates, and employer mandates. I discuss in the Welcome Statement on my page how I aim to go beyond what is currently available on sites such as Pollster.com and Polling Report.
From a subscription card insert in the New Yorker:
EXTRA! REGISTER ONLINE NOW FOR YOUR CHANCE TOWIN $50,000 CASH
FROM THE NEW YORKER
I guess I already knew that once they were affiliated with Dennis Miller, the New Yorker had already jumped it. . . .
Xiao-Li wrote an article on his experiences putting together a statistics course for non-statistics students at Harvard. Xiao-Li asked for any comments, so I'm giving some right here:
I think the ideas in the article are excellent.
The challenges of getting students actively involved in statistics learning have motivated me to write a book on teaching statistics, develop a course on training graduate students to teach statistics, and even to offer general advice on the topic.
But I have not put it all together into a successful introductory course the way Xiao-Li has, and so I read his article with interest, seeking tips in how we can do better in our undergraduate teaching.
The only thing I really disagree with is Xiao-Li's description of statisticians as "traffic cops on the information highway." Sure, it sounds good, but often I find my most important role as a statistician is to tell people it's ok to look at their data, it's ok to fit their models and graph their inferences. There's always time to go back and check for statistical significance, but I've found the biggest mistakes are when scientists, fearing the statistician over their shoulder, discard much of their information and don't spend enough time looking at what they have left.
I'm certainly not arguing that simple methods are all we need. (See here for my recent advertisement for fancy modeling). What I'm saying is that I'm happier being an enabler than a police officer. I think I've done more good by saying yes than by saying no.
On the other hand, in Xiao-Li's defense, he's prevented three false discoveries (see bottom of page 206 of his article), whereas I've proved one false theorem. So perhaps we just put different values on our Type 1 and Type 2 errors!
To return to XL's article, on pages 207-208 he tells a story involving a scientist who was stopped just in time before making a big mistake, by discussing the questionable analysis with Policeman Meng, who noticed the problem. I assume we can all agree that the crucial step in this process was that the scientist was (a) worried that something might be wrong and (b) went to a statistician for help. I'd like to believe that many of the readers of this article would've been able to find the problem, but this sort of eagle-eyed criticism is different from what I think of as the most common bit of policing, which is statisticians giving scientists a hard time about technicalities.
Or, to put it another way, I don't mind the statistician as critic, but I don't think we should have the police officer's traditional power to arrest and detain people at will. Except maybe in some extraordinary cases.
To return to undergraduate education: I've taught undergraduate statistics several times at Berkeley and at Columbia. Berkeley had an exciting undergraduate program with about 15 juniors and seniors taking a bunch of topics classes. I have fond memories of my survey sampling and decision analysis classes and also of the department's annual graduation ceremony, which included B.A.'s, M.A.'s, and Ph.D.'s in one big celebration. I've heard that the program has since grown to about 50 students. At Columbia, in contrast, we have something in the neighborhood of 0 statistics majors. It's a feedback loop: few courses, few students, few courses, etc. I think this was the case at Harvard for many many years, although maybe it's changed recently.
My point? The intro courses at Berkeley for non-majors were very well organized, much more so than at Columbia, at least until recently. Perhaps no coincidence. I suspect it's easier to confidently teach statistics to non-majors if you have a good relationship with the select group of undergraduates who are interested enough in statistics to major in it. And, conversely, an excellent suite of introductory statistics classes is a great way to interest students in further study.
Teacher training is also important, as Xiao-Li indicates in the last sentence of his article. At Berkeley there was no formal course in statistics teaching, but most of the Ph.D. students went through the "boot camp" of serving as T.A.'s in large courses under the supervision of experienced lecturers such as Roger Purves; between this direct experience and word-of-mouth guidance from other students in the doctoral program, they quickly learned which way was up. At Columbia we have recently revived our course, The Teaching of Statistics at the University Level, and I hope that this course--and similar efforts at Harvard and other universities--will help move us in the right direction.
In addition, wider awareness of statistical issues outside of academia (for example, at our sister blog) will, I hope, make college students demand statistical thinking in all their classes, whether taught by statisticians or not. It wouldn't be a bad thing for a student in a purely qualitatively-taught history class to consider the role of selection bias in the gathering of historical data (see Part 2 of A Quantitative Tour for more on this sort of thing), just as it isn't a bad thing for a student in a statistics class to think about the social implications of some of the methods we use.
David Spiegelhalter and Ken Rice wrote this excellent short article on Bayesian statistics. I think it's far superior to the Wikipedia articles on Bayes, most of which focus too much on discrete models for my taste.
Peter Flom writes:
I am now up for a position which would require teaching some introductory statistics to people studying to work in health care. Mostly, these people will have only a HS diploma, and it may be a fairly old HS diploma (a lot of them are returning to school).For the interview, though, I am assigned to give a 30 minute talk (no powerpoint or anything, just a white board).
Alan Bergland writes:
I am a graduate student studying evolutionary biology at Brown University. I am writing you with what I think is a simple question, but I cannot seem to find an answer I feel comfortable with.I am trying to test a planned contrast using posterior distributions from a mixed model (the mixed model is calculated in lme4, and the simulations in arm). The model is fairly complicated, but at the end of the day, there are two fixed effect treatments with two levels each that I am interested in. Lets call these fixed effects "treatment A" (with levels A and a) and "treatment B" (with levels B and b). I am interested in the interaction between treatment A and treatment B, but have a specific hypothesis about the form of that interaction I would like to test. Specifically, I would like to test if ab is less than Ab & aB=AB.
As you and Jennifer Hill suggest in your Multilevel/Hierarchical models book (p. 20), I could test if ab
Once I can calculate the probability that Ab=AB, would it be reasonable to calculate the probability that (ab is less than Ab & aB=AB) as Pr(ab is less than Ab)*Pr(aB=AB)?
My reply:
1. Don't use the arm's sim() function for lmer() objects. The current version is wrong; we're fixing it now, and the replacement should be available in about a month.
2. I don't recommend testing if aB=AB. At least in the sorts of problems I work on, no two comparisons are exactly equal. I think it makes more sense to estimate the relevant comparison, get the confidence interval, and make a graph. You could also do things like calculate the posterior probability (based on simulations) that ab < AB & |aB - AB|
Ryan Richt writes:
I wondered if you have a quick moment to dig up an old post of your own that I cannot find by searching. I read an entry where you discussed if there really was a difference between a prior of 1/2 meaning that we have no knowledge of a coin flip, or meaning we are exactly certain that it's generative distribution is 1/2.I'm only 24 and just got my masters last year, but I now have my own summer interns (who of course I encourage to read ET Jaynes and see the bayesian light) and one of them basically asked that question today.
My reply: The two original blog entries are here and here. Here's my published article. And here's a link discussing actual wrestlers and boxers. (Apparently the wrestler would win.)
The talks from the mini-conference are up on the website. The speakers:
Martin Lindquist (Dept of Statistics, Columbia)
Ed Vul (Dept of Brain and Cognitive Sciences, MIT)
Nikolas Krigeskorte (Laboratory of Brain and Cognition, NIH)
Tor Wager (Dept of Psychology, Columbia)
Andrew Gelman (Dept of Statistics, Columbia)
Daphna Shohamy (Dept of Psychology, Columbia)
Cosma Shalizi (Dept of Statistics, CMU)
Pat Shrout (Dept of Psychology, NYU)
The powerpoints are up, and also videos of our presentations. If you listen carefully, you can hear the raucous laughter in the background. . . .
Here (I found it through a link from Jenny Davidson). Only one update in the past six months, but still, it's the great Luc Sante...
Alfred Inselberg, the inventor of parallel coordinates, sent along this fascinating handout with a bunch of color graphs illustrating the power of the parallel-coordinates idea.
Here's a cool picture, along with Inselberg's caption:
![]()
In the background is a dataset with 32 variables and 2 categories. On the left is the plot of the first two variables in the original order, on the right are the best two variables after classification. The algorithms discovers the best 9 variables (features) needed to describe the classification rule, with 4% error, and orders them according to their predictive power.
A couple more below:
Ian Fellows writes:
Being as you are an R user at the intersection of the social sciences and statistics, I thought some recent work I've done might be of interest to you. SPSS has long dominated the teaching and practice of statistics in the social sciences (at least among non-statisticians). I've created a new menu driven data analysis graphical user interface aimed at replacing SPSS (or at least that's the long term lofty goal). It has just been released under GPL-2 on CRAN. Feel free to check out some screen shots in the online wiki manual (not yet complete).
I don't know SPSS, but just yesterday someone told me that people can run R from SPSS and get a convenient menu system, so if this freeware would have the same capacity, that would be great. Here's the description:
I ran into John Barnard a few hours ago and he told me that he likes the blog but he hates the political stuff. So, John, you can skip this one. Although there is a bit of statistics near the end, so if you want you can click through and search for two asterisks (**); I've labeled the statistical content, just this once, to make your life slightly easier!
Following Paul Krugman, John Sides considers how one might measure the ideological position of conservative political commentator Michelle Malkin. I'd heard the name but I don't have any TV reception and didn't really know what she stood for. Going to her webpage, I see she's written three books: "Invasion: How America Still Welcomes Terrorists, Criminals, and Other Foreign Menaces to Our Shores," "In Defense of Internment: The Case for 'Racial Profiling' in World War II and the War on Terror," and "Unhinged: Exposing Liberals Gone Wild." From her blog, she also appears to have conservative economic views, although it's hard to separate this from partisanship without going back to posts from previous years.
Krugman wants a "scale of positions on political matters ... we might find that only 19 percent of Americans are to the right of Michelle Malkin, while 23 percent are to the left of Michael Moore." I don't have enough of a sense about Malkin, but I'm pretty sure that much less than 23% of Americans are to the left of Michael Moore. In chapter 8 of Red State, Blue State is this graph from Joe Bafumi and Michael Herron estimating the ideological positions of congressmembers and voters:
Aaron Edlin just sent me this article by Pinar Karaca-Mandic and himself from 2006:
We [Edlin and Karaca-Mandic] estimate auto accident externalities (more specifically insurance externalities) using panel data on state-average insurance premiums and loss costs. Externalities appear to be substantial in traffic-dense states: in California, for example, we find that the increase in traffic density from a typical additional driver increases total statewide insurance costs of other drivers by $1,725-$3,239 per year, depending on the model. High-traffic density states have large economically and statistically significant externalities in all specifications we check. In contrast, the accident externality per driver in low-traffic states appears quite small. On balance, accident externalities are so large that a correcting Pigouvian tax could raise $66 billion annually in California alone, more than all existing California state taxes during our study period, and over $220 billion per year nationally.
Interesting stuff. I don't have it in me right now to check all these numbers, but the argument looks to be laid out clearly enough that the experts in the area can work it out. Also, it all seems to be about accidents to other cars; I'm not sure where they factor in the costs due to running over pedestrians.
Kobi forwarded this on, I don't know anything about it but it looks like it could be interesting:




Recent Comments