This note by Steve Hsu on the history of the Wranglers (winners of a mathematics competition held each year from 1753-1909 at Cambridge University) reminded me of my experience in the U.S. math olympiad training program in high school. At the time, it seemed clear that we were clearly ordered by ability (with my position somewhere between 15th and 20th out of 24!). In retrospect, I think there are a lot of tricks to solving and writing up solutions to "Olympiad problems," and I didn't know a lot of these tricks.

It was the usual paradox of measurement: I was confusing reliability with validity, as they say in the psychometric literature.

Daljit Dhadwal writes:

On the Ask Metafilter site, someone asked the following:

How does statistical analysis differ when analyzing the entire population rather than a sample? I need to do some statistical analysis on legal cases. I happen to have the entire population rather than a sample. I'm basically interested in the relationship between case outcomes and certain features (e.g., time, the appearance of certain words or phrases in the opinion, the presence or absence of certain issues). Should I do anything different than I would if I were using a sample? For example, is a p-value meaningful in this kind of case?

My reply:

This is a question that comes up a lot. For example, what if you're running a regression on the 50 states. These aren't a sample from a larger number of states; they're the whole population.

To get back to the question at hand, it might be that you're thinking of these cases as a sample from a larger population that includes future cases as well. Or, to put it another way, maybe you're interested in making predictions about future cases, in which case the relevant uncertainty comes from the year-to-year variation. That's what we did when estimating the seats-votes curve: we set up a hierarchical model with year-to-year variation estimated from a separate analysis. (Original model is here, later version is here.)

So, one way of framing the problem is to think of your "entire population" as a sample from a larger population, potentially including future cases. Another frame is to think of there being an underlying probability model. If you're trying to understand the factors that predict case outcomes, then the implicit full model includes unobserved factors (related to the notorious "error term") that contribute to the outcome. If you set up a model including a probability distribution for these unobserved outcomes, standard errors will emerge.

After finding the Howard Wainer interview, I looked up the entire series of Profiles in Research published by the Journal of Educational and Behavioral Statistics. I don't have much to say about most of these interviews: some of these people I'd never heard of, and I don't really have much research overlap with the others. Probably I have the most overlap with R. D. Bock, who's done a lot of work on multilevel modeling, but, for whatever reason, his stories didn't grab my interest.

But I was curious about the interview with Arthur Jensen. I've never met him--he gave a talk at the Berkeley statistics department once when I was there, but for some reason I wasn't able to attend the talk. But I've heard of him. As the interviewers (Daniel Robinson and Howard Wainer) state:

More on the median voter

| No Comments

A correspondent read my recent note on the limited influence of the median voter and writes:

My understanding of median voter theorem is that each election has its own median voter, and that the median voter's influence is limited to the outcome of that election only. I don't understand, then, why the graph in your post is evidence that the median voter has little influence. It seems to me that there are two elections being considered in that graph, with two different median voters. The graph appears to consider "moderation" to be having a moderate voting record in Congress, but it seems to me that the median voter in Congress is likely quite different from the median voter in any particular Congressional district. The power of the median voter in Congress, it seems to me, is to affect the outcome of Congressional votes, not to improve his own chances for re-election, which are determined by his proximity to the median voter in his district. Thus, I'm not sure why we would expect moderation, as measured by the median Congressional voter, to translate into electoral success, which we would expect to be determined by the median district voter.

My reply:

Should Mark Sanford resign?

| 6 Comments

At our sister blog, Tom Schaller says no:

Is Sanford a cad for bolting his family on Father's Day weekend? Of course, but that is a private, moral failing, rather than a failure of public duty. . . .

I [Schaller] oppose most of what Mr. Sanford stands for politically. His showy rejection of federal stimulus money targeted for his state was a crass publicity stunt designed to garner national attention for Mr. Sanford at the expense of his constituents, many of whom are struggling economically. . . . Should Mr. Sanford's ambitions founder on the shoals of a personal scandal, however, yet another opportunity will be lost to establish the long-overdue separation between private comportment and public service. So here's hoping he doesn't resign or, if he does, it is a matter of personal choice rather than him bowing to political pressure.

I see where Schaller is coming from. Lots of people have complicated personal lives, and it's not clear at all that these difficulties have much if anything to do with governing. But I don't know if I agree with him on the wall of separation between private comportment and public service.

Consider the Sanford case. Schaller's a Democrat, so he can evaluate Sanford on his policies. But if Schaller were a Republican, he might very well want Sanford out of there because he tarnishes the brand, makes the party a laughingstock, etc. Also makes it harder for Sanford to convincingly follow a "family values" agenda which Schaller (if he were a Republican) might want. These are legitimate concerns for a Republican to have. Even if you don't think Sanford's personal indiscretions are important, you might want him gone and replaced by a more effective Republican. Just as, from the other direction, a Democrat would've preferred a zipped-fly version of Bill Clinton.

Some time ago FlowingData had an article on visualizing tables - which really is about visualizing spreadsheets in terms of correlations between columns. While Circos generates very colorful displays:

circos.png

Today I was impressed by a much cleaner and Tuftier variant on the theme by Mike Bostock, called Dependency Tree:

dependency-tree.png

Click on the link, it's interactive. Jeff Heer and Bostock also have a new JavaScript visualization toolkit out ProtoVis, which simplifies the creation of such stuff. The computer scientist in me finds this development very cool. But I still like my correlation matrices.

Sometimes you hear discussion of how the red states get more from the government than they pay in taxes while the blue states get less and pay more. This is slightly misleading because the blue states are richer and rich people pay a higher rate of income tax, but it does raise the interesting question of the regionally distributive effects of national taxing and spending poliicies.

minimap.jpg

For some perspective on where this is coming from: In our office is a map from 1924 titled "Good Roads Everywhere" that shows a proposed system of highways spanning the country, "to be built and forever maintained by the United States Government." The map, made by the National Highways Association, also includes the following explanation for the proposed funding system: "Such a system of National Highways will be paid for out of general taxation. The 9 rich densely populated northeastern States will pay over 50 per cent of the cost. They can afford to, as they will gain the most. Over 40 per cent will be paid for by the great wealthy cities of the Nation. . . . The farming regions of the West, Mississippi Valley, Southwest and South will pay less than 10 per cent of the cost and get 90 per cent of the mileage." Beyond its quaint slogans ("A paved United States in our day") and ideas that time has passed by ("Highway airports"), the map gives a sense of the potential for federal taxing and spending to transfer money between states and regions.

P.S. Yes, I posted this last year, but without the pretty map image (click on it for higher resolution, which unfortunately still isn't quite good enough to make out the text)..

The Howard Wainer story.

On of the fun parts is this story from his days as an assistant professor:

Casey Mulligan is consistent

| 4 Comments

Back in April, in an article about partisan perceptions of the economy, John Sides and I wrote:

A scary thought

| 11 Comments

A colleague and I were talking the other day about how much we pay our research assistants. It turns out that she pays much more. In fact, sometimes I don't get around to paying my research assistants at all, but she pays hers a decent amount.

My colleague, who's an untentured professor, said that was understandable because she makes less money than I do, so she can better relate to the students' lifestyles. That's a pretty scary thought--it should really go the other way, right? I get paid more so I should be able to afford to be more generous. But maybe she's right; if so, it's a sobering insight.

One major impediment, scientists agree, is the grant system itself. It has become a sort of jobs program, a way to keep research laboratories going year after year . . .

I was on an NIH panel a couple of years ago with about 25 other scientists, reviewing something like 90 grants. It was pointless. 25 people is just too many to make a decision. What happened was that there were 3 or 4 people who were experienced in the process, who ended up guiding the entire discussion.

The highlight--or, I should say, lowlight--was when we were reviewing a proposal involving the study of the carcinogenic effects of hookah (water pipe) smoking. I asked if this was really such a big deal, and one of the panel members told me that smoking tobacco through a hookah is something like 10 times worse than smoking a cigarette. If so, the public health consequences could be pretty serious, even if not so many people did it. I said this sounded like a reasonable point to me. Then this guy across the table from me spoke up and said that he knew somebody who was 80 years old, had been smoking with a hookah all his life and was none the worse from it. At this point, I blew up. I couldn't believe that the "my elderly aunt smokes and she didn't get cancer" argument could be brought up at an NIH panel!

My final thoughts on those Iran vote analyses:

Our article (by Yu-Sung, Jennifer, Masanao, and myself, and based also on work with Kobi, Grazia, and Peter Messeri) will be appearing in the Journal of Statistical Software, in a special issue on missing-data imputation. Here's the abstract:

Our mi package in R has several features that allow the user to get inside the imputation process and evaluate the reasonableness of the resulting models and imputations. These features include: flexible choice of predictors, models, and transformations for chained imputation models; binned residual plots for checking the fit of the conditional distributions used for imputation; and plots for comparing the distributions of observed and imputed data in one and two dimensions. In addition, we use Bayesian models and weakly informative prior distributions to construct more stable estimates of imputation models. Our goal is to have a demonstration package that (a) avoids many of the practical problems that arise with existing multivariate imputation programs, and (b) demonstrates state-of-the-art diagnostics that can be applied more generally and can be incorporated into the software of others.

We've made lots of improvements since listing the package last year (here). There's still a lot more work to do, in many different directions (including multilevel models, nonignorable models, the self-cleaning oven, and making the program run faster in sorts of ways), and we keep improving it. But it's good to have something out there.

To actually get the R package, just open your R window, click on Packages, Install packages, and grab mi.

Pinchas Lev writes:

Sometimes people think it's a disaster when you have more predictors than data points, but I always point out that, no, it's better to have 9 predictors than just 1 or 2. After all, if you really wanted just 1 or 2, you could just throw out most of your data!

Nate's chart is excellent, especially the ordering of the candidates in order of the percent favoring resignation:

sanford2.PNG

I also like the gratuitious exclamation marks which add fun value without actually making the graph any harder to read. The key reason this works is that Nate wisely did not fill in the blank squares with "No!"s.

My only comments are:

Andrew Knight points me to this Kafkaesque report on Bayesian methods and evidence-based medicine. It's always good to see things like this out there,

My main disagreement with the report is on their framework in which there is a fixed data model and different choices of prior distribution. As we discuss in Section 2.8 of Bayesian Data Analysis, I much prefer the framework in which a single prior distribution (or "population distribution") is applied to many different data settings. I think that framing it my way makes the benefits of Bayesian inference much clearer.

I also don't like all the tables. But that's not really a Bayesian issue.

The roach-bombing puzzle

| 8 Comments

I've been assured, and I believe, that the effective way to get rid of the roaches in your apartment is to clean the place, put poison in the cracks, and then seal them. Some people do that. But a lot of people go for the "bombing" approach: the exterminator comes to the building once a month, drops the bomb, leaves, and comes back the next month.

My question is: what are these people thinking?? Why do these people willingly get bombed once a month instead of following the simpler and effective approach? Part of this is ignorance, surely, but I think there's more to it than that, some underlying psychological appeal. I don't think it's just ignorance because, when I talk with people who get bombed and discuss the "clean, poison, and seal" approach, I've found them to be very resistant and (I would say) "defensive." They seem to want to believe that bombing is effective and really don't want to hear about alternative strategies.

What's going on? I have some theories. Maybe bombing seems like less effort than cleaning the food out of your closet and sealing the cracks. Also it seems sort of decisive. On the other hand, shouldn't people pause a little when they think about needing the exterminator every month? Yet, that doesn't seem to bother people. Conceptually, getting the exterminator to bomb your apartment feels to me a bit like "taking a pill." Maybe there's some technological appeal. Sort of like the way that photovoltaics are sexy in a way that passive solar isn't.

I don't know. I'll have to ask some psychologists of my acquaintance who work on environmental decision making.

Hall, J.L., L.W. Miratrix, P.B. Stark, M. Briones, E. Ginnold, F. Oakley, M. Peaden, G. Pellerin, T. Stanionis and T. Webber, 2009. Implementing Risk-Limiting Audits in California, USENIX EVT/WOTE, In press.

Related discussion here.

Donna Harrington writes:

I will be teaching a new multilevel models course in the fall and am currently reading your text, /Data Analysis Using Regression and Multilevel/Hierarchical Models/ as I prepare. I am enjoying the book and am considering adopting it for use in the course.

Would you be willing to share the syllabus you have used for your Applied Regression and Multilevel Models course? I am particularly interested in seeing how much of the book you use in a one semester course.

My reply:

I have to admit that, over the years, I've made my syllabuses less and less detailed as I've focused more and more on writing the books. For a multilevel modeling course, I suggested the following:

- chapters 3,4,5: linear and logistic regression
- chapter 7: basics of simulation
- chapter 9: basics of causal inference
- chapters 11-14: multilevel linear and logistic regression (up to and including varying-intercept, varying-slope models)
- chapter 18: all the theory that they'll need.

For a one-semester introductory course, my usual strategy for a one-semester course is to focus chapters 2-10: that is, cover everything except multilevel modeling. Linear regression, logistic, glm, computation, and causal inference. Then for the last part of the course, I can choose among some options, including: intro to multilevel models, sample size and power calculations, and missing data imputation.

P.S. To those of you who haven't had the opportunity to take a course from me: Don't worry about it. I'm better at writing than teaching. Maybe you're better off learning out of one of my books with somebody else actually teaching the class.

A political scientist writes:

Here's a question that occurred to me that others may also have. I imagine "Mister P" will become a popular technique to circumvent sample size limitations and create state-level data for various public opinion variables. Just wondering: are there any reasons why one wouldn't want to use such estimates as a state-level outcome variable? In particular, does the dependence between observations caused by borrowing strength in the multilevel model violate the independence assumptions of standard statistical models? Lax and Phillips use "Mister P" state-level estimates as a predictor, but I'm not sure if someone has used them as an outcome or whether it would be appropriate to do so
.

First off, I love that the email to me was headed, "mister p question." And I know Jeff will appreciate that too. We had many discussions about what to call the method.

To get back to the question at hand: yes, I think it should be ok to use estimates from Mister P as predictor or outcome variables in a subsequent analysis. In either case, it could be viewed as an approximation to a full model that incorporates your regression of interest, along with the Mr. P adjustments.

I imagine, though, that there are settings where you could get the wrong answer by using the Mr. P estimates as predictors or as outcomes. One way I could imagine things going wrong is through varying sample sizes. Estimates will get pooled more in the states with fewer respondents, and I could see this causing a problem. For a simple example, imagine a setting with a weak signal, lots of noise, and no state-level predictors. Then you'd "discover" that small states are all near the average, and large states are more variable.

Another way a problem could arise, perhaps, is if you have a state-level predictor that is not statistically significant but still induces a correlation. With the partial pooling, you'll see a stronger relation with the predictor in the Mr. P estimates than in the raw data, and if you pipe this through to a regression analysis, I could imagine you could see statistical significance when it's not really there.

I think there's an article to be written on this.

Recent Comments

  • A. Zarkov: "For example, how does the relatively minor data that shows read more
  • Krish Swamy: Is Wall Street hiring people to build models again? :-) read more
  • Krish Swamy: My experience has been very similar to yours, Prof.Gelman. I read more
  • steve hsu: Andrew, I went back and read the earlier post you read more
  • Thom: Phil - I think you are confusing Binet with E. read more
  • Andy Fugard: Ah, the general factor in intelligence. I'm told that Spearman read more
  • TGGP: jonathan, from what I recall the famous old study giving read more
  • Michael: Why are there so many jobs for statistician around Los read more
  • Bob O'H: For me the important point is that there is a read more
  • J Smith: If one included government jobs I bet that DC bubble read more
  • Eric: Oh, I see. It's where they're hiring. read more
  • jonathan: I lived through a couple of firefights over IQ, including read more
  • Phil: Andrew, I like your analogy. Rating people on some sort read more
  • ekupfer: Cosma Shalizi's post from a couple of years ago had read more
  • Andrew Gelman: Steve, Zarkov: I agree that this stuff is important, and read more
  • A. Zarkov: "A bit over the top, no?" As George Box once read more
  • steve hsu: Re: Flynn Effect, it is unclear whether the gains are read more
  • steve hsu: Yes, Jensen's fixation on g would be nutty if it read more
  • Corey: I wonder what Jensen makes of the Flynn effect, in read more
  • FredH: Mebane's analysis is done using Benford's Law. Benford's Law is read more