June 2005 Archives

Someone sent me a question about whether it makes sense to use multilevel modeling in a study of polls from many countries. I'll give the question and my response. The topic has been on my mind because I just wrote a discussion on this issue for the forthcoming issue of Political Analysis.

The question:

Phil Price pointed me to this:


The estimation procedure is OK (except for the calculation error, noted on the webpage) but I'd like to see an uncertainty interval.

Interactive graphics

| No Comments

Anthony Unwin writes,

The sample R code in Appendix C of GCSR (2nd edition) is pretty helpful, but I'm not happy with the graphics (surprise, surprise!). Your code for producing a collection of histograms means that they are all individually scaled. For comparative purposes they should, of course, be common scaled.

I'm looking forward to your reaction to my suggestion that you should incorporate interactive graphics in your course. One nice example of interaction that just occurs to me is to select a group of graphics with the mouse and then ask the system, perhaps via a pop-up dialog as in MANET, to common scale them.

I replied,

On June 20, we had a miniconference on causal inference at the Columbia University Statistics Department. The conference consisted of six talks and lots of discussion. One topic of discussion was the use of propensity scores in causal inference, specifically, discarding data based on propensity scores. Discarding data (e.g., discarding all control units whose propensity scores are outside the range of the propensity scores in the treated group) can reduce or eliminate extrapolation, a potential cause of bias if the treated and control groups have different distributions of background covariates. However, it's sort of unappealing to throw out data, and can sometimes lead to treatment effect estimates for an ill-defined subset of the population. There was discussion on the extent to which modeling can be done using all available data without extrapolation. Other topics of discussion included bounds, intermediate outcomes, and treatment interactions. For more information, click here.

Tyler Cowen notes, from Harper's magazine, the following survey result: "Average percentage of the U.K. population that Britons believe to be immigrants: 21.
Actual percentage: 8."

A survey from 1995 in the U.S.

This reminded me of something I saw in the Washington Post about 10 years ago, that said that Americans, on average, overestimate the percentage of minorities in the country. I went to Nexis and looked it up (searched on "survey, black, hispanic", for the years 1991-1995 in the Post) and found it.

From the Post article, "Most whites, blacks, Hispanics and Asian Americans said the black population, which is about 12 percent, was twice that size." They similarly way overestimated the percentage of Hispanics and Asians in the country.

There were also systematic misperceptions about economic status. Once again, from the Post article, "A majority of white Americans have fundamental misconceptions about the economic circumstances of black Americans, according to a new national survey, with most saying that the average black is faring as well or better than the average white in such specific areas as jobs, education and health care. That's not true. Government statistics show that whites, on average, earn 60 percent more than blacks, are far more likely to have medical insurance and more than twice as likely to graduate from college."

Understanding the misperceptions

There's really a lot going on here and I'm not sure how to think about it all. These misperceptions seem important from a political perspective. How to understand where they come from? I wonder if basic cognitive biases can explain the misperceptions about the percentages of minorities. In particular, it is natural to bias your estimate of unknown probabilities toward 50/50 (Erev, Wallsten, and Budescu have written about this). Given that blacks, Hispanics, and Asians represent "natural kinds" or partitions of the population, it maybe should be no surprise that people overestimate their proportions. This would also explain the U.K. result.

This sort of reasoning is also consistent with the famous survey in which people grossly overestimate the proportion of the U.S. budget that goes to foreign aid. Small proportions will be overestimated.

The survey questions on economic views seem more complicated in that they would naturally be tied in with political ideology. It may be that a lot of whites don't want to answer Yes to the question, "Are blacks worse off than whites?" because they associate the question with specific policies, such as welfare benefits, that they don't like. Some of the quotes in the Post article (see below) seem relevant to this point. A Joe Bafumi showed in his Ph.D. thesis on the "stubborn American voter," it can be difficult to get accurate responses, even on a factual question, if people associate it with a political position.

As the saying goes, further research is needed here.

Jon Baron pointed me to this page which has the following funny story from Deb Frisch. (The story is also here.)

I've always wanted to do role-playing demonstrations--activities in which different students play different roles in the context of a statistical problem--in my statistics classes, but I've rarely gotten them to work.

The only time it was effective was when I was teaching statistical consulting. I got two students to play the role of "consultants" and two students to be the "clients" (with a prepared folder of material from an actual consulting project in an earlier semester), and then when it was done, the other students commented on the performance of the "consultants." Anyway, I'd like to have role-playing demos for intro statistics classes.

I came across this set of games for teaching history, developed by Mark Carnes of Barnard College.

Deb Nolan had the following reaction:

I finally got a computer that could play the video (I have a new mac.) So I watched the role playing video at Barnard. A long time ago, two of my students did a role playing presentation of their data analysis project. They argued about whether HIV causes AIDS, one was Peter Duesberg. It was quite entertaining. I think it's a great idea. We could work a few of them in to our demos and projects.

But I'm still not quite sure how to implement it. Perhaps Tian has some ideas?

Diet soda and weight gain


I wonder what Seth Roberts thinks about this:

Study links diet soda to weight gain


San Antonio Express-News

A review of 26 years of patient data found that people who drink diet soft drinks were more likely to become overweight.

Not only that, but the more diet sodas they drank, the higher their risk of later becoming overweight or obese -- 65 percent more likely for each diet drink per day.

Bayesian infernece proceeds by taking the likelihoods from different data sources and then combining them with a prior distribution (or, more generally, a hierarchical model). The likelihood is key. For example, in a meta-analysis (such as the three examples in Chapter 5 of our book), you need the likelihood for each separate experiment. No funny stuff, no posterior distributions, just the likelihood. In a linear model setting, it's convenient to have unbiased estimates. I don't want everybody coming to me with their posterior distribution--i'd just have to divide away their prior distributions before getting to my own analysis.

Sort of like a trial, where the judge wants to hear what everybody saw--not their individual inferences, but their raw data. Anyway, it's kind of funny since we're always saying how Bayesian inference is the best, but really we don't want other people preprocessing their data in this way. When combining subjective estimates, the challenge is that there are no pure, unbiased data points.

See part 1 of this talk for more details.

I wrote awhile ago on the Flynn effect (the increase in population IQ from 1940 to 1990 in many countries) and Flynn's comments on the impossibility of meritocracy.

Several years ago, Seth Roberts, who told me about all this, had the idea of measuring changes in intelligence over time by looking at the complexity of newspapers and magazines. From a casual reading of Time magazine, etc., from 1950 or so, as compared to today, Seth had the impression that the articles had become more sophisticated.

ABout eight years ago, I set a couple of students to the task of scanning in some old magazine articles and looking at changes from 1950 to the present time. They then compared the articles using some simple readability formulas (letters per word, words per sentence, and a couple of other things--basically, whatever was already coded into Word). Nothing much came of it and we forgot the project.

Then recently I learned that Steven Johnson has written a book in which he found that TV shows have gotten more complex over the past few years, and directly connected it to the Flynn effect. I'm curious what Seth thinks about this--it seems to confirm his hypothesis.

In a series of blog entries, Carrie McLaren argues with Johnson (the commenters on the blog have lots of interesting things to say too). I don't have anything interesting to add here. I haven't read Johnson's book but it appears that he analyzed content rather than simply using things like readability formulas, which perhaps is why he found interesting results whereas we got stuck.

Physicists . . .


A colleague writes,

hi andrew,

here's a small question from a physicist friend:

can you point to a good reference on why regressing y/x against x is a bad thing to do...?

My short answer: it's not necessarily a bad thing to do at all. It depends on the context. In short: what are x and y?

My context-free comment is that if you're considering y/x, perhaps they are both positive, in which case it might make sense to work with log(x) and log(y), in which case the regression of log(y/x) on log(x) is the same as the regression of log(y) on log(x), with 1 subtracted from the slope (since log(y/x)=log(y)-log(x)).

P.S. I think it's ok for me to make fun of physicists since I majored in physics in college and switched to statistics because physics was too hard for me.

Iain Pardoe writes,

I was wondering if you might have any thoughts on the following ...

Suppose I have data collected over a period of years, with a response y and some predictors x. I want to predict y for the following year based on data collected up to that year. One approach is to model the data each year using ALL data collected up to that year. But what if you expect the relationship between y and x to change over time, i.e., you want to down-weight data from further in the past when fitting a model each year. You could ignore any data that is say more than 10 years old, but this seems a little ad-hoc. What might be a reasonable approach to doing this that isn't so ad-hoc?

Any thoughts?

I am sometimes contacted by people who want to conduct a survey, or who are planning to teach survey sampling, and want to know what to read. I recommend two books.

For the statistical theory and methods of sampling: Sampling: Design and Analysis, by Sharon Lohr (Arizona State University). This is a great book, combining the practical orientation of Kish (1965) with the clear notation of Cochran (1977). No other book I know of comes close to Lohr's. My only (minor) criticism of Lohr's book is that, when it comes to some areas on the research frontier (for example, poststratification with many categories), it is not always clear that there are open questions. I wouldn't mind seeing a few loose ends. I expect more of this will be in the forthcoming second edition.

For practical issues of conducting a survey: Survey Methodology, by Bob Groves, Floyd Fowler, Mick Couper, James Lepkowski, Eleanor Singer, and R. Tourangeau (Survey Research Center, University of Michigan). Lots of cool stuff, all in one place. These guys really know what they're doing.

A third book that's interesting is Analysis of Health Surveys, by Korn and Graubard. It has excellent material on analyzing survey data collected by others, a topic that does not get much emphasis in other books.

Baby Names


This is a really fun website.

You type in a name and it plots the popularity of the name since 1880. I of course first typed in my own name, and learned that it wasn't very common (110th most popular) when I was born, but was very common (4th most popular) in the 1990's. Which means that most of the Samanthas out there are much younger than I am. Does that mean people might expect me to be younger than I am because of my name? There are a lot of names that I associate with older people, but I can't think of too many that I associate with young people. Maybe that's just because I don't know many kids, though.

Objective and Subjective Bayes


Turns out I'm less of an objective Bayesian than I thought I was. I'm objective, and I'm Bayesian, but not really an Objective Bayesian. Last week I was at the OBayes 5 (O for objective) meeting in Branson, MO. It turns out that most of the Objective Bayes research is much more theoretical than I am. I like working with data, and I just can't deal with prior distributions that are three pages long, even if they do have certain properties of objectiveness.

Russ Lenth (Department of Statistics, University of Iowa) wrote a great article on sample size and power calculations in the American Statistican in 2001. I was looking for it as a reference for Barry's comment on this entry.

Anyway, I saw that Lenth has a webpage with a power/sample-size calculator and also some of the advice from his article, in easily-digestible form
. Perhaps this will be helpful to some of youall. I'm not happy with most of what's been written on sample size and power calculations in the statistical and biostatistical literature.

Also, here are some of my ramblings on power calculations.

I was spell-checking an article in WinEdt. It didn't like Ansolabehere and suggested "manslaughter" instead.

The difference between "statistically significant" and "not statistically significant" is not in itself necessarily statistically significant.

By this, I mean more than the obvious point about arbitrary divisions, that there is essentially no difference between something significant at the 0.049 level or the 0.051 level. I have a bigger point to make.

It is common in applied research--in the last couple of weeks, I have seen this mistake made in a talk by a leading political scientist and a paper by a psychologist--to compare two effects, from two different analyses, one of which is statistically significant and one which is not, and then to try to interpret/explain the difference. Without any recognition that the difference itself was not statistically significant.

Let me explain. Consider two experiments, one giving an estimated effect of 25 (with a standard error of 10) and the other with an estimate of 10 (with a standard error of 10). The first is highly statistically significant (with a p-value of 1.2%) and the second is clearly not statistically significant (with an estimate that is no bigger than its s.e.).

What about the difference? The difference is 15 (with a s.e. of sqrt(10^2+10^2)=14.1), which is clearly not statistically significant! (The z-score is only 1.1.)

This is a surprisingly common mistake. The two effects seem sooooo different, that it is hard for people to even think that their difference might be explained purely by chance.

For a horrible example of this mistake, see the paper, Blackman, C. F., Benane, S. G., Elliott, D. J., House, D. E., and Pollock, M. M. (1988). Influence of electromagnetic fields on the efflux of calcium ions from brain tissue in vitro: a three-model analysis consistent with the frequency response up to 510 Hz. Bioelectromagnetics 9, 215-227. (I encountered this example at a conference in radiation and health in 1989. I sent a letter to Blackman asking him for a copy of his data so we could improve the analysis, but he refused, saying the raw data were on logbooks and it would be too much effort to copy them. We'll be discussing the example further in our forthcoming book on applied regression and multilevel modeling.)

I spoke last week at a workshop at Smith College on teaching statistics to undergraduate political science students. The organizers of the conference were Paul Gronke (Reed College) and Howard Gold (Smith College).

Here's my talk
. This talk was accompanied by several demonstrations and handouts, and this slideshow by itself has parts that may be hard to follow without that supplementary material.

It was lots of fun. The 20 or so people at the workshop enjoyed the demonstrations and there was lively discussion about teaching research methods to undergraduates in general and political science students in particular.

Normal curves

| 4 Comments | 1 TrackBack

From Tian comes this picture from a Chinese news agency:


What's the deal? The picture looks a little fishy to me since the rightmost normal curve appears in front of the person whose body is in the foreground of the photo. But if things really looked like that, I would've loved to have been there to see it!



We submitted a paper to a leading statistics journal, and one of the review reports included the following sentence:

Although the statistical methodology used is not particularly complex, sometimes a straightforward solution to a problem can be even more elegant than something that is technically more impressive.

At first I was a little miffed that they referred to our methods as "not particularly complex" but then I realized that this is really a victorious moment for applied Bayesian data analysis. Our paper used multilevel modeling, an adaptive Gibbs/Metropolis algorithm, posterior predictive checking, as well as tons of graphs and postprocessing of inferences.

Not too many years ago, we would have had to deal with generalized skepticism about Bayes, prior distributions, exchangeability, blah blah blah. Maybe even some objection to using a probability model at all. (One of my colleagues where I used to work once told me, "We don't believe in models.") And the reviewers who liked the paper would have gone on about how innovative it was. It's good to be able to skip over all that and go straight to the modeling (the "science," as Rubin would put it).

David Budescu writes,

We ran an experiment where subject made predictions about future value of many stocks based on their past performance. More precisely, they were asked to estimate 7 quantiles of the distribution of each stock:

Q05, Q15, Q25, Q50, Q75, Q85, and Q95

I would like to estimate the mean and SD (or variance) of this distribution based on these quantiles subject to weak assumptions (symmetry and unimodality) but without assuming a particular distribution.

I know of some methods (e.g. Pearson & Tukey, Biometrika, 1965) that use only 3 of these quantiles (Q05, Q50, and Q95) but I hate not to use all the data I have collected.

Does anyone know of a more general and flexible solution?

Any thoughts? Of course, some distribution would have to be assumed. Also, I wonder about assuming symmetry since the data would be there to reject the hypothesis of symmetry in some settings. Also, of course, I wonder whether the mean and sd are really what you want. Well, I can see the mean, since it's $, but I'm not so sure that the sd is what's wanted.

Jasjeet writes,

Hi Andrew,

I saw your recent exchange on falsification on your blog. I mostly agree with you, but I think the view of falsification presented is a little too simple---this is an issue with Popper's stance. I say this even though I'm very sympathetic to Popper's position. I suggest that you check out Quine's "Two Dogmas of Empiricism". Originally published in The Philosophical Review 60 (1951): 20-43. Reprinted in W.V.O. Quine, From a Logical Point of View (Harvard University Press, 1953). This is generally considered one of the canonical articles on the issue.

You may be interest to know that Kuhn tried to distance himself from the dominant reading of his work. The historian of science, Silvan Schweber, who knew Kuhn tells wonderfully funny stories about this at dinner parties. BTW, if you are interested in this stuff, you should check out Schweber's _QED and the Men Who Made It_. It is a great *history* of science book which also engages many philosophical issues. Philosophers of science generally bore me now. I say this as someone who spent many years reading this stuff. Philosophers of science became boring once there arose a sharp division between them and actual scientists. This was not true of earlier philosophers such as the logical positivists and people like Russell. But the second half of the 20th century was hard on philosophy...on this issue you should check out the work of your Columbia colleague Jacques Barzun ("From Dawn to Decadence" etc).

But if you do read some of of these people, I would really like to get your thoughts on what Richard Miller says about Bayesians in his "Fact and Method". Are they the modern logical positivists? Alas, I sometimes think so. One would think that the failure of Russell's Principia Mathematica, Godel and all of that would have killed logical positivism, but it hasn't.....


"A coin with probability p > 0 of turning up heads is tossed . . . " -- Woodroofe, Probability with Applications (1975, p. 108)

"Suppose a coin having probability 0.7 of coming up heads
is tossed . . . " -- Ross, Introduction to Probability Models (2000, p. 82)

The biased coin is the unicorn of probability theory—-everybody has heard of it, but it has never been spotted in the flesh. As with the unicorn, you probably have some idea of what the biased coin looks like—-perhaps it is slightly lumpy, with a highly nonuniform distribution of weight. In fact, the biased coin does not exist, at least as far as flipping goes.

Bill Browne, a statistician who does tons of work on multilevel model, especially for educational applications, has a 3-year postdoctoral position available in computational applied statistics. It looks interesting!

Chad on ethics

| No Comments

Chad Heilig is a statistics Ph.D. graduate of Berkeley who has moved from theoretical statistics to work at the CDC. He recently wrote a paper on ethics in statistics that will appear in Clinical Trials. The paper is interesting to read--it presents a historical overview of some ideas about ethics and statistics in medical studies.

Two key ethical dilemmas in clinical trials are:

(1) The conflict between the goal of saving future lives (by learning as much as possible, right away, about effectiveness of treatments), and the goal of treating current patients as effectively as possible (which, in some settings, means using the best available treatment, and in others means using something new--but will not, in general, correspond to random assignment).

(2) The conflict between the goals in (1)--to help current and future patients--and the goals of the researcher, which can include pure scientific knowledge as well as $, glory, etc.

As Chad points out, it's a challenge to quantify either of these tradeoffs. For example, how many lives will be saved by performing a large randomized trial on some drug, as compared to using it when deemed appropriate and then learning its effectiveness from observational studies. (It's well known that observational studies can give wrong answers in such settings.)

I completely disagree with the following statement on page 5 of the paper, which Chad attributes to Palmer (1993): "Where individual ethics is favored, one ought to employ Bayesian statistical methods; where collective ethics is favored, frequentist methods apply." This doesn't make sense to me. (For one thing, "frequentist methods" is an extremely general class which includes Bayesian methods as a special case.)

For a copy of the paper, email Chad at cqh9@cdc.gov

Recent Comments

  • subdee: Looks like the Rutgers R to me. read more
  • Andrew Gelman: Ceolaf: I don't agree with Lilla's attribution of the 2010 read more
  • Bill Jefferys: @Ted Dunning: This sounds similar to the Keller Plan. http://en.wikipedia.org/wiki/Keller_Plan read more
  • Mark Palko: I'd love to share but I'm not sure how many read more
  • ceolaf: 1) Lilla's view of the tea party are much read more
  • Andrew Gelman: Chris: You write: It's funny how, pre-election, political scientists were read more
  • K? O'Rourke: Mark: Would have some of those Excel spreadsheetsto teach high read more
  • K? O'Rourke: Mark: Would have some of those Excel spreadsheetsto teach high read more
  • Phil: I agree in a limited way with the other commenters, read more
  • Ian Fellows: @Cocuk: thanks for reiterating my point? @Anonymous I reject the read more
  • zbicyclist: @morgan: "... "Washington" is a machine that rewards those who read more
  • Morgan: "The Tea Party activists are conservative Republicans." I really don't read more
  • Chris: It's funny how, pre-election, political scientists were all predicting 20-25 read more
  • ziel: "The Tea Party activists are conservative Republicans. Are there any read more
  • Nick Cox: And indeed by many others over several decades, such as read more

About this Archive

This page is an archive of entries from June 2005 listed from newest to oldest.

May 2005 is the previous archive.

July 2005 is the next archive.

Find recent content on the main index or look in the archives to find all content.