Too clever by, hmmm, about 5% a year

Coblogger John Sides quotes a probability calculation by Eric Lawrence that, while reasonable on a mathematical level, illustrates a sort of road-to-error-is-paved-with-good-intentions sort of attitude that bothers me, and that I see a lot of in statistics and quantitative social science.

I’ll repeat Lawrence’s note and then explain what bothers me.

Here’s Lawrence:

In today’s Wall Street Journal, Nate Silver of 538.com makes the case that most people are “horrible assessors of risk.” . . . This trickiness can even trip up skilled applied statisticians like Nate Silver. This passage from his piece caught my [Lawrence’s] eye:

“The renowned Harvard scholar Graham Allison has posited that there is greater than a 50% likelihood of a nuclear terrorist attack in the next decade, which he says could kill upward of 500,000 people. If we accept Mr. Allison’s estimates–a 5% chance per year of a 500,000-fatality event in a Western country (25,000 causalities per year)–the risk from such incidents is some 150 times greater than that from conventional terrorist attacks.”

Lawrence continues:

Here Silver makes the same mistake that helped to lay the groundwork for modern probability theory. The idea that a 5% chance a year implies as 50% chance over 10 years suggests that in 20 years, we are certain that there will be a nuclear attack. But . . . the problem is analogous to the problem that confounded Chevalier de Méré, who consulted his friends Pascal and Fermat, who then derived several laws of probability. . . . A way to see that this logic is wrong is to consider a simple die roll. The probability of rolling a 6 is 1/6. Given that probability, however, it does not follow that the probability of rolling a 6 in 6 rolls is 1. To follow the laws of probability, you need to factor in the probability of rolling 2 6s, 3 6s, etc.

So how can we solve Silver’s problem? The simplest way turns the problem around and solves for the probability of not having a nuclear attack. Then, preserving the structure of yearly probabilities and the decade range, the problem becomes P(no nuclear attack in ten years) = .5 = some probability p raised to the 10th power. After we muck about with logarithms and such, we find that our p, which denotes the probability of an attack not occurring each year, is .933, which in turn implies that the annual probability of an attack is .067.

But does that make a difference? The difference in probability is less than .02. On the other hand, our revised annual risk is a third larger. . . .

OK, so Lawrence definitely means well; he’s gone to the trouble to write this explanatory note and even put in some discussion of the history of probability theory. And this isn’t a bad teaching example. But I don’t like it here. The trouble is that there’s no reason at all to think of the possibility of a nuclear terrorist attack as independent in each year. One could, of course, go the next step and try a correlated probability model–and, if the correlations are positive, this would actually increase the probability in any given year–but that misses the point too. Silver is making an expected-value calculation, and for that purpose, it’s exactly right to divide by ten to get a per-year estimate. Beyond this, Allison’s 50% has got to be an extremely rough speculation (to say the least), and I think it confuses rather than clarifies matters to pull out the math. Nate’s approximate calculation does the job without unnecessary distractions. Although I guess Lawrence’s comment illustrates that Nate might have done well to include a parenthetical aside to explain himself to sophisticated readers.

This sort of thing has happened to me on occasion. For example, close to 20 years ago I gave a talk on some models of voting and partisan swing. To model votes that were between 0 and 1, we first did a logistic transformation. After the talk, someone in the audience–a world-famous statistician who I respect a lot (but who doesn’t work in social science) asked about the transformation. I replied that, yeah, I didn’t really need to do it: nearly all the vote shares were between 0.2 and 0.8, and the logit was close to linear in that range; we just did logit to be on the safe side. [And, actually, in later versions of this research, we ditched the logit as being a distraction that hindered the development of further sophistication in the aspects of the model that really did matter.] Anyway, my colleague responded to my response by saying, No, he wasn’t saying I should use untransformed data. Rather, he was asking why I hadn’t used a generalized linear model; after all, isn’t that the right thing to do with discrete data. I tried to explain that, while election data are literally discrete (there are no fractional votes), in practice we can think of congressional election data as continuous. Beyond this, a logit model would have an irrelevant-because-so-tiny sqrt(p(1-p)/n) error term which would require me to add an error term to the model anyway, which would basically take me back to the model I was already starting with. This point completely passed him by, and I think he was left with the impression that I was being sloppy. Which I wasn’t, at all. In retrospect, I suppose a slide on this point would’ve helped; I’d just assumed that everyone in the audience would automatically understand the irrelevance of discrete-data models to elections with hundreds of thousands of votes. I was wrong and hadn’t realized the accumulation of insights that any of us gather when working within an area of application, insights which aren’t so immediately available to outsiders–especially when they’re coming into the room thinking of me (or Nate Silver, as above) as an “applied statistician” who might not understand the mathematical subtleties of probability theory.

P.S. Conflict-of-interest note: I post on Sides’s blog and on Silver’s blog, so I’m conflicted in all directions! On the other hand, neither of them pays me (nor does David Frum, for that matter; as a blogger, I’m doing my part to drive down the pay rates for content providers everywhere), so I don’t think there’s a conflict of interest as narrowly defined.

13 thoughts on “Too clever by, hmmm, about 5% a year

  1. Sheesh, Lawrence is an idiot. And this whole discussion is overcomplicated: Nate could have just pointed out that a 50% chance of 50,000 deaths in a decade means an expectation of 25K nuclear-terrorism deaths per decade, which is waaaay bigger than the conventional terrorism death rate. No need to do a per-day or per-year calculation at all. (Oh, I see a commenter at the WSJ site said the same thing. He's right).

    More striking to me is the 50% estimate. Come on, there can't really be a 50% chance of a terrorist nuclear strike in the past 10 years, and if there is I doubt it'll kill 500K people. Admittedly I'm not a "renowned Harvard scholar," but I know enough to know that this probability is way too high. If Allison is willing to offer that bet, I'll put $1000 on the "no" side of the question "Will a terrorist nuclear attack in the next 10 years kill more than 500,000 people?" And, just to show I'm not a quibbler, I'll take the bet at 300,000 people, and I'll make it any number of nuclear terrorist attacks combined. So if 2 bombs combine to kill 350,000 people, I lose. Graham Allison, if you're reading this and willing to wager, get in touch and let's do it. –Phil Price

  2. Phil:

    1. I think you're going overboard in calling Lawrence an idiot. He made three common probability-theory mistakes: (1) confusing the probability of a single event with an expected value, (2) assuming independence of non-independent events, (3) jumping too quickly to criticize someone else who was actually doing things correctly. But, as I said, these are common errors; a person doesn't have to be an idiot to make them.

    2. I think the odds are less than 50% that Graham Allison is reading this. I doubt he even reads 538.com, and I didn't even post this there.

  3. I agree with Phil here. But the reason I'm commenting is that story about the world-famous statistician. There are thousands of votes and he thinks the data should be treated as discrete? That's stunning. It reminds me of a world-famous Berkeley statistics professor who complained that a certain experiment had only 4 subjects. (With lots of observations on each one.) Either he was clueless about how science actually works or there were big areas of science he knew nothing about. He was in an ivory tower within an ivory tower.

  4. Andrew, I agree with your general philosophy in the concluding paragraph. That type of reasoning has led to a resurgence in the use of the linear probability model in some circles, e.g. And I agree that the expected value logic makes sense when thinking about the expected number of lives lost.

    What I can't wrap my mind around is using the expected value logic on manipulating probabilities. A %150 chance over a 30 year period doesn't make sense to me. But that may be due to a failure of imagination on my part.

    All that said, this is a small point in the context of the larger debate, and I agree that the 50% estimate has a wide range of uncertainty around it.

  5. Distractions are in the eye of the beholder, but we should only distract when its important and not distract when its just a distraction (at least to others I guess).

    I was warned early in my career that there was a danger that in choosing to not be distracted by unimportant details, others would interpret it as my not knowing better or avoiding things I did not how to do properly.

    So Andrew – you used the lack of linear fit as the error term? Neat, that probably would have been worth comment for most of us.

    Keith

  6. I'll agree with Andrew and disagree with Phil (and Seth?) on the claim that I am not an idiot. On Andrew's other points, though:

    (1) I read the 50% likelihood statement as a probability claim, not an expected value claim.
    (2) As I stated elsewhere, I assumed independence because Silver assumed independence.
    (3) If my read in (1) was in error, then I was too quick to critique.

  7. I'm with Eric here – I don't see how you can read "a 5% chance per year of a 500,000-fatality event in a Western country" as an expected value statement.

    Yes, it is correct for the expected value purpose to divide the 500.000 by 20 and get to 25.000/year – but no, there is no way that can or should be expressed as a "5% chance". So I do think Nate is at best very sloppy here. And I think most readers will have read exactly what Eric read – that you can deduce from a 50% probability over 10 years that there is a 5% probability every year. And this is false under all but one possible set of assumptions (and it's very clear that Nate isn't referring to that).

    I think it was unwise of Eric to assume independence and do his little calculation without saying so – because contrary to what he says it's not at all clear if Nate assumes independence.

  8. Eric, I'm not sure what your point (1) means. If there's a 50% probability of a nuclear terrorist attack that kills 500,000 people, that means an "expected value" of 250,000 nuclear-terrorism deaths. (Perhaps the issue is the term "expected value", which is yet another example of statisticians choosing to use terminology in a way that differs from its use in normal language). I'm not sure how a "probability claim" differs from an "expected value" claim.
    With regard to your point (2) neither you nor Silver has to assume independence. The point of Silver's analysis was to compare the magnitude of the nuclear-terrorism threat to the magnitude of the conventional terrorism threat. For that, you can compare the "expected" 250,000 nuclear-terrorism deaths in the next decade to the number of "expected" conventional terrorism deaths in the next decade…or, you can divide both of these numbers by ten years (or 120 months, or whatever) if you want to compare expected deaths per year, per month, per day. Independence, or lack thereof, doesn't come into it, if your goal is to just come up with an expected yearly death rate. It's true that Silver explained this wrong too, but at least he got the right answer!

    As far as what we might call your point (0), that you're not an idiot…eh, maybe I was too quick to judge. If Andrew says you're not, I've gotta give that some weight. Perhaps I should have said "Lawrence's analysis is consistent with him being an idiot." ;)

  9. First, I'll defend Silver. In a short WSJ piece, he doesn't have room to state assumptions, add caveats, footnotes, etc. I read the 5% per year to imply independence, as the unconditional probability in any given year is the same as obtained by conditioning on the past. That's why I assumed independence. If I were doing serious work on the subject, I wouldn't do so.

    And to Phil, on my point (1), all I'm saying is that with a fair die, the expected number of 6s in two rolls of the die is 1/3, whereas the probability of rolling at least one six in two throws is 11/36. Those are two different claims. I appreciate your willingness to update your belief on point (0), though my willingness to argue statistics with Andrew may be prima facie evidence of idiocy. And if I were doing serious work on this question, I'd probably start by reading this:

    "Estimating the Probability of Events That Have Never Occurred," Journal of the American Statistical Association, V. 93, pp. 1-9, 1998

  10. I think that whatever the merits of the 50% probability and its transforms, the 500,000 deaths is a very dodgy number. Why would we assume that a terrorist-created device would kill three times as many people as the Hiroshima bomb?

  11. The question of the probability of a nuclear attack was discussed by Warren Buffet in his biography Snowball on p641. Buffet said " If there's a 10% probability that something will happen in a year, there is a 99.5% probability that is will happen in 50 years. But if you can get that probability down to 3% that reduces the probability to only 78% in 50 years. And is you can get it down to 1% there is only a 40% probaility in 50 years."

    Can anyone tell me how he arrived at these calculations? Thanks.

Comments are closed.