Overconfidence in historical predictions; also a discussion of graphical displays of scientific results

Bryan Caplan writes about a cool paper from 1999 by Philip Tetlock on overconfidence in historical predictions. Here’s Caplan’s summary:

Tetlock’s piece explores the overconfidence of foreign policy experts on both historical “what-ifs” (“Would the Bolshevik takeover have been averted if World War I had not happened?”) and actual predictions (“The Soviet Union will collapse by 1993.”) The highlights:

# Liberals believe that relatively minor events could have made the Soviet Union a lot better; conservatives believe that relatively minor events could have made South Africa a lot better.

# Tetlock asked experts how they would react if a research team announced the discovery of new evidence. He randomly varied the slant of the evidence. He found a “pervasiveness of double standards: experts switched on the high-intensity search light of skepticism only for dissonant results.”

# Tetlock began collecting data on foreign policy experts’ predictions back in the 80’s. For example, in 1988 he asked Sovietologists whether the USSR would still be around in 1993. Overall, experts who said they were 80% or more certain were in fact right only 45% of the time.

# How did experts cope with their failed predictions? “[F]orecasters who had greater reason to be surprised by subsequent events managed to retain nearly as much confidence in the fundamental soundness of their judgments of political causality as forecasters who had less reason to be surprised.” The experts who made mistakes often announced that it didn’t matter because prediction is pretty much impossible anyway (but then why did they assign high probabilities in the first place?!) The mistaken experts also often said they were “almost right” (e.g. the coup against Gorbachev could have saved Communism) but correct experts very rarely conceded that they were “almost wrong” for similar reasons.

Caplan goes on to discuss the probability that forecasters might have been more calibrated if they had been betting money on their predictions. This is an interesting point but I’d like to take the discussion in a different direction. Beyond the general interest in cognitive illusions I’ve had since reading the Kahneman, Slovic, and Tversky book way back when, Tetlock’s study interests me because it interacts with Niall Ferguson’s work on potential outcomes in historical studies and Joe Bafumi’s work on the stubborn American voter.

Virtual history and stubborn voters

Ferguson edited a book on “virtual history” in which he considered historical speculations, and retroactive historical speculations, in the potential-outcome framework that is used in statistical inference. These ideas also come up in other fields, such as law (as pointed out here by Don Rubin). I’m not quite sure how overconfidence fits in here but it seems relevant.

Joe Bafumi in the “stubborn American voter” (here’s an old link; I don’t have a link to the updated version of the paper) found that in the past twenty years or so, Americans have become more partisanl, not only in their opinions, but also in their views on factual matters. This seems similar to what Tetlock found and also suggests that the time dimension is relevant. Joe also considers views of elites vs. average Americans.

Finally . . .

Tetlock’s paper was great but I’d like it even better if the results were presented as graphs rather than tables of numbers. In my experience, graphical presentations make results clearer, but even more important, can generate new hypotheses and reject existing hypotheses I didn’t realize I had.

My impression is that statistics and data analysts see graphics as an “exploratory” tool for looking at data, maybe useful when selecting a model, but then when they get their real results, they present the numbers. But in my conception of exploratory data analysis (see also here for Andreas Buja’s comment and here for my rejoinder), graphs are about comparisons. And, as is clear from Caplan’s summary, Tetlock’s paper is all about comparisons–stated probabilities compared to actual probabilities, liberals compared to conservatives, and so on. So I think something useful could possibly be learned by re-expressing Tetlock’s Tables 1, 2, 3, and 4 as graphs. (Perhaps a good term project for a student in my regression and multilvel modeling class this fall?)