Statistical methods for aggregating pre-election polls

David Shor writes:

You did some initial analysis of Nate’s election forecasting work over on FiveThirtyEight.

Forgive me for plugging in my own [Shor’s] work, but I did pre-election poll aggregation using a different methodology, and performed about on par with Nate. (Slightly better with state level presidential races, worse with senate races, and quite a bit better with the popular vote).

A comparison of our results(And a spreadsheet with quite a bit of data) are available here.

In terms of possible improvements for next cycle:

1) Nate’s strong performance in the senate, where there were very few polls, suggests that it would be wise to incorporate regression estimates into forecasts(See North Dakota). How to do so in a rigorous fashion is a very good question.

2) I’m obviously biased here, but I think Nate should move more toward a Hidden Markov Model framework. In his model’s current form, his estimators for House effects and Pollster introduced error are inefficient and complicated, while these things can be very elegantly incorporated under a HMM.

Better, this would make things parametric, which would make it easier model non-traditional scenarios(The Georgia run-off in 2008 is a good example).

And, it would clean up a lot of the contradictory assumptions that come along when building a model up over time. For example, the functional form of FiveThirtyEight’s temporal error assumes a stationary random walk, while the exponential decay of poll weights over time is inconsistent with a random walk (see here).

3) House effects and Pollster Introduced error are important and should be incorporated into models. Nate has shown this pretty well.

My comments:

1. It should definitely help to include state-level predictors in a regression model. Even something as simple as the vote in the previous presidential election would help a lot, I’d think.

1a. For predicting House and Senate races, you’ll also want to include incumbency (worth about 10 percentage points in the House and I think about half that in the Senate), both for predicting the upcoming election and for adjusting past elections to get a measure of the normal vote for the parties.

2. A latent Markov model would be fine, maybe something like the models that Mark Glickman has implemented for rating football teams and chess players. In both of these situations, the true rankings change over time.

2a. In some ways, any poll aggregation system will have contradictory assumptions because there are really two goals: (i) to estimate the current state of public opinion, and (ii) to forecast the election. Focusing solely on current opinion is silly–after all, it’s ultimately about who wins the election–but focusing solely on prediction is frustrating because then, for the presidential election, each week’s polls contain very little information.