Prediction market and polls

Ted Dunning sent me this graph:

So, how do the polling data compare to the contract prices from Intrade on the day before the election? Below is a graph with a data point for each state, with the horizontal axis representing the polling data and the vertical axis representing the Intrade contract price.

intrade_v_polls_11.3.png

The quick message that I get from here is that Intrade prices are way biased toward 50/50. For example, the price for DC is something like .04, which is ridiculous. (To two decimal places, it should certainly be .00).

8 thoughts on “Prediction market and polls

  1. Actually, there's a fairly large transaction cost associated with buying a contract on Intrade. Therefore, you wouldn't expect contracts to converge to 0.00 as there aren't actually arbitrage opportunities once transaction costs are taken into account. Also, no one will offer an initial contract of 0.00 (or 1.00) as there's no money to be made.

    Note that this is only a problem for contracts where the price is basically constant over the life of the contract, such as the DC contract. Another implication is that very few trades should happen for such contracts, which I suspect you'll also find.

  2. The bias towards 50/50, at least at the edges (like DC) can certainly be explained by the cost of the overhead of doing 'business' at Intrade. The cost of participation is $.05 per contract.

    From Intrade:
    http://www.intrade.com/jsp/intrade/help/howitwork
    "We charge a commission of 5¢ for all contracts executed on the exchange. Let's take the example we discussed above, you bought 6 contracts at 50 in the morning and sold them in the afternoon at 73. On the purchase of 6 contracts, your commission was 30¢ (5¢ X 6 contracts); on the sale of 6 contracts at 73 your commission was again 30¢."

  3. Bias towards 50/50 is what you get when people wish to reinvest the proceeds from trading: an additional 0.05 doesn't gain you much, but it prevents a financial wipe-out.

    Below I have a section from my master's thesis (which didn't invent anything new, these results go back to Kelly and Shannon). Towards the end I mention how 50/50 is minmax optimal in the case of perfect ignorance.

    It's not ridiculous that traders doubt themselves and their data a little. Any self-doubt simply manifests itself as a bias toward 50/50.

    —-

    Assume a gambling game, with n possible outcomes. We estimate the probability of each outcome with p_i, i = 1, 2, …, n. We place a bet of M*r_i coins on each outcome. Once the outcome is known to be j, we get N*r_j coins back. How should we distribute the coins if our knowledge of the distribution was perfect?

    We have m coins. Because we cannot incur a loss in this game if we play properly, it pays to use all of them. Let’s bet m coins on the outcomes from O = {i : p_i = max_j p_j}, so that for every j not in O, r_j = 0. If this was not an optimal betting, there would exist k coins that should be moved from an outcome i to an outcome j, so that we would earn more. We would then earn on average N*k*p_j − N*k*p_i coins more. But since no j outside O has a larger or equal probability, we always make a loss. Such a bet is thus at least locally optimal, and optimal for n = 2. This is a bold strategy. The average profit made by the bold strategy in a game is −M + N M max_j p_j. The expected return on investment (ROI) given a particular probability distribution is N max_j p_j, which is maximum, so the bold strategy is max-optimal.

    If we are ignorant about the probability distribution, we bet m/n coins on all outcomes, r_i = 1/n. Whatever the outcome, we never lose money, and our earnings in the worst case are n, implying ROI of 0%. Proportional betting is minimax-optimal, as we have a guaranteed bottom bound. In such proportional betting, we pay no attention to the probability distribution.

    We could try a timid betting strategy, which takes the probability distribution into account: we bet on the proportion r_i = p_i of M coins on outcome i. We thus spend all the coins, since the probabilities sum up to 1. The expected ROI is N Sum_i[p_i^2].

    Let us try to find a strategy which will maximize our expected earnings in the long run, after several repetitions of the game. Our capital in (k+1)-th game is what we obtained in the k-th. Long-term capital is exponentially larger for a strategy whose E_p[ln r] = Sum_i[p_i ln r_i] is maximized, in comparison with any other strategy, for a sufficiently large number of game repetitions.

    My betting setup might differ from Intrade's – would be grateful for corrections.

  4. Given the fact that Intrade contracts aren't expected to reach 0 or 1, that curve fit should probably be redone. In fact, it's a not-so-good predictor of the data anyway — I believe I could do better in short order if the data were easily available in one place (I'm too lazy to go and get it myself). I've got some ready-built code for computing arbitrarily scaled and shifted logistic curves, and MATLAB's non-linear least squares routine would do all the heavy lifting.

  5. It's interesting that although the prices may be "biased towards 50/50" in some sense (because DC isn't priced at .00), the prices are actually biased away from 50/50 compared to that logistic curve. At least, for the central part of the graph. Is there any good reason to think that the data should fit a logistic curve?

  6. Kenny, logistic was fit to the data – one of default models.

    Terrell, great point! The mechanics of the market are very different from the description in my comment: they provide no incentive to try to capture real probabilities! Instead, they create incentives for biasing probabilities towards 0/100, with the additional complexity coming out from fees and volatility.

Comments are closed.