Bayesian hierarchical model for the prediction of soccer results

Gianluca Baio sends along this article (coauthored with Marta Blangiardo):

The problem of modelling football [soccer] data has become increasingly popular in the last few years and many different models have been proposed with the aim of estimating the characteristics that bring a team to lose or win a game, or to predict the score of a particular match. We propose a Bayesian hierarchical model to address both these aims and test its predictive strength on data about the Italian Serie A championship 1991-1992. To overcome the issue of overshrinkage produced by the Bayesian hierarchical model, we specify a more complex mixture model that results in better fit to the observed data. We test its performance using an example about the Italian Serie A championship 2007-2008.

I like the use of the hierarchical model and the focus on prediction. I’m wondering, though, shouldn’t the model include a correlation between the “attack” and “defense” parameters? Or maybe that’s in the model but I didn’t notice it.

Oooh, an ugly, ugly table! Table 2 breaks every rule in the book–and I don’t mean that in a good way. Too many significant digits (if the 95% interval is [-0.52, +0.06], then, no, you don’t need to report the posterior mean to four decimal places), no group-level predictors, and the teams are laid out in alphabetical order. Table 3 is much better (although would be much better as a graph, I think). Figure 3 is getting better. The words on the graph are too tiny–they’re unreadable. And it would be my preference to add a few sentences to the caption to explain what’s going on. Figure 5 is looking pretty, but it reverts to the horrible, horrible alphabetical order–this time being made even worse by putting the alphabet in reverse.

And the Winbugs code has that discredited dgamma (epsilon, epsilon) model. (This is described on page 4 as a “flat” prior distribution, which isn’t right at all.) Not a huge deal, but something I notice because I’ve spent a lot of time thinking about it.

1 thought on “Bayesian hierarchical model for the prediction of soccer results

  1. They stepped on a forgotten land mine?

    OK maybe the dgamma(epsilon, epsilon) model did not blow up thier research?

    My guess is they started from un-updated WinBugs example code – i.e. those forgotten land mines.

    For students who might have some time to spare

    Define a replicable web search strategy to find introductory WinBugs examples

    Count the proportion that have the discredited rather than currently accepted "flat" priors

    Maybe code F for just discredited, B for just currently accepted or A+ for both with a pointer to the discreditation

    Then ask for funding to get rid of the Fs that are found

    K?

Comments are closed.