Does posterior predictive model checking fit with the operational subjective approach?

David Rohde writes:

I have been thinking a lot lately about your Bayesian model checking approach. This is in part because I have been working on exploratory data analysis and wishing to avoid controversy and mathematical statistics we omitted model checking from our discussion. This is something that the refereeing process picked us up on and we ultimately added a critical discussion of null-hypothesis testing to our paper. The exploratory technique we discussed was essentially a 2D histogram approach, but we used Polya models as a formal model for the histogram. We are currently working on a new paper, and we are thinking through how or if we should do “confirmatory analysis” or model checking in the paper.

What I find most admirable about your statistical work is that you clearly use the Bayesian approach to do useful applied statistical analysis. My own attempts at applied Bayesian analysis makes me greatly admire your applied successes. On the other hand it may be that I am one of those naïve readers of Savage and Lindley that frustrated you in the 80s but it seems to me that your use of model checks, and discussion of “model fit” is a bit awkward from the strict Bayesian / Operational Subjective view point. Anyway I thought I would put this “naïve” view to you and give you a chance to shoot it down.

You often say that “all models are wrong” but it seems to me that there are two very different interpretations to this. The first makes a concession with frequentism and says that in the long run the histogram of data will not resemble any model within the parameterized family. The second says that my (or someone else’s) subjective probabilities are not perfectly reflected in the model-prior combination. These two possibilities are discussed in Probabilism by de Finetti (unfortunately this is drawn out over several pages but the article is here.

My understanding of the Bayesian and in particular the operational subjective approach is that emphasis is placed upon conditional or posterior probabilities rather than a model fit. Sometimes I am tempted to think that the key insight of Bayesian thinking is that it is often inappropriate to think you can “estimate parameters” or do “model selection”. That is research that is aimed at discovering a universal truth or law will often fail, yet useful predictive statements can be made routinely on the basis of data even messy
social science data.

The concept of model selection seems to rest on the assumption that if the probability of the data is low conditional on a model is low then the model might be questionable. The difficulty for me here is that as I see it the probability of the data is always low (so long as the dataset contains a reasonable amount of information).

I think the idea that you protested against in the 80s is a lazy form of coherence. While it is coherent to specify P(y^{rep},y) and then compute P(y^{rep}|y) the value of this is reliant on the initial specification being reasonable. The analysis is more convincing if the analyst goes through real angst in specifying P(y^{rep},y) in particular by thinking about marginal distributions P(y^{rep},y) and maybe even checking if estimators that you expect to work roughly correspond with P(y^{rep}|y). Using Monte Carlo to plot samples of P(y^{rep}|y) against the real data seem also to follow this sort of path. I would again argue that this can and perhaps should be viewed as a prior-prior coherence check rather than a model check… i.e. we are checking if it is coherent to believe that some (perhapse many) samples of marginilizations of P(y^{rep}|y) should resemble marginilzations of y.

As mentioned above, my interest in the operational subjective view of probability theory isn’t always useful in trying to do applied statistics. I hope I am not coming through as overly dogmatic. I am really interested if you distinguish your practical statistical thinking from the operational subjective approach. When you talk about model checking and your admiration for Popper it seems possible that you might.

On the merits and problems of the operational subjective perspective I very much like this quote from David Land “on practical grounds, one’s scientific experiments must be directed in ways that draw upon expert information and will persuade other people (and worse, committees of people) to the inferences one makes; operational subjectivism correctly points out the logical dilemmas this entails, but self-righteously eschews the convenient moral compromises that make traditional research methods valuable in practice.” David Lane’s review of Frank Lad’s operational subjective statistical methods.

My reply:

1. When I say “all models are wrong,” I mean that the mathematical assumptions of our models are simplifications of reality. You focus on probability distributions of errors, but it’s my impression that the deterministic assumptions of the model–additivity, linearity, etc.–are more important. We discuss this point in ARM, in chapter 4, I believe.

2. I don’t see the problem with “estimating parameters.” It’s all done in the context of a model, but that’s ok. I do agree with you, though, that statistics won’t tell you any universal truths; what it will tell you ways in which the observed data depart from a model.

3. I don’t quite get your point about an analysis being “more convincing if the analyst goes through real angst.” The analysis should be somewhat independent of the analyst, no? Here I’m speaking of logical independence, not statistical independence.

4. I prefer thinking of model checks rather than trying to decide if you’re checking the prior or the likelihood. There are some interesting distinctions here, but overall I’d prefer to the the entire model as a single entity.

5. As I’ve written in various places, I don’t like thinking in terms of subjective probability. All of statistics is subjective, in the sense that choices have to be made in deciding what models to consider and what data to analyze, but I don’t see Bayesian inference as any more subjective than classical inference. Not one bit. I could actually argue the opposite, that classical statistics–with its choices of estimators, tuning parameters, optimality criteria, etc., is actually more subjective. But I’d just as soon not even bother to make that argument.

1 thought on “Does posterior predictive model checking fit with the operational subjective approach?

  1. In Engineering, Physics, and Applied Mathematics type modeling, there is a tendency for the crowd to split into two groups. Group one are people interested in what is likely to happen, and group two are people interested in what they can prove about the system.

    Group one tends to do things like use asymptotic series that diverge, numerical estimates of the order of magnitude of certain terms to discard them if possible, and numerical calculations combined with qualitative simplifications to get an idea roughly of when the model works to predict the observed data. Group one is the group you want to go to if you are designing a widget, since they focus on whether the math describes the widget.

    Group two tends to do things like prove that an infinite number of terms of a given series expansion of the solution eventually converges in mean squared error to the right answer to the given equation. Group two will never ask whether the equation is the right equation for your widget, it is more or less uninteresting. Group two are basically pure mathematicians working on math problems that group one invented but chose to ignore because they weren't immediately relevant to the widget.

    Within statistics, I think this is basically the distinction between an applied statistician / data analyst, and a mathematical statistician. For the most part, the frequentist / bayesian divide is not that interesting to the group one guys, the more important question is have we learned something about the real world.

    Now, in my opinion, we need a little bit of both, but more of group one than of group two, since the results of group two tend to apply across many disciplines, but the results of group one are models specific to real world problems which are a bit different each time.

    To group one modelers, the statement "every model is wrong" is obvious because they have fit many models and seen how each of those models will sometimes fail to help them understand some aspect of a real world question, and to group two the question is almost meaningless since their interests tend to be in the form of statements like "assuming model x is true, what are the consequences for prediction y" and if you assume model x is true, then the statement that it is "wrong" is hard to understand.

    All this being said, it seems to me that Andrew's success is that he's a group one guy who has figured out how to build better models by applying some of the results of group two guys (basically the computational machinery of Bayesian inference).

    Every time someone tries to pin him down on subjectivist vs objectivist ideas he looks at the real world and says something about how subjective the whole process of scientific investigation is, whereas the question asker is usually working within some formal framework in which "subjectivist" and "objectivist" have formal meanings so the two people are in essence talking about different things.

Comments are closed.