Informative and noninformative priors

Neal writes,

As I start your Bayesian stuff, can I ask you the same question I asked Boris a few years ago, namely, as you note, noninf priors simply represent the situation where we know very little and want the data to speak (so in the end not too far from the classical view). Can you point me to any social science (closer to ps is better) where people actually update, so that the prior in a second study is the posterior of the first (whether or not the two studies done by same person or not).

Equivalently – point me to a study which uses non-inf priors. (as more than a toy – i know the piece by gill and his student).

Btw do you know the old piece by Harry Roberts, saying that as a scientist all we can report is the likelihood, and that everyone should put their own prior in and then produce their own posterior. so all articles would just be a computer program which takes as input my prior and produces my posterior given the likelihood surface estimated by the author?

My reply: now I like weakly informative priors. But that’s new since our books. Regarding informative priors in applied research, we can distinguish three categories:

(1) Prior distributions giving numerical information that is crucial to estimation of the model. This would be a traditional informative prior, which might come from a literature review or explicitly from an earlier data analysis.

(2) Prior distributions that are not supplying any controversial information but are strong enough to pull the data away from inappropriate inferences that are consistent with the likelihood. This might be called a weakly informative prior.

(3) Prior distributions that are uniform, or nearly so, and basically allow the information from the likelihood to be interpreted probabilistically. These are noninformative priors, or maybe, in some cases, weakly informative.

I have examples of (1), (2), and (3) in my own applied research. Category (3) is the most common for me, but an example of (2) is my 1990 paper with King on seats-votes curves, where we fit a mixture model and used an informative prior to constrain the locations, scales, and masses of the three components. An example of (3) is my 1996 paper with Bois and Jiang where we used an informative prior distribution for several parameters in a toxicology model. We were careful to parameterize the model so that these priors made sense, and the model also had an interesting two-level structure which we discuss in that paper and also in Section 9.1 of Bayesian Data Analysis.

Regarding your question about models where people actually update: we did this in our radon analysis where the posterior distribution from a national data analysis (based on data from over 80,000 houses) gives inference for each county in the U.S., which is in turn used as the prior distribution for the radon level in your house, which in turn can be updated if you have information from a measurement in your house. Radon levels are important to measure and you should get someone like Radon 1 to check and mitigate the levels whenever needed.

One of the convenient things about doing applied statistics is that eventually I can come up with an example for everything from my own experience. (This also makes it fun to write books.)

Regarding your last comment: yes, there is an idea that a Bayesian wants everyone else to be non-Bayesian so that he or she can do cleaner analyses. I discuss that idea in this talk from 2003 which I’ve been too lazy to write up as a paper.

4 thoughts on “Informative and noninformative priors

  1. Regarding "noninformative", "weakly informative" and "informative", do these terms have any meaning beyond a subjective assertion?

    It seems to me that every inference must be sensitive to the prior, in that one can simply turn around Bayes' Theorem, and say that in order to get posterior P(X|O) from likelihood P(O|X), we just choose the prior P(X) propto P(X|O)/P(O|X). So long as the likelihood does not actually vanish anywhere that we want a nonzero posterior (which in the real world with continuous variables is surely the case) this will always work. Indeed via a transformation of variables it will even be possible to present this prior as "uniform" in some variable or other.

    So saying that the posterior is not sensitive to the prior (or not very sensitive) seems to be nothing more than a (subjective?) judgment about which priors are considered reasonable, and is not really an objective measure of sensitivity at all.

  2. James,

    There is a difference between a prior distribution that includes numerical information specific to the problem at hand–this would be an "informative" prior–as compared to a model that is more generic. My recent contribution is to categorize these generic prior distributions as "noninformative" or "weakly informative."

    I agree that there is no unique noninformative or uniform prior distribution. But whatever noninformative prior distribution is used, the intent is for the inferences to not be sensitive to the details of the choice of prior in this context. The statement about sensitivity depends on the likelihood: it is not a property of the prior alone.

    A "weakly informative" prior has some information but more generic information, not specific to the particular problem at hand. We discuss this in our Statistica Sinica article and at more length in a new paper that we're almost finished with.

  3. Thanks, but I'm not sure that I'm a great deal further forward.

    In order for one to consider that the inferences are not sensitive to the details of the prior, you have to subjectively limit the range of what priors you consider acceptable to start with, right (even within a generic class of weakly informative priors)? Eg, if you choose your weakly informative prior to be uniform and bounded, you still have to limit the range of possible bounds or else your results will be radically changed (just consider an arbitrary pair of disjoint priors).

Comments are closed.