A dismal theorem?

James Annan writes,

I wonder if you would consider commenting on Marty Weitzman’s “Dismal Theorem”, which purports to show that all estimates of what he calls a “scaling parameter” (climate sensitivity is one example) must be long-tailed, in the sense of having a pdf that decays as an inverse polynomial and not faster. The conclusion he draws is that using a standard risk-averse loss function gives an infinite expected loss, and always will for any amount of observational evidence.

I looked up Weitzman and found this paper, “On Modeling and Interpreting the Economics of Catastrophic Climate Change,” which discusses his “dismal theorem.” I couldn’t bring myself to put in the effort to understand exactly what he was saying, but I caught something about posterior distributions having fat tails. That’s true–this is a point made in many Bayesian statistics texts, including ours (chapter 3) and many that came before us (for example, Box and Tiao). With any finite sample, it’s hard to rule out the hypothesis of a huge underlying variance. (Fundamentally, the reason is that, if the underlying distribution truly does have fat tails, it’s possible for them to be hidden in any reasonable sample. It’s that Black Swan thing all over again.) I think that Weitzman is making some deeper technical point, and I’m sure I’m disappointing Annan by not having more to say on this . . .

More

Searching on the web, I found this article by William Nordhaus criticizing Weitzman’s reasoning. Unfortunately, Nordhaus’s article just left me more confused: he kept talking about a utility function of the form U(c) = (1-c^(1-a))/(1-a), which doesn’t seem to be relevant to the climate change example. Or to any other example, for that matter. Attempting to model risk aversion with a utility function–that’s so 1950s, dude! It’s all about loss aversion and uncertainty aversion nowadays. This isn’t Nordhaus’s fault–he seems to be working off of Weitzman’s model–but it’s hard for me to know how to evaluate any of this stuff if it’s based on this sort of model.

Also, I don’t buy Nordhaus’s argument on page 4 that you can deduce our implicit value of non-extinction by looking at how much the U.S. government spends on avoiding asteroid impacts. This reminds me of the sorts of comparisons people do, things like total spending on cosmetics or sports betting compared to cancer research. I already know that we spend money on short-term priorities–I wouldn’t use that to make boroad claims about the “negative utility of extinction.”

Back to Weitzman’s paper

I find abbreviations such as DT (for the “dismal theorem”) and GHG (for greenhouse gases) to be distracting. I don’t know if this is fair of me. I don’t mind U.S. or FBI or EPA or other common abbreviations, but I find it really annoying to read a phrase such as, “Phrased di¤erently, is DT an economics version of an impossibility theorem which signifies that there are fat-tailed situations where economic analysis is up against a strong constraint on the ability of any quantitative analysis to inform us without committing to a VSL-like parameter and an empirical CBA framework that is based upon some explicit numerical estimates of the miniscule [sic] probabilities of all levels of catastrophic impacts up to absolute disaster?” The concepts are tricky enough as it is without me having to try to flip back and find out what is meant by DT, VSL, and CBA. But, if Weitzman were to spell out all the words, would the other economists think he’s some sort of rube? I just don’t know the rules here.

On page 37, near the end of the paper, Weitzman writes, “A so-called “Integrated Assessment Model” (hereafter “IAM”) . . .”) I was reminded of Raymond Chandler’s advice for writers: “When in doubt, have a man come through the door with a gun in his hand.” Or, in this case, an abbreviation. Never let your readers relax, that’s my motto.

I’m not sure how to think about the decision analysis questions. For example, Weitzman writes, “Should we have foregone the industrial revolution because of the GHGs it generated?” But I don’t think that foregoing the industrial revolution was ever a live option.

P.S. I have to admit, “miniscule” sounds right. It begins with “mini,” after all.

15 thoughts on “A dismal theorem?

  1. That utility function is actually very common in finance & economics (it's called a constant relative risk aversion or CRRA function); not the most realistic, especially for modeling catastrophic risk, though, as it implies that risk aversion is inversely proportional to consumption. This kind of stuff is one of the reasons I'm leaving econometrics for statistics (plus the broader areas of application, etc.).

  2. U(c) = c^(1-a)/(1-a) is the most commonly utility function used in economics, it satisfies expected utlity and constant relative risk aversion (c * (U''(c)/U'(c)) is constant).

  3. I'm surprised you have such a big problem with the utility function they use. That's the standard function used in nearly every single economic application you could ever read. I assume you must not read very many economics papers. If you did, I really doubt you would say that it's all about loss aversion nowadays.

    the main problem with loss aversion is that it's really hard to get those sorts of models to become tractable. The power utility model is nice because it very often produces analytic solutions to models.

    So it seems odd for you to say that you don't know how to interpret a model using power utility. It would be far harder to interpret any other model because you would never get an analytic solution — you would be reduced to numeric approximations to everything. Moreover, the entire field of economics seems to be able to interpret models with power utility just fine. I'm not sure I see why you find this sort of function so bizarre.

    Clicking through to your post about loss aversion, I'm not especially convinced you understood any of that either. The Rabin paper is in general a critique of constant absolute risk aversion utility models, not constant relative risk aversion models like the power utility function Nordhaus and Weitzman discuss. With CRRA utility, the main problems arise in reconciling experimental evidence with market behavior like in the stock market. The fact that loss aversion exists on some small scale in no way implies that we would want to drop the entire concept of a concave utility function, and I highly doubt that Rabin would want anybody to draw that conclusion from his work.

    Furthermore, we know that many of the lab based results that Rabin was appealing to don't tend to hold in the real world. In a long series of papers published in top tier economics journals (AER, etc.), John List has shown that experienced traders tend to not display any loss aversion or the other characteristics of prospect theory. Prospect theory has proven to also be pretty fragile even in lab settings.

    I guess what I'm trying to point out is that your post is bizarre. Power utility is the standard model used in economics. Rabin's critique in no way implied that we should drop that framework. At best, it implied that we shouldn't use power utility to model small scale risks. In the large scale, power utility has proven in a variety of areas to be surprisingly effective. Moreover, since climate change is certainly a large scale risk, loss aversion is nowhere close to being an important issue.

  4. Anon,

    Thanks for the comments. Just to be clear: my problem with the use of utilities to capture the risk-aversion phenomenon has nothing to do with Yitzhak's paper. The problem was clear to me from simple reasoning and also confirmed when I taught decision analysis classes, then given further theoretical structure to me when I read the literature on loss aversion, uncertainty aversion, and framing. It was after all of that that Matt's paper was pointed out to me, and I felt that he made a nice mathematical point.

    I have no problem with a concave utility function–as long as you don't try to use the utility function to capture all sorts of preference issues that are better modeled by framing etc. But, first, the exponential function seems silly to me, and second, I'm not quite sure how it works with climate change. But on the second point I'm merely ignorant; I didn't read the papers clearly enough to figure out what was the quantity X for which they were computing U(X).

    Tc,

    As with comment above, I recognize that the exponential utility function had nice properties, and indeed people realized that in the 1950s. It just doesn't make a lot of sense to me in any example I've ever seen, also I think you're way overloading the utility function by trying to use it to handle phenomena such as risk aversion.

  5. I agree on the abbreviations. They're common in statistics papers, too. Is journal space so scarce that we can't spell out "dismal theorem?"

  6. Andrew – The issue you touch on with overloading is actually quite significant in finance. The parameter in this utility (a, in your example) also ends up controlling intertemporal substitution behavior. This is especially nasty in some types of asset pricing models.

    Anon – I agree that loss aversion and other behavioral issues aren't tremendously relevant for the topic at hand. However, I don't generally find tractability to be a particularly convincing argument for using power utility to price catastrophic risk (heck, I've even seen arguments that, in the long-run, we should only pay attention to log utility anyway; not terribly convincing, but they're out there). For risks that are of such a nature that they would substantially alter the structure of the model, such simplifications become suspect. I think of this as somewhat analogous to "value of a life" studies. The structure they put on price vs. mortality probability is often somewhat unconvincing, as one would expect price to increase very steeply as the probability increases, making that section substantially more difficult to estimate. Ok, done for now.

  7. OK, so I should probably actually read the paper(s) before commenting, but from James' question it sounds like Weitzman's point is precisely that the utility function doesn't work properly in these instances…

  8. Professor Gelman,
    This is smth.not related to this post but I didn't know where else to ask: I would like to ask your opinion, if you have the time, about the test to compare nonparametrically two means. Apparently it's a big fuss about it nowadays in certain circles within experimental economics. The paper is here: http://www.iue.it/Personal/Schlag/papers/exacthyp

    I have the impression that these results are known for a long time in statistics, but I wanted to ask a specialist. Also it is not clear to me what is the applicability of such methods, but that is another question.

  9. I wanted to use the word as a technical term in a paper I was writing, so I looked it up, and apparently both "miniscule" and "minuscule" are now acceptable spellings in some dictionaries. Firefox however agrees with you that only the second is correct.

  10. Thanks for commenting on it. I realise that the economics might be most interesting to you but you're right that I'm a bit disappointed, because you don't deal with the crux of the Bayesian estimation bit, which (AIUI) is the way that Marty outlaws the concept of a physical constant. I agree with you that long-tailed distributions often arise in practice, but he is claiming that they arise even in the case of gaussian likelihood functions (which he explicitly uses). The way he does this is that instead of estimating some abitrary physical parameter c (say) by making observations of the system and constructing a likelihood, he seems to think we should talk about the "distribution of c" which he claims looks like N(m,s) where s (at least) is unknown, and then we have to estimate these hyperparameters by considering our observations as samples from "the pdf of c". Thus as more information is gathered our estimate will converge to a pdf of finite (but currently unknown) width, rather than a point estimate. But this approach seems completely incompatible with standard use of Bayesian methods in the physical sciences. Eg D'Agostini explicitly uses examples such as "the mass of an electron" in his book (available here), not "the distribution of electron mass" because in the standard model, this value is indeed a constant, not a distribution.

  11. ag: Nordhaus's article just left me more confused: he kept talking about a utility function of the form U(c) = (1-c^(1-a))/(1-a)… Attempting to model risk aversion with a utility function–that's so 1950s, dude! It's all about loss aversion and uncertainty aversion nowadays.

    tc: U(c) = c^(1-a)/(1-a) is the most commonly utility function used in economics, it satisfies expected utility and constant relative risk aversion (c * (U''(c)/U'(c)) is constant).

    anon: I'm surprised you have such a big problem with the utility function they use. That's the standard function used in nearly every single economic application you could ever read.
    the main problem with loss aversion is that it's really hard to get those sorts of models to become tractable. The power utility model is nice because it very often produces analytic solutions to models.

    anon is right on. first, the power function u(x)=x^k, 0http://psych.fullerton.edu/mbirnbaum/calculators/cpt_calculator.htm

    Of course, this shows why prospect theory is totally useless in terms of modeling. risk aversion is defined everywhere except at the status quo, where all the action is. ouch.

    this is why most practicing economists ignore prospect theory and use a continuous power function to model behavior, which is exactly what K&T showed is a stupid thing to do, if you care about empirical validity. Get it?

  12. aarrggh. i forgot that you can't use greater than signs or blog software gets html-based confusion. here's what i wrote:

    the power function u(x)=x to the k where k is between 0 and 1 is widely used in economics. it exhibits constant relative risk aversion (CRRA). it is by far the most intuitively plausible shape for gains. It is usefully contrasted with the negative exponential function which exhibits constant absolute risk aversion. the risk aversion demo in your 1998 paper implicitly assumes CARA. Like Rabin, you discovered that assuming CARA leads to kooky predictions. this is why, like tc says, CRRA is the most commonly used function in econ. you need CARA or CRRA to make EU tractable and CRRA is much more empirically plausible than CARA.

  13. If you upgrade Gelman (1998) to a formal, general proof you get this:

    If a decision maker always rejects a p chance of w+1 and a 1-p chance of w-1, the decision maker will always reject a 50-50 chance of win infinity/lose X, where X=ln .5/ln (p/1-p).

    Proof to follow.

    AG – can you figure it out?

  14. Clemen (1991, pp. 379-382) Risk tolerance and the exponential utility function

    U(x)=1-e^-x/R

    This [the exponential utility function] function is concave, and thus can be used to represent risk-averse preferences. As x becomes large, U(x) approaches 1. The utility of zero, U(0) is equal to 0, and the utility for negative x (being in debt) is negative. In the exponential utility function, R is a parameter that determines how risk-averse the utility function is. In particular, R is called the risk tolerance. Larger values of R make the exponential utility function flatter, while smaller values make it more concave or more risk-averse.

    How can R be determined? A variety of ways exist, but it turns out that R has a very intuitive interpretation that makes its assessment relatively easy. Consider the gamble

    Win $Y with probability 0.5
    Lose $Y/2 with probability 0.5

    Would you be willing to take this gamble if Y were $100? $2000? $35,000? At what point would the risk become intolerable? The largest value of Y for which you would prefer to take the gamble rather than not take it is approximately equal to your risk tolerance. This is the value that you can use for R in your exponential utility function.
    For example, let Y=$900. Hence, R=$900. Using this assessment in the exponential utility function would result in the utility function U(x)=1-e^x/900.

    What are reasonable R values? Howard (1988) suggests certain guidelines for determining a corporation’s risk tolerance in terms of total sales, net income, or equity. Reasonable values of R appear to be approximately 6.45 of total sales, 1.24 times net income, or 15.7% of equity. These figures are based on observations that Howard has made in the course of consulting with various companies.

    Using the exponential utility function seems like magic, doesn’t it? One assessment, and we are finished! Why bother with all of those certainty equivalents that we discussed above? You know, however, that you never get something for nothing, and that definitely is the case here. The exponential utility function has a specific kind of curvature and implies a certain kind of risk attitude. This risk attitude is called constant risk aversion. Essentially it means that no matter how much wealth you have – how much money in your pocket or bank account – you would view a particular gamble in the same way. The gamble’s risk premium would be the same no matter how much money you have. Is constant risk aversion reasonable? Maybe it is for some people. Many individuals might be less risk-averse if they had more wealth.

Comments are closed.