Harry Selker and Alastair Wood say yes.
Recently in Decision Theory Category
A correspondent writes:
You may be interested in this article by Matthew Rabin which makes the point that you make in your article: if you are an expected utility maximizer then turning down small actuarially unfair bets (e.g. 50% win $120; 50% lose $100) implies that you would never accept a bet where could lose $1000 (even if you might win an infinite amount of money). (But proved in more generality).This was taught to me in the first year of my econ phd program (which I'm currently in!) as why you probably don't want to extrapolate from decisions over small bets to risk aversion in general, not as why we should throw out risk aversion and expected utility maximization completely. Of course, decision theorists do all kinds of things to try to "fix" this problem.
My reply: Yitzhak (as we called him in high school) wrote his paper after mine had appeared; unfortunately my article was in a statistics journal and he had not heard about it. (This was before I could publicize everything on the blog. And, even now, I think a few papers of mine manage to get out there without being noticed.)
I'm glad they teach this stuff in grad schools now--although, in a way, this still proves my point, in that the nonlinear-utility-function-for-money model is still considered such a standard that they feel the need to debunk it.
My correspondent replied: "I wouldn't call it a debunking....we still go on to use it as the workhorse model in everything we do...."
I think there are good and bad things about this "workhorse model":
I don't like the term "risk aversion" (see here and here). For a long time I've been meaning to write something longer and more systematic on the topic, but every once in awhile I see something that reminds me of the slipperiness of the topic.
For example, Alex Tabarrok asks, "Why are Americans more risk averse about medicine than Europeans?" It's a good question, and it's something I've wondered about myself. But I don't know what he's talking about when he says that "the stereotype is that Americans are more risk-loving" than Europeans. Huh? Americans are notorious for worrying about risks, with car seats, bike helmets, high railings on any possible place where someone could fall, Purell bottles everywhere, etc etc. The commenters on Alex's blog are all talking about drug company regulations, but it seems like a broader cultural thing to me.
But I'm bothered by the term "risk aversion." Why exactly is it appropriate to refer to strict rules on drug approvals as "risk averse"? In a general English-language use of the words, I understand it, but it gets slippery when you try to express it more formally.
The questions are no big deal, but what I find interesting is that medical school do personal interviews at all. No place where I've ever worked has interviewed grad school applicants. It's hard for me to see what you get from it, that it would be worth the cost. I guess there must be quite a bit of psychology literature on this question.
Gustaf Granath writes:
I am an ecologist. I have been struggling with a problem for some time now and even asked some statisticians about this. It would be interesting for me (and maybe other people reading your blog) to hear your opinion. So far, I have not received a satisfying answer from anyone.I am doing a meta-analysis (in ecology with normal dist. data) using two different apporaches. My first approach is a frequentist mixed-model, assuming independence of each sample. The second approach is a hierarchical Bayesian model, modelling the dependence structure in the data set (e.g multiple outcomes from each study). I want to investigate if my covariates are important, and since I have many candidate covariates, I need to do some kind of model selection. My questions is then: is there a model selection tool that can be applied on both approaches??
One of my favorite cartoons, by Charles Barsotti, shows a hot dog excitedly saying, "Hey everybody, we've been invited to a cookout!" I share this with my classes when I teach decision analysis to emphasize that different people (or, more generally, "agents") have different goals.
I was thinking about this point recently after a discussion here a couple weeks ago about the first-player advantage in Risk. Commenter Ken Williams suggested solving the problem by alternating games. In considering why this seems like a bad idea to me (beyond the impracticality of playing several games of risk back to back), I realized that the relevant issue here is not fairness but rather is fun, or playability. After all, for fairness alone you only need to randomize who starts first and that solves the problem. But, if there's a huge first-player advantage, the game still might not be so playable. It's not always a lot of fun to play a game if you know to start with that you're gonna lose.
Pete Lindstrom writes:
I was wondering if you could blog on the points discussed in the WSJ at this link. Apparently, there is a controversy over ways to use clinical data to calculate risks - one method adjusting for time and another using absolute numbers for the entire length of the study.
My (wholly inadequate) reply: This is interesting, but I have to say, I find the article pretty confusing. It's written in the standard journalistic style of going forward and backward in time, rather than in the scientific-journal style of presenting the data and models all in one place. If this was something I had to do, I'd puzzle through what's happening here. Luckily for me, I'm blogging just for fun and so I'll just let the question sit for others to worry about.
I get so irritated when economists and political scientists try to explain every sort of irrational behavior in life as being part of some utility function.
That's one reason I love this paper by Erik Snowberg and Justin Wolfers, "Explaining the Favorite-Longshot Bias: Is it Risk-Love or Misperceptions." They conclude that, yes, it's misperceptions:
Jenny has declared email bankruptcy but is watching her debts pile up again. I have (with effort) followed the Inbox Zero route. John Cook thinks email isn't the problem; on the other hand, he's reacting to a chorus of people telling him that email is ruining their lives, and maybe they have some good reason for saying this. Cook's commenter Heather appears to be staying barely above water with 200 messages in her inbox, while commenter Mr. Gunn recommends a technological solution.
From the comfort of my empty inbox, I thought of another big issue with email. Actually, a huge issue.
Email is a way to feel like you're working without actually thinking very hard. Sort of like blogging, actually--but blogging at least has the side benefit of sharing information with the world, focusing one's thoughts, etc. Actually, one of the unanticipated advantages of blogging, for me, was to organize the ideas that otherwise were going out into a million little emails.
I could--and have, often enough--put all of my work effort on some days into working with the inbox. What's the problem with that? First, I'm letting others drive my priorities. Some of this is fine--I certainly don't delude myself that I'm like that guy who sat in a room by himself and proved Fermat's Last Theorem--but at some point I think a little more direction to my work is useful. Second, inbox-handling just isn't usually the highest-quality thinking. It's just hard enough to occupy my mind without actually pushing me. I might as well just be playing Tetris for two hours.
My plan with Inbox Zero is to spend less total time on my email. I will spend some of the released time on more interesting, useful work, and it will also free up time for leisure.
The next step is to cut back on blogging. (Things will improve once we get the Scheduled Posting feature working again, so I can just write 10 blog entries, schedule them, and not have to think about them anymore.)
Lucian Bebchuk writes:
Financial firms seeking to retain talent are reported to be making substantial use of guaranteed bonuses, and the French Economy Minister recently called for limiting such bonuses. While many now focus on how guaranteed bonuses affect the level of pay, my [Bebchuk's] piece focuses on their effect on incentives. I show that guaranteed bonuses create perverse incentives to take excessive risks, and consequently could well be worse for incentives than straight salary. . . .The above discussion has implications that go beyond the question of guaranteed bonuses. It's now well recognized that bonus plans based on short-term results which may turn out to be illusory can produce excessive risk-taking, and that plans should therefore be structured to account for the time horizon of risks. But even though tying bonus plans to long-term results is desirable, it isn't sufficient to avoid excessive incentives to take risks. Bonus plans tied to long-term results can still produce such incentives if they reward executives for the upside produced by their choices but insulate them from a significant part of the downside. Bonus plans that provide executives with such insulation from downsides - either by establishing a guaranteed floor or otherwise - can seriously backfire. . . .
I can see why the bankers want such incentives--as a tenured professor, I can see the appeal of a system with a floor but no ceiling--but Bebchuk makes a convincing argument that the incentives aren't good. So maybe it's just as well that professors don't get fat bonuses as part of their compensation packages.
In the aftermath of linking to my article with Aaron and Nate about the probability of your vote being decisive, Conor Clarke writes:
If your decision to vote is motivated by the sense that "one vote can make a difference," you are being substantially less rational than someone who never leaves the house for fear of being killed by a meteor. Voting is irrational.
I completely disagree with this last statement, and I know that Aaron does also. Here's we wrote on pages 4-5 of our article:
Alan Bergland writes:
I am a graduate student studying evolutionary biology at Brown University. I am writing you with what I think is a simple question, but I cannot seem to find an answer I feel comfortable with.I am trying to test a planned contrast using posterior distributions from a mixed model (the mixed model is calculated in lme4, and the simulations in arm). The model is fairly complicated, but at the end of the day, there are two fixed effect treatments with two levels each that I am interested in. Lets call these fixed effects "treatment A" (with levels A and a) and "treatment B" (with levels B and b). I am interested in the interaction between treatment A and treatment B, but have a specific hypothesis about the form of that interaction I would like to test. Specifically, I would like to test if ab is less than Ab & aB=AB.
As you and Jennifer Hill suggest in your Multilevel/Hierarchical models book (p. 20), I could test if ab
Once I can calculate the probability that Ab=AB, would it be reasonable to calculate the probability that (ab is less than Ab & aB=AB) as Pr(ab is less than Ab)*Pr(aB=AB)?
My reply:
1. Don't use the arm's sim() function for lmer() objects. The current version is wrong; we're fixing it now, and the replacement should be available in about a month.
2. I don't recommend testing if aB=AB. At least in the sorts of problems I work on, no two comparisons are exactly equal. I think it makes more sense to estimate the relevant comparison, get the confidence interval, and make a graph. You could also do things like calculate the posterior probability (based on simulations) that ab < AB & |aB - AB|
Aaron Edlin just sent me this article by Pinar Karaca-Mandic and himself from 2006:
We [Edlin and Karaca-Mandic] estimate auto accident externalities (more specifically insurance externalities) using panel data on state-average insurance premiums and loss costs. Externalities appear to be substantial in traffic-dense states: in California, for example, we find that the increase in traffic density from a typical additional driver increases total statewide insurance costs of other drivers by $1,725-$3,239 per year, depending on the model. High-traffic density states have large economically and statistically significant externalities in all specifications we check. In contrast, the accident externality per driver in low-traffic states appears quite small. On balance, accident externalities are so large that a correcting Pigouvian tax could raise $66 billion annually in California alone, more than all existing California state taxes during our study period, and over $220 billion per year nationally.
Interesting stuff. I don't have it in me right now to check all these numbers, but the argument looks to be laid out clearly enough that the experts in the area can work it out. Also, it all seems to be about accidents to other cars; I'm not sure where they factor in the costs due to running over pedestrians.
Alex Tabarrok and Matthew Yglesias comment on "the marginal utility of money income." I'll have to write something longer about this some day, but for now let me just reiterate my current understanding that there is no such thing as a utility function. Rather than people arguing over the shape of the utility function, I hope they can move forward to thinking more directly about what people will do with their money.
From my earlier blog entry:
Greg Mankiw links to an article that illustrates the challenges of interpreting raw numbers causally. This would really be a great example for your introductory statistics or economics classes, because the article, by Robert Book, starts off by identifying a statistical error and then goes on to make a nearly identical error of its own! Fun stuff.
Pinchas Lev writes:
Greg Mankiw writes:
The next time you hear someone cavalierly point to international comparisons in life expectancy as evidence against the U.S. healthcare system, you should be ready to explain how schlocky that argument really is.
He points to the following claim by Gary Becker:
National differences in life expectancies are a highly imperfect indicator of the effectiveness of health delivery systems.for example, life styles are important contributors to health, and the US fares poorly on many life style indicators, such as incidence of overweight and obese men, women, and teenagers. To get around such problems, some analysts compare not life expectancies but survival rates from different diseases. The US health system tends to look pretty good on these comparisons.
Becker cites a study that finds that the U.S. does better than Europe in cancer survival rates and in the availability of hip and knee replacements and cataract surgery.
It makes a lot of sense to think of health as multidimensional, so that some countries can do better in life expectancy while others do better in hip replacements and cancer survival.
But I disagree with Mankiw's claim that it's "schlocky" to compare life expectancy. If the U.S. really is spending lots more per person on health care and really getting less in life expectancy compared to other countries . . . that seems like relevant information.
The above remark, which came in the midst of my discussion of an analysis of Iranian voting data, illustrates a gap--nay, a gulf--in understanding between statisticians and (many) nonstatisticians, one of whom commented :that my quote "makes it sound that [I] have not a shred of a clue what a p-value is."
Perhaps it's worth a few sentences of explanation.
Catherine Rampell posted some attractive county-level Human Development Index maps and also discussed my criticisms of the index: I wrote, "if you go by the maps that everybody's linking to...you're pretty much just mapping state income and giving it a fancy transformation and a fancy new name." In its defense, she wrote:
Which is, I [Rampell] suppose, why the American Human Development Index, an adapted version of the U.N.'s original H.D.I., was created: because the U.N.'s index was not designed to capture the levels of variation that would occur within a single country. It was designed to make international comparisons.
This, to me, indicates the problem with the index. It was advertised as putting U.S. states on an international scale (Louisiana vs. Croatia and all that) but, if it needs to be redefined for the U.S., it seems to me that you're losing the universal interpretation, which is a big justification for the index in the first place. At this point, I'd rather map each of the components of the index separately (as Rampell actually does illustrate on her blog).
Greg Mankiw reports on an article by Betsey Stevenson and Justin Wolfers that finds:
By many objective measures the lives of women in the United States have improved over the past 35 years, yet we show that measures of subjective well-being indicate that women's happiness has declined both absolutely and relative to men. . . . Relative declines in female happiness have eroded a gender gap in happiness in which women in the 1970s typically reported higher subjective well-being than did men. . . .
Mankiw concludes: "It sounds like either the women's movement was a mistake or subjective happiness is not the right objective." The bit about the women's movement doesn't make sense to me--this reasoning seems to contradict the point Mankiw made a few days ago about the difficulty of making inferences based on n=1.
If I had to make a quick guess, I would've gone with the hypothesis of economic stress combined with the difficulty of having a job and taking care of the kids, but Stevenson and Wolfers discuss this issue (see pages numbered 15 and 17 and Table 3 of the linked article) and show that the data don't particularly support this hypothesis.
Getting back to Mankiw's comment: Setting aside the line about the women's movement--who knows, maybe the women's movement was a mistake, it's hard to say with n=1 what might have happened in its absence--I think he's right that subjective happiness is not an "objective." People have written about this: you don't become happy by aiming for happiness as an objective, you become happy by doing things that make you happy (or, just by being the kind of person who's happy in any case). It's an interesting issue, but I'm not sure how this is relevant to the Stevenson and Wolfers study.
P.S. If I were Betsey Stevenson, I might be a little unhappy that Mankiw referred to the authors unalphabetically as Wolfers and Stevenson!
P.P.S. Mankiw has fixed this and put the authors in the correct order.
This was pretty yucky:
Adderall, a stimulant composed of mixed amphetamine salts, is commonly prescribed for children and adults who have been given a diagnosis of attention-deficit hyperactivity disorder. But in recent years Adderall and Ritalin, another stimulant, have been adopted as cognitive enhancers: drugs that high-functioning, overcommitted people take to become higher-functioning and more overcommitted. . . . In 2005, a team led by Sean Esteban McCabe, a professor at the University of Michigan's Substance Abuse Research Center, reported that in the previous year 4.1 per cent of American undergraduates had taken prescription stimulants for off-label use; at one school, the figure was twenty-five per cent. . . . white male undergraduates at highly competitive schools--especially in the Northeast--are the most frequent collegiate users of neuroenhancers.
Lots of creepy stories if you follow the link. Or maybe I have the wrong attitude: I don't happen to need these sorts of drugs, so who am I to say that others shouldn't be able to attain similar levels of productivity through chemical means? Maybe I'm like somebody with two good legs, complaining about the development of a new super-efficient prosthetic limb.
Anyway, without passing judgment on any of this, I'd just have to say that I feel fortunate to have grown up in a noncompetitive environment, in which nobody was telling us that we had to work twice as hard to compete in the global marketplace, etc. I also consider myself fortunate to have grown up before success was defined as becoming super-rich. There really does seem to be more pressure now on students--more opportunities, sure, but more pressure, a tradeoff that I wouldn't like, I think.
David Fox writes:
As a 'classically' trained statistician who works on 'real' problems (mainly environmental ones) I have come to appreciate the utility and benefits of working within a Bayesian framework. I would not classify myself as a 'convert' but prefer to have an array of statistical tools from which I can select the most appropriate one for the job at hand. As they say - if all you've got is a hammer, then the whole world's a nail! On the issue of choice of priors, I believe this is an absolute strength in the evaluation and setting of environmental regulatory limits. In situations characterized by high levels of data paucity but rich with expert knowledge (albeit diverse), why would you choose to ignore the latter?However, I should get to the real purpose of this email. A rather fierce debate has been taking place among academics in our departments of Botany and Mathematics and Statistics about the use of a 'new' form of decision-making under extreme uncertainty. It is called Info-Gap (short for information gap) Theory and owes its existence to Prof. Yakov Ben-Haim at Technion in Israel (Ben-Haim 2006). Yakov is well known to the aforementioned academics - he visits here regularly and has done a remarkably good job at 'selling' his product - to the extent that some staff and students in our Botany department and The Australian Centre of Excellence in Risk Analysis (http://www.acera.unimelb.edu.au) have enthusiastically (and some would say, blindly) embraced this 'new' paradigm for decision-making under extreme uncertainty. I must plead mea culpa, having been swept up in the initial enthusiasm and published a couple of papers which use info-gap. However, I have a growing unease that IG is not 'new' but in fact a variant of existing methodologies." While not wishing to draw you into our local debate, I was wondering if you have ever heard of info-gap theory and if you have, do you have an opinion? Prof. Ben-Haim has recently launched his own web site (http://www.info-gap.com) presumably in response to the 'hi-jacking' of the Wikepedia entry (http://en.wikipedia.org/wiki/Info-gap_decision_theory) by IG's most strident local critic, Moshe Sniedovich. Sniedovich has also established a web site (http://info-gap.moshe-online.com/) and a quick look will demonstrate the ferocity of the debate.
Just today, the following paragraph in a paper I was reading [Hickey, G.L., Craig, P.S., and Hart, A. (2009) On the application of loss functions in determining assessment factors for ecological risk. Ecotoxicology and Environmental Safety, 72, 293-300] caught my attention:
"There do exist other forms of risk measurement. However, by a very well-known theorem of Wald (1950), any admissible decision rule is a Bayes rule with respect to some prior distribution (possibly an improper prior distribution), whereby admissibility is defined to mean that no other decision rule dominates it in terms of risk. It is therefore argued by many, for example, Bernardo and Smith (2000) that it is pointless to work in decision-theory outside the Bayesian framework".This accords with my own gut feeling that IG Theory is in fact a Bayes Rule with a non-informative prior.
My reply: I had never heard about Dr. Ben-Haim or his methods before receiving this email. I checked out the links but couldn't really see the point in this approach. The mathematics looked complicated and appeared to be a distraction from the more important goals of modeling the decision problems directly.
For some of my thoughts on Bayesian decision analysis, see chapter 22 of Bayesian Data Analysis (second edition). Bayesian decision analysis is a lot more flexible than people realize, I think, especially when used in the context of hierarchical modeling. See here for a brief discussion of my idea of "institutional decision analysis" and here for an example of Bayesian decision analysis in action.
In my article on the boxer, the wrestler, and the coin flip, I discuss some fundamental difficulties with Bayesian robusness and similar approaches.
Finally, I don't know that I'd agree with the statement that it's "pointless" to work in non-Bayesian decision theory. For me, I've found the Bayesian approach to do the job, but I can imagine there are settings where other methods can be useful. I'm not, however, a fan of those 1950's-style alternatives such as "minimax regret" and all the reat. I offer no comment on Info-Gap since I didn't put in the effort to try to understand exactly what it is.
In a long review of David Boyd Haycock's "Mortal Coil: A Short History of Living Longer," Steven Shapin discusses historical and recent proposals for extending the human lifespan. Shapin's article seems off to me: he just seems to spend too much time mocking the idea of extending life. He keeps bringing up silly examples such as the biblical Methuselah, Mel Brooks's 2000 year old man, and Old Tom Parr, who lived in the 1600s and claimed to have lived 150 years old. (Shapin didn't even need to go back that far; I remember as a child reading of a bunch of Russians who claimed to have lived to about 150--as I recall, they ate a lot of yoghurt.)
This is all fine--after all, Shapin's a historian and is reviewing a history book--but he seems a bit too eager to laugh at modern life-extenders such as Roy Walford, who promoted the caloric restriction diet but died at age 79. Connecting to the Bible and even Old Tom Parr is fine, but why does Shapin keep bringing them up in his review? Not to mention bringing up the "Groundhog Day worry about endless boredom . . . the meaninglessness of life in a world without death . . ." I mean, talk about weak arguments in favor of mortality!
My guess is that to Shapin--as to me--the potential of much longer life is scary. I'd love to live to 150 or beyond (I think); certainly I'm not happy about the idea that my life is more than half over!--but, still, there's something scary here, and not just because of issues such as environmental devastation, global inequities, and so forth. I think the scary thing is: What if the calorie-restricters and vitamin-poppers are right? What if we could live to 150 if we only lived right? Then when we die peacefully in our beds at 80, we can be torn up about the 70 years that we're going to miss. I mean, who really wants this guy (with his "private blog" and all the rest) to be right? Far better to laugh it off or just not think about it. That's what I do.
P.S. I was also surprised that Shapin didn't discuss the theory (which I first read in Plagues and Peoples, I believe) that premodern hunter-gatherers lived healthier lives than those in agricultural societies, at least until recently. This would relate the historical stories of the ancients having long lives.
Richard Posner defended the rationality of people who bought stocks during the bubble, writing:
People buy common stock when stock prices are rising. They (notoriously) bought houses during the early 2000s when house prices were rising. Since almost no one can predict the ups and downs of the stock market or the housing market, these purchases must have been motivated, Akerlof and Shiller argue, by something other than a rational investment strategy. But this is not at all obvious . . . Stocks have generally been a good investment, at least when held for a considerable period. . . .
I agree with Nate, who disagrees with Richard Posner by pointing out that, in fact, there was evidence that stocks were overpriced during the early 2000s, even at the time.
I'd like to add one comment. During all these bubble years, the experts were telling us over and over again how we should be buying stocks, how stocks were the best investment over the long term, and how we were all irrational for not putting more of our money into the stock market.
What's the logic here? People were being irrational by hesitating to buy stocks when they were going up, then they were finally being rational by buying stocks when they had very high prices?
I think all this discussion is hindered by the overloading of the term "rational." I imagine that just about everybody takes his or her money management seriously, and I'm sure people are trying to behave rationally with their investments. The trouble is that there are lots of rules out there to follow, so there's more than one way to be rational. I agree with Nate that Posner's implicit assumption--that people were following expert advice, and so they must have been applying (prospectively) good judgment--is misguided.
Phil went on vacation to Panama (among other places). I said, Panama? Who goes to Panama? Phil said, What do you mean, who goes to Panama? I said, people go to Costa Rica, they go to Guatemala, who goes to Panama?
Phil replied:
According to http://www.thinkpanama.com/panama-weekly/category/panama-tourism and http://www.travelime.com/news/533/ the number of tourists that visited Panama last year was almost exactly the same as the number that visited Guatemala, 1.6M in each case.
OK.
Ian Ayres suggests a gas tax that would start off with a rebate:
The government would offer a $500 advance tax rebate each year for every car you choose to sign up for the tax. In return, you would commit to pay an extra $1 for each gallon of gas you buy.
For obvious reasons, I like this idea--I'd like to get that extra $500. And since the government is giving out stimulus money anyway, now's the time to try it!
But I'm puzzled by their suggested implementation:
The actual tax paid would be based on miles driven and fuel economy. Thus a Chevy Impala rated at 19 m.p.g. would be charged $5.26 each 100 miles, while a Prius rated at 46 m.p.g. would be charged $2.17 per 100 miles.
Wouldn't it be simpler to just charge $1 per gallon of gas (with people who didn't get the rebate getting some sort of sticker exempting them from the tax)? Why have a complicated system based on miles per gallon when you can simply tax the gas itself?
In any case, I get Ayres's main point which is that this rebate system is more of a way to make things psychologically palatable to people than to be a realistic policy suggestion.
Perhaps another way to go on this would be to follow the "you polluted, you clean it up" policy, by which the tax is more directly tied to the cost of keeping the roads going, securing the supply of oil, cleaning the air, retrofitting coal plants to pollute less, etc. Maybe people would be less unhappy paying a higher gas tax if it were clearly going to maintaining the transport system and cleaning up the pollution it creates?
Chris Masse pointed me to this blog by Panos Ipeirotis, who argues that some online prediction markets give probabilities that are too good to be true:
Yes, I understand that it's frustrating to not be able to drive your expensive SUV at the maximum possible speed attainable by that magnificent machine . . . but, really, how fast do you really expect to be traveling on NYC streets in a snowstorm during rush hour???
On the often-interesting judgment and decision making listserv, George Christopoulos wrote:
It seems that in situations similar to the present economic situation economic agents are less willing to take risks and instead they prefer safer options.Could somebody point to studies that show this negative relationship between depression /recession (or when generally when wealth resources are low) and increased (relative?) risk aversion?
There were a couple of responses on the list, but they seemed to me to miss the point slightly. The respondents referred to econ literature on stock market trading and on wealth and economic decision making, but my impression was that Christopoulos was looking for something more psychological: something like a meta-analysis of studies of uncertainty aversion (I prefer to avoid the term "risk aversion" or even "loss aversion," for reasons I've discussed at length on this blog) over time, to see if subjects in an identical experiment show more uncertainty aversion in bad times than good.
The next step would be to analyze such data to separate out, to the extent possible, effects of individual economic status and national trends. The hypothesis might be that both have effects: that people suffering personal reversals might show more uncertainty aversion, and that, on top of this, everyone might tend to show more uncertainty aversion during economic downturns.
Could be an interesting study, although I doubt that such data are available.
Seems like a good idea to me. This story reminds me of when my course listing mysteriously got removed from the department's website. It took something like two years to get it back up.
Mark Thoma has an interesting discussion of the challenge that the economics profession, and individual economists, have when they give policy recommendations.
Mark's basic point goes as follows. Consider the following four stages of a model:
(a) assumptions about fundamental principles of how the world works,
(b) normative principles (that is, fundamental goals, views about how the world should be),
(c) conclusions about the likely effects on policy,
(d) recommendations about policies.
In any rigorous economic model, there should be a mapping leading from (a) to (c). Further reasoning (possibly mathematical modeling, as in cost-benefit analysis) will take you from (b) and (c) to (d).
That's all fine. But Mark's point is that the reasoning can go the other way too: start with (b) and (d), and then you can figure out what (c) needs to be, and then you can go back one more step and figure out what model (a) you need to get started! Even if economists are not doing this reasoning-from-conclusions-to-assumptions explicitly, you could well believe it's going on implicitly as well as being induced by various pressures such as the selection of what research results to report and even what problems to work on.
This is inevitable, and I discuss it in the decision analysis chapter (22, I think it is) of Bayesian Data Analysis. We call it the garbage-in-garbage-out problem: If you can come with any decision you'd like by just altering the inputs of your analysis, then what's the point of decision analysis (or, by extension to the above-linked example, economic modeling) at all?
My answer is something that I call "institutional decision analysis," which has two principles:
1. It can be a good idea to provide reasoning to justify your decisions. As an individual person, you might not have to justify your personal decisions to anyone (except to your spouse), but an institution--whether it be a business, a government agency, a nonprofit organization, or some other grouping--often needs some path of bread crumbs connecting assumptions to recommendations. (Here, I carefully say "connecting" rather than "leading from" to be agnostic about the direction of the reasoning.)
2. As Mark noted, an overall decision recommendation on anything important is likely to be so dependent on assumptions to such an extent that it's probably fair to say that the analyst is reasoning from conclusions to assumptions (from (d) to (c) and then to (a), in my above notation). But, even then, formal decision analysis can be useful in making relative recommendations. This is the point that we made in our article about decision making for home radon [link fixed]. In the economics context, this might suggest that economists of different political persuasions could still give useful recommendations about how to spend money or cut taxes, or where in the economy such policies would make more or less sense.
Gur Huberman writes, regarding the Edlin, Gelman, and Kaplan article in The Economist's Voice:
Can you extend the charity/rationality argument to explain why people in non battleground sates (e.g., NY) vote? Even if charity motivation is a partial explanation for voting, an implication would be that voter turnout is higher in battleground states, other things being equal. However, I am afraid that this prediction is consistent with many other explanations of why people vote.Another issue that has intrigued me for years: I am under the impression that voter turnout is lower in local elections and in midterm elections. In midterm elections there's less at stake, so your charity story seems to cover that. But, selfishly speaking, it may well be that who my mayor is may have a stronger impact on my life than who my president is. (Quantifying this last statement is challenging.) If so, why am I more likely to vote in a presidential election than in a mayoral one? Your charity theory may help answer the question.
My reply
1. I think there are many reasons for voting, and in NY it's not particularly rational for instrumental reasons.
2, In our article a couple years ago in the journal Rationality and Society, Edlin, Kaplan, and I discuss the coexistence of many different models for voting. For example, there is the "psychological" model that we are more likely to vote in an election that more people are talking about. People are more likely to talk about an election that is close and that is viewed as important. So the psychological and economic/rational explanation coincide in this way. (Similarly, you could consider psychological or economic rationales for purchases. For example, if I buy something on sale, I'm economically motivated to save money and psychologically motivated because of the pleasure in "getting a deal.") These two things reinforce each other; I see them as parallel, not competing, explanations.
3. Your mayor may have more of an impact on _your_ life, but total impact is proportional to total #people affected. And that doesn't even get into foreign policy (not an issue for local politics unless you happen to live in, say, Berkeley, California).
John Quiggin sent me this article of his from 1987 that made the same argument as my paper with Edlin and Kaplan on why and how it's rational to vote. In his article, Quiggin wrote:
There is strong evidence that voting behaviour is both ends-directed and rational. That is, electors choose to vote because of the effects their vote will have, and do not vote if these effects are insufficient to outweigh the costs of voting. However, as Downs' paradox shows, rationality and egoism together imply non-voting. The evidence suggests that egoism is the postulate which must be abandoned. . . . voters' interest in political information increases with the importance of political choices. Once again, this is consistent with rationality but not with egoism.
Our article had more math and more focus on U.S. politics but the basic point is the same.
Also let me use this as yet another excuse to plug a wonderful article, The Norm of Self-Interest, by psychologist Dale Miller, in which he argues the following:
A norm exists in Western cultures that specifies self-interest both is and ought to be a powerful determinant of behavior. This norm influences people's actions and opinions as well as the accounts they give for their actions and opinions. In particular, it leads people to act and speak as though they care more about their material self-interest than they do.
Ted Dunning sent me this graph:
So, how do the polling data compare to the contract prices from Intrade on the day before the election? Below is a graph with a data point for each state, with the horizontal axis representing the polling data and the vertical axis representing the Intrade contract price.

The quick message that I get from here is that Intrade prices are way biased toward 50/50. For example, the price for DC is something like .04, which is ridiculous. (To two decimal places, it should certainly be .00).
Jim Dannemiller writes:
I ran across your discussion on retrospective power and power calculations in general, and I thought that you might be interested in this manuscript [link fixed] that Ron Serlin and I are working on at present.
Their idea is to formally put costs into the model and then optimize, instead of setting Type 1 and Type 2 error rates ahead of time. I'm already on record as not being a fan of the concepts of Type 1 and Type 2 errors; that said, we do work in that framework in chapter 20 of our book, and my guess is that it does make sense to put costs into the model explicitly. So I imagine this article by Dannemiller and Serlin is a step forward.
P.S. Jim also said he liked our Teaching Statistics book!
In his aforementioned chapter, Stephen Senn writes:
"In order to interpret a trial it is necessary to know its power": This is a rather silly point of view that nevertheless continues to attract adherents. A power calculation is used for planning trials and is effectively superseded once the data are in. . . . An analogy may be made. In determining to cross the Atlantic it is important to consider what size of boat it is prudent to employ. If one sets sail from Plymouth and several days later sees the Statue of Liberty and the Empire State Building, the fact that the boat employed was rather small is scarcely relevant to deciding whether the Atlantic was crossed.
I used to think this too, but after writing my paper with David Weakliem, I've changed my stance on the relevance of retrospective power calculations. In that article, Weakliem and I discussed the problem of Type M (magnitude) errors, where the true effect is small but it is estimated to be large. One problem with underpowered studies is that, when they do turn up statistically significant results, they tend to be huge compared to the true effect sizes.
On the other hand, large studies can be a huge waste of effort, so I don't really know what I would recommend for medical research.
Following this discussion of his statistical advice, Stephen Senn sent me this chapter on "Determining the Sample Size", which has a great beginning:
Clinical trials are expensive, whether the cost is counted in money or in human suffering, but they are capable of providing results which are extremely valuable, whether the value is measured in drug company profits or successful treatment of future patients. Balancing potential value against actual cost is thus an extremely important and delicate matter and since, other things being equal, both cost and value increase the more patients are recruited, determining the number needed is an important aspect of planning any trial. It is hardly surprising, therefore, that calculating the sample size is regarded as being an important duty of the medical statistician working in drug development.
I'll pile on the references by linking to chapter 20 of my book with Jennifer, which, compared to Senn's book, has a bit more calculation and a bit less discussion. I suspect many readers will benefit from reading both. (Full link to the book here.)
Finally, here's a link with a couple more references, including a great little article by Russ Lenth.
This discussion from Keynes (from Robert Skidelsky, linked from Steve Hsu) reminds me of a frustrating conversation I've sometimes had with economists regarding the concept of "risk aversion."
Mankiw calculates that McCain's tax plan would tax him at a marginal rate of 83%, while Obama's would tax his marginal dollar at 93%. He concludes:
The bottom line: If you are one of those people out there trying to induce me [Mankiw] to do some work for you, there is a good chance I will turn you down. And the likelihood will go up after President Obama puts his tax plan in place. I expect to spend more time playing with my kids. They will be poorer when they grow up, but perhaps they will have a few more happy memories.
I don't quite follow Mankiw's reasoning on the marginal tax rates, except I do get his point that his marginal dollars are all ultimately going to his kids--none of it will be spent in his lifetime, so in that sense he's talking about different varieties of an estate tax.
I'm more interested in the decision implications.
To start with, it does sound like Mankiw's kids are already well provided for, and, although I'm sure they'd disagree with me on this, it's not clear that they would benefit from having more money in the bank when their parents are gone.
So, from that point of view, the question is why Mankiw isn't already spending more time playing with his kids? I can't speak for him, but for me, I have to say that it can be fun to work (or even to write blog entries). But, more than that, I feel a sense of obligation to get things done. At some level, getting paid is part of the motivation, but in any particular example I'm not quite sure how it fits in. I do lots of work things that pay me $0; I think they're important, so I do them.
On the other hand, if I really, really didn't need the money, I could set my salary to $0 and spend the money on extra postdocs. That would be pretty cool but I can't really live on $0 and keep my current lifestyle.
For Mankiw, I'm not sure; maybe he makes enough from his textbooks that he doesn't need much of his academic salary and could possibly do more by converting it into postdocs and research assistants. Or maybe he already has more research assistants than he knows what to do with; I don't know. But his division of waking hours into "working" or "playing with kids" is, I would guess, not very sensitive to the marginal tax rate.
Sustainable Energy and the challenge of connecting technical findings to the policymaking literature
I went to the webpage of physicist / computer scientist David MacKay and found that he had written a book on energy policy for general audiences. It's basically a physics book where he computes the energy costs of different aspects of our lifestyles and then estimates the potential for getting power from various non-carbon-emitting sources. It's a fun read and I recommend taking a look. I don't know enough to offer any serious endorsement or criticism of his claims, but he presents his reasoning very clearly, which I like. He has lots of graphs, and I view his book as being somewhat in the spirit of Red State, Blue State, as organizing a bunch of information so that the reader is in a better position to make his or her own judgments. (Again, I'm in no position to endorse or criticize MacKay's specific recommendations.)
My main suggestion is that MacKay follow up on one of his suggestions and connect his work to that of advocates on different sides of the issue. He begins his book as follows:
I [MacKay] recently read two books, one by a physicist, and one by an economist. In Out of Gas, Caltech physicist David Goodstein describes an impending energy crisis brought on by The End of the Age of Oil. . . .In The Skeptical Environmentalist, Bjørn Lomborg paints a completely different picture. "Everything is fine." Indeed, "everything is getting better." Furthermore, "we are not headed for a major energy crisis," and "there is plenty of energy." How could two smart people come to such different conclusions? I had to get to the bottom of this.
This sounded good, and I was looking forward to the resolution. But in all the rest of the book, MacKay never mentioned Goodstein or Lomborg again (except once in a brief aside to say that their books are "full of interesting numbers and back-of-envelope calculations," and once to cite Lomborg's estimate of bird deaths caused by wind turbines)!
This was a letdown. I think MacKay's argument would be stronger if he could loop back and address the arguments of Goodstein, Lomborg, and others.
In response to something Robin Hanson wrote on his blog (sorry I can't find the exact link, I think it was at the end of July, 2008), I wrote:
Dan Lakeland writes:
I recently enrolled as a PhD student in a civil engineering program. My interest could be described as the application of data and risk analysis to engineering modelling, design methods, and decision making.The field is pretty ripe, and infrastructure risk analysis is a common topic these days, but the simulations and statistical approaches taken so far have been a bit unsatisfactory. For example people studying the impact of bridge failures during earthquakes on the local economy might assume a constant cost per person-hour of delay throughout the rebuild period, or people might build statistical models of probability of building collapse, but I would call them pretty much prior distributions, not really based on much data, or based on a finite element computer model of the physics of a single model building.
Several years ago I was at the Library of Congress and asked where to go to get to the stacks. The guard told me that the stacks were closed. I asked, when did that happen? He replied that the Library of Congress had never had open stacks. The funny thing is, I knew he was wrong, because in high school I went to the Library of Congress a couple of times and I remember roaming the stacks, which were positioned sort of like spokes in a wheel. It was so cool to go to the stacks and see all the books written by an author. (I also remember looking for the book, "Get Even: The Complete Book of Dirty Tricks," which was in the card catalog--remember those?--but not on the shelves. But only Members of Congress can check books out of the Library of Congress. Hmmm.....) It's annoying how people can be so sure of themselves. The guard had probably been working there 10 years and so he thought he knew everything about the place.
See here:
We humans seem to be born with a number line in our head. But a May 30 study in Science suggests it may look less like an evenly segmented ruler and more like a logarithmic slide rule on which the distance between two numbers represents their ratio (when divided) rather than their difference (when subtracted).
This is consistent with our analysis in chapter 5 of our book of decisions of Bangladeshis about whether to switch wells because of arsenic in drinking water. Among households with dangerous wells (arsenic content higher than 50 (in some units)), we predicted whether a household switches wells, given two predictors:
- distance to the nearest safe well;
- arsenic level of their existing well.
The data were consistent with the model that people weight "distance to nearest safe well" linearly but weight "arsenic level" on the log scale. As we discuss in our book, this makes psychological sense: distance is something you perceive directly and linearly, by walking (it takes twice as much time and effort to walk 200m as to walk 100m), whereas arsenic level is just a number and, as such, going from 50 to 100 seems about the same, psychologically, as going from 100 to 200 or 200 to 400--even though, in reality, that last jump is four times as bad as the first (arsenic being a cumulative poison).
John Sides points to a news report of a hit-and-run driver who "struck and slightly injured a pedestrian while driving his sports car in downtown Washington" and then said, "I didn’t know I hit him…I feel terrible…[But] he’s not dead, that’s the main thing." He was fined $50.
It often seems to happen this way, that punishments for reckless driving are much less severe than the effect of the crime itself. (Even being "slightly injured" in a car crash has gotta be a personal loss of much more than $50, not even counting hospital costs.) This is particularly striking given that not every offender is caught, so you might think that punishments would be higher for their deterrent value.
Why are the punishments so low? One reason is that many of the legislators who write the laws and judges who decide sentencing are themselves dangerous drivers at times, and I suspect that it can be easier for them to identify with the criminals than the victims. (If Gary Larsen were writing the laws, there'd probably be a death penalty for running over a dog.)
But I think there's something deeper going on, having to do with retrospective and prospective decision analysis. In the driving example, it goes like this.
1. Suppose somebody (e.g., Dick Cheney) is driving dangerously but nobody is hurt, or not seriously. Then the response is that no serious harm was done--it's just one of those things--so no point in having a big punishment.
2. Suppose somebody (e.g., Ted Kennedy or Laura Bush) is driving dangerously and seriously injures or kills someone. Then the response is that it's a terrible tragedy but very bad luck, so what is gained by seriously punishing the driver.
The issue is that deaths and serious injuries are also rare--even if you drive recklessly, it's extremely unlikely that you'll kill someone in any given outing. So you're stuck between punishing the almosts and might-have-beens or really laying down the hammer on the serious cases. No option seems quite right. Although I guess in this case the pedestrian will do all right because he'll probably sue the driver for a couple of million dollars.
Alex Tabarrok has an interesting discussion of saving strategies. Alex writes:
There are people who don't save much because they have very low incomes, their behavior does not seem to be in error, especially when we take into consideration the various welfare programs that will cover people in their old age. . . . So let's focus on people with moderate to high incomes. . . . Over confidence and in particular the idea that we are special and will live a long life suggests the error is saving too much. . . . Availability bias probably also suggests we save too much - we see people who saved too little in the street but the ones who saved too much are dead and gone. . . . I do not know which error is more prevalent but if we are to be neither spendthrift nor miser we need to recognize both types of error.
My guess is that Alex is a little too optimistic about people's savings strategies, given all the credit card debt out there. Also, as some of his commenters note, it's easy for people to get used to a particular spending pattern, and it's easier to ramp it up than to scale it down. So, for psychological purposes, it might be better to plan for a gradually increasing standard of living than something completely flat over time.
But I'm sympathetic with Alex's general point that both kinds of errors are relevant. It reminds me of when I asked the students in my decision analysis class to raise their hands if they'd never missed a flight. I then said to them: You go to the airport too early! A retrospective rather than a prospective analysis but still essentially correct, I think.
This is ok, but I like my solution better.
Eddie Randolph writes,
I was wondering if you had any thoughts on the World Rock Paper Scissors Contest currently being held. Do you think it will be won by someone who plays intuitively, or a master strategist? If you think the strategist will win, do you think they will employ strategies from the book you pointed to in your blog?
My reply: I love rock-paper-scissors but I'm afraid I have no deep theories. I'd guess that it's pretty random, that whoever wins one year wouldn't have much better than a random chance of doing well the next year.
This entry received the following comment:
You can't compare each round as a parallel test because they move the tee boxes and hole locations every day. This makes the course much more difficult on some days and that is what separates the best from the worst.a href="http://www.sporthaley.com" Women's Golf Clothing /a
(I've purposely unlinked the html.) Also, the commenter's name is given as "Women's Golf Clothing," and the above link is given as the referring url.
I don't know what to make of this sort of thing. It's hard for me to believe it's pure spam--could a bot really read the entry and make that comment? But what person would sign his or her comment as "Women's Golf Clothing"? We get this kind of semi-spam on the blog comments now and then, and I'm never sure what to think about it.
P.S. Just to be clear, here's another comment that clearly is 100% spam. It's for the blog entry on "Baby Names," it's by "baby boy," and it says, "I can think of "yang" names, you can never know." That's clearly spam, unlike the above comment which was human-generated.
We went by the pool in Central Park today and it was drained of water. Wha . . . ? It was only about 100 degrees out there today.
Aaahhhh, I see: "Pool Opens July 1st and closes after Labor Day". (Note also that in the picture on the website, the pool has no people in it. That's no coincidence: when there are people in it, it's jam-packed with people.)
OK, I have an idea: no A/C in city offices until the all the swimming pools open. If the city doesn't want to open the pool until July 1st, no problem: it's probably not so hot outside, and I'm sure the desk workers can make do with fans.
And, just in case they decide to do what they did in previous years and open up only half of the pool (I'm not kidding, they roped half the pool off so that a few zillion people were crammed into 50% of the space), then they can do the same thing in city offices: half of the offices can have A/C, half won't. I'm not sure what to do about the mayor. Maybe give him A/C half the time?
P.S. When open, the pool hours are from 11-3 and 4-7. Of course, this means that city offices should also be air conditioned only during these times.
We're more likely to listen to expensive advice. Dog bites man, or is there something I'm missing here?
P.S. I know from personal experience that if you raise your consulting fee high enough, people do start to choke on the price and either say no or reduce the number of hours they want. Both of which are good outcomes from my perspective.
This article by Tim Harford reminds me of an example I used to give in my decision analysis class:
When I was younger, people used to complain about candy bars getting smaller and smaller. (For example, Stephen Jay Gould has a graph in one of his books showing the size of the standard Hershey bar declining from 2 ounces in 1965 gradually down to 1.2 ounces in 1980, and for that matter I can recall tunafish cans gradually declining from 8 ounces to 6 ounces.) And I remember going to the candy machine with my quarter and picking out the candy bar that was heaviest--I don't remember which one--even if it wasn't my favorite flavor, to get the most value for the money.
But now I realize that, rationally, candymakers should charge more for smaller candy bars. The joy from eating the candy is basically discrete--I'll get essentially no more joy from a 1.7-ounce bar than from a 1.4-ounce bar. But the larger bar will be worse for my health (no big deal if I eat just one, but with some cumulative effect if I eat one every day, similarly with the sodas and so forth). And, given the well-known fact that nobody can eat just part of a candy bar, I get more net utility from the small bar, thus they should charge more.
See here for a link to a research study on this.
When you leave a voice mail, please say your name and phone number slowly and clearly. Thank you.
Michail Fragkias writes,
Chris Wiggins points me to this column by John Tierney reporting research by Keith Chen on cognitive dissonance--that well-known phenomenon whereby we change our preferences to match our pre-existing decisions (for example, not wanting to hear bad news about one's preferred presidential candidate). Chen wrote a paper claiming that cognitive dissonance is not nearly as important as everyone thought it was. For Tierney's column, Chen writes,
All of the studies I [Chen] talk about take as their basic model a famous and incredibly influential experiment by Jack Brehm in 1956; the first study, in fact, which psychologists took to demonstrate cognitive dissonance. In Brehm’s study and its modern variants, subjects are first asked to rate or rank a bunch of goods based on how much they like them. Then, subjects are offered a choice between two of the goods they just rated, and are told they can take the good they choose home with them as payment for the study. They are then asked to re-rate all of the original goods; cognitive dissonance theory suggests that people would have a better opinion of the good they choose after choosing it than before.So, for example, subjects may first be asked to rank 15 goods from 1 to 15, with 1 being the best and 15 being the worst. Then, a subject would be asked to choose between two goods they initially ranked similarly, say the goods they ranked 7 and 9. After making this choice, psychologists have looked at whether, if asked to rank these goods again, the chosen good rises in rank, and if the rejected good falls. This seems like a perfectly reasonable thing to look at; but there’s a big problem in how this has been done.
The problem is, when you ask subjects to choose between goods they ranked 7 and 9 (call these goods A and B), many subjects choose good B, (the good they initially ranked lower). Typically about one-quarter to one-third of subjects do this. Now, why people do this isn’t entirely clear, but one thought is that it indicates that asking subjects to rank goods from best to worst isn’t a perfect measure of how they feel. Some of them might not take the task that seriously; some might get confused by all the choices. So while they’ll initially rank B below A in the list of items, when they actually focus just on those two items they realize they actually prefer B to A.
The real problem, though, is what psychology studies did with subjects who “switched” — that is, those who chose the good they initially said they liked less. What many studies did (following the original Brehm study) is to exclude from the study those subjects who choose good B. Is this a problem that can bias their findings? Yes, if we think that the subjects being excluded from study are systematically different from those who aren’t. Specifically, when we drop subjects who choose good B over good A, we may be systematically dropping subjects who like good B (more than good A). Ignoring this possibility is like ignoring Monty’s choice and what it tells us about where the car is likely to be. By throwing out subjects, a study “stacks the deck” of remaining subjects with people who like good A more than they like good B. In fact, all remaining subjects have signaled they feel that way, twice. Maybe it shouldn’t be a surprise then, when asked to re-rank these items, the rank of good A rises; it originally ranked 7 from a larger group, then those who on second thought didn’t like it so much, were dropped.
Many studies that examine “spreading” look at how much A goes up and B goes down only for those people who chose A (as just described). Others look at whether the chosen good (A if you choose A and B if you choose B) went up and the non-chosen went down for everyone. This is also problematic for exactly the same reason; we shouldn’t be surprised that people like the things they choose, and the experiment needs to take that into account before it can correctly claim that dissonance is occurring.
This is interesting. I'm certainly sympathetic to the argument that preferences don't exist, independently of the settings in which they are chosen. (See here and here.) On the other hand, the desire to avoid cognitive dissonance seems real to me. I'd like to see how this work fits into the general literature on the topic. (Also, I'm not quite sure why Tierney calls this "social psychology." Isn't it "cognitive psychology"? I'm sure there's something I'm missing here.
James Annan writes,
I wonder if you would consider commenting on Marty Weitzman's "Dismal Theorem", which purports to show that all estimates of what he calls a "scaling parameter" (climate sensitivity is one example) must be long-tailed, in the sense of having a pdf that decays as an inverse polynomial and not faster. The conclusion he draws is that using a standard risk-averse loss function gives an infinite expected loss, and always will for any amount of observational evidence.
I looked up Weitzman and found this paper, "On Modeling and Interpreting the Economics of Catastrophic Climate Change," which discusses his "dismal theorem." I couldn't bring myself to put in the effort to understand exactly what he was saying, but I caught something about posterior distributions having fat tails. That's true--this is a point made in many Bayesian statistics texts, including ours (chapter 3) and many that came before us (for example, Box and Tiao). With any finite sample, it's hard to rule out the hypothesis of a huge underlying variance. (Fundamentally, the reason is that, if the underlying distribution truly does have fat tails, it's possible for them to be hidden in any reasonable sample. It's that Black Swan thing all over again.) I think that Weitzman is making some deeper technical point, and I'm sure I'm disappointing Annan by not having more to say on this . . .
More
Seth is skeptical of skepticism in evaluating scientific research. He starts by pointing out that it can be foolish to ignore data, just because they don't come from a randomized experiment. The "gold standard" of double-blind experimentation has become an official currency, and Seth is arguing for some bimetallism. To continue with this ridiculous analogy, a little bit of inflation is a good thing: some liquidity in scientific research is needed in order to keep the entire enterprise moving smoothly.
As Gresham has taught us, if observational studies are outlawed, then only outlaws will do observational studies.
I think Seth goes too far, though, and that brings up an interesting question.
I was sorry to see Steven Levitt repeating the claim about driving a car being good for the environment. I wrote about this last week when it appeared in the other New York Times column of John Tierney, but perhaps it's worth repeating:
Chris Paulse sends in this amusing advice which could be used as an example in teaching decision analysis.
Jim Hammitt sends along this interesting report comparing different measures of risk when evaluating public health options:
There is long-standing debate whether to count "lives saved" or "life-years saved" when evaluating policies to reduce mortality risk. Historically, the two approaches have been applied in different domains. Environmental and transportation policies have often been evaluated using lives saved, while life-years saved has been the preferred metric in other areas of public health including medicine, vaccination, and disease screening. . . Describing environmental, health, and safety interventions as "saving lives" or "saving life-years" can be misleading. . . . Reducing the risk of dying now increases the risk of dying later, so these lives are not saved forever but life-years are gained. . . .
We discuss some of these issues in our article on home radon risks. Beyond this, I have two comments on Jim Hammitt's paper:
1. I wish he'd talked about Qalys. I just like the sound of that word. Qaly, qaly, qaly. (It's pronounced "Qualy")
2. He talks briefly about "willingness to pay." I've always thought this can be a misleading concept. Sometimes it's really "ability to pay." Give someone a lot more money and he or she becomes more able to pay for things, including risk reduction. True, this induces more willingness to pay, but to me the ability is the driving factor. I think the key is what comparison is being made. If you're considering one person and comparing several risks, then the question is, what are you willing to pay for. But if you are considering several people with different financial situations, then the more relevant question might be, who is able to pay.
1. This article by Carl Elliott reminded me why institutional review boards (IRBs) are needed.
2. This site (via Seth) reminds me of why IRBs can be a bad thing.
For me, IRBs are typically a waste of time, nothing more, but for others they are a (potential) protection against health hazards and exploitation, and for others they are a barrier to research progress.
I don't know anything about basketball (except that the players are shorter than they say they are, and I don't really even know that); nonetheless, in the recent-but-still-grand tradition of blogging . . .
I'll try to clarify my recent entry on unintended consequences by focusing on a less politically-loaded example.
Millions of people in south Asia are exposed to high levels of arsenic in their drinking water. It's a natural contaminant (something to do with the soil chemistry) but it's become an increasingly important problem in the past decades because people have been digging millions of deep (~ 100 feet) tubewells. The background is that the surface water is often contaminated, and international organizations have been encouraging the locals to dig these tubewells which draw clean water from hundreds of feet below ground. Unfortunately, some of that water is contaminated with arsenic. A true unintended consequence. But what to do next?
There are various solutions out there, including a low-cost device for purifying surface water. My connection to this is that I've been involved in a project to give information to people in Bangladesh about where and how deep to dig to find arsenic-free deep water. In some places you have to drill hundreds of feet deep, and this can be expensive (relative to Bangladeshis' incomes). So we're setting up an insurance system for people there, so they can pay a little bit more but be assured of eventually getting a safe well, or their money back. The idea is to provide incentives for well-drillers also, to set up an ongoing system where there is trust and so that safe wells can be installed.
More unintended consequences?
Two concerns about unintended consequences arise. First, on the physical level, there is a concern that, if people build wells taking clean water from deep aquifers, they'll start using that water more and more (just as we in the developed world flush our toilets with fresh water, etc), leading to changes in the water flow that might bring arsenic down there or have other bad consequences. I don't know enough to evaluate this concern so I'm just trusting my colleagues on this.
The second concern is something I mentioned to my collaborators the other day: should we really be offering this insurance scheme at all? The goal of the program is to get people to dig deeper wells than they otherwise would've done, by setting up incentives for customers and well-drillers to get together. (I should explain that this is intended to be a revenue-neutral, "at cost," system: not a subsidy for Bangladeshis to dig wells, but not a moneymaker for us, either. The money would be made by the drillers, and this would provide an incentive for the program to continue.)
Anyway, I asked my collaborators whether maybe we shouldn't be doing this program at all, since we're trying to get people to do something they wouldn't do themselves.
One of my colleagues replied that, no, it was a good idea, and for us not to do it would be "paternalistic" in that we're saying that we know what's best for the locals. We can offer the insurance and they can decide. But, wait! I said. If we really want to be non-paternalistic, we wouldn't get involved at all, right?
Defaults
It seems that these debates come down to the choice of the default. If the default is to do our insurance program, then it's paternalistic to consider not doing it. But if the default is for us to stop messing around in Bangladesh, then it's paternalistic to try to motivate them to dig deep wells. (The unintended consequence of the mid-1990s intervention--encouraging moderately deep tube wells--is cautionary, but it's not clear that this should be a message that we shouldn't get involved.)
Melissa Lafsky writes in Freakanomics discusses how biofuels, which have been proposed as an environmentally-friendly alternative energy source, have been estimated to create more pollution than drilling for more oil. And then, of course, climate change is itself a huge unintended consequence of industrialization. I just have a couple of comments.
1. Alex Tabarrok wrote:
The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences.
I like this description but it doesn't quite fit either of the examples here. To start with, climate change was an unanticipated consequence of industrialization. But industrialization was not designed to regulate the climate (schemes such as cloud-seeding aside). So maybe Alex's paragraph is more of a description of perverse unintended consequences.
To take the other example: Yes, biofuels were proposed to regulate climate change, so the first half of Alex's description works. But the second part isn't quite appropriate, because the unintended consequences were discovered in advance. According to the quoted report, "Prior analyses made an accounting error." So in this case it doesn't sound like a problem in anticipating feedback.
2. This brings me to my second point, which is that the problem seems to have been discovered before the massive shift to biofuels actually happened, so the problem "for the next 93 years" won't really happen. According to the article, "scientists [are] already calling for government reform on biofuel policies." So this is more of an anticipated than an actual unintended consequence.
I'm doing some work related to Geologic Carbon Sequestration (also called Geologic Capture and Storage): the idea is to capture carbon dioxide from power plants or other industrial sources, and pump it deep underground into geologic formations that will trap it for centuries or millenia. It sounds desperate, and I initially had substantial misgivings, but upon looking into it more I think it is a good idea and we should get started. But there are still some major political, legal, and economic issues that have to be resolved. I'm working on an issue that is on the border between the technical and regulatory spheres:
The State of California will soon be called on to decide whether sequestration should be allowed in some specific places. The government will have to assess the risks and decide whether a particular spot is "safe enough." Any site that is likely to be proposed will be considered by experts to be highly likely to retain almost all of the CO2 that is pumped into it...but of course, the experts could be wrong. It's very hard to characterize the subsurface---there could be faults or old boreholes that you don't know about---so maybe the CO2 could leak out. If it does, bad things can happen.
According to Andrew Sullivan, a political commentator named Michael Graham wrote,
I am so confident of both a Patriots win today and a Romney win in Massachusetts on Tuesday that I made this pledge on the air Friday: 'If the NY Giants beat the Patriots in the Super Bowl, I will vote cast my Super Duper Tuesday primary vote for (shudder) John McCain.
But . . . the Patriots were favored by 14 points, and if you look up "football" in the index of Bayesian Data Analysis, you'll see that football point spreads are accurate to within a standard deviation of 14 points, with the discrepancy being approximately normally distributed. So, a 14-point underdog has something like a 15% chance of winning. It's funny how people don't get this sort of thing.
On the other hand, his pledge is nonenforceable so it's no big deal.
Robin Hanson suggested here an experimental design in which patients, instead of randomly assigned to particular treatments, are randomly given restrictions (so that each patient would have only n-1 options to consider, with the one option removed at random). I asked some experts about this design and got the following responses.
Eric Bradlow wrote:
I think "exclusion", more generally, in Marketing has been done in the following ways:[1] A fractional design -- each person only sees a subset of the choices, items, or attributes of a product (intentionally) on the part of the experimenter. Of course, this is commonly done to reduce complexity of the task while trading off the ability to estimate a full set of interactions. The challenge here, and I wrote a paper about this in JMR in 2006, is that people infer the values of the missing attributes and do not, despite instructions, ignore them. Don Rubin actually wrote an invited discussion on my piece. So, random exclusion on the part of the experimenter is done all of the time.
[2] A second way exclusion is sometimes done is prior to the choice or consumption task, you let the respondent remove "unacceptable" alternatives. There was a paper by Seenu Srinivasan of Stanford on this. In this manner, the respondent eliminates "dominated/would never choose alternatives". This is again done for the purposes of reducing task complexity.
[3] A third set of studies I have seen, and Eric Johnson can comment on the psychology of this much more than I can, is something that Dan Ariely (now of Duke formerly of MIT and colleagues have done), which seems closest to this post. In these sets of studies, alternatives are presented and then "start to shrink and/or vanish". What is interesting is that these alternatives that he does this to are not the preferred ones and it has a dramatic effect on people's preferences. I always found these studies fascinating.
[4] A fourth set of related work, of which Eric Johnson has great fame, is a "mouse-lab" like experiment where you allow people to search alternatives until they want to stop. This then becomes a sequential search problem; however, people exclude alternatives when they want to
stop.So, Andy, I agree with your posting that:
(a) Marketing researchers have done some of this.
(b) Depending on who is doing the excluding, one will have to model this as a two-step process, where the first step is a self-selection (observational study like likelihood piece, if one is going to be model-based).
The aforementioned Eric Johnson then wrote:
I think there are at least two important thoughts here:(1) random inclusion for learning... Decision-making has changed the way we think about preferences: They are discovered (or constructed) not 'read' from a table (thus Eric B.'s point 3).
A related point is that a random option can discover a preferences (gee, I never thought I liked ceviche....) so there may be value in adding random options to the respondent,,, The late Hillel Einhorn wrote about 'making mistakes to learn.'
(2) "New Wave' choice modeling often consists of generating the experimental design on the fly: Adaptive conjoint. By definition, these models use the results from one choice to eliminate a bunch of possible options and focus on those that have the most information. Olivier Toubia at Columbia Marketing is a master of this.
To elaborate on Eric B.'s points:
Consumer Behavior research shows that elimination is a major part of choice for consumers, probably determining much of the variance in what is chosen. Make choice easier, learning harder.
There is an interesting tradeoff for both the individual and larger publics here: You try a option you are likely not to like (treatment which may well not work). If you are surprised, then you (or subsequent patients) benefit for a long time. Since this is an intertemporal choice, people may
not experiment enough.
Finally, Dan "Decision Science News" Goldstein added:
I've never seen a firm implement such a design in practice, neither when I worked in industry, nor when I judged "marketing effectiveness" competitions.
My own thoughts are, first, that there are a lot of interesting ideas in experimental design beyond the theory in the textbooks. It would be worth thinking systematically about this (someday). Second, I want to echo Eric Johnson's comment about preferences being constructed, not "read off a table" from some idealized utility function. Utility theory is beautiful but it distresses me that people think it fits reality in an even approximate way.
Robin Hanson writes,
To make sense of social complexity we would ideally want to add lots of randomization to people's real choices, and then collect lots of data on what happens to them. But this seems a lot to ask of people. For example, people who eat at a restaurant might be willing to tell you how they felt later after eating there, but they'd be reluctant to eat a random item from the menu even one percent of the time.Would people be more willing to have a few of their options randomly excluded? For example, would people mind much if on a menu of one hundred items one of the items was randomly excluded each time - "sorry we are out of that today"? Data about choices under such reduced menus would still have a key randomization component.
This idea occurred to me while talking to a cancer doctor who thought he could get thousands of cancer patients to agree to release data on their progress, but who would be more reluctant to accept a random treatment. Once standard drugs have failed, there are about twenty alternative drugs a patient could try, which they usually pick based on the side effects etc. Patients probably wouldn't mind much having one of these options taken off the menu.
My thoughts:
I think I'd eat a random item 1% of the time as part of an experiment--after all, 1% of the time would correspond to three lunches per year.
To get to your main proposal: I think if you exclude one item, you'll get a study that is a mix of experiment and observational study, which could probably be analyzed in a way more robustly than purely observational data could be analyzed, but requiring more information than the analysis of a pure experiment.
This sounds like something that marketing researchers might have studied too.
Stephen Dubner and Steven Levitt wrote this Freakanomics column, which concludes, "if there is any law more powerful than the ones constructed in a place like Washington, it is the law of unintended consequences." What I'm wondering is, what sort of law is this? Obviously it's not a real "law" like the law of gravity or even one of those social-science laws like Gresham's law or the statement that democracies usually don't fight each other. But it's supposed to be more than just a joke in the manner of Murphy's law, right?
I've remarked previously that unintended consequences often were actually intended but Dubner and Levitt's examples seem actually unintended. So these seem like real examples, but I don't know what it takes for this to be a "law." Surely there must be dozens of other examples of intended consequences that actually happened? Or unintended consequences which, although unfortunate, were minor compared to the intended consequences? The Freakanomics article was interesting; now I want to hear a statement of the law itself...
P.S. Interesting comments below. Also, Alex Tabarrok has further elaboration:
The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences.
Somebody writes,
I am looking for interesting, unusual datasets for a data analysis class I am teaching, and I heard by email from Ray Fisman that you have a sanitized version of the data from his speed dating experiment.
Indeed, the data are here; we use them in a homework assignment in our book. The data were collected by Ray Fisman and Sheena Iyengar, an economist and a psychologist at the business school here, and they summarized their findings in this paper:
We study dating behavior using data from a Speed Dating experiment where we generate random matching of subjects and create random variation in the number of potential partners. Our design allows us to directly observe individual decisions rather than just final matches. Womenvput greater weight on the intelligence and the race of partner, while men respond more to physical attractiveness. Moreover, men do not value women's intelligence or ambition when it exceeds their own. Also, we find that women exhibit a preference for men who grew up in affluent neighborhoods. Finally, male selectivity is invariant to group size, while female selectivity is strongly increasing in group size.
What I really want to do with these data is what I suggested to Ray and Sheena several years ago when they first told me about the study: a multilevel model that allows preferences to vary by person, not just by sex. Multilevel modeling would definitely be useful here, since you have something like 10 binary observations and 6 parameters to estimate for each person.
I'm hoping that some pair of students analyzes these data as a project in my class this spring. I suspect that we could learn some interesting things. Also, once the model has been fitted successfully once, Ray, Sheena, and others would be able to fit it to other similar datasets easily enough.
Finally, let me thank Ray and Sheena again for making their data available to all.
Seth forwarded me this article [link fixed, I hope] from Nassim Taleb:
If a prediction market is not liquid enough, it's possible to manipulate it by throwing in small sums of money (thus, for example, a political candidate could boost his price by buying a bunch of shares). Presumably this could be useful, for example if you pump up your market share price, this might induce donors to contribute to the winning cause or could help attract endorsements.
At the other extreme, if the market is too liquid, there's a potential "moral hazard" or motivation to throw an election, to purposely hurt your side in order to make money on the pointspread if you've already placed a large bet in the other direction.
Now here's my question: there's clearly a sense in which a prediction market can be too small (too illiquid) to be trusted, and conversely if it is too large (too liquid) you get problems in the other direction. Is there an intermediate zone in which the market is liquid enough so it can't be easily manipulated, but not so liquid that it motivates point-shaving? Or do the zones of "too illiquid" and "too liquid" actually overlap, so there's no market size that does the job?
I imagine the answer would depend on some external parameters, such as the ease or difficulty of enforcing insider-trading restrictions. Possibly there's some theoretical work in this area. Justin? Robin?
P.S. I'm raising the questions above in all sincerity. This post is not intended to be a devastating argument that shoots down prediction markets; I'd just like to know if these issues have been considered and resolved in some way. A lot of the casual discussions of prediction markets have been of the "they're cool" or "they're silly" variety, but I imagine the researchers in this area have considered ways of assessing the problems arising from the issues noted above.
P.P.S. This paper by Robin Hanson (see comment below) discusses the first of these points, presenting theory and evidence that low-volume markets are hard to manipulate and thus implying that there is an intermediate zone where the markets can work well.
Chris Masse sent these links: Using Prediction Markets to Track Information Flows: Evidence from Google, by Cowgill, Wolfers, and Zitzewitz, and a news article by Noam Cohen. Here's the abstract of the Cowgill et al. paper:
In the last 2.5 years, Google has conducted the largest corporate experiment with prediction markets we are aware of. In this paper, we illustrate how markets can be used to study how an organization processes information. We document a number of biases in Google’s markets, most notably an optimistic bias. Newly hired employees are on the optimistic side of these markets, and optimistic biases are significantly more pronounced on days when Google stock is appreciating. We find strong correlations in trading for those who sit within a few feet of one another; social networks and work relationships also play a secondary explanatory role. The results are interesting in light of recent research on the role of optimism in entrepreneurial firms, as well as recent work on the importance of geographical and social proximity in explaining information flows in firms and markets.
I love this sort of thing. In grad school I remember we talked about setting up a "betting board" where people could put up slips of papers with proposed bets, and then you could accept a bet by signing it with your name. We never did anything with it, and the technology is better now... The Cowgill et al. paper is interesting in how they go beyond the usual "prediction markets are cool" story to look into what information is really being used in the market.
P.S. I gotta say, though: Think harder about your tabular presentations! Do you really care that a certain coefficient is estimated at -0.188 with a standard error of 0.072??? It would be great if the younger economists, working on cool projects like this, could take the lead on graphical presentation--which, after all, is all about getting more information out of your analyses.
P.P.S. In his news article, Cohen writes:
A question never addressed in the report is what would seemingly be most interesting to an outsider: Do prediction markets work? Unlike surveys, the markets rely on something, I think the technical term is ... oh, yeah, greed, to get their results.Ask me who I think will win a baseball game, an election and an Oscar, and I can try to be objective, but I can’t help being influenced by who I would like to see win. (The Yankees, Fred Thompson, Pee-wee Herman; or is it the Yankees, Pee-wee Herman, Fred Thompson?) Put $5 on it, however, and suddenly I am willing to use all the information I have at my disposal to come up with the best answer.
The attribution to "greed" seems naive to me. I'd be interested to hear the comments of Justin Wolfers or Robin Hanson or others who have thought more about these issues. I agree that a $5 bet can (for some people) induce some sincerity, but I wouldn't call that "greed"--unless they're paying New York Times reporters a lot less than I think, $5 seems below the "greed" threshold. Rather, I'd say that the $5 represents some signal that it's appropriate to take it seriously.
Also, not to keep going on about polls and forecasts, but (most) political polls are not set up to ask the question of "who will win" but rather the question of who would you like to see win. The point of the poll is to ask respondents something that they know about and is of general interest--in this case, their views on the issues, which candidate they support, etc. The voters--the general voting population--are the people who determine who wins the election, which is quite a bit different from the "Yankees" and "Pee-Wee Herman" examples given in the news article. (Yes, I know he's just being amusing, but I think there is a serious underlying point, which is that elections are not just something that people predict, they're also something that we jointly decide with our votes.)
This looks interesting. Don Saari writes,
We would like to call your attention to an IMBS conference that will be held on January 25-27, 2008. The topic is Luce and Raiffa After Fifty Years-What Is Next? It has been 50 years since the Duncan Luce and Howard Raiffa book, Games and Decisions: Introduction and Critical Survey, was first published. Our conference is meant both to honor this book that has had such a powerful impact, and to adopt the spirit of the Luce-Raiffa book by critically examining where game theory is today and where it should be in the future.
I love the Luce and Raiffa book. The funny thing is, it describes various unsolved problems with the implication that, in a few years, all of game theory will be cleaned up. Actually, I think this book represents the high-water mark of the idea of game theory as an all-encompassing tool in social science. Game theory has seen lots of important specific advances since then but its limitations have become clearer too. Here's a website with the conference--unfortunately, only a list of speakers so far, no titles or abstracts, but maybe that will change soon.
Futurescanner is a website full of forecasts. (I heard about this from an unsolicited email but it looks interesting.) It would be fun (at least for me) to see forecasts about statistical methods. The challenge would be in stating the problems clearly enough you could unambiguously state when they were solved.
From the Judgment and Decision Making list, I saw this interesting article by Scott Armstrong:
I [Armstrong], along with Kesten Green and Willie Soon, audited the forecasting methods used by the authors of the government's administrative reports to support their strategy to list polar bears as an endangered species. As it turns out, the forecasts were based primarily judgmental methods. We concluded that the forecasts of polar bear populations were not derived from scientific forecasting procedures. It would be irresponsible to classify polar bears as endangered on the basis of such forecasts.
Bob Clemen replied with some more general questions about how to evaluate forecasting methods:
Kaiser and I had the following discussion of rationality, following my earlier discussion of the rationality of voting. I wrote:
Any given behavior can be analyzed by economists either in a way as to show why it's really rational (even thought it doesn't look that way) or really irrational (even if it looks normal enough). I haven't quite figured out the rules for how they decide which way to lean in any given case.
Kaiser then wrote:
As for rational/irrational, I'm confused by the Kahneman work: he's saying irrationality is an anomaly which seems to indicate he thinks people should be rational but then if everyone is "irrational," could it be the theory is wrong in which case we shouldn't call that anomalous?
I replied:
Regarding rationality, my impression is that psychologists, unlike economists and political scientists, don't care so much about "rationality." Psychologists think of rationality as a process--as a way of thinking and making decisions--not as a particular algorithm. In that sense, Kahneman et al. are pointing out that much of our everyday rational thinking has systematic problems. It's no surprise that any particular form of rationality will be imperfect. What's interesting is the ways in which people make mistakes.
I cleaned out my inbox again. This time I mean business. I'm gonna read my email every day at 4pm (approximately) and deal with every email immediately, right then. No more of this e-mail-all-day-and-all-night nonsense!
Garrett Glasgow sent along this study on the effectiveness of suicide barriers on bridges:
With support from mental health workers, elected officials, the California Highway Patrol, and the local community, Caltrans has announced their intention to install a suicide prevention barrier on the Cold Spring Bridge by 2010 at a cost of $605,000. During the course of the debate a number of people have claimed that such a barrier would not only deter suicides at the Cold Spring Bridge, but actually prevent suicides and thus save lives. This claim is unfounded. A review of the evidence presented in favor of building the barrier and my own research reveals that there is no evidence that installing a suicide prevention barrier on the Cold Spring Bridge would save lives.
As Garrett writes, "there is a distinction between preventing suicides and preventing suicides
at a particular location."
A reader writes, regarding my review of The Black Swan,
Frederic Bois (winner of the Outstanding Statistical Application Award from the American Statistical Association, among other accomplishments) told me about the following job opportunities for modeling in toxicology, decision analysis, and risk assessment. Frederic is great, and so I assume the job is also. Here's the announcement:
Like Dave Krantz, I'm down on the decision-theoretic concept of "utility" because it doesn't really exist.
The utility function doesn't exist
You cannot, in general, measure utility directly, and attempts to derive it based on preferences (based on the Neumann-Morgenstern theory) won't always work either because:
1. Actual preferences aren't necessarily coherent, meaning that there is no utility function that can produce all these preferences.
2. Preferences themselves don't in general exist until you ask people (or, to be even more rigorous, place them in a decision setting).
So, yeah, utility theory is cool, but I don't see utility as something that's Platonically "out there" in the sense that I can talk about Joe's utility function for money, or whatever.
Call it value, not utility
The above is commonplace (although perhaps not as well known as it should be). But my point here is something different, a point about terminology. I would prefer to follow the lead of some decision analysis books and switch from talking about "utility" to talking about "value." To the extent the utility function has any meaning, it's about preferences, or how you value things. I don't think it's about utility, or how useful things are. (Yes, I understand the idea of utility in social choice theory, where you're talking about what's useful to society in general, but even there I'd say you're really talking about what society values, or what you value for society.)
Just play around with the words for a minute. Instead of "my utility function for money" or "my utility for a washer and a dryer, compared to my utility for two washers or two dryers" (to take a standard example of a nonadditive utility function) or "my utility for a Picasso or for an SUV," try out "my value function for money" or "the value I assign to a washer and a dryer, compared to the value I assign to two washers or two dryers" or "the value I assign to a Picasso or to an SUV." This terminology sounds much better to me.
P.S. See Dave's comments here.
Dave had these comments on my recent thoughts on utility and value functions:
I [Dave] agree with the negatives about "utility" as a word and as a Platonic function (attached to each individual).In teaching, I tend to discuss "subjective value." In my decision making course for undergrads I talk about optimization with respect to "objective" values, including physical, biological, and economic indices (e.g., maximum area, maximum sustainable yield, maximum profit), and with respect to subjective value, measured in a variety of ways; then I emphasize that many decision rules do not maximize anything -- because the weighting or even the existence of many goals is context dependent, and because some goals are converted into constraints. Optimization is thus subject to constraint and performed with context-dependent weights.
A standard use for "value function" in behavioral economics derives from Tversky & Kahneman's Prospect Theory; one of the blog contributors complains about that. And the emphasis on choice of words leads another contributor to treat the issue as one of words, rather than concepts and facts, no more important than "degrees of freedom" (which, of course, is a venerable term used relatedly in physics and in statistics).
I don't think there is an easy cure via terminology, though I feel you are on the right track here.
Kaiser writes,
Been leafing through the "Super Crunchers" book over the weekend. . . . Halfway through it, I am still trying to figure out if "super crunching" means traditional statistics or data mining. It is not without irony that the author seems to equate the two. Regardless, it's still good publicity for our field.One example that seemed to have caught on [comes] from a book called "Decision Traps" by Russo and Schoemaker (who I think are business consultants). The idea is a catchy one, which is to illustrate the "over-confidence" of decision makers. The trick they used is to ask people to provide interval estimates at 90% confidence to a list of 10 questions such as "What was Martin Luther King Jr's age at death?", and "In what year was Mozart born?". Out of 1000+ respondents, they found that "less than 1 percent of the people gave ranges that included the right answer nine or ten times. Ninety-nine percent of people were overconfident." (pp.112-114 in the book). . . . Have you done anything similar with your students?
As far as I know the original idea came from an example of Alpert and Raiffa. I've had lots of success doing an adaptation of the Alpert and Raiffa demo in class; see Section 13.2.2 of my book with Nolan on Teaching Statistics book or section 4 of this paper.
There's also a more standard confidence coverage demo in Section 8.4 of Teaching Statistics. That one works well in class too.
The great Bill James writes:
In sports, mathematical analysis is old news as applied to baseball, basketball, and football. . . . But it has not yet been applied to leagues. . . . Rather than beginning with the question "How does a team win?" - the query that has been the basis of all sports research to this point - what if we begin by asking "How does a league succeed?"Take the problem of what we could call NBA "sluggishness." In the regular season, players simply don't seem to be playing hard all the time. . . . The NBA's problem is that the underlying mathematics of the league are screwed up. . . . In the NBA, the element of predetermination is simply too high. Simply stated, the best team wins too often. If the best team always wins, then the sequence of events leading to victory is meaningless. Who fights for the rebound, who sacrifices his body to keep the ball from rolling out of bounds doesn't matter. The greater team is going to come out on top anyway. . . . Everybody knows who's going to win. Why do the players seem to stand around on offense? Why is showboating tolerated? Because it doesn't matter. . . .
So how should the NBA correct this? Lengthen the shot clock. Shorten the games. Move in the 3-point line. Shorten the playoffs.
If you reduce the number of possessions in a game by giving teams more time to hold the ball, you make it more likely that the underdog can win - for the same reason that Bubba Watson is a lot more likely to beat Tiger Woods at golf over three days than he is over four. It's simple math. The longer the contest lasts, the more certain the better team is to win. If the NBA went back to shorter playoff series - for example from best-of-seven games to best-of-three - an upset in that series would become a much more realistic possibility. A three-game series would make the homecourt advantage much more important, which, in turn, would make the regular season games much more important. The importance of each game is inversely related to the frequency with which the best team wins. . . .
I see James's point (and I continue to enjoy his writing style, so memorably and affectionately parodied by Veronica Geng a couple of decades ago), but I disagree with the remedy of adding more randomness. I don't think I really want to see the best team lose a lot. One appeal of a top-level sporting contest is seeing top players perform at their peak. Despite the popular models of the "binomial, p=.55" type, which team is "best" is not generally defined. In baseball, it depends so much on who is pitching; in football, some new plays can make the difference. Not to mention practice, discipline, teamwork, and getting some sleep the night before the game. Ideally (to me), the outcome of a game is unpredictable not because the worse team has a good chance of winning, but because it takes a special effort for a team to be the best. (Even in a deterministic game such as chess, the "best" (according to rankings) player does not always win.)
These issues lead into a larger question about scoring systems in games, a paradox of sorts that continues to confuse me: on one hand, you don't want the outcome to be random, on the other hand, you want the team that is behind to have a reasonable chance of catching up. I remember when I was a kid, my dad said that the tennis scoring system (games, set, match) was better than the ping-pong system (first player who gets 21 wins) because in tennis, you can always catch up. On the other hand, in a competitive game ping-pong, you should never be down 20-0 in the first place. There must be some principles here that can be stated mathematically, but I'm not quite how to state them. Perhaps someone has already looked into this.
P.S. I feel awkward disagreeing with Bill James, whose writings were one of the reasons I went into statistics. But I'm disagreeing with him about basketball, not baseball, so maybe it's ok.
The great David Owen reviewed a history of bridge (the card game) recently in the New Yorker. Among other things, he noted the decline in popularity of bridge and the rise of poker. But the fall of bridge and rise of poker were not simultaneous. Poker has never really gone away, but its recent ESPN-level popularity postdates bridge's decline by decades. More to the point, I prefer poker to bridge. At my weak-amateur level, I think poker is more of a skill game than bridge is. To put it another way, both poker and bridge have routine elements. But in bridge, the routine elements are crucial and require a lot of focus--play out those cards right, or you lose. In poker, the key routine element is to fold crappy cards (most of the time), and that's easy. This is one reason I find poker to be more fun--I can focus on the important moments. (Yes, I'm sure it's different for good players of either game.)
Someone writes in with the following question:
In this discussion of Allegra Goodman's book novel Intuition, Barry wrote, "brilliant people are at least as capable of being dishonest as ordinary people." The novel is loosely based on some scientific fraud scandals from the 1980s, the one of its central characters, a lab director, is portrayed as brilliant and a master of details, but who makes a mistake by brushing aside evidence of fraud by a postdoc in her lab. One might describe the lab director's behavior as "soft cheating" since, given the context of the novel, she had to have been deluding herself by ignoring the clear evidence of a problem.
Anyway, the question here is: are brilliant scientists at least as likely to cheat? I have no systematic data on this and am not sure how how to get this information. One approach would be to randomly sample scientists, index them by some objective measure of "brilliance" (even something like asking their colleagues to rate their brilliance on a 1-10 scale and then taking averages would probably work), then do a through audit of their work to look for fraud, and then regress Pr(fraud) on brilliance. This would work if the prevalence of cheating were high enough. Another approach would be to do a case-control study of cheaters and non-cheaters, but the selection issues would seem to be huge here, since you'd be only counting the cheaters who got caught. Data might also be available within colleges on the GPA's and SAT scores of college students who were punished for cheating; we could compare these to the scores of the general population of students. And there might be useful survey data of students, asking questions like "do you cheat" and "what's your SAT" or whatever. I guess there might even be a survey of scientists, but it seems harder to imagine they'd admit to cheating.
Daniel Kahneman posted the following on the Judgment and Decision Making site:
Have there been studies of the calibration of expert players in judgments of chess situations -- e.g., probability that white will win?In terms of the amount and quality experience and feedback, chess players are at least as privileged as weather forecasters and racetrack bettors -- but they don't have the experience of expressing their judgments in probabilities. I [Kahneman] am guessing that the distinction between a game that is "certainly lost" and "probably lost" is one that very good players can make reliably, but I know of no evidence.

Despite knowing much less about decision making and (likely) less about chess than Kahneman, I have three conjectures:
For years, Dave Krantz has been telling me about his goal-based model of decision analysis. It's always made much more sense to me than the usual framework of decision trees and utility theory (which, I agree with Dave, is not salvaged by bandaids such as nonlinear utilities and prospect theory). But, much as I love Dave's theory, or proto-theory, I always get confused when I try to explain it to others (or to myself): "it's, uh, something about defining decisions based on goals, rather than starting with the decision options, uh, ...." So I was thrilled to find that Dave and Howard Kunreuther just published an article describing the theory. Here's the abstract:
We propose a constructed-choice model for general decision making. The model departs from utility theory and prospect theory in its treatment of multiple goals and it suggests several different ways in which context can affect choice.It is particularly instructive to apply this model to protective decisions, which are often puzzling. Among other anomalies, people insure against non-catastrophic events, underinsure against catastrophic risks, and allow extraneous factors to influence insurance purchases and other protective decisions. Neither expected-utility theory nor prospect theory can explain these anomalies satisfactorily. To apply this model to the above anomalies, we consider many different insurance-related goals, organized in a taxonomy, and we consider the effects of context on goals, resources, plans and decision rules.
The paper concludes by suggesting some prescriptions for improving individual decision making with respect to protective measures.
Going to their paper, Table 1 shows the classical decision-analysis framework, and Table 2 shows the new model, which I agree is better. I want to try to apply it to our problem of digging low-arsenic wells for drinking water in Bangladesh.
Is vs. should
I have a couple of qualms about Dave's approach, though, which involve distinguishing between descriptive and normative concerns. This comes up in all models of decision making: on one hand, you can't tell people what to do (at best, you can point out inconsistencies in their decisions or preferences), but on the other hand these theories are supposed to provide guidance, not just descriptions of our flawed processes.
Anyway, I'm not so thrilled with goals such as in Krantz and Kunreuther's Table 5, of "avoid regretting a modest loss." The whole business of including "regret" in a decision model has always seemed to me to be too clever by half. Especially given all the recent research on the difficulties of anticipating future regret. I'd rather focus on more stably-measurable outcomes.
Also, Figure 4 is a bit scary to me. All those words in different sizes! It looks like one of those "outsider art" things:

In all seriousness, though, I think this paper is great. The only model of decision making I've seen that has the potential to make sense.
Need a better name
But I wish they wouldn't call their model "Aristotelian." As a former physics student, I don't have much respect for Aristotle, who seems to have gotten just about everything wrong. Can't they come up with a Galilean model?
Bernard Guerrero asks what I think of this. My response is that people can be irrational all the time--let's face it, we're a bunch of animals. Voters can have incoherent preferences (e.g., more services but less taxes), consumers can make mistakes (buying that brand-name $40,000 car and then being upset that they have no money left), forecasters can make mistakes (even setting aside "moral hazard" settings, there are lots of notorious problems such as people attaching insufficient probability to the "all else" category).
Arima models etc. can be overrated--lots of people seem to think these are the only models out there. Cavan Reilly has a fun example--chapter 27 in my book with Meng--of a 6-parameter predator-prey model that way outperforms standard time series models (with 11 or more parameters) in forecasting the famous Canadian lynx series. So I'm not surprised that Arimas can be beaten.
I agree with Bernard that you'd want to know where the survey forecasts come from. The surveys themselves are of forecasts. (This is different than the familiar use of surveys of forthcoming elections, where people are asked whom they would vote for if the election were held today. The Ang et al. paper is using surveys where people are explicitly asked to forecast.) It does sound like a classic "wisdom of crowds" averaging.
P.S. Two of the authors are at Columbia. I haven't met them. Perhaps they can speak in our quantitative social science seminar in the fall.
Dan Goldstein writes, "Marketing is JDM [judgment and decision making] with teeth." Wouldn't that be dentistry?
More thoughts on the backseat driver principle from Ubs:
Anders Sandberg writes here about how the finish-the-plate bias can lead people to overeat, simply because food comes in larger packages (which, in turn, presumably arises because food is so cheap to produce). Anyway, this reminds me of an insight I had several years ago which I used to tell the students in my decision analysis classes. They were always skeptical but maybe now with research behind me on this, they'll believe me.
Anyway, here goes:
When I was younger, people used to complain about candy bars getting smaller and smaller. (For example, Stephen Jay Gould has a graph in one of his books showing the size of the standard Hershey bar declining from 2 ounces in 1965 gradually down to 1.2 ounces in 1980, and for that matter I can recall tunafish cans gradually declining from 8 ounces to 6 ounces.) And I remember going to the candy machine with my quarter and picking out the candy bar that was heaviest--I don't remember which one--even if it wasn't my favorite flavor, to get the most value for the money.
But now I realize that, rationally, candymakers should charge more for smaller candy bars. The joy from eating the candy is basically discrete--I'll get essentially no more joy from a 1.7-ounce bar than from a 1.4-ounce bar. But the larger bar will be worse for my health (no big deal if I eat just one, but with some cumulative effect if I eat one every day, similarly with the sodas and so forth). And, given the well-known fact that nobody can eat just part of a candy bar, I get more net utility from the small bar, thus they should charge more.
The driver overestimates his control over the situation (including his own car as well as others on the road). The backseat driver ("Whoa--you're taking that curve too fast!") underestimates the driver's control. As a driver, I listen to the passengers because they provide a useful corrective. Even if the backseat driver is sometimes annoying, it makes sense to listen.
More generally: I'll take anybody's advice seriously.
One of the small puzzles of decision analysis is that:
(a) Plans have lots of problems--things commonly don't go according to plan, plans notoriously exclude key possibilities that the planner didn't think of, plans can encourage tunnel vision, etc. But . . .
(b) Plans are helpful. In fact, it's hard to do much of anything useful without a plan. (I'm sure people will come up with counterexamples here, but certainly in my own work and life, not much happens if I don't plan it. Serentipitous encounters are fine but don't add up to much.
Beyond this, one could add that economic activity seems to work well with minimal planning (just enough structure and rules to set up "the marketplace") but individual actors plan, and need to plan, all the time.
This puzzle is particularly interesting to me as I do work in applied decision analysis.
So what's the solution to the puzzle?
Ubs pointed me to this entry at Mental Floss linking to this article by Graham Walker, who's described as "a co-author of the Official Rock Paper Scissors Strategy Guide (published by Simon and Schuster) and five-time organizer of the World Rock Paper Scissors Championships." Hey, I wanted to organize a RPS tournament one winter in college but everybody thought it was a silly idea. Credit goes to those who put in the effort.
Anyway, here are Walker's suggestions. I thought it was just going to be a joke--"rock always ins" and all that--but they actually look pretty good to me. The comments at the end of the article are interesting too.
The secret to winning at RPSBasically, there are two ways to win at RPS. First is to take one throw away from your opponent options. ie - If you can get your opponent to not play rock, then you can safely go with scissors as it will win against paper and stalemate against itself. Seems impossible right? Not if you know the subtle ways you can manipulate someone. The art is to not let them know you are eliminating one of their options. The second way is to force you opponent into making a predictable move. Obviously, the key is that it has to be done without them realizing that you are manipulating them.
Most of the following techniques use variations on these basic principles. How well it works for you depends upon how well you can subtly manipulate your opponent without them figuring out what you are doing. So, now that the background is out of the way, let's get into these techniques:
1 - Rock is for Rookies
In RPS circles a common mantra is "Rock is for Rookies" because males have a tendency to lead with Rock on their opening throw. It has a lot to do with idea that Rock is perceived as "strong" and forceful", so guys tend to fall back on it. Use this knowledge to take an easy first win by playing Paper. This tactic is best done in pedestrian matches against someone who doesn't play that much and generally won't work in tournament play.
2 - Scissors on First
The second step in the 'Rock is for Rookies' line of thinking is to play scissors as your opening move against a more experienced player. Since you know they won't come out with rock (since it is too obvious), scissors is your obvious safe move to win against paper or stalemate to itself.
3 - The Double Run
When playing with someone who is not experienced at the RPS, look out for double runs or in other words, the same throw twice. When this happens you can safely eliminate that throw and guarantee yourself at worst a stalemate in the next game. So, when you see a two-Scissor run, you know their next move will be Rock or Paper, so Paper is your best move. Why does this work? People hate being predictable and the perceived hallmark of predictability is to come out with the same throw three times in row.
4 - Telegraph Your Throw
Tell your opponent what you are going to throw and then actually throw what you said. Why? As long as you are not playing someone who actually thinks you are bold enough to telegraph your throw and then actually deliver it, you can eliminate the throw that beats the throw you are telegraphing. So, if you announce rock, your opponent won't play paper which means coming out with that scissors will give you at worst a stalemate and at best the win.
5 - Step Ahead Thinking
Don't know what to do for your next throw? Try playing the throw that would have lost to your opponents last throw? Sounds weird but it works more often than not, why? Inexperienced (or flustered) players will often subconsciously deliver the throw that beat their last one. Therefore, if your opponent played paper, they will very often play Scissors, so you go Rock. This is a good tactic in a stalemate situation or when your opponent lost their last game. It is not as successful after a player has won the last game as they are generally in a more confident state of mind which causes them to be more active in choosing their next throw.
6 - Suggest A Throw
When playing against someone who asks you to remind them about the rules, take the opportunity to subtly "suggest a throw" as you explain to them by physically showing them the throw you want them to play. ie "Paper beats Rock, Rock beats scissors (show scissors), Scissors (show scissors again) beats paper." Believe it or not, when people are not paying attention their subconscious mind will often accept your "suggestion". A very similar technique is used by magicians to get someone to take a specific card from the deck.
7 - When All Else Fails Go With Paper
Haven't a clue what to throw next? Then go with Paper. Why? Statistically, in competition play, it has been observed that scissors is thrown the least often. Specifically, it gets delivered 29.6% of the time, so it slightly under-indexes against the expected average of 33.33% by 3.73%. Obviously, knowing this only gives you a slight advantage, but in a situation where you just don't know what to do, even a slight edge is better than none at all.
8 - The Rounder's Ploy
This technique falls into more of a 'cheating' category, but if you have no honour and can live with yourself the next day, you can use it to get an edge. The way it works is when you suggest a game with someone, make no mention of the number of rounds you are going to play. Play the first match and if you win, take it is as a win. If you lose, without missing a beat start playing the 'next' round on the assumption that it was a best 2 out of 3. No doubt you will hear protests from your opponent but stay firm and remind them that 'no one plays best of one for a kind of decision that you two are making'. No this devious technique won't guarantee you the win, but it will give you a chance to battle back to even and start again.
I'd like to move from basketball to something more important: geriatric care, a topic I was reminded of after reading this interesting article by Atul Gawande.
The article starts with some general discussion of the science of human aging, then moves to consider options for clinical treatment. Gawande learns a lot from observing a gerontologist's half-hour meeting with a patient. He tells a great story (too long to make sense to repeat here), although I suspect he was choosing the best out of the many patients he observed. He notes:
In the story of Jean Gavrilles and her geriatrician, there’s a lesson about frailty. Decline remains our fate; death will come. But, until that last backup system inside each of us fails, decline can occur in two ways. One is early and precipitately, with an old age of enfeeblement and dependence, sustained primarily by nursing homes and hospitals. The other way is more gradual, preserving, for as long as possible, your ability to control your own life.Good medical care can influence which direction a person’s old age will take. Most of us in medicine, however, don’t know how to think about decline. We’re good at addressing specific, individual problems: colon cancer, high blood pressure, arthritic knees. Give us a disease, and we can do something about it. But give us an elderly woman with colon cancer, high blood pressure, arthritic knees, and various other ailments besides—an elderly woman at risk of losing the life she enjoys—and we are not sure what to do.
Gawande continues with a summary of this study:
Several years ago, researchers in St. Paul, Minnesota, identified five hundred and sixty-eight men and women over the age of seventy who were living independently but were at high risk of becoming disabled because of chronic health problems, recent illness, or cognitive changes. With their permission, the researchers randomly assigned half of them to see a team of geriatric specialists. The others were asked to see their usual physician, who was notified of their high-risk status. Within eighteen months, ten per cent of the patients in both groups had died. But the patients who had seen a geriatrics team were a third less likely to become disabled and half as likely to develop depression. They were forty per cent less likely to require home health services.Little of what the geriatricians had done was high-tech medicine: they didn’t do lung biopsies or back surgery or PET scans. Instead, they simplified medications. They saw that arthritis was controlled. They made sure toenails were trimmed and meals were square. They looked for worrisome signs of isolation and had a social worker check that the patient’s home was safe.
But now comes the kicker:
How do we reward this kind of work? Chad Boult, who was the lead investigator of the St. Paul study and a geriatrician at the University of Minnesota, can tell you. A few months after he published his study, demonstrating how much better people’s lives were with specialized geriatric care, the university closed the division of geriatrics.“The university said that it simply could not sustain the financial losses,” Boult said from Baltimore, where he is now a professor at the Johns Hopkins Bloomberg School of Public Health.
One of the problems comes from the "separate accounts" fallacy in decision making:
On average, in Boult’s study, the geriatric services cost the hospital $1,350 more per person than the savings they produced, and Medicare, the insurer for the elderly, does not cover that cost. It’s a strange double standard. No one insists that a twenty-five-thousand-dollar pacemaker or a coronary-artery stent save money for insurers. It just has to maybe do people some good. Meanwhile, the twenty-plus members of the proven geriatrics team at the University of Minnesota had to find new jobs. Scores of medical centers across the country have shrunk or closed their geriatrics units. Several of Boult’s colleagues no longer advertise their geriatric training for fear that they’ll get too many elderly patients. “Economically, it has become too difficult,” Boult said.But the finances are only a symptom of a deeper reality: people have not insisted on a change in priorities. We all like new medical gizmos and demand that policymakers make sure they are paid for. They feed our hope that the troubles of the body can be fixed for good. But geriatricians? Who clamors for geriatricians? What geriatricians do—bolster our resilience in old age, our capacity to weather what comes—is both difficult and unappealingly limited. It requires attention to the body and its alterations. It requires vigilance over nutrition, medications, and living situations.
On the plus side, Baltimore has much better weather than St. Paul.
From the article by Boult et al. (you might notice a shift in style from the New Yorker to
the Journal of the American Geriatric Society):
PARTICIPANTS: A population-based sample of community-dwelling Medicare beneficiaries age 70 and older who were at high risk for hospital admission in the future (N = 568).INTERVENTION: Comprehensive assessment followed by interdisciplinary primary care.
MEASUREMENTS: Functional ability, restricted activity days, bed disability days, depressive symptoms, mortality, Medicare payments, and use of health services. Interviewers were blinded to participants' group status.
RESULTS: Intention-to-treat analysis showed that the experimental participants were significantly less likely than the controls to lose functional ability (adjusted odds ratio (aOR) = 0.67, 95% confidence interval (CI) = 0.47–0.99), to experience increased health-related restrictions in their daily activities (aOR = 0.60, 95% CI = 0.37–0.96), to have possible depression (aOR = 0.44, 95% CI = 0.20–0.94), or to use home healthcare services (aOR = 0.60, 95% CI = 0.37–0.92) during the 12 to 18 months after randomization. Mortality, use of most health services, and total Medicare payments did not differ significantly between the two groups. The intervention cost $1,350 per person.
CONCLUSION: Targeted outpatient GEM slows functional decline.

Recent Comments