September 10, 2008
Robin Hanson and I discuss adjusting for variables you shouldn't adjust for (for example, adjusting grades given sex, race, or pre-test scores)
In response to something Robin Hanson wrote on his blog (sorry I can't find the exact link, I think it was at the end of July, 2008), I wrote:
Regarding the general point of adjusting for variables such as sex and race (and, for that matter, previous test scores) that you're not "supposed" to adjust for, see the last (non-footnote) paragraph on page 4 that continues on to page 5 in this article.
The short version is that sometimes "fairness" or "the rules" are more important than inference about ability. Also, a decision rule that is optimal for each individual is not necessarily best for the group. Randomized rules are not in general optimal but they can provide a mix of outcomes that might be desirable in aggregate.
Robin responded:
Well I'd feel better if we had a coherent theory of "fairness" or "the rules" we could use to determine when we should not infer on all info available. Otherwise I fear those are just empty excuses for not inferring things we don't want to infer. I can see the abstract possibility, but I'd want to see a concrete argument applied correctly to a specific circumstance, not just some vague hand-waving about "fairness."
Then I wrote:
I think the course grades example is a real one. Suppose I give a pre-test at the beginning of the course ,then at the end I give a final exam. For simplicity suppose these are the only 2 pieces of info that we have on the students, and then imagine we can use these to predict fugure performance (e.g., grades in a future course). Once the course is over, the pre-test probably adds information (beyond what's in the final exam alone), but it wouldn't really be fair to use the pre-test to assign the final grade.
The principle here, such as it is, seems to be that a course grade should be based on things done in the course. A justification for this principle is that, when a future employer (for example) sees a transcript, he or she can best understand it if the separate course grades represent separate pieces of information. As a Stat 100 instructor, my job in assigning grades is to record how well the students did in Stat 100; it's not my job to second-guess transcript readers by giving my Bayesian estimate of the students' true ability.
More generally, then, this is the principle that keeping information segregated has a benefit in making it easier for outsiders to make best use of the information. Similarly, if I buy a widget on Amazon and rate it, I'm making the best contribution to society if I accurately describe my own experience with the widget--rather than reading everyone else's reviews and then Bayesianly shrinking my own judgment to the common mean. Any analyst can do this; the contribution I'm making as a rater is to describe my own experiences, and diluting this with other information will just make it more difficult for others to make use of what I'm telling them.
And Robin wrote:
Well I completely agree that we want to keep info sources as modular as possible, so that we simplify as much as possible the task of combining info sources, including the task of updating the total given updates to each part. And in the case of course grades I agree that modularity suggests one grade a course only on work done for that course. But it seems to me that in this case we were discussing, the more modular choice is in fact to adjust the test scores for what we know about differing variances of different groups. If we don't in fact do that with the test score, I don't see another plausible process whereby that info will be included in the final result. Are you more imaginative here than I?
Posted by Andrew at 9:34 PM | Comments (5) | TrackBack
September 3, 2008
Melding statistics with engineering?
Dan Lakeland writes:
I recently enrolled as a PhD student in a civil engineering program. My interest could be described as the application of data and risk analysis to engineering modelling, design methods, and decision making.The field is pretty ripe, and infrastructure risk analysis is a common topic these days, but the simulations and statistical approaches taken so far have been a bit unsatisfactory. For example people studying the impact of bridge failures during earthquakes on the local economy might assume a constant cost per person-hour of delay throughout the rebuild period, or people might build statistical models of probability of building collapse, but I would call them pretty much prior distributions, not really based on much data, or based on a finite element computer model of the physics of a single model building.
I think the application of data to engineering is bizarrely a rather new field. Or at least in a renaissance. Back in the 50s or earlier they used to do lots of tests, and generate graphical nomographs of the results (Like the Moody chart for fluid flow friction factors), but these days the emphasis is on detailed finite element analyses, which tell you a exactly how some model will perform, but doesn't deal at all with the difference between your model assumptions and reality.I'm attaching an article that I'm reading for an earthquake soil mechanics class, which shows pretty much the state of the art of applications of (bayesian) statistics to engineering. A CPT test is a test where they push a cone on the end of a long rod into the ground and measure the pressure being applied to the cone as a function of depth. another paper I've read uses artificial neural networks to predict the shear capacity of reinforced concrete beams. Engineers typically don't like ANN type approaches because they're data oriented and don't have explanatory power in terms of physics. On the other hand, the ANN model, because it's based on data, is a much better fit to real performance than the existing physics based models.
I wonder if you might comment in your blog on melding statistics with engineering. especially how we can use data together with deterministic models, and build better engineering decision rules, both for everyday engineering, as well as for dealing with social investment decisions such as building code requirements for extreme events like earthquakes, hurricanes, and soforth.
What decision theory books or articles do you know of that might be useful and relevant to this field?
My reply
I've long thought of statistics as a branch of engineering rather than a science. To me, statistics is all about building tools to solve problems. On the other hand, departments of Operations Research and Industrial Engineering tend to focus on probability theory rather than applied statistics, so I think we need our own departments.
Getting to your specific question: yes, I know what you're talking about. Back in high school and college I spent a few summers working in a lab programming finite element methods. Ultimately this was all statistical, but I didn't see that at the time. I imagine there's been a huge amount of work in this area in the past 25 years, with iterative methods for refining grid boxes and so forth. It would be a fun area to work in. But I suspect it would be an effort to translate it into statistical language.
It seems to me that engineers and physicists work very hard at solving particular problems, which are often big and difficult. Statisticians develop general tools for easy problems (e.g., logistic regression), which is a different sort of challenge. I think there's great potential for putting these perspectives together but I'm not quite clear where to start. I've seen some articles in statistics journals addressing your concerns but I haven't been so impressed by what I've seen there. Probably a better strategy is to start with the engineering literature and add uncertainty to that.
Posted by Andrew at 12:17 AM | Comments (8) | TrackBack
August 12, 2008
Just one of those things
Several years ago I was at the Library of Congress and asked where to go to get to the stacks. The guard told me that the stacks were closed. I asked, when did that happen? He replied that the Library of Congress had never had open stacks. The funny thing is, I knew he was wrong, because in high school I went to the Library of Congress a couple of times and I remember roaming the stacks, which were positioned sort of like spokes in a wheel. It was so cool to go to the stacks and see all the books written by an author. (I also remember looking for the book, "Get Even: The Complete Book of Dirty Tricks," which was in the card catalog--remember those?--but not on the shelves. But only Members of Congress can check books out of the Library of Congress. Hmmm.....) It's annoying how people can be so sure of themselves. The guard had probably been working there 10 years and so he thought he knew everything about the place.
Posted by Andrew at 12:36 AM | Comments (10) | TrackBack
August 6, 2008
A Natural Log: Our Innate Sense of Numbers is Logarithmic, Not Linear
See here:
We humans seem to be born with a number line in our head. But a May 30 study in Science suggests it may look less like an evenly segmented ruler and more like a logarithmic slide rule on which the distance between two numbers represents their ratio (when divided) rather than their difference (when subtracted).
This is consistent with our analysis in chapter 5 of our book of decisions of Bangladeshis about whether to switch wells because of arsenic in drinking water. Among households with dangerous wells (arsenic content higher than 50 (in some units)), we predicted whether a household switches wells, given two predictors:
- distance to the nearest safe well;
- arsenic level of their existing well.
The data were consistent with the model that people weight "distance to nearest safe well" linearly but weight "arsenic level" on the log scale. As we discuss in our book, this makes psychological sense: distance is something you perceive directly and linearly, by walking (it takes twice as much time and effort to walk 200m as to walk 100m), whereas arsenic level is just a number and, as such, going from 50 to 100 seems about the same, psychologically, as going from 100 to 200 or 200 to 400--even though, in reality, that last jump is four times as bad as the first (arsenic being a cumulative poison).
Posted by Andrew at 12:55 AM | Comments (3) | TrackBack
July 24, 2008
Traffic fines and retrospective and prospective decision analysis
John Sides points to a news report of a hit-and-run driver who "struck and slightly injured a pedestrian while driving his sports car in downtown Washington" and then said, "I didn’t know I hit him…I feel terrible…[But] he’s not dead, that’s the main thing." He was fined $50.
It often seems to happen this way, that punishments for reckless driving are much less severe than the effect of the crime itself. (Even being "slightly injured" in a car crash has gotta be a personal loss of much more than $50, not even counting hospital costs.) This is particularly striking given that not every offender is caught, so you might think that punishments would be higher for their deterrent value.
Why are the punishments so low? One reason is that many of the legislators who write the laws and judges who decide sentencing are themselves dangerous drivers at times, and I suspect that it can be easier for them to identify with the criminals than the victims. (If Gary Larsen were writing the laws, there'd probably be a death penalty for running over a dog.)
But I think there's something deeper going on, having to do with retrospective and prospective decision analysis. In the driving example, it goes like this.
1. Suppose somebody (e.g., Dick Cheney) is driving dangerously but nobody is hurt, or not seriously. Then the response is that no serious harm was done--it's just one of those things--so no point in having a big punishment.
2. Suppose somebody (e.g., Ted Kennedy or Laura Bush) is driving dangerously and seriously injures or kills someone. Then the response is that it's a terrible tragedy but very bad luck, so what is gained by seriously punishing the driver.
The issue is that deaths and serious injuries are also rare--even if you drive recklessly, it's extremely unlikely that you'll kill someone in any given outing. So you're stuck between punishing the almosts and might-have-beens or really laying down the hammer on the serious cases. No option seems quite right. Although I guess in this case the pedestrian will do all right because he'll probably sue the driver for a couple of million dollars.
Posted by Andrew at 10:19 AM | Comments (9) | TrackBack
July 18, 2008
Are you saving too little or too much?
Alex Tabarrok has an interesting discussion of saving strategies. Alex writes:
There are people who don't save much because they have very low incomes, their behavior does not seem to be in error, especially when we take into consideration the various welfare programs that will cover people in their old age. . . . So let's focus on people with moderate to high incomes. . . . Over confidence and in particular the idea that we are special and will live a long life suggests the error is saving too much. . . . Availability bias probably also suggests we save too much - we see people who saved too little in the street but the ones who saved too much are dead and gone. . . . I do not know which error is more prevalent but if we are to be neither spendthrift nor miser we need to recognize both types of error.
My guess is that Alex is a little too optimistic about people's savings strategies, given all the credit card debt out there. Also, as some of his commenters note, it's easy for people to get used to a particular spending pattern, and it's easier to ramp it up than to scale it down. So, for psychological purposes, it might be better to plan for a gradually increasing standard of living than something completely flat over time.
But I'm sympathetic with Alex's general point that both kinds of errors are relevant. It reminds me of when I asked the students in my decision analysis class to raise their hands if they'd never missed a flight. I then said to them: You go to the airport too early! A retrospective rather than a prospective analysis but still essentially correct, I think.
Posted by Andrew at 6:40 AM | Comments (6) | TrackBack
June 27, 2008
Paternalistic software
This is ok, but I like my solution better.
Posted by Andrew at 9:49 PM | Comments (2) | TrackBack
June 26, 2008
rps
Eddie Randolph writes,
I was wondering if you had any thoughts on the World Rock Paper Scissors Contest currently being held. Do you think it will be won by someone who plays intuitively, or a master strategist? If you think the strategist will win, do you think they will employ strategies from the book you pointed to in your blog?
My reply: I love rock-paper-scissors but I'm afraid I have no deep theories. I'd guess that it's pretty random, that whoever wins one year wouldn't have much better than a random chance of doing well the next year.
Posted by Andrew at 12:58 AM | Comments (1) | TrackBack
June 16, 2008
Non-spam comments with spam links
This entry received the following comment:
You can't compare each round as a parallel test because they move the tee boxes and hole locations every day. This makes the course much more difficult on some days and that is what separates the best from the worst.a href="http://www.sporthaley.com" Women's Golf Clothing /a
(I've purposely unlinked the html.) Also, the commenter's name is given as "Women's Golf Clothing," and the above link is given as the referring url.
I don't know what to make of this sort of thing. It's hard for me to believe it's pure spam--could a bot really read the entry and make that comment? But what person would sign his or her comment as "Women's Golf Clothing"? We get this kind of semi-spam on the blog comments now and then, and I'm never sure what to think about it.
P.S. Just to be clear, here's another comment that clearly is 100% spam. It's for the blog entry on "Baby Names," it's by "baby boy," and it says, "I can think of "yang" names, you can never know." That's clearly spam, unlike the above comment which was human-generated.
Posted by Andrew at 2:00 PM | Comments (17) | TrackBack
June 11, 2008
This city sucks, or, a suggested equivalence principle
We went by the pool in Central Park today and it was drained of water. Wha . . . ? It was only about 100 degrees out there today.
Aaahhhh, I see: "Pool Opens July 1st and closes after Labor Day". (Note also that in the picture on the website, the pool has no people in it. That's no coincidence: when there are people in it, it's jam-packed with people.)
OK, I have an idea: no A/C in city offices until the all the swimming pools open. If the city doesn't want to open the pool until July 1st, no problem: it's probably not so hot outside, and I'm sure the desk workers can make do with fans.
And, just in case they decide to do what they did in previous years and open up only half of the pool (I'm not kidding, they roped half the pool off so that a few zillion people were crammed into 50% of the space), then they can do the same thing in city offices: half of the offices can have A/C, half won't. I'm not sure what to do about the mayor. Maybe give him A/C half the time?
P.S. When open, the pool hours are from 11-3 and 4-7. Of course, this means that city offices should also be air conditioned only during these times.
Posted by Andrew at 12:24 AM | Comments (13) | TrackBack
May 20, 2008
Everybody knows this already, no?
We're more likely to listen to expensive advice. Dog bites man, or is there something I'm missing here?
P.S. I know from personal experience that if you raise your consulting fee high enough, people do start to choke on the price and either say no or reduce the number of hours they want. Both of which are good outcomes from my perspective.
Posted by Andrew at 2:13 AM | Comments (1) | TrackBack
May 19, 2008
Happy smokers pay more for cigs; connection to the well-known fact that nobody can eat just part of a candy bar
This article by Tim Harford reminds me of an example I used to give in my decision analysis class:
When I was younger, people used to complain about candy bars getting smaller and smaller. (For example, Stephen Jay Gould has a graph in one of his books showing the size of the standard Hershey bar declining from 2 ounces in 1965 gradually down to 1.2 ounces in 1980, and for that matter I can recall tunafish cans gradually declining from 8 ounces to 6 ounces.) And I remember going to the candy machine with my quarter and picking out the candy bar that was heaviest--I don't remember which one--even if it wasn't my favorite flavor, to get the most value for the money.
But now I realize that, rationally, candymakers should charge more for smaller candy bars. The joy from eating the candy is basically discrete--I'll get essentially no more joy from a 1.7-ounce bar than from a 1.4-ounce bar. But the larger bar will be worse for my health (no big deal if I eat just one, but with some cumulative effect if I eat one every day, similarly with the sodas and so forth). And, given the well-known fact that nobody can eat just part of a candy bar, I get more net utility from the small bar, thus they should charge more.
See here for a link to a research study on this.
Posted by Andrew at 12:55 AM | Comments (2) | TrackBack
May 7, 2008
Speak clearly
When you leave a voice mail, please say your name and phone number slowly and clearly. Thank you.
Posted by Andrew at 10:29 AM | Comments (4) | TrackBack
April 10, 2008
Distinction between decision analysis and 'statistical decision theory'
Michail Fragkias writes,
While reading Chapter 22 of your book, Bayesian Data Analysis (2nd ed.) - I came upon the section on the *Distinction between decision analysis and ‘statistical decision theory’* (p. 543-44) in which you seem quite critical on statistical decision theory suggesting that it is not useful for real decision problems. Actually, I was quite confused with the three paragraphs as I thought that they do not reflect statistical decision theory as I understand it.A disclaimer: I’m an applied economist by training and recently decided to educate myself in Bayesian statistics and econometrics due to my developing interest in statistical decision theory. Obviously, I started working through your book on Bayesian Data Analysis to get a clear exposition of foundations, but did decide to look ahead at Ch. 22 early on due to my interests. Comparing the content of these paragraphs with the treatment in the 2000 book by French and Rios Insua ‘Statistical Decision Theory’, for example, it seems to me that the core idea of statistical decision theory is misrepresented. Unfortunately, there is though no reference in this section to the original work that is being criticized. On the issue of real examples of usage of statistical decision theory, I’m aware of useful applications in economics, such as the work by Brock, Durlauf and West (http://www.nber.org/papers/w10025)
Am I missing something being new to the field?
My reply: I have not read the book that you cite. I am a big fan of decision analysis and decision theory, though. The thing that I don't like is so-called "statistical decision theory" in which an "estimator" is chosen based on minimizing some theoretically chosen measure of loss. It's a relic from the 1940s and 1950s, the idea that choosing a statistical estimator should be treated as a decision problem. The decision analysis that I like is set up with actual losses (dollars, lives, whatever).
Posted by Andrew at 12:19 AM | Comments (0) | TrackBack
April 8, 2008
Dissonance on cognitive dissonance
Chris Wiggins points me to this column by John Tierney reporting research by Keith Chen on cognitive dissonance--that well-known phenomenon whereby we change our preferences to match our pre-existing decisions (for example, not wanting to hear bad news about one's preferred presidential candidate). Chen wrote a paper claiming that cognitive dissonance is not nearly as important as everyone thought it was. For Tierney's column, Chen writes,
All of the studies I [Chen] talk about take as their basic model a famous and incredibly influential experiment by Jack Brehm in 1956; the first study, in fact, which psychologists took to demonstrate cognitive dissonance. In Brehm’s study and its modern variants, subjects are first asked to rate or rank a bunch of goods based on how much they like them. Then, subjects are offered a choice between two of the goods they just rated, and are told they can take the good they choose home with them as payment for the study. They are then asked to re-rate all of the original goods; cognitive dissonance theory suggests that people would have a better opinion of the good they choose after choosing it than before.So, for example, subjects may first be asked to rank 15 goods from 1 to 15, with 1 being the best and 15 being the worst. Then, a subject would be asked to choose between two goods they initially ranked similarly, say the goods they ranked 7 and 9. After making this choice, psychologists have looked at whether, if asked to rank these goods again, the chosen good rises in rank, and if the rejected good falls. This seems like a perfectly reasonable thing to look at; but there’s a big problem in how this has been done.
The problem is, when you ask subjects to choose between goods they ranked 7 and 9 (call these goods A and B), many subjects choose good B, (the good they initially ranked lower). Typically about one-quarter to one-third of subjects do this. Now, why people do this isn’t entirely clear, but one thought is that it indicates that asking subjects to rank goods from best to worst isn’t a perfect measure of how they feel. Some of them might not take the task that seriously; some might get confused by all the choices. So while they’ll initially rank B below A in the list of items, when they actually focus just on those two items they realize they actually prefer B to A.
The real problem, though, is what psychology studies did with subjects who “switched” — that is, those who chose the good they initially said they liked less. What many studies did (following the original Brehm study) is to exclude from the study those subjects who choose good B. Is this a problem that can bias their findings? Yes, if we think that the subjects being excluded from study are systematically different from those who aren’t. Specifically, when we drop subjects who choose good B over good A, we may be systematically dropping subjects who like good B (more than good A). Ignoring this possibility is like ignoring Monty’s choice and what it tells us about where the car is likely to be. By throwing out subjects, a study “stacks the deck” of remaining subjects with people who like good A more than they like good B. In fact, all remaining subjects have signaled they feel that way, twice. Maybe it shouldn’t be a surprise then, when asked to re-rank these items, the rank of good A rises; it originally ranked 7 from a larger group, then those who on second thought didn’t like it so much, were dropped.
Many studies that examine “spreading” look at how much A goes up and B goes down only for those people who chose A (as just described). Others look at whether the chosen good (A if you choose A and B if you choose B) went up and the non-chosen went down for everyone. This is also problematic for exactly the same reason; we shouldn’t be surprised that people like the things they choose, and the experiment needs to take that into account before it can correctly claim that dissonance is occurring.
This is interesting. I'm certainly sympathetic to the argument that preferences don't exist, independently of the settings in which they are chosen. (See here and here.) On the other hand, the desire to avoid cognitive dissonance seems real to me. I'd like to see how this work fits into the general literature on the topic. (Also, I'm not quite sure why Tierney calls this "social psychology." Isn't it "cognitive psychology"? I'm sure there's something I'm missing here.
Posted by Andrew at 6:04 PM | Comments (5) | TrackBack
April 4, 2008
A dismal theorem?
James Annan writes,
I wonder if you would consider commenting on Marty Weitzman's "Dismal Theorem", which purports to show that all estimates of what he calls a "scaling parameter" (climate sensitivity is one example) must be long-tailed, in the sense of having a pdf that decays as an inverse polynomial and not faster. The conclusion he draws is that using a standard risk-averse loss function gives an infinite expected loss, and always will for any amount of observational evidence.
I looked up Weitzman and found this paper, "On Modeling and Interpreting the Economics of Catastrophic Climate Change," which discusses his "dismal theorem." I couldn't bring myself to put in the effort to understand exactly what he was saying, but I caught something about posterior distributions having fat tails. That's true--this is a point made in many Bayesian statistics texts, including ours (chapter 3) and many that came before us (for example, Box and Tiao). With any finite sample, it's hard to rule out the hypothesis of a huge underlying variance. (Fundamentally, the reason is that, if the underlying distribution truly does have fat tails, it's possible for them to be hidden in any reasonable sample. It's that Black Swan thing all over again.) I think that Weitzman is making some deeper technical point, and I'm sure I'm disappointing Annan by not having more to say on this . . .
More
Searching on the web, I found this article by William Nordhaus criticizing Weitzman's reasoning. Unfortunately, Nordhaus's article just left me more confused: he kept talking about a utility function of the form U(c) = (1-c^(1-a))/(1-a), which doesn't seem to be relevant to the climate change example. Or to any other example, for that matter. Attempting to model risk aversion with a utility function--that's so 1950s, dude! It's all about loss aversion and uncertainty aversion nowadays. This isn't Nordhaus's fault--he seems to be working off of Weitzman's model--but it's hard for me to know how to evaluate any of this stuff if it's based on this sort of model.
Also, I don't buy Nordhaus's argument on page 4 that you can deduce our implicit value of non-extinction by looking at how much the U.S. government spends on avoiding asteroid impacts. This reminds me of the sorts of comparisons people do, things like total spending on cosmetics or sports betting compared to cancer research. I already know that we spend money on short-term priorities--I wouldn't use that to make boroad claims about the "negative utility of extinction."
Back to Weitzman's paper
I find abbreviations such as DT (for the "dismal theorem") and GHG (for greenhouse gases) to be distracting. I don't know if this is fair of me. I don't mind U.S. or FBI or EPA or other common abbreviations, but I find it really annoying to read a phrase such as, "Phrased di¤erently, is DT an economics version of an impossibility theorem which signifies that there are fat-tailed situations where economic analysis is up against a strong constraint on the ability of any quantitative analysis to inform us without committing to a VSL-like parameter and an empirical CBA framework that is based upon some explicit numerical estimates of the miniscule [sic] probabilities of all levels of catastrophic impacts up to absolute disaster?" The concepts are tricky enough as it is without me having to try to flip back and find out what is meant by DT, VSL, and CBA. But, if Weitzman were to spell out all the words, would the other economists think he's some sort of rube? I just don't know the rules here.
On page 37, near the end of the paper, Weitzman writes, "A so-called Integrated Assessment Model (hereafter IAM) . . .") I was reminded of Raymond Chandler's advice for writers: "When in doubt, have a man come through the door with a gun in his hand." Or, in this case, an abbreviation. Never let your readers relax, that's my motto.
I'm not sure how to think about the decision analysis questions. For example, Weitzman writes, "Should we have foregone the industrial revolution because of the GHGs it generated?" But I don't think that foregoing the industrial revolution was ever a live option.
P.S. I have to admit, "miniscule" sounds right. It begins with "mini," after all.
Posted by Andrew at 2:39 AM | Comments (15) | TrackBack
April 2, 2008
The limits of open-mindedness in evaluating scientific research
Seth is skeptical of skepticism in evaluating scientific research. He starts by pointing out that it can be foolish to ignore data, just because they don't come from a randomized experiment. The "gold standard" of double-blind experimentation has become an official currency, and Seth is arguing for some bimetallism. To continue with this ridiculous analogy, a little bit of inflation is a good thing: some liquidity in scientific research is needed in order to keep the entire enterprise moving smoothly.
As Gresham has taught us, if observational studies are outlawed, then only outlaws will do observational studies.
I think Seth goes too far, though, and that brings up an interesting question.
In the discussion on his blog, Seth appears to hold the position that all published research has value. (At least, I brought up a notorious example of error-ridden research, and Seth responded that "I don’t agree that this means its info is useless.") But if all published research, even that with crippling errors, is useful, then presumably this is true of some large fraction of unpublished research, right?
At this point, even setting aside monkeys-at-a-typewriter arguments, there's the question of what we're supposed to do with the mountain of research: millions of published articles each year, plus who knows how many undergraduate term papers, high school science fair projects, etc. I think there are some process-type solutions out there, things like Wikipedia and Slashdot or whatever (which have their own biases, but let a zillion flowers bloom, etc.). But that seems like a cop-out to me, since ultimately someone has to read the papers and judge whether it's worth trying to replicate studies, and so forth. Somewhere it's gotta be relevant that a paper has mistakes, right?
Posted by Andrew at 9:37 AM | Comments (11) | TrackBack
March 20, 2008
The "all else equal fallacy," one more time (this time using econ jargon), and also a discussion of the perils of "crossover" arguments
I was sorry to see Steven Levitt repeating the claim about driving a car being good for the environment. I wrote about this last week when it appeared in the other New York Times column of John Tierney, but perhaps it's worth repeating:
These guys are making a classic statistical error, I think, which is to assume that all else is held constant. This is the error that also leads people to misinterpret regression coefficients causally. (See chapters 9 and 10 of our book for discussion of this point.) In this case, the error is to assume that the walker and the driver will be making the same trip. In general, the driver will take longer trips--that's one of the reasons for having a car, that you can easily take longer trips. Anyway, my point is not to get into a long discussion of transportation pricing, just to point out that this seemingly natural calculation is inappropriate because of its mistaken assumption that you can realistically change one predictor, leaving all the others constant.
Unintended consequences of an economist forgetting about unintended consequences
I'm surprised that Levitt didn't notice this, given that the distinction between "exogenous" and "endogenous" variables is such a big deal in economics. In fact, an important contribution that economists often make to public policy debates is to emphasize that you can't simply assume "all else held equal" in an analysis. In fact, Levitt himself made this point is his column a couple months ago, in discussing unintended consequences. One of the consequences of switching from driving to walking is that you take shorter trips. Maybe this is a good thing, maybe it's a bad thing, but I don't think it makes a lot of sense to say, "Be Green: Drive" without realizing that distance traveled is affected by the choice.
P.S. Levitt buttresses his argument with the statement, "Chris Goodall [the person who made the walking/driving comparison] is no right-wing nut; he is an environmentalist and author of the book How to Live a Low-Carbon Life." How relevant is this? Even a "right-wing nut" could make a good point, right? More to the point, I think we have to be careful about automatically trusting "crossover" arguments. Do we have to believe something, just because it comes from somebody who we wouldn't expect to say it? I worry that this sort of crossover appeal is so appealing that otherwise-skeptical commentators (such as Levitt) forget their usual skepticism.
P.P.S. Yes, I realize that Levitt might just be trying to be amusing and thought-provoking rather than making a claim about public policy. From the standpoint of economics and statistics, though, I think this really a great opportunity to explain why the "all else equal" assumption can cause problems. A great example for a course in linear regression or econometrics.
Posted by Andrew at 2:32 PM | Comments (8) | TrackBack
March 18, 2008
Free airline vouchers
Chris Paulse sends in this amusing advice which could be used as an example in teaching decision analysis.
Posted by Andrew at 8:48 AM | Comments (0) | TrackBack
March 14, 2008
Valuing "Lives Saved" vs. "Life-Years Saved," leading to a discussion of the flawed concept of "willingness to pay"
Jim Hammitt sends along this interesting report comparing different measures of risk when evaluating public health options:
There is long-standing debate whether to count "lives saved" or "life-years saved" when evaluating policies to reduce mortality risk. Historically, the two approaches have been applied in different domains. Environmental and transportation policies have often been evaluated using lives saved, while life-years saved has been the preferred metric in other areas of public health including medicine, vaccination, and disease screening. . . Describing environmental, health, and safety interventions as "saving lives" or "saving life-years" can be misleading. . . . Reducing the risk of dying now increases the risk of dying later, so these lives are not saved forever but life-years are gained. . . .
We discuss some of these issues in our article on home radon risks. Beyond this, I have two comments on Jim Hammitt's paper:
1. I wish he'd talked about Qalys. I just like the sound of that word. Qaly, qaly, qaly. (It's pronounced "Qualy")
2. He talks briefly about "willingness to pay." I've always thought this can be a misleading concept. Sometimes it's really "ability to pay." Give someone a lot more money and he or she becomes more able to pay for things, including risk reduction. True, this induces more willingness to pay, but to me the ability is the driving factor. I think the key is what comparison is being made. If you're considering one person and comparing several risks, then the question is, what are you willing to pay for. But if you are considering several people with different financial situations, then the more relevant question might be, who is able to pay.
Posted by Andrew at 10:01 PM | Comments (13) | TrackBack
February 27, 2008
Two sides to the IRB story
1. This article by Carl Elliott reminded me why institutional review boards (IRBs) are needed.
2. This site (via Seth) reminds me of why IRBs can be a bad thing.
For me, IRBs are typically a waste of time, nothing more, but for others they are a (potential) protection against health hazards and exploitation, and for others they are a barrier to research progress.
Posted by Andrew at 12:01 AM | Comments (0) | TrackBack
February 23, 2008
Basketball statistics
I don't know anything about basketball (except that the players are shorter than they say they are, and I don't really even know that); nonetheless, in the recent-but-still-grand tradition of blogging . . .
This doesn't stop me from writing about the topic (see previous thoughts on plus-minus statistics, competitive balance, racial bias, and regression models). Anyway, this attracted the notice of Eli Witus, who presumably does know something about the sport. Eli writes,
You might be interested in a recent blog post as it addresses what I think is a flaw in the methodology of Dave Berri's Wins Produced system, which you have discussed before.I don't have any formal statistical training, so I am learning as I go. Here's another post that you might be interested in. I am very interested in multilevel modeling--I think it could be very useful in basketball since the game is much more interactive than baseball, and player statistics are heavily dependent on the context of the player's teammates and coach. I think multilevel modeling could help answer questions about how a player's statistics are likely to change if he changes teams.
Cool stuff. I agree about the multilevel modeling. And here's a recent post with some pretty graphs.
P.S. To reduce my credibility even further, let me admit that my 12-year-old nephew can regularly beat me at Horse.
Posted by Andrew at 9:14 PM | Comments (3) | TrackBack
February 14, 2008
Discussion of unintended consequences as a battle over defaults
I'll try to clarify my recent entry on unintended consequences by focusing on a less politically-loaded example.
Millions of people in south Asia are exposed to high levels of arsenic in their drinking water. It's a natural contaminant (something to do with the soil chemistry) but it's become an increasingly important problem in the past decades because people have been digging millions of deep (~ 100 feet) tubewells. The background is that the surface water is often contaminated, and international organizations have been encouraging the locals to dig these tubewells which draw clean water from hundreds of feet below ground. Unfortunately, some of that water is contaminated with arsenic. A true unintended consequence. But what to do next?
There are various solutions out there, including a low-cost device for purifying surface water. My connection to this is that I've been involved in a project to give information to people in Bangladesh about where and how deep to dig to find arsenic-free deep water. In some places you have to drill hundreds of feet deep, and this can be expensive (relative to Bangladeshis' incomes). So we're setting up an insurance system for people there, so they can pay a little bit more but be assured of eventually getting a safe well, or their money back. The idea is to provide incentives for well-drillers also, to set up an ongoing system where there is trust and so that safe wells can be installed.
More unintended consequences?
Two concerns about unintended consequences arise. First, on the physical level, there is a concern that, if people build wells taking clean water from deep aquifers, they'll start using that water more and more (just as we in the developed world flush our toilets with fresh water, etc), leading to changes in the water flow that might bring arsenic down there or have other bad consequences. I don't know enough to evaluate this concern so I'm just trusting my colleagues on this.
The second concern is something I mentioned to my collaborators the other day: should we really be offering this insurance scheme at all? The goal of the program is to get people to dig deeper wells than they otherwise would've done, by setting up incentives for customers and well-drillers to get together. (I should explain that this is intended to be a revenue-neutral, "at cost," system: not a subsidy for Bangladeshis to dig wells, but not a moneymaker for us, either. The money would be made by the drillers, and this would provide an incentive for the program to continue.)
Anyway, I asked my collaborators whether maybe we shouldn't be doing this program at all, since we're trying to get people to do something they wouldn't do themselves.
One of my colleagues replied that, no, it was a good idea, and for us not to do it would be "paternalistic" in that we're saying that we know what's best for the locals. We can offer the insurance and they can decide. But, wait! I said. If we really want to be non-paternalistic, we wouldn't get involved at all, right?
Defaults
It seems that these debates come down to the choice of the default. If the default is to do our insurance program, then it's paternalistic to consider not doing it. But if the default is for us to stop messing around in Bangladesh, then it's paternalistic to try to motivate them to dig deep wells. (The unintended consequence of the mid-1990s intervention--encouraging moderately deep tube wells--is cautionary, but it's not clear that this should be a message that we shouldn't get involved.)
Posted by Andrew at 10:27 AM | Comments (7) | TrackBack
Battle of the unintended consequences (or, unintended battle of the consequences?)
Melissa Lafsky writes in Freakanomics discusses how biofuels, which have been proposed as an environmentally-friendly alternative energy source, have been estimated to create more pollution than drilling for more oil. And then, of course, climate change is itself a huge unintended consequence of industrialization. I just have a couple of comments.
1. Alex Tabarrok wrote:
The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences.
I like this description but it doesn't quite fit either of the examples here. To start with, climate change was an unanticipated consequence of industrialization. But industrialization was not designed to regulate the climate (schemes such as cloud-seeding aside). So maybe Alex's paragraph is more of a description of perverse unintended consequences.
To take the other example: Yes, biofuels were proposed to regulate climate change, so the first half of Alex's description works. But the second part isn't quite appropriate, because the unintended consequences were discovered in advance. According to the quoted report, "Prior analyses made an accounting error." So in this case it doesn't sound like a problem in anticipating feedback.
2. This brings me to my second point, which is that the problem seems to have been discovered before the massive shift to biofuels actually happened, so the problem "for the next 93 years" won't really happen. According to the article, "scientists [are] already calling for government reform on biofuel policies." So this is more of an anticipated than an actual unintended consequence.
3. Unintended consequences are interesting, but the law of unintended consequences isn't always so useful in telling us what we should do, since in this case the problem that we're trying to combat is itself an unintended consequence. I don't really know what to do with this. These discussions often seem to give the implicit recommendation to do nothing, but I'm not quite sure what "doing nothing" would mean. Reduce fossil fuel consumption to 18th-century levels? Freeze consumption at exactly the current levels? Invade Brazil so that they can't implement biofuels policies? Any policy, even the default (whatever that is) might have unintended consequences. I think that's the best message to take from these discussions: that all policies should be examined carefully. But we knew that already, right? I'm not trying to pick on the Freakonomics people here; I'm just trying to figure out where this is all going.
Posted by Andrew at 8:00 AM | Comments (4) | TrackBack
February 5, 2008
How to ensure the "expected loss" is low in the real world
I'm doing some work related to Geologic Carbon Sequestration (also called Geologic Capture and Storage): the idea is to capture carbon dioxide from power plants or other industrial sources, and pump it deep underground into geologic formations that will trap it for centuries or millenia. It sounds desperate, and I initially had substantial misgivings, but upon looking into it more I think it is a good idea and we should get started. But there are still some major political, legal, and economic issues that have to be resolved. I'm working on an issue that is on the border between the technical and regulatory spheres:
The State of California will soon be called on to decide whether sequestration should be allowed in some specific places. The government will have to assess the risks and decide whether a particular spot is "safe enough." Any site that is likely to be proposed will be considered by experts to be highly likely to retain almost all of the CO2 that is pumped into it...but of course, the experts could be wrong. It's very hard to characterize the subsurface---there could be faults or old boreholes that you don't know about---so maybe the CO2 could leak out. If it does, bad things can happen.
One bad thing that could happen is that the carbon dioxide could leak into an aquifer and turn the water into something like Perrier. Doesn't sound so bad---could even be worth a lot of money---but actually it could be a big problem. Carbonated water is acidic (carbonic acid), and could dissolve minerals and leach bad stuff like lead or arsenic into the water. That's bad if the aquifer supplies drinking water!
One way to assess a site is to look at the "expected loss" or "expected cost" of using the site. For example, suppose the probability that CO2 will leak into the aquifer is 0.001, and if this does happen then the cost (e.g. the cost of purifying the water so people can still drink it) has a Net Present Value of $100 Million. Then the expected loss from this particular failure mode is $100,000.
Here's the thing: we can estimate the cost, if there is a failure, fairly accurately: you can look at how much it costs to build a filtration plant or to import water from somewhere else, and get it maybe within a factor of 2. But the other part of the equation, the failure probability, is just impossible to answer. Some people will say it's really high, some will say it's really low. There's very little in the way of empirical data, since there are only a few large-scale sequestration sites operating.
If we want to be sure that (probability of failure) x (cost of failure) is low, a common-sense idea is to start with sites where the cost of failure would be low: then it doesn't matter if the probability of failure is a lot higher than we thought. As more sites are used, and are monitored for a few decades, we'll learn more about our ability to predict subsurface CO2 behavior --- we'll never know less than we know right now! --- and then we can relax the constraint on the (cost of failure) term and start looking at the putative product.
You're with me so far, right? OK, then here's where I need help: I need some examples of places where this common-sense idea was EXPLICITLY used in creating regulatory policy. I hope someone who reads this blog can help me here.
Posted by Phil at 2:14 PM | Comments (18) | TrackBack
Converting point spreads into odds, or, if only he'd read chapter 1 of Bayesian Data Analysis
According to Andrew Sullivan, a political commentator named Michael Graham wrote,
I am so confident of both a Patriots win today and a Romney win in Massachusetts on Tuesday that I made this pledge on the air Friday: 'If the NY Giants beat the Patriots in the Super Bowl, I will vote cast my Super Duper Tuesday primary vote for (shudder) John McCain.
But . . . the Patriots were favored by 14 points, and if you look up "football" in the index of Bayesian Data Analysis, you'll see that football point spreads are accurate to within a standard deviation of 14 points, with the discrepancy being approximately normally distributed. So, a 14-point underdog has something like a 15% chance of winning. It's funny how people don't get this sort of thing.
On the other hand, his pledge is nonenforceable so it's no big deal.
Posted by Andrew at 2:26 AM | Comments (4) | TrackBack
January 31, 2008
Random restriction as an alternative to random assignment? A mini-seminar from the experts
Robin Hanson suggested here an experimental design in which patients, instead of randomly assigned to particular treatments, are randomly given restrictions (so that each patient would have only n-1 options to consider, with the one option removed at random). I asked some experts about this design and got the following responses.
Eric Bradlow wrote:
I think "exclusion", more generally, in Marketing has been done in the following ways:[1] A fractional design -- each person only sees a subset of the choices, items, or attributes of a product (intentionally) on the part of the experimenter. Of course, this is commonly done to reduce complexity of the task while trading off the ability to estimate a full set of interactions. The challenge here, and I wrote a paper about this in JMR in 2006, is that people infer the values of the missing attributes and do not, despite instructions, ignore them. Don Rubin actually wrote an invited discussion on my piece. So, random exclusion on the part of the experimenter is done all of the time.
[2] A second way exclusion is sometimes done is prior to the choice or consumption task, you let the respondent remove "unacceptable" alternatives. There was a paper by Seenu Srinivasan of Stanford on this. In this manner, the respondent eliminates "dominated/would never choose alternatives". This is again done for the purposes of reducing task complexity.
[3] A third set of studies I have seen, and Eric Johnson can comment on the psychology of this much more than I can, is something that Dan Ariely (now of Duke formerly of MIT and colleagues have done), which seems closest to this post. In these sets of studies, alternatives are presented and then "start to shrink and/or vanish". What is interesting is that these alternatives that he does this to are not the preferred ones and it has a dramatic effect on people's preferences. I always found these studies fascinating.
[4] A fourth set of related work, of which Eric Johnson has great fame, is a "mouse-lab" like experiment where you allow people to search alternatives until they want to stop. This then becomes a sequential search problem; however, people exclude alternatives when they want to
stop.So, Andy, I agree with your posting that:
(a) Marketing researchers have done some of this.
(b) Depending on who is doing the excluding, one will have to model this as a two-step process, where the first step is a self-selection (observational study like likelihood piece, if one is going to be model-based).
The aforementioned Eric Johnson then wrote:
I think there are at least two important thoughts here:(1) random inclusion for learning... Decision-making has changed the way we think about preferences: They are discovered (or constructed) not 'read' from a table (thus Eric B.'s point 3).
A related point is that a random option can discover a preferences (gee, I never thought I liked ceviche....) so there may be value in adding random options to the respondent,,, The late Hillel Einhorn wrote about 'making mistakes to learn.'
(2) "New Wave' choice modeling often consists of generating the experimental design on the fly: Adaptive conjoint. By definition, these models use the results from one choice to eliminate a bunch of possible options and focus on those that have the most information. Olivier Toubia at Columbia Marketing is a master of this.
To elaborate on Eric B.'s points:
Consumer Behavior research shows that elimination is a major part of choice for consumers, probably determining much of the variance in what is chosen. Make choice easier, learning harder.
There is an interesting tradeoff for both the individual and larger publics here: You try a option you are likely not to like (treatment which may well not work). If you are surprised, then you (or subsequent patients) benefit for a long time. Since this is an intertemporal choice, people may
not experiment enough.
Finally, Dan "Decision Science News" Goldstein added:
I've never seen a firm implement such a design in practice, neither when I worked in industry, nor when I judged "marketing effectiveness" competitions.
My own thoughts are, first, that there are a lot of interesting ideas in experimental design beyond the theory in the textbooks. It would be worth thinking systematically about this (someday). Second, I want to echo Eric Johnson's comment about preferences being constructed, not "read off a table" from some idealized utility function. Utility theory is beautiful but it distresses me that people think it fits reality in an even approximate way.
Posted by Andrew at 12:12 AM | Comments (1) | TrackBack
January 28, 2008
Random restriction as an alternative to random assignment?
Robin Hanson writes,
To make sense of social complexity we would ideally want to add lots of randomization to people's real choices, and then collect lots of data on what happens to them. But this seems a lot to ask of people. For example, people who eat at a restaurant might be willing to tell you how they felt later after eating there, but they'd be reluctant to eat a random item from the menu even one percent of the time.Would people be more willing to have a few of their options randomly excluded? For example, would people mind much if on a menu of one hundred items one of the items was randomly excluded each time - "sorry we are out of that today"? Data about choices under such reduced menus would still have a key randomization component.
This idea occurred to me while talking to a cancer doctor who thought he could get thousands of cancer patients to agree to release data on their progress, but who would be more reluctant to accept a random treatment. Once standard drugs have failed, there are about twenty alternative drugs a patient could try, which they usually pick based on the side effects etc. Patients probably wouldn't mind much having one of these options taken off the menu.
My thoughts:
I think I'd eat a random item 1% of the time as part of an experiment--after all, 1% of the time would correspond to three lunches per year.
To get to your main proposal: I think if you exclude one item, you'll get a study that is a mix of experiment and observational study, which could probably be analyzed in a way more robustly than purely observational data could be analyzed, but requiring more information than the analysis of a pure experiment.
This sounds like something that marketing researchers might have studied too.
P.S. See here for much more from the marketing researchers.
Posted by Andrew at 12:37 AM | Comments (2) | TrackBack
January 22, 2008
What kind of law is the Law of Unintended Consequences?
Stephen Dubner and Steven Levitt wrote this Freakanomics column, which concludes, "if there is any law more powerful than the ones constructed in a place like Washington, it is the law of unintended consequences." What I'm wondering is, what sort of law is this? Obviously it's not a real "law" like the law of gravity or even one of those social-science laws like Gresham's law or the statement that democracies usually don't fight each other. But it's supposed to be more than just a joke in the manner of Murphy's law, right?
I've remarked previously that unintended consequences often were actually intended but Dubner and Levitt's examples seem actually unintended. So these seem like real examples, but I don't know what it takes for this to be a "law." Surely there must be dozens of other examples of intended consequences that actually happened? Or unintended consequences which, although unfortunate, were minor compared to the intended consequences? The Freakanomics article was interesting; now I want to hear a statement of the law itself...
P.S. Interesting comments below. Also, Alex Tabarrok has further elaboration:
The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences.
Posted by Andrew at 2:18 AM | Comments (14) | TrackBack
January 21, 2008
The speed-dating data
Somebody writes,
I am looking for interesting, unusual datasets for a data analysis class I am teaching, and I heard by email from Ray Fisman that you have a sanitized version of the data from his speed dating experiment.
Indeed, the data are here; we use them in a homework assignment in our book. The data were collected by Ray Fisman and Sheena Iyengar, an economist and a psychologist at the business school here, and they summarized their findings in this paper:
We study dating behavior using data from a Speed Dating experiment where we generate random matching of subjects and create random variation in the number of potential partners. Our design allows us to directly observe individual decisions rather than just final matches. Womenvput greater weight on the intelligence and the race of partner, while men respond more to physical attractiveness. Moreover, men do not value women's intelligence or ambition when it exceeds their own. Also, we find that women exhibit a preference for men who grew up in affluent neighborhoods. Finally, male selectivity is invariant to group size, while female selectivity is strongly increasing in group size.
What I really want to do with these data is what I suggested to Ray and Sheena several years ago when they first told me about the study: a multilevel model that allows preferences to vary by person, not just by sex. Multilevel modeling would definitely be useful here, since you have something like 10 binary observations and 6 parameters to estimate for each person.
I'm hoping that some pair of students analyzes these data as a project in my class this spring. I suspect that we could learn some interesting things. Also, once the model has been fitted successfully once, Ray, Sheena, and others would be able to fit it to other similar datasets easily enough.
Finally, let me thank Ray and Sheena again for making their data available to all.
Posted by Andrew at 12:03 AM | Comments (0) | TrackBack
January 18, 2008
The Irrelevance of "Probability"?
Seth forwarded me this article [link fixed, I hope] from Nassim Taleb:
I [Taleb] spent a long time believing in the centrality of probability in life and advocating that we should express everything in terms of degrees of credence, with unitary probabilities as a special case for total certainties, and null for total implausibility. Critical thinking, knowledge, beliefs, everything needed to be probabilized. Until I came to realize, twelve years ago, that I was wrong in this notion that the calculus of probability could be a guide to life and help society. Indeed, it is only in very rare circumstances that probability (by itself) is a guide to decision making . It is a clumsy academic construction, extremely artificial, and nonobservable. Probability is backed out of decisions; it is not a construct to be handled in a standalone way in real-life decision-making. It has caused harm in many fields. . . .We can easily see that when it comes to small odds, decision making no longer depends on the probability alone. It is the pair probability times payoff (or a series of payoffs), the expectation, that matters. . . .
What causes severe mistakes is that, outside the special cases of casinos and lotteries, you almost never face a single probability with a single (and known) payoff. You may face, say, a 5% probability of an earthquake of magnitude 3 or higher, a 2% probability of one of 4 or higher, etc. The same with wars: you have a risk of different levels of damage, each with a different probability. "What is the probability of war?" is a meaningless question for risk assessment. . . .
The point is mathematically simple but does not register easily. I've enjoyed giving math students the following quiz (to be answered intuitively, on the spot). In a Gaussian world, the probability of exceeding one standard deviations is ~16%. What are the odds of exceeding it under a distribution of fatter tails (with same mean and variance)? The right answer: lower, not higher — the number of deviations drops, but the few that take place matter more. It was entertaining to see that most of the graduate students get it wrong. . . .
Another complication is that just as probability and payoff are inseparable, so one cannot extract another complicated component, utility, from the decision-making equation. . . .
I'd just like to add two points. First, utility doesn't exist either...
Posted by Andrew at 12:05 AM | Comments (18) | TrackBack
January 11, 2008
Limits on prediction markets?
If a prediction market is not liquid enough, it's possible to manipulate it by throwing in small sums of money (thus, for example, a political candidate could boost his price by buying a bunch of shares). Presumably this could be useful, for example if you pump up your market share price, this might induce donors to contribute to the winning cause or could help attract endorsements.
At the other extreme, if the market is too liquid, there's a potential "moral hazard" or motivation to throw an election, to purposely hurt your side in order to make money on the pointspread if you've already placed a large bet in the other direction.
Now here's my question: there's clearly a sense in which a prediction market can be too small (too illiquid) to be trusted, and conversely if it is too large (too liquid) you get problems in the other direction. Is there an intermediate zone in which the market is liquid enough so it can't be easily manipulated, but not so liquid that it motivates point-shaving? Or do the zones of "too illiquid" and "too liquid" actually overlap, so there's no market size that does the job?
I imagine the answer would depend on some external parameters, such as the ease or difficulty of enforcing insider-trading restrictions. Possibly there's some theoretical work in this area. Justin? Robin?
P.S. I'm raising the questions above in all sincerity. This post is not intended to be a devastating argument that shoots down prediction markets; I'd just like to know if these issues have been considered and resolved in some way. A lot of the casual discussions of prediction markets have been of the "they're cool" or "they're silly" variety, but I imagine the researchers in this area have considered ways of assessing the problems arising from the issues noted above.
P.P.S. This paper by Robin Hanson (see comment below) discusses the first of these points, presenting theory and evidence that low-volume markets are hard to manipulate and thus implying that there is an intermediate zone where the markets can work well.
Posted by Andrew at 9:33 AM | Comments (4) | TrackBack
January 8, 2008
Google's prediction markets
Chris Masse sent these links: Using Prediction Markets to Track Information Flows: Evidence from Google, by Cowgill, Wolfers, and Zitzewitz, and a news article by Noam Cohen. Here's the abstract of the Cowgill et al. paper:
In the last 2.5 years, Google has conducted the largest corporate experiment with prediction markets we are aware of. In this paper, we illustrate how markets can be used to study how an organization processes information. We document a number of biases in Google’s markets, most notably an optimistic bias. Newly hired employees are on the optimistic side of these markets, and optimistic biases are significantly more pronounced on days when Google stock is appreciating. We find strong correlations in trading for those who sit within a few feet of one another; social networks and work relationships also play a secondary explanatory role. The results are interesting in light of recent research on the role of optimism in entrepreneurial firms, as well as recent work on the importance of geographical and social proximity in explaining information flows in firms and markets.
I love this sort of thing. In grad school I remember we talked about setting up a "betting board" where people could put up slips of papers with proposed bets, and then you could accept a bet by signing it with your name. We never did anything with it, and the technology is better now... The Cowgill et al. paper is interesting in how they go beyond the usual "prediction markets are cool" story to look into what information is really being used in the market.
P.S. I gotta say, though: Think harder about your tabular presentations! Do you really care that a certain coefficient is estimated at -0.188 with a standard error of 0.072??? It would be great if the younger economists, working on cool projects like this, could take the lead on graphical presentation--which, after all, is all about getting more information out of your analyses.
P.P.S. In his news article, Cohen writes:
A question never addressed in the report is what would seemingly be most interesting to an outsider: Do prediction markets work? Unlike surveys, the markets rely on something, I think the technical term is ... oh, yeah, greed, to get their results.Ask me who I think will win a baseball game, an election and an Oscar, and I can try to be objective, but I can’t help being influenced by who I would like to see win. (The Yankees, Fred Thompson, Pee-wee Herman; or is it the Yankees, Pee-wee Herman, Fred Thompson?) Put $5 on it, however, and suddenly I am willing to use all the information I have at my disposal to come up with the best answer.
The attribution to "greed" seems naive to me. I'd be interested to hear the comments of Justin Wolfers or Robin Hanson or others who have thought more about these issues. I agree that a $5 bet can (for some people) induce some sincerity, but I wouldn't call that "greed"--unless they're paying New York Times reporters a lot less than I think, $5 seems below the "greed" threshold. Rather, I'd say that the $5 represents some signal that it's appropriate to take it seriously.
Also, not to keep going on about polls and forecasts, but (most) political polls are not set up to ask the question of "who will win" but rather the question of who would you like to see win. The point of the poll is to ask respondents something that they know about and is of general interest--in this case, their views on the issues, which candidate they support, etc. The voters--the general voting population--are the people who determine who wins the election, which is quite a bit different from the "Yankees" and "Pee-Wee Herman" examples given in the news article. (Yes, I know he's just being amusing, but I think there is a serious underlying point, which is that elections are not just something that people predict, they're also something that we jointly decide with our votes.)
Posted by Andrew at 12:21 AM | Comments (4) | TrackBack
January 7, 2008
Luce and Raiffa After Fifty Years
This looks interesting. Don Saari writes,
We would like to call your attention to an IMBS conference that will be held on January 25-27, 2008. The topic is Luce and Raiffa After Fifty Years-What Is Next? It has been 50 years since the Duncan Luce and Howard Raiffa book, Games and Decisions: Introduction and Critical Survey, was first published. Our conference is meant both to honor this book that has had such a powerful impact, and to adopt the spirit of the Luce-Raiffa book by critically examining where game theory is today and where it should be in the future.
I love the Luce and Raiffa book. The funny thing is, it describes various unsolved problems with the implication that, in a few years, all of game theory will be cleaned up. Actually, I think this book represents the high-water mark of the idea of game theory as an all-encompassing tool in social science. Game theory has seen lots of important specific advances since then but its limitations have become clearer too. Here's a website with the conference--unfortunately, only a list of speakers so far, no titles or abstracts, but maybe that will change soon.
Posted by Andrew at 5:01 PM | Comments (0) | TrackBack
December 6, 2007
A spray that will improve your memory
Futurescanner is a website full of forecasts. (I heard about this from an unsolicited email but it looks interesting.) It would be fun (at least for me) to see forecasts about statistical methods. The challenge would be in stating the problems clearly enough you could unambiguously state when they were solved.
Posted by Andrew at 12:36 AM | Comments (1) | TrackBack
December 3, 2007
Are polar bears endangered? And can this be addressed using decision analysis?
From the Judgment and Decision Making list, I saw this interesting article by Scott Armstrong:
I [Armstrong], along with Kesten Green and Willie Soon, audited the forecasting methods used by the authors of the government's administrative reports to support their strategy to list polar bears as an endangered species. As it turns out, the forecasts were based primarily judgmental methods. We concluded that the forecasts of polar bear populations were not derived from scientific forecasting procedures. It would be irresponsible to classify polar bears as endangered on the basis of such forecasts.
Bob Clemen replied with some more general questions about how to evaluate forecasting methods:
Scott, after your paper about global warming went around (with the associated offer to bet with Al Gore), I [Clemen] went and read the paper and reflected on what I knew about the IPCC. You are correct in general, as far as I can tell: These things are rarely forecasted in a way that uses the principles of forecasting that the IIF (and you especially) have worked so hard to develop. I do not want to take issue with the principles; I think the development and promulgation of those ideas is a huge contribution toward better forecasting.The polar bear paper made me think further. I'd like to make three points:
1) Why pick on global warming or polar bears? Isn't it the case that many forecasts, especially those based on complex physical or natural models, are made in a way that would be "unscientific" according to the principles? I suspect that most of our big public-policy decisions are based on unscientific forecasts.
Your papers may get some attention for scientific forecasting, but I wonder if a more productive approach would be to work on specific issues, trying to improve the forecasting methods used in those particular arenas. I really worry that your challenges are liable to alienate precisely those scientists whom you want to do a better job. Why not help them instead of accuse them?
2) A statement on page 4 of the polar bear paper does raise a question. The statement is, "Some reviewers of our research have suggested that the principles do not apply to the physical sciences." I do not want to claim that this is true, but instead to flip it on its head: To what
extent have the forecasting principles themselves been developed using studies of long-range forecasts (typically judgmental) thata) Are based on complex natural models such as climate, air quality, ground water transport, or pharmacokinetic models?
b) Have tried to forecast as far into the future as 30, 50, or 100 years (or more) under conditions of recent radical change in a critical element of the system (like increased CO2 concentration in the atmosphere)?You may be able to answer these questions, and I would be very interested in the answer. To the extent that the forecasting principles were not based on such studies, then it may be a tough sell to use the principles to argue that climate change and similar forecasts are not valid. Kinda like extrapolating beyond the range of the data.
3) Points 1 and 2 aside, I will be the first (well, maybe second after you) to say that there are both better and worse ways to make expert-based judgmental forecasts. . . . here is a paper by a few folks who used expert climatologists back in the early 1990s to come up with long-range
probabilistic judgments of climate change (sorry, not global climate change): DeWispelare, A., Herren, L., & Clemen, R. T. (1995). The use of probability elicitation in the high-level nuclear waste regulation program. /International Journal of Forecasting 11/, 5-24.
My thoughts:
1. As Arrnstrong et al. imply in their article, the polar bears here represent the larger issue of government regulation on environmental issues. This is important when considering as a decision-analytic problem because some of the strategies recommended to save the polar bears would also be intended to mitigate other environmental consequences. Thus, I expect that an analysis looking at polar bears in isolation will underestimate the benefits of action here.
2. Amstrong et al. question the use of scientific consensus as a method for making environmental policy decisions. I'm not really sure what to do here: even if, as they note, forecasters have sometimes been wrong in the past, can we really do better than the consensus? The scientific consensus, with its peer review, would seem to me to be one of the best examples of the so-called wisdom of crowds. On the other hand, policy is everybody's business, and if you disagree with the consensus you should feel free to say so. There's some idea that people could agree on the science and disagree on the policy--thus focusing attention on the value function. But in practice it seems hard to do this: once people disagree on the policy, they go back and fight about the facts. (Consider weapons of mass destruction, IQ and ethnicity, Alger Hiss, the Swift Boat veterans, or even silly things like Noah's ark.)
Posted by Andrew at 3:25 PM | Comments (4) | TrackBack
November 30, 2007
Psychologists, economists, and ideas of rationality
Kaiser and I had the following discussion of rationality, following my earlier discussion of the rationality of voting. I wrote:
Any given behavior can be analyzed by economists either in a way as to show why it's really rational (even thought it doesn't look that way) or really irrational (even if it looks normal enough). I haven't quite figured out the rules for how they decide which way to lean in any given case.
Kaiser then wrote:
As for rational/irrational, I'm confused by the Kahneman work: he's saying irrationality is an anomaly which seems to indicate he thinks people should be rational but then if everyone is "irrational," could it be the theory is wrong in which case we shouldn't call that anomalous?
I replied:
Regarding rationality, my impression is that psychologists, unlike economists and political scientists, don't care so much about "rationality." Psychologists think of rationality as a process--as a way of thinking and making decisions--not as a particular algorithm. In that sense, Kahneman et al. are pointing out that much of our everyday rational thinking has systematic problems. It's no surprise that any particular form of rationality will be imperfect. What's interesting is the ways in which people make mistakes.
Posted by Andrew at 3:44 AM | Comments (5) | TrackBack
November 29, 2007
My new email policy
I cleaned out my inbox again. This time I mean business. I'm gonna read my email every day at 4pm (approximately) and deal with every email immediately, right then. No more of this e-mail-all-day-and-all-night nonsense!
Posted by Andrew at 3:02 AM | Comments (2) | TrackBack
November 28, 2007
Do suicide barriers save lives?
Garrett Glasgow sent along this study on the effectiveness of suicide barriers on bridges:
With support from mental health workers, elected officials, the California Highway Patrol, and the local community, Caltrans has announced their intention to install a suicide prevention barrier on the Cold Spring Bridge by 2010 at a cost of $605,000. During the course of the debate a number of people have claimed that such a barrier would not only deter suicides at the Cold Spring Bridge, but actually prevent suicides and thus save lives. This claim is unfounded. A review of the evidence presented in favor of building the barrier and my own research reveals that there is no evidence that installing a suicide prevention barrier on the Cold Spring Bridge would save lives.
As Garrett writes, "there is a distinction between preventing suicides and preventing suicides
at a particular location."
Posted by Andrew at 2:51 AM | Comments (9) | TrackBack
November 6, 2007
Taleb and military officers
A reader writes, regarding my review of The Black Swan,
I think that your view of military officers and the self-selection occuring at Taleb's Las Vegas meeting might be less accurate than Taleb's even though he might have accidently hit upon truth through self-selecting. As a former infantry officer, I can attest that the military officers that I knew tended to be more inquisitive, intellectually curious, and comfortable with uncertainty (at least professionally) than most of the professionals that I've worked with since that time (in engineering, construction, and consulting). I would argue that (at least in the US) the military makes a concerted effort to get it's officers to think about, deal with, and live with risk and uncertainty. I remember my plebe Military Science class at West Point. One of the first things we learned was to think of an officer's role in war as management of chaos. There is always the recognition that the unexpected can happen and much training is create that understanding. Always expect the unexpected (although literally this is probably impossible). From my experience, people outside the military tend to have very mistaken notions of military officers and military leadership in general. I remember that we had a visiting Philosophy professor when I was a senior in college. She was a pacifist from one of the UC campuses (strange to choose to spend a semester at West Point). She noted that her perceptions had changed significantly in terms of her respect for military officers as well as the creativity and intellectual curiosity of cadets (which I think she thought was out of place given the uniformity of our environment and dress). Granted however, however I might argue that these qualities exist in the general population of military officers at a higher rate than outside the military, West Point is definitely selecting for these qualities to a great extent in its professors so that they would exist at a higher level there.I do agree with your note that even a poor plan held tentatively might be better than no plan. I think this idea was behind Patton's quote: "A good plan executed with vigor now is better than a perfect plan ten minutes from now." And then, there is the famous quote from Von Moltke: "No plan survives first contact" [with the enemy]. Ian Mitroff makes this point in his approach to crisis management. He seems to understand that you cannot predict the black swans but argues that contingency planning for different categories of crises along with "early warning signals" to tell you when something is amiss will help you prepare for the unexpected crisis. An example (my own) might be that a plan to evacuate New York for one stated reason (say threat of terrorist nuclear attack) might also be helpful if you needed to evacuate NY for some other reason (maybe a tsunami in the Atlantic).
My reply: I have little knowledge of the military, but I suspect that what Taleb observed may have been what you observed, plus the selection bias.
Posted by Andrew at 12:23 AM | Comments (1) | TrackBack
October 28, 2007
Bayes pays: toxicology and decision analysis division
Frederic Bois (winner of the Outstanding Statistical Application Award from the American Statistical Association, among other accomplishments) told me about the following job opportunities for modeling in toxicology, decision analysis, and risk assessment. Frederic is great, and so I assume the job is also. Here's the announcement:
INERIS - Institut National de l'Environnement Industriel et des Risques
Computational toxicology aims at modelling the links between chemical exposure and toxicity for humans or the environment. With the progress of toxicological sciences and a host of other scientific disciplines, a large body of knowledge is now available from fundamental or applicative research. Such knowledge enables detailed qualitative and quantitative descriptions of toxicity. Yet, acquisition of new knowledge on thousands of substances, as prescribed by new regulations, should be prioritised with respect to scientific uncertainty, considerations of animal welfare, consumer protection, and sustainable economic development. In that context, the assessment and reduction of uncertainty is a priority for risk and policy analysis and needs major improvements.
To answer those challenges, INERIS, a public institution, is gathering a fast growing team of statisticians, chemists and toxicologists, to develop of state-of-the-art software and research projects. The tools and concepts developed will help improve decision-making on chemical safety, reduce industrial development costs, and promote the use of alternative to animal testing. Funding will come from both private industries and public funding bodies (National Agency for Research, European Commission…).
INERIS is looking for a
Research Scientist in decision-analysis applied to toxicology
Your overall research theme will be Bayesian decision analysis for the development of optimal testing strategies, in the context of predictive assessment of the relationships between chemical exposure and toxicity. As a successful candidate, you will have a background and research experience in such techniques. A PhD in statistics or econometrics is required. A prior experience in Bayesian techniques applied to decision/risk analysis is needed. Experience with biological processes and chemistry will facilitate your insertion in the team; alternatively you may come from a different perspective but a strong desire to master a new field.
Research Scientist in statistics applied to risk analysis in toxicology
As a successful candidate, you will have a background and research experience in modelling and statistical inference for biological or clinical data. Familiarity with Bayesian techniques applied to risk analysis will be a plus. A PhD in statistics is required. You develop statistical inference methods and models to assess the relationships between chemical exposure and toxicity, for a very wide array of substances and effects.
Research Scientist in Chemometrics applied to Toxicology
Your challenge is to develop robust methods for statistical inference on the relationships between chemical structure and toxicity or physicochemical properties. That will require scientific expertise on statistical and physico-chemical modelling of toxicity, and on the management of chemical databases. A PhD in Chemometrics or statistics is required. A prior experience in QSAR research and data mining applied to toxicology will be appreciated.
For all three positions, your responsibilities will include:
- Developing the scientific expertise in your field, producing scientific publications;
- Assuring the general management and scientific advancement of research/expertise programs in partnership with public authorities or industry;
- Assuring funding for scientific research by means of grant proposal submissions to national and international institutions.
All three positions are permanent appointments at INERIS main site (Verneuil-en-Halatte, Picardie, 30 minutes North of Paris by train). Senior and junior candidates may apply, salary will be adjusted to competence and experience. Women and handicapped persons are encouraged to consider this offer.
For general enquiries please contact Dr. Bois (frederic.bois@ineris.fr).
Please send your application to Dr. Bois (frederic.bois@ineris.fr) and Dr. Mombelli (enrico.mombelli@ineris.fr).
INERIS, DRC, Parc Alata BP2, F-60550 Verneuil-en-Halatte (www.ineris.fr)
See also here.
Posted by Andrew at 6:01 PM | Comments (0) | TrackBack
October 25, 2007
Don't say "utility function," say "value function"
Like Dave Krantz, I'm down on the decision-theoretic concept of "utility" because it doesn't really exist.
The utility function doesn't exist
You cannot, in general, measure utility directly, and attempts to derive it based on preferences (based on the Neumann-Morgenstern theory) won't always work either because:
1. Actual preferences aren't necessarily coherent, meaning that there is no utility function that can produce all these preferences.
2. Preferences themselves don't in general exist until you ask people (or, to be even more rigorous, place them in a decision setting).
So, yeah, utility theory is cool, but I don't see utility as something that's Platonically "out there" in the sense that I can talk about Joe's utility function for money, or whatever.
Call it value, not utility
The above is commonplace (although perhaps not as well known as it should be). But my point here is something different, a point about terminology. I would prefer to follow the lead of some decision analysis books and switch from talking about "utility" to talking about "value." To the extent the utility function has any meaning, it's about preferences, or how you value things. I don't think it's about utility, or how useful things are. (Yes, I understand the idea of utility in social choice theory, where you're talking about what's useful to society in general, but even there I'd say you're really talking about what society values, or what you value for society.)
Just play around with the words for a minute. Instead of "my utility function for money" or "my utility for a washer and a dryer, compared to my utility for two washers or two dryers" (to take a standard example of a nonadditive utility function) or "my utility for a Picasso or for an SUV," try out "my value function for money" or "the value I assign to a washer and a dryer, compared to the value I assign to two washers or two dryers" or "the value I assign to a Picasso or to an SUV." This terminology sounds much better to me.
P.S. See Dave's comments here.
Posted by Andrew at 10:07 PM | Comments (21) | TrackBack
Dave Krantz on utility and value
Dave had these comments on my recent thoughts on utility and value functions:
I [Dave] agree with the negatives about "utility" as a word and as a Platonic function (attached to each individual).In teaching, I tend to discuss "subjective value." In my decision making course for undergrads I talk about optimization with respect to "objective" values, including physical, biological, and economic indices (e.g., maximum area, maximum sustainable yield, maximum profit), and with respect to subjective value, measured in a variety of ways; then I emphasize that many decision rules do not maximize anything -- because the weighting or even the existence of many goals is context dependent, and because some goals are converted into constraints. Optimization is thus subject to constraint and performed with context-dependent weights.
A standard use for "value function" in behavioral economics derives from Tversky & Kahneman's Prospect Theory; one of the blog contributors complains about that. And the emphasis on choice of words leads another contributor to treat the issue as one of words, rather than concepts and facts, no more important than "degrees of freedom" (which, of course, is a venerable term used relatedly in physics and in statistics).
I don't think there is an easy cure via terminology, though I feel you are on the right track here.
Posted by Andrew at 2:31 PM | Comments (0) | TrackBack
October 22, 2007
The probability coverage demonstration
Kaiser writes,
Been leafing through the "Super Crunchers" book over the weekend. . . . Halfway through it, I am still trying to figure out if "super crunching" means traditional statistics or data mining. It is not without irony that the author seems to equate the two. Regardless, it's still good publicity for our field.One example that seemed to have caught on [comes] from a book called "Decision Traps" by Russo and Schoemaker (who I think are business consultants). The idea is a catchy one, which is to illustrate the "over-confidence" of decision makers. The trick they used is to ask people to provide interval estimates at 90% confidence to a list of 10 questions such as "What was Martin Luther King Jr's age at death?", and "In what year was Mozart born?". Out of 1000+ respondents, they found that "less than 1 percent of the people gave ranges that included the right answer nine or ten times. Ninety-nine percent of people were overconfident." (pp.112-114 in the book). . . . Have you done anything similar with your students?
As far as I know the original idea came from an example of Alpert and Raiffa. I've had lots of success doing an adaptation of the Alpert and Raiffa demo in class; see Section 13.2.2 of my book with Nolan on Teaching Statistics book or section 4 of this paper.
There's also a more standard confidence coverage demo in Section 8.4 of Teaching Statistics. That one works well in class too.
Posted by Andrew at 10:35 PM | Comments (0) | TrackBack
October 15, 2007
Maintaining competitive balance in basketball: I disagree with Bill James
The great Bill James writes:
In sports, mathematical analysis is old news as applied to baseball, basketball, and football. . . . But it has not yet been applied to leagues. . . . Rather than beginning with the question "How does a team win?" - the query that has been the basis of all sports research to this point - what if we begin by asking "How does a league succeed?"Take the problem of what we could call NBA "sluggishness." In the regular season, players simply don't seem to be playing hard all the time. . . . The NBA's problem is that the underlying mathematics of the league are screwed up. . . . In the NBA, the element of predetermination is simply too high. Simply stated, the best team wins too often. If the best team always wins, then the sequence of events leading to victory is meaningless. Who fights for the rebound, who sacrifices his body to keep the ball from rolling out of bounds doesn't matter. The greater team is going to come out on top anyway. . . . Everybody knows who's going to win. Why do the players seem to stand around on offense? Why is showboating tolerated? Because it doesn't matter. . . .
So how should the NBA correct this? Lengthen the shot clock. Shorten the games. Move in the 3-point line. Shorten the playoffs.
If you reduce the number of possessions in a game by giving teams more time to hold the ball, you make it more likely that the underdog can win - for the same reason that Bubba Watson is a lot more likely to beat Tiger Woods at golf over three days than he is over four. It's simple math. The longer the contest lasts, the more certain the better team is to win. If the NBA went back to shorter playoff series - for example from best-of-seven games to best-of-three - an upset in that series would become a much more realistic possibility. A three-game series would make the homecourt advantage much more important, which, in turn, would make the regular season games much more important. The importance of each game is inversely related to the frequency with which the best team wins. . . .
I see James's point (and I continue to enjoy his writing style, so memorably and affectionately parodied by Veronica Geng a couple of decades ago), but I disagree with the remedy of adding more randomness. I don't think I really want to see the best team lose a lot. One appeal of a top-level sporting contest is seeing top players perform at their peak. Despite the popular models of the "binomial, p=.55" type, which team is "best" is not generally defined. In baseball, it depends so much on who is pitching; in football, some new plays can make the difference. Not to mention practice, discipline, teamwork, and getting some sleep the night before the game. Ideally (to me), the outcome of a game is unpredictable not because the worse team has a good chance of winning, but because it takes a special effort for a team to be the best. (Even in a deterministic game such as chess, the "best" (according to rankings) player does not always win.)
These issues lead into a larger question about scoring systems in games, a paradox of sorts that continues to confuse me: on one hand, you don't want the outcome to be random, on the other hand, you want the team that is behind to have a reasonable chance of catching up. I remember when I was a kid, my dad said that the tennis scoring system (games, set, match) was better than the ping-pong system (first player who gets 21 wins) because in tennis, you can always catch up. On the other hand, in a competitive game ping-pong, you should never be down 20-0 in the first place. There must be some principles here that can be stated mathematically, but I'm not quite how to state them. Perhaps someone has already looked into this.
P.S. I feel awkward disagreeing with Bill James, whose writings were one of the reasons I went into statistics. But I'm disagreeing with him about basketball, not baseball, so maybe it's ok.
Posted by Andrew at 9:08 AM | Comments (4) | TrackBack
September 21, 2007
Bridge vs. poker
The great David Owen reviewed a history of bridge (the card game) recently in the New Yorker. Among other things, he noted the decline in popularity of bridge and the rise of poker. But the fall of bridge and rise of poker were not simultaneous. Poker has never really gone away, but its recent ESPN-level popularity postdates bridge's decline by decades. More to the point, I prefer poker to bridge. At my weak-amateur level, I think poker is more of a skill game than bridge is. To put it another way, both poker and bridge have routine elements. But in bridge, the routine elements are crucial and require a lot of focus--play out those cards right, or you lose. In poker, the key routine element is to fold crappy cards (most of the time), and that's easy. This is one reason I find poker to be more fun--I can focus on the important moments. (Yes, I'm sure it's different for good players of either game.)
Posted by Andrew at 7:17 AM | Comments (0) | TrackBack
September 10, 2007
Bayes and risk
Someone writes in with the following question:
I've been studying Information Technology risk for some time now and so your work is of great interest. In IT risk we have several problems that a Bayesian approach would seem to help us address. Namely:1.) We have only about 10 years of information
2.) The relevancy of that information changes somewhat quickly - sometimes weekly, sometimes monthly (thanks microsoft patch day) so it's difficult to take any empiricist approach.
3.) We have very small sample sizes (the details of a threat action is rarely shared information).
What I'm discovering is that:
1.) Lack of common definition. Risk can be Threat can be Vulnerability can be Hazard, etc... Our standards bodies (the ISO) aren't helping here, just making this problem worse by committee-think.
2.) Most IT security folks (notice I didn't use "IT risk") have an engineering background and therefore a frequentist perspective. As such, they reject the notion that probabilities can be attached to risk.
3.) They love the garbage in-garbage out argument. Similarly, it is commonly argued that "opinions" cannot be useful information.
I believe that the use of Bayes has the ability to significantly improve our profession. I believe that there are very smart people in our profession. What is troubling is the amount of evangelism it is taking to educate even the most intelligent IT Security folks. That said, I have a couple of questions for you if you have the time to consider them.
1.) Taleb rails against the use of Gaussian distributions. Most smart IT security folks have read Taleb, and therefore discount the notion of using them. But didn't Jaynes have a position that Gaussian was actually an appropriate distribution to use when the actual distribution was uncertain?
2.) How do you deal with the frequentists and the tendency to casually dismiss inference because of "garbage-in, garbage-out"? I've pointed out that "fraudulent" use of data to push an agenda is not limited to any particular discipline - probability theory or not. However, the frequentists are still disturbed at the idea of using their experience and then accounting for their (residual?) uncertainty.
3.) We define risk as a value derived by the probable frequency of a loss event, and the probable impact of that event. Are we insane in our attempt to attach probabilities to risk?
My reply:
Garbage-in, garbage out is a real concern in statistical modeling and decision analysis. I discuss it a bit in this talk and in Chapter 22 of Bayesian Data Analysis. Classical decision theory does not always handle the GIGO problem well.
But I don't see why to single out Bayes! Any statistical method has assumptions. Maximum likelihood, for example, can be much more unstable than Bayes--that's why Bayesian inference is sometimes called "regularization." See here, for example.
Regarding Taleb and the Gaussian distribution, I actually had a discussion with him on this. The t distribution can be interpreted as a scale mixture of Gaussians (that is, a Gaussian distribution where the scale itself varies). I've used the Gaussian distribution a lot (see all the examples in our books) but the t is probably a better general choice.
Finally, I think it makes a lot of sense to attach probabilities to risks. You just have to recognize the models used in creating these probabilities. You should check the fit of the model (by comparing replicated data to observed data) and alter it as necessary. Low probabilities can be estimated by a combination of empirical work and theoretical modeling (for example, here is our paper on estimating the probability of events that have never occurred).
Posted by Andrew at 1:55 PM | Comments (4) | TrackBack
August 17, 2007
Are brilliant scientists less likely to cheat?
In this discussion of Allegra Goodman's book novel Intuition, Barry wrote, "brilliant people are at least as capable of being dishonest as ordinary people." The novel is loosely based on some scientific fraud scandals from the 1980s, the one of its central characters, a lab director, is portrayed as brilliant and a master of details, but who makes a mistake by brushing aside evidence of fraud by a postdoc in her lab. One might describe the lab director's behavior as "soft cheating" since, given the context of the novel, she had to have been deluding herself by ignoring the clear evidence of a problem.
Anyway, the question here is: are brilliant scientists at least as likely to cheat? I have no systematic data on this and am not sure how how to get this information. One approach would be to randomly sample scientists, index them by some objective measure of "brilliance" (even something like asking their colleagues to rate their brilliance on a 1-10 scale and then taking averages would probably work), then do a through audit of their work to look for fraud, and then regress Pr(fraud) on brilliance. This would work if the prevalence of cheating were high enough. Another approach would be to do a case-control study of cheaters and non-cheaters, but the selection issues would seem to be huge here, since you'd be only counting the cheaters who got caught. Data might also be available within colleges on the GPA's and SAT scores of college students who were punished for cheating; we could compare these to the scores of the general population of students. And there might be useful survey data of students, asking questions like "do you cheat" and "what's your SAT" or whatever. I guess there might even be a survey of scientists, but it seems harder to imagine they'd admit to cheating.
Arguments that brilliant scientists are more likely to cheat
Goodman makes the argument (through fictional example) in her book that brilliant scientists are more likely to be successful lab directors, thus under more pressure to keep getting grants (many mouths to feed), thus susceptible to soft cheating, at least. Similarly, the cheating postdoc is described as so smart he never had to work hard in college, again under high expectations and cheating partly to maintain his reputation as the golden boy. On the other side, a more ordinary "worker bee" type will not be expected to come up with a brilliant insight, and so won't be under that pressure to cheat.
Another argument that brilliant scientists are more likely to cheat comes from some of the standard "overcoming bias" ideas, that a brilliant person is more likely to have made daring correct conjectures in the past, then when the person comes up with a new conjecture, he or she is more likely to believe in it and then fake the data. (I'm assuming that scientific cheating of the sort that's interesting is in the lines of twisting the data to support a conclusion that you think is true. If you don't even think the hypothesis is true, there's not much point to faking the evidence, since later scientists will overturn you anyway. The motivation for cheating is that you're sure you're right, and so you overconfidently discard the cases that don't support your case.)
Arguments that brilliant scientists are less likely to cheat
I'm half-convinced by the overconfidence argument above, but overall I suspect that brilliant scientists are more likely to be honest than less-brilliant scientists, at least in their own field of research. I say this partly because science is, to some extent, about communication, and transparency is helpful here. Also, as illustrated (fictionally) in Goodman's book, fraud is often done to cover up unsuccessful research. If you're brilliant, it's likely that your research will be successful: even if you don't achieve your big goals--even brilliant people will, perhaps should, bite off more than you can chew--you should get some productive spinoffs, and the simple cost-benefit analysis suggests that cheating would stand to lose you more than you'd gain.
Conversely, for a more mediocre scientist, cheating may be a roll of the dice, which, if it succeeds, can bring you to a plateau, and if it fails, you won't be that much worse off than before--you don't have such a big potential reputation to lose. And if the stakes are low, the cheating might never be discovered: you get the paper, the job, tenure or whatever, your findings are never replicated, and you move on.
Thinking of honesty as a behavior rather than a character trait
The other thing is that it might make more sense to think of honesty as a behavior rather than a character trait. I'm pretty honest (I think), but that also makes me an unpracticed liar (and, unsuprisingly, a bad liar). So the smart move for me is not to lie--again, more to lose than to gain (in my estimated expected value). But if I worked in a profession where dishonesty--or, to put it more charitably, hiding the truth--was necessary, something involving negotiation or legal maneuvers or whatever, then I'd probably get better at lying and then maybe I'd start doing more of it in other aspects of life.
Science seems to me like an area where lying isn't generally very helpful, so I don't see that the best scientists would be good or practiced liars. The incentives, at least for the very best work, go the other way.
P.S. Thanks for Robin Hanson for encouraging me to present arguments on both sides of the question.
Posted by Andrew at 3:01 PM | Comments (6) | TrackBack
July 16, 2007
Calibration in chess
Daniel Kahneman posted the following on the Judgment and Decision Making site:
Have there been studies of the calibration of expert players in judgments of chess situations -- e.g., probability that white will win?In terms of the amount and quality experience and feedback, chess players are at least as privileged as weather forecasters and racetrack bettors -- but they don't have the experience of expressing their judgments in probabilities. I [Kahneman] am guessing that the distinction between a game that is "certainly lost" and "probably lost" is one that very good players can make reliably, but I know of no evidence.

Despite knowing much less about decision making and (likely) less about chess than Kahneman, I have three conjectures:
1. Players would show superadditivity in the sense of overstating their own chances of winning. To put it another way, suppose that both players in a game give you Pr(I win), Pr(I tie), Pr(I lose). Call these W1, W2, W3 (for white) and B1, B2, B3 (for black). My conjecture is that (W1+B1) > (W3+B3)--that is, that the total "I win" probability exceeds the total "I lose" probability. It would be interesting to see this on average and also for individual games and times of the game.
2. Players would show the usual overconfidence in probability statements, for example, events that are stated to happen 90% of the time only happening 75% of the time, and so forth.
3. Aspects of both points above might be explained by the idea that:chess players, like the rest of us, tend to make their probability statements about the ideal, rather than the actual, game outcome. For example, suppose you were to do a study to measure probability judgments and find the (generically) expected overconfidence: when players predict a 99% chance of victory, it only happens 90% of the time, or whatever. On those 10% of the times when his or her prediction is wrong, I could imagine he or she explaining it away as some blunder that "wasn't supposed to happen" and so shouldn't count.
Similarly, before the game even starts, each player's probability of winning can be calculated based on who is playing white, who is black, and their ratings (see here), but I would imagine that, before the game begins, each player overestimates his or her own winning probability, thinking "this time I'll play harder" or something similar.
This ties in a bit to the distinction between the "is vs. should" or "descriptive vs. normative" distinction in decision analysis. I think it would be natural to assess the chances of winning in the well-fought game of the player's imagination rather than in the calibrated empirical world of all realistic possibilities.
Anyway, it would be fun to see the data. And I'm probably being overconfident about my own conjectures above.
Update from the comments
In the comments, Smiley and James suggest that chess players evaluate the position rather than the players. This would lead to classical overconfidence (bias #2 above) because evalation of "the postion" would tend to imply near-optimal play and would discount the possibility of blunders or simply of aspects of the position that are not noticed. It would also lead to superadditivity from some version of the endowment effect (overvaluing my position because it's mine.) Koray points out that chess programs perform these evaluations automatically so maybe these could be compared to players' personal evaluations.
And Lemmus points out that you could have observers other than the players make the probability evaluations also--some observers who are watching the games, others who know something about the players but aren't watching live, and others who only see the position (and possibly how the players got there).
Posted by Andrew at 6:30 AM | Comments (8) | TrackBack
July 13, 2007
Goals and plans in decision making
For years, Dave Krantz has been telling me about his goal-based model of decision analysis. It's always made much more sense to me than the usual framework of decision trees and utility theory (which, I agree with Dave, is not salvaged by bandaids such as nonlinear utilities and prospect theory). But, much as I love Dave's theory, or proto-theory, I always get confused when I try to explain it to others (or to myself): "it's, uh, something about defining decisions based on goals, rather than starting with the decision options, uh, ...." So I was thrilled to find that Dave and Howard Kunreuther just published an article describing the theory. Here's the abstract:
We propose a constructed-choice model for general decision making. The model departs from utility theory and prospect theory in its treatment of multiple goals and it suggests several different ways in which context can affect choice.It is particularly instructive to apply this model to protective decisions, which are often puzzling. Among other anomalies, people insure against non-catastrophic events, underinsure against catastrophic risks, and allow extraneous factors to influence insurance purchases and other protective decisions. Neither expected-utility theory nor prospect theory can explain these anomalies satisfactorily. To apply this model to the above anomalies, we consider many different insurance-related goals, organized in a taxonomy, and we consider the effects of context on goals, resources, plans and decision rules.
The paper concludes by suggesting some prescriptions for improving individual decision making with respect to protective measures.
Going to their paper, Table 1 shows the classical decision-analysis framework, and Table 2 shows the new model, which I agree is better. I want to try to apply it to our problem of digging low-arsenic wells for drinking water in Bangladesh.
Is vs. should
I have a couple of qualms about Dave's approach, though, which involve distinguishing between descriptive and normative concerns. This comes up in all models of decision making: on one hand, you can't tell people what to do (at best, you can point out inconsistencies in their decisions or preferences), but on the other hand these theories are supposed to provide guidance, not just descriptions of our flawed processes.
Anyway, I'm not so thrilled with goals such as in Krantz and Kunreuther's Table 5, of "avoid regretting a modest loss." The whole business of including "regret" in a decision model has always seemed to me to be too clever by half. Especially given all the recent research on the difficulties of anticipating future regret. I'd rather focus on more stably-measurable outcomes.
Also, Figure 4 is a bit scary to me. All those words in different sizes! It looks like one of those "outsider art" things:

In all seriousness, though, I think this paper is great. The only model of decision making I've seen that has the potential to make sense.
Need a better name
But I wish they wouldn't call their model "Aristotelian." As a former physics student, I don't have much respect for Aristotle, who seems to have gotten just about everything wrong. Can't they come up with a Galilean model?
Posted by Andrew at 6:50 AM | Comments (5) | TrackBack
July 7, 2007
Wisdom of crowds
Bernard Guerrero asks what I think of this. My response is that people can be irrational all the time--let's face it, we're a bunch of animals. Voters can have incoherent preferences (e.g., more services but less taxes), consumers can make mistakes (buying that brand-name $40,000 car and then being upset that they have no money left), forecasters can make mistakes (even setting aside "moral hazard" settings, there are lots of notorious problems such as people attaching insufficient probability to the "all else" category).
Arima models etc. can be overrated--lots of people seem to think these are the only models out there. Cavan Reilly has a fun example--chapter 27 in my book with Meng--of a 6-parameter predator-prey model that way outperforms standard time series models (with 11 or more parameters) in forecasting the famous Canadian lynx series. So I'm not surprised that Arimas can be beaten.
I agree with Bernard that you'd want to know where the survey forecasts come from. The surveys themselves are of forecasts. (This is different than the familiar use of surveys of forthcoming elections, where people are asked whom they would vote for if the election were held today. The Ang et al. paper is using surveys where people are explicitly asked to forecast.) It does sound like a classic "wisdom of crowds" averaging.
P.S. Two of the authors are at Columbia. I haven't met them. Perhaps they can speak in our quantitative social science seminar in the fall.
Posted by Andrew at 10:14 PM | Comments (2) | TrackBack
July 5, 2007
Judgment, decision making, and marketing
Dan Goldstein writes, "Marketing is JDM [judgment and decision making] with teeth." Wouldn't that be dentistry?
Posted by Andrew at 6:47 AM | Comments (1) | TrackBack
July 2, 2007
Multidimensionality and the backseat driver principle
More thoughts on the backseat driver principle from Ubs:
No doubt you've heard the factoid that, when asked if they are a better-than-average driver, x% of respondents will claim that they are, where x is something much greater than 50. Assuming this is true, and not just a made-up urban-myth statistic, the irony is that many of them (x-50%, at least) must be wrong.My [Ubs's] epiphany is that maybe they're not wrong. In the supposed poll, the definition of "good driver" is never specified.
A certain person who is near and dear to me (but shall remain nameless) is an aggressive driver. She goes fast, changes lanes a lot, darts in front of other cars, frequently talks on the phone while driving, etc. I, on the other hand, tend to drive slowly, carefully watch the other cars around me, and often defer to other drivers even if they're being unreasonable.
If you were to ask this person, I have no doubt that she would say she is a better driver than me. She would say so based on the fact that her ability-related driving skills are superior to mine. And they are. She really is better than I am at maneuvering a car through traffic; she has a much greater capacity to multitask while driving; and her reaction time, though not as fast as she thinks it is, is surely faster than mine. If ever we were on some goofy reality TV show where contestants are asked to race through an obstacle course while talking on the phone, eating a burrito, and listening to loud music, she would be our team's favorite for that contest.
You can see where this is going. If you were to ask me, I would say that I'm the better driver, based on the fact that I am safer and more careful. I may be less able to multitask while driving, so I simply don't. I don't talk on the phone while driving, I don't tailgate, I rarely exceed the speed limit, etc. Any insurance company would consider me less likely to get in an accident.
Which of these skill sets represents better driver? Either definition is plausible. It's hardly surprising that more than 50% of drivers choose to value the skill which they are better at.
and then my reply:
I agree--it's something I've thought of for awhile. Political scientists are particularly aware of these multidimensionality issues because the arise in voting. There are some dimensions in which people can agree on how to evaluate different political parties (for example, nearly everyone favors economic growth, nearly everyone was unhappy with Jimmy Carter over Iran and with George Bush over Katrina), but there are other dimensions in which lots of people disagree.The other thing this reminds me of is, about 20 years ago, a friend recommended to me a "pop evolutionary psychology" book (I think you know what I mean by this category: one of these books that explains much of human behavior based on what was adaptive 100,000 years ago; this must have been one of the early books of this type) that he really liked.
I flipped open the book and naturally turned to the chapter about relationships and sex. The author stated that every one of us is on a 1-10 scale of desirability (a mix of attractiveness, health, $, and whatever other good features you might want in a partner), and that through a natural bargaining process, the 10's end up with 10's, the 9's end up with 9's, etc. He said that, after he came up with this theory, a friend of his said it was very helpful to him because he realized he was an 8 who was searching for 10's and being chased by 6's.
As you can probably predict given what I've written so far, I was gobsmacked by this guy's assumption of unidimensionality, the scale leading directly and unambiguously from Brad Pitt to the Unabomber (and where his friend just happens to be an 8. Yeah, right).
Posted by Andrew at 1:24 AM | Comments (3) | TrackBack
Fair candy pricing
Anders Sandberg writes here about how the finish-the-plate bias can lead people to overeat, simply because food comes in larger packages (which, in turn, presumably arises because food is so cheap to produce). Anyway, this reminds me of an insight I had several years ago which I used to tell the students in my decision analysis classes. They were always skeptical but maybe now with research behind me on this, they'll believe me.
Anyway, here goes:
When I was younger, people used to complain about candy bars getting smaller and smaller. (For example, Stephen Jay Gould has a graph in one of his books showing the size of the standard Hershey bar declining from 2 ounces in 1965 gradually down to 1.2 ounces in 1980, and for that matter I can recall tunafish cans gradually declining from 8 ounces to 6 ounces.) And I remember going to the candy machine with my quarter and picking out the candy bar that was heaviest--I don't remember which one--even if it wasn't my favorite flavor, to get the most value for the money.
But now I realize that, rationally, candymakers should charge more for smaller candy bars. The joy from eating the candy is basically discrete--I'll get essentially no more joy from a 1.7-ounce bar than from a 1.4-ounce bar. But the larger bar will be worse for my health (no big deal if I eat just one, but with some cumulative effect if I eat one every day, similarly with the sodas and so forth). And, given the well-known fact that nobody can eat just part of a candy bar, I get more net utility from the small bar, thus they should charge more.
Posted by Andrew at 12:29 AM | Comments (6) | TrackBack
June 22, 2007
The backseat driver principle
The driver overestimates his control over the situation (including his own car as well as others on the road). The backseat driver ("Whoa--you're taking that curve too fast!") underestimates the driver's control. As a driver, I listen to the passengers because they provide a useful corrective. Even if the backseat driver is sometimes annoying, it makes sense to listen.
More generally: I'll take anybody's advice seriously.
Posted by Andrew at 8:42 AM | Comments (5) | TrackBack
June 12, 2007
One reason why plans are good
One of the small puzzles of decision analysis is that:
(a) Plans have lots of problems--things commonly don't go according to plan, plans notoriously exclude key possibilities that the planner didn't think of, plans can encourage tunnel vision, etc. But . . .
(b) Plans are helpful. In fact, it's hard to do much of anything useful without a plan. (I'm sure people will come up with counterexamples here, but certainly in my own work and life, not much happens if I don't plan it. Serentipitous encounters are fine but don't add up to much.
Beyond this, one could add that economic activity seems to work well with minimal planning (just enough structure and rules to set up "the marketplace") but individual actors plan, and need to plan, all the time.
This puzzle is particularly interesting to me as I do work in applied decision analysis.
So what's the solution to the puzzle?
I don't really have a solution, but in talking with Dave Krantz yesterday I thought of one advantage of plans, even bad plans. Suppose you have a particular goal and are setting up a plan, considering two decision options, A or B. According to the plan, decision A will work by first implementing step 1, then step 2. Decision B will work by first implementing step x, then step y. Graphically:
option A --> 1 --> 2 --> Goal
option B --> x --> y --> Goal
This plan may have problems, but it clearly sets up the roles of 1,2,x,y. Without the plan, it could be easy to hold both A and B in your mind simultaneously, blurring the distinction. In particular, it could be easy to vaguely imagine that you could do step 1, then step y.
To summarize: one advantage of a plan is it enforces a certain logical consistency and can clarify the relations between intermediate steps.
P.S. Lots of interesting comments here
Posted by Andrew at 8:06 PM | Comments (3) | TrackBack
May 21, 2007
How to Win at Rock-Paper-Scissors
Ubs pointed me to this entry at Mental Floss linking to this article by Graham Walker, who's described as "a co-author of the Official Rock Paper Scissors Strategy Guide (published by Simon and Schuster) and five-time organizer of the World Rock Paper Scissors Championships." Hey, I wanted to organize a RPS tournament one winter in college but everybody thought it was a silly idea. Credit goes to those who put in the effort.
Anyway, here are Walker's suggestions. I thought it was just going to be a joke--"rock always ins" and all that--but they actually look pretty good to me. The comments at the end of the article are interesting too.
The secret to winning at RPSBasically, there are two ways to win at RPS. First is to take one throw away from your opponent options. ie - If you can get your opponent to not play rock, then you can safely go with scissors as it will win against paper and stalemate against itself. Seems impossible right? Not if you know the subtle ways you can manipulate someone. The art is to not let them know you are eliminating one of their options. The second way is to force you opponent into making a predictable move. Obviously, the key is that it has to be done without them realizing that you are manipulating them.
Most of the following techniques use variations on these basic principles. How well it works for you depends upon how well you can subtly manipulate your opponent without them figuring out what you are doing. So, now that the background is out of the way, let's get into these techniques:
1 - Rock is for Rookies
In RPS circles a common mantra is "Rock is for Rookies" because males have a tendency to lead with Rock on their opening throw. It has a lot to do with idea that Rock is perceived as "strong" and forceful", so guys tend to fall back on it. Use this knowledge to take an easy first win by playing Paper. This tactic is best done in pedestrian matches against someone who doesn't play that much and generally won't work in tournament play.
2 - Scissors on First
The second step in the 'Rock is for Rookies' line of thinking is to play scissors as your opening move against a more experienced player. Since you know they won't come out with rock (since it is too obvious), scissors is your obvious safe move to win against paper or stalemate to itself.
3 - The Double Run
When playing with someone who is not experienced at the RPS, look out for double runs or in other words, the same throw twice. When this happens you can safely eliminate that throw and guarantee yourself at worst a stalemate in the next game. So, when you see a two-Scissor run, you know their next move will be Rock or Paper, so Paper is your best move. Why does this work? People hate being predictable and the perceived hallmark of predictability is to come out with the same throw three times in row.
4 - Telegraph Your Throw
Tell your opponent what you are going to throw and then actually throw what you said. Why? As long as you are not playing someone who actually thinks you are bold enough to telegraph your throw and then actually deliver it, you can eliminate the throw that beats the throw you are telegraphing. So, if you announce rock, your opponent won't play paper which means coming out with that scissors will give you at worst a stalemate and at best the win.
5 - Step Ahead Thinking
Don't know what to do for your next throw? Try playing the throw that would have lost to your opponents last throw? Sounds weird but it works more often than not, why? Inexperienced (or flustered) players will often subconsciously deliver the throw that beat their last one. Therefore, if your opponent played paper, they will very often play Scissors, so you go Rock. This is a good tactic in a stalemate situation or when your opponent lost their last game. It is not as successful after a player has won the last game as they are generally in a more confident state of mind which causes them to be more active in choosing their next throw.
6 - Suggest A Throw
When playing against someone who asks you to remind them about the rules, take the opportunity to subtly "suggest a throw" as you explain to them by physically showing them the throw you want them to play. ie "Paper beats Rock, Rock beats scissors (show scissors), Scissors (show scissors again) beats paper." Believe it or not, when people are not paying attention their subconscious mind will often accept your "suggestion". A very similar technique is used by magicians to get someone to take a specific card from the deck.
7 - When All Else Fails Go With Paper
Haven't a clue what to throw next? Then go with Paper. Why? Statistically, in competition play, it has been observed that scissors is thrown the least often. Specifically, it gets delivered 29.6% of the time, so it slightly under-indexes against the expected average of 33.33% by 3.73%. Obviously, knowing this only gives you a slight advantage, but in a situation where you just don't know what to do, even a slight edge is better than none at all.
8 - The Rounder's Ploy
This technique falls into more of a 'cheating' category, but if you have no honour and can live with yourself the next day, you can use it to get an edge. The way it works is when you suggest a game with someone, make no mention of the number of rounds you are going to play. Play the first match and if you win, take it is as a win. If you lose, without missing a beat start playing the 'next' round on the assumption that it was a best 2 out of 3. No doubt you will hear protests from your opponent but stay firm and remind them that 'no one plays best of one for a kind of decision that you two are making'. No this devious technique won't guarantee you the win, but it will give you a chance to battle back to even and start again.
Walker (not "Thrower," huh?) introduces these suggestions with an excellent perspective on game theory:
Contrary to what you might think RPS is not simply a game of luck or chance. While it is true that from a mathematical perspective the 'optimum' strategy is to play randomly, it still is not a winning strategy for two reasons. First, 'optimum' in this case means you should win, lose and draw an equal number of times (hardly a winning strategy over the long term). Second, Humans, try as they might, are terrible at trying to be random, in fact often humans in trying to approximate randomness become quite predictable. So knowing that there is always something motivating your opponent's actions, there are a couple of tricks and techniques that you can use to tip the balance in your favour.
Something about that British spelling makes it all seem so much more sophisticated. . . . One other thing: the Mental Floss entry has some commenters who mention the Dynamite throw. I remember playing best-of-5 as a kid with Dynamite (fist clenched with index finger out) as an option. Dynamite beat all the other options (except it tied with Dynamite, of course) but you could only throw it once during the best-of-5 game.
Finally, as I'm sure many people have pointed out, some aspects of RPS are mirrored in the interaction between baseball pitcher and hitter, since the batter has to pretty much guess at the pitch before it's thrown (with some room for adjustment).
Posted by Andrew at 7:25 AM | Comments (5) | TrackBack
May 4, 2007
Effectiveness of geriatric specialists, leading to a brief discussion of the "separate accounts" fallacy in decision making and a comparison of the climates of Baltimore and St. Paul
I'd like to move from basketball to something more important: geriatric care, a topic I was reminded of after reading this interesting article by Atul Gawande.
The article starts with some general discussion of the science of human aging, then moves to consider options for clinical treatment. Gawande learns a lot from observing a gerontologist's half-hour meeting with a patient. He tells a great story (too long to make sense to repeat here), although I suspect he was choosing the best out of the many patients he observed. He notes:
In the story of Jean Gavrilles and her geriatrician, there’s a lesson about frailty. Decline remains our fate; death will come. But, until that last backup system inside each of us fails, decline can occur in two ways. One is early and precipitately, with an old age of enfeeblement and dependence, sustained primarily by nursing homes and hospitals. The other way is more gradual, preserving, for as long as possible, your ability to control your own life.Good medical care can influence which direction a person’s old age will take. Most of us in medicine, however, don’t know how to think about decline. We’re good at addressing specific, individual problems: colon cancer, high blood pressure, arthritic knees. Give us a disease, and we can do something about it. But give us an elderly woman with colon cancer, high blood pressure, arthritic knees, and various other ailments besides—an elderly woman at risk of losing the life she enjoys—and we are not sure what to do.
Gawande continues with a summary of this study:
Several years ago, researchers in St. Paul, Minnesota, identified five hundred and sixty-eight men and women over the age of seventy who were living independently but were at high risk of becoming disabled because of chronic health problems, recent illness, or cognitive changes. With their permission, the researchers randomly assigned half of them to see a team of geriatric specialists. The others were asked to see their usual physician, who was notified of their high-risk status. Within eighteen months, ten per cent of the patients in both groups had died. But the patients who had seen a geriatrics team were a third less likely to become disabled and half as likely to develop depression. They were forty per cent less likely to require home health services.Little of what the geriatricians had done was high-tech medicine: they didn’t do lung biopsies or back surgery or PET scans. Instead, they simplified medications. They saw that arthritis was controlled. They made sure toenails were trimmed and meals were square. They looked for worrisome signs of isolation and had a social worker check that the patient’s home was safe.
But now comes the kicker:
How do we reward this kind of work? Chad Boult, who was the lead investigator of the St. Paul study and a geriatrician at the University of Minnesota, can tell you. A few months after he published his study, demonstrating how much better people’s lives were with specialized geriatric care, the university closed the division of geriatrics.“The university said that it simply could not sustain the financial losses,” Boult said from Baltimore, where he is now a professor at the Johns Hopkins Bloomberg School of Public Health.
One of the problems comes from the "separate accounts" fallacy in decision making:
On average, in Boult’s study, the geriatric services cost the hospital $1,350 more per person than the savings they produced, and Medicare, the insurer for the elderly, does not cover that cost. It’s a strange double standard. No one insists that a twenty-five-thousand-dollar pacemaker or a coronary-artery stent save money for insurers. It just has to maybe do people some good. Meanwhile, the twenty-plus members of the proven geriatrics team at the University of Minnesota had to find new jobs. Scores of medical centers across the country have shrunk or closed their geriatrics units. Several of Boult’s colleagues no longer advertise their geriatric training for fear that they’ll get too many elderly patients. “Economically, it has become too difficult,” Boult said.But the finances are only a symptom of a deeper reality: people have not insisted on a change in priorities. We all like new medical gizmos and demand that policymakers make sure they are paid for. They feed our hope that the troubles of the body can be fixed for good. But geriatricians? Who clamors for geriatricians? What geriatricians do—bolster our resilience in old age, our capacity to weather what comes—is both difficult and unappealingly limited. It requires attention to the body and its alterations. It requires vigilance over nutrition, medications, and living situations.
On the plus side, Baltimore has much better weather than St. Paul.
From the article by Boult et al. (you might notice a shift in style from the New Yorker to
the Journal of the American Geriatric Society):
PARTICIPANTS: A population-based sample of community-dwelling Medicare beneficiaries age 70 and older who were at high risk for hospital admission in the future (N = 568).INTERVENTION: Comprehensive assessment followed by interdisciplinary primary care.
MEASUREMENTS: Functional ability, restricted activity days, bed disability days, depressive symptoms, mortality, Medicare payments, and use of health services. Interviewers were blinded to participants' group status.
RESULTS: Intention-to-treat analysis showed that the experimental participants were significantly less likely than the controls to lose functional ability (adjusted odds ratio (aOR) = 0.67, 95% confidence interval (CI) = 0.47–0.99), to experience increased health-related restrictions in their daily activities (aOR = 0.60, 95% CI = 0.37–0.96), to have possible depression (aOR = 0.44, 95% CI = 0.20–0.94), or to use home healthcare services (aOR = 0.60, 95% CI = 0.37–0.92) during the 12 to 18 months after randomization. Mortality, use of most health services, and total Medicare payments did not differ significantly between the two groups. The intervention cost $1,350 per person.
CONCLUSION: Targeted outpatient GEM slows functional decline.
P.S. Dennis Miller alert: Since I'm mentioning the New Yorker, I'll have to link to this again.
Posted by Andrew at 12:15 AM | Comments (4) | TrackBack
April 25, 2007
More Black Swan
Here are Taleb's comments on my comments on his book.
Posted by Andrew at 12:46 AM | Comments (0) | TrackBack
April 20, 2007
The norm of self-interest
Aleks's comments here, in particular the bit about selfishness, reminds me of one of my favorite papers, "The norm of self-interest" by the psychologist Dale Miller. Here's the abstract:
The self-interest motive is singularly powerful according to many of the most influential theories of human behavior and the layperson alike. In the present article the author examines the role the assumption of self-interest plays in its own confirmation. It is proposed that a norm exists in Western cultures that specifies self-interest both is and ought to be a powerful determinant of behavior. This norm influences people's actions and opinions as well as the accounts they give for their actions and opinions. In particular, it leads people to act and speak as though they care more about their material self-interest than they do. Consequences of misinterpreting the "fact" of self- interest are discussed.
(Related work by Noah Kaplan, Aaron Edlin, and myself here, distinguishing rationality from selfishness as motivations for voting.)
Posted by Andrew at 12:57 AM | Comments (7) | TrackBack
April 18, 2007
We Don't Quite Know What We are Talking About When We Talk About Volatility
Following up (sort of) on my comments on The Black Swan . . .
Dan Goldstein and Nassim Taleb's paper write: "Finance professionals, who are regularly exposed to notions of volatility, seem to confuse mean absolute deviation with standard deviation, causing an underestimation of 25% with theoretical Gaussian variables. In some fat tailed markets the underestimation can be up to 90%. The mental substitution of the two measures is consequential for decision making and the perception of market variability."
This interests me, partly because I've recently been thinking about summarizing variation by the mean absolute difference between two randomly sampled units (in mathematical notation, E(|x_i-x_j})), because that seems like the clearest thing to visualize. Fred Mosteller liked the interquartile range but that's a little too complicated for me, also I like to do some actual averaging, not just medians which miss some important information. I agree with Goldstein and Taleb that there's not necessarily any good reason for using sd (except for mathematical convenience in the Gaussian model).
Posted by Andrew at 5:23 AM | Comments (0) | TrackBack
April 13, 2007
Lotteries: A Waste of Hope
Statisticians are always looking for ways to convince people not to play the lottery. Here's another reason (from Eliezer Yudkowsky).
Posted by Andrew at 7:54 AM | Comments (1) | TrackBack
February 21, 2007
Discontinuities in valuation of poker hands
Jacks are better than 10's, but just a little better. I suspect jacks are generally overvalued because they are face cards. I'm not sure how I'd use this knowledge (supposing for a moment that my conjecture is actually true), it's just a thought I had after reading a couple of books about poker.
Posted by Andrew at 12:25 AM | Comments (5) | TrackBack
February 13, 2007
Truth is stranger than fiction
Robin Hanson asks the following question here:
How does the distribution of truth compare to the distribution of opinion? That is, consider some spectrum of possible answers, like the point difference in a game, or the sea level rise in the next century. On each such spectrum we could get a distribution of (point-estimate) opinions, and in the end a truth. So in each such case we could ask for truth's opinion-rank: what fraction of opinions were less than the truth? For example, if 30% of estimates were below the truth (and 70% above), the opinion-rank of truth was 30%.If we look at lots of cases in some topic area, we should be able to collect a distribution for truth's opinion-rank, and so answer the interesting question: in this topic area, does the truth tend to be in the middle or the tails of the opinion distribution? That is, if truth usually has an opinion rank between 40% and 60%, then in a sense the middle conformist people are usually right. But if the opinion-rank of truth is usually below 10% or above 90%, then in a sense the extremists are usually right.
My response:
1. As Robin notes, this is ultimately an empirical question which could be answered by collecting a lot of data on forecasts/estimates and true values.
2. However, there is a simple theoretical argument that suggests that truth will be, generally, more extreme than point estimates, that the opinion-rank (as defined above) will have a distribution that is more concentrated at the extremes as compared to a uniform distribution.
The argument (with pictures) goes as follows:
Suppose that everybody's Bayesian, everybody has the same prior distribution, but with different small amounts of data. To give some notation: suppose we will be looking at a sequence of parameters, theta_1, theta_2, theta_3, ... with a common prior distribution p(theta), which represents the true distribution of this population of theta's. (We could further suppose a hierarchical structure, so that p(theta) has hyperparameters that are estimated from data, but this is not necessary for our discussion here.) For simplicity, suppose p(theta) is a normal (bell-shaped) curve centered at 0 with standard deviation sigma.
Now suppose you get some data, y, on a parameter, theta, and summarize your inference by a point estimate which is your posterior mean, theta.hat = E(theta|y). Averaging over all possible data y that you might see, this posterior mean a sampling distribution which is centered about 0 but with a standard deviation less than sigma. This derives from an application of the basic variance-decomposition inequality: var(theta.hat) = var(E(theta)|y) = var(theta) - E(var(theta|y)), which tells us that the theta.hat's are less variable than the underlying thetas. (This is a point we make in our paper, All Maps of Parameter Estimates are Misleading, and it also is discussed in some papers by Tom Louis.)
Here's some R code, producing the graph below:
J <- 200
mu.theta <- 0
sigma.theta <- 1
theta <- rnorm (J, mu.theta, sigma.theta)n <- 100
sigma.y <- .5
y <- array (NA, c(J,n))
theta.hat <- array (NA, c(J,n))
theta.hat.rank <- rep (NA, J)for (j in 1:J){
y[j,] <- rnorm (n, theta[j], sigma.y)
theta.hat[j,] <- (y[j,]/sigma.y^2 + mu.theta/sigma.theta^2)/
(1/sigma.y^2 + 1/sigma.theta^2)
theta.hat.rank[j] <- mean (theta.hat[j,] < y[j,])
}par (mfrow=c(2,2))
x.range <- range (theta,y,theta.hat)
hist (theta, xlim=x.range, yaxt="n", ylab="", main="True parameter values")
hist (y, xlim=x.range, yaxt="n", ylab="", main="Data")
hist (theta.hat, xlim=x.range, yaxt="n", ylab="", main="Point estimates of parameters")
hist (theta.hat.rank, main="Opinion-ranks of estimates")

Getting back to Robin's question: so, if everybody is Bayesian, using a prior distribution that correctly reflects the distribution of the underlying parameters being modeled, then, the point estimates will, on average, be closer to the center of the distribution as compared to the true values. (To put it another way, the parameter estimates are shrunk toward the prior mean.) And so the truth will look stranger than fiction--if fiction is thought of as point estimates!
3. This point arises in many statistical examples: one's best guess is inherently more sober than what might possibly happen, which is one argument for considering fanciful possibilities in fiction. Taking your best point estimate at every step of the way will not give a realistic simulation of reality. Reality occasionally includes the unexpected.
4. We can apply this reasoning to sports scores, for example. Football games can be predicted to an accuracy of about 14 points (that is, the difference between the score differential and the point spread has an approximate normal distribution with mean 0 and standard deviation 14); see chapter 1 of Bayesian Data Analysis and some data here. Looking at these data:
- The average difference between winner's and loser's score is 12 points.
- The average spread (point prediction of difference between winner and loser) is 5.3 points.
- 71% of the time, the score is more extreme (in difference between winner's and loser's score) than the spread. (The favorite beats the spread in about half the games, and in another 20% or so of the games, the underdog actually wins by a larger margin than the favorite was favored.)
- The distribution of actual game outcomes (as measured by score differentials) is more extreme than the distribution of the point predictions.
Posted by Andrew at 10:58 AM | Comments (0) | TrackBack
February 10, 2007
Detecting lies
Robin Hanson points to a list of methods for detecting lies:
1. Look for inconsistencies
2. Ask unexpected questions
3. Compare to when they truth-tell
4. Watch for fake smiles and emotions
5. Listen to your gut reaction
6. Watch for microexpressions
7. Are words and gestures consistent
8. Are they unusually uneasy
9. Watch for too much detail
10. Focus on the truths you find
In response, billswift comments that items 2,4,5,8,9 would be failed by autistic people, even if not lying.
I'd like to add to this that items 4,5,6,8 can be failed by people with Tourette's syndrome, since an inability to look people in the eye is often taken as a sign of untrustworthiness (hence, flagging items 4 and 5), twitching can be taken as a sign of uneasiness (item 8) as well as allowing the observer to read in all sorts of microexpressions (item 6). After all, "shifty-eyed" people are liars, right?
More generally, the whole "gut reaction" thing can reinforce prejudice against anyone who behaves differently.
This is not to say that these lie-detection methods don't work--I'd be interested in seeing the details of an empirical study--but it's no fun being on the other end of this sort of appraisal.
P.S. In a comment on Robin's blog, Anders Sandberg writes,
As for detecting lies, it is better to combine cues than look for individual cues. Aldert Vrij & Samantha Mann, Detecting Deception: The Benefit of Looking at a Combination of Behavioral, Auditory and Speech Content Related Cues in a Systematic Manner, Group Decision and Negotiation 13: 61–79, 2004 http://www.springerlink.com/content/r6x2363031787h1x/ lists a variety of possible cues and talks about tests of detecting lies using them. They conclude that:"there is growing evidence that CBCA scores, Reality Monitoring scores and some nonverbal cues, particularly illustrators and hand and finger movements are useful to look at. It sounds reasonable to suggest that the more these cues occur simultaneously in a person’s response, the more likely it is that the person is lying. Our own study (Vrij et al., 2000) showed that lie detection with each of the cues individually did not result in high hit rates. In other words, it is essential to work with multiple cue models."
"The combined analyses revealed the most accurate classification of liars and truth tellers with a total hit rate of 81% (85% lie detection hit rate and 77% truth detection hit rate)."
One useful trick according to them is to compare the possible lie with a baseline of normal behavior for the person. But the method will only work if it is applied correctly, and they in particular point out the problems caused by making accusations that lead to biases in both the interviewer and interviewee. There is also a lot of widespread myths about individually reliable cues such as "liars look away" and "liars make many movements", making many "expert" lie detectors actually worse than normal people at deception detection because they only look at single factors. The authors actually suggest a method to train away this bias, by having police either state whether people in a video are lying, or whether they have to think hard. Afterwards they can confront their scores and see that the thinking hard approach works much better.
Posted by Andrew at 12:31 AM | Comments (8) | TrackBack
February 9, 2007
Internet weather forecast accuracy
David Madigan pointed me to this interesting analysis of internet weather forecasts. I think the person who wrote the article was pretty annoyed. Key quote:
The hail, rain and lightning eventually subsided, but the most alarming news was waiting on cell phone voicemail. A friend who lived in the area had called frantically, knowing we were at the park, as the local news was reporting multiple people had been by struck by lightning at Schlitterbahn during the storm."So much for the 0% chance of rain," I repeated.
The post continues with analysis of temperature forecasts, but maybe they'll go back and look at precipitation too. They also have to work on their graphs--but that's the trouble with using Excel, I suppose. Here's an example:

Posted by Andrew at 11:43 AM | Comments (4) | TrackBack
February 8, 2007
The fallacy of the one-sided bet (for example, risk, God, torture, and lottery tickets)
As a researcher and teacher in decision analysis, I've noticed a particular argument that seems to have a lot of appeal to people who don't know better. I'll call it the one-sided bet. Some examples:
- How much money would you accept in exchange for a 1-in-a-billion chance of immediate death? Students commonly say they wouldn't take this wager for any amount of money. Then I have to explain that they will do things such as cross the street to save $1 on some purchase, there's some chance they'll get run over when crossing the street, etc. (See Section 6 of this paper; it's also in our Teaching Statistics book.)
- Goals of bringing the levels of various pollutants down to zero. With plutonium, I'm with ya, but other things occur naturally, and at some point there's a cost to getting them lower. And if you want to get radiation exposure down to zero, you can start by not flying and not living in Denver.
- Pascal's wager: that's the argument that you might as well believe in God because if he (she?) exists, it's an infinite benefit, and if there is no god, it's no loss. (This ignores possibilities such as: God exists but despises believers, and will send everyone but atheists to hell. I'm not saying that this highly likely, just that, once you accept the premise, there are costs to both sides of the bet.) See also this from Alex Tabarrok and this from Lars Osterdal.
- Torture and the ticking time bomb: the argument that it's morally defensible (maybe even imperative) to torture a prisoner if this will yield even a small probability of finding where the ticking (H)-bomb is that will obliterate a large city. Again, this ignores the other side of the decision tree: the probability that, by torturing someone, you will motivate someone else to blow up your city.
- Anything having to do with opportunity cost.
- The argument for buying a lottery ticket: $1 won't affect my lifestyle at all, but even a small chance of $1 million--that will make a difference! Two fallacies here. First, most lottery buyers will get more than 1 ticket, so realistically you might be talking hundreds of dollars a year, which indeed could affect your standard of living. Second, there actually is a small chance that the $1 can change your life--for example, that might be the extra dollar you need to buy a nice suit that gets you a good job, or whatever.
There are probably other examples of this sort of argument. The key aspect of the fallacy is not that people are (necessarily) making bad choices, but that they only see half of the problem and thus don't realize there are tradeoffs at all.
P.S. When I was young and stupid, I spent some time trying to convince a student in my intro statistics class that it was a bad idea to play the lottery. In retrospect, I should've told him that it was fine, and just delineated where the probability calculations were relevant (for example, if he were to play the lottery twice a week for a year, or whatever).
Posted by Andrew at 10:04 AM | Comments (10) | TrackBack
February 2, 2007
"Unintended consequences" often were actually intended
I don't have much to say here, except that the concept of "unintended consequences" is so appealing that I think it's often applied to settings where the consequences actually were anticipated and intended, at least by some of the parties involved.
Posted by Andrew at 12:37 AM | Comments (5) | TrackBack
November 28, 2006
"Happiness" from economic and psychological perspectives
In a comment on this entry, Thom writes,
I'm not convinced that what we call happiness is a single thing. We could probably divide it into (at least) two concepts - local happiness "this instant" and general happiness. I think that having children relates more to the latter (or possibly towards a related concept like fulfilment).Beyond that you'd need a theoretical account of happiness to make sense of what's going on. The (naive) economic analysis is that happiness leads to inaction, but the some theories of emotion propose the opposite (with evidence in support). For example the broaden and build theory of emotion proposes that the evolutionaty function of positive emotions is to build resources - so you'd maybe expect happy people to plan for the future (whereas we know very unhappy people don't).
I'm especially interested in his second comment--the point about action and inaction is something I'd never thought about. From an economic standpoint, if you are at a maximum of relative happiness, you would want to do what it takes to stay there (which might be inaction, but it might be to work your tail off, if, for example, you're happy but in major dept). For unhappy people, one could try a reverse explanation: if you're unhappy despite everything you've tried, then maybe giving up seems like the best alternative.
Posted by Andrew at 8:59 AM | Comments (2) | TrackBack
November 6, 2006
Serenity prayer (rerun)
This (sent to me several months ago by Will Fitzgerald) is so great I had to run it again:

Posted by Andrew at 12:41 AM | Comments (0) | TrackBack
November 4, 2006
Civil liberties and war
Adam Berinsky is presentjng this paper at the New York Area Political Psychology Meeting today. I don't have much to say about the content of the paper, except that a key issue would seem to me to be framing: are civil liberties a luxury (as our math professors would say in college when proving a theorem, "culture") that we can't afford in wartime, or are civil liberties a form of security that is needed more than ever during a war? I would think that many of the controversies about civil liberties--in policy discussions and in public opinion--depend on this framing.
In any case, I have some comments about the graphs in the paper. First, I like how the paper follows in the Page and Shapiro tradition of presenting results graphically rather than as tables. For the Berinsky paper, I'd recommend more consistency in the presentation, basically displaying the information, wherever possible, as line plots with time on the x-axis. This parallelism will make the paper easier to read, I think--partly because the graphs can be made physically small and thus fit into the text better, also because a compact display allows more information to be displayed and be made visible in one place (so that the reader--and the researcher--can see more comparisons and learn more).
In detail:
The x-axes should be cleaner. I'd recommend, either putting a tick mark at Jan 1 for each year, or else showing year boundaries on the x-axis and putting the year labels between tick marks (so that, for example, "2003" is placed between the 1 Jan 2003 and 1 Jan 2004 tick marks. It's confusing to read raphs such as Fig 1 with tick marks at "Jul-2001", "May-2003", "Mar-2003", etc.
Fig 5.2 is hard to read. I'd recommend actually replacing Fig 5.2 by 3 small figures, one for each of the poll questions you're analyzing. Figs would be on common scales, and for each fig, you can show the time series for Reps, Dems, and Independents. I'd also like to see these go back before 1995. Perhaps can get similar questions from NES?
Figs 5.3 and 5.4 should be combined as time series. Also, I'd like to see these questions ordered in increasing (or decreasing) support for the "no on civil liberties" response.
Fig 5.5 should have some data on it. Actually, I think it should be rewritten as a time series, with year on the x-axis and 2 lines for the 2 levels of war support (0 and 1). Also, I'd make this richer in info by considering subsets of the population. A famous example is education: highly-educated people supported the war more.
Fig 5.6 ("Threat and intolerance") could use a more descriptive title. What are the questions here? It's good for figures to be self-contained. Also, the lines should be labled directly (not with a legend) and the x-axis should just have labels every 10 years. Again, maybe more could be learned by looking at subsets of the population or at other questions.
Fig 5.7: a little confusing. Maybe breaking up into 2 or 3 or 4 little plots (arranged on a grid) would help. Also, I'd label the x-axis as discussed in the Fig 5.1 comment.
Figs 5.8 and 5.9 should be combined and presented as time series.
Posted by Andrew at 7:47 AM | Comments (0) | TrackBack
November 1, 2006
"Loss aversion" isn't always
This entry by Will Wilkinson reminded me of something that's bugged me for awhile, which is the use of term "loss aversion" to describe something that I'd rather call "uncertainty aversion," if that. (Wilkinson doesn't actually do this thing that irritates me--he actually is talking about loss aversion, referring to actual aversion to loss--but he reminds me of this issue.)
As I wrote before,
If a person is indifferent between [x+$10] and [55% chance of x+$20, 45% chance of x], for any x, then this attitude cannot reasonably be explained by expected utility maximization. The required utility function for money would curve so sharply as to be nonsensical (for example, U($2000)-U($1000) would have to be less than U($1000)-U($950)). This result is shown in a specific case as a classroom demonstration in Section 5 of a paper of mine in the American Statistician in 1998 and, more generally, as a mathematical theorem in a paper by my old economics classmate Matthew Rabin in Econometrica in 2000. . . .Matt attributes the risk-averse attitude at small scales to "loss aversion." As Deb points out, this can't be the explanation, since if the attitude is set up as "being indifferent between [x+$10] and [55% chance of x+$20, 45% chance of x]", then no losses are involved. I attributed the attitude to "uncertainty aversion," which has the virtue of being logically possible in this example, but which, thinking about it now, I don't really believe.
Right now, I'm inclined to attribute small-stakes risk aversion to some sort of rule-following. For example, it makes sense to be risk averse for large stakes, and a natural generalization is to continue that risk aversion for payoffs in the $10, $20, $30 range. Basically, a "heuristic" or a simple rule giving us the ability to answer this sort of preference question.
There was some discussion of this on the blog last year. To recap briefly, no, I don't think this example is loss aversion, since no losses are involved. Yes, you could shift the problem by subtracting, to get losses, but that's not how it's framed. Getting back to the $40,$50,$60 example: if you want, you can say that the very mention of the $50 makes anything less seems like a loss, but I don't see it. I think the evidence is that people react to actual losses much more strongly than to a non-gain.
Risk aversion. No, it's loss aversion. No, it's uncertainty aversion. No, it's rule-following.
Anyway, my problem here is with "loss aversion" used in an automatic way to summarize various aspects of irrationality (such as avoidance of expected monetary value for small dollar amounts). My take on it (which is probably historically inaccurate) was that decision scientists first simply assumed that people used expected monetary value. Then they coined the term "risk aversion" and associated it with concave utility functions. Simple calculations (such as mine and Matt's, mentioned above) made it clear to many people (eventually everyone, I hope) that the typical non-EMV attitudes cannot be sensibly fit into an expected-utility framework. This led to ideas such as prospect theory which had aspects of expected utility but with biases caused by framing, confusions about probability, loss aversion, and so forth.
Now loss aversion is the catchphrase--and I agree, it's an improvement on the now-meaningless "risk aversion"--but I think it's silly to apply "loss aversion" to settings with no losses. Really, in some of these settings, I don't see "aversion" at all but rather a preference for certainty (perhaps "uncertaintly aversion") or even just the following of a rule.
The big issue
The big issue pointed out implicitly by Wilkinson (and others) is that people often seem to respond to the trend rather than the absolute level of the economy. I'm certainly not meaning to imply that, in battling over terminology, I'm resolving these deeper issues. My goal here is simply to point out that some commonly-used terms can have misleading implications.
Regarding Wilkinson's actual entry, his discussion is interesting, but I'm confused by his main point, which seems to be:
(a) Middle-class Americans shouldn't be so scared about losses--they'd still be able to get by OK on half their incomes.
(b) By being less afraid of losses, a middle-class American could take more risks which could result in a doubling of his or her income.
But, if point (a) is true, and you could easily live on half, then what's the motivation to double your income? Shouldn't we all just be taking more vacations?
I'm not trying to disagree with Wilkinson's point that many people's economic lives might not be so precarious as they think--as he puts it, middle-class Americans get a lot of things for free. I just don't see why this implies that people should be taking more risks.
Posted by Andrew at 12:26 AM | Comments (4) | TrackBack
September 28, 2006
Interesting decision analysis project
Pre-Doctoral Clinical Research Fellowship at MSKCC
The Department of Psychiatry & Behavioral Sciences of Memorial Sloan-
Kettering Cancer Center (MSKCC; www.mskcc.org) invites applications for a
part-time pre-doctoral clinical research fellowship in the behavioral
aspects of cancer prevention and control.
The position is supported by the National Cancer Institute (NCI) and provides mentored training in cancer control and prevention (smoking cessation) activities and quality of life in cancer. The fellowship provides an excellent opportunity for research career development in a healthcare setting. Current projects include a randomized intervention trial using real-time data capture (RTDC) methodology via handheld computers to promote presurgical smoking cessation among newly diagnosed cancer patients, and a study of early adjustment and quality of life in lung cancer patients treated surgically. Pre-doctoral fellows actively participate in project development and implementation, including data management, data analyses, and dissemination of findings via professional presentations and manuscript preparation. Fellows also are encouraged to attend weekly formal lectures and seminars and an advanced colloquium in research design and statistical methods.Stipends are highly competitive and benefits are excellent.
Conference/travel funds are provided. The ideal candidate should be
enrolled in graduate school and have completed at least one, preferably
two, years of graduate studies in psychology, public health, or a related
field. The position is part-time, with a minimum commitment of 15
hours/week. Strong interest in clinical research and solid computer and
quantitative skills preferred. We encourage pre-doctoral fellows to pursue
master’s theses and doctoral dissertations in related areas and provide
mentoring for completion of these and other academic requirements. Send
cover letter summarizing research interests/experiences, curriculum vitae
and three professional references to: Jamie Ostroff, Ph.D.
(ostroffj@mskcc.org) and/or Jack Burkhalter, Ph.D. (burkhalj@mskcc.org),
Department of Psychiatry & Behavioral Sciences, Memorial Sloan-Kettering
Cancer Center, 641 Lexington Ave., 7th floor, New York, NY 10022.
Posted by Andrew at 12:39 AM | Comments (0) | TrackBack
September 27, 2006
Variable ordering fallacy: why people continue to disagree
A couple of debates seem to never stop: nature vs nurture, ability versus luck, role of society vs personal responsibility. The fundamental problem in these discussions is that one group of people considers one of the causes more important than the other one, and the other group disagrees. In this entry, I will attempt to show an explanation of this problem with my interaction analysis framework.
I have taken the "rodents" dataset. Cases are apartments in New York City, the covariates are the number of defects, the poverty score and the race for the apartment, whereas the outcome is whether there were rodents found in the building. The result of the analysis in the form of an interaction graph is as follows:

The defects are clearly by far the best predictors of rodents (13.2% of explained variation), this is followed by race (7.9%) and then by the poverty score (7.1%). What is important is that none of the covariates is explained away by the others. The links between covariates indicate the correction that is necessary as both covariates provide in part the same information about the outcome. In particular, should we predict rodents using poverty and race, the actual amount of variance explained would be 7.1+7.9-3.0=12.0%.
The trouble is that -3.0 factor. If race and poverty weren't correlated, it would be zero (or positive). But as they are correlated, there is ambiguity with respect to what is primary, race or poverty, in predicting the rodents. In particular, one could say that the increased frequency of rodents among minorities can be explained by poverty. With this, we would assign 7.1% of explained variance to poverty and 7.9-3.0=4.9% to race.
On the other hand, we could say that minorities have a cultural bias, an example of which is that don't keep as many pets like cats and dogs that prey upon rodents. Thus, cultural biases can explain an increased likelihood of rodents, along with, say, racist landlords that refuse to fix cracks in an apartment of a householder of the wrong race. Poverty could also be a consequence of these cultural biases (preferring one profession to another) or even race directly, either in terms of innate ability, in terms of discrimination or in terms of the "poverty trap". With such an interpretation we would allocate 7.9% of explained variance to race, a proxy for culture, and 7.1-3.0=4.1 to poverty.
Same data, same model, but two interpretations: because of the correlation between race and poverty, we do not know how to divide the 3% of shared information among the two variables. People will continue to disagree. Sometimes it is possible to resolve this dilemma when one variable completely explains away the other one, but this isn't the case here. What to do?
If poverty and race were not correlated, this problem would not appear. So one way of remedying the problem would be through controlled experiment. The trouble is that one cannot change someone else's race at random.
Another is the shut up and calculate approach: just employ logistic regression and see what the coefficients are:
glm(formula = y ~ defects + poor + as.factor(race), family = binomial,
data = nd)
coef.est coef.se
(Intercept) -3.10 0.06
z.defects 1.38 0.04
z.poor 0.60 0.05
as.factor(race)2 1.07 0.06
as.factor(race)3 1.08 0.08
as.factor(race)4 1.34 0.07
as.factor(race)5 0.69 0.09
as.factor(race)6 0.85 0.45
as.factor(race)7 0.60 0.27
n = 13931, k = 9
residual deviance = 12427.1, null deviance = 15185.1 (difference = 2758.1)
The regression gives specific values that assign the importance to a particular covariate. In this case, race is more important than poverty. The trouble is that the regression coefficients are sometimes haphazard or even counterintuitive as measures of feature importance. Imagine that y = a*x1+b*x2+e, and that x1=x2: clearly any choice of a+b=c will be equally fitting.
For that matter, given the same data with correlated covariates, people will continue to disagree on how important individual covariates are. Regression coefficient magnitude can be seen as a tiebreaker, but if one denies the authority/truth of the linear best-fitting model, it can be questioned. In many cases, it is impossible to disentangle variables that always tend to stick together. Trying to separate them would be artificial.
Posted by Aleks Jakulin at 8:50 AM | Comments (0) | TrackBack
September 13, 2006
Should you wear a bicycle helmet?
Rebecca pointed me to this interesting article by Ben Hoyle in the London Times, "Helmeted cyclists in more peril on the road." Hoyle writes:
Cyclists who wear helmets are more likely to be knocked off their bicycles than those who do not, according to research.Motorists give helmeted cyclists less leeway than bare-headed riders because they assume that they are more proficient. They give a wider berth to those they think do not look like “proper” cyclists, including women, than to kitted-out “lycra-clad warriors”.
Ian Walker, a traffic psychologist, was hit by a bus and a truck while recording 2,500 overtaking manoeuvres. On both occasions he was wearing a helmet.
During his research he measured the exact distance of passing traffic using a computer and sensor fitted to his bicycle.Half the time Dr Walker, of the University of Bath, was bare-headed. For the other half he wore a helmet and has the bruises to prove it.
He even wore a wig on some of his trips to see if drivers gave him more room if they thought he was a woman. They did.
He was unsure whether the protection of a helmet justified the higher risk of having a collision. “We know helmets are useful in low-speed falls, and so are definitely good for children.”
On average, drivers steered an extra 3.3 in away from those without helmets to those wearing the safety hats. Motorists were twice as likely to pass “very close” to the cyclist if he was wearing a helmet.
Not just risk compensation
This is interesting: I was aware of the "risk compensation" idea, that helmeted riders will ride less safely, thus increasing the risk of accident (although the accident itself may be less likely to cause serious injury), as has been claimed with seat belts, antilock brakes, and airbags for cars. (If it were up to me, I would make car bumpers illegal, since they certainly seem to introduce a "moral hazard" or incentive to drive less carefully.)
But I hadn't thought of the idea that the helmet could be providing a signal to the driver. From the article, it appears that the optimal solution might be a helmet, covered by a wig . . .
The distinction between risk compensation altering one's own behavior, and perceptions altering others' behavior, is important in making my own decision. On the other hand, my small n experience is that I have a friend who was seriously injured after crashing at low speed with no helmet. So it's tricky for me to put all the information together in making a decision.
Attitudes
The news article concludes with,
He [Walker] said: “When drivers overtake a cyclist, the margin for error they leave is affected by the cyclist’s appearance. Many see cyclists as a separate subculture.“They hold stereotyped ideas about cyclists. There is no real reason to believe someone with a helmet is any more experienced than someone without.”
I don't know the statistics on that, but I do think there's something to this "subculture" business. People on the road definitely seem to have strong "attitudes" to each other based on minimal information.
Self-experimentation
Finally, Rebecca pointed out that this is another example of self-experimentation. As with Seth's research, the self-experimenter here appears to have a lot of expert knowledge to guide his theories and data collection. Also amusing, of course, is that his name is Walker.
Posted by Andrew at 12:39 AM | Comments (10) | TrackBack
August 15, 2006
Judgment and decision making journal
Dan Goldstein links to this new online journal on decision analysis. It looks pretty interesting. I am positively disposed toward this article by Davd Gal, since what is often described as "loss aversion" is often better characterized as "uncertainty aversion" (see here, for example).
Posted by Andrew at 6:12 AM | Comments (0) | TrackBack
July 12, 2006
One or two-year research position in psychology/economics
Sheena Iyengar is a professor of psychology in the business school here who has worked on some interesting projects (including the speed-dating experiment). She writes,
I [Sheena] am looking for an ambitious, dedicated, and promising graduating senior interested in a full-time research assistant position for one to two years beginning August 1, 2006. Potential applicants should have a degree in either social/cognitive psychology or economics with an interest in the intersection of economics and the psychology of judgment and decision making. Preference is given to candidates with a strong math background and good writing skills who have had some research experience in a laboratory already.The salary for this position is $45,000 and includes all health benefits. The research assistant will be responsible for running experiments, managing a laboratory, conducting statistical analyses, and will have the opportunity to co-author in journal publications. It is a truly excellent opportunity for someone who is interested in pursuing a Ph.D. in behavioral economics, psychology, and/or related disciplines. If you are interested in applying for this position, please e-mail me, Professor Sheena S. Iyengar, at ss957@columbia.edu or call at 212 854-8308. I will be interviewing potential applicants immediately.
It looks interesting to me . . .
Posted by Andrew at 7:10 AM | Comments (0) | TrackBack
June 30, 2006
"The more rapid access to drugs on the market enabled by the Prescription Drug User Fee Act saved the equivalent of 180 to 310 thousand life-years BETWEEN 19XX AND XXXX."
As discussed here, I've been interested in finding studies of the costs and benefits of approvals of new medical treatments, but not in the narrow sense of the costs and benefits to those being treated, but the larger balance sheet, incluing costs of running the study, risks to participants, and likely gains to the general population. (For example, approving a study early allows for potentially more gains to the general population but also more risks of unforseen adverse events.)
Jim Hammitt pointed me to this paper by Tomas J. Philipson, Ernst R. Berndt, Adrian H. B. Gottschalk, Matthew W. Strobeck, entitled "Assessing the Safety and Efficacy of the FDA: The Case of the Prescription Drug User Fee Acts." Here's the summary of the paper, and here's the abstract:
The US Food and drug Administration (FDA) is estimated to regulate markets accounting for about 20% of consumer spending in the US. This paper proposes a general methodology to evaluate FDA policies, in general, and the central speed-safety tradeoff it faces, in particular. We apply this methodology to estimate the welfare effects of a major piece of legislation affecting this tradeoff, the Prescription Drug User Fee Acts (PDUFA). We find that PDUFA raised the private surplus of producers, and thus innovative returns, by about $11 to $13 billion. Dependent on the market power assumed of producers while having patent protection, we find that PDUFA raised consumer welfare between $5 to$19 billion; thus the combined social surplus was raised between $18 to $31 billions. Converting these economic gains into equivalent health benefits, we find that the more rapid access of drugs on the market enabled by PDUFA saved the equivalent of 180 to 310 thousand life-years. Additionally, we estimate an upper bound on the adverse effects of PDUFA based on drugs submitted during PDUFA I/II and subsequently withdrawn for safety reasons, and find that an extreme upper bound of about 56 thousand life-years were lost. We discuss how our general methodology could be used to perform a quantitative and evidence-based evaluation of the desirability of other FDA policies in the future, particularly those affecting the speed-safety tradeoff.
I haven't read the paper (that takes more effort than linking to it!) but I like that they're trying to measure all the costs and benefits quantitatively.
Posted by Andrew at 8:55 AM | Comments (0) | TrackBack
June 1, 2006
Costs and benefits of expedited drug approvals
Marcia Angell has an interesting article in the New York Review of Books on the case of Vioxx, the painkiller drug that was withdrawn after it was found to cause heart attacks. (She cites an estimate of tens of thousands of heart attacks caused by the use of Vioxx and related drugs, referring to Eric J. Topol, "Failing the Public Health—Rofecoxib, Merck, and the FDA," The New England Journal of Medicine, October 21, 2004.) Angell writes,
In late 1998 and early 1999, Celebrex and then Vioxx were approved by the FDA. They were given rapid "priority" reviews—which means the FDA believed them likely to be improvements over drugs already sold to treat arthritis pain. Was that warranted? Neither drug was ever shown to be any better for pain relief than over-the-counter remedies such as aspirin or ibuprofen (Advil) or naproxen (Aleve). But theory predicted that COX-2 inhibitors would be easier on the stomach, and that was the reason for the enthusiasm. As it turned out, though, only Vioxx was shown to reduce the rate of serious stomach problems, like bleeding ulcers, and then, mainly in people already prone to these problems, a small fraction of users. In other words, the theory just didn't work out as anticipated.Furthermore, people vulnerable to stomach ulcers could probably get the same protection and pain relief by taking a proton-pump inhibitor (like Prilosec) along with an over-the-counter pain reliever. So the COX-2 inhibitors did not really fill an unmet need, despite the one seemingly attractive claim made in favor of them.
She also goes into detail on conflict of interest in the FDA advisory committees, and recommends that the FDA shouldn't approve new drugs so hastily. This sounds like a good recommendation for Vioxx etc. (tens of thousands of heart attacks doesn't seem good). But how many drugs are there on the other side--effective drugs that are still waiting for approval? I'm curious what Angell's colleagues at the Harvard Center for Risk Analysis would say. Would it be possible to have an approval process that catches the Vioxx-type drugs but approves others faster?
Posted by Andrew at 7:14 AM | Comments (2) | TrackBack
May 4, 2006
Conservative decisions of football coaches: fourth-down conversion and default decisions
Tyler Cowen links to a paper by David Romer on football coaches' fourth-down decisions (punt, go for a field goal, or go for a first down or touchdown). Apparently a coach would increase the probability of winning the game by going for the first down or toouchdown much more often, and punting and going for the field goal much less often.
I've heard this beforem--that "going for it" is the "percentage play"--and Romer asks in this paper why should it be. After all, the economic incentives clearly favor the idea of winning games. It seems like a huge Moneyball-style opprotunity, the pro-sports equivalent of the famous $20 on the street in that joke about the 2 economists. And, for that matter, "going for it" on 4th down is a more exciting play, so it should make the fans happy (as compared to Moneyball-like strategies such as being patient at the plate and drawing walks, which arguably are so boring as to potentially lose fans).
Conservatism everwhere
As Romer points out, the conservative strategy of football coaches is a general case of conseratism in decision making that appears in many contexts in the Kahneman-Slovic-Tversky literature. I like the term "conservatism" here and think it preferable to "risk aversion," a term that is so vague as to have no meaning anymore, I think.
At least from anecdotal evidence (e.g., stories about Woody Hayes), my impression is that football coaches are conservative in other ways too, and maybe these attitudes go toghether. In any case, my impression from reading Bill James and Moneyball is that sports decisions are often made more on flashy numbers than on more relevant data analysis. (I'm sure that this is true of the rest of us too in making our decisions--the sports coaches are just in the embarrassing position of having more hard data available.)
A rational reason for conservatism in this case
In the particular example of fourth-down conversion, somebody--I think it was Bill James--pointed out a possibly rational reason for coaches to be conservative. The argument goes as follows: if a strategy succeeds and the game is won, everyone's happy. The real issue comes when it fails. If the coach did the standard strategy and fails, then hey, it's too bad, but everyone (well, everyone but George Steinbrenner) knows you can't win 'em all. But if the coach does something that is perceived to be "radical" and it fails, then he looks bad and is much more easily Monday-morning-quarterbacked. Even if the probability of winning is higher under the radical strategy, the medium-term expected payoff (i.e., probabability that the coach keeps his job at the end of the season) could be higher under the conservative strategy.
How does this differ from Romer's theory? Romer suggests a risk-aversion based on probability of winning. In my theory, the "default strategy" plays a key role. There is path dependence and an economic moitivation to follow the default strategy. (This is in addition to the much-observed psychological pheoomenon that people do the default, even at significant personal financial costs.) Romer doesn't mention the idea of defaults in his paper but I think that's the next step in studying the phenomenon of conservatism in decision making.
I'm curious what Hal Stern thinks of all of this.
Posted by Andrew at 6:51 AM | Comments (4) | TrackBack
Distinction between different scenarios of group decision-making
A few years ago, I was at a mini-conference on information aggregation in decision making, at which there was a lot of discussion of group decision-making procedures, and individual strategies in group decision contexts. I was bothered that there was a lot of talk about decision-making rules, but not so much about the ways that rules interact with the types of decision problems, which I categorized as:
1. combining information (as in perception and estimation tasks)
2. combining attitudes (as in national elections)
3. combining interests (as in competitive games and distributive politics)
I considered three different group-decision scenarios: (1) "inference," (2) "difference of opinion," and (3) "conflict of interest," and discussed demarcation points to identify the scenarios. My claim is that different information-combining strategies are appropriate in these different scenarios, and that blurring these distinctions (for example, thinking of the "marketplace of ideas" or analogizing from Arrow's theorem of preferences to rules for ranking Google pages) can mislead.
For more, see pages 16-26 of this set of slides.
Posted by Andrew at 12:29 AM | Comments (0) | TrackBack
April 11, 2006
Murphy's laws for grunts
Marty Ringo send me the following comments on my paper on the (mis)application of the Prisoner's Dilemma to trench warfare. I appreciate the comments, especially given that the closest I've come to military service was the Boy Scouts when I was 11, and the last time I was in combat was a fistfight in 7th grade. Anyway, Ringo writes,
As an ex-NCO (non-commissioned officer, i.e. sergeant) I have varying degrees of prejudice about academics writing about combat. This is, of course, self-contradicting since I am a semi-academic and most of what I know about combat has been acquired from reading in my post-military life.S.L.A. Marshall's famous study on combat fire found that few, maybe something like 20%, of the soldiers in WW II actually fired their weapons in combat. Marshall's research methods have been since questioned, but the point he raised still lingers. A famous WW I joke pertains to this issue.
There once was the young enlisted soldier who had risen through courage and competence to sergeant and had been recommended for commission. To get his lieutenant's bars, he had to pass a tough test of military knowledge. Despite is his lack of formal education, the young man got every answer correct but one. The examining colonel was delighted, and, hoping to assist this military prodigy, reviewed the one incorrect answer with the youth. The question posed a battle situation in which the company in question was pinned down on a rocky ridge by fire from a superior position, and then asked the how the officer should raise a flag to boost the company's morale. The young man explained how this could be done with an intricate series of ropes hooked to the flagpole and thrown from one group of men to another without ever exposing anyone to fire. And then, everyone would pull together and up goes the flag. This was wrong. The correct answer was "Sergeant erect a flagpole!"The point here is that commanders do not order the troops to fire; sergeants do, and the commanders who really thought that shooting rifles at entrenched positions made much of difference were seldom in contact with the sergeants, let alone the troops below the NCO ranks.
In infantry tactics there is a thing called suppressing fire or suppression. It is a wonderful thing in theory: the idea being your fire will keep the enemy from firing or at least from firing accurately. In Vietnam when patrols were hit--and since 80+% of combat incidents were enemy (NVA or Viet Cong) initiated, that was a standard form of combat--the lieutenants would yell out, “Return fire, return fire.” Sometimes men didn’t and got pinned down; sometimes men did and got pinned down, sometimes men didn’t and didn’t get pinned down and…. There are a lot of it-depends in trying to draw a conclusion. The Army today tries to draw conclusions from every combat action, but the extent such reviews are successful is more a matter of case study than paradigmatic analysis.
RIngo also points out that the "Tit for Tat" strategy comes from Anatol Rapaport, whom I did not reference in my article. Finally, he gives us the following "Murphy's Laws for Grunts":
Murphy’s Laws for grunts- Murphy was a grunt.
- Tracers work both ways.
- Suppressive fire - won't.
- Try to look unimportant; the enemy may be low on ammo and not want to waste a bullet on you.
- Never share a foxhole with anyone braver than yourself.
- Never forget that your weapon was made by the lowest bidder.
- If your attack is going really well, it's an ambush.
- The retreating enemy that’s falling back is just trying to suck you into a serious ambush.
- Teamwork is essential; it gives the enemy other people to shoot at.
- Don't look conspicuous; it draws fire.
- If the enemy is within range, so are you.
- Incoming fire has the right of way.
- If the Platoon Sergeant can see you, so can the enemy.
- The most dangerous thing in combat is a Second Lieutenant with a map and a compass.
- Those who DO hesitate under fire usually DO NOT end up KIA or WIA.
- Walking point = sniper bait. [Actually in Vietnam is wasn’t true. Snipers on both sides primarily went after the officers or radio carriers. However, anti-personal mines, for obvious reasons, were another thing.]Note [writes Ringo] there are many Murphy’s Laws for grunts that have the opposite message, e.g. “When in a fire fight, kill as many as you can, the one you miss may not miss tomorrow.” However, the “watch your rear end” messages appear to dominate by over 2 to 1.
Posted by Andrew at 12:21 AM | Comments (2) | TrackBack
March 29, 2006
The serenity prayer and Venn diagrams
The Serenity Prayer, attributed to Reinhold Niebuhr and now associated with Alcoholics Anonymous, goes,
God give me the serenity to accept things which cannot be changed; Give me courage to change things which must be changed; And the wisdom to distinguish one from the other.
I think this would make a great classroom example of Venn diagrams (as used for set theory and probability). There are two sets:
A: "things which cannot be changed"
B: "things which must be changed"
and the prayer implicitly assumes that A and B are disjoint and exhaustive (that is, that every item in the universe of "things" being considered is either in A or in B, but not both).
But if you draw the Venn diagram, you can see the possibility of:
not-A and not-B: this is ok, things which can be changed but do not need to be changed
A and B: this is bad, these are the things which cannot be changed but must be changed!
This is a great example, in that the rhetoric of the prayer is so compelling that it's easy to miss, at first, these other two categories, but the Venn diagram makes it clear. Also, many students will already have been exposed to this prayer, and the others will probably find it interesting. How does the Venn diagram version affect how the prayer says we should live our lives?
P.S. The above version of the prayer is from Niebuhr. As Jim Lebeau notes in a comment, the actual version used by AA goes "God grant us the serenity to accept the things we cannot change, courage to change the things we can . . .", which doesn't quite work as a Venn diagram example.
P.P.S. The pretty picture below is from Will Fitzgerald (see comments).

Posted by Andrew at 12:11 AM | Comments (8) | TrackBack
March 27, 2006
Newcomb's paradox solved using statistical reasoning
Newcomb's paradox is considered to be a big deal, but it's actually straightforward from a statistical perspective. The paradox goes as follows: you are shown two boxes, A and B. Box A contains either $1 million or $0, and Box B contains $1000. You are given the following options: (1) take the money (if any) that's in Box A, or (2) take all the money (if any) that's in Box A, plus the $1000 in Box B. Nothing can happen to the boxes between the time that you make the decision and when you open them and take the money, so it's pretty clear that the right choice is to take both boxes. (Well, assuming that an extra $1000 will always make you happier...)
The hitch is that, ahead of time, somebody decided whether to put $1 million or $0 into Box A, and that Somebody did so in a crafty way, putting in $1 million if he or she thought you would pick Box A only, and $0 if he or she thought you would pick Box A and B. Let's suppose that this Somebody is an accurate forecaster of which option you would choose. In that case, it's easy to calculate that the expected gain of people who pick only Box A is greater than the expected gain of people who would pick both A and B. (For example, if Somebody gets it right 70% of the time, for either category of person, then the expected monetary value for the "believers" who pick only box A is 0.7*($1,000,000) + 0.3*0 = $700,000, and the expected monetary value for the "greedy people" who pick both A and B is 0.7*$1000 + 0.3*$1,001,000 = $301,000.) So the A-pickers do better, on average, than the A-and-B-pickers.
The paradox
The paradox, as has been stated, is that from the perspective of the particular decision, it's better to pick A and B, but from the perspective of expected monetary value, it appears better to pick just A.
Resolution of the paradox
It's better to pick A and B. The people who pick A do better than the people who pick A and B, but that doesn't mean it's better for you to pick A. This can be explained in a number of statistical frameworks:
- Ecological correlation: the above expected monetary value calculation compares the population of A-pickers with the population of A-and-B-pickers. It does not compare what would happen to an individual. Here's an analogy: one year, I looked at the correlation between students' midterm exam scores and the number of pages in their exam solutions. There was a negative correlation: the students who wrote the exams in 2 pages did the best, the students who needed 3 pages did a little worse, and so forth. But for any given student, writing more pages could only help. Writing fewer pages would give them an attribute of the good students, but it wouldn't actually help their grades.
- Random variables: label X as the variable for whether the Somebody would predict you are an A-picker, and label Y as the decision you actually take. In the population, there is a positive correlation between X and Y. But X occurs before Y. Changing Y won't change X, any more than painting dots on your face will give you chicken pox. Yes, it would be great to be identified as an A-picker, but picking A won't change your status on this.
One more thing
Some people have claimed to "resolve" Newcomb's paradox by saying that this accurate-forecasting Somebody can't exist; the Somebody is identified with God, time travel, reverse causation, or whatever. But from a statistical point of view, it shouldn't be hard at all to come up with an accurate forecast. Just do a little survey, ask people some background questions (age, sex, education, occupation, etc.), then ask them if they'd pick A or A-and-B in this setting. Even a small survey should allow you to fit a regression model that would predict the choice pretty well. Of course, you don't really know what people would do when presented with the actual million dollars, but I think you'd be able to forecast to an accuracy of quite a bit better than 50%, just based on some readily-available predictors.
Posted by Andrew at 12:01 AM | Comments (11) | TrackBack
February 17, 2006
Random selection of judges
Sean Schubert pointed me to this article, " Figure Skating Scoring Found to Leave Too Much to Chance":
The overseers of international figure skating scoring instituted a new system in 2004, designed to reduce the chances of vote fixing or undue bias after the scandal during the Winter Olympics in Salt Lake City in 2002. Under the old rules eight known national judges scored a program up to six points with the highest and lowest scores dropped. Under the new rules, 12 anonymous judges score a program on a 10-point scale. A computer then randomly selects nine of the 12 judges to contribute to the final score. The highest and lowest individual scores in each of the five judging categories are then dropped and the remaining scores averaged and totaled to produce the final result.This random elimination of three judges results in 220 possible combinations of nine-judge panels, explains John Emerson, a statistician at Yale University. And according to his analysis of results from the shorts program at the Ladies' 2006 European Figure Skating Championships, the computer's choice of random judges can have a tremendous--and hardly fair--impact on the skaters' rankings. "Only 50 of the 220 possible panels would have resulted in the same ranking of the skaters following the short program," Emerson writes in a statement announcing his findings.
I have to say, selecting judgments at random seems pretty wacky to me. Why not just average all of them? I also think it's funny that the ratings are from 0 to 10, with increments of 0.25. Why not just score them from 0 to 40 with increments of 1, or 0 to 4 with increments of 0.1?
The article does point out an interesting problem, which is that judges are perhaps giving too-low scores at the beginning to leave room later. The system of adding or deducting points from the base value seems like a step toward fixing this. But they should set the base value low (e.g., at 3, rather than at 7), so that they'll have more resolution at the upper end of the scale. As things stand, the scale might be better designed for picking the worst skater than the best!
On the other hand, I'm not so strongly moved by Emerson's argument that removing different judges would change the outcome. As Tom Louis has written, rankings are pretty random anyway, and, in any case, things would be different if a different set of 12 judges were selected.
Posted by Andrew at 12:10 AM | Comments (6) | TrackBack
February 14, 2006
Evaluation of multilevel decision trees
The evaluation of decision trees under uncertainty is difficult because of the required nested operations of maximizing and averaging. Pure maximizing (for deterministic decision trees) or pure averaging (for probability trees) are both relatively simple because the maximum of a maximum is a maximum, and the average of an average is an average. But when the two operators are mixed, no simplification is possible, and one must evaluate the maximization and averaging operations in a nested fashion, following the structure of the tree. Nested evaluation requires large sample sizes (for data collection) or long computation times (for simulations).
An alternative to full nested evaluation is to perform a random sample of evaluations and use statistical methods to perform inference about the entire tree. We show that the most natural estimate is biased and consider two alternatives: the parametric bootstrap, and hierarchical Bayes inference. We explore the properties of these inferences through a simulation study.
Here's the paper (by Erwann Rogard, Hao Lu, and myself).
Posted by Andrew at 9:38 AM | Comments (0) | TrackBack
January 3, 2006
Catch-22: Without data, how do you know how to sample?
As part of a "carbon trading" program, a program is being instituted to reduce energy use for streetlights in a developing country. Here's how it works: (1) "baseline" energy use is established for the existing street light system, (2) some of the lights will be replaced with new lights that are more energy efficient and will thus consume less energy, and (3) the company that does the installation will be reimbursed based on the reduction in consumption. (No reduction, no money).
Simple enough on paper, but we live in a messy world. For example, the electricity provided by the grid is often substantially below the nominal voltage, so the existing lamps (which do not include voltage regulators) often put out much less light than they should, but also consume less electricity than they should. The new lights include voltage regulators so they always operate at their nominal power consumption. It's entirely possible that replacing the old lights with the new ones will increase the light output but generate no energy savings (or even negative savings) and thus no reduction in carbon dioxide production.
One possibility would be to use new lamps that have the same light output as the current lamps, rather than the same nominal energy consumption. But it's not clear that the municipalities involved will agree to that, for one thing. (For instance, the voltage that is provided varies with time, so even though the existing lamps often operate well below their nominal light output, they sometimes do achieve it). Also, lamps only come in discrete steps of light output, so there may be no way to provide the same amount of light as is currently provided.
Another problem --- the one that prompted this blog entry --- is how to establish the baseline energy use, and determine the energy savings of the replacement lamps. Lamps are not individually metered, although meters can be installed temporarily (at some expense). The actual energy consumption of an existing lamp, and its light output, depend on the lamp's age and on the voltage that it gets. As mentioned above, the voltage varies with time...but it does so differently for different lamps, depending on the distance from the power plant and on the local electric loads. There seem to be no existing records on voltage-vs-time for any locations, much less for the large number of towns that might participate in this program.
We need to figure out how to predict in advance the energy savings that can be expected from various lamp replacement strategies, with enough precision that all of the actors can figure out whether to proceed and, if so, how large a program to commit to. We also need to figure out how to monitor the actual savings. These seem like the same issue, but they're not: we have almost no data on which to base our savings predictions, but once the program starts we can have data collection as part of it.
For evaluating the actual savings once the program starts, we're thinking of a paired-comparisons approach: every time they go out to replace an existing lamp with a new one, they'll install (for a couple of weeks) a monitor on an adjacent lamp that is not being replaced. The new lamp's energy consumption is very predictable (because it has a voltage regulator) so it doesn't need its own monitor. Basically we'll be using the adjacent non-replaced lamp to get an estimate of what the other lamp would have consumed, had it not been replaced with a new one. (A side benefit of this approach is the reduced need for travel: it takes time and money to go all over the place installing monitoring equipment, so if extra trips can be avoided, that's a bonus).
But to predict the savings in the first place, we've got problems. We know the voltage varies with time and with distance from the power plant, but we don't know how. We know the power consumption of the lamps varies with voltage and with the age of the lamp, but we don't know how. If we understood the dependencies, then we could simulate some different situations and see how various sampling schemes would perform, but many of the parameters are very uncertain.
So we've got a Catch-22: we can't determine the right sampling strategy without knowing something about the spatial and temporal variability, but we won't be able to get any data until the sampling plan has been approved.
If you have any experience or advice for this kind of problem, please post it here!
Posted by Phil at 5:28 PM | Comments (2) | TrackBack
November 1, 2005
Special Halloween edition
Here's the abstract to today's brown bag seminar in the Marketing Department (331 Uris Hall, 1:30pm, for you locals). If you read the abstract you'll see the Halloween connection.
On the Consumption of Negative Feelings
(Eduardo B. Andrade, UC Berkeley & Joel B. Cohen, University of Florida)
Abstract:
If the hedonistic assumption (i.e., people’s willingness to pursue pleasure and avoid pain) holds, why do individuals expose themselves to events known to elicit negative feelings? In this article, we assess how (1) the intensity of the negative feelings, (2) the positive feelings in the aftermath, and (3) the coactivation of positive and negative feelings contribute to our understanding of the phenomenon. In a series of 4 studies, horror and non-horror movie watchers are asked to report their positive and negative feelings either after (experiment 1) or while (experiments 2A, 2B, and 3) they are exposed to a horror movie. The results converge with a coactivation-based model and highlight the importance of a protective frame.
Posted by Andrew at 12:28 AM | Comments (0) | TrackBack
October 31, 2005
The "white male effect"
Dave Krantz pointed me to a paper by Kahan, Braman, Gastil, Slovic, and Mertz on "Gender, race, and risk perception: the influence of cultural status anxiety," which explores the "white male effect," which is the "tendency of white males to fear all manner of risk less than women and minorities," a pattern first noted by Slovic and others in the early 1990s. Finucane and Slovic (1999) wrote that “the white-male effect seemed to be caused by about 30 percent of the white male sample that judged risks to be extremely low.”
Here's the abstract of the new paper:
Why do white men fear various risks less than women and minorities? Known as the “white male effect,” this pattern is well documented but poorly understood. This paper proposes a new explanation: cultural status anxiety. The cultural theory of risk posits that individuals selectively credit and dismiss asserted dangers in a manner supportive of their preferred form of social organization. This dynamic, it is hypothesized, drives the white male effect, which reflects the risk skepticism that hierarchical and individualistic white males display when activities integral to their status are challenged as harmful. The paper presents the results of an 1800-person survey that confirmed that cultural worldviews moderate the impact of sex and race on risk perception in patterns consistent with status anxieties. It also discusses the implication of these findings for risk regulation and communication.
The paper is interesting, and I'm sympathetic to its general arguments--it certainly makes sense to me that risk perceptions, and perceptions about uncertainties in general, will be influenced by cultural values. But I have a couple of concerns relating to how the data were collected and analyzed.
The findings of the article come from regression analyses of responses to a national survey. They aksed people about their perceptions of risks of environmental danger, guns, and abortion. They also asked some cultural world view and personality questions, along with demographics. They found that the cultural worldview questions were predictive of risk attitudes.
I'm just a little worried that they may be measuring political views as much as risk attitudes. For example, one of the agree/disagree statements is "Women who get abortions are putting their health in danger." Statistically, my impression is that the health risk from abortion itself is low, but a person who opposes abortion might answer Yes to the question, on the grounds that a lifestyle associated with frequent abortions is risky. My point here is that the answer to the question itself could have a political twist to it. Although the question is nominally about risks, I don't know how much it's really telling us about risk perception.
I'm not saying that this is a devastating critique. Understanding the "white male effect" is a challenge, and cultural world view, etc., has got to be relevant. But this particular study maybe could be interpreted in other ways.
The paper would be clearer if the tables were made into graphs. Table 1, for instance, is dominated by a weird visual effect having to do with the lengths of the labels. It also includes irrelevant information such as that the sd of the ages in the data is 16.99.
More importantly, Tables 2,3,4,5,7,8,10,11 could be nicely combined into a single display that conveys what is happening and allows the groups to be compared. Table 6,9,12,13 could be combined also.
Figures 2,3,4,5 are ok, but there's a real opportunity missed to throw in some data. Also, the lines could be labeled directly rather than through different dottings.
This is an interesting paper, so it might be a good example for my statistical graphics class. One of the assignments will be to take a tabular presentation from an article of interest, redo as graphs, and discuss the effect on how the information is conveyed.
P.P.S. See here for more on this topic from Dan Kahan, the first author of the paper under discussion.
Posted by Andrew at 12:54 AM | Comments (1) | TrackBack
October 18, 2005
An easy decision for a statistician
I went to Radio Shack the other day and bought a telephone answering machine.
Q: Did I want to buy the extended warranty for $5.99? [Students: figure this one out before continuing...]
A: No.
Posted by Andrew at 12:00 AM | Comments (3) | TrackBack
September 28, 2005
The rational animal, or the irrational computer
It seems to me that from the "liberal" (in the U.S. politics) perspective, man [humans] used to be the "rational animal" but is now the "irrational computer," and this worries me a bit.
The rational animal
For an example of the first view, here's a quote I just googled::
"We believed . . . that man was a rational animal, endowed by nature with rights, and with an innate sense of justice; and that he could be restrained from wrong and protected in right, by moderate powers, confided to persons of his own choice, and held to their duties by dependence on his own will." -- Thomas Jefferson, 1823
The idea being that our rationality is what separates us from the beasts, either individually (as in the Jefferson quote) or through collective action, as in Locke and Hobbes. If the comparison point is animals, then our rationality is a real plus!
The irrational computer
Nowadays, though, it seems almost the opposite, that people are viewed as irrational computers. To put it another way, if the comparison point is a computer, then what makes us special is not our rationality but our emotions.
I was thinking about this when reading in n+1 magazine the review by Megan Falvey of the book "Freakanomics."
Our description of the rational self supports the real-world conditions under which some futures seem more attainable than others. It coaxes us into wholehearted, personally felt participation with capitalist regulation. Levitt’s calculating individual is the ideal subject of contemporary neoliberal economic reform, in particular the expansion of the market into all possible areas of life.
The idea seems to be that "the description of the rational self" excludes warmer aspects of human nature. That I'll definitely believe. But I still think rationality is a good thing--perhaps my bias as a scientist.
Decoupling rationality and selfishness
Rationality can serve other-directed as well as selfish goals. Yes, I can rationally try to get the best deal on a new TV, but the Red Cross can also use rationality (for example, in the form of mathematical optimization) to deliver help to as many people as possible. Or Novartis can use rationality (in the form of up-to-date biostatistical methods) to increase the chance of developing an effective drug--this can serve both selfish and unselfish purposes.
The decoupling of rationality and selfishness is a point we made here, in the context of considering voting as a rational way to attempt to improve the well-being of others as well as oneself.
To get back to Falveys' book review: I'm not attempting to address the details of her disagreements with Levitt and Dubner, just to express my distress that she sees rationality to be a problem. Considering the alternatives, I think rationality is pretty good. But it is useful to think about the goals to which the rationality is directed.
Posted by Andrew at 12:08 PM | Comments (5) | TrackBack
August 31, 2005
Statistics and decision science job in Australia
I got the following by email. It's good to see this kind of commitment to interdisciplinary work in statistics and decision analysis:
From Eddie Anderson, Professor of Operations Management, AGSM:
The Australian Graduate School of Management (AGSM) is hoping to hire a new faculty member in Statistics to start some time during the calendar year 2006. At this stage we are considering possible candidates at any level and are seeking expressions of interest. We expect to formally advertise a position before the end of 2005.The AGSM is regularly rated as the best business school in Australia and one of the best in the region. The AGSM is a joint venture between the University of Sydney and the University of New South Wales. We have a faculty of around 42 full time academics, but there are close research links with faculty at the University of New South Wales (where the AGSM is located). You can find more information about the School at www.agsm.edu.au.
The School has an established area of research excellence in decision sciences and we would like whoever is appointed to this position to complement this research strength, which includes people working in the areas of strategy, OB and marketing. This focus on decision-making in a business context may make the position particularly suited for someone who works in the Bayesian framework.
The characteristics of the person we are looking for are as follows:
1. A strong research record with a record of (or potential for) publishing in tier 1 journals (like JASA).
2. The ability to be successful in the classroom with our MBA students.
3. The ability to interact on research with other faculty across the School, particularly within the School’s research strength in Decision Sciences.
4. The ability to teach outside of core Statistics areas on topics such as Data Mining, Decision Analysis or Econometrics (there are opportunities in these areas for executive teaching as well).
I would be grateful for your help in this search. If there is anyone you know who you think may be suitable for this position then please let me have their details, or ask them to contact me directly on eddiea@agsm.edu.au.
Best regards
Eddie Anderson
Posted by Andrew at 6:20 AM | Comments (0) | TrackBack
August 24, 2005
Terrorist Risk Revisited
There's a fun little article in the Harvard Magazine on risk perception. David Ropeik and George Gray at the Harvard School of Public Health wrote a book Risk: A Practical Guide for Deciding What's Really Safe and What's Really Dangerous in the World around You, which sounds interesting. The article also mentions a study by the University of Michigan transportation Research Institute comparing motor-vehicle deaths in October - December, 2001 (right after the September 11 attacks) to the same period in the previous year. (Click here for a previous post and comments on this topic.) The Michigan study concludes are that there were 1,018 more traffic deaths in late 2001 than in late 2000 -- I haven't read the study myself, so I'm just passing along what they report. (Is 1,018 large relative to the average number of traffic deaths and its variability? I don't know.)
In a similar vein, I keep telling my mom how much more likely it must be that I'll be hit by a car or by lightning than be bombed on the subway. I don't think it makes her worry about me any less.
Posted by Sam at 11:56 AM | Comments (1) | TrackBack
July 25, 2005
Terrorism and Statistics
There was an interesting editorial in Sunday's New York Times about the anxiety produced by terrorism and people's general inability to deal rationally with said anxiety. All kinds of interesting stuff that I didn't know or hadn't thought about. Nassim Nicholas Taleb, a professor at UMass Amherst, writes that risk avoidance is governed mainly by emotion rather than reason, and our emotional systems tend to work in the short term: fight or flight; not fight, flight, or look at the evidence and make an informed decision based on the likely outcomes of various choices. Dr. Taleb points out that Osama bin Laden "continued killing Americans and Western Europeans in the aftermath of Sept. 11": People flew less and drove more, and the risk of death in an automobile is higher than the risk in an airplane. If you're afraid of an airplane hijacking, though, you're probably not thinking that way. It would be interesting to do a causal analysis of the effect of the September 11 terrorist attacks on automobile deaths (maybe someone already has?).
Posted by Sam at 10:32 AM | Comments (6) | TrackBack
July 18, 2005
Overconfidence in historical predictions; also a discussion of graphical displays of scientific results
Bryan Caplan writes about a cool paper from 1999 by Philip Tetlock on overconfidence in historical predictions. Here's Caplan's summary:
Tetlock's piece explores the overconfidence of foreign policy experts on both historical "what-ifs" ("Would the Bolshevik takeover have been averted if World War I had not happened?") and actual predictions ("The Soviet Union will collapse by 1993.") The highlights:# Liberals believe that relatively minor events could have made the Soviet Union a lot better; conservatives believe that relatively minor events could have made South Africa a lot better.
# Tetlock asked experts how they would react if a research team announced the discovery of new evidence. He randomly varied the slant of the evidence. He found a "pervasiveness of double standards: experts switched on the high-intensity search light of skepticism only for dissonant results."
# Tetlock began collecting data on foreign policy experts' predictions back in the 80's. For example, in 1988 he asked Sovietologists whether the USSR would still be around in 1993. Overall, experts who said they were 80% or more certain were in fact right only 45% of the time.
# How did experts cope with their failed predictions? "[F]orecasters who had greater reason to be surprised by subsequent events managed to retain nearly as much confidence in the fundamental soundness of their judgments of political causality as forecasters who had less reason to be surprised." The experts who made mistakes often announced that it didn't matter because prediction is pretty much impossible anyway (but then why did they assign high probabilities in the first place?!) The mistaken experts also often said they were "almost right" (e.g. the coup against Gorbachev could have saved Communism) but correct experts very rarely conceded that they were "almost wrong" for similar reasons.
Caplan goes on to discuss the probability that forecasters might have been more calibrated if they had been betting money on their predictions. This is an interesting point but I'd like to take the discussion in a different direction. Beyond the general interest in cognitive illusions I've had since reading the Kahneman, Slovic, and Tversky book way back when, Tetlock's study interests me because it interacts with Niall Ferguson's work on potential outcomes in historical studies and Joe Bafumi's work on the stubborn American voter.
Virtual history and stubborn voters
Ferguson edited a book on "virtual history" in which he considered historical speculations, and retroactive historical speculations, in the potential-outcome framework that is used in statistical inference. These ideas also come up in other fields, such as law (as pointed out here by Don Rubin). I'm not quite sure how overconfidence fits in here but it seems relevant.
Joe Bafumi in the "stubborn American voter" (here's an old link; I don't have a link to the updated version of the paper) found that in the past twenty years or so, Americans have become more partisanl, not only in their opinions, but also in their views on factual matters. This seems similar to what Tetlock found and also suggests that the time dimension is relevant. Joe also considers views of elites vs. average Americans.
Finally . . .
Tetlock's paper was great but I'd like it even better if the results were presented as graphs rather than tables of numbers. In my experience, graphical presentations make results clearer, but even more important, can generate new hypotheses and reject existing hypotheses I didn't realize I had.
My impression is that statistics and data analysts see graphics as an "exploratory" tool for looking at data, maybe useful when selecting a model, but then when they get their real results, they present the numbers. But in my conception of exploratory data analysis (see also here for Andreas Buja's comment and here for my rejoinder), graphs are about comparisons. And, as is clear from Caplan's summary, Tetlock's paper is all about comparisons--stated probabilities compared to actual probabilities, liberals compared to conservatives, and so on. So I think something useful could possibly be learned by re-expressing Tetlock's Tables 1, 2, 3, and 4 as graphs. (Perhaps a good term project for a student in my regression and multilvel modeling class this fall?)
Posted by Andrew at 12:21 AM | Comments (0) | TrackBack
July 12, 2005
Dave Krantz on decision analysis and quantum physics, leading to a Jim Thomspon reference and then back to Penrose's theory that consciousness is inherently quantum-mechanical
Commenting on my thoughts about decision analysis and Schroedinger's cat (see here for my clarifications), Dave Krantz writes,
I'd first like to comment on the cat example, and then turn to the relationship to probabilistic modelling of choice.I think one can gain clarity by thinking about simpler analogs to Schroedinger's cat. Instead of poison gas being released, killing the cat, let's suppose that a single radioactive decay just releases one molecule of hydrogen (H2) into an otherwise empty (hard vacuum) cat box. Now an H2 molecule is something that, in principle, one can describe pretty well by a rather complicated wave function. The wave function for an H2 molecule confined to a small volume, however, is different from the wave function for an H2 molecule confined to a much larger cat box. At any point in time, our best description (vis-a-vis potential measurements we could make that would interact with the H2 molecule) is a superposition of these two wave functions, narrowly or broadly confined. As long as we don't know whether the radioactive decay has taken place, and we make no observation that directly or indirectly interacts with the H2 molecule, the superposition continues to be the best physical model.
This example points up the fact that Schroedinger's cat involves two different puzzles. The first is epistemological: we are used to thinking of a cat as alive or dead, but equally used to thinking of a H2 molecule as confined narrowly or broadly. How can it be both? But this way of thinking just won't work in QM. The point of the double-slit experiments is to show clearly that an unobserved photon does NOT go through one slit or the other, it goes through both, in the sense of its wave function giving rise to coherent circularly symmetric waves emanating from each slit and interfering. It is equally wrong to think that a H2 molecule is either confined narrowly or broadly. Observations are going to be accounted for by assuming a superposition.
The second puzzle arises because a cat cannot in practice be described by a single wave function at all. That's at least true of an ordinary cat, subject to many sorts of observation. But in practice, even an unobserved cat is not describable by a wave function. There are wave functions for each molecule, but the best descriptions do not collapse these into a single wave function. Coherence fails. To take an analogy, one can get monochromatic light by passing a beam through an interference filter; though the frequencies of the different photons are all alike, the phases still vary randomly. This is very different from the coherent light of a laser, where everything is in phase.
There is a real problem of understanding when incoherent wave functions collapse into a single coherent one. This has been dramatized, in recent years, by studies of Bose-Einstein condensates. Rubidium atoms can be very near one another, yet still incoherent; but at low temperatures, they become a single molecular system, with a condensed wave function. The study of conditions for coherence is on-going, as I understand it. A cat is outside the boundaries of coherence.
Epistemologically, the introduction of probabilities as fundamental terms in choice modelling is rather analogous to the introduction of probabilities in QM measurement. It has always struck me as curious that the two happened in the same year, 1927: Born developed the probabilistic interpretation of QM measurement and Thurstone formulated the law of comparative judgment.
Where the analogy breaks down, however, is that there isn't any analog to a wave function in choice models. Thurstone actually tried to introduce something like it, with his discriminal processes, but from the start, discriminal processes were postulated to be independent rather than coherent random variables. Thus, I don't see much point in pushing the analogy of any DM problem with the Schroedinger cat problem, where the essence is superposition rather than independence.
My thoughts
OK, that was Dave talking. To address his last point, yes, I don't see where the complex wave function would come in. (Dsquared makes the same point in the comments to this entry. In probability theory we're all happy to use Boltzmann statistics (i.e., classical probability theory). I've never seen anyone make a convincing case (or even try to make a case) that, for example, Fermi-Dirac statistics should be used for making business decisions.)
But Dave's point above about "coherence" is exactly what I was talking about. Also there's the bit about the collapse of the wave function (or of the decision tree). But I suppose Dave would say that, without complex wavefunctions, there's no paradox there. With classical Boltzmann statistics, the cat really is just alive or dead all along, with no need for superposition of states
Jim Thompson's cat
Hmmm...my feeling is that the act of deliberation, or even just of keeping a decision "open" or "alive," creates a superposition of states. If I'm deciding whether or not to flip the switch, then I would't say that the cat is "either alive or dead." I haven't decided yet! In The Killer Inside Me, Jim Thompson writes, "How can you hurt someone that's already dead?", but I don't take such a fatalistic position.
Roger Penrose's consciousness
But hey, let's take this one step further. In my experiment (as opposed to Schroedinger's), the cat is alive or dead based on my decision of whether to flip a switch (and, in turn, this decision is ultimately coupled with other outcomes of interest; e.g., the switch also turns off the light in the next room, which encourages the lab assistant to go home for the day, and then he might bump into someone on the subway, etc., etc.). If it is true, as Penrose claims in The Emperor's New Mind, that consciousness is inherently quantum-mechanical and non-algorithmic, then my decision of whether to flip the switch indeed must be modeled as a superposition of wave functions. Although then I'm not quite sure how deliberation fits in to all this.
Anyway, to get more positivistic for a moment, maybe the next research step is to formulate some actual decision problems (or realistic-seeming fake problems) in terms of coherence, and see if anything useful comes of it.
P.S. Dave is very modest on his webpage but he's actually the deepest thinker I know of in decision analysis.
P.P.S. It's funny that Dave has a cat living in a "cat box," which I always thought was equivalent to the litterbox (so I recall from my catful days). Maybe "cat container" would be a better phrase?
Posted by Andrew at 7:36 AM | Comments (5) | TrackBack
Decision analysis, Penrose style
I appreciated the comments on my recent entry on decision analysis and Schroedinger's cat.
Some comments
Chris sent some general links, and Simon and Dsquared referred to some specific desicion problems in finance--an area I know nothing about but certainly seems like a place where formal decision analysis would be useful.
Deb referred to the expected value of information (a concept I remember from teaching classes in decision analysis) and wonders why I have to bring quantum mechanics and Roger Penrose into the picture.
Why bring in quantum mechanics?
I bring up quantum mechanics for two reasons. First, making a decision has the effect of discretizing a continuous world. (Just as, in politics, a winner-take-all election converts a divided populace into a unidirectional mandate.) I see a strong analogy here to the collapsing of the wave function. To bring in a different physics analogy, decision-making crystallizes a fluid world into a single frozen choice.
The second connection to quantum mechanics connection arises because decisions are not made in isolation, and when we wait on a decision, it tends to get "entangled" with other decisions, producing a garden of forking paths that is a challenge to analyze. At some point--even, possibly, before the "expected value of additional information" crosses the zero line--decisions get made, or decision-making gets forced upon us, because it's just to costly for all concerned to live with all the uncertainty. (I wouldn't say this is true of all decisions or even most decisions, but it can arise, especially I think in decisions which are loosely coupled to other decisions--for example, a business decision that affects purchasing, hiring in other divisions, planning, etc.) This is the Penrose connection--that quantum states (or decisions) get resolved when they are entangled with enough mass.
P.S.
The other thing I learned is that links don't always work. Chris sent me this link, Simon sent this, and Dsquared sent this. My success: 0/3. 1 broken link and 2 with password required.
Posted by Andrew at 12:23 AM | Comments (6) | TrackBack
July 8, 2005
Decision analysis and quantum mechanics; or, making a decision about Schroedinger's cat
One of the mysteries of quantum mechanics (as I recall from my days as a physics major, and from reading Roger Penrose's books) is the jump from complex probability amplitudes to observed outcomes, and the relation between observation and measurement. Heisenberg, 2-slit experiment, and that cat that's both alive and dead, until it's observed, at which point it becomes either alive or dead. As I recall from reading The Emperor's New Mind, Penrose believed that it was not the act of measurement that collapsed the cat's wavefunction, but rather the cat's (or, more precisely, the original electron whose state was uncertain) getting entangled with enough mass that the two possibilities could not simulteously exist.
OK, fine. I haven't done any physics since 1986 so I can't comment on this. But it reminded me of something similar in decision making.
Consider a decision that must be made at some unspecified but approximately-known time in the future. For example, a drug company must choose which among a set of projects to pursue (and does not have the resources to pursue all of them). The choice needs not be made immediately, and waiting will allow more information to be gathered to make a more informed decision. At the same time, the clock is ticking and there are losses associated with delay. In addition to the obvious losses (not going full-bore on a promising project leads to a later expected release date, thus fewer lives saved and less money made), waiting ties up other resources of suppliers, customers, etc. [Yes, this example is artificial--I'm sure I can think of something better--but please bear with me on the general point.]
So this is the connection to quantum mechanics. We have a decision, which will ultimately either kill a cat or not, and it makes sense to keep the decision open as long as possible, but at some point it becomes entangled with enough other issues that the decision basically makes itself, or, to put it another way, the decision just has to be made. The act of decision is equivalent to taking a measurement in the physical experiment.
I think there's something here, although I'm not quite sure what.
P.S. Further discussion here.
Posted by Andrew at 2:33 AM | Comments (7) | TrackBack
June 27, 2005
What's so funny about decision analysis?
Jon Baron pointed me to this page which has the following funny story from Deb Frisch. (The story is also here.)
A day in the life of a decision scientist2:00 P.M. Need to be at Dulles airport by 5:30 for flight to Kansas City (via Chicago) for Judgment and Decision Making (JDM) conference. Need to decide whether to take 3:15 or 3:45 bus to Dulles. Gut says 3:45 since the benefit of an additional half hour at home is greater than the slightly increased risk of a missed flight. Head says it's Friday afternoon, might be big crowds on highway and at airport, better safe than sorry. Decide to take 3:15 shuttle but don't leave house in time. Take the 3:45 instead. Get to Dulles in plenty of time.
4:00 P.M. Get in long line of United Premier members. After 10 minutes, realize there are two lines - human vs. non-human check-in machines. I'm in the twice-as-long human line, even though I have an e-ticket. If I switch now though, I'll be behind people who arrived 10 minutes after me. In order to avoid feeling like a loser, I stay in human line. Check bag (even though this was not my original intention) to justify the extra wait.
6:00 P.M. United terminal in O'Hare airport. Go to Berghoff Café for dinner. Order cheese pizza and small beer. Price of pizza ($3.50) is written on menu. Price of beer is not. Reach cashier and learn that price of beer=price of pizza=ridiculous price for 14 oz. of beer. Feel flash of anger at sleazy marketing ploy. Forgive Berghoff's because pizza is really good.
8:00 P.M. United flight to Kansas City. Wish I had a magazine. Sit down and see Newsweek in seatback. Feel excitement and small surge of irrational pride. Remove magazine. It is Polish Newsweek. Experience disappointment. Feel worse than I did when I first sat down. Derive satisfaction from observing the endowment effect and loss aversion in action. Combine satisfaction with disappointment and arrive at slightly less than neutral.
10:00 P.M. Arrive at Hyatt hotel. Am told the type of room I'd reserved (non-smoking king) was sold out. Do I want a king suite instead? I am tired and experience change aversion. I want the room I reserved. I ask if the suite will cost more. Am told the only difference is that the suite is larger and has a Murphy bed instead of a regular bed. Interrogate desk clerk to determine whether quality of Murphy mattress is greater than or equal to quality of regular mattress. He assures me there is no difference. Get to room, turn on light and inspect bed visually and dorsally. Try to retrieve memories of other hotel beds. Due to recency and frequency, all I can think of is my own bed. Too tired to continue research. Go to sleep.
--Deborah Frisch
I read this and indeed found it hilarious, especially the bit about the Newsweek magazine in Polish. There's something inherently funny about applying decision analyses to these personal situations. Which is one reason why I have always been distrustful of the examples in many decision analysis textbooks, hypothetical problems such as using decision analysis to decide what dessert to eat, or whatever. This is turn motivated the idea of institutional decision analysis, in which the focus is on having decision procedures that can be justified in settings with multiple stakeholders. The challenge is to apply this idea in real settings.
Posted by Andrew at 12:01 AM | Comments (1) | TrackBack
June 13, 2005
Jon Baron on intuitive judgment
In the comments to that entry is a link to an empirical study by Paul Deignan of information processing and political ideology.
Posted by Andrew at 8:54 AM | Comments (0) | TrackBack
June 8, 2005
A question in psychological measurement
David Budescu writes,
We ran an experiment where subject made predictions about future value of many stocks based on their past performance. More precisely, they were asked to estimate 7 quantiles of the distribution of each stock:Q05, Q15, Q25, Q50, Q75, Q85, and Q95
I would like to estimate the mean and SD (or variance) of this distribution based on these quantiles subject to weak assumptions (symmetry and unimodality) but without assuming a particular distribution.
I know of some methods (e.g. Pearson & Tukey, Biometrika, 1965) that use only 3 of these quantiles (Q05, Q50, and Q95) but I hate not to use all the data I have collected.
Does anyone know of a more general and flexible solution?
Any thoughts? Of course, some distribution would have to be assumed. Also, I wonder about assuming symmetry since the data would be there to reject the hypothesis of symmetry in some settings. Also, of course, I wonder whether the mean and sd are really what you want. Well, I can see the mean, since it's $, but I'm not so sure that the sd is what's wanted.
Posted by Andrew at 12:17 AM | Comments (6) | TrackBack
May 10, 2005
Regression modeling and meta-analysis for decision making; or, We thank Kevin Brancato and Hailin Lou for research assistance . . .
I noticed the blog of Kevin Brancato. I've been enjoying reading the blog entries, especially since Kevin is a former student of ours at Columbia! His paper on macroeconomic statistics is also interesting (and relevant to some of my work).
Kevin worked as a research assistant for me a few years ago on a project which eventually appeared in the Journal of Business and Economic Statistics under the title, "Regression Modeling and Meta-Analysis for Decision Making: A Cost-Benefit Analysis of Incentives in Telephone Surveys."
Here's the abstract of the paper:
Regression models are often used, explicitly or implicitly, for decision making. However, the choices made in setting up the models (e.g., inclusion of predictors based on statistical significance) do not map directly into decision procedures. Bayesian inference works more naturally with decision analysis but presents problems in practice when noninformative prior distributions are used with sparse data. We do not attempt to provide a general solution to this problem, but rather present an application of a decision problem in which inferences from a regression model are used to estimate costs and benefits. Our example is a reanalysis of a recent meta-analysis of incentives for reducing survey nonresponse. We then apply the results of our fitted model to the New York City Social Indicators Survey, a biennial telephone survey with a high nonresponse rate. We consider the balance of estimated costs, cost savings, and response rate for different choices of incentives. The explicit analysis of the decision problem reveals the importance of interactions in the fitted regression model.
It was our attempt to perform a full decision analysis, rather than simply looking at some regression coefficients.
P.S. Yes, the tables should be graphs. Especially Table 2.
P.P.S. I don't know what's up with Hailin Lou.
P.P.P.S. As a former resident of the D.C. suburbs, I don't share Kevin's enthusiasm for a plan to add lanes to the Beltway.
Posted by Andrew at 12:32 AM | Comments (0) | TrackBack
May 2, 2005
Altruism and voter turnout: is it a good thing that nicer people are more likely to vote?
Are nicer and better-informed citizens more likely to vote?
James Fowler (political science, UC Davis) wrote an interesting paper about a lab experiment he conducted, demonstrating the connection between other-regarding preferences and voter turnout. Here's the abstract:
Scholars have recently reworked the traditional calculus of voting model by adding a term for benefits to others. Although the probability that a single vote affects the outcome of an election is quite small, the number of people who enjoy the benefit when the preferred alternative wins is large. As a result, people who care about benefits to others and who think one of the alternatives makes others better off are more likely to vote. I test the altruism theory of voting in the laboratory by using allocations in a dictator game to reveal the degree to which each subject is concerned about the well-being of others. The main findings suggest that variation in concern for the well-being of others in conjunction with strength of party identification is a significant factor in individual turnout decisions. Partisan altruists are much more likely to vote than their nonpartisan or egoist peers.
I especially like this paper because it is consistent with the model of Edlin, Kaplan, and myself of the rationality of voting based on social motivations. As we (and Fowler) point out, there's no reason that "rationality" has to mean "selfishness."
Many researchers in political science and economics seem to feel that it is "cheating" to introduce other-directed preferences into a rational choice model, but given both the logic and the evidence (Fowler's paper gives some experimental evidence, and our paper has lots of observational evidence), I don't see that selfishness makes much sense in this setting.
Some other comments
Fowler's experimental results show voting to be correlated with various attitudes and behaviors. But his conclusion is all about why people vote. I'm also interested in the implications about who votes. If selfish people don't vote, is that perhaps a good feature of our system?
A related point (also discussed in our paper) is that, if people are voting because of altruism rather than selfishness, this has implications for how people vote, as well as why they vote. In particular, one would expect political pitches to be made on more altruistic grounds.
On to the graphical presentation . . . I like Fowler's Figure 2. It could be slightly improved by having the y-axes range from 0 to 1 (since this is the range of probabilities, with 0 and 1 being the sharp endpoints of the y-range (in R or S, you can do this using ylim=c(0,1), yaxs="i" in the plotting command). Also I'd recommend making the graphs slightly wider than they are high. The x and y axes are on different units, so it's a little confusing to make the plots square.
Fowler's Figure 1 can be done better. There's no need for three colors, and the up-and-down patterns are confusing. Better would be to have 3 histograms (on a common scale), one for each of the 3 conditions, with labels on top. This'll be much clearer.
Tables 1 and 2 would be better as figures; see Gelman, Pasarica, Dodhia, "Let's practice what we preach: turning tables into graphs" from The American Statistician (2002). I mean, the tables are ok by the usual standard of social science papers, but the substance of the paper is so strong, that why not take the model presentation to the next level with some clear graphs? (If you were to insist on keeping the tables--and I think this would be a big mistake--then you must round off all the numbers to 1 decimal. Given your se's, there is essentially no information in the 2nd decimal places. Yes, that means that correlations of 0.09 and 0.06 will be rounded off to 0.1, and correlations of 0.3 will be rounded off to 0.0. That's fine--there's really nothing statistically distinguishing these numbers anyway.)
Also . . .
At his website, Fowler has several other papers on related topics.
Posted by Andrew at 10:18 AM | Comments (5) | TrackBack
April 19, 2005
Loss aversion etc
If a person is indifferent between [x+$10] and [55% chance of x+$20, 45% chance of x], for any x, then this attitude cannot reasonably be explained by expected utility maximization. The required utility function for money would curve so sharply as to be nonsensical (for example, U($2000)-U($1000) would have to be less than U($1000)-U($950)). This result is shown in a specific case as a classroom demonstration in Section 5 of a paper of mine in the American Statistician in 1998 and, more generally, as a mathematical theorem in a paper by my old economics classmate Matthew Rabin in Econometrica in 2000.
I was thinking about this stuff recently because of a discussion I had with Deb Frisch on her blog. I like Matt's 2000 paper a lot, but Deb seems to be really irritated by it. Her main source of irritation seems to be that Matt writes, "The theorem is entirely 'nonparametric,' assuming nothing about the utility function except concavity." But actually he assumes fairly strong assumptions about preferences (basically, a more general version of my [x, x+$10, x+$20] gamble above), and under expected utility, this has strong implications about the utility function.
Matt's key assumption could be called "translation invariance"--the point is that the small-stakes risk aversion holds at a wide range of wealth levels. That's the key assumption--the exact functional form isn't the issue. Deb compares to a power-law utility function, but expected-utility preferences under this power law would not show substantial small-scale risk aversion across a wide range of initial wealth levels.
Deb did notice one mistake in Matt's paper (and in mine too). Matt attributes the risk-averse attitude at small scales to "loss aversion." As Deb points out, this can't be the explanation, since if the attitude is set up as "being indifferent between [x+$10] and [55% chance of x+$20, 45% chance of x]", then no losses are involved. I attributed the attitude to "uncertainty aversion," which has the virtue of being logically possible in this example, but which, thinking about it now, I don't really believe.
Right now, I'm inclined to attribute small-stakes risk aversion to some sort of rule-following. For example, it makes sense to be risk averse for large stakes, and a natural generalization is to continue that risk aversion for payoffs in the $10, $20, $30 range. Basically, a "heuristic" or a simple rule giving us the ability to answer this sort of preference question.
Attitudes, not preference or actions
By the way, I've used the term "attitude" above, rather than "preference." I think "preference" is too much of a loaded word. For example, suppose I ask someone, "Do you prefer $20 or [55% chance of $30, 45% chance of $10]?" If he or she says, "I prefer the $20," I don't actually consider this any sort of underlying preference. It's a response to a question. Even if it's set up as a real choice, where they really get to pick, it's just a preference in a particular setting. But for most of these studies, we're really talking about attitudes.
Posted by Andrew at 12:14 AM | Comments (9) | TrackBack
March 30, 2005
Surfing the web, or From the 10th floor to the 7th floor in four steps
So I clicked on the link on our webpage to Decision Science News, flipped through there and then on to his links . . . hmmm, a link to the psychologist Jon Baron, who studies thinking and decision making. . .
Baron's blog is pretty cool too. Sort of halfway between a science blog (like ours and Decision Science News) and an opinion blog (like the 3 million other blogs out there). It's Baron's opinions, but backed by his perspectives as a leading decision scientist. (In this post, he briefly discusses treatments for obesity. I should forward him the reference to Seth's article on self-experimentation (or maybe the link about the psychology professor who told us to take drugs).
Well, Jon has his own links (including Decision Science News) . . . I clicked through, and the only other one that was interesting was the blog of Deb Frisch, another psychology professor and decision scientist. Her blog is definitely more of the "personal commentary on ussues current events" style, but the issues and current events she discusses are of interest to me too, so I enjoyed reading it. She has a confrontational style, which shows up in her comment to this entry. It would probably be fun to be a student in one of her classes.
Frisch's blog had interesting stuff. Right near the top there was a link to an implementation of Eliza, which I of course had heard about but had never tried out. That same entry has a link to a blog called Econlog, by Arnold Kling and Bryan Caplan. Frisch links to Econlog only to mock them, but actually it had some interesting stuff. (Although I'm not inclined to agree with them when they write, "Cato is right to want to topple Social Security. If you don't have the common sense to save for your own retirement, you shouldn't come crying to the taxpayers when your hair turns gray." Seems a little harsh, especially given the many cognitive illusions that decision scientists have discovered over the past 40 years!)
But I don't have to agree with Kling and Caplan to read their blog. Actually, their most recent posting referred to a study on effects of pre-kindegarten education--a topic of great interest to me right now. The funny thing is, two of the three authors of the study are at the Columbia School of Social Work, and one of the authors is Jane Waldfogel, who I know--she works in my building--and is in fact a co-organizer of this seminar series.
I'll have to read the paper more carefully before commenting on it, but, hey, it only took me 4 links to find out what's being done on the 7th floor of my building! As well as learning some other stuff on the way.
Posted by Andrew at 12:48 AM | Comments (2)
March 22, 2005
Decision Science News
Dan Goldstein, who runs the Center for Decision Sciences seminar at Columbia (along with Dave Krantz and Elke Weber) has a blog called Decision Science News.
I've learned a lot from some of the presentations at the decision science seminar (see here) and even spoke at it myself once (on this topic), and am generally interested in the topic, so I was curious to see the blog. It presents short descriptions of interesting recent work in decision science, especially in marketing. Reading this blog is a good way to get a sense of what the decision researchers are thinking about nowadays.
As a former student of Gigerenzer, Dan is perhaps sympathetic to my views on institutional decision analysis (probabilities represent an agreed-upon hypothesized model, used for convenience, rather than subjective states of knowledge). (See here for more.)
Posted by Andrew at 5:59 AM | Comments (0) | TrackBack
February 23, 2005
Causal inference and decision trees
Causal inference and decision analysis are two areas of statistics in which I've seen very little overlap: the work in causal inference is typically very "foundational" with continuing reassessment based on first principles, whereas decision analysis is more of meat-and-potatoes Bayesian inference--slap down a probability model, stick in a utility function, and turn the crank. (With all this processing, this must be ground beef and mashed potatoes.)
Actually, though, causal inference and decision analysis are connected at a fundamental level. Both involve manipulation and potential outcomes. In causal inference, the "causal effect" (or, as Michael Sobel would say, the "effect") is the difference between what would happen under treatment A and what would happen under treatment B. The key to this definition is that either treatment could be applied to the experimental unit by some agent (the "experimenter").
In parallel, decision analysis concerns what would happen if decision A or decision B were chosen. When drawing decision trees, we let squares and circles represent decision and uncertainty nodes, respectively. To map on to causal inference, the squares would represent potential treatments and the circles would represent uncertainty in outcomes--or population variability.
In practice, the two areas of research are not always so closely connected. For example, in our decision analysis for home radon, the key decision is whether to remediate your house for radon. The causal effect of this decision on reducing the probability of lung cancer death is assumed to follow a specified functional form as estimated from previous studies. For our decision analysis we don't worry about too much about the details of where that estimate came from.
But in thinking about causal effects, the decision-making framework might be helpful in distinguishing among different possible potential-outcome frameworks.
Posted by Andrew at 12:54 AM | Comments (2)
February 2, 2005
Using base rate information?
Aleks points to this blog entry from "HedgeFundGuy" on bias in decision making. HedgeFundGuy passes on a report that finds that people's opinions are strongly biased by their political leanings, then he gives his take on the findings--he thinks that this so-called bias isn't really a problem, it's just evidence of reasonable Bayesian thinking.
I'll first copy out what HedgeFundGuy had to say (including his own copy of the report of the study), then give my take, which is slightly different than his.
HedgeFundGuy writes:
A recent Miami Herarld article on some academic research on bias is quite illuminating. Because it's a registration required link (ugh) I'll snip the best parts.Drew Westen is a professor of psychology at Emory University, and author of a new and still-unpublished study testing whether people make decisions based on bias or fact. Bias won hands down.In a key scenario, respondents were lead to believe a soldier was accused of torturing people at Abu Ghraib prison in Iraq. The fictional soldier claimed to have been following orders from superiors who told him the Geneva Convention had been suspended. He supposedly wanted to subpoena President Bush and Defense Secretary Donald Rumsfeld to prove his case. Respondents were asked if he should have that right.
Some were presented with strong ''evidence'' corroborating the soldier's story. Others had only his word to go on.
But the strength or weakness of the evidence turned out to be immaterial. Researchers were able to predict people's opinion over 80 percent of the time based simply on their opinions of the Bush administration, the GOP, the military and human rights groups. Those who had less affection for the president sided with the soldier even when the evidence was weak. And fans of the president tended to side with him even when the evidence was overwhelming.
We believe what we want, facts be damned.
''The scary thing,'' says Westen, ``is the extent to which you can imagine this influencing jury decisions, boardroom decisions, political decisions . . .''
It sounds like solid research and I believe it. But I'm not so pessimistic on the conclusion. Instead of bias, I would just say we are all Bayesians. We see things through a filter, but that allows us to process information faster and more efficiently. Sure, sometimes our preconceptions are mistaken and unhelpful, but generally we apply preconceptions every day to social and logistical problems big and small.
If you told me that someone I generally find unreliable or mistaken in his worldview (for me, Michael Moore or Ralph Nader), who believed X, you would have to add a lot of data clearly pointing to X in order for me to also believe X. In contrast, If you told me that Milton Friedman or Richard Posner believed Y, I could probably withstand seeing some data suggesting not-Y and still believe Y, based on my faith in Friedman or Posner. In a more pedestrian fashion, when my wife says my shirt doesn't match, I believe her without checking myself, but if she told me my spark plugs needed changing, I would pretty much ignore her. People and groups have credibility on different issues, and their alignment with certain positions causes me to have greater or lesser belief in those positions irrespective of the data. Those starting points then require more or less corroborating data depending on my initial skepticism.
I remember Fama and French's influential Journal of Finance 1992 article showing little evidence for the CAPM. It was so persuasive because the authors were and are efficient markets advocates, and CAPM is aligned with the efficient markets camp. Their conclusion against the CAPM suggested the data must have been very weak indeed. If that article was written by an unknown, or some 'animal spirits' advocate at Harvard it would not have been nearly as persuasive. That's bias, but that's also rational.
My take on it:
I sympathize with HedgeFundGuy's desire to debunk, or demystify [actually, I first typed this as "demistify" but that makes sense too!], the study. (As Aleks points out, this is the direction of Gigerenzer's work on interpreting common mental "mistakes" as cognitively efficient behavior.)
However, I'm a little skeptical of HedgeFundGuy's skepticism. First off, I disagree with HedgeFundGuy's claim that "we are all Bayesians." As far as I am aware of the research on judgment and decision making by Kahneman, Tversky, Krantz, . . ., we are not Bayesians--at least, our preferences and decisions do not follow Bayesian rules. (Technically, in many settings people do not use base rate information, and in just about all settings, people fail to account for sample size in a way consistent with likelihood/Bayesian inference.)
Now, maybe that's OK that we're not Bayesian--I suspect Gigerenzer isn't bothered by it--but, given that so many experiments show that people don't make use of base rate information when they definitely should, I'm wary of suddenly turning around and applauding an experiment that shows that, in some settings, people use base-rate information too much.
There's base-rate information, and there's data information (well, that division is artificial, since as HedgeFundGuy points out, we interpret the data information (the "likelihood") in light of our beliefs about its source), and people weight them different ways in different circumstances. There's clearly something "psychological" going on, and some of it can perhaps be interpreted as efficient use of scarce mental resources. But interpreting this as Bayesian inference--no way.
To the extent that we all have political opinions, and we would like those who disagree with us to learn a bit from unpleasant facts, I think findings such as reported in this paper are indeed disturbing.
Posted by Andrew at 12:29 AM | Comments (3)
January 27, 2005
Thoughts on Eric Johnson's talk
Eric Johnson (a psychologist at the Columbia Business School) spoke today at the Decision Sciences seminar.
A fascinating talk
His topic was "decisions as memory" (maybe i'm getting the exact words wrong here), and the key idea was that, in the process of making a decision, a person queries his or her memory, thinking of good and bad aspects of different decision options. There's lots of research on memory that covers all sorts of artifacts (for example, when you remember one item, you will be led to similar items). The idea of this research program is that, if memory is a key part of judgment and decision making, then many of the weird (or at least, non-normative) aspects of decision making--which have been studied by Kahneman, Tversky, and others over the years--can maybe be explained at a more cognitively basic level as quirks of how we remember things and how we access these memories.
My comments/questions
I had a bunch of comments on the talk (which will probably be incomprehensible unless you were there, but it's helpful for me to put them down):
- Many of his examples were consumer purchases (buying a car, paying for a mug, deciding whether to eat dessert at a restaurant). He also said that the sort of tradeoffs people have are "too many things they want to do." Later on he discussed the choice of which candidate to vote for in an election. I suspect there are big differences between self-focused consumer decisions and other-focused decisions about political preferences.
- Simmilarly, he referred to the "cost to buy that product, make that commute, prefer that policy." Once again, I see a difference between buying a product (which involves personal costs and benefits) and choosing among policies (no personal costs, and with larger societal effects).
From a psychological perspective, I don't have a problem with what Johnson was saying. All these decisions might involve accessing memories, so there might be important similarlities in how we think about these questions. But at some level it seems inappropriate to use the language of personal cost and choice to describe policy decisions.
- In discussing people's willingness to donate organs upon their deaths, he referred to some suggestions that have been made to set up a market for human organs. He said that "a market implies that people have implicit preferences." I don't see why he said that. A market is a forum for buying and selling. Why are implicit preferences necessary? Or is he saying that a market would collapse if preferences are not implicit? I don't understand.
- Also in discussing organ donation, he said that the reason people check off the "I'll be a donor" box is in anticipation of their future happiness. He also said that "the benefits are the good feeling about being a donor." When I checked off that box (or however I consented), I thought the benefits are that someone's live might be saved after I die. Or maybe the benefit is just the aesthetic satisfaction of recycling. I don't know exactly, but I wouldn't really say it gave me a good feeling, or that I did it because I thought it would make me happy.
Once again, I'm fascinated by the idea of memory-querying being a key part of decision making. But I'm skeptical of the hedonistic framework being applied so indiscriminately.
- At some point during the talk I was thinking about Dave Krantz's statement that hard decisions involve tradeoffs--tradeoffs between decision options, and tradeoffs between attributes of any given decision. Johnson's theory didn't seem to capture that struggle-in-my-mind that I feel when making a decision. (But, again, the theory has a really cool, unifying explanation of cognitive biases such as "anchoring.")
- Johnson refes to decisions as "predicitons of future utility." I don't understand why he uses the term "utility." In the decision-analytic literature, I always thought that "utility" is defined as that thing which, when averaged over, describes a set of consistent decisions (as in Neumann's utility theory, where the very existence of utilities is derived from a set of axioms of consistent preferences). Since Johnson is working outside the "expected utility" framework, I don't really know what he means when he talks about "utility" (or, for that matter, anticipated future utility).
- In describing a buy-the-mug experiment, he characterized the "endownment effect" by saying: "that's a disturbing effect for economists." Perhaps true--I expect Johnson knows more about economists' views than I do--but if true, I'm disturbed that it's a disturbing effect. I would think that economists' main goal would be to describe and understand our commercial behavior, buying and selling, working, poverty and wealth, etc. Learning about the endowment effect should allow them to understand these things better--why would it be disturbing?
- He said something about the virtue of "choice" in retirement plans, which I didn't quite follow because earlier he had said that people tend to choose the default retirement plan--which would seem to imply that most people don't want choice in that aspect of their lives (or, at least, that they don't want choice so badly that they would actually exercise it).
- Johnson is a professor of marketing. Some of the studies he described have the flavor of "subliminal advertising." For example, changing the background of a web site affected what products people wanted to buy. Now, fundamentally, I don't see anything wrong with this sort of manipulation--after all, the website has to have some background, so why not set it to get what you want. But at the same time there's something vaguely creepy about it, no?
- From a political scientist's perspective, this work reminded me of the studies of issue framing. The research goal would be to connect "framing" with "memory," I guess.
Johnson et al.'s papers
The papers (by Eric Johnson, Elke Weber, and Daniel Goldstein) underlying the talk are here and here.
P.S. Links don't work anymore!
Posted by Andrew at 8:00 PM | Comments (1)
January 18, 2005
Neurobiology and decision making
David Laibson, Samuel M. McClure, George Loewenstein, and Jonathan D. Cohen will be speaking this Thurs, 20 Jan, 2:30-4pm, at 404 IAB, on "Neuroeconomics and Impulsivity." Their article is available here.
Their studies suggest that short-term and long-term rewards activate different areas of the brain. Personally, I think economists worry too much about intertemporal choice as a factor in decision making. I've been convinced by Dave Krantz that the idea of time discounting (for example, that an item now is equivalent in utility to 1.05 items to be delivered in a year) is not as universally applicable to decision analysis as seems generally assumed. I'll get into this more another time.
In any case, the article looks interesting and I expect the talk will be interesting also.
Posted by Andrew at 11:44 AM | Comments (0)
December 30, 2004
What is the value of a life?
What is the value of a life, and can it be estimated by finding a wage premium for risk?
Regressions of wage as a function of risk
This paper by Dora Costa and Matthew Kahn says yes, and they use historical data across industries to estimate the value of risk--the increase in pay for riskier jobs. This article by W. Kip Viscusi and Joseph Aldy offers a comprehensive review of studies estimating the dollar value that people implicitly assign to risk, based on estimated wage premiums.
Skepticism about regressions of wage as a function of risk
On the other hand, Peter Dorman and Paul A. Hagstrom argue that there is no evidence that risky jobs pay more. Dorman also wrote a book on this topic that I found convincing. If you're not careful, you can get negative coefficients in these wage/risk models.
Inevitable inconsistency
In its general form, the question of risk compensation must have multiple answers depending on context. It is well known that people are generally not good at understanding risks, even of an essentially monetary nature. (For example, my dad persists in buying car insurance that reimburses him for the car value if it is destroyed in a crash, even though he could easily afford to replace it. That is, I think he'd be better off "self-insuring" the value of the car. But, nooooo...., he wants the "peace of mind.") When you start bringing in life and death, well, of course people will be inconsistent. Perhaps having risk premiums in some industries and not others. And, as Dorman points out, a lot will depend on the forms of collective bargaining available.
I care about this for (at least) two reasons. First, the value of a life is a wonderful example when teaching decision analysis. Math and stat students who have not previously thought about the topic will commonly refuse to assign a dollar value to even the smallest risk (for example, I had a student who would not accept $1 for a 1-in-a-trillion chance of death).
The second reason for caring about the value of a life is that it is relevant for decsions such as whether to measure your house for radon or larger policy questions. It's frustrating that we can't just use wage regressions to estimate how much people value their lives, but to be realistic I have to recognize that the cost of a risk depends on its context. Statistical decision analysis can still be useful, however, in ensuring that decision recommendations are consistent across different conditions and localities.
P.S. See here for much more on this.
Posted by Andrew at 5:01 PM | Comments (2) | TrackBack
December 29, 2004
Type 1, type 2, type S, and type M errors
In statistics, we learn about Type 1 and Type 2 errors. For example, from an intro stat book:
A Type 1 error is commtted if we reject the null hypothesis when it is true.A Type 2 error is committed if we accept the null hypothesis when it is false.
(Usually these are written as I and II, in the manner of World Wars and Super Bowls, but to keep things clean with later notation I'll stick with 1 and 2.)
Actually, though . . .
Never a Type 1 or Type 2 error
I've never in my professional life made a Type I error or a Type II error. But I've made lots of errors. How can this be?
A Type 1 error occurs only if the null hypothesis is true (typically if a certain parameter, or difference in parameters, equals zero). In the applications I've worked on, in social science and public health, I've never come across a null hypothesis that could actually be true, or a parameter that could actually be zero.
A Type 2 error occurs only if I claim that the null hypothesis is true, and I would certainly not do that, given my statement above!
But errors nonetheless
But I certainly have made errors! How can they be classified? For simplicity, let's suppose we're considering parameters theta, for which the "null hypothesis" is that theta=0. (For example, theta could be a regression coefficient, or a comparison between two treatment effects. In any given study, there might be many thetas of interest.)
A Type S error is an error of sign. I make a Type S error by claiming with confidence that theta is positive when it is, in fact, negative, or by claiming with confidence that theta is negative when it is, in fact, positive. I think it's fair to say that classical 2-sided hypothesis testing fits this framework: for example, if our 95% interval for theta is [.1, .3], or if we say that theta.hat = .2 and is statistically significantly different from zero, then our scientific claim is that theta is positive, not simply that it's nonzero.
A Type M error is an error of magnitude. I make a Type M error by claiming with confidence that theta is small in magnitude when it is in fact large, or by claiming with confidence that theta is large in magnitude when it is in fact small. The well-known problem of publication bias could lead to systematic Type M errors, with large-magnitude findings more likely to be reported.
See here for more on Type S and Type M errors.
So what?
Does this matter? If we just do straight Bayesian inference with continuous prior distributions and work with posterior inferences, then it's not really so important. If we want, we can compute Type S and Type M error rates corresponding to various posterior summaries (that's what we do in the paper linked to above) but this is just a theoretical curiosity.
Thinking about error rates does make a difference, however, if we start selecting procedures based on their Type 1 error rates, Type 2 error rates or whatever. Then I think you're asking for trouble, for the reasons noted above. Thus these ideas could be useful in pointing us away from theoretical and methodological dead ends.
Posted by Andrew at 12:02 AM | Comments (1)
December 17, 2004
Radon webage is back up and running
Our radon risk page (created jointly with Phil Price of the Indoor Environment Division, Lawrence Berkeley National Laboratory), is fully functional again.
You can now go over to the map, click on your state and then your county, give information about your house, give your risk tolerance (or use the default value), and get a picture of the distribution of radon levels in houses like yours. You also get an estimate of the dollar costs and lives saved from four different decision options along with a decision recommendation. (Here's an example of the output.)
We estimate that if all homeowners in the U.S. followed the instructions on this page, there would be a net savings of about $10 billion (with no additional loss of life) compared to what would happen if everybody followed the EPA's recommendation.
Posted by Andrew at 7:23 AM | Comments (0)
November 23, 2004
Arsenic in Bangladesh; sharing wells
Many of the wells used for drinking water in Bangladesh and other South Asian countries are contaminated with natural arsenic, affecting an estimated 100 million people. Arsenic is a cumulative poison, and exposure increases the risk of cancer and other diseases.
Is my well safe?
One of the challenges of reducing arsenic exposure is that there's no easy way to tell if your well is safe. Kits for measuring arsenic levels exist (and the evidence is that aresenic levels are stable over time in any given well), but we and other groups are just beginning to make these kits widely available locally.
Suppose your neighbor's well is low in arsenic. Does this mean that you can relax? Not necessarily. Below is a map of arsenic levels in all the wells in a small area (see the scale of the axes) in Araihazar upazila in Bangladesh:

Blue and green dots are the safest wells, yellow and orange exceed the Bangladesh standard of 50 micrograms per liter, and red and black indicate the highest levels of arsenic.
Bad news: dangerous wells are near safe wells
As you can see, even if your neighbor has a blue or green well, you're not necessarily safe. (The wells are located where people live. The empty areas between the wells are mostly cropland.) Safe and dangerous wells are intermingled.
Good news: safe wells are near dangerous wells
There is an upside, though: if you currently use a dangerous well, you are probably close to a safe well. The following histogram shows the distribution of distances to the nearest safe well, for the people in the map above who currently (actually, as of 2 years ago) have wells that are yellow, orange, red, or black:

Switching and sharing
So if you are told where that safe well is, maybe you can ask your neighbor who owns that well to share. In fact, a study by Alex Pfaff, Lex van Geen, and others has found that people really do switch wells when they are told that their well is unsafe. We're currently working on a cell-phone-based communication system to allow people in Bangladesh to get some of this information locally.
General implications for decision analysis
This is an interesting example for decision analysis because decisions must be made locally, and the effectiveness of various decision strategies can be estimated using direct manipulation of data, bypassing formal statistical analysis.
Other details
Things are really more complicated than this because the depth of the well is an important predictor, with different depths being "safe zones" in different areas, and people are busy drilling new wells as well as using and measuring existing ones. Some more details are at our papers in Risk Analysis and Environmental Science & Technology.
Posted by Andrew at 12:47 PM | Comments (0)
November 16, 2004
Institutional decision analysis
The term "decision analysis" has multiple meanings in Bayesian statistics. When we use the term here, we are not talking about problems of parameter estimation, squared error loss, etc. Rather, we use "decision analysis" to refer to the solution of particular decision problems (such as in medicine, public health, or business) by averaging over uncertainties as estimated from a probability model. (See here for an example.)
That said, decision analysis has fundamental difficulties, most notably that it requires one to set up a utility function, which on one hand can be said to represent subjective feelings but on the other hand is presumably solid enough that it is worth using as the basis for potentially elaborate calculations.
From a foundational perspective, this problem can be resolved using the concept of institutional decision analysis.
Personal vs. institutional decision analysis
Statistical inference has an ambiguous role in decision making. Under a "subjective" view of probability (which I do not generally
find useful; see Chapter 1 of Bayesian Data Analysis), posterior inferences represent the personal beliefs of the analyst, given his or her prior information and data. These can then be combined with a subjective utility function and input into a decision tree to determine the optimal decision, or sequence of decisions, so as to maximize subjective expected utility. This approach has serious drawbacks as a procedure for personal decision making, however. It can be more difficult to define a utility function and subjective probabilities than to simply choose the most appealing decision. The formal decision-making procedure has an element of circular reasoning, in that one can typically come to any desired decision by appropriately setting the subjective inputs to the analysis.
In practice, then, personal decision analysis is most useful when the inputs (utilities and probabilities) are well defined. For example, in a decision problem of the costs and benefits of screening for cancer, the utility function is noncontroversial--years of life, with a slight adjustment for quality of life--and the relevant probabilities are estimated from the medical literature. Bayesian decision analysis then serves as a mathematical tool for calculating the expected value of the information that would come from the screening.
In institutional settings--for example, businesses, governments, or research organizations--decisions need to be justified, and formal decision analysis has a role to play in clarifying the relation between the assumptions required to build and apply a relevant probability model and the resulting estimates of costs and benefits.
We introduce the term "institutional decision analysis" to refer to the process of transparently setting up a probability model, utility function, and an inferential framework leading to cost estimates and decision recommendations. Depending on the institutional setting, the decision analysis can be formalized to different extents.
In general, there are many ways in which statistical inferences can be used to inform decision-making. The essence of the "objective" or "institutional" Bayesian approach is to clearly identify the model assumptions and data used to form the inferences, evaluate the reasonableness and the fit of the model's predictions (which include decision recommendations as a special case), and then expand the model as appropriate to be more realistic. The most useful model expansions are typically those that allow more information to be incorporated into the inferences.
Further discussion and several examples appear in Chapter 22 of "Bayesian Data Analysis."
Posted by Andrew at 12:51 PM | Comments (2)
November 10, 2004
How to save $10 billion
Radon is a radioactive gas that is generally believed to cause lung cancer, even in low concentrations, and might exists in high concentrations in the basement of your house (see the map).
The EPA recommends that you should test your home for radon and then fix the problem if your measurement is 4 picoCuries per liter or higher. We estimate that this strategy, if followed, would cost about $25 billion and save about 110,000 lives over the next thirty years.
We can do much better by using existing information on radon levels to target homes that are likely to have high levels. If meausrements are more targeted, we estimate that the same savings of 110,000 lives can be achieved at a cost of only $15 billion. The problem with the EPA's recommendation is that, by measuring everyone, including those who will probably have very low radon, it increases the number of false alarms--high measurements that occur just by chance in low-radon houses.
We found formal decision analysis to be a useful tool in quantifying the recommendations of where to measure and remediate. (For more details, see Section 22.4 of Bayesian Data Analysis and this paper).
Click here to see what to do for your house.
Posted by Andrew at 4:29 PM | Comments (0)
November 9, 2004
Cost-benefit analysis, goal-based decision analysis, and Dave Krantz's talk
Carrie McLaren has an interesting interview with Frank Ackerman and Lisa Heinzerling in the current Stay Free magazine, on the topic of cost-benefit analysis, as it is used in environmental regulations (for example, how much money is it worth spending to reduce arsenic exposures by a specified amount). Apparently, a case of chronic bronchitis has been judged to have a cost of $260,000, and IQ points are worth $8300 each. Ackerman and Heinzerling argue that cost-benefit analysis is "fundamentally flawed," basically because it involves a lot of arbitrary choices that allow regulators to do whatever they want and justify their choices with numbers.
This made me a little worried, since I've done some cost-benefit analysis myself! In particular, I'm sympathetic to the argument that cost-benefit analysis requires arbitrary choices of the value of a life (for example). Garbage in, garbage out, and all that. But, on the plus side, cost-benefit analysis allows one to quantify the gains from setting priorities. Even if you don't "believe" a particular value specified for value of a life, you can calculate conditional on that assumed value, as a starting point to understanding the full costs of different decision options.
With this mixture of interest and skepticism as background, I was interested to read the following exchange in the Stay Free interview:
Stay Free: To play devil's advocate, proponents of cost-benefit analysis argue that obviously some ways of preserving our environment or our health are cheaper and better than others, so can't cost-benefit analysis help with that?Heinzerling: It doesn't necessarily help with that. What might help is setting a goal and then thinking about creative ways to get to that goal most cheaply.
[back to me:] OK, but what's the difference between (a) setting dollar values for various risks and then doing cost-benefit analysis, and (b) setting a goal and then trying to achieve it most cheaply? Either way, you'll choose a point along the "efficient frontier" of non-dominated strategies. Setting the dollar values or setting a goal are just two different ways of parameterizing where you'll end up on that frontier.
Well, this stuff still confuses me, but according to Dave Krantz of the psychology department at Columbia, there is an alternative framework of decision analysis based on goals, rather than utilities, that is a better fit to how people actually make decisions (and perhaps how they should make decisions). Basically, the claim is that, in practice, (a) and (b) of the previous paragraph are not the same.
P.S. See here for possibly more discussion on this.
Posted by Andrew at 1:02 PM | Comments (1) | TrackBack
October 13, 2004
Why it's rational to vote
The chance that your vote will be decisive in the Presidential election is, at best, about 1 in 10 million. So why vote?
Schematic cost-benefit analysis
To express formally the decision of whether to vote:
U = p*B - C, where
U = the relative utility of going and casting a vote
p = probability that, by voting, you will change the election outcome
B = the benefit you would feel from your candidate winning (compared to the other candidate winning)
C = the net cost of voting
The trouble is, if p is 1 in 10 million, then for any reasonable value of B, the product p*B is essentially zero (for example, even if B is as high as $10000, p*B is 1/10 of one cent), and this gives no reason to vote.
The usual explanation
Actually, though, about half the people vote. The simplest utility-theory explanation is that the net cost C is negative for these people--that is, the joy of voting (or the satisfying feeling of performing a civic duty) outweighs the cost in time of going out of your way to cast a vote.
The "civic duty" rationale for voting fails to explain why voter turnout is higher in close elections and in important elections, and it fails to explain why citizens give small-dollar campaign contributions to national candidates. If you give Bush or Kerry $25, it's not because you're expecting a favor in return, it's because you want to increase your guy's chance of winning the election. Similarly, the argument of "it's important to vote, because your vote might make a difference" ultimately comes down to that number p, the probability that your vote will, in fact, be decisive.
Our preferred explanation
We understand voting as a rational act, given that a voter is voting to benefit not just himself or herself, but also the country (or the world) at large. (This "social" motivation is in fact consistent with opinion polls, which find, for example, that voting decisions are better predicted by views on the economy as a whole than by personal financial situations.)
In the equation above, B represents my gain in utility by having my preferred candidate win. If I think that Bush (or Kerry) will benefit the country as a whole, then my view of the total benefit from that candidate winning is some huge number, proportional to the population of the U.S. To put it (crudely) in monetary terms, if my candidate's winning is equivalent to an average $100 for each person (not so unreasonable given the stakes in the election), then B is about $30 billion. Even if I discount that by a factor of 100 (on the theory that I care less about others than myself), we're still talking $300 million, which when multiplied by p=1/(10 million) is a reasonable $30.
Some empirical evidence
As noted above, voter turnout is higher in close elections and important elections. These findings are consistent with the idea that it makes more sense to vote when your vote is more likely to make a difference, and when the outcome is more important.
As we go from local, to state, to national elections, the size of the electorate increases, and thus the probability decreases of your vote being decisive, but voter turnout does not decrease. This makes sense in our explanation because national elections affect more people, thus the potential benefit B is multiplied by a larger number, canceling out the corresponding decrease in the probability p.
People often vote strategically when they can (in multicandidate races, not wanting to "waste" their votes on candidates who don't seem to have a chance of winning). Not everyone votes strategically, but the fact that many people do is evidence that they are voting to make a difference, not just to scratch an itch or satisfy a civic duty.
As noted above, people actually say they are voting for social reasons. For example, in the 2001 British Election Study, only 25% of respondents thought of political activity as a good way to get "benefits for me and my family" whereas 66% thought it a good way to obtain "benefits for groups that people care about like pensioners and the disabled."
Implications for voting
First, it can be rational to vote with the goal of making a difference in the election outcome (not simply because you enjoy the act of voting or would feel bad if you didn't vote). If you choose not to vote, you are giving up this small but nonzero chance to make a huge difference.
Second, if you do vote, it is rational to prefer the candidate who will help the country as a whole. Rationality, in this case, is distinct from selfishness.
See here for the full paper (joint work with Aaron Edlin and Noah Kaplan)
Posted by Andrew at 5:02 PM | Comments (10)