April 2007 Archives

This paper by David Blanchflower and Andrew Oswald (from the Australian Economic Review in 2005) looks interesting. I'm interested in happiness (who isn't?) but this paper particularly interests me because it addresses a special case of the general statistical problem of summarizing multivariate data by indexes. Here's the abstract:

According to the well-being measure known as the U.N. Human Development Index, Australia now ranks 3rd in the world and higher than all other English-speaking nations. This paper questions that assessment. It reviews work on the economics of happiness, considers implications for policymakers, and explores where Australia lies in international subjective well-being rankings. Using new data on approximately 50,000 randomly sampled individuals from 35 nations, the paper shows that Australians have some of the lowest levels of job satisfaction in the world. Moreover, among the sub-sample of English-speaking nations, where a common language should help subjective measures to be reliable, Australia performs poorly on a range of happiness indicators. The paper discusses this paradox. Our purpose is not to reject HDI methods, but rather to argue that much remains to be understood in this area.

I recommend--for the next paper these folks write--to present the results in graphical, not tabular, form, and to order the countries in some reasonable way (for example, in order of per-capita GDP) rather than alphabetically. For example, do we really need to know that Australia has a value of 5.39 for one index and 5.62 for another? These comments apply to the raw data and also the displays of regression coefficients.

Going on to the substance of the paper, I have no particular comments. It is admirably crisp and speaks for itself and modestly focuses on the statistical issues.

Too much information?

| No Comments

Aleks sent me the link to this site. Seth might like it--except that it seems to be set up only to monitor data, not to record experiments.

Mediation

| 1 Comment

Rahul writes:

Baby-faced politicians lose

| 7 Comments

Greg Laun pointed me to this paper by Alexander Todorov, Anesu Mandisodza, Amir Goren, and Crystal Hall, whose abstract states:

Inferences of competence based solely on facial appearance predicted the outcomes of U.S. congressional elections better than chance (e.g., 68.8% of the Senate races in 2004) and also were linearly related to the margin of victory. These inferences were specific to competence and occurred within a 1-second exposure to the faces of the candidates. The findings suggest that rapid, unreflective trait inferences can contribute to voting choices, which are widely assumed to be based primarily on rational and deliberative considerations.

MCSim lives!

| 3 Comments

MCSim is some software that Frederic Bois wrote for our toxicology research over 10 years ago. I didn't know it was still around, until Bill Harris wrote,

Jouni writes that SAS has now a Bayesian module. I agree with Jouni that "The Bayesian probability reflects a person's subjective beliefs" is not really the kind of phrase you expect to hear from a modern practicing Bayesian methodologist. This definition would immediately invalidate the use of Bayesian methods in any field of science, I'd think." Well, maybe not in sociology . . .

Greg Mankiw and Tyler Cowen point to the release of this book by Bryan Caplan, so it might be worth pointing to my discussion of an earlier version of the book that he showed me when I visited his university in 2005. I don't like the title (unsuprisingly, since I wrote a paper called Voting as a rational choice), but Caplan's book is interesting.

My full comments are here, and here's the short version:

This paper by Catherine Crouch, Jessica Watkins, Adam Fagen, and Eric Mazur looks pretty exciting to me:

Peer Instruction is an instructional strategy for engaging students during class through a structured questioning process that involves every student. Here we describe Peer Instruction (hereafter PI) and report data from more than ten years of teaching with PI in the calculus- and algebra-based introductory physics courses for non-majors at Harvard University, where this method was developed. Our results indicate increased student mastery of both conceptual reasoning and quantitative problem solving upon implementing PI. Gains in student understanding are greatest when the PI questioning strategy is accompanied by other strategies that increase student engagement, so that every element of the course serves to involve students actively. We also provide data on gains in student understand-ing and information about implementation obtained from a survey of almost four hundred instructors using PI at other institutions. We find that most of these instructors have had success using PI, and that their students understand basic mechanics concepts at the level characteristic of courses taught with interactive engagement methods. Finally, we provide a sample set of materials for teaching a class with PI, and provide information on the extensive resources available for teaching with PI.

Their stuff is all about physics. I'd like to do it with statistics. I think it could revolutionize the (currently crappy) state of statistics instruction.

More Black Swan

| No Comments

Jonathan Nagler posts this mini-conference:

The psychology of power

| 8 Comments

In a comment on this entry, Chris points to this interview with Deborah Gruenfeld. Some excerpts:

I [Gruenfeld] have been studying the psychological consequences of having power for the past seven years . . . There are just so many good examples of people with power who behave in ways that demand some kind of psychological explanation.

For example, I had a brief career in journalism, and I occasionally met with Jann Wenner, the founder and publisher of Rolling Stone. . . . He had in his office a small refrigerator within arm’s reach of his desk. As far as I could tell, there were only two things in there: a bottle of vodka and a bag of raw onions. While we were meeting, he would reach over, open the door, drink vodka straight out of the bottle, and eat onions. What’s striking about it now is that none of us ever said anything to him about this, and he never even offered to share! He seemed to think it was perfectly appropriate to do this in a meeting. And that is, I think, a classic example of what we think is going on with power, which is what we call “disinhibition.”

Gruenfeld continues:

NYC R users group

| 3 Comments

I received this in the email. I know nothing about it, just passing it on:

Seth tested his balance every day, sometimes when eating flaxseed oil and sometimes when eating olive oil, and found the following:

flaxseed.jpg

This is a pretty graph, and shows that Seth's balance improved when he ate flaxseed oil and got worse with the olive oil. He conjectures:

A possible explanation is that when the concentration of omega-3 in the blood is low, the omega-3 in cell membranes slowly “evaporates” into the blood. When a cell’s membranes lose omega-3, it doesn’t work as well.

But . . .

As a statistician, my first thought was some sort of measurement bias: Seth knows when he was taking olive oil and when he was taking flaxseed oil, and staying balanced is a tricky enough task that I could well imagine that the results could be affected by his expectations.

Flying blind

I'd be more convinced by a blinded experiment. This is tricky with a self-experiment but it could be done. For example:

1. Get 50 identical vials and pour olive oil into 25 of them and flaxseed oil into the other 25. Label them (e.g., "o" and "f"), then cover up the labels with removable stickers.

2. Mix up the vials in a bag (this is sometimes called "physical randomization" in the sampling literature), then use one vial per day. After use, place them on a shelf in order. Each day, measure your balance and whatever else you want to record.

3. When the experiment is over, peel off the stickers and identify which oil was eaten on which day.

4. If the two oils can be told apart by smell, clip your nose (this might sound weird but actually Seth was already doing this.) If they taste different, mix with some strong bitter flavor (this might mess up Seth's weight-loss experiment but should be OK for the balance study). If they look different, add food coloring or just use opaque bottles and don't look inside before drinking.

This simple experiment, with complete randomization, might not capture the time trends Seth is looking for. It would be simple enough to alter the experiment, for example by replacing the vials with larger containers and setting the unit of randomization to be the ten-day period rather than the day. You could even do something trickier, maybe with the assistance of a friend, to set up a pattern with long strings of o's and f's without knowing exactly when the switches will occur.

Why Seth's existing experiment is a good thing: I'm not slamming unblinded studies

I hope Seth (or one of his correspondents) does this randomized experiment. In the meantime, Seth's results provide a potentially important contribution by motivating new hypotheses. The unblinded experiment was so easy to do (within the context of Seth's earlier experiments), and placing a requirement such as blinding might have increased the required effort to the extent that Seth might not have gotten around to doing it.

Maybe Seth could make blinding (where possible) a routine part of his future experiments, though. Just as he's trained himself to perform disciplined self-experiments with precise and regular measurements (something that I never get around to doing when trying out new teaching methods, for example), maybe he could take the next step with blinding.

Benjamin Page is speaking on this paper:

Data from the 2006 CCGA national survey once again indicate that the American public is much more multilateralist than U.S. foreign policy officials. Large majorities of Americans favor several specific steps to strengthen the UN, support Security Council intervention for peacekeeping and human rights, and favor working more within the UN even if it constrains U.S. actions. Large majorities also favor the Kyoto agreement on global warming, the International Criminal Court, the Comprehensive Nuclear Test Ban Treaty, and the new inspection agreement on biological weapons. Large majorities favor multilateral uses of U.S. troops for peacekeeping and humanitarian purposes, but majorities oppose most major unilateral engagements.

He continues:

Pierre points to the proceedings of the first of the Valencia International Meetings on Bayesian Statistics [31MB PDF!] in 1979.

Browsing through, I am surprised that they look very much like a blog, with good papers and a lot of good commentary and discussion, something that we have discussed before. Then it took an airplane flight to the beautiful Mediterranean beaches, but today, thanks to the internet, we can stay in our offices and chat online. Hmm.

Perhaps that flight was what motivated people to show up and contribute something interesting. But receiving commentary, especially commentary from good researchers is also something. Peer review with anonymous reviews that nobody looks at is such a waste of human effort. Make commentaries, not reviews! Pick good papers! Pick good authors! Pick trusted commentators! Pick good rankers! Pick good editors (who list papers on a topic or who invite authors to write on a topic)! Pick interesting topics! The monolithic nightmare of conferences, tomes and publishers should go away.

For some inspiration, look at websites such as Reddit or Yelp!. Reddit shows how good stuff can rise higher (but it fails because people ranking are not trusted). Yelp shows how one can pick and reward good reviewers (but fails because it's hard to find good stuff).

Glossary: clog = conference log.

The norm of self-interest

| 7 Comments

Aleks's comments here, in particular the bit about selfishness, reminds me of one of my favorite papers, "The norm of self-interest" by the psychologist Dale Miller. Here's the abstract:

The self-interest motive is singularly powerful according to many of the most influential theories of human behavior and the layperson alike. In the present article the author examines the role the assumption of self-interest plays in its own confirmation. It is proposed that a norm exists in Western cultures that specifies self-interest both is and ought to be a powerful determinant of behavior. This norm influences people's actions and opinions as well as the accounts they give for their actions and opinions. In particular, it leads people to act and speak as though they care more about their material self-interest than they do. Consequences of misinterpreting the "fact" of self- interest are discussed.

(Related work by Noah Kaplan, Aaron Edlin, and myself here, distinguishing rationality from selfishness as motivations for voting.)

This looks interesting (it's this Saturday, from 10:30 to 4:00 in 801 International Affairs Building):

Susan Fiske and Lasana Harris (Princeton), "Which Groups We Consider Least Human: Evidence From Social Cognition and Social Neuroscience."

Mark Peffley (Kentucky), "Racial Polarization in Criminal Justice Attitudes."

Shawn Rosenberg (UC Irvine and Princeton), "Types of Democratic Deliberation: Can the People Govern?"

Following up (sort of) on my comments on The Black Swan . . .

Dan Goldstein and Nassim Taleb's paper write: "Finance professionals, who are regularly exposed to notions of volatility, seem to confuse mean absolute deviation with standard deviation, causing an underestimation of 25% with theoretical Gaussian variables. In some fat tailed markets the underestimation can be up to 90%. The mental substitution of the two measures is consequential for decision making and the perception of market variability."

This interests me, partly because I've recently been thinking about summarizing variation by the mean absolute difference between two randomly sampled units (in mathematical notation, E(|x_i-x_j})), because that seems like the clearest thing to visualize. Fred Mosteller liked the interquartile range but that's a little too complicated for me, also I like to do some actual averaging, not just medians which miss some important information. I agree with Goldstein and Taleb that there's not necessarily any good reason for using sd (except for mathematical convenience in the Gaussian model).

Duncan Watts (of Columbia's sociology department) wrote an article in the New York Times the other day:

As anyone who follows the business of culture is aware, the profits of cultural industries depend disproportionately on the occasional outsize success — a blockbuster movie, a best-selling book or a superstar artist — to offset the many investments that fail dismally. What may be less clear to casual observers is why professional editors, studio executives and talent managers, many of whom have a lifetime of experience in their businesses, are so bad at predicting which of their many potential projects will make it big. How could it be that industry executives rejected, passed over or even disparaged smash hits like “Star Wars,” “Harry Potter” and the Beatles, even as many of their most confident bets turned out to be flops? It may be true, in other words, that “nobody knows anything,” as the screenwriter William Goldman once said about Hollywood. But why?

Duncan continues:

J. Robert Lennon has a blog!

| 1 Comment

J. Robert Lennon, one of my favorite authors, has a blog (with his wife). It's interesting to see what a Real Writer thinks about literature. Also a bit disillusioning . . .

Trust and institutions

| 1 Comment

Lanlan Wang sent along this paper. Here's the abstract:

One thing that bugs me is that there seems to be so little model checking done in statistics. As I wrote in this referee report,

I'd like to see some graphs of the raw data, along with replicated datasets from the model. The paper admirably connects the underlying problem to the statistical model; however, the Bayesian approach requires a lot of modeling assumptions, and I'd be a lot more convinced if I could (a) see some of the data and (b) see that the fitted model would produce simulations that look somewhat like the actual data. Otherwise we're taking it all on faith.

But, why, if this is such a good idea, do people not do it? I don't buy the cynical answer that people don't want to falsify their own models. My preferred explanation might be called sociological and goes as follows: We're often told to check model fit. But suppose we fit a model, write a paper, and check the model fit with a graph. If the fit is ok, then why bother with the graph: the model is OK, right? If the fit shows problems (which, realistically, it should, if you think hard enough about how to make your model-checking graph), then you better not include the graph in the paper, or the reviewers will reject, saying that you should fix your model. And once you've fit the better model, no need for the graph.

The result is: (a) a bloodless view of statistics in which only the good models appear, leaving readers in the dark about all the steps needed to get there; or, worse, (b) statisticians (and, in general, researchers) not checking the fit of their model in the first place, so that neither the original researchers nor the readers of the journal learn about the problems with the model.

One more thing . . .

You might say that there's no reason to bother with model checking since all models are false anyway. I do believe that all models are false, but for me the purpose of model checking is not to accept or reject a model, but to reveal aspects of the data that are not captured by the fitted model. (See chapter 6 of Bayesian Data Analysis for some examples.)

Statisticians often talk about a bias-variance tradeoff, comparing a simple unbiased estimator (for example, a difference in differences) to something more efficient but possibly biased (for example, a regression). There's commonly the attitude that the unbiased estimate is a better or safer choice. My only point here is that, by using a less efficient estimate, we are generally choosing to estimate fewer parameters (for example, estimating an average incumbency effect over a 40-year period rather than estimating a separate effect for each year or each decade). Or estimating an overall effect of a treatment rather than separate estimates for men and women. If we do this--make the seemingly conservative choice to not estimate interactions, we are implicitly estimating these interactions at zero, which is not unbiased at all!

I'm not saying that there are any easy answers to this; for example, see here for one of my struggles with interactions in an applied problem---in this case (estimating the effect of incentives in sample surveys), we were particularly interested in certain interactions even thought they could not be estimated precisely from data.

(Also posted at Overcoming Bias.)

Lotteries: A Waste of Hope

| 1 Comment

Statisticians are always looking for ways to convince people not to play the lottery. Here's another reason (from Eliezer Yudkowsky).

Boris points us to this paper (with Christopher Berry and Nolan McCarty):

states_common_density.png

Boris writes:

Jesus update

| 1 Comment

Jonathan Falk points us to this:

Several prominent scholars who were interviewed in a bitterly contested documentary that suggests that Jesus and his family members were buried in a nondescript ancient Jerusalem burial cave have now revised their conclusions, including the statistician who claimed that the odds were 600:1 in favor of the tomb being the family burial cave of Jesus of Nazareth . . .

See here for Aleks's earlier thoughts on this.

Books on nutrition

| No Comments

Seth recommends:

The Queen of Fats, by Susan Allport

Nutrition and Physical Degeneration, by Weston Price

The first of these books is recent; the other is from 1930 or so.

Adjusted R-sq = 0.001

| 10 Comments

A correspondent writes:

Wanted to add to my comment on the Black Swan review... but didn't want to hang people in public.

You mentioned... (Mosteller and Wallace made a similar point in their Federalist Papers book about how they don't trust p-values less than 0.01 since there can always be unmodeled events. Saying p<0.01 is fine, but please please don't say p<0.00001 or whatever.) which is a terrific point!

I had a related experience just last week when attending a seminar recently. Some guys were modeling some marketing information and showed ranges of coefficents from the set of regressions and argued that everything was significant. At the bottom of the table, it read: "Adjusted R-sq = 0.001".

I had to check my glasses. I thought I was hallucinating. That line didn't seem to unfaze anyone else. The audience were asking modeling questions, why didn't you model it this way or that, etc. I turned around and asked my neighbor: were you bothered by R-sq of 0.1%? His answer was "I have seen 0.001 or lower for panel data".

Now I'm not an expert in panel data analysis. But I am shocked, shocked, that apparently such models are allowable in academia. Pray tell me not!

I don't know what to say. In theory, R^2 can be as low as you want, but I have to admit I've never seen something like 0.001.

Jen pointed me to Level-Headed: Economics Experiment Finds Taste for Equality. In brief, people are willing to pay their own money to take from the rich and give it to the poor. The underlying Nature article mentions that:


Emotions towards top earners become increasingly negative as inequality increases, and those who express these emotions spend more to reduce above-average earners’ incomes and to increase below-average earners’ incomes. The results suggest that egalitarian motives affect income-altering behaviours, and may therefore be an important factor underlying the evolution of strong reciprocity and, hence, cooperation in humans.

However, I can see other explanations that don't require the explanation of altruism:


  • Utility arbitrage: Utility is nonlinear: taking $1 when you have $10 of daily income is worse than taking $10 when you have $100 of daily income. This is used as an argument for progressive taxation, which might be nonlinear in money, but could be linear in utility (taxes giving everyone the same amount of pain). Those who take from the rich and give from the poor might effectively be doing arbitrage: the amount of gratitude from the poor minus the anger from the rich minus the cost amounts to a positive profit for Robin Hood.

  • Insurance against slavery: There is an incentive for a commoner to prevent a powerful figure from gathering excessive power because letting this go on could lock the commoners into an under-caste.

  • Power asymmetry: The rich can become richer only by increasing the imbalance in the income distribution. But as they become richer, they actually become fewer. At some point, increasing their riches actually reduces their power, and they get "taken under" (in a revolution or a revolt). Since revolutions are costly, it's adaptive to "equalize" without breaking things up.

As an aside, it's interesting to notice James H. Fowler among the authors: he's behind a chain of very interesting papers over the past few years.

From Brian Witte of the Associated Press:

Maryland officially became the first state on Tuesday to approve a plan to give its electoral votes for president to the winner of the national popular vote instead of the candidate chosen by state voters.

'Gov. Martin O'Malley, a Democrat, signed the measure into law, one day after the state's General Assembly adjourned.

The measure would award Maryland's 10 electoral votes to the national popular vote winner. The plan would only take effect if states representing a majority of the nation's 538 electoral votes decided to make the same change.

. . .

Other states are considering the change . . . National Popular Vote, a group that supports the change, said there are legislative sponsors for the idea in 47 states. . . . But not everyone is buying into the idea. North Dakota and Montana rejected it earlier this year. Opponents say the change would hurt small rural states, where the percentage of the national vote would be even smaller than the three electoral votes they each have in the overall Electoral College.

"Even smaller" . . . that's right. North Dakota has 640,000 people--that's 0.21% of the U.S. population. Their share of 538 electoral votes is 0.0021 x 538 = 1.15. Explain again why they should get more electoral votes than, say, the 679,000 people in Cobb County, Georgia, or the 668,000 people in Will County, Illinois?

Boris pointed me to this paper by Edward Glaeser, Giacomo Ponzetto, and Jesse Shapiro. Here's the abstract:

Party platforms differ sharply from one another, especially on issues with religious content, such as abortion or gay marriage. Religious extremism in the U.S. appears to be strategically targeted to win elections, since party platforms diverge significantly, while policy outcomes like abortion rates are not affected by changes in the governing party. Given the high returns from attracting the median voter, why do vote-maximizing politicians veer off into extremism? In this paper, we find that strategic extremism depends on an important intensive margin where politicians want to induce their core constituents to vote (or make donations) and the ability to target political messages towards those core constituents. Our model predicts that the political relevance of religious issues is highest when around one-half of the voting population attends church regularly. Using data from across the world and within the U.S., we indeed find a non-monotonic relationship between religious extremism and religious attendance.

And here are my thoughts:

Brendan Nyhan (who arranged my fun visit to Duke's quantitative social science center in Feb) sent a bunch of references. I'm commenting on them here for convenience (easier than storing in my inbox!).

1. Cooperative game theory (looks at combinations of coalitions): a paper of Brandenburger and a syllabus of a course of Gilboa and Scarf.

2. NetLogo, a popular automaton simulation environment. This looks cool. I want to use something like this to do simulations to extend the ideas of this paper: Forming voting blocs and coalitions as a prisoner's dilemma: a possible theoretical explanation for political instability. (This is why I'm interested in item 1 above also.)

3. Computational and Mathematical Modeling in the Social Sciences, by Scott De Marchi: I ordered it, will report back. Brendan said I should read it because Scott's views on statistics are completely different from mine.

4. Fearon and Laitin's paper on civil wars, which is a controversial example of political methodology because they try to interpret zillions of regression coefficients at once. Also these supplementary tables.

5. Arthur Brooks's survey data on civic engagement and inequality, and Brendan's comment on Brooks's writings in this area. (I'd earlier noticed some of Brooks's interesting work on fertility differences between Democrats and Republicans and charitable giving.)

OK, I finished reading it and transcribing my thoughts. They're the equivalent of about 20 blog entries (or one long unpublishable article) but it seemed more convenient to just put them in one place.

As I noted earlier, reading the book with pen in hand jogged loose various thoughts. . . . The book is about unexpected events ("black swans") and the problems with statistical models such as the normal distribution that don't allow for these rarities. From a statistical point of view, let me say that multilevel models (often built from Gaussian components) can model various black swan behavior. In particular, self-similar models can be constructed by combining scaled pieces (such as wavelets or image components) and then assigning a probability distribution over the scalings, sort of like what is done in classical spectrum analysis of 1/f noise in time series. For some interesting discussion in the context of "texture models" for images, see the chapter by Yingnian Wu in my book with Xiao-Li on applied Bayesian modeling and causal inference. (Actually, I recommend this book more generally; it has lots of great chapters in it.)

That said, I admit that my two books on statistical methods are almost entirely devoted to modeling "white swans." My only defense here is that Bayesian methods allow us to fully explore the implications of a model, the better to improve it when we find discrepancies with data. Just as a chicken is an egg's way of making another egg, Bayesian inference is just a theory's way of uncovering problems with can lead to a better theory. I firmly believe that what makes Bayesian inference really work is a willingness (if not eagerness) to check fit with data and abandon and improve models often.

More on black and white

My own career is white-swan-like in that I've put out lots of little papers, rather than pausing for a few years like that Fermat's last theorem guy. Years ago I remarked to my friend Seth that he's followed the opposite pattern: by abandoning the research-grant, paper-writing treadmill and devoting himself to self-experimentation, he basically was rolling the dice and going for the big score--in Taleb's terminology, going for that black swan.

On the other hand, you could say that in my career I'm following Taleb's investment advice--my faculty job gives me a "floor" so that I can work on whatever I want, which sometimes seems like something little but maybe can have unlimited potential. (On page 297, Taleb talks about standing above the rat race and the pecking order; I've tried to do so in my own work by avoiding a treadmill of needing associates to do the research to get the funding, and needing funding to pay people.)

In any case, I've had a boring sort of white-swan life, growing up in the suburbs, being in school continuously since I was 4 years old (and still in school now!). In contrast, Taleb seems to have been exposed to lots of black swans, both positive and negative, in his personal life.

Chapter 2 of The Black Swan has a (fictional) description of a novelist who labors in obscurity and then has an unexpected success. This somehow reminds me of how lucky I feel that I went to college when and where I did. I started college during an economic recession, and in general all of us at MIT just had the goal of getting a good job. Not striking it rich, just getting a solid job. Nobody I knew had any thought that it might be possible to get rich. It was before stock options, and nobody knew that there was this thing called "Wall Street." Which was fine. I worry that if I had gone to college ten years later, I would've felt a certain pressure to go get rich. Maybe that would've been fine, but I'm happy that it wasn't really an option.

95% confidence intervals can be irrelevant, or, living in the present

On page xviii, Taleb discusses problems with social scientists' summaries of uncertainty. This reminds me of something I sometimes tell political scientists about why I don't trust 95% intervals: A 95% interval is wrong 1 time out of 20. If you're studying U.S. presidential elections, it takes 80 years to have 20 elections. Enough changes in 80 years that I wouldn't expect any particular model to fit for such a long period anyway. (Mosteller and Wallace made a similar point in their Federalist Papers book about how they don't trust p-values less than 0.01 since there can always be unmodeled events. Saying p<0.01 is fine, but please please don't say p<0.00001 or whatever.)

More generally, people (or, at least, political commentators) often live so much in the present that they forget that things can change. An instructive example here is Richard Rovere's book on Goldwater's 1964 campaign. Rovere, a respected political writer, wrote that the U.S. had a one-and-a-half-party system, with the Democrats being the full party and the Republicans the half party. Yes, Goldwater lost big and, yes, the Democrats did have twice the number of Senators and twice the number of Representatives in Congress then--but, actually, from 1950 through 1990, the Republicans won or tied every Presidential election (except 1964). Hardly the performance of a half-party.

Knowing what you don't know, and omniscience is not omnipotence

The quotes on page xix remind me of one of my favorites: "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so" (Mark Twain?). I actually prefer the version that says, "It's what you don't know you don't know that gets you into trouble." Also Earl Weaver's "It's what you learn after you know it all that counts."

On page xx, Taleb writes, "What you know cannot really hurt you." This doesn't sound right to me. Sometimes you know something bad is coming but you can't dodge it. For example, consider certain diseases.

Creativity is not (yet) algorithmic

On page xxi, Taleb says how almost no great discovery came from design and planning. This reminds me about a biography of Mark Twain that I read several years ago. Apparently, Twain was always trying to create a procedure--essentially, an algorithm--to produce literature. He tried various strategies, collaborators, etc., but nothing really worked. He just had to wait for inspiration and write what came to mind.

Also on page xxi, Taleb writes "we don't learn rules, just facts, and only facts." This statement would surprise linguists. It's been well demonstrated that kids learn language through rules (as can be seen, for example, from overgeneralizations such as "feets" and "teached"). More generally, folk science is strongly based on categories and natural kinds--I think Taleb is aware of this since he cites my sister's work in his references. (A recent example of naive categorization in folk science is in the papers of Satoshi Kanazawa.)

Recognition, prevention, and saltatory growth

On page xxiii, Taleb writes that "recognition can abe quite a pump." Yes, but recall all those scientists whose lives were shortened by two years (on average) from frustration at not receiving the Nobel Prize!

On page xxiv, "few reward acts of prevention": I'm reminded of our health plan in grad school, which paid for catastrophic coverage but not routine dental work. A friend of mine actually had to get root canal, and eventually got the plan to pay for it, but not without a struggle.

On page 10, Taleb writes, "history does not crawl, it jumps." This reminds me of the evidence on saltatory growth in infants (basically, babies grow length by a jump every few days; they don't grow the same amount every day).

Aha

I was also reminded of the fractal nature of scientific revolutions--basically, at all scales (minutes, hours, days, months, years, decades, centuries, . . .), science seems to proceed by being derailed by unexpected "aha" moments. (Or, to pick up on Taleb's themes, I can anticipate that "aha" moments will occur, I just can't predict exactly when they will happen or what they will be.)

Liberals and conservatives

On page 16, Taleb asks "why those who favor allowing the elimination of a fetus in the mother's womnb also oppose capital punishment" and "why those who accept abortion are supposed to be favorable to high taxation but against a strong military," etc. First off, let me chide Taleb for deterministic thinking. Fromthe General Social Survey cumulative file, here's the crosstab of the responses to "Abortion if woman wants for any reason" and "Favor or oppose death penalty for murder":

40% supported abortion for any reason. Of these, 76% supported the death penalty.

60% did not support abortion under all conditions. Of these, 74% supported the death penalty.

This was the cumulative file, and I'm sure things have changed in recent years, and maybe I even made some mistake in the tabulation, but, in any case, the relation between views on these two issues is far from deterministic!

But getting back to the main question: I don't think it's such a mystery that various leftist views (allowing abortion, opposing capital punishment, supporting a graduated income tax, and reducing the military) are supposed to go together--nor is it a surprise that the opposite positions go together in a rightist worldview. Abortion is related to women's rights, which has been a leftist position for a long time. Similarly, conservatives have favored harsher punishments and liberals (to use the U.S. term) have favored milder punishments for a long time also. The graduated income tax favors the have-nots rather than the have-mores, and the military is generally a conservative institution. Other combinations of views are out there, but I don't agree with Taleb's claim that the left-right distinction is arbitrary.

Picking pennies in front of a steamroller

On page 19, Taleb refers to the usual investment strategy (which I suppose I actually use myself) as "picking pennies in front of a steamroller." That's a cute phrase; did he come up with it? I'm also reminded of the famous Martingale betting system. Several years ago in a university library I came across a charming book by Maxim (of gun fame) where he went through chapter after chapter demolishing the Martingale system. (For those who don't know, the Martingale system is to bet $1, then if you lose, bet $2, then if you lose, bet $4, etc. You're then guaranteed to win exactly $1--or lose your entire fortune. A sort of lottery in reverse, but an eternally popular "system.")

Throughout, Taleb talks about forecasters who aren't so good at forecasting, picking pennies in front of steamrollers, etc. I imagine much of this can be explained by incentives. For example, those Long-Term Capital guys made tons of money, then when their system failed, I assume they didn't actually go broke. They have an incentive to ignore those black swans, since others will pick up the tab when they fail (sort of like FEMA pays for those beachfront houses in Florida). It reminds me of the saying that I heard once (referring to Donald Trump, I believe) that what matters is not your net worth (assets minus liabilities), but the absolute value of your net worth. Being in debt for $10 million and thus being "too big to fail" is (almost) equivalent to having $10 million in the bank.

The discussion on page 112 of how Ralph Nader saved lives (mostly via seat belts in cars) reminds me of his car-bumper campaign in the 1970s. My dad subscribed to Consumer Reports then (he still does, actually, and I think reads it for pleasure--it must be one of those Depression-mentality things), and at one point they were pushing heavily for the 5-mph bumpers. Apparently there was some federal regulation about how strong car bumpers had to be, to withstand a crash of 2.5 miles per hour, or 5 miles per hour, or whatever--the standard had been 2.5 (I think), then got raised to 5, then lowered back to 2.5, and Consumer's Union calculated (reasonably correctly, no doubt) that the 5 mph standard would, in the net, save drivers money. I naively assumed that CU was right on this. But, looking at it now, I would strongly oppose the 5 mph standard. In fact, I'd support a law forbidding such sturdy bumpers. Why? Because, as a pedestrian and cyclist, I don't want drivers to have that sense of security. I'd rather they be scared of fender-benders and, as a consequence, stay away from me! Anyway, the point here is not to debate auto safety; it's just an interesting example of how my own views have changed. Another example of incentives.

Three levels of conversation, or, why lunch at the faculty club might (sometimes) be more interesting than hanging out with chair-throwing traders

On page 21, Taleb compares the excitement of chair-throwing stock traders to "lunches in a drab university cafeteria with gentle-minded professors discussing the latest departmental intrigue." This reminds me of a distinction I came up with once when talking with Dave Krantz, the idea of three levels of conversation. Level 1 is personal: spouse, kids, favorite foods, friends, gossip, etc. Level 2 is "departmental intrigue," who's doing what job, getting person X to do thing Y, how to get money for Z--basically, level 2 is all about money. Level 3 is impersonal things: politics, sports, research, deep thoughts, etc. When talking with Dave, I resolved to minimize level 2 conversation and focus on the far more important (and interesting) levels 1 and 3. Level 2 topics have an immediacy which puts them on the top of the conversational stack, which is why I made the special effort to put them aside. Anyway, it struck me in reading page 21 of Taleb's book that chair-throwing stock traders have much more interesting level 2 conversations (compared with professors or even grad students), and quite possibly they have better level 1 conversations also--but I'd hope that the level 3 conversations at the university are more interesting. Being on campus, I'm used to having all sorts of good level 3 conversations, but I find these harder to come by in other settings. Probably it's nothing to do with the depth of these other people, just that I find it easier to get into a good conversational groove with people at the university. In any case, I try (not always successfully) to keep conversations away from "the latest departmental intrigue."

Riding the escalator to the stairmaster

The story on page 54 about the people who ride the escalator to the Stairmasters reminds me that, where I used to work, there was a guy who carried his bike up the stairs to the 4th floor. This always irritated me because it set an unfollowable example. For instance, one day I was on the elevator (taking my bike to the 3rd floor) and some guy asked me, "You ride your bike for the exercise. Why don't you take the stairs?" (I replied that I don't ride my bike for the exercise.)

Confirmation bias, or, shouldn't I be reading an astrology book?

Around pages 58-59, Taleb talks about confirmation bias and recommends that we look for counterexamples to our theories. I certainly agree with this and do it all the time in my research. But what about other aspects of life? For example, I was reading The Black Swan, which I knew ahead of time would contain lots of information that I already agreed with. Should I instead read a book on astrology? In practice, I'm sure this would just confirm my (true) suspicion that astrology is false, so I'm kinda stuck.

Rare events and selection bias

The footnote on 61 reminded me of a talk I saw a couple years ago where it was said that NYC is expected to have a devastating earthquake some time in the next 2000 years.

On page 77, Taleb says that lottery players treat odds of one in a thousand and one in a million almost the same way. But . . . when they try making lottery odds lower (for example, changing from "pick 6 out of 42" to "pick 6 out of 48," people do respond by playing less (unless the payoffs are appropriately increased). I attribute this not to savvy probability reasoning but to a human desire not to be ripped off.

On page 102 and following, Taleb discusses selection bias. I also recommend the article by Howard Wainer et al. (A Selection of Selection Anomalies); Deb Nolan and I also have a few in our Teaching Statistics book.

Then, on page 126, Taleb describes a conference he attended where his "first surprise was to discover that the military people there thought, behaved, and acted like philosophers [in the good sense of the word] . . . They thought out of the bix, like traders, except much better and without fear of introspection." He goes on to discuss why military officers are such good skeptical thinkers. But this seems like a clear case of selection bias! The military officers who come to an academic symposium are probably an unusual bunch.

Losers lie

On page 118-119, there's a discussion of how someone with a winning streak in life can think it's skill, even if it's just luck and selection (that the losers don't get observed). I'd like to add another explanation, which is that people lie. Someone who tells you he won ten straight times probably actually won ten times out of fifteen. Someone who tells you he broke even probably is a big loser. Etc.

On page 125, Taleb explains why the Fat Tonys get more Nobel Prizes in medicine than the Dr. Johns. I don't know if this is really true, but if it is, I might attribute it to the Tonys' better social skills (i.e., helping others be happy and getting people to do what they want) more than their better ability to assess uncertainty.

Of fights and coin flips

On page 127-128, Taleb discusses the distinction between uncertainty and randomness (in my terminology, the boxer, the wrestler, and the coin flip). I'd only point out that coins and dice, while maybe not realistic representations of many sources of real-world uncertainty, do provide useful calibration. Similarly, actual objects rarely resemble "the meter" (that famous metal bar that sits, or used to sit, in Paris), but it's helpful to have an agreed-upon length scale. We have some examples in Chapter 1 of Bayesian Data Analysis of assigning probabilities empirically (for football scores and record linkage).

Also, as discussed in our Teaching Statistics book, when teaching probability I prefer to use actual random events (e.g., sex of births) rather than artificial examples such as craps, roulette, etc., which are full of technical details (e.g., what's the probability of spinning a "00") that are dead-ends with no connection to any other areas of inquiry. In contrast, thinking about sex of births leads to lots of interesting probabilistic, biological, combinatorical, and evolutionary directions.

Overconfidence as the side effect of communication goals

On page 14, Taleb discusses overconfidence (as in the pathbreaking Alpert and Raiffa study). As we teach in decision theory, there's actually an easy way to make sure that your 95% intervals are calibrated. Just apply the following rule: Every time someone asks you to make a decision, spin a spinner that has a 95% chance of returning the interval (-infinity, infinity), and a 5% chance of returning the empty set. You will be perfectly calibrated (on average). The intervals are useless, however, which points toward the fact that when people ask you for an interval, you're inclined (for Gricean reasons if no other) to provide some information. According to Dave Krantz, much of overconfidence of probability statements can be explained by this tension between the goals of informativeness and calibration.

On page 145, Taleb discusses the fallacy of assuming that "more is better." A lot depends here on the statistical model you're using (or implicitly using). With least squares, overfitting is a real concern. Less so in Bayesian inference, but still it comes up with noninformative prior distributions. An important--the important--topic in Bayesian statistics is the construction of structured prior distributions that let the data speak but at the same time don't get overwhelmed by a flood of data.

Of taxonomies and lynx

In the discussion of Mandelbrot's work on page 269, I'd also mention his models for taxonomies, which have a simple self-similar structure without the complexities of the more familiar spatial examples. Also, the story about the problems of Gaussian models reminds me Cavan Reilly's chapter in this book, where he fits a simple predator-prey model with about 3 parameters to the famous Canadian lynx data and gets much better predictions than the standard 11-parameter Gaussian time series models that are usually fit to those data.

Buzzwords

On page 278, Taleb rants against statistical buzzwords such as standar deviation and correlation, and financial buzzwords such as risk. This reminds me of my rant against the misunderstood concept of "risk aversion." I have to write this up fully sometime, but some of my rant is here.

It's all over but the compartmentalizin'

On page 288, Taleb discusses people who compartmentalize their intellectual lives, for example the philosopher who was a trader but didn't use his trading experiences to inform his philosophy. I noticed a similar thing about some of my collegues where I used to teach in the statistics department at Berkeley. On the one hand, they were extremely theoretical, using advanced mathematics to prove very subtle things in probability theory, often things (such as the strong law of large numbers) that had little if any practical import. But when they did applied work, they threw all this out the window--they were so afraid of using probability models that they would often resort to very crude statistical methods.

I'm only a statistician from 9 to 5

I try (and mostly succeed, I think) to have some unity in my professional life, developing theory that is relevant to my applied work. I have to admit, however, that after hours I'm like every other citizen. I trust my doctor and dentist completely, and I'll invest my money wherever the conventional wisdom tells me to (just like the people whom Taleb disparages on page 290 of his book).

Miscellaneous sociological thoughts

Taleb's comment on page 155 about economics being the most insular of fields reminds me of this story of the economist who said that economists are different than "anthropologists, sociologists, and public health officials" because economists believe that "everyone is fundamentally alike" [except, of course, for anthropologists, etc.]. Economists often do seem pretty credulous of arguments presented by other economists!

The reference on page 158 to dentists reminded me of the dentists named Dennis.

On page 166, Taleb disparages plans. But plans can be helpful, no? Even if they don't work out. It usually seems to me that even a poor plan (if recognized as tentative) is better than no plan at all.

The discussion on page 171 of predicting predictions reminds me of the paradox, of sorts, that opinion polls shift predictably during presidential nominating conventions (for evidence, see here, for example), even though conventions are very conventional events, and so one's shift in views should be (on average) anticipated.

On page 174-175, Taleb commends Poincare for not wasting time finding typos. For me, though, typo-finding is pleasant. Although I am reminded of the expression, "there's no end to the amount of work you can put into a project after it's done."

The graphs on pages 186-187 have that ugly Excel look, with unecessary horizontal lines and weirdly labeled y-axes. In any case, they remind me of the game of "scatterplot charades" that I sometimes enjoy playing with a statistics class. The game goes as follows: someone displays a scatterplot--just the points, nothing more--and everyone tries to guess what's being plotted. Then more and more of the graph is revealed--first the axis numbers, then the axis labels--until people figure it out.

I'm a little puzzled by Taleb's claim, at the end of page 193, that "to these people amused by the apes, the idea of a being who would look down on them the way they look down on the apes cannot immediately come to their minds." I'm amused by apes but can imagine such a superior being who would be amused by me. Why not?

On page 196, Taleb writes, "a single butterfly flapping its wings in New Delhi may be the certain cause of a hurricane in North Carolina . . ." No--there is no "the cause" (let alone, "the certain cause"). Presumably another butterfly somewhere else could've moved the hurricane away.

Page 198: the chance of a girl birth is 48.5%, not 50%.

On page 209, Taleb writes, "work hard, not in grunt work . . .". I have mixed feelings here. On one hand, yes, grunt work can distract from the big projects. For example, I'm blogging and writing lots of little papers each year instead of attacking the big questions. On the other hand, these little projects are the way I get insight into the big questions. Getting in down and dirty, playing with the data and writing code, is a way that I learn.

The mention on page 210 of Pascal's wager reminds me of the fallacy of the one-sided bet. I'm hoping that now that this fallacy has been named, people will notice it and avoid it on occasion.

The discussion on page 222 of capitalism, socialism, and attribution errors reminds me of the saying that everybody wants socialism for themselves and capitalism for everybody else (and there's nothing more fun than spending other people's money).

The discussion on the following page of the long tail reminds me of the conjecture about the "fat head" of mega-consumers.

The footnote on page 224 about book reviews reminds me of a general phenomenon which is that different reviews of the same book tend to have almost the exact same information. This becomes really clear if you look up a bunch of reviews on Nexis, for example. It can be frustrating, because for a book I like, I'd be interested in seeing lots of different perspectives. In contrast, on the web the implicit rules haven't been defined yet, so there's more diversity (as in this non-review right here, or in these comments on Indecision).

The comments on page 231 on the Gaussian distribution remind me of this story where even Galton got confused about the tails of the distribution as applied to human height.

On page 240, Taleb writes that Gauss, in using the normal distribution, "was a mathematician dealing with a theoretical point, not making claims about the structure of reality like statistical-minded scientists." I don't have my Stigler right here, but I'd always understood that Gauss developed least squares and the normal distribution in the context of fitting curves to astronomical observations. Sure he did lots of pure math, but he (and Laplace) were doing empirical science too.

I like Galileo's quote on page 257, "The great book of Nature lies ever open before our eyes and the true philosophy is written in it. . . . But we cannot read it unless we have first learned the language and the characters in which it is written. . . . It is written in mathematical language and the characters are triangles, circles and other geometric figures." As Taleb writes, "Was Galileo legally blind?" Actual nature is not full of triangles etc., it's full of clouds, mountains, trees, and other fractal shapes. But these shapes not having names or formulas, Galileo couldn't think of them. He chose the natural kind that was closest to hand. En el pais de los ciegos, etc.

On page 261, Taleb writes that in the past 44 years, "nothing has happened in economics and social science statistics except for some cosmetic fiddling." I'd disagree with that. True, I'm sure you could find antecedents of any current method in papers that were written before 1963, but I think that developing methods that work on complex problems is a contribution in itself. There's certainly a lot we can do now that couldn't be done very easily 44 years ago.

Reading with pen in hand

To conclude: it's fun (but work) to read a book manuscript with pen in hand. Also liberating that the book is already coming out, so instead of scanning for typos or whatever, I can just write down whatever ideas pop up.

P.S. Here are my thoughts on Taleb's previous book.

Networks Course Blog

| 1 Comment

This Cornell Info 204 - Networks class blog is exemplary. It seems that the students scour around for information related to the class and post entries - that are then reviewed by (and commented by) the class and the rest of the world. This is a much better model for an ideas-history-and-paradigms class than handing in homework and essays. I've browsed around it, and the postings describe the myriad applications of network-oriented thinking we can see around.

Here is interactive visualization of Election & Public Opinion by PIIM. It's an interactive display of Red / Blue state. Election data goes all the way back to 1789, the first presidential election.
PIIM_Vote.png

This application will familiarize you with the voting process of the United States. Explore how public opinion and "creative democracy" has such a persuasive effect on the country; and how just a handful of votes may cause significant impact.
Historical background, the current voting process, and informative visualization of every major election are available. The Issue and policy tools permit some creative "What if" experiments in redrawing an election based on subtle alternations to historical outcomes.

mcmcsamp() and mcsamp()

| No Comments

Wildlife biologist Wayne Hallstrom writes,

Boris pointed me to this paper by Matthew Gentzkow and Jesse Shapiro. Here's the abstract:

We [Gentzkow and Shapiro] construct a new index of media slant that measures whether a news outlet's language is more similar to a congressional Republican or Democrat. We apply the measure to study the market forces that determine political content in the news. We estimate a model of newspaper demand that incorporates slant explicitly, estimate the slant that would be chosen if newspapers independently maximized their own profits, and compare these ideal points with firms' actual choices. Our analysis confirms an economically significant demand for news slanted toward one's own political ideology. Firms respond strongly to consumer preferences, which account for roughly 20 percent of the variation in measured slant in our sample. By contrast, the identity of a newspaper's owner explains far less of the variation in slant, and we find little evidence that media conglomerates homogenize news to minimize fixed costs in the production of content.

It appears that newspapers are more liberal in liberal cities and more conservative in conservative cities.

Wolfram Schlenker of our economics department is presenting this paper by himself and Michael Roberts on the effects of climate change. The talk is this Thursday, 11:30-1, in 717 IAB. Here's the abstract:

Taylor Branch has a fascinating article in the New York Review of Books on the Bonus Army (the gathering of WW1 veterans in Washington in 1932) and the G.I. Bill, which paid for millions of college educations and mortages for WW2 veterans. I knew about Herbert Hoover and the Bonus Army but I didn't realize that Roosevelt later said no to them too or that "'Opposition to the bonus,' Arthur Schlesinger Jr. recalled, 'was one of the virtuous issues of the day.'" Or that the press referred to work camps for veterans as "playgrounds for derelicts" who were "shell-shocked, whisky-shocked and depression-shocked." Or that a major motivation for the G.I. Bill was to avoid similar political controversies, or that Martin Luther King was modeling his last campaign on the Bonus Army. There are also some political issues that Branch touches briefly upon, such as the ambigous role of the American Legion in the politics of the time, and the current status of soldiers and veterans in U.S. politics.

Seth writes,

One of the first managing editors of The New Yorker had a slogan: “Don’t get it right, get it written”. My philosophy with regard to the Shangri-La Diet was similar: “don’t get it exactly right, get it written, and get feedback.”

n = 35

| 2 Comments

Ronggui Huang from the Department of Sociology at Fudan University writes,

Recently, my mentor and I have collected data in about 35 neighborhoods, and we survey 30 residents in each neighborhood. I would like to study the effects of neighborhood-level characteristics, so after data collection, I aggregate the data to neighborhood-level. In other words, I have just 35 sample points. With such a small sample size (35 neighborhoods), what statistical methods can I use to analyse the data? It seems that most of the statistical methods are based on large sample theory.

My quick answer is that, from the standpoint of classical statistical theory, 35 is a large sample! You could also do a multilevel model if you want. But I'd be careful about the causal interpretations (you wrote "effects" above)--you're probably limited on what you can learn causally unless you can frame what you're doing as a "natural experiment" (for a start, see chapters 9 and 10 of our new book).

P.S. I imagine things have changed quite a bit at Fudan in the years since Xiao-Li was there.

Multiple-authored papers

| 6 Comments

A quick search on Google Scholar found that all ten of my most cited papers have multiple authors. Looking up the top ten most cited papers from some of the other tenured faculty in our statistics department: Shaw-Hwa Lo (9/10 have multiple authors), Zhiliang Ying (9/10), Daniel Rabinowitz (9/10), Ioannis Karatzas (8/10), Victor de la Pena (7/10). (I tried to look up Chris Heyde also, but Google Scholar kept coming up with articles referring to him rather than articles by him.) Victor and Ioannis are probabilists--their work is closer to pure math so perhaps it makes sense that their single-authored papers are (relatively) more prominent.

Anyway, I think it's an important point, since it's easy to undervalue multiple-authored work by diluting the credit among all the authors.

Here's an interesting problem involving the time interval between cougar "kills"...meaning cougars killing prey, not cougars being killed. (By the way, "cougar" is synonymous with "mountain lion", "catamount", and "puma". Same animal.) The data I'll discuss below were collected by Polly Buotte and other researchers guided by Toni Ruth of the Selway Institute, funded by the Hornocker Wildlife Institute and Wildlife Conservation Society.

Cougars in and around Yellowstone National Park are monitored in two ways. Researchers try to put a radio collar on every adult cougar; there are typically about a dozen adult cougars in the park.

Most of the collars used, now and historically, are old-style radiotelemetry collars. These emit a periodic signal that can be used, through triangulation, to determine the approximate location of the animal (spatial error less than 100m). More recently, some of the collars are GPS collars that report the exact location of the animal every three hours. The GPS collars, a new technology, are expensive, relatively short- lived, and somewhat failure-prone.

One of the issues of interest to researchers is the statistical distribution of intervals between kills, called the "inter-kill interval" or IKI. A specific question of interest is the extent to which the IKI distribution has changed due to the reintroduction of wolves to Yellowstone. Some change might be expected because (1) wolves sometimes steal a cougar's kill before the cougar is done with it, so the cougar might have to kill more frequently to make up for the lost meat, and (2) prey availability might change, as prey change their behavior to try to avoid areas favored by wolves, thus possibly changing the types of prey available to cougars or their density in cougar habitat.

In addition to the statistical distribution of IKI overall and its change since the reintroduction of wolves, a related question of interest is how the IKI differs for different "social classes" of cougars, where "social class" distinguishes adult female, adult male, or maternal female (i.e. female with cubs).

Based on the radio collar data, 121 IKIs were determined for 11 cougars over 8 years. The following figure shows the IKI data for the three social classes, as determined by the two different methods (GPS and "ground").

IKIhists.png

With the help of the radio collars, researchers have tried to characterize every cougar kill made by certain cougars during certain time periods. "Characterizing" the kill means determining the date, time, and location of the kill and the type of animal killed: a large bighorn sheep, a young elk, and so on. For the standard telemetry collars, this involves using the collar to track the cougar's movements; a researcher essentially tracks the cougar every day (without disturbing its behavior) searches locations the day after the cat leaves, and locates the carcass from each kill. This method, which we refer to below as the "ground" method, is very labor- intensive. By contrast, with the GPS collars, the researcher compiles a list of the locations where the cougar spent a substantial amount of time, and visits each of those locations to characterize the kill. (Cougar usually stay on or near a kill for at least 3 days, unless driven off, and are rarely stationary for that long unless they have made a kill). This method (the "GPS" method) is much less time-intensive because the researcher can proceed from kill location to kill location rather than following the cougar.

Range voting

| 7 Comments

I have come across the Range Voting website. The basic idea is to allow voters to express their preferences on a scale from 0-100. The winner is then that one candidate that has least Bayesian regret, or the highest average score.

rangevoting.png

I guess this system would make it harder for radical candidates to win, and it would give an edge to those that would try to address everyone (although polarized voters would still only use 100 or 0). It might even have a lower cognitive cost in voting, as it doesn't require the voter to make the choice, but merely to assign grades to those candidates one is familiar with.

The website has a good collection of descriptions of other voting models, I've enjoyed this voting-with-money scheme. For those that want to dig deeper, there is a very good system of pages on Wikipedia.

Pseudo-failures to replicate

| 1 Comment

Prakash Gorroochurn from our biostat dept wrote this paper discussing the fact that, even if a study find statistical significance, its replication might not be statistically significant--even if the underlying effect is real.

This is an important point, which can also be understood using the usual rule of thumb that to have 80% power for 95% significance, your true effect size needs to be 2.8 se's from zero. Thus, if you have a result that's barely statistically significant (2 se's from zero), it's likely that the true effect is less than 2.8, and so you shouldn't be so sure you'll see a statistically significant replication. As Kahneman and Tversky found, however, our intuitions lead us to (wrongly) expect replication of statistical significance.

Prakash's paper is also related to our point about the difference between significance and non-significance.

Galin Jones sent me this paper (by James Flegal, Murali Haran, and himself) which he said started with a suggestion I once made to him long ago. That's pretty cool! Here's the abstract:

Current reporting of results based on Markov chain Monte Carlo computations could be improved. In particular, a measure of the accuracy of the resulting estimates is rarely reported in the literature. Thus the reader has little ability to objectively assess the quality of the reported estimates. This paper is an attempt to address this issue in that we discuss why Monte Carlo standard errors are important, how they can be easily calculated in Markov chain Monte Carlo and how they can be used to decide when to stop the simulation. We compare their use to a popular alternative in the context of two examples.

This is a clear paper with some interesting results. My main suggestion is to distinguish two goals: estimating a parameter in a model and estimating an expectation. To use Bayesian notation, if we have simulations theta_1,...,theta_L from a posterior distribution p(theta|y), the two goals are estimating theta or estimating E(theta|y). (Assume for simplicity here that theta is a scalar, or a scalar summary of a vector parameter.)

Inference for theta or inference for E(theta)

When the goal is to estimate theta, then all you really need is to estimate theta to more accuracy than its standard error (in Bayesian terms, its posterior standard deviation). For example, if a parameter is estimated at 3.5 +/- 1.2, that's fine. There's no point in knowing that the posterior mean is 3.538. To put it another way, as we draw more simulations, we can estimate that "3.538" more precisely--our standard error on E(theta|y) will approach zero--but that 1.2 ain't going down much. The standard error on theta (that is, sd(theta|y)) is what it is.

Recent Comments

  • Frank D: To echo Andrew's and some of the other comments: The read more
  • Phil: Murray makes some peculiar points in his blog --- for read more
  • Bill Mill: A nit: Just to clarify Anthony's comment, the article suggests read more
  • Dain: William, Why would most people, who aren't savvy enough to read more
  • Anthony Lee: It definitely seems like the choice of categories seems rather read more
  • matoko: Pfft. It is waaaaaay simpler than that, Hume. Seniors sayin' read more
  • RickRussellTX: 'Socialism is a stand-in for a less socially acceptable epithet read more
  • bshor: Also, self-reported ideology is really rife with measurement error. As read more
  • bill r: "thought there was some rule against treating ordinal data as read more
  • jonathan: Why is lawyer in intellectual and physician in traditional? I read more
  • Bill Mill: Why assume that Kael's quotation is meant seriously, rather than read more
  • ceolaf: The problem I have with the substance of this read more
  • Kevin Wright: One of the biggest opportunities for improvements in journal publishing read more
  • Michael Bishop: "...but it somehow is easy to believe that others could read more
  • charles Murray: The graph was based on fitted values for logit regressions, read more
  • Keith O'Rourke: Had a similar but not so extreme experience. Submitted a read more
  • ram: Congratulations on clearing your Inbox. I am not sure if read more
  • EmilyKennedy: Excellent! Congratulations! I love this state! read more
  • Nameless: I can't help but wonder: are engineers included in "traditional read more
  • William Ockham: As you say, the answer is not in the data. read more