Recently in Literature Category

6 links

| No Comments

The Browser asked me to recommend 6 articles for their readers. Here's what I came up with. I really wanted to link to this one but it wouldn't mean much to people who don't know New York. I also recommended this (if you'll forgive my reference to bowling), but I think it was too much of a primary source for their taste.

Dave Backus points me to this review by anthropologist Mike McGovern of two books by economist Paul Collier on the politics of economic development in Africa. My first reaction was that this was interesting but non-statistical so I'd have to either post it on the sister blog or wait until the 30 days of statistics was over. But then I looked more carefully and realized that this discussion is very relevant to applied statistics.

Here's McGovern's substantive critique:

Much of the fundamental intellectual work in Collier's analyses is, in fact, ethnographic. Because it is not done very self-consciously and takes place within a larger econometric rhetoric in which such forms of knowledge are dismissed as "subjective" or worse still biased by the political (read "leftist") agendas of the academics who create them, it is often ethnography of a low quality. . . .

Despite the adoption of a Naipaulian unsentimental-dispatches-from-the-trenches rhetoric, the story told in Collier's two books is in the end a morality tale. The tale is about those countries and individuals with the gumption to pull themselves up by their bootstraps or the courage to speak truth to power, and those power- drunk bottom billion elites, toadying sycophants, and soft-hearted academics too blinded by misplaced utopian dreams to recognize the real causes of economic stagnation and civil war. By insisting on the credo of "just the facts, ma'am," the books introduce many of their key analytical moves on the sly, or via anecdote. . . . This is one explana- tion of how he comes to the point of effectively arguing for an international regime that would chastise undemocratic leaders by inviting their armies to oust them--a proposal that overestimates the virtuousness of rich countries (and poor countries' armies) while it ignores many other potential sources of political change . . .

My [McGovern's] aim in this essay is not to demolish Collier's important work, nor to call into question development economics or the use of statistics. . . . But the rhetorical tics of Collier's books deserve some attention. . . . if his European and North American audiences are so deeply (and, it would seem, so easily) misled, why is he quick to presume that the "bottom billion" are rational actors? Mightn't they, too, be resistant to the good sense purveyed by economists and other demystifiers?

Now to the statistical modeling, causal inference, and social science. McGovern writes of Collier (and other quantitatively-minded researchers):

Portions of the two books draw on Collier's academic articles to show one or several intriguing correlations. Having run a series of regressions, he identifies counterintuitive findings . . . However, his analysis is typically a two-step process. First, he states the correlation, and then, he suggests an explanation of what the causal process might be. . . . Much of the intellectual heavy lifting in these books is in fact done at the level of implication or commonsense guessing.

This pattern (of which McGovern gives several convincing examples) is what statistician Kaiser Fung calls story time--that pivot from the quantitative finding to the speculative explanation My favorite recent example remains the recent claim that "a raise won't make you work harder." As with McGovern's example, the "story time" hypothesis there may very well be true (under some circumstances) but the statistical evidence doesn't come close to proving the claim or even convincing me of its basic truth.

The story of story time

But story time can't be avoided. On one hand, there are real questions to be answered and real decisions to be made in development economics (and elsewhere), and researchers and policymakers can't simply sit still and say they can't do anything because the data aren't fully persuasive. (Remember the first principle of decision analysis: Not making a decision is itself a decision.)

From the other direction, once you have an interesting quantitative finding, of course you want to understand it, and it makes sense to use all your storytelling skills here. The challenge is to go back and forth between the storytelling and the data. You find some interesting result (perhaps an observational data summary, perhaps an analysis of an experiment or natural experiment), this motivates a story, which in turn suggests some new hypotheses to be studied. Yu-Sung and I were just talking about this today in regard to our article on public opinion about school vouchers.

The question is: How do quantitative analysis and story time fit into the big picture? Mike McGovern writes that he wishes Paul Collier had been more modest in his causal claims, presenting his quantitative findings as "intriguing and counterintuitive correlations" and frankly recognizing that exploration of these correlations requires real-world understanding, not just the rhetoric of hard-headed empiricism.

I agree completely with McGovern--and I endeavor to follow this sort of modesty in presenting the implications of my own applied work--and I think it's a starting point for Coliier and others. Once they recognize that, indeed, they are in story time, they can think harder about the empirical implications of their stories.

The trap of "identifiability"

As Ole Rogeberg writes (following up on ideas of James Heckman and others), the search for clean identification strategies in social research can be a trap, in that it can result in precise but irrelevant findings tied to broad but unsupported claims. Rogeberg has a theoretical model explaining how economists can be so rigorous in parts of their analysis and so unrigorous in others. Rogeberg sounds very much like McGovern when he writes:

The puzzle that we try to explain is this frequent disconnect between high-quality, sophisticated work in some dimensions, and almost incompetently argued claims about the real world on the other.

The virtue of description

Descriptive statistics is not just for losers. There is value in revealing patterns in observational data, correlations or predictions that were not known before. For example, political scientists were able to forecast presidential election outcomes using information available months ahead of time. This has implications about political campaigns--and no causal identification strategy was needed. Countries with United Nations peacekeeping take longer, on average, to revert to civil war, compared to similarly-situated countries without peacekeeping. A fact worth knowing, even before the storytelling starts. (Here's the link, which happens to also include another swipe at Paul Collier, this time from Bill Easterly.)

I'm not convinced by every correlation I see. For example, there was this claim that warming increases the risk of civil war in Africa. As I wrote at the time, I wanted to see the time series and the scatterplot. A key principle in applied statistics is that you should be able to connect between the raw data, your model, your methods, and your conclusions.

The role of models

In a discussion of McGovern's article, Chris Blattman writes:

Economists often take their models too seriously, and too far. Unfortunately, no one else takes them seriously enough. In social science, models are like maps; they are useful precisely because they don't explain the world exactly as it is, in all its gory detail. Economic theory and statistical evidence doesn't try to fit every case, but rather find systematic tendencies. We go wrong to ignore these regularities, but we also go wrong to ignore the other forces at work-especially the ones not so easily modeled with the mathematical tools at hand.

I generally agree with what Chris writes, but here I think he's a bit off by taking statistical evidence and throwing it in the same category as economic theory and models. My take-away from McGovern is that the statistical evidence of Collier et al. is fine; the problem is with the economic models which are used to extrapolate from the evidence to the policy recommendations. I'm sure Chris is right that economic models can be useful in forming and testing statistical hypotheses, but I think the evidence can commonly be assessed on its own terms. (This is related to my trick of understanding instrumental variables by directly summarizing the effect of the instrument on the treatment and the outcome without taking the next step and dividing the coefficients.)

To put it another way: I would separate the conceptually simple statistical models that are crucial to understanding evidence in any complex-data setting, from the economics (or, more generally, social science) models that are needed to apply empirical correlations to real-world decisions.

My new writing strategy


In high school and college I would write long assignments using a series of outlines. I'd start with a single sheet where I'd write down the key phrases, connect them with lines, and then write more and more phrases until the page was filled up. Then I'd write a series of outlines, culminating in a sentence-level outline that was roughly one line per sentence of the paper. Then I'd write. It worked pretty well. Or horribly, depending on how you look at it. I was able to produce 10-page papers etc. on time. But I think it crippled my writing style for years. It's taken me a long time to learn how to write directly--to explain clearly what I've done and why. And I'm still working on the "why" part. There's a thin line between verbosity and terseness.

I went to MIT and my roommate was a computer science major. He wrote me a word processor on his Atari 800, which did the job pretty well. For my senior thesis I broke down and used the computers in campus. I formatted it in troff which worked out just fine.

In grad school I moved toward the Latex approach of starting with the template and an outline (starting with the Introduction and ending with Discussion and References), then putting in paragraphs here and there until the paper was done. I followed the same approach for my first few books.

Blogging was different. When I blog I tend to start at the beginning and just keep writing until I'm done. I've learned that it's best to write an entry all at once--it's hard to come back a day or a week later to fill in any gaps. I think this has helped my writing style and my writing efficiency. The only trouble is that my entries tend to be story-like rather than article-like. In a story you begin with the motivation and then gradually reveal what's happening. When I'm blogging I commonly start at one place but then, once I'm halfway through, I realize I want to go somewhere else. In contrast, in a proper article you jump right in and say the key point right away, and everything gets structured from there. I've tried to improve my blog-writing by contracting introductory paragraphs into introductory sentences.

I've been blogging for over six years, and it's affected my writing. More and more I write articles from beginning to end. It's worked for me to use Word rather than Latex. Somehow in Word, as in the blogging window, it's easy for me to just get started and write, whereas in Latex everything's just too structured. Really what's relevant here, though, is the style not the software.

Sometimes, though, I have a complicated argument to make and it helps to outline it first. In that case I'll write the outline and then use it as the basis for an article.

But recently I came up with a new strategy--the best of both worlds, perhaps. I write the outline but then set it aside and write the article from scratch, from the beginning, not worrying about the outline. The purpose of the outline is to get everything down so I don't forget any key ideas. Having the outline gives me the freedom to write the article without worrying that I might be missing something--I can always check the outline at the end.

Tyler Cowen asks what is the ultimate left-wing novel? He comes up with John Steinbeck and refers us to this list by Josh Leach that includes soclal-realist novels from around 1900. But Cowen is looking for something more "analytically or philosophically comprehensive."

My vote for the ultimate left-wing novel is 1984. The story and the political philosophy fit together well, and it's also widely read (which is an important part of being the "ultimate" novel of any sort, I think; it wouldn't do to choose something too obscure). Or maybe Gulliver's Travels, but I've never actually read that, so I don't know if it qualifies as being left-wing. Certainly you can't get much more political than 1984, and I don't think you can get much more left-wing either. (If you get any more left-wing than that, you start to loop around the circle and become right-wing. For example, I don't think that a novel extolling the brilliance of Stalin or Mao would be considered left-wing in a modern context.)

Native Son (also on Leach's list) seems like another good choice to me, but I'm sticking with 1984 as being more purely political. For something more recent you could consider something such as What a Carve Up by Jonathan Coe.

P.S. Cowen's correspondent wrote that "the book needs to do two things: justify the welfare state and argue the limitations of the invisible hand." But I don't see either of these as particularly left-wing. Unless you want to argue that Bismarck was a left-winger.

P.P.S. Commenters suggest Uncle Tom's Cabin and Les Miserables. Good choices: they're big novels, politically influential, and left-wing. There's probably stuff by Zola etc. too. I still stand by 1984. Orwell was left-wing and 1984 was his novel. I think the case for 1984 as a left-wing novel is pretty iron-clad.

Unfinished business


This blog by J. Robert Lennon on abandoned novels made me think of the more general topic of abandoned projects. I seem to recall George V. Higgins writing that he'd written and discarded 14 novels or so before publishing The Friends of Eddie Coyle.

I haven't abandoned any novels but I've abandoned lots of research projects (and also have started various projects that there's no way I'll finish). If you think about the decisions involved, it really has to be that way. You learn while you're working on a project whether it's worth continuing. Sometimes I've put in the hard work and pushed a project to completion, published the article, and then I think . . . what was the point? The modal number of citations of our articles is zero, etc.

Online James?

| 1 Comment

Eric Tassone writes:

On the back of my yellowing pocket book of "The Mask of Dimitros" is the following blurb:

'Eric Ambler is the best living writer of thrillers.' -- News Chonicle

What I'm wondering is, why the qualifier "living"? Did the News Chronicle think there was a better writers of thrillers than Ambler who was no longer alive? I can't imagine who that could be, considering that Ambler pretty much defined the modern thriller.

English-to-English translation


It's not just for Chaucer (or Mad Max) anymore. Peter Frase writes:

It's a shame that we neglect to re-translate older works into English merely because they were originally written in English. Languages change, and our reactions to words and formulations change. This is obvious when you read something like Chaucer, but it's true to a more subtle degree of more recent writings. There is a pretty good chance that something written in the 19th century won't mean the same thing to us that it meant to its contemporary readers. Thus it would make sense to re-translate Huckleberry Finn into modern language, in the same way we periodically get new translations of Homer or Dante or Thomas Mann. This is a point that applies equally well to non-fiction and social theory: in some ways, English-speaking sociologists are lucky that our canonical trio of classical theorists-Marx, Weber, and Durkheim-all wrote in another language. The most recent translation of Capital is eminently more readable than the older ones-and I know I could have used a modern English translation of Talcott Parsons when I was studying contemporary theory.

Now, one might respond to this by saying that writing loses much in translation, and that some things just aren't the same unless you read them in the original un-translated form. And that's probably true. But it would still be good to establish the "English-to-English translation" as a legitimate category . . .

Good point. I'm hoping someone will translate Bayesian Data Analysis into English. Half the time, I can't be sure what they're trying to say.

Tyler Cowen discusses his and Bryan Caplan's reaction to that notorious book by Amy Chua, the Yale law professor who boasts of screaming at her children, calling them "garbage," not letting them go to the bathroom when they were studying piano, etc. Caplan thinks Chua is deluded (in the sense of not being aware of research showing minimal effects of parenting on children's intelligence and personality), foolish (in writing a book and making recommendations without trying to lean about the abundant research on child-rearing), and cruel. Cowen takes a middle view in that he doesn't subscribe to Chua's parenting strategies but he does think that his friends' kids will do well (and partly because of his friends' parenting styles, not just from their genes).

Do you view yourself as special?

I have a somewhat different take on the matter, an idea that's been stewing in my mind for awhile, ever since I heard about the Wall Street Journal article that started this all. My story is that attitudes toward parenting are to some extent derived from attitudes about one's own experiences as a child.

Bayes in China update


Some clarification on the Bayes-in-China issue raised last week:

1. We heard that the Chinese publisher cited the following pages that might contain politically objectionable materials: 3, 5, 21, 73, 112, 201.

2. It appears that, as some commenters suggested, the objection was to some of the applications, not to the Bayesian methods.

3. Our book is not censored in China. In fact, as some commenters mentioned, it is possible to buy it there, and it is also available in university libraries there. The edition of the book which was canceled was intended to be a low-cost reprint of the book. The original book is still available. I used the phrase "Banned in China" as a joke and I apologize if it was misinterpreted.

4. I have no quarrel with the Chinese government or with any Chinese publishers. They can publish whatever books they would like. I found this episode amusing only because I do not think my book on regression and multilevel models has any strong political content. I suspect the publisher was being unnecessarily sensitive to potentially objectionable material, but this is their call. I thought this was an interesting story (which is why I posted the original email on the blog) but I did not, and do not, intend it as any sort of comment on the Chinese government, Chinese society, etc. China is a big country and this is one person at one publisher making one decision. That's all it is; it's not a statement about China in general.

I did not write the above out of any fear of legal action etc. I just think it's important to be fair and clear, and it is possible that some of what I wrote could have been misinterpreted in translation. If anyone has further questions on this, feel free to ask in the comments and I will clarify as best as I can.

I'll be on Radio 4 at 8.40am, on the BBC show "Today," talking about The Honest Rainmaker. I have no idea how the interview went (it was about 5 minutes), but I'm kicking myself because I was planning to tell the hookah story, but I forgot. Here it is:

I was at a panel for the National Institutes of Health evaluating grants. One of the proposals had to do with the study of the effect of water-pipe smoking, the hookah. There was a discussion around the table. The NIH is a United States government organisation; not many people in the US really smoke hookahs; so should we fund it? Someone said, 'Well actually it's becoming more popular among the young.' And if younger people smoke it, they have a longer lifetime exposure, and apparently there is some evidence that the dose you get of carcinogens from hookah smoking might be 20 times the dose of smoking a cigarette. I don't know the details of the math, but it was a lot. So even if not many people do it, if you multiply the risk, you get a lot of lung cancer.

Then someone at the table - and I couldn't believe this - said, 'My uncle smoked a hookah pipe all his life, and he lived until he was 90 years old.' And I had a sudden flash of insight, which was this. Suppose you have something that actually kills half the people. Even if you're a heavy smoker, your chance of dying of lung cancer is not 50 per cent, so therefore, even with something as extreme as smoking and lung cancer, you still have lots of cases where people don't die of the disease. The evidence is certainly all around you pointing in the wrong direction - if you're willing to accept anecdotal evidence - there's always going to be an unlimited amount of evidence which won't tell you anything. That's why the psychology is so fascinating, because even well-trained people make mistakes. It makes you realise that we need institutions that protect us from ourselves.

I think that last bit--"if you're willing to accept anecdotal evidence, there's always going to be an unlimited amount of evidence which won't tell you anything." Of course, what makes this story work so well is that it's backed up by a personal anecdote!

Damn. I was planning to tell his story but I forgot. Next time I do radio, I'm gonna bring an index card with my key point. Not my 5 key points, not my 3 key points, but my 1 key point. Actually, I'm gonna be on the radio (in Seattle) next Monday afternoon, so I'll have a chance to try this plan then.

5 books


I was asked by Sophie Roell, an editor at The Browser, where every day they ask an expert in a field to recommend the top five books, not by them, in their subject. I was asked to recommend five books on how Americans vote.

The trouble is that I'm really pretty unfamiliar with the academic literature of political science, but it seemed sort of inappropriate for a political scientist such as myself to recommend non-scholarly books that I like (for example, "Style vs. Substance" by George V. Higgins, "Lies My Teacher Told Me," by James Loewen, "The Rascal King" by Jack Beatty, "Republican Party Reptile" by P. J. O'Rourke, and, of course, "All the King's Men," by Robert Penn Warren). I mean, what's the point of that? Nobody needs me to recommend books like that.

Instead, I moved sideways and asked if I could discuss five books on statistics instead. Roell said that would be fine, so I sent her a quick description, which appears below.

The actual interview turned out much better. Readable and conversational. I give Roell credit for this, keeping me from rambling too much. The interview includes the notorious hookah story, which should provoke a wince of recognition from anyone who's ever served on an NIH panel.

Below is my original email; the full interview appears here.

Brow inflation


In an article headlined, "Hollywood moves away from middlebrow," Brooks Barnes writes:

As Hollywood plowed into 2010, there was plenty of clinging to the tried and true: humdrum remakes like "The Wolfman" and "The A-Team"; star vehicles like "Killers" with Ashton Kutcher and "The Tourist" with Angelina Jolie and Johnny Depp; and shoddy sequels like "Sex and the City 2." All arrived at theaters with marketing thunder intended to fill multiplexes on opening weekend, no matter the quality of the film. . . .

But the audience pushed back. One by one, these expensive yet middle-of-the-road pictures delivered disappointing results or flat-out flopped. Meanwhile, gambles on original concepts paid off. "Inception," a complicated thriller about dream invaders, racked up more than $825 million in global ticket sales; "The Social Network" has so far delivered $192 million, a stellar result for a highbrow drama. . . . the message that the year sent about quality and originality is real enough that studios are tweaking their operating strategies. . . . To reboot its "Spider-Man" franchise, for instance, Sony hired Marc Webb, whose only previous film was the indie comedy "(500) Days of Summer." The studio has also entrusted a big-screen remake of "21 Jump Street" to Phil Lord and Chris Miller, a pair whose only previous film was the animated "Cloudy With a Chance of Meatballs." . . . Guillermo del Toro, the "Pan's Labyrinth" auteur, is developing a new movie around Disneyland's Haunted Mansion ride. . . .

"In years past," said Sean Bailey, Disney's president for production, "most live-action films seemed like they had to be either one thing or the other: commercial or quality. The industry had little expectation of a film being both. Our view is the opposite."

Huh? Standards have certainly changed when a Spiderman sequel, and a 21 Jump Street remake, and a ride at Disneyland are defined as "highbrow."

The cultural products described in the article--big-money popular entertainments that are well-reviewed and have some association with quality--are classic middlebrow. Back around 1950, Russell Lynes and Dwight Macdonald were all over this.

Of course, Lynes and Macdonald would've identified the New York Times as Middlebrow Central and so wouldn't have been surprised at all to see uber-middlebrow items labeled as highbrow. That's the whole essence of middlebrow: to want the "qualiity" label without putting in the work. 21 Jump Street, indeed.

P.S. I agree with (the ghosts of) Lynes and Macdonald that these middlebrow movies are just fine if that's what people want. It's just funny to see them labeled as "highbrow," in what almost seems like a parody of middlebrow aspiration. So "edgy."

I've become increasingly uncomfortable with the term "confidence interval," for several reasons:

- The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case),

- The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between "inference" and "prediction" in the classical framework.)

- The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence.

So here's my proposal. Let's use the term "uncertainty interval" instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think.

P.S. As of this writing, "confidence interval" outGoogles "uncertainty interval" by the huge margin of 9.5 million to 54000. So we have a ways to go.

Blogging: Is it "fair use"?


Dave Kane writes:

I [Kane] am involved in a dispute relating to whether or not a blog can be considered part of one's academic writing. Williams College restricts the use of undergraduate theses as follows:
Non-commercial, academic use within the scope of "Fair Use" standards is acceptable. Otherwise, you may not copy or distribute any content without the permission of the copyright holder.

Seems obvious enough. Yet some folks think that my use of thesis material in a blog post fails this test because it is not "academic." See this post for the gory details.

In the annals of hack literature, it is sometimes said that if you aim to write best-selling crap, all you'll end up with is crap. To truly produce best-selling crap, you have to have a conviction, perhaps misplaced, that your writing has integrity. Whether or not this is a good generalization about writing, I have seen an analogous phenomenon in statistics: If you try to do nothing but model the data, you can be in for a wild and unpleasant ride: real data always seem to have one more twist beyond our ability to model (von Neumann's elephant's trunk notwithstanding). But if you model the underlying process, sometimes your model can fit surprisingly well as well as inviting openings for future research progress.

In defense of jargon


Daniel Drezner takes on Bill James.

Taleb + 3.5 years


I recently had the occasion to reread my review of The Black Swan, from April 2007.

It was fun reading my review (and also this pre-review; "nothing useful escapes from a blackbody," indeed). It was like a greatest hits of all my pet ideas that I've never published.

Looking back, I realize that Taleb really was right about a lot of things. Not that the financial crisis has happened, we tend to forget that the experts who Taleb bashes were not always reasonable at all. Here's what I wrote in my review, three and a half years ago:

On page 19, Taleb refers to the usual investment strategy (which I suppose I actually use myself) as "picking pennies in front of a steamroller." That's a cute phrase; did he come up with it? I'm also reminded of the famous Martingale betting system. Several years ago in a university library I came across a charming book by Maxim (of gun fame) where he went through chapter after chapter demolishing the Martingale system. (For those who don't know, the Martingale system is to bet $1, then if you lose, bet $2, then if you lose, bet $4, etc. You're then guaranteed to win exactly $1--or lose your entire fortune. A sort of lottery in reverse, but an eternally popular "system.")

Throughout, Taleb talks about forecasters who aren't so good at forecasting, picking pennies in front of steamrollers, etc. I imagine much of this can be explained by incentives. For example, those Long-Term Capital guys made tons of money, then when their system failed, I assume they didn't actually go broke. They have an incentive to ignore those black swans, since others will pick up the tab when they fail (sort of like FEMA pays for those beachfront houses in Florida). It reminds me of the saying that I heard once (referring to Donald Trump, I believe) that what matters is not your net worth (assets minus liabilities), but the absolute value of your net worth. Being in debt for $10 million and thus being "too big to fail" is (almost) equivalent to having $10 million in the bank.

So, yeah, "too big to fail" is not a new concept. But as late as 2007, it was still a bit of an underground theory. People such as Taleb screamed about, but the authorities weren't listening.

And then there are parts of the review that make me really uncomfortable. As noted in the above quote, I was using the much-derided "picking pennies in front of a steamroller" investment strategy myself--and I knew it! Here's some more, again from 2007:

I'm only a statistician from 9 to 5

I try (and mostly succeed, I think) to have some unity in my professional life, developing theory that is relevant to my applied work. I have to admit, however, that after hours I'm like every other citizen. I trust my doctor and dentist completely, and I'll invest my money wherever the conventional wisdom tells me to (just like the people whom Taleb disparages on page 290 of his book).

Not long after, there was a stock market crash and I lost half my money. OK, maybe it was only 40%. Still, what was I thinking--I read Taleb's book and still didn't get the point!

Actually, there was a day in 2007 or 2008 when I had the plan to shift my money to a safer place. I recall going on the computer to access my investment account but I couldn't remember the password, was too busy to call and get it, and then forgot about it. A few weeks later the market crashed.

If only I'd followed through that day. Oooohhh, I'd be so smug right now. I'd be going around saying, yeah, I'm a statistician, I read Taleb's book and I thought it through, blah blah blah. All in all, it was probably better for me to just lose the money and maintain a healthy humility about my investment expertise.

But the part of the review that I really want everyone to read is this:

On page 16, Taleb asks "why those who favor allowing the elimination of a fetus in the mother's womb also oppose capital punishment" and "why those who accept abortion are supposed to be favorable to high taxation but against a strong military," etc. First off, let me chide Taleb for deterministic thinking. From the General Social Survey cumulative file, here's the crosstab of the responses to "Abortion if woman wants for any reason" and "Favor or oppose death penalty for murder":

40% supported abortion for any reason. Of these, 76% supported the death penalty.

60% did not support abortion under all conditions. Of these, 74% supported the death penalty.

This was the cumulative file, and I'm sure things have changed in recent years, and maybe I even made some mistake in the tabulation, but, in any case, the relation between views on these two issues is far from deterministic!

Finally, a lot of people bash Taleb, partly for his idosyncratic writing style, but I have fond memories of both his books, for their own sake and because they inspired me to write down some of my pet ideas. Also, he deserves full credit for getting things right several years ago, back when the Larry Summerses of the world were still floating on air, buoyed by the heads-I-win, tails-you-lose system that kept the bubble inflated for so long.

I studied math and physics at MIT. To be more precise, I started in math as default--ever since I was two years old, I've thought of myself as a mathematician, and I always did well in math class, so it seemed like a natural fit.

But I was concerned. In high school I'd been in the U.S. Mathematical Olympiad training program, and there I'd met kids who were clearly much much better at math than I was. In retrospect, I don't think I was as bad as I'd thought at the time: there were 24 kids in the program, and I was probably around #20, if that, but I think a lot of the other kids had more practice working on "math olympiad"-type problems. Maybe I was really something like the tenth-best in the group.

Tenth-best or twentieth-best, whatever it was, I reached a crisis of confidence around my sophomore or junior year in college. At MIT, I started right off taking advanced math classes, and somewhere along the way I realized I wasn't seeing the big picture. I was able to do the homework problems and do fine on the exams, but something was missing. Ultimately I decided the problem was that, in the world of theoretical math, there were the Cauchys, the Riemanns, etc., and there were everybody else. I didn't want to be one of the "everbody else." Unfortunately I didn't know about applied math at the time--at MIT, as elsewhere, I imagine, the best math students did the theory track.

I was also majoring in physics, which struck me as much more important than math, but which I felt I had even less of an understanding of. I did well in my classes--it was MIT, I didn't have a lot of friends and I didn't go on dates, so that gave me lots of time to do my problem sets each week--and reached the stage of applying to physics grad schools. In fact it was only at the very last second in April of my senior year that I decided to go for a Ph.D. in statistics rather than physics.

I had some good experiences in physics, most notably taking the famous Introduction to Design course at MIT--actually, that was a required course in the mechanical engineering department but many physics students took it too--and working for two summers doing solid-state physics research at Bell Labs. We were working on zone-melt recrystallization of silicon and, just as a byproduct of our research, discovered a new result (or, at least it was new to us) that solid silicon could superheat to something like 20 degrees (I think it was that, I don't remember the details) above its melting point before actually melting. This wouldn't normally happen, but we had a set-up in which the silicon wafer was heated in such a way that the center got hotter than the edges, and at the center there were no defects in the crystal pattern for the melting process to easily start. So it had to get really hot for it to start to melt.

Figuring this out wasn't so easy--it's not like we had a thermometer in the inside of our wafer. (If we did, the crystalline structure wouldn't have been pure, and there wouldn't have been any superheating.) We knew the positions and energies of our heat sources, and we had radiation thermometers to measure the exterior temperature from various positions, we knew the geometry of the silicon wafer (which was encased in silicon dioxide), and we could observe the width of the molten zone.

So what did we do? What did I do, actually? I set up a finite-element model on the computer and played around with its parameters until I matched the observations, then looked inside to see what our model said was the temperature at the hottest part of the wafer. Statistical inference, really, although I didn't know it at the time. When I came to Bell Labs for my second summer, I told my boss that I'd decided to go to grad school in statistics. He was disappointed and said that this was beneath me, that statistics was a step down from physics. I think he was right (about statistics being simpler than physics), but I really wasn't a natural physicist, and I think statistics was the right field for me.

Why did I study statistics? I've been trained not to try to answer Why questions but rather to focus on potential interventions. The intervention that happened to me was that I took a data analysis course from Don Rubin when I was a senior in college. MIT had very few statistics classes. I'd taken one of them and liked it, and when I went to a math professor to ask what to take next, he suggested I go over to Harvard and see what they had to offer.

I sat in on two classes: one was deadly dull and the other was Rubin's, which was exciting from Day 1. The course just sparkled with open problems, and the quality of the ten or so students in the class was amazing. I remember spending many hours laboriously working out every homework problem using the Neyman-Pearson theory we'd been taught in my theoretical statistics course. It's only by really taking this stuff seriously that I realized how hopeless it all is. When, two years later, I took a class on Bayesian statistics from John Carlin, I was certainly ready to move to a model-based philosophy.

Anyway, to answer the question posed at the beginning of the paragraph, Don's course was great. I was worried that statistics was just too easy to be interesting, but Don assured me that, no, the field has many open problems and that I'd be free to work on them. As indeed I have.

Why did I start a blog? I realize I'm skipping a few steps here, considering that I started my Ph.D. studies in 1986 and didn't start blogging until nearly two decades later. I started my casual internet reading with Slate and Salon and at some point had followed some links and been reading some blogs. In late 2004 my students, postdocs, and I decided to set up a blog and a wiki to improve communication in our group and to reach out to others. The idea was that we would pass documents around on the wiki and post our thoughts on each others' ideas on the blog.

I figured we'd never run out of material because, if we ever needed to, I could always post links and abstracts of my old papers. (I expect I'm far from unique among researchers in having a fondness for many of my long-forgotten publications.)

What happened? For one thing, after a couple months, the blog and wiki got hacked (apparently by some foreign student with no connection to statistics who had some time on his hands). Our system manager told us the wiki wasn't safe so we abandoned it and switched account names for the blog. Meanwhile, I'd been doing most of the blog posting. For awhile, I'd assign my students and postdocs to post while I was on vacation, but then I heard they were spending hours and hours on each entry so I decided to make it optional, which means that most of my cobloggers rarely post on the blog. Which is too bad but I guess is understandable.

Probably the #1 thing I get from posting on the blog is an opportunity to set down my ideas in a semi-permanent form. Ideas in my head aren't as good as the same ideas on paper (or on the screen). To put it another way, the process of writing forces me to make hard choices and clarify my thoughts. The weakness of my blogging is that it's all in words, not in symbols, so quite possibly the time I spend blogging distracts me from thinking more deeply on mathematical and computational issues. On the other hand, sometimes blogging has motivated me to do some data analyses which have motivated me to do new statistical research.

There's a lot more that I could say about my blogging experiences, but really it all fits in a continuum with the writing of books and articles, meetings with colleagues, and all stages of teaching (from preparation of materials to meetings with students). One thing that blogging has in common with book-writing and article-writing is that I don't really know who my audience is. I can tell you, though, that the different blogs have much different sets of readers. My main blog has an excellent group of commenters who often point out things of which I'd been unaware. At the other blogs where I post, the commenters often don't always understand where I'm coming from, and all I can really do is get my ideas out there and let people use them how they may. In that way it's similar to the frustrating experience of writing for journals and realizing that sometimes I just can't get my message across. In my own blog I can go back and continue modifying my ideas in the light of audience feedback. My model is George Orwell, who wrote on the same (but not identical) topics over and over again, trying to get things just right. (I know that citing Orwell is a notorious sign of grandiosity in an author, but in my defense all I'm saying is that Orwell is my model, not that I have a hope of reaching that level.)

New Sentences For The Testing Of Typewriters (from John Lennon):

Fetching killjoy Mavis Wax was probed on the quay.

"Yo, never mix Zoloft with Quik," gabs Doc Jasper.

One zany quaff is vodka mixed with grape juice and blood.

Zitty Vicki smugly quipped in her journal, "Fay waxes her butt."

Hot Wendy gave me quasi-Kreutzfeld-Jacob pox.

Jack's pervy moxie quashed Bob's new Liszt fugue.

I backed Zevy's qualms over Janet's wig of phlox.

Tipsy Bangkok panjandrums fix elections with quivering zeal.

Mexican juntas, viewed in fog, piqued Zachary, killed Rob.

Jaywalking Zulu chieftains vex probate judge Marcy Quinn.

Twenty-six Excedrin helped give Jocko quite a firm buzz.

Racy pics of bed hijinx with glam queen sunk Val.

Why Paxil? Jim's Bodega stocked no quince-flavor Pez.

Wavy-haired quints of El Paz mock Jorge by fax.

Two phony quacks of God bi-exorcize evil mojo.

Dan Goldstein sends along this bit of research, distinguishing terms used in two different subfields of psychology. Dan writes:

Intuitive calls included not listing words that don't occur 3 or more times in both programs. I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time.

I think the bar graphs are kinda ugly, maybe there's a better way to do it based on classifying the words according to content? Also the whole exercise would gain a new dimension by comparing several areas instead of just two. Maybe that's coming next.

Tyler Cowen links approvingly to this review by B. R. Myers of a book that I haven't read. Unlike Cowen, I haven't seen the book in question--so far, I've only read the excerpt that appeared in the New Yorker--but I can say that I found Myers's review very annoying. Myers writes:

At the sister blog, David Frum writes, of a book by historian Laura Kalman about the politics of the 1970s:

Some recent blog comments made me rethink how to express disagreement when I write. I'll first display my original blog entry, then some of the comments, then my reaction, and finally my new approach. As usual with negative comments, my first reaction was to be upset, but once the hurt feelings went away, I realized that these dudes had a point.

Act 1

A few days I ago, I posted the following on 538:

Self-described "political operative" Les Francis, with real-world experience as former executive director of the Democratic National Committee:
I don't need any polls to tell me that Republicans will do well in November. The "out" party almost always shows significant gains in the first midterm election of a new President.

Political scientists Joe Bafumi, Bob Erikson, and Chris Wlezien, from elite, out-of-touch, ivory-tower institutions Dartmouth, Columbia, and Temple Universities:


Game over.

Act 2

After posting, I typically check the comments because sometimes I have some typos or obscure points that I need to fix or explain. The comments at 538 aren't always so polite, but this time they went over the top:

Otto said...

Maybe I'm just a poor country lawyer but I don't understand this post or what those graphs are supposed to say. Is the point that the President's party doesn't actually do poorly in midterms?

Also, I think the author is being sarcastic when he refers to those colleges as being ivory towers but it's tough to tell.

benh57 said...

I concur with Otto. Huh?

bigbadbutt said...

Yeah, as far as i can tell, both articles agree. Not sure what you're trying to imply here..

Bram Reichbaum said...

If the contest is whether political consultants or political operatives are more intelligible, game over indeed. [insert Marvin the Martian voice] This post makes me very angry. [/MMv]

And so on. You get the point. The commenters really, really hated it, and nobody got the point of the graphs.

Act 3

John McPhee, the Anti-Malcolm


This blog is threatening to turn into Statistical Modeling, Causal Inference, Social Science, and Literature Criticism, but I'm just going to go with the conversational flow, so here's another post about an essayist.

I'm not a big fan of Janet Malcolm's essays --- and I don't mean I don't like her attitude or her pro-murderer attitude, I mean I don't like them all that much as writing. They're fine, I read them, they don't bore me, but I certainly don't think she's "our" best essayist. But that's not a debate I want to have right now, and if I did I'm quite sure most of you wouldn't want to read it anyway. So instead, I'll just say something about John McPhee.

As all right-thinking people agree, in McPhee's long career he has written two kinds of books: good, short books, and bad, long books. (He has also written many New Yorker essays, and perhaps other essays for other magazines too; most of these are good, although I haven't seen any really good recent work from him, and some of it has been really bad, by his standards). But...

Via J. Robert Lennon, I discovered this amusing blog by Anis Shivani on "The 15 Most Overrated Contemporary American Writers."

Lennon found it so annoying that he refused to even link to it, but I actually enjoyed Shivani's bit of performance art. The literary criticism I see is so focused on individual books that it's refreshing to see someone take on an entire author's career in a single paragraph. I agree with Lennon that Shivani's blog doesn't have much content--it's full of terms such as "vacuity" and "pap," compared to which "trendy" and "fashionable" are precision instruments--but Shivani covers a lot of ground and it's fun to see this all in one place.

My main complaint with Shivani, beyond his sloppy writing (but, hey, it's just a blog; I'm sure he saves the good stuff for his paid gigs) is his implicit assumption that everyone should agree with him. I'm as big a Kazin fan as anyone, but I still think he completely undervalued Marquand.

The other thing I noticed was that, apart from Amy Tan and Jhumpa Lahiri, none of the writers on Shivani's list were people whom I would consider bigshots. To me, they seemed like a mix of obscure poets (even a famous poet within the poetry-and-NPR world is still obscure compared to other kinds of writers), obscure critics (ditto), and some Manhattan-insider types. And Junot Diaz, who I like, even if maybe Shivani is right that he's just riffing on old Philip Roth shtick.

P.S. Following the links from Shivani, I came across this. I still think Andrea did it better, though.

P.P.S. Shivani mentioned "Antonya Nelson." The name rang a bell, so I searched the blog and found this. She's the one who wrote the John Updike story! ("Not angry enough to be a John Cheever story, not clipped enough to be a Raymond Carver story, not smooth enough to be a Richard Ford story.") I'm surprised Shivani didn't mention that one.

P.P.P.S. Thinking a bit more about Lennon's reaction . . . I guess I'd be pretty annoyed to see an article on "the 15 most overrated American statisticians." I know two or three people who'd probably put me high on such a list. It's a good thing they don't have blogs!

The last great essayist?


I recently read a bizarre article by Janet Malcolm on a murder trial in NYC. What threw me about the article was that the story was utterly commonplace (by the standards of today's headlines): divorced mom kills ex-husband in a custody dispute over their four-year-old daughter. The only interesting features were (a) the wife was a doctor and the husband were a dentist, the sort of people you'd expect to sue rather than slay, and (b) the wife hired a hitman from within the insular immigrant community that she (and her husband) belonged to. But, really, neither of these was much of a twist.

To add to the non-storyness of it all, there were no other suspects, the evidence against the wife and the hitman was overwhelming, and even the high-paid defense lawyers didn't seem to be making much of an effort to convince anyone of their client's innocents. (One of the closing arguments was that one aspect of the wife's story was so ridiculous that it had to be true. In the lawyer's words:

If she was guilty, why would she say that? . . . It's actually the strongest evidence of her truthfulness, because if she was a liar she would say something that made no sense. I mean, it makes no sense.

If that's the "strongest evidence" in your favor, then you're in trouble.)

And indeed, the two defendants were in trouble. They ended up with life in prison.

The only real evidence in favor of the defendants was that the woman was devoted to her daughter and that she and her husband (and their extended families as well) were involved in bitter legal proceedings, which included at one point a court order against the husband and, at a later date, a ruling that he should have custody of the daughter. Unfortunately, all of this understandable frustration did nothing to reduce the evidence that she hired her friend to kill her ex-husband. If anything, this all just makes the motivation for the murder that much more plausible.

The strange thing about the Janet Malcolm article is that Malcolm is so sympathetic to the killers (or, more specifically, to the ex-wife who made the call; Malcolm doesn't say much about the actual shooter). Malcolm never goes so far as to do a Michael Moore (who notoriously described himself as the only white man in America who thought O. J. Simpson didn't do it), but she definitely seemed to be rooting for the woman to get off, to the extent that she (Malcolm) called up one of the lawyers in the middle of the trial in what looks like an attempt to force a mistrial.

Malcolm's main argument seems to be that the woman who ordered the killing was represented by a very good lawyer, but, because of difficulties having to do with other lawyers involved in the case, this super-defender didn't have a chance to really do his thing and win the case.

Why do I care?

Why am I writing hundreds of words about a months-old magazine article on a year-old court case that wasn't so remarkable in the first place?

The key is the author: Janet Malcolm, who's arguably the best--perhaps only--pure essayist writing in English today.

What do I mean by a "pure essayist"? Someone who writes about one topic but is using it as a hook to talk about . . . anything and everything. Classic examples from the previous century include Rebecca West and the George Orwell of "Inside the Whale." Who else besides Janet Malcolm does this nowadays?

To understand any article by Malcolm, you have to go on several levels.

1. On the surface, it's the story of a woman in a difficult situation (an immigrant woman, a doctor, divorced with a young child and trapped in a family feud) who's been accused of murder, and, as the story goes on, it becomes pretty clear that she actually did it. Along with this, it's an interesting look into the peculiarities of the court system, from jury selection through cross-examination to opening arguments.

2. At the next level, it's a New Journalism-style bit of court reporting, where we are told not just about the facts of the case but also, "Boys on the Bus"-style, about the other reporters and about the journalist's personal sympathies.

3. Finally, amidst the colorful details of the court case, Malcolm occasionally offers a thoughtful reflection on the court system. Sometimes I think she's flat-out wrong, but even then she's interesting. (For example, at one point she reports an awkward bit of cross-examination and says that witnesses often get tangled up in trying to match wits with opposing lawyers. My impression is that the story is simpler than that, it's just that when you're on the stand, you're scared of saying the wrong thing, so your words come out all hesitant, defensive, and triple-checked. How can anyone avoid it?) Anyway, the point is that I found Malcolm's article to be thought-provoking throughout, much more than one would expect from the pretty basic story of the crime.

OK, here's my theory . . .

Now let's try to put it all together. What we have is an open-and-shut case, a story with no suspense. The killer is named on page 1 and the evidence mounts from there. No courtroom surprises, the case ends as it begins, and the killers get life sentences. I don't think that any major magazine--other than the New Yorker--would've published Malcolm's article.

So why did Malcolm take on this case? One possibility is that it seemed more ambiguous at the beginning than the end. Perhaps before the trial started, the evidence didn't look as strong as it eventually did. (Once Malcolm sat in the trial for several weeks and interviewed all the participants, it's no surprise that she turned it into an article (and maybe, soon, into a book); the real question is why she thought this particular case was worse the effort in the first place.)

I have a theory. I doubt I'll ever have the chance to meet Janet Malcolm and ask her, so I'll just put it out here. Malcolm refers to the well-known idea that all sorts of lawyers can successfully defend an innocent person; what makes a great defense lawyer is the ability to get a guilty person off. And, indeed, if there's a hero in Malcolm's story, it's the lawyer for the ex-wife, a man who perhaps can work miracles.

So, my theory is that Malcolm sees herself in the same role. Any journalist can wring sympathy and a critique of the legal system out of a wrongly-accused person, but a great journalist can extract sympathy from someone who's manifestly guilty.

Thus, rather than writing an article slamming the criminal courts on the basis of some dramatic miscarriage of justice, Malcolm is attempting the much more impressive feat of basing her critique on an open-and-shut case.

What I'm talking about here is not just a degree-of-difficulty thing. My guess is that Malcolm feels that her deeper arguments are strong enough that they should hold even in the weakest-possible legal case. Perhaps Malcolm would feel it would be cheating to make her argument in the context of an innocent person wrongly accused, or even in an ambiguous case.

This reminds me of Malcolm's most famous article, The Journalist and the Murderer, in which--incredibly (to me)--she was angry at a journalist for deceiving a man who, it turned out, had murdered his entire family! Again, I think Malcolm saw the challenge in taking the side of a murderer, and, again, she felt strongly enough about her point (that journalists should not be deceivers) that she wanted to make that point in the starkest possible setting. The Journalist and the Murderer: you can't get much starker than that.

In any case, I wasn't convinced by Malcolm's article. Just because there are some awesome lawyers out there, no, I don't think every killer has the right to Johnny-Cochran-level representation. And, no matter how much someone loves their child, I don't see that it's such a great idea to arrange for the child to see her father being shot. It's hard for me to get around this one. I also don't approve of a journalist trying to use her influence to throw a monkey wrench into a murder trial.

But Malcolm's our only great essayist, so I'll read through to see her thoughts. I can appreciate a writer's artistry without agreeing with her politics.

Tyler Cowen links to an interesting article by Terry Teachout on David Mamet's political conservatism. I don't think of playwrights as gurus, but I do find it interesting to consider the political orientations of authors and celebrities.

I have only one problem with Teachout's thought-provoking article. He writes:

As early as 2002 . . . Arguing that "the Western press [had] embraced antisemitism as the new black," Mamet drew a sharp contrast between that trendy distaste for Jews and the harsh realities of daily life in Israel . . .

In 2006, Mamet published a collection of essays called The Wicked Son: Anti-Semitism, Jewish Self-Hatred and the Jews that made the point even more bluntly. "The Jewish State," he wrote, "has offered the Arab world peace since 1948; it has received war, and slaughter, and the rhetoric of annihilation." He went on to argue that secularized Jews who "reject their birthright of 'connection to the Divine'" succumb in time to a self-hatred that renders them incapable of effectively opposing the murderous anti-Semitism of their enemies--and, by extension, the enemies of Israel.

It is hard to imagine a less fashionable way of framing the debate over Israel, and even the most sympathetic reviewers of The Wicked Son frequently responded with sniffish dismay to Mamet's line of argument. . . .

I added the boldface above.

Setting aside the specific claims being made here (it would be hard for me to evaluate, for example, whether antisemitism was indeed "trendy" in 2002), I think Teachout made a mistake in his use of "trendy" and "fashionable." What do those words mean, really? As far as I can tell, they are used to refer to a position that you disagree with but that you fear is becoming more popular. Or, to put it another way, everything's trendy until it jumps the shark.

So far, it might sound like I'm making a picky comment about English usage, along the lines of my recommendations to academic writers to avoid unnecessary phrases such as "Note that" and "obviously." "Trendy" and "fashionable" are convenient negative words that don't add much meaning--they're a way to express contempt without taking the trouble to make an actual argument.

But it goes beyond that. My real trouble with the use of "fashionable" and "trendy" (and the accompanying implicit reasoning that goes along with them) is that they can get a writer tangled up in contradictions, without him even realizing it.

Let's return to the article under discussion. After praising him for his "less fashionable" framing of the debate over Israel, Teachout turns to Mamet's latest play, which he does not actually like very much: "Alas, his first post-conversion play does not suggest that this new point of view has as yet borne interesting artistic fruit." Teachout concludes his mini-review with the following sentence (parentheses in the original):

(The play has, interestingly, proved to be a major success at the box office.)

Here's my problem. At this point, everything in Teachout's (and, perhaps, Mamet's) world is suffused with political implication. It's the good guys versus the bad guys. I can think of a few ways to interpret the parenthetical remark above:

- Theatergoers--unlike playwrights--are sensible Americans, middle-of-the-road politically, and welcome a fresh new play that does not take a politically-correct view of the world. This is a Hollywood-versus-America sort of argument and is consistent with the idea that we should trust the judgment of the market over that of a narrow spectrum of playwrights and critics.

- On the other hand, Teachout didn't actually like the play, so maybe he's making an opposite argument, that theatergoers are, essentially, nothing but sheep who flock to anything by a big-name playwright. This is consistent with the idea that theaters can get away with all sorts of politically-correct crap and the audience won't know the difference, thus a self-perpetuating mechanism that isolates theater away from the real life of America.

- Or maybe Teachout is simply saying that American theater has degenerated to such an extent that even "heavy-handed lectures"--this is how he describes Mamet's latest--can be major successes.

The judgment of the market is always a double-edged sword, and it's never clear whether popularity should be taken as a sign of virtue ("the free-market economy," in Mamet's words) or as a deplorable sign of conformity (recall those oh-so-trendy words, "trendy" and "fashionable").

Why did I write this?

I'm a Mamet fan--who isn't?--but I don't recall ever having seen any of his plays performed. And I'm not really familiar with Terry Teachout (although I recognized the distinctive name, and I think I've read a few other things by him, over the years).

So why did I write this? Because I think a lot about writing, and I find myself sensitive to turns of phrase that facilitate confusion in a writer. Sometimes confusions can be remedied by statistics (for example, the notorious claim that "people vote against their interests"), but in other cases, such as in Teachout's article, I think the problem is in a division of the world into good guys and bad guys. Here's another example from a few months ago, an article about demographic trends that, again, got tangled from a need (as I see it) to be both popular and unpopular at the same time--to have the perceived virtues of mass support while being an embattled underdog.

In the short term, perhaps we can all avoid the words "fashionable" and "trendy." Or, if they must be used, please explain how to distinguish a positive trend in opinion (a "trend," as it were) from a negative trend (which, of course, is just "trendy").

P.S. I'm not trying to pick on Teachout. It's only because I found his article interesting that I took the trouble to comment on this one bit. I just think his article (and similar musings on art and politics) could be improved by subtracting these two words, which are so easy to write but so destructive of clear thinking.

P.P.S. After posting the above, I corresponded briefly with Teachout . He was polite but refused to back down. Oh well, life wasn't meant to be easy, I guess.

Literature and life


What can we learn about an author from his or her fiction? This is an old, old question, I know. But I still can't help thinking about it when I read a book.

John Updike's stories are full of male characters whom women find irresistibly attractive. I can only assume that this reflects Updike's own experiences, to some extent. If he had not been, in reality, catnip to women, I imagine he'd have made more of a big deal about the episodes in his books where women kept falling into his protagonists' laps.

Same for John D. Macdonald, although there I suppose it's possible he was just throwing in the sex to sell books.

And even more so for Richard Ford. This guy's male characters are so smooth, there's no way that Ford isn't/wasn't like that too.

What about Lorrie Moore? I think she must have had a very frustrating life (so far). I say this because her stories always seem to be centered around a female character who is witty, thoughtful, and refined, and surrounded by really piggy guys.

And I can only assume that Franz Kafka had some frustrating experiences in his life as well.

Other writers seem tougher to characterize. For example, Jane Smiley's characters are all over the place, as are J. Robert Lennon's. I can't see how I could draw many conclusions about their personal experiences from their books. Lots and lots of writers are like this: you get a sense of their sensibilities but not really of their experiences. Maybe that's why cases such as John Updike and Lorrie Moore are so interesting, that they seem to be revealing, perhaps unintentionally, bits of themselves?

Editing and clutch hitting

| 1 Comment

Regarding editing: The only serious editing I've ever received has been for my New York Times op-eds and my article in the American Scientist. My book editors have all been nice people, and they've helped me with many things (including suggestions of what my priorities should be in communicating with readers)--they've been great--but they've not given (nor have I expected or asked for) serious editing. Maybe I should've asked for it, I don't know. I've had time-wasting experiences with copy editors and a particularly annoying experience with a production editor (who was so difficult that my coauthors and I actually contacted our agent and a lawyer about the possibility of getting out of our contract), but that's another story.

Regarding clutch hitting, Bill James once noted that it's great when a Bucky Dent hits an unexpected home run, but what's really special is being able to get the big hit when it's expected of you. The best players can do their best every time they come to the plate. That's why Bill James says that the lack of evidence for clutch hitting makes sense, it's not a paradox at all: One characteristic of pros is that they can do it over and over.

Faithful readers will know that my ideal alternative career is to be an editor in the Max Perkins mold. If not that, I think I'd enjoy being a literary essayist, someone like Alfred Kazin or Edmund Wilson or Louis Menand, who could write about my favorite authors and books in a forum where others would read and discuss what I wrote. I could occasionally collect my articles into books, and so on. On the other hand, if I actually had such a career, I wouldn't have much of an option to do statistical research in my spare time, so I think for my own broader goals, I've gotten hold of the right side of the stick.

As it is, I enjoy writing about literary matters but it never quite seems worth spending the time to do it right. (And, stepping outside myself, I realize that I have a lot more to offer the world as a statistician than literary critic. Criticism is like musicianship--it can be hard to do, and it's impressive when done well, but a lot of people can do it. Literary criticism is not like statistics. The supply of qualified critics vastly exceeds demand. Nobody is going to pay me $x/hour to be a literary consultant (for good reason, I'm sure), for any positive value of x. 9999 readers, Aleks,...

So you get it for free.

Anyway, this is all preamble to a comment on Clive James, who I just love--yes, I realize this marks me as a middlebrow American Anglophile. Deal with it. In any case, I came across this footnote in his verse collection:

Noam Chomsky gave furiously sleep ideas green colorless as an example of a random sequence of words which could have no meaning.

No, no, NO!!! This is so wrong that I'm wondering if James was making some sort of joke. But I can't see what that would be. So, to straighten things out:
This is not quite correct.

1. He got the phrase backward: It's "colorless green ideas sleep furiously."
Chomsky used two examples: "colorless green ideas sleep furiously" and "furiously sleep ideas green colorless." The former set of words sounds like a sentence (even though it makes no sense), the latter does not sound like a sentence (and also does not make sense).

2. It's not a random sequence of words. It's a very deliberate sequence of words, the reverse of a prhase that make perfect grammatical (or syntactic, I can never get these straight) sense. To use an analogy that James must be familiar with, "colorless green ideas" is like Jabberwocky--it sounds like English--all the parts of speech are in the right place. The difference, what makes the Chomsky sentence special, is that, first, the sentence makes no sense. But, beyond that, any two successive words of the sentence make no sense: Something green cannot be colorless, an idea cannot be green, ideas do not sleep, and you cannot sleep furiously. Chomsky's sentence is a work of beauty, and it was disappointing to see Clive James miss the pointpartly miss this point--in a book of poetry, no less!

Just a couple words about Clive James. One thing that I find appealing about him is that he's a writer in the David Owen mode (that is, David Owen the American journalist, not David Owen the English politician): serious, earnest, somewhat intelligent but a bit of a blockhead. Which I mean in a good way. Not clever-clever or even clever, but he wants to get things right.

P.S. Thanks to commenters for pointing out that my original blog was mistaken: Chomsky actually had two strings of words, not just one. in his famous article. So James did not get the phrase wrong (although he was in error in calling it "random").

P.P.S. Yes, I realize that James is originally from Australia. Nonetheless, I think my enjoyment of his writing is more a sign of Anglophilia than Australiophilia on my part.

A very short story


A few years ago we went to a nearby fried chicken place that the Village Voice had raved about. While we were waiting to place our order, someone from the local Chinese takeout place came in with a delivery, which the employees of the chicken place proceeded to eat. This should've been our signal to leave. Instead, we bought some chicken. It was terrible.

From "Judge Savage," by Tim Parks:

That evening, Daniel called Hilary's parents. These people always disliked me, he knew. He had never understood if it was a racial thing, or whether they would have disliked any partner of Hilary's.

Very clever. Parks demonstrates Daniel's blind spot--he can't imagine that maybe Hilary's parents hate him because of his unpleasant personality--but does it entirely from Daniel's perspective. I wonder if this just came naturally to Parks, or whether he figured it out as a puzzle to solve--how to convey a blind spot from the perspective of the person looking and not noticing it--or whether Parks wasn't thinking at all about this and it just happened.

Considering the character Daniel's psychology, I'd consider the above as an example of the so-called fundamental attribution error, in that he's attributing Hilary's parents dislike of him to situational factors rather than to his own personality.

I'll have more on "Judge Savage" later (on the topic of "fighting the last war").

You know that expression, "Not from the Onion"? How did we say that, all those years before the Onion existed?

I was thinking about this after encountering (amidst a Google search for something else) this article on a website called "College News":

DANVILLE, KY., March 8, 2007--Two Centre College professors spent the past six years reading and analyzing 200 children's books to discover a disturbing trend: gender bias still exists in much of modern children's literature.

Dr. David Anderson, professor of economics, and Dr. Mykol Hamilton, professor of psychology, have documented that gender bias is common today in many children's books in their research published recently in Sex Roles: A Journal of Research titled "Gender Stereotyping and Under-Representation of Female Characters in 200 Popular Children's Picture Books: A 21st Century Update." . . .

"Centre College," huh? That's where Area Man is studying, right?

According to the materials on its website, Centre College is ranked very high on some measures, and I wouldn't be surprised if it's an excellent place to get an education. Still, there's something Onion-like about all of this.

I was reading this article by Ariel Levy in the New Yorker and noticed something suspicious. Levy was writing about an event in 1979 and then continued:

One year later, Ronald Reagan won the Presidency, with overwhelming support from evangelicals. The evangelical vote has been a serious consideration in every election since.

From Chapter 6 of Red State, Blue State:


According to the National Election Study, Reagan did quite a bit worse than Carter among evangelical Protestants than among voters as a whole--no surprise, really, given that Reagan was not particularly religious and Cater was an evangelical himself.

It was 1992, not 1980, when evangelicals really started to vote Republican.

What's it all about?

I wouldn't really blame Ariel Levy for this mistake; a glance at her website reveals a lot of experience as a writer and culture reporter but not much on statistics or politics. That's fine by me: there's a reason I subscribe to the New Yorker and not the American Political Science Review!

On the other hand, I do think that the numbers are important, and I worry about misconceptions of American politics--for example, the idea that Reagan won "overwhelming support from evangelicals." A big reason we wrote Red State, Blue State was to show people how all sorts of things they "knew" about politics were actually false.

Perhaps the New Yorker and other similar publications should hire a statistical fact checker or copy editor? Maybe this is the worst time to suggest such a thing, with the collapsing economics of journalism and all that. Still, I think the New Yorker could hire someone at a reasonable rate who could fact check their articles. This would free up their writers to focus on the storytelling that they are good at without having to worry about getting the numbers wrong.

Another option would be to write a letter to the editor, but I don't think the New Yorker publishes graphs.

P.S. I've written before about the need for statistical copy editors (see also here, here, and, of course, the notorious "But viewed in retrospect, it is clear that it has been quite predictable").

P.P.S. I think one of my collaborators made this graph, maybe by combining the National Election Study questions on religious denomination and whether the respondent describes him/herself as born again.

P.P.P.S. Somebody pointed out that Reagan did do well among white evangelicals, so maybe that's what Levy was talking about.

Interesting mini-memoir from John Podhoretz about the Upper West Side, in his words, "the most affluent shtetl the world has ever seen."

The only part I can't quite follow is his offhand remark, "It is an expensive place to live, but then it always was." I always thought that, before 1985 or so, the Upper West Side wasn't so upscale. People at Columbia tell all sorts of stories about how things used to be in the bad old days.

I have one other comment. Before giving it, let me emphasize that enjoyed reading Podhoretz's article and, by making the comment below, I'm not trying to shoot Podhoretz down; rather, I'm trying to help out by pointing out a habit in his writing that might be getting in the way of his larger messages.

Podhoretz writes the following about slum clearance:

A comment by Mark Palko reminded me that, while I'm a huge Marquand fan, I think The Late George Apley is way overrated. My theory is that Marquand's best books don't fit into the modernist way of looking about literature, and that the gatekeepers of the 1930s and 1940s, when judging Marquand by these standards, conveniently labeled Apley has his best book because it had a form--Edith-Wharton-style satire--that they could handle. In contrast, Point of No Return and all the other classics are a mixture of seriousness and satire that left critics uneasy.

Perhaps there's a way to study this sort of thing more systematically?

I was stunned this from Jenny Davidson about mystery writers:

The crime fiction community is smart and adult and welcoming, and so many good books are being written (Lee Child was mentioning his peer group - i.e. they were the new kids around the same tie - being Michael Connelly, Robert Crais, Dennis Lehane, Laura Lippman - the list speaks for itself) . . .

Why was I stunned? Because just a few days earlier I had a look at a book by Robert Crais. It just happened that Phil, when he was visiting, had finished this book (which he described as "pretty good") and left it with me so he wouldn't have to take it back with him. I'd never heard of Crais, but it had pretty amazing blurbs on the cover and Phil recommended it, so I took a look.

It was bad. From page 1 it was bad. It was like a bad cop show. I could see the seams where the sentences were stitched together. I could see how somebody might like this sort of book, but I certainly can't understand the blurbs or the idea that it's a "good book"!

Here's an article that I believe is flat-out entertaining to read. It's about philosophy, so it's supposed to be entertaining, in any case.

Here's the abstract:

A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science.

Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.

Here's the background:

Helen DeWitt links to an interview with Seth Godin, who makes some commonplace but useful observations on jobs and careers. It's fine, but whenever I read this sort of thing, I get annoyed by the super-aggressive writing style. These internet guys--Seth Godin, Clay Shirky, Philip Greenspun, Jeff Jarvis, and so on--are always getting in your face, telling you how everything you thought was true was wrong. Some of the things these guys say are just silly (for example, Godin's implication that Bob Dylan is more of a success than the Monkees because Dylan sells more tickets), other times they have interesting insights, but reading any of them for awhile just sets me on edge. I can't take being shouted at, and I get a little tired of hearing over and over again that various people, industries, etc., are dinosaurs.

Where does this aggressive style come from? My guess is that it's coming from the vast supply of "business books" out there. These are books that are supposed to grab you by the lapel and never let go.

I'm certainly not trying to tell Seth Godin and the others to change--a lot of people (for example, Helen DeWitt!) seem to like their style, and it seems to work for them. I just wanted to comment on this because I suddenly realized I was seeing this in-your-face, change-or-die style all over the place and it was getting annoying.

Perhaps I'll write a blog entry in this style at some point, just to see how it comes out.

P.S. I went to the library the other day and say their shelves and shelves of business books. Scary stuff.

Classics of statistics


Christian Robert is planning a graduate seminar in which students read 15 classic articles of statistics. (See here for more details and a slightly different list.)

Actually, he just writes "classics," but based on his list, I assume he only wants articles, not books. If he wanted to include classic books, I'd nominate the following, just for starters:
- Fisher's Statistical Methods for Research Workers
- Snedecor and Cochran's Statistical Methods
- Kish's Survey Sampling
- Box, Hunter, and Hunter's Statistics for Experimenters
- Tukey's Exploratory Data Analysis
- Cleveland's The Elements of Graphing Data
- Mosteller and Wallace's book on the Federalist Papers.
Probably Cox and Hinkley, too. That's a book that I don't think has aged well, but it seems to have had a big influence.

I think there's a lot more good and accessible material in these classic books than in the equivalent volume of classic articles. Journal articles can be difficult to read and are typically filled with irrelevant theoretical material, the kind of stuff you need to include to impress the referees. I find books to be more focused and thoughtful.

Accepting Christian's decision to choose articles rather than books, what would be my own nominations for "classics of statistics"? To start with, there must be some tradeoff between quality and historical importance.

One thing that struck me about the list supplied by Christian is how many of these articles I would definitely not include in such a course. For example, the paper by Durbin and Watson (1950) does not seem at all interesting to me. Yes, it's been influential, in that a lot of people use that statistical test, but as an article, it hardly seems classic. Similarly, I can't see the point of including the paper by Hastings (1970). Sure, the method is important, but Christian's students will already know it, and I don't see much to be gained by reading the paper. I'd recommend Metropolis et al. (1953) instead. And Casella and Strawderman (1981)? A paper about minimax estimation of a normal mean? What's that doing on the list??? The paper is fine--I'd be proud to have written it, in fact I'd gladly admit that it's better than 90% of anything I've ever published--but it seems more of a minor note than a "classic." Or maybe there's some influence here of which I'm unaware. And I don't see how Dempster, Laird, and Rubin (1977) belongs on this list. It's a fine article and the EM algorithm has been tremendously useful, but, still, I think it's more about computation than statistics. As to Berger and Sellke (1987), well, yes, this paper has had an immense influence, at least among theoretical statisticians--but I think the paper is basically wrong! I don't want to label a paper as a classic if it's sent much of the field in the wrong direction.

For other papers on Christian's list, I can see the virtue of including in a seminar. For example, Hartigan and Wong (1979), "Algorithm AS 136: A K-Means Clustering Algorithm." The algorithm is no big deal, and the idea of k-means clustering is nothing special. But it's cool to see how people thought about such things back then.

And Christian also does include some true classics, such as Neyman and Pearson's 1933 paper on hypothesis testing, Plackett and Burnham's 1946 paper on experimental design, Pitman's 1939 paper on inference (I don't know if that's the best Pitman paper to include, but that's a minor issue), Cox's hugely influential 1972 paper on hazard regression, Efron's bootstrap paper, and classics by Whittle and Yates. Others I don't really feel so competent to judge (for example, Huber (1985) on projection pursuit), but it seems reasonable enough to include them on the list.

OK, what papers would I add? I'll list them in order of time of publication. (Christian used alphabetical order, which, as we all know, violates principles of statistical graphics.)

Neyman (1935). Statistical problems in agricultural experimentation (with discussion). JRSS. This one's hard to read, but it's certainly a classic, especially when paired with Fisher's comments in the lively discussion.

Tukey (1972). Some graphic and semigraphic displays. This article, which appears in a volume of papers dedicated to George Snedecor, is a lot of fun (even if in many ways unsound).

Akaike (1973). Information theory and an extension of the maximum likelihood principle. From a conference proceedings. I prefer this slightly to Mallows's paper on Cp, written at about the same time (but I like the Mallows paper too).

Lindley and Smith (1972). Bayes estimates for the linear model (with discussion). JRSS-B. The methods in the paper are mostly out of date, but it's worth it for the discussion (especially the (inadvertently) hilarious contribution of Kempthorne).

Rubin (1976). Inference and missing data. Biometrika. "Missing at random" and all the rest.

Wahba (1978). Improper priors, spline smoothing and the problem of guarding against model errors in regression. JRSS-B. This stuff all looks pretty straightforward now, but maybe not so much so back in 1978, back when people were still talking about silly ideas such as "ridge regression." And it's good to have all these concepts in one place.

Rubin (1980). Using empirical Bayes techniques in the law school validity studies (with discussion). JASA. Great, great stuff, also many interesting points come up in the discussion. If you only want to include one Rubin article, keep this one and leave "Inference and missing data" for students to discover on their own.

Hmm . . . why are so many of these from the 1970s? I'm probably showing my age. Perhaps there's some general principle that papers published in year X have the most influence on graduate students in year X+15. Anything earlier seems simply out of date (that's how I feel about Stein's classic papers, for example; sure, they're fine, but I don't see their relevance to anything I'm doing today, in contrast to the above-noted works by Tukey, Akaike, etc., which speak to my current problems), whereas anything much more recent doesn't feel like such a "classic" at all.

OK, so here's a more recent classic:

Imbens and Angrist (1994). Identification and estimation of local average treatment effects. Econometrica.

Finally, there are some famous papers that I'm glad Christian didn't consider. I'm thinking of influential papers by Wilcoxon, Box and Cox, and zillions of papers of that introduced particular hypothesis tests (the sort that have names that they tell you in a biostatistics class). Individually, these papers are fine, but I don't see that students would get much out of reading them. If I was going to pick any paper of that genre, I'd pick Deming and Stephan's 1940 article on iterative proportional fitting. I also like a bunch of my own articles, but there's no point in mentioning them here!

Any other classics you'd like to nominate (or places where you disagree with me)?

The commenting feature doesn't work for me on Helen DeWitt's blog so I'm forced to comment on her entries here.

1. She discusses whether it's fair to characterize The Last Samurai. I have a slightly different perspective on this: I've never really understood the idea that a "beach read" should be something light and fluffy. On the beach, you can relax, you have the time to get into anything. I could see wanting something light on the subway--you have to be able to get into it right away and follow it amid all the jostles. I guess the point is that when you're at the beach, you're far from the library. So what you really want for the beach is not necessarily something relaxing or easy to read, but rather a sure thing, a known quantity that you'll definitely enjoy. No point sitting on the beach reading a book that you hate.

2. In an interesting discussion of translation, DeWitt recommends learning a language by reading great literature in the original tongue. Seems fine to me . . . but, I gotta admit that I think of pleasure reading (even serious stuff) as a way to relax. For me, reading a novel (or even the newspaper) in French or Spanish is a substitute for work, not a substitute for bedtime reading.

3. Apparently many people use tax-preparation software, even though tax forms aren't really hard to fill out. One thing I'll say in encouragement of doing your own taxes is that I used to do my own taxes and every once in awhile I'd make a mistake, but . . . it wasn't so horrible! I'd just send the government a check for what I owed and there'd be a little fine. No big deal. Sometimes I'd overpay by mistake and they'd send me a refund check. I can't guarantee this will happen to you, but those were my experiences.

Mark Palko writes:

Roth and Amsterdam

| 1 Comment

I used to think that fiction is about making up stories, but in recent years I've decided that fiction is really more of a method of telling true stories. One thing fiction allows you to do is explore what-if scenarios. I recently read two books that made me think about this: The Counterlife by Philip Roth and Things We Didn't See Coming by Steven Amsterdam. Both books are explicitly about contingencies and possibilities: Roth's tells a sequence of related but contradictory stories involving his Philip Roth-like (of course) protagonist, and Amsterdam's is based on an alternative present/future. (I picture Amsterdam's book as being set in Australia, but maybe I'm just imagining this based on my knowledge that the book was written and published in that country.) I found both books fascinating, partly because of the characters' voices but especially because they both seemed to exemplify George Box's dictum that to understand a system you have to perturb it.

So, yes, literature and statistics are fundamentally intertwined (as Dick De Veaux has also said, but for slightly different reasons).

Intellectual property


Somebody should warn Doris Kearns Goodwin not to take any of this guy's material. . . .



Rajiv Sethi quotes Bentley University economics professor Scott Sumner writing on the first anniversary of his blog:

Be careful what you wish for. Last February 2nd I [Sumner] started this blog with very low expectations... I knew I wasn't a good writer . . . And I was also pretty sure that the content was not of much interest to anyone.

Now my biggest problem is time--I spend 6 to 10 hours a day on the blog, seven days a week. Several hours are spent responding to reader comments and the rest is spent writing long-winded posts and checking other economics blogs. . . .

I [Sumner] don't think much of the official methodology in macroeconomics. Many of my fellow economists seem to have a Popperian view of the social sciences. You develop a model. You go out and get some data. And then you try to refute the model with some sort of regression analysis. . . .

My problem with this view is that it doesn't reflect the way macro and finance actually work. Instead the models are often data-driven. Journals want to publish positive results, not negative. So thousands of macroeconomists keep running tests until they find a "statistically significant" VAR model, or a statistically significant "anomaly" in the EMH. Unfortunately, because the statistical testing is often used to generate the models, and determine which get published, the tests of statistical significance are meaningless.

I'm not trying to be a nihilist here, or a Luddite who wants to go back to the era before computers. I do regressions in my research, and find them very useful. But I don't consider the results of a statistical regression to be a test of a model, rather they represent a piece of descriptive statistics, like a graph, which may or may not usefully supplement a more complex argument that relies on many different methods . . .

I [Sumner] like Rorty's pragmatism; his view that scientific models don't literally correspond to reality, or mirror reality. Rorty says that one should look for models that are "coherent," that help us to make sense of a wide variety of facts. . . .

Interesting, especially given my own veneration of Popper (or, at least the ideal version of Popper as defined in Lakatos's writings). Sumner is writing about macroeconomics, which I know nothing about. In any case, I should probably read something by Rorty. (I've read the name "Rorty" before--I'm pretty sure he's a philosopher and I think his first name is "Richard," but that's all I know about him.)

Sumner also writes:

I suppose it wasn't a smart career move to spend so much time on the blog. If I had ignored my commenters I could have had my manuscript revised by now. . . . And I really don't get any support from Bentley, as far as I know the higher ups don't even know I have a blog. So I just did 2500 hours of uncompensated labor.

I agree with Sethi that Sumner's post is interesting and captures much of the blogging experience. But I don't agree with that last bit about it being a bad career move. Or perhaps Sumner was kidding? (It's notoriously difficult to convey intonation in typed speech.) What exactly is the marginal value of his having a manuscript revised? It's not like Bentley would be compensating him for that either, right? For someone like Sumner (or, for that matter, Alex Tabarrok or Tyler Cowen or my Columbia colleague Peter Woit), blogging would seem to be an excellent career move, both by giving them and their ideas much wider exposure than they otherwise would've had, and also (as Sumner himself notes) by being a convenient way to generate many thousands of words that can be later reworked into a book. This is particularly true of Sumner (more than Tabarrok or Cowen or, for that matter, me) because he tends to write long posts on common themes. (Rajiv Sethi, too, might be able to put together a book or some coherent articles by tying together his recent blog entries.)

Blogging and careers, blogging and careers . . . is blogging ever really bad for an academic career? I don't know. I imagine that some academics spend lots of time on blogs that nobody reads, and that could definitely be bad for their careers in an opportunity-cost sort of way. Others such as Steven Levitt or Dan Ariely blog in an often-interesting but sometimes careless sort of way. This might be bad for their careers, but quite possibly they've reached a level of fame in which this sort of thing can't really hurt them anymore. And this is fine; such researchers can make useful contributions with their speculations and let the Gelmans and Fungs of the world clean up after them. We each have our role in this food web. (Personally I think I'm as careful in everything I blog as in my published research--take this one however you want!--and I welcome blogging as a way to put ideas out there and often get useful criticism. My impression is that Sumner and Sethi feel the same way, but authors who have reached the bestseller level probably just don't have the time to read their blog comments.)

And then of course there are the many many bloggers, academic and otherwise, whose work I assume I would've encountered much more rarely were they not blogging.

The other issue that Sethi touches on in is the role of blogging in economic discourse. Which brings us to the ("reverse causal") question of why there are so many prominent academic bloggers from economics (also sociology and law, it appears) but not so many in political science or psychology or, for that matter, statistics.

I guess the last one of these is easy enough to answer: there aren't so many statisticians out there, most of them don't seem to really enjoy writing, and statistics isn't particularly newsworthy. I had a conversation about this the other day after writing something for Physics Today. Physics Today is the monthly magazine of the American Physical Society, and it's fun to read. It was a pleasure to write for it. But could there be Statistics Today? It wouldn't be so easy! In physics there's news every month, exciting new experiments, potential path-breaking theories, and the like. Somebody somewhere is building a microscope that can look inside a quark, and somebody else is figuring out how to generalize Heisenberg's uncertainty principle to account for this. Meanwhile, in statistics, there's . . . a new efficient estimator for Poisson regression? News about the Census? No, when statisticians try to be entertaining, they typically end up writing about statistical errors made by non-statisticians. (Oops, I've done that too!). This can be fun now and then, but you can't make a monthly magazine out of it.

J. Robert Lennon writes:

At the moment I [Lennon] am simultaneously working on two magazine articles, each requiring me to assess not just a book, but (briefly) a writer's entire career. The writers in question are both prominent, both widely published, read, and appreciated. And yet neither, I think, enjoys a full appreciation of their career--its real scope, with all its twists and turns, its eccentricities intact.

In one case, the writer had one smash hit, and one notorious book everyone hates. In the other, the writer has somehow become known as the author of one really serious book that gets taught a lot in college classes, and a bunch of other stuff generally thought to be a little bit frivolous. But close readings of each (hell, not even that close) reveals these reputations to be woefully inadequate. Both writers are much more interesting than their hits and bombs would suggest.

This naturally got me thinking about statisticians. Some statisticians are famous (within the statistics world) without having made any real contributions (as far as I can tell). And then there are the unappreciated (such as the psychometrician T. L. Kelley) and the one-hit wonders (Wilcoxon?)

I was also curious who Lennon's subjects are. Feel free to place your guesses below. I'll send a free book to the first person who guesses both authors correctly (if you do it before Lennon announces it himself).

Around the time I was finishing up my Ph.D. thesis, I was trying to come up with a good title--something more grabby than "Topics in Image Reconstruction for Emission Tomography"--and one of the other students said: How about something iike, Female Mass Murderers: Babes Behind Bars? That sounded good to me, and I was all set to use it. I had a plan: I'd first submit the one the boring title--that's how it would be recorded in all the official paperwork--but then at the last minute I'd substitute in the new title page before submitting to the library. (This was in the days of hard copies.) Nobody would look at the time, then later on, if anyone went into the library to find my thesis, they'd have a pleasant surprise. Anyway, as I said, I was all set to do this, but a friend warned me off. He said that at some point, someone might find it, and the rumor would spread that I'm a sexist pig. So I didn't.

I was thinking about this after hearing this report based on a reading of Supreme Court nominee Elena Kagan's undergraduate thesis. Although in this case, I suppose the title was unobjectionable, it was the content that bothered people.

The official announcement:

The Excellence in Statistical Reporting Award for 2010 is presented to Felix Salmon for his body of work, which exemplifies the highest standards of scientific reporting. His insightful use of statistics as a tool to understanding the world of business and economics, areas that are critical in today's economy, sets a new standard in statistical investigative reporting.

Here are some examples:

Tiger Woods

Nigerian spammers

How the government fudges job statistics

This one is important to me. The idea is that "statistical reporting" is not just traditional science reporting (journalist talks with scientists and tries to understand the consensus) or science popularization or silly feature stories about the lottery. Salmon is doing investigative reporting using statistical thinking.

Also, from a political angle, Salmon's smart and quantitatively sophisticated work (as well as that of others such as Nate Silver) is an important counterweight to the high-tech mystification that surrounds so many topics in economics.

Jenny writes:

The Possessed made me [Jenny] think about an interesting workshop-style class I'd like to teach, which would be an undergraduate seminar for students who wanted to find out non-academic ways of writing seriously about literature. The syllabus would include some essays from this book, Geoff Dyer's Out of Sheer Rage, Jonathan Coe's Like a Fiery Elephant - and what else?

I agree with the commenters that this would be a great class, but . . . I'm confused on the premise. Isn't there just a huge, huge amount of excellent serious non-academic writing about literature? George Orwell, Mark Twain, Bernard Shaw, T. S. Eliot (if you like that sort of thing), Anthony Burgess, Mary McCarthy (I think you'd call her nonacademic even though she taught the occasional college course), G. K. Chesterton, etc etc etc? Teaching a course about academic ways of writing seriously about literature would seem much tougher to me.

Trips to Cleveland


Helen DeWitt writes about The Ask, the new book by Sam Lipsyte, author of a hilarious book I read a couple years ago about a loser guy who goes to his high school reunion. I haven't read Lipsyte's new book but was interested to see that he teaches at Columbia. Perhaps I can take him to lunch (either before or after I work up the courage to call Gary Shteyngart and ask him about my theory that the main character of that book is a symbol of modern-day America).

In any case, in the grand tradition of reviewing the review, I have some thoughts inspired by DeWitt, who quotes from this interview:

LRS: I was studying writing at college and then this professor showed up, a disciple of Gordon Lish, and we operated according to the Lish method. You start reading your work and then as soon as you hit a false note she made you stop.

Lipsyte: Yeah, Lish would say, "That's bullshit!"

If they did this for statistics articles, I think they'd rarely get past the abstract, most of the time. The methods are so poorly motivated. You're doing a so-called "exact test" because . . . why? And that "uniformly most powerful test" is a good idea because . . . why again? Because "power" is good? And that "Bayes factor"? Etc.

The #1 example of motivation I've ever seen was in the move The Grifters. In a very early scene, John Cusack gets punched in the stomach and is seriously injured, and that drives everything else in the plot.

DeWitt quotes Gerald Howard:

Lish's influence can been seen in Sam's obvious concentration on the crafting of his sentences and his single-minded focus on style, a quality less prevalent in the work of younger American writers than it should be. (Savor the perfectly pitched ear required to turn a simple phrase like "a dumpling, some knurled pouch of gristle.") Sam replies that "Gordon said many things that I will never forget, but the one thing that I always think about is that he said once, 'There is no getting to the good part. It all has to be the good part.' And so I think that when people are writing their novels they are just thinking about the story, about what has to happen so their character can get to Cleveland. . . ."

The way I put it (from the perspective of nonfiction writing) is "Tell 'em what they don't know." And, ever since having read The Princess Bride many years ago, I've tried to put in only the "good parts" in all my books. That was one thing that was fun about writing Teaching Statistics: A Bag of Tricks. We felt no obligation to be complete or to include boring stuff just because we were supposed to. Most textbooks I've seen have way too many trips to Cleveland.

One thing I say about statistics is: I always try to fit the dumbest, simplest possible model for any problem I'm working on. But, unfortunately, the simplest method that is even plausibly appropriate for any problem is typically just a little bit more complicated than the most complicated thing I know how to fit.

I guess there's a similar principle in writing: You restrict yourself to the good stuff, but there's just a bit too much good stuff to fit in whatever container you have in mind. And then you must, as the saying goes, kill your darlings.

P.S. To connect to another of our common themes: Ed Tufte's mother, of all people, wrote a good book about the construction of sentences. Sentences are important and, to the best of my knowledge, nonalgorithmic That is, I have no clean method for constructing clear sentences. I often have to rephrase to avoid the notorious garden-path phenomenon. I wonder how Vin Scully did it. Was it just years of practice?

P.P.S. One thing I love about Marquand are his chapter titles. I can't usually hope to match him, but he's my inspiration for blog entry titles such as this one.

For "humanity, devotion to truth and inspiring leadership" at Columbia College. Reading Jenny's remarks ("my hugest and most helpful pool of colleagues was to be found not among the ranks of my fellow faculty but in the classroom. . . . we shared a sense of the excitement of the enterprise on which we were all embarked") reminds me of the comment Seth made once, that the usual goal of university teaching is to make the students into carbon copies of the instructor, and that he found it to me much better to make use of the students' unique strengths. This can't always be true--for example, in learning to speak a foreign language, I just want to be able to do it, and my own experiences in other domains is not so relevant. But for a worldly subject such as literature or statistics or political science, then, yes, I do think it would be good for students to get involved and use their own knowledge and experiences.

One other statement of Jenny's caught my eye. She wrote:

Burgess on Kipling


This is my last entry derived from Anthony Burgess's book reviews, and it'll be short. His review of Angus Wilson's "The Strange Ride of Rudyard Kipling: His Life and Works" is a wonderfully balanced little thing. Nothing incredibly deep--like most items in the collection, the review is only two pages long--but I give it credit for being a rare piece of Kipling criticism I've seen that (a) seriously engages with the politics, without (b) congratulating itself on bravely going against the fashions of the politically incorrect chattering classes by celebrating Kipling's magnificent achievement blah blah blah. Instead, Burgess shows respect for Kipling's work and puts it in historical, biographical, and literary context.

Burgess concludes that Wilson's book "reminds us, in John Gross's words, that Kipling 'remains a haunting, unsettling presence, with whom we still have to come to terms.' Still." Well put, and generous of Burgess to end his review with another's quote.

Other critics, most notably George Orwell and Edmund Wilson, have also written interestingly on Kipling,



From Anthony Burgess's review of "The Batsford Companion to Popular Literature," by Victor Neuberg:

Arthur J. Burks (1898-1974) was no gentleman. During the 1930s, when he would sometimes have nearly two million words in current publication, he aimed at producing 18,000 words a day.
Editors would call me up and ask me to do a novelette by the next afternoon, and I would, but it nearly killed me. . . . I once appeared on the covers of eleven magazines the same month, and then almost killed myself for years trying to make it twelve. I never did.

[Masanao: I think you know where I'm heading with that story.]

Ursula Bloom, born 1985 and still with us [this was written sometime between 1978 and 1985], is clearly no lady. Writing also under the pseudonyms of Lozania Prole (there's an honest name for you), Sheila Burnes and Mary Essex, she has produced 486 boooks, beginning with Tiger at the age of seven. . . .

Was Richard Horatio Edgar Wallace (1875-1932) a gentleman? . . . . In the 1920s and 1930s, Mr Neuburg tells us, one in four of all books was the work of Wallace. [How did Neuburg estimate this? I guess I'll have to track down his book and find out.] Everybody, especially the now unreadable Sir Hugh Walpole, looked down on this perpetually dressing-gowned king of the churners, who gave the public what it wanted.

Burgess continues:

What the public wanted, and still wants, is an unflowery style woven out of cliches, convincing dialogue, loads of action. Is the public wrong.

The added emphasis is my own.

I'll have more to say at some point about the popular literature of the past, but for now let me just note the commonplace that once-bestselling melodramas often seem unreadable to present-day audiences. I'm guessing that it has something to do with the cliches not working any more and the dialogue no longer being convincing. I'm sure there will even be a day when Eddie Coyle's words no longer sound natural. (Not that that book was ever extremely easy to read, nor was it a major bestseller.)

Inappropriate parallelism


I've been teaching at elite colleges for over twenty years, and one thing that persistently frustrates me is the students' generally fluent ability to manipulate symbols without seeming to engage with the underlying context. Colorless green ideas sleep furiously, and all that. My theory is that in high school these students were trained to be able to write a five-paragraph essay on anything at all.

I was reminded of this when reading an article on the recent airline disruptions in Europe, where Washington Post columnist Anne Applebaum writes:

A friend with no previous interest in airline mechanics explained over the phone how two planes had already been affected. Another proffered a detailed description of the scientific process by which the ash enters the engine, melts from the heat, and turns back into stone--not what one wants inside one's airplane engines, really.

Others have become mystics. A British friend sees this as "judgment for the bad things we have done to the Earth." . . .

So far, so good. But then:

Though it is uncanny, I [Applebaum] do understand why some want science to explain this odd event and why others see the revenge of the volcano gods.

Huh? It seems a bit . . . anthropological of her to put science and "the volcano gods" in this sort of parallelism. It's no big deal, really, it just reminds me of a remark I once read that newspapers were better in the old days: Back when "journalists" were "reporters" and didn't have college educations, they just reported the facts and had neither the obligation to understand the world nor the inclination to smooth out reality to fit the contours of their well-rounded sentences. As a college teacher, I'm the last person to endorse such an idea, but it does have its appeal.

P.S. No, I doubt that Applebaum herself thinks of scientific and volcano-god explanations as equivalent. But that's my point: she wrote something that she (probably) doesn't really believe and, I suspect, she didn't really think clearly about, just because it fit the flow of her article. It was symbol-manipulation without full context.

David Blackbourn writes in the London Review of Books about the German writer Hans Magnus Eisenberger:

But there are several preoccupations to which Enzensberger has returned. One is science and technology. Like left-wing intellectuals of an earlier period, but unlike most contemporary intellectuals of any political stamp, he follows scientific thinking and puts it to use in his work. There are already references to systems theory in his media writings of the 1960s, while essays from the 1980s onwards bear the traces of his reading in chaos theory.

For some inexplicable reason, catastrophe theory has been left off the list. Blackburn continues:

One of these takes the topological figure of the 'baker's transformation' (a one-to-one mapping of a square onto itself) discussed by mathematicians such as Stephen Smale and applies the model to historical time as the starting point for a series of reflections on the idea of progress, the invention of tradition and the importance of anachronism.

Pseuds corner indeed. I can hardly blame a European intellectual who was born in 1929 for getting excited about systems theory, chaos theory, and the rest. Poets and novelists of all sorts have taken inspiration by scientific theories, and the point is not whether John Updike truly understood modern computer science or whether Philip K. Dick had any idea what the minimax strategy was really about--these were just ideas, hooks for them to hang their stories. All is fodder for the creative imagination.

I have less tolerance, however, for someone to write in the London Review of Books to describe this sort of thing as an indication that Enzensberger "follows scientific thinking and puts it to use in his work." Perhaps "riffs on scientific ideas" would be a better way of putting it.

P.S. See Sebastian's comment below. Maybe I was being too quick to judge.

Nate asks, "Why is writing a 100,000-word book about 10 times harder than writing 100 1,000-word blog posts?" I don't know if this is true at all. Writing the book might be less fun than writing the blog posts, but is it really harder? If you really want to write the book, maybe the trick is to make blogging feel like work and book-writing feel like a diversion.

From Anthony Burgess's review of "Best Sellers: Popular Fiction of the 1970s," by John Sutherland, I learn that The Godfather sold 300,000 hardcover and 13 million paperbacks during that decade, and Fear of Flying, the book at the bottom of the New York Times's bestselling list (#10? #15?), sold 100,000 hardcover and nearly 6 million in softback. Comparing to the list from 1965, we see that absolute sales increased rapidly. And the numbers have shot up still more if it's true what they're saying about James Patterson.

What I want to know is, what happened to all those copies of "Alive!" Back in the 1970s, you used to see copies of that book everywhere. Did everyone who owned that book throw it out? Or was it printed on some sort of special disintegrating paper guaranteed to fall apart after twenty years? I guess I should also ask what happened to those two hundred million Perry Mason books. Are they all in Grandma's attic somewhere?

From Anthony Burgess's review of John Updike's The Coup:

This emboldens me [Burgess] to set some of Mr Updike's prose as verse, making such small emendations as are necessary to regularize the rhythm:
The piste diminished to a winding track,

Treacherously pitted, strewn with flinty scrabble
That challenged well the mettle of our Michelin
Steel-belted radials. Distances grew bluish;
As we rode higher, clots of vegetation,
Thorny and leafless, troubled with grasping roots
The rocks. In the declivities that broke
Our grinding, twisting ascent, there were signs
Of pasturage: clay trampled to a hardened
Slurry by hooves, and also excrement
Distinguishable still from mineral matter,
Some toppled skeletons of beehive huts,
Consumed, their thatches, as a desperate fodder.
Aristada, which thrives on overgrazed
Lands, tinged with green this edge of desolation.

I see Burgess's point. More than this, I'm reminded of the very low standards of contemporary poetry. If you want to write a novel or even a short story, it better be interesting in some way. For a poem, though, it just has to be . . . not too embarrassingly bad. The above passage, as poetry, would fit just fine in the New Yorker or elsewhere--it follows all the rules (imaginary gardens with real toads and all the other life-affirming b.s.). And, in fact, if I saw it there and troubled to read it in that format, I'd probably think of it as pretty good, in the sense that I could actually figure out what it is talking about. Personally, I trace this back to what we were taught by our high school English teachers about poetry being intense, poetry being a puzzle, and or course good poetry being something you're supposed to like. Now, I fully admit that T. S. Eliot has been admired by many people whose literary skills, tastes, and judgments I respect much more than my own--still, I think of him not so much as a great poet but as somebody who was well connected and got off a few good lines. I'm down on the whole poetry-as-puzzle thing. It worked OK for Michael Stipe, but he had that music thing going on.

OK, now on to Topic #1

The above is all set-up to my main point. My real goal here is not to bash poetry but to reflect upon a related item from Burgess's review, where he writes:

[Updike] is committed also to a kind of poetic unit, a verse paragraph that, in certain contexts of action or even speech, seems excessively long. And there is a basic melody which seems to require otiose adjectives:
. . . a poster of Elvis Presley in full sequinned regalia, Marilyn Monroe from a bed of polar bear skins making upwards at the lens the crimson O of a kiss whose mock emotion led her to close her greasy eyelids, and a page torn from that magazine whose hearty name of Life did not save it from dying. . . .

That hearty [continues Burgess] is surely unecessary . . . yet without the adjective the prosody falters.

I have a few comments:

1. I think Burgess is right. "Hearty" does not really make sense there, but it wouldn't work to simply remove it. For one thing, "whose name of Life" would sound too much like "the game of life." The word "hearty" nudges the reader and keeps the sense of the sentence moving along.

2. Updike's rhythm really works, allowing the reader (that is, me) to follow a complex sentence straight through the first time. And I know, from my own experience, that it's not easy to get that sort of flow.

Just for example, why did Updike put "upwards at the lens" before "the crimson O"? In spoken English it would be natural to set out the object of the verb right away (that is, to say, "Making the crimson O of a kiss" and go from there). But, no, that wouldn't work: if you put the "upwards at the lens" phrase after the kiss, it won't be clear that it should be modifying Marilyn, not her kiss.

I'm aware of this particular rearrangement trick because I do it all the time in my own writing. My point is that it takes a lot of work to get the sentences to just flow on the printed page, where you don't have timing, intonation, facial expressions, and gestures to help clarify your intentions.

3. There are other ways to keep the logical flow, for example the sort of frank discursiveness most impressively (to me) done by Nicholson Baker in The Mezzanine.

4. My main reaction, though, is the thought that Burgess's argument applies to me as well! At a much lower level than Updike, sure, but still. I put so much effort into the flow of my sentences (even, I'm embarrassed to admit, when blogging), and in writing about technical matters (as I usually do), I have the further constraint of not wanting to get anything wrong. In Bayesian Data Analysis in particular, I went carefully over everything to make sure I was not saying anything sloppy. I'd noticed that a lot of the statistics literature had such sloppiness, and I wanted to be careful to label rigorous results, conjectures, and opinions as such. Along with all of this, I try to avoid cliches, especially those sloppy expressions such as "the lion's share" (one of my pet peeves, for some reason--and, no, I don't really think of "pet peeve" as a cliche even though, yes, I know that it is) which are vaguely--but only vaguely--quantitative.

Anyway, in making sure my sentences are readable, I sometimes lapse into too rhythmic a style. I hadn't really thought of this except in special cases, but reading this Burgess review of Updike, it all suddenly makes sense. There can be a tradeoff between rhythm and meaning. And not just in a simple way that you can choose between words that are true and words that are beautiful. Rather, the same rhythm that I rely on to make my words clear on the page (or the screen) has the effect of requiring me to add unnecessary words. And there's no simple solution, because if I just remove those extra words, all of a sudden it can be hard for people to follow what I'm saying.

As they say in the stagecoach business, remove the padding from the seats and you get a bumpy ride.

P.S. This all reminds me . . . I'm a big Updike fan (despite being unimpressed by his book titles); I love Rabbit, Run and like its sequels, many of the short stories are just amazing . . . a few years ago I picked up a copy of Roger's Version, which I recalled has having received good reviews. I read a few pages but just couldn't continue--nothing about the writing seemed remotely plausible. It didn't seem anything like how a real person would talk (nor was it interesting enough to realize for other reasons). Rabbit, Run, though, that was great. I like how it was written and also its ideas. I completely disagree with the notion that Updike was merely a painter of pretty word-pictures with nothing to say.

P.P.S. Speaking of Updike, here's my favorite poetry-related item from recent years (from the New Yorker in 1994):


In a comment on a note on Anthony Burgess, Steve writes:

Burgess famously was told once by his doctor that he had a year to live. So, to provide for his family after his death, he wrote five novels in one year, all published. He turned out to be perfectly healthy and went on to publish countless words.

I just read a biography--The Real Life of Anthony Burgess, by Andrew Biswell--which suggests that it didn't quite happen like that.

The mystery of mysteries


Commenter Maxine on Jenny's blog writes:

I find that the endings are the worst things about crime fiction. Harlan Coben for example writes great posers, but then the end.....hmmm. Sophie Hannah seems to be doing something similar (just read her most recent, A Room Swept White). . . .

One quite frustrating thing about reviewing crime fiction is that one cannot criticise these silly endings as you would then give away the whole point.

This all sounds reasonable, but . . . why is it that revealing the ending of a crime novel would "give away the whole point" more than revealing the end of a non-crime novel?

Is there something inherent in crime that makes it more suspenseful than other endeavors, or just some sort of literary convention? I don't see it. There can be just as much suspense in a story of love, or illness, or animals, or war, or all sorts of other human endeavors, no? So maybe it's just a tradition, that crime stories are expected to be wrapped around a "whodunit?" or "how will he get caught?" structure, whereas other sorts of stories are expected to work off of a known outcome.

I have this image in my mind of the central thread of a crime story tied down at one end (usually, although not necessarily, the chronological beginning), and the central thread of a traditional non-crime story being tied down at two ends, so you know where it will start and where it will end.

I was thinking about this general topic the other day while reading snippets of a sci-fi novel that I've been carrying around in my jacket pocket. The novel is from 1950--reprinted in the 70s, I think, but still old enough that it's satisfyingly short and easy to carry. It's fun to see all the conventions being followed, one of which is that it will have a happy ending, in the sense that the hero will be alive and victorious at the story's end. Well, maybe not--I still have another 10 or so pages to go, will need to wait for a few more long lines at the market--but I'm pretty sure it'll go that way. This novel is like a string tied at both ends. It has points of suspense--as in Dr. Jekyll and Mr. Hyde, I don't know how it's gonna get to where it's going--but the basic structure is clear.

One other thing--reading these oldstyle books often diminishes my appreciation of newer works in the same genre. The original can be so much fresher.

I encountered this book in the library, "Homage to Qwert Yuiop: Selected Journalism 1978-1985," by Anthony Burgess. It's just great. I think somebody should collect and print the rest of Burgess's journalism too. (There do seem to be one or two other collections out there but I doubt they have the density of this overflowing volume of 589 pages of small print. I have a feeling they're leaving out a lot of good stuff.) I'd also gladly buy an edition of the uncollected book reviews of Alfred Kazin or of Anthony Boucher. I'm sure a lot of editing would be required, though, but I think it would be worth it.

It would be fun to spend some months going through old book reviews from the 1950s-1980s and collecting the most interesting parts for a collection. But I suspect that almost nobody but me would be interested in such a thing.

P.S. The library had another book of Burgess's previously uncollected essays--a collection prepared posthumously. I was surprised to find that I did not like these essays at all! The problem, I think, is that the editors tried to collect what seemed to them to be the most important of his works, which led to the inclusion of a lot of pretentious crap. I think Burgess's workaday book reviews were much more interesting and judicious than these pieces where he had more freedom to say what he wanted.

A huge ad in the subway . . .


. . . for Paul Auster's new book.

Only in Paris, huh?

Deep in a long discussion, Phil writes, in evident frustration:

I don't like argument by innuendo. Say what you mean; how hard is it, for cryin' out loud?

Actually, it is hard! I've spent years trying to write directly, and I've often noticed that others have difficulty doing so. I always tell students to simply write what they did, in simple declarative sentences. (It can be choppy, that's fine: "I downloaded the data. I cleaned the data. I ran a regression" etc.) But it's really hard to do. As George Orwell put it, good prose is like a windowpane, but sometimes it needs a bit of Windex and a clean rag to fully do its job.

P.S. I feel similarly about statistical graphics. (See also here.)

The Triumph of the Thriller


Patrick Anderson is a sometime novelist, speechwriter, and book reviewer who wrote a book, The Triumph of the Thriller, a couple of years ago--I just recently encountered it in the library. His topic: how crime thrillers have taken over the bestseller list.

Top-selling crime novels are not new. Between 1947 and 1952, Mickey Spillane, amazingly, came out with 7 of the 28 bestselling books in the history of U.S. publishing. The titles of his books: "I, the Jury," "The Big Kill," "My Gun is Quick," "One Lonely Night," "The Long Wait," "Kiss Me, Deadly," and "Vengeance is Mine." I think you get the picture. Meanwhile, from the 1930s onward, Erle Stanley Gardner published over 100 crime novels, among which an incredible 91 sold over a million copies each. So, not new at all.

What's new is not the presence of the thriller but its triumph. James Patterson's books sold 14 million copies in a single year more than Grisham, King, and Brown combined. (And, of course, each of these benchmarks is himself a writer of thrillers.)

The 1870 census


Elissa Brown points us to this reproduction of the 56 beautiful pages of the Statistical Atlas of the Ninth Census, published in 1874. Takes forever to download it but it's worth it.

Here's a bit from page 38 (chosen for its humor value, but also because it's pretty):


Page 46 is nice too.

Some of the designs are pretty good, some not so much. I'm not going to give lots of examples--hey, these guys were working back in 1870 and had to do everything by hand! I just wanted to point out that, no, these old graphs are not perfect--many of them could be improved upon in obvious ways--just to avoid the implication that these represent some sort of perfection. They represent an impressive level of effort and also remind us how far we've come.

Wittgenstein would be amused


When writing this comment, I learned that it isn't so easy to spell "Wittgenstein." I had to try several times. Luckily, it's in the spell-checker so I eventually got it by trial and error. Quine's in the spell-checker (but, oddly enough, not "Quine's"), but Tarski isn't.

Some others:

wittgenstein (lower-case): fails the spell-checker.
Wittgenstein's is ok, though. As it should be. So I don't know why Quine works but Quine's didn't.
Knuth. Yes.
Russell. Yes.
Whitehead. Yes.
Gelman. No.
Meng. No.
Rubin. Yes.

Hey, that's not fair!

Regular readers will know the importance I attach to model checking: to the statistical paradigm in which we take a model seriously, follow its implications, and then look carefully for places where these implications don't make sense, thus revealing problems with the model, which can then trace backwards to understand where your assumptions went wrong.

This sort of reasoning can be done qualitatively also. From Daniel Drezner, here's a fun example, an analysis of a recent political bestseller:

I [Drezner] hereby retract any and all enthusiasm for Game Change-- because I don't know which parts of it are true and which parts are not. . . . It was on page 89 that I began to wonder just how much Game Change's authors double-checked their sources. This section of the book recounts entertainment mogul David Geffen's "break" with Hillary Clinton's presidential campaign:
The reaction to the column stunned Geffen. Beseiged by interview requests, he put out a statement saying Dowd had quoted him accurately. Some of Geffen's friends in Hollywood expressed disbelief. Warren Beatty told him, She's going to be president of the United States--you must be nuts to have done this. But many more congratulated Geffen for having the courage to say what everyone else was thinking but was too afraid to put on the record. They said he'd made them feel safer openly supporting or donating to Obama. Soon after, when Geffen visited New York, people in cars on Madison Avenue beeped their horns and gave him the thumbs-up as he walked down the street (emphasis added [by Drezner]).

A self-refuting sentence indeed. Don't these guys have an editor? This reminds me of our recent discussion of the economics of fact checking.

Another hypothesis is that John Heilemann and Mark Halperin--the authors of Game Change--realized all along that the thumbs-up-on-Madison-Avenue story was implausible, but they felt that it was a good quote to include in order to give a sense of where Geffen was coming from. From this perspective, it should be obvious to the reader that the sentence beginning "Soon after, when Geffen visited New York" was a Geffen quote, nothing more and nothing less. In a book based on interviews, it would just be too awkward to explicitly identify each quotation as, for example, writing, "Geffen told us that soon after he visited New York, people in cars . . ." Sure, that latter version would be more accurate but would disrupt the flow.

Similar reasoning might explain or excuse David Halberstam's notorious errors in his baseball book that were noted by Bill James: Halberstam's goal was not to convey what happened but rather to convey the memories of key participants. Similarly, maybe the point of Game Change is to tell us what people recall, not what was actually happening. An oral history presented in narrative form.

P.S. For more on model checking from a Bayesian statistical perspective, see chapter 6 of Bayesian Data Analysis or this article. Or, if you prefer it in French, this.

In his forthcoming book, Albert-László Barabási writes, "There is a theorem in publishing that each graph halves a book's audience." If only someone had told me this two years ago!

More seriously, this tongue-in-cheek theorem, if true, defines an upsetting paradox. As we discussed at the beginning of the Notes section of Red State, Blue State, we structured the book around graphs because that seemed to be the best way to communicate our findings. Tables are not a serious way of conveying numerical information on the scale that we're interested in, and, sure, we could've done it all in words (even saying things like "We ran a regression and it was statistically significant"), but we felt that this would not fully involve readers in our reasoning. The paradox--or maybe it's not such a paradox at all--is that graphs are grabby, they engage the reader, but this makes reading the book a slower, more intense, and more difficult endeavor.

P.S. Barabási apparently believes the theorem himself. His research publications are full of graphs, but his book has none at all (and no tables either). Well, it has one diagram, I guess. He may very well be making the right call on this one. People who want to see the graphs can follow the references and look up the scientific research articles that underlie the work described in the book.

Kaiser goes through the first chapter of Freakonomics 2 with a statistician's reading, picking out potentially interesting claims and tracking where they come from. It's actually the kind of review I might write--although in this particular instance I chose not to actually read the book, instead speculating on its authors' motivations (see here, here, and here).

Here's Kaiser:

p.20 -- was surprised to learn that women used to have shorter life expectancy than men. I have always thought women live longer. This factoid is used to show that throughout history, "women have had it rougher than men" but "women have finally overtaken men in life expectancy". I'm immediately intrigued by when this overtaking occurred. L&D do not give a date so I googled "female longevity": first hit said "it appears that women have out survived men at least since the 1500s, when the first reliable mortality data were kept."; the most recent hit cited CDC data which showed that U.S. females outlived males since 1900, the first year of reporting. In the Notes, L&D cite an 1980 article in the journal Speculum, published by the Medieval Academy. In any case, the cross-over probably occurred prior to any systematic collection of data so I find this minor section less than convincing.

. . .

p.29 -- They cite statistics about "the typical prostitute in Chicago." In what ways are the subjects of the study "typical" and in what ways are they not typical? The sample size was 160. They don't say much about the selection process of the subjects, except that they all came from three South Side neighborhoods. Would like to know more about the selection.

p.30 -- After much buildup, we get to their surprise: "Why has the prostitute's wage fallen so far?" I'm looking for the data, what does it mean by "so far"? All we have is the assertion "the women's wage premium pales in comparison to the one enjoyed by even the low-rent prostitutes from a hundred years ago." On the previous page, we learn that modern "street prostitutes" earn $350 per week. On p.24, we learn that in the past, Chicago prostitutes took in $25 a week, "the modern equivalent of more than $25,000 a year". Unfortunately, neither of these two numbers is comparable to $350. Dividing $25,000 by 50 weeks (approx.) gives $500 per week. So the drop is $150 off $500, or 30%. But... this is a comparison of wages from prostitution, not of "wage premium". On p.29, the modern study found "prostitution paid about four times more than [non-prostitution] jobs." On p.23, they say "a tempted girl who receives only $6 per week working with her hands sells her body for $25 per week" so we can compute the historical ratio as $25/$6 = 4.17 times. So, I must have gotten the wrong data.

. . .

p.46 -- Some of the language is overdone. They say the men "blew away" the women in a version of an SAT-style math test with twenty questions. What does "blowing away" mean? Scoring 2 more correct questions out of 20.

. . .

The rest of the chapter -- They discuss Allie, a high-end prostitute. This section has little interest for a statistician since it is a sample of one.

This last bit reminded me of my dictum that the activity we call "statistics" exists in the middle of the Venn diagram formed by measurement, comparison, and variability. No two of the three is enough.

Not a debunking

Kaiser's comments do not represent a trashing, or debunking, of the much-criticized new Freakonomics book; rather, they represent a careful reading of the sort that someone might do if he was interested in taking its claims seriously.

My first thought was that it's too bad that Levitt and Dubner didn't send a draft of their book to a careful reader like Kaiser for comments. (It's hard to get people to comment; I routinely send draft copies of my books to zillions of friends and colleagues but usually only get a few responses. Which is understandable; people are busy.)

But then I thought, What a minute! ne person who'd I think is eminently qualified to examine the numbers in Freakonomics is . . . Levitt himself! Did he just not notice the issues that Kaiser mentioned, or was it a communication problem, that Levitt and Dubner were just too close to the material and didn't realize that their readers might not share their knowledge base? Or perhaps they're focused more on the concepts than the details--they like their theories and are not so concerned about the quantitative details. This can work if you're Arnold Toynbee or Susan Sontag but maybe is riskier if part of your reputation is that of supplying your readers with interesting-but-true facts. It's the nature of interesting-but-true facts that they're most interesting if true, and even more interesting if they're convincingly true.

I just finished reading The Aesthetics of Junk Fiction, by Thomas Roberts; and it's the most thought-provoking book I've encountered since Taleb. (By "thought provoking," I mean just that: These books provoked more thoughts from me than any other books I've read recently.)

It's a book all about literary genres, and what people get out of reading detective stories, romances, science fiction, and westerns.

With genres on my mind, my reaction to receiving Kaiser's new book, Numbers Rule Your World, was that this is the latest in the increasingly popular genre of pop-statistics books.

And then this got me thinking about different sorts of genres. Roberts discusses how a reader will go through detective stories, say, like potato chips--actually, he criticizes the food analogy, but you get the picture, with some people reading book after book by the same author or even the same series, others reading more broadly within a genre, and others dipping into a genre from time to time.

Books are different from T.V., where it's so easy to just flip the channels and encounter something new. With books, it's easier to stay within your preferred genre or genres.

Anyway, here's the thing. People who love mysteries will read one after another. People who love science-fiction will read libraries of the stuff. But, even if you looove pop-statistics books, you probably won't read more than one or two. Unlike mysteries, romances, westerns, etc., pop-statistics books are designed not for addicts but for people who aren't already familiar with the area.

Because of my familiarity with applied statistics, I'm in some ways the worst possible reviewer for Kaiser's book. It's hard for me to judge it, because these ideas are already familiar to me, and I don't really know what would work to make the point to readers who are less statistically aware. (Christian Robert had a similar reaction.)

The book has a blurb on the back from someone from SAS Institute, but I looked at it anyway. And I'm glad I did.

My favorite part was the bit about "How steroid tests miss ten dopers for each one caught." I liked how he showed it with integers rather than probabilities--and I think there's some research that this sort of presentation is helpful. And, more than that, I liked that Kaiser is sending the message that this all makes sense: rather than trying to portray probability as counterintuitive and puzzle-like, he's saying that if you think about things in the right way, they will become clear. The story of the college admissions test questions was interesting too, in the same way.

There is an inherent tension in all these pop-statistics books, which send two messages:

1. The statistician as hero, doing clever things, solving problems, and explaining mysteries.

2. The method as hero, allowing ordinary people (just plain statisticians) to do amazing things.

Superman or Iron Man, if you will.

As a statistician myself, I prefer the Iron Man story: I like the idea of developing methods that can help ordinary people solve real problems. My impression is that Kaiser, professional statistician that he is, also prefers the Iron Man frame, although it can be hard to convey this, because stories work better when the heroes are humans, not methods. The next book to write, I guess, should be called, not Amazing Numberrunchers or Fabulous Stat-economists, but rather something like Statistics as Your Very Own Iron Man Suit.

P.S. I didn't understand Kaiser's description of how they handle the waiting lines at Disneyland. When I went there, you'd buy a packet of tickets, ranging from A (lame rides like It's a Small World that nobody ever wanted to go on), through intermediate rides like the teacups, up to the E tickets for the always-crowded rides like Space Mountain. Apparently they changed the system at some point and now have something called a Fast Pass, which sounds like a take-a-number sort of system with beeper that tells you when it's your turn to go on to your ride. Kaiser describes this as a brilliant innovation, which I guess it is--it seems like an obvious idea, but they certainly don't do it in most doctor's waiting rooms!--but he also describes it as more of a psychological trick in crowd management than an efficiency gain. That's where he loses me. Sure, I accept the point that the rides have a finite capacity, so in that sense you can't really shorten waiting times very much, but if you can wander around while waiting for your ride instead of standing on line, that's a positive gain, no? Standing on line is generally pretty unpleasant.

P.P.S. Do youall like this kind of rambling blog that goes through several ideas, or would it be better for me to split this sort of thing into multiple entries (for example, a review of Kaiser's book, a question about Disneyland, the discussion of genres, and the Superman/Iron Man issue)? I kinda feel that multiple entries would work better on the blog; on the other hand, the sort of single wide-ranging discussion you see here is more interesting in a published review. Maybe I can send this to the American Statistician or some other such publication.

Elissa Brown sends these in. They're actually pretty good, with a quite reasonable Ogden-Nash-style rhythm and a certain amount of statistical content. It's good to know that the kids today are learning useful skills in their graduate programs.

You are perfect; I'd make no substitutions You remind me of my favorite distributions With a shape and a scale that I find reliable You're as comforting as a two parameter Weibull When I ask you a question and hope you answer truly You speak as clearly as a draw from a Bernoulli Your love of adventure is most influential Just like the constant hazard of an exponential. With so many moments, all full of fun, You always integrate perfectly to one.

And here are a bunch more:

David Johnstone writes:

A coauthor and I just recently submitted a revision of our manuscript to a journal. If we'd known it was going to be so much work, we probably never would've written the paper in the first place. . . . It's a surprising amount of work between idea and execution (even forgetting about issues such as writing the letter in response to the referee reports). And, actually, this particular review process was very easy, as such things go. Still a lot of effort, though. It reminds me that being able to something once is a lot less than describing a method clearly and in appropriate generality.

Blog style


I followed this link from Tyler Cowen to "Ben Casnocha on Chile" and found . . . a long blog entry that was exactly in the style of Tyler Cowen! I wonder if Cowen realized this when he linked to it. Probably not: just as we don't notice our own strong smells (or so I've been told), it's probably also hard for anyone to notice an imitation of one's own style. I do wonder whether Casnocha was imitating Cowen on purpose--not such a bad idea when blogging to imitate a master, just as short-story writers continue to imitate John Updike. Personally, I'm sick and tired of book and movie reviewers imitating Pauline Kael--I didn't even like her own writing and I don't enjoy seeing her stylistic ticks repeated by others--but, hey, that's their choice.

P.S. In case you're wondering, here are a few Cowenisms in Casnocha's blog:

I heard a rumour that Doris Kearns Goodwin is still being interviewed on TV, and . . . yes, it's true!

My first thought was: What, they couldn't find an equally appealing talking head who wasn't also a plagiarist? I'm sure there are lots of well-spoken historians who'd love the chance to go on Johnny Carson or whatever it's called nowadays.

But then I looked around on her website, and now I'm not sure. Her books have received all sorts of praise as exemplary popular history, and that sounds like as good a qualification as any for explaining history on TV. Who cares if she's a plagiarist? She's not on the tube for her creative writing talent or, for that matter, for her ability to learn from the primary sources.

The other dimension is that plagiarism is a moral offense. At the very least, I think it might help if Goodwin's TV interviewers every once in a while brought up the piagiarism issue in some relevant way. For example, "Since we're on the topic of authenticity in political candidates, what do you think of the accusation that candidate X is ripping off the ideas of politician Y? As a plagiarist yourself, you must have some thoughts on this?" Or, "The relations between senators and their staff are complicated, no? You must have some insights into this, having delegated the writing of your book to research assistants who copied whole chunks from others' work. How many of 100 members of the U.S. Senate do you think actually read more of the health care bill than you've read of your own publications?"

Patterson update

| 1 Comment

I went to the library and took a look at a book by James Patterson. It was pretty much the literary equivalent of a TV cop show. I couldn't really see myself reading it all the way through, but it was better-written than I'd expected. It's hard for me to see why Patterson wants to keep doing it (even if his coauthors are doing most of the work at this point). But I suppose that, once you're on the bestseller list, it's a bit addictive and you want to stay up there.

Recent Comments

  • DK: #17. All these quadrillions and other super low p-values assume read more
  • Andrew Gelman: Anon: No such assumption is required. If you multiply the read more
  • anon: Doesn't this rely on some form of assumed orthogonality in read more
  • Andrew Gelman: David: Yup. What makes these graphs special is: (a) Interpretation. read more
  • David Shor: This seems pretty similar to the "Correlations" feature in the read more
  • David W. Hogg: If you want probabilistic results (probabilities over outcomes, with and read more
  • Cheryl Carpenter: Bob is my brother and he mentioned this blog entry read more
  • Bob Carpenter: That's awesome. Thanks. Exactly the graphs I was talking about. read more
  • Manuel Moe G: Do I detect a small inconsistency in the nomenclature you read more
  • Jed: Speaking of wacky claims... Not sure if you saw this. read more
  • mb: Small issue. From their Figure 3, the number of rain-free read more
  • Andrew Gelman: Jim: As Kobi and I write in our paper, we read more
  • Sumio Watanabe: Dear Dr. Gelman, I agree with your opinion that, even read more
  • Jim: Just curious what would be the next step if the read more
  • Winston Lin: Andrew, the July 4 findings might not be quite so read more

About this Archive

This page is an archive of recent entries in the Literature category.

Economics is the previous category.

Miscellaneous Science is the next category.

Find recent content on the main index or look in the archives to find all content.