The blog is now here. Time to update your feed (if it didn't happen automatically).

OK, the 30 days of statistics are over. I'll still be posting regularly on statistical topics, but now it will be mixed in with everything else, as before.

Here's what I put on the sister blogs in the past month:

1. How to write an entire report with fake data.

2. "Life getting shorter for women in hundreds of U.S. counties": I'd like to see a graph of relative change in death rates, with age on the x-axis.

3. "Not a choice" != "genetic".

4. Remember when I said I'd never again post on albedo? I was lying.

5. Update on Arrow's theorem. It's a Swiss thing, you wouldn't understand.

6. Dan Ariely can't read, but don't blame Johnson and Goldstein.

7. My co-blogger endorses college scholarships for bowling. Which reminds me that my friends and I did "intramural bowling" in high school to get out of going to gym class. Nobody paid us. We even had to rent the shoes!

8. The quest for

9. For some reason, the commenters got all worked up about the dude with the two kids and completely ignored the lady who had to sell her summer home in the Hamptons."

10. The most outrageous parts of a story are the parts that don't even attract attention.

11. Do academic economists really not talk about economic factors when they talk about academic jobs in economics?

12. The fallacy of composition in brownstone Brooklyn.

13. No, the federal budget is not funded by taking money from poor people.

14. Leading recipient of U.S. foreign aid says that foreign aid is bad.

15. Jim Davis has some pretty controversial opinions.

16. Political scientist links to political scientist linking to political scientist claiming political science is irrelevant.

17. "Approximately one in 11.8 quadrillion." (I love that "approximately." The exact number is 11.8324589480035 quadrillion but they did us the favor of rounding.)

Blog in motion


In the next few days we'll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)

Why I blog?

There is sometimes a line of news, a thought or an article sufficiently aligned with the general topics on this blog that is worth sharing.

I could have emailed it to a few friends who are interested. Or I could have gone through the relative hassle of opening up the blog administration interface, cleaned it up a little, added some thoughts and made it pretty to post on the blog. And then it's poring through hundreds of spam messages, just to find two or three false positives in a thousand spams. Or, finding the links, ideas and comments reproduced on another blog without attribution or credit. Or, even, finding the whole blog mirrored on another website.

It might seem all work and no fun, but what keeps me coming back is your comments: the discussions, the additional links, information and insights you provide, this is what makes it all worthwhile.

Thanks, those of you who are commenters! And let us know what would make your life easier.

RSS mess


Apparently some of our new blog entries are appearing as old entries on the RSS feed, meaning that those of you who read the blog using RSS may be missing a lot of good stuff. We're working on this. But, in the meantime, I recommend you click on the blog itself to see what's been posted in the last few weeks. Enjoy.


We're having some problem with the blog, where we get comments but they don't show up on the blog. We're trying to figure out what's going on. In the meantime, feel free to post your comments; they'll show up soon, I hope.

Jeff and John were bugging me about this so I thought I should give a quick summary:

Statistical Modeling, Causal Inference, and Social Science: Most of my stuff goes here. We also have several other contributors who unfortunately don't blog very often. When they do blog, they usually have something good to say. I used to ask my students and postdocs to blog here when I was on vacation, but then I found out they were spending hours writing these blog entries, which seems to be contrary to the spirit of the thing. I've recently started posting political things on Nate Silver's site. Nate's super-cool--recall that one of the reasons I became a statistician was from reading Bill James, and Nate is definitely of that breed--and also this allows me to reach a different audience than I'm reaching here. David Frum's conservative site. This started with an article I wrote a few months ago on the lessons for conservatives from the 2008 election. I am occasionally posting here when I have something relevant for this audience, whether it be a bit of number crunching specifically relevant to something newsworthy, or a more general point about political polarization or whatever. I do not see my research as inherently liberal or conservative (or moderate, for that matter), and I (naively, I'm sure) feel that politics would improve if all sides have a clearer view of public opinion.

The Monkey Cage: A blog run by John Sides, Lee Sigelman, and others, focusing on political science research. I post there sometimes (generally crossposting on this blog) and also participate in discussions there.

Overcoming Bias: I post here sometimes because it's fun (albeit frustrating) to try to communicate with a bunch of people with whom I probably disagree with on 95% of all issues. Robin Hansen is an interesting guy and it seems worth keeping these communication lines open, even though I feel I'm speaking a different language from most of the participants there.

Red State, Blue State, Rich State, Poor State: Boris set up this website for our book and we kept a blog going here, with frequent posts in the month leading up to the election and the month or so following. Since then I've been putting all my political posts here at Statistical Modeling (or at the other sides mentioned above), so there's no need to keep up with that one.

Rachel and I also set up a blog for my course last semester; that worked pretty well for communicating to students, and I think I'll do it again, but that's not relevant to discussions of public blogs here.

I think I can handle the first 5 blogs above. I can't really see reducing beyond that, given the goal of reaching diverse audiences. Of course what I really want is for everyone to read the Statistical Modeling blog--but I recognize that not everyone is fascinated by discussions of statistical graphics, R code, causal inference, and so forth. I'll tell you one thing: this is the only blog where you'll get my musings about literature!

A potential applicant writes:

I am considering applying to the postdoc positions at the Applied Statistics Center advertised at From the description in that page, it is not clear to me whether the selected postdocs are expected to use part of their time in their own projects. There is no information, either, on what the postdoc positions offer, in terms of salary and benefits. Lastly, there is no advertised application deadline. I would appreciate it a lot if you could give me information on these issues.

My reply:

1. Yes, postdocs are expected to spend time on their own projects, possibly in collaboration with others here.

2. Salary and benefits are competitive with salary and benefits for statistics postdocs elsewhere.

3. There is no application deadline. We consider the applicants as they come in.

4. I'll be in sabbatical next year so I'm not sure if I'll hire someone here--it depends on what grants are funded. But the Applied Statistics Center at Columbia includes other researchers, and when you apply for our postdoc, faculty in various departments might take a look at your application.

Blog upgrade from MT 3.3 to MT 4.2


We have upgraded the blog software from MT 3.3 to MT 4.2. There might be some hiccups, but we hope to have it operational as quickly as possible. Let us know if there are any problems!

Postdoctoral opportunity with the Earth Institute

The Earth Institute is looking for applicants for its postdoctoral fellows program:

My talks in Toronto

Just to let you know . . .

We're busy finishing our book during these next couple of weeks. So if I'm slow in responding to messages, just wait. You might hear from me in mid-March.



I am sometimes asked how to get an RSS feed from this blog. I've been told you can do it here.

Andrew Gelman has a blog

Comments are working again

The blog is fully working, so your comments will be processed again. And have a fun 4th-of-July weekend!

Blog problems

The comment file got corrupted, so we're trying to figure out how to fix it. In the meantime, the blog is not currently displaying comments. It appears to be storing the comments, however, so I hope we'll get it fixed within a few days.

Seth Roberts's work on self-experimentation is the subject of the Freakonomics column in this Sunday's New York Times. Regular readers of this blog will recall discussions of Seth's work here and here. Also a related study here.

The publicizing of Seth's work also is an interesting example of information transmission. Seth published a paper in Behavioral and Brain Sciences--a top journal, but not enough to get the work much publicity. I posted a link to it on our blog (circulation 200/day), it was picked up by Alex at Marginal Revolution (circulation 10,000/day) and from there was noticed by a columnist for the New York Times (circulation ~ 2 million/day). But I think the high quality of Seth's article in BBS, with all its experimental data and scientific context, was crucial, in convincing the two levels of gatekeeper--Alex and Stephen--that the work could be taken seriously.

Postdoctoral position available

Postdoctoral research opportunity: Columbia University, Departments of Epidemiology and Statistics

Supervisors: Ezra Susser (epidemiology) and Andrew Gelman (statistics)

We have a NIH-funded postdoctoral position (1 or 2 years) available for what is essentially statistical research as applied to some important problems in psychiatric epidemiology. One project which we are working is the Jerusalem Perinatal Study of Schizophrenia, a birth cohort of about 90,000 (born 1966-1974) followed for schizophrenia in adulthood. Another project is a California birth cohort study of schizophrenia--this is a cohort of 20,000 collected in 1959-1966 for which we have ascertained/diagnosed 71 cases of schizophrenia spectrum disorders. The data set already exists and has produced several important findings. The statistical methods involve fitting and understanding multilevel models; see below. The position can also involve some teaching in the Statistics Department if desired.

Statistical Project 1: Tools for understanding and display of regressions and multilevel models

Modern statistical packages allow us to fit ever-more-complicated models, but there is a lag in the ability of applied researchers (and of statisticians!) to understand these models and check their fit to data. We are in the midst of developing several tools for summarizing regressions, generalized linear models, and multilevel models—these tools include graphical summaries of predictive comparisons, numerical summaries of average predictive comparisons, measures of explained variance (R-squared) and partial pooling, and analysis of variance. To move this work to the next stage we need to program the methods for general use (writing them as packages in the popular open-source statistical language R) and further develop them in the context of ongoing applied research projects.

Statistical Project 2: Deep interactions in multilevel regression

In regressions and generalized linear models, factors with large effects commonly have large interactions. But in a multilevel context in which factors can have many levels, this can imply many many potential interaction coefficients. How can these be estimated in a stable manner? We are exploring a doubly-hierarchical Bayes approach, in which the first level of the hierarchy is the usual units-within-groups (for example, patients within hospitals) in which coefficents are partially pooled and the second level is a hierarchical model of the variance components (so that the different amounts of partial pooling are themselves modeled). The goal is to be able to include a large number of predictors and interactions without the worry that lack-of-statistical-significance will make the estimates too noisy to be useful. We plan to develop these methods in the context of ongoing applied research projects.

If you are interested . . .

Please send a letter to Prof. Andrew Gelman (Dept of Statistics, Columbia University, New York, N.Y. 10027,, along with c.v., copies of any relevant papers of yours, and three letters of recommendation.

Blog, take two

I'm sure most of you noticed that our blog disappeared for a while last week.

Some f&*^ing kid in Michigan of all places hacked into my account through the Wiki. I think the Wiki security problems are now fixed, and have also learned the hard way not to rely on anyone else to back things up!

Anyway, I just wanted to let everyone know that the blog is now functional again, and getting close to being back to its old glory. Most if not all of the entries are back up. I'll work on comments tomorrow. The uploaded files and links were lost. I'll replace the ones I have access to, but you might want to check your own entries and update links and pictures (same goes for the Wikis). All authors have the same user names as they were before, and passwords have been set back to the original ones I made up (let me know if you don't know your password).

I apologize for the interruption!


