« February 2005 | Main | April 2005 »
March 31, 2005
More thoughts on self-experimentation
Susan writes:
I've started reading the piece you sent me on Seth. Very interesting stuff. I generally tend to think that one can get useful evidence from a wide variety of sources -- as long as one keeps in mind the nature of the limitations (and every data source has some kind of limitation!). Even anecdotes can generate important hypotheses. (Piaget's observations of his own babies are great examples of real insights obtained from close attention paid to a small number of children over time. Not that I agree with everything he says.) I understand the concerns about single-subject, non-blind, and/or uncontrolled studies, and wouldn't want to initiate a large-scale intervention on the basis of these data. But from the little bit I've read so far, it does sound like Seth's method might elicit really useful demonstrations, as well as generating hypotheses that are testable with more standard methods. But I also think it matters what type of evidence one is talking about -- e.g., one can fairly directly assess one's own mood or weight or sleep patterns, but one cannot introspect about speed of processing or effects of one's childhood on present behavior, or other such things.
My thoughts: that's an interesting distinction between aspects of oneself that can be measured directly, as compared to data that are more difficult to measure.
I remember that Dave Krantz once told me that many of the best ideas in the psychology of decision making had come from researchers' introspection. That sounds plausible to me. Certainly, speculative axioms such as "minimax risk" and similar ideas discussed in the Luce and Raiffa book always seemed to me to be justified by introspection or by demonstrations of the Socratic-dialogue type that (such as in Section 5 of this paper, where we demonstrate why you can't use a curving utility function to explain so-called "risk averse" attitudes).
One of the discussants of Seth's paper in Behavioral and Brain Sciences compared introspection to self-experimentation. Just as self-experimentation is a cheaper, more flexible, but limited version of controlled experiments on others, introspection is a cheaper etc. version of self-experimentation.
Back to Susan's comments: she appears to agree with Seth that it's not a good idea to jump from the self-experiments to the big study. So there should be some intermediate stage . . . pilot-testing with volunteers? How much of this needs to be done before he's ready for the big study? More generally, this seems to be an important experimental design question not addressed by the usual statistical theory of design of experiments.
Posted by Andrew at 12:12 AM | Comments (3)
March 30, 2005
Surfing the web, or From the 10th floor to the 7th floor in four steps
So I clicked on the link on our webpage to Decision Science News, flipped through there and then on to his links . . . hmmm, a link to the psychologist Jon Baron, who studies thinking and decision making. . .
Baron's blog is pretty cool too. Sort of halfway between a science blog (like ours and Decision Science News) and an opinion blog (like the 3 million other blogs out there). It's Baron's opinions, but backed by his perspectives as a leading decision scientist. (In this post, he briefly discusses treatments for obesity. I should forward him the reference to Seth's article on self-experimentation (or maybe the link about the psychology professor who told us to take drugs).
Well, Jon has his own links (including Decision Science News) . . . I clicked through, and the only other one that was interesting was the blog of Deb Frisch, another psychology professor and decision scientist. Her blog is definitely more of the "personal commentary on ussues current events" style, but the issues and current events she discusses are of interest to me too, so I enjoyed reading it. She has a confrontational style, which shows up in her comment to this entry. It would probably be fun to be a student in one of her classes.
Frisch's blog had interesting stuff. Right near the top there was a link to an implementation of Eliza, which I of course had heard about but had never tried out. That same entry has a link to a blog called Econlog, by Arnold Kling and Bryan Caplan. Frisch links to Econlog only to mock them, but actually it had some interesting stuff. (Although I'm not inclined to agree with them when they write, "Cato is right to want to topple Social Security. If you don't have the common sense to save for your own retirement, you shouldn't come crying to the taxpayers when your hair turns gray." Seems a little harsh, especially given the many cognitive illusions that decision scientists have discovered over the past 40 years!)
But I don't have to agree with Kling and Caplan to read their blog. Actually, their most recent posting referred to a study on effects of pre-kindegarten education--a topic of great interest to me right now. The funny thing is, two of the three authors of the study are at the Columbia School of Social Work, and one of the authors is Jane Waldfogel, who I know--she works in my building--and is in fact a co-organizer of this seminar series.
I'll have to read the paper more carefully before commenting on it, but, hey, it only took me 4 links to find out what's being done on the 7th floor of my building! As well as learning some other stuff on the way.
Posted by Andrew at 12:48 AM | Comments (2)
March 29, 2005
Theoretical work on legislative districting
Stephen Coate (Dept. of Economics, Cornell) and Brian Knight (Dept. of Economics, Brown) wrote a paper, "Socially Optimal Redistricting," with a theoretical derivation of seats-votes curves. The paper cites some of my work with Gary King on empirically estimating seats-votes curves. Coate and Knight sent the paper to Gary, who forwarded it to me. It's an interesting paper but has a slight misrepresentation of what Gary and I did in studying seats-votes curves and redistricting.
Coate and Knight's paper centers on a derivation of optimal curves under a theoretical model of voters. I don't have much to say on their model and theory (see below for a couple of comments); the main reason I'm writing this is to update their characterization of empirical work on seats-votes curves (on pages 3-4 of their paper). They cite some of the work of Gary and myself but incorrectly state that we work with the so-called "bilogit" curve, which is a model that Gary created in the 1980s and then abandoned for our more flexible approach. We don't actually assume any functional form for the curve--it is not restricted to be linear, or cubic, or bilogit, or any parameteric family of curves. We fit the model using JudgeIt.
To put it another way, we model district-level votes. We don't model the seats-votes curve directly. The seats-votes curve is a consequence of the votes, not something that is specified on its own. As a result, the seats-votes curve is not restricted to any particular form. (See this our 1994 AJPS paper for details of our model and fitting procedure, and our 1990 JASA paper for an earlier version of the model.)
What are seats-votes curves?
On page 4 of their paper, Coate and Knight write that "the underlying foundations of the [empirical] analysis are opaque. While the seat-vote curve is an undeniably elegant construct, the relationship between seat-vote curves and districting is not clear." I think our 1994 AJPS paper should clarify this for them. Basically, the fundamental concept is the vector of vote proportions for each of the parties in each district. This vector has a probability distribution (representing what could happen for a future election, or what could have happened for a historical election), and the seats-votes curve (as we define it) is the function of expected seats, given votes. This relates to districting because districting affects the vector of vote proportions (mostly by moving votes from one district to another, also by affecting the decisions of candidates to run, campaign strategies, etc.).
A couple of minor comments
I question some of the assumptions of the model (in particular, the claim that "every citizen votes sincerely for the representative whose ideology is closest to its own"--this would seem to be contradicted by the fact of a large and consistent incumbency advantage) but I certainly respect the idea that theoretical work must begin with a highly stylized model.
I also think it's funny that they refer to lower-case "democrats" and "republicans." A political scientist would never do that! It would be like a minister referring to "jesus" and "god."
In summary
I'm not trying to knock the Coate and Knight paper. As they point out, empirical and theoretical projects are judged by different standards. An empirical model is supposed to be realistic and fit the data; a theoretical model is supposed to be conceptually compelling and have as few "exogenous" factors as possible. Two different ways of understanding social phenomena. I just wanted to clarify that the existing empirical methods for estimating seats-votes curves are pretty sophisticated and go far beyond fitting two- or three-parameter models. So, developing theoretical models is great, but they won't have the descriptive power of our empirical models. (Or, to look at it the other way, the empirical models will never have the simplicity of the theoretical models they're developing.)
Posted by Andrew at 12:49 AM | Comments (0)
March 28, 2005
Potential outcomes, causal inference, and virtual history
A few years ago I picked up the book Virtual History: Alternatives and Counterfactuals, edited by Niall Ferguson. It's a book of essays by historians on possible alternative courses of history (what if Charles I had avoided the English civil war, what if there had been no American Revolution, what if Irish home rule had been established in 1912, ...).
There have been and continue to be other books of this sort (for example, What If: Eminent Historians Imagine What Might Have Been, edited by Robert Cowley), but what makes the Ferguson book different is that he (and most of the other authors in his book) are fairly rigorous in only considering possible actions that the relevant historical personalities were actually considering. In the words of Ferguson's introduction: "We shall consider as plausible or probable only those alternatives which we can show on the basis of contemporary evidence that contemporaries actually considered."
I like this idea because it is a potentially rigorous extension of the now-standard "Rubin model" of causal inference.
As Ferguson puts it,
Firstly, it is a logical necessity when asking questions about causality to pose 'but for' questions, and to try to imagine what would have happened if our supposed cause had been absent.
And the extension to historical reasoning is not trivial, because it requires examination of actual historical records in order to assess which alternatives are historically reasonable.
Here's Ferguson making the case that potential outcomes (in statistical terminology, the "Rubin causal model") are particularly relevant to the study of historical causation:
What we call the past was once the future; and the people of the past no more knew what their future would be than we can know our own. All they could do was consider the likely future, the plausible outcome. . . . Now, if all history is the history of (recorded) thought, surely we must attach equal significance to all the outcomes thought about. The historian who allows his knowledge as to which of these outcomes subsequently happened to obliteratre the other outcomes people regarded as plausible cannot hope to recapture the past 'as it actually was'. . .
Thus, to the best of their abilities, Ferguson et al. are not just telling stories; they are going through the documents and considering the possible other courses of action that had been considered during the historical events being considered. In addition to being cool, this is a rediscovery and extension of statistical ideas of causal inference to a new field of inquiry.
I don't know how much this aspect of Virtual History has been followed up since the book's publication. My impression is that these are treated as purely speculative games (as in the What If? book) without a sense of the constraints of considering options that were considered at the time.
P.S. I looked up Niall Ferguson on the web and he seems to simultaneously be a professor at NYU and Harvard, a senior fellow" at the Hooover Institution (Stanford), and a visiting professor at Oxford. Perhaps he is living some virtual history himself with these 4 jobs!
Posted by Andrew at 12:38 AM | Comments (0)
March 25, 2005
Postdoctoral position available
Postdoctoral research opportunity: Columbia University, Departments of Epidemiology and Statistics
Supervisors: Ezra Susser (epidemiology) and Andrew Gelman (statistics)
We have a NIH-funded postdoctoral position (1 or 2 years) available for what is essentially statistical research as applied to some important problems in psychiatric epidemiology. One project which we are working is the Jerusalem Perinatal Study of Schizophrenia, a birth cohort of about 90,000 (born 1966-1974) followed for schizophrenia in adulthood. Another project is a California birth cohort study of schizophrenia--this is a cohort of 20,000 collected in 1959-1966 for which we have ascertained/diagnosed 71 cases of schizophrenia spectrum disorders. The data set already exists and has produced several important findings. The statistical methods involve fitting and understanding multilevel models; see below. The position can also involve some teaching in the Statistics Department if desired.
Statistical Project 1: Tools for understanding and display of regressions and multilevel models
Modern statistical packages allow us to fit ever-more-complicated models, but there is a lag in the ability of applied researchers (and of statisticians!) to understand these models and check their fit to data. We are in the midst of developing several tools for summarizing regressions, generalized linear models, and multilevel models—these tools include graphical summaries of predictive comparisons, numerical summaries of average predictive comparisons, measures of explained variance (R-squared) and partial pooling, and analysis of variance. To move this work to the next stage we need to program the methods for general use (writing them as packages in the popular open-source statistical language R) and further develop them in the context of ongoing applied research projects.
Statistical Project 2: Deep interactions in multilevel regression
In regressions and generalized linear models, factors with large effects commonly have large interactions. But in a multilevel context in which factors can have many levels, this can imply many many potential interaction coefficients. How can these be estimated in a stable manner? We are exploring a doubly-hierarchical Bayes approach, in which the first level of the hierarchy is the usual units-within-groups (for example, patients within hospitals) in which coefficents are partially pooled and the second level is a hierarchical model of the variance components (so that the different amounts of partial pooling are themselves modeled). The goal is to be able to include a large number of predictors and interactions without the worry that lack-of-statistical-significance will make the estimates too noisy to be useful. We plan to develop these methods in the context of ongoing applied research projects.
If you are interested . . .
Please send a letter to Prof. Andrew Gelman (Dept of Statistics, Columbia University, New York, N.Y. 10027, gelman@stat.columbia.edu), along with c.v., copies of any relevant papers of yours, and three letters of recommendation.
Posted by Andrew at 12:03 AM | Comments (1)
March 24, 2005
Research, Google-style
In my correspondence with Boris about Barone's column about rich Democrats, I expressed surprise at Barone's statement that "Patriotism is equated with Hitlerism" (among leftists). Boris referred me to this article by Victor Davis Hanson which indeed has examples of leftists (and even moderate Democrats like John Glenn) comparing Bush to the Nazis.
But aren't the Democrats just following the lead of the Clinton-haters in the 1990s? Hansen says no:
The flood of the Hitler similes is also a sign of the extremism of the times. If there was an era when the extreme Right was more likely to slander a liberal as a communist than a leftist was to smear a conservative as a fascist, those days are long past. True, Bill Clinton brought the deductive haters out of the woodwork, but for all their cruel caricature, few compared him to a mass-murdering Mao or Stalin for his embrace of tax hikes and more government. “Slick Willie” was not quite “Adolf Hitler” or “Joseph Stalin.”
Hmmm . . . this got me curious, so I followed Hansen's tip and did some Google searches:
bush hitler: 1.5 million
clinton hitler: 0.7 million
What about some other comparisons?
bush god: 8.6 million
clinton god: 3.4 million
So Bush is both more loved and hated than Clinton, perhaps. But then again, there's been a huge growth in the internet in the past few years, so maybe more Bush than Clinton for purely topical reasons?
bush: 83 million
clinton: 25 million
Hmm, let's try something completely unrelated to politics:
bush giraffe: 180,000
clinton giraffe: 23,000
OK, maybe not a good comparison, since giraffes live in the bush. Let's try something that's associated with Clinton but not with Bush:
bush mcdonalds: 440,000
clinton mcdonalds: 200,000
At this point, I'm getting the clear impression that Bush is getting more hits than Clinton on just about everything! So no evidence here that he's being Hitlerized more than Clinton was. It looks like the big number for "bush hitler" is more of an artifact of the spread of the web. [Place disclaimers here about the use of Google as a very crude research tool!]
This certainly doesn't invalidate, or even argue against, Hansen's main points. It just suggests that we should be similarly concerned about haters on the other side.
OK, I guess that's enough on this topic . . . maybe a good example for statistics teaching, though? Googlefighting as data analysis? Perhaps Cynthia Dwork, David Madigan, or some other student of web rankings can come up with more sophisticated analyses.
P.S. Update with discussion here.
P.S. Much much more on this general topic here, and here.
Posted by Andrew at 8:33 AM | Comments (10)
The "trustfunder left" and the personification of states and counties
Boris forwarded to me this article by Michael Barone on "the trustfunder left." Some excerpts:
Who are the trustfunders? People with enough money not to have to work for a living, or not to have to work very hard. . . . These people tend to be very liberal politically. Aware that they have done nothing to earn their money, they feel a certain sense of guilt. . . . they are citizens of the world with contempt for those who feel chills up their spines when they hear "The Star Spangled Banner." . . . Where can you find trustfunders? Not scattered randomly around the country, but heavily concentrated in certain areas. . . . Trustfunders stand out even more vividly when you look at the political map of the Rocky Mountain states. In Idaho and Wyoming, each state's wealthiest county was also the only county to vote for John Kerry . . . Massachusetts Catholics gave their fellow Massachusetts Catholic Kerry only 51 percent of their votes, but he won 77 percent in Boston, 85 percent in Cambridge, and 69 percent and 73 percent in trustfunder-heavy Hampshire and Berkshire Counties in the western mountains. . . .
Rich states and counties mostly support the Democrats, but rich voters mostly support the Republicans
This is vivid writing but, I think, incorrect electoral analysis. Barone is making the common error of "personifying" states and counties. Since 1996, and especially since 2000, rich states and rich counties have tended to support the Democrats--but rich voters have continued to support the Republicans.
For example, as David Park found looking through the exit polls, the 2004 election showed a consistent correlation between income and support for the Republicans, with Bush getting the support of 36% of voters with incomes below $15,000, 14% of those with incomes between $15-30,000, . . . and 62% of those with incomes above $200,000.
Given these statistics, I strongly doubt that trustfunders--in Barone's words, "people with enough money not to have to work for a living, or not to have to work very hard"--are mostly liberal, as he claims. Of course it's possible, but the data strongly support the statements that (a) richer people tend to support the Republicans, but (b) voters in richer states (and, to some extent, counties) tend to support Democrats. There definitely are differences between richer and poorer states--but the evidence is that, within any state, the richer voters tend to go for the Republicans. See here for more.
Confusion of the columnists
My first thought on seeing Barone's article was disappointment that the author of the Almanac of American Politics would write something so misinformed. However, other columnists have made the same mistake. For example, here's here's Nicholas Kristof in the New York Times.
The interesting thing is that the conceptual confusion between patterns among states and among individuals (sometimes called the "ecological fallacy" or "Simpson's paradox" in statistics) led Barone to confusion even at the state and county level. For example, he writes,
Where Democrats had a good year in 2004 they owed much to trustfunders. In Colorado, they captured a Senate and a House seat and both houses of the legislature. Their political base in that state is increasingly not the oppressed proletariat of Denver, but the trustfunder-heavy counties that contain Aspen (68 percent for Kerry), Telluride (72 percent) and Boulder (66 percent). . . .
I went and looked it up. Actually, Kerry got 70% of the vote in Denver.
What's going on?
How can Barone, an experienced observer who knows a lot more about voting patterns than I do, make this mistake--not recognizing that rich people are voting for Republicans and not even noticing that Kerry got 70% of the vote in Denver? I think the fundamental problem, both of conservatives like Barone and liberals on the other side, is not coming to grips with the basic fact that both parties have close to 50% support.
Perhaps the Democrats are the party of trustfunders, welfare cheats, drug addicts, communists, and whatever other categories of people you don't like. Perhaps the Republicans are the party of rich CEO's, bigots, fascists, and so forth. No matter how you slice it, both sides have to add up to 50%, so you either have to throw in a lot of "normal" voters on both sides or else you have to marginalize large chunks of the population.
For example, Barone notes that Kerry won only 51% of the Catholic votes in Massachusetts. That looks pretty bad--he's so unpopular that he barely got the support of voters of his own state and religion. But, hey, he got 48% of the vote national vote, so somebody was voting for him. And considering that Bush got 62% of the voters with incomes over $200,000, Kerry's voters can't all be trustfunders!
Barone might be right, however, when he cites the trustfunders as a new source of money for the Democrats (as they of course also are for the Republicans). And, as a political matter, it might very well be a bad thing if both political parties are being funded by people from the top of the income distribution. This would be an interesting thing to look at. There's a wide spectrum of political participation, ranging from voting, to campaign contributions, to activism (see Verba, Schlozman, and Brady), and the demographics of these contributors and activists is potentially important. But you're not going to find it by looking at state-level or county-level vote returns.
Reasoning by analogy?
I clicked through to the link on Barone's page to his book, "Hard and Soft America." This looks much more reasonable. I wonder if he caught on to something real with "Hard America, Soft America" and then too quickly generalized it to imply, "anyone I agree with is part of hard America, which I like" and "anyone I disagree with is part of Soft America, which I dislike."
It wouldn't be the first time that a smart person was led by ideology to overgeneralize.
P.S. See also here, here, and here, and here for various takes on Barone's article.
Posted by Andrew at 12:09 AM | Comments (5) | TrackBack
March 23, 2005
Question about causal inference
Judea Pearl (Dept of Computer Science, UCLA) spoke here Tuesday on "Inference with cause and effect." I think I understood the method he was describing but it left me with some questions about what were the method's hidden assumptions. Perhaps someone familiar with this approach can help me out here.
I'll work with a specific example from my one of my current research projects.
A treatment is being considered of giving zinc supplements to some HIV-positive children in South Africa. The treatment, call it Z, would be randomly assigned: Z=1 for half the kids and Z=0 for the others. The outcome of interest is CD4 percentage after a year of treatment (or control); call this Y. High values of Y are good, low values are bad. There's also an intermediate outcome, the amount of diarrhea during the year; call this D. Zinc supplements are known to reduce diarrhea, also diarrhea can have bad consequences for CD4 (something to do with the immune system).
The causal path diagram looks like this:
Z has arrows going to D and to Y, and D has an arrow going to Y. Thus, Z can affect Y directly and also through D.
Now suppose I want to estimate the effect of D--that is, the effect of some treatment that would reduce diarrhea directly (and not by giving zinc). From Pearl's talk, I gather that this would be the operation of "fixing" D to a specified value, and he would do this using a "mutilated model" [Pearl's term] that would remove all the arrows to D (in this case, that would just be the arrow from Z to D).
OK, so suppose I actually supplied some data on (Z,D,Y) for 200 kids, and also suppose that, in these data, (D,Y) had a joint normal distribution given Z. (Here, Z is binary, so I'm saying that (D,Y) have one joint normal distribution for the treated kids, and another joint normal distribution for the untreated kids.)
So now, can we apply Pearl's method, and actually get an estimate for the direct effect of D on Y? His talk led me to believe that we could. But what would that estimate mean? In real life, we can't really estimate this direct effect without having direct manipulation of D. Or, to put it another way, we can only estimate this effect if we make some additional assumptions. What assumptions is Pearl's model making? I'm willing to swallow distributional assumptions, and I'm happy with the arrows in the path diagram, but there's gotta be something else going on here.
Posted by Andrew at 12:16 AM | Comments (5)
March 22, 2005
Decision Science News
Dan Goldstein, who runs the Center for Decision Sciences seminar at Columbia (along with Dave Krantz and Elke Weber) has a blog called Decision Science News.
I've learned a lot from some of the presentations at the decision science seminar (see here) and even spoke at it myself once (on this topic), and am generally interested in the topic, so I was curious to see the blog. It presents short descriptions of interesting recent work in decision science, especially in marketing. Reading this blog is a good way to get a sense of what the decision researchers are thinking about nowadays.
As a former student of Gigerenzer, Dan is perhaps sympathetic to my views on institutional decision analysis (probabilities represent an agreed-upon hypothesized model, used for convenience, rather than subjective states of knowledge). (See here for more.)
Posted by Andrew at 5:59 AM | Comments (0) | TrackBack
My favorite examples all in one place
I got a call from Joe Ax, a reporter at the (Westchester) Journal News because there had recently been two different tied elections in the county. (See here for some links.) He wanted my estimate of the probability of a tied election. Well, there were actually only about 1000 votes in each election, so the probability of a tie wasn't so low. . . . (For an expected-to-be-close election with n voters, i estimate Pr(tie) roughly as 5/n. This is based on, first, the assumption that there is a 1/2 probability of an even number of votes for the 2 candidates (otherwise you can't have a tie), and then on the assumption that the outcome is roughly equally-likely to be between 45% and 55% for either candidate. Thus 1/2 x 10/n = 5/n.)
I also mentioned that some people would calculate the probability based on coin flipping, but I don't like that because it asssumes that everyone's probability is 1/2 and that voters are independent, neither of which is true (and also the coin-flipping model doesn't come close to fitting actual election data).
Coin flips and babies
An hour or so later Joe called me back and said that he'd mentioned this to some people, and someone told him that he'd heard that actually heads are slightly more common than tails. What did I think of this? I replied that heads and tails are equally likely when a coin is flipped (although not necessarily when spun), but maybe his colleague was remembering the fact that births are more likely to be boys than girls.
P.S. Here's the Journal News article (featuring my probability calculations).
Posted by Andrew at 12:08 AM | Comments (5) | TrackBack
March 21, 2005
Bayes in China
Xiao-Li confirmed that they didn't like Bayes in China (or at least in Shanghai) when he was a student. He writes:
Yes, I do [remember], and it's no laughing matter then! What happened was that the notion of "prior" contradicted one of Mao's quotation "truth comes out of empirical/practical evidence" (my translation is not perfect, but you can get the essence) -- and anything contradicts what Mao said was banned!
Do any other Chinese statisticians have stories like this?
Posted by Andrew at 12:28 AM | Comments (1)
March 18, 2005
Lowess is great
One of the discussants in Brain and Behavioral Sciences of Seth Roberts's article on self-experimentation was by Martin Voracek and Maryanne Fisher. They had a bunch of negative things to say about self-experimentation, but as a statistician, I was struck by their concern about "the overuse of the loess procedure." I think lowess (or loess) is just wonderful, and I don't know that I've ever seen it overused.
Curious, I looked up "Martin Voracek" on the web and found an article about body measurements from the British Medical Journal. The title of the article promised "trend analysis" and I was wondering what statistical methods they used--something more sophisticated than lowess, perhaps?
They did have one figure, and here it is:

Voracek and Fisher, the critics of lowess, are fit straight lines to data to clearly nonlinear data! It's most obvious in their leftmost graph. Voracek and Fisher get full credit for showing scatterplots, but hey . . . they should try lowess next time! What's really funny in the graph are the little dotted lines indicating inferential uncertainty in the regression lines--all under the assumption of linearity, of course. (You can see enlarged versions of their graphs at this link.)
As usual, my own house has some glass-based construction and so it's probably not so wise of me to throw stones, but really! Not knowing about lowess is one thing, but knowing about it, then fitting a straight line to nonlinear data, then criticizing someone else for doing it right--that's a bit much.
Not just lowess
Just to be clear, when I say "lowess is great," I really mean "smoothing regression is great"--lowess, also splines, generalized additive models, and all the other things that Cleveland, Hastie, Tibshirani, etc., have developed. (One of the current challenges in Bayesian data analysis is to integrate such methods. Maybe David Dunson will figure it all out.)
Posted by Andrew at 12:37 AM | Comments (4) | TrackBack
March 17, 2005
bugs.R question
This one's just for the bugs.R users out there . . .
Hui Xie asks,
I have a question on the bugs function you wrote. It is an extremely useful functions for statistician to implement Baysian. Now I am using it for microarray bayesian analysis. One thing that bothered us is that for each analysis Winbugs will be invoked. As you know, in microarray analysis there are usually thousands of genes for seperate analysis which impies the winbugs window will open and close for thousands times. Therefore it is very disirable to turn it off in this situation. I would like to ask you if there is a way in bugs() that can make Winbugs run in the background.
My reply:
You can do this in Openbugs (run bugs() with the version=2 option), however Openbugs is still pretty buggy and is not nearly as reliable as regular Winbugs. Another option is to reconfigure your Bugs model so that it does hundreds or thousands of analyses at once. This should be easy to do just by looping. Finally, I'd like to say you could do it in Umacs (our universal Markov chain sampler) but we're still working on that program!
Posted by Andrew at 12:16 AM | Comments (3)
March 16, 2005
Learning from self-experimentation
Seth Roberts is a professor of psychology at Berkeley who has used self-experimentation to generate and study hypotheses about sleep, mood, and nutrition. He wrote an article in Behavioral and Brain Sciences describing ten of his self-experiments. Some of his findings:
Seeing faces in the morning on television decreased mood in the evening and improved mood the next day . . . Standing 8 hours per day reduced early awakening and made sleep more restorative . . . Drinking unflavored fructose water caused a large weight loss that has lasted more than 1 year . . .
As Seth describes it, self-experimentation generates new hypotheses and is also an inexpensive way to test and modify them. One of the commenters, Sigrid Glenn, points out that this is particularly true with long-term series of measurements that it might be difficult to do on experimental volunteers.
Heated discussion
Behavioral and Brain Sciences is a journal of discussion papers, and this one had 13 commmenters and a response by Roberts. About half the commenters love the paper and half hate it. My favorite "hate it" comment is by David Booth, who writes, "Roberts can swap anecdotes with his readers for a very long time, but scientific understanding is not advanced until a literature-informed hypothesis is tested between or within groups in a fully controlled design shown to be double-blind." Tough talk, and controlled experiments are great (recall the example of the effects of estrogen therapy), but Booth is being far too restrictive. Useful hypotheses are not always "literature-informed," and lots has been learned scientifically by experiments without controls and blindness. This "NIH" model of science is fine but certainly is not all-encompassing (a point made in Cabanac's discussion of the Roberts paper).
The negative commenters were mostly upset by the lack of controls and blinding in self-experiments, whereas the positive commenters focused on individual variation, and the possibility of self-monitoring to establish effective treatments (for example, for smoking cessation) for individuals.
In his response, Roberts discusses the various ways in which self-experimentation fits into the landscape of scientific methods.
My comments
I liked the paper. I followed the usual strategy with discussion papers and read the commentary and the response first. This was all interesting, but then when I went back to read the paper I was really impressed, first by all the data (over 50 (that's right, 50) scatterplots of different data he had gathered), and second by the discussion and interpretation of his findings in the context of the literature in psychology, biology, and medicine.
The article has as much information as is in many books, and it could easily be expanded into a book ("Self-experimentation as a Way of Life"?). Anyway, reading the article and discussions led me to a few thoughts which maybe Seth or someone else could answer.
First, Seth's 10 experiments were pretty cool. But they took ten years to do. It seems that little happened for the first five years or so, but then there were some big successes. It would be helpful to know if he started doing something in last five years that made his methods more effective. If someone else wants to start self-experimenting, is there a way to skip over those five slow years?
Second, his results on depression and weight control, if they turn out to generalize to many others, are huge. What's the next step? Might there be a justification for relatively large controlled studies (for example, on 100 or 200 volunteers, randomly assigned to different treatments)? Even if the treatments are not yet perfected, I'd think that a successful controlled trial would be a big convincer which could lead to greater happiness for many people.
Third, as some of the commenters pointed out, good self-experimentation includes manipulations (that is, experimentation) but also careful and dense measurements--"self-surveillance". If I were to start self-experimentation, I might start with self-surveillance, partly because the results of passive measurements might themselves suggest ideas. All of us do some self-experimentation now and then (trying different diets, exercise regimens, work strategies, and soon). Where I suspect that we fall short is in the discipline of regular measurements for a long enough period of time.
Finally, what does this all say about how we should do science? How can self-experimentation and related semi-formal methods of scientific inquiry be integrated into the larger scientific enterprise? What is the point where researchers should jump to a larger controlled trial? Seth talks about the benefits of proceeding slowly and learning in detail, but if you have an idea that something might really work, there are benefits in learning more about it sooner.
P.S. Some of Seth's follow-up studies on volunteers are described here (for some reason, this document is not linked to from Seth's webpage, but it's referred to in his Behavioral and Brain Sciences article).
Posted by Andrew at 12:35 AM | Comments (11) | TrackBack
March 15, 2005
Don't waste your time reading this one
One of the major figures in Segerstrale's book is John Maynard Smith, who she refers to as "Maynard Smith." Shouldn't it be just "Smith"? Perhaps it's a British thing? When reading about 20th century English history, I always wondered why David Lloyd George was called "Lloyd George" rather than simply "George," but I figured that was just to avoid confusing him with the king of that name.
Posted by Andrew at 8:39 PM | Comments (4)
Still more on science and ideology
In the comments to this entry, Aleks points out that the correlations between scientific views and political ideology are not 100%, even at any particular point in time. (In my earlier entry, I had discussed how these political alignments have shifted over time.)
The question then arises: why care about this at all? Why not just evaluate the science on scientific grounds and ignore the ideology?
I'd like to ignore ideology--actually, I personally feel that I can evaluate scientific claims dispassionately--but maybe it's not so easy. One interesting point made by Ullica Segerstrale in Defenders of the Truth is that, by attacking socobiology on political grounds, the "anti's" (Lewontin, Gould, Chorover, etc.) reduced the credibility of their scientific criticisms.
In fact, my impression from her book is that Segerstrale herself was somewhat affected in this way, reflexively considering criticisms of sociobiology to be politically-motivated even when they could have just been motivated by scientific skepticism.
For example, one of the heroes of Defenders of the Truth is Bill Hamilton, a British geneticist who came up with some creative and sophisticated ideas in the 1960s on kin selection. In 1973, he presented a conference paper including the following passage (reprinted on page 147 of Segerstrale's book):
The incursions of barbaric pastoralists seem to do civilizations less harm in the long run than one might expect. Indeed, two dark ages and renaissances in Europe suggest a recurring pattern in which a renaissance follows an incursion by about 800 years. It may even be suggested that certain genes or traditions of pastoralists revitalize the conquered people with an ingredient of progress which tends to die out in a large panmictic population for the reasons already discussed. I have in mind altruism itself, or the part of the altruism which is perhaps better described as self-sacrificial daring. By the time of the renaissance it may be that the mixing of genes and cultures (or of cultures alone if these are the only vehicles, which I doubt) has continued long enough to bring the old mercantile thoughtfulness and the infused daring into conjunction in a few individuals who then find courage for all kinds of inventive innovation against the resistance of established thought and practice.
Segerstrale discusses how this was attacked as "racist"--which seems like overkill--but perhaps does not give enough attention to the idea that this idea of Hamilton's is just silly. I mean, the idea that the pastoral life is so mellow so that the genes for "daring" disappear, and then they get an infusion of fresh new blood . . . one can certainly see the connection to fascism, but to me it just seems more like overreach: Hamilton did amazing work explaining the existence of altruism under natural selection, and then maybe he got overconfident and thought he had found the key to human history. I'm not surprised that people would find this a bit ridiculous.
I'm not knocking Segerstrale's book, which had all this information there--I'm just suggesting that maybe the objections to some of the more extreme claims of sociobiology (such as the passage above) could have been scientific as much as ideological.
In summary . . .
I'm saying that ideological views of science are important, not because we should use our ideology to decide what to believe, but because understanding others' ideologies might help us understand the motivations underlying their beliefs.
(One reason that working on radon was interesting is that there's no "pro-radon" lobby, and many of the reactions that people have to radon seem purely ideological, depending on how people feel about government regulation, environmental risk, etc.)
Posted by Andrew at 12:11 AM | Comments (3)
March 14, 2005
Output assessment for Monte Carlo simulations via the score statistic
Yanan Fan, Steve Brooks, and I wrote a paper on using the score statistic to assess convergence of simulation output, which will appear in Journal of Computational and Graphical Statistics. The idea of the paper is to make use of certain identies involving the derivative of the logarithm of the target density. The paper introduces two convergence diagnostics. The first method uses the identity that the expected value of this derivative should be zero (if one is indeed drawing from the target distribution). The second method compares marginal densities estimated empirically from simulation draws to those estimated using path sampling. For both methods, multiple chains can be used to assess convergence using these methods, as we illustrate using some examples.
Posted by Andrew at 12:37 AM | Comments (0)
March 11, 2005
"I don't recommend that you take drugs, but . . . "
Well, now that I'm telling stories . . . When reading "Defenders of the Truth", I came across the name of Stephen Chorover--he was one of the left-wing anti-sociobiology people. As a freshman at MIT, I took introductory psychology (9.00, I believe it was), and Chorover was one of the two professors. He would give these really vague lectures--the only thing I remember was when he told us about his experiences with mescaline. He said something like, "I don't recommend that you take drugs, but the only way you'll know what it's like is to try it." Seemed like a real burned-out 60's type. (The course was co-taught, and the other prof was a young guy named Jeremy Wolfe, who was a dynamic lecturer but unfortunately spent all his time talking about perception, mostly vision, which might be interesting but certatinly wasn't why a college freshman is taking psychology.) The course also had a weekly evening meeting that was in a room too small for us all to fit in, because, they told us, "we know you won't show up anyway." Another great message to send to the freshmen . . .
(I really shouldn't go around mocking college instructors since I know I have my own flaws. In the first semester of teaching, one of the students came up to me at the end of the semester and said, "Don't worry, Prof. Gelman. You'll do a better job teaching next time.")
Anyway, it was just funny to see Chorover's name in print after so many years. Also, Steven Pinker gave a guest lecture in that intro psych class of ours, but that was before he became political.
Posted by Andrew at 11:44 AM | Comments (2)
Science and ideology
Writing about the changing nature of science and ideology (see also here) reminds me that in grad school, Joe Schafer used to talk about the "left-wing Bayesians" and the "right-wing frequentists," which might even have been true although I can't see any scientific reason for such an alignment. I mean, I can see a lot of rationalizations (for example, Bayesian inference was more of a new, maybe risky, approach, hence perhaps would be more popular with radicals than with conservatives), but they don't seem so convincing to me.
I also remember that Xiao-Li Meng told me that in China they didn't teach Bayesian statistics because the idea of a prior distribution was contrary to Communism (since the "prior" represented the overthrown traditions, I suppose). Or maybe he was pulling my leg, I dunno.
Posted by Andrew at 11:12 AM | Comments (4)
March 10, 2005
Contingency and ideology
Following Bob O'Hara's recommendation, I read Defenders of the Truth: The Battle for Science in the Sociobiology Debate and Beyond, by Ullica Segerstrale. As Bob noted in his comment, this is a story of a bunch of scientists who managed to have a highly ideological debate about evolutionary theory despite all being on the left side of the political spectrum (sort of like that famous scene from The Life of Brian with the Judean People's Front).
Nature vs. nurture, right vs. left
Anyway, I wanted to use this to continue the discussion of science and political ideology.
As Segerstale describes in detail for the debates on sociobiology and related genetics issues, it was the conservatives (like the IQ testers and the Bell Curve guys) who generally favored ideas of genetic determinism: for individuals, you are your (inherited) genes, and for the species as a whole, we are what we are programmed to be. The liberals (Margaret Mead, etc.) would allow individuals, societies, and humanity in general to be changed more by environment. Segerstale also points out that the resulting scientific debates were a mess because what is true scientifically is not necessarily what you want to be true for political reasons.
But now step back a minute. Is it always true that the conservatives favor determinism ("nature") and the liberals favor "nurture"? Is it always the IQ guys vs. Margaret Mead? As Segerstale notes, you can't get much more left-wing than Noam Chomsky, and he certainly favors innate explanations for human behavior. (But, interestingly, Steven Pinker is a Chomskyian linguist whose deterministic scientific views seem to have pulled him rightward politically--"being mugged by reality," perhaps?)
Chomsky is no anomaly. In the field of evolutionary theory, J. B. S. "I'd lay down my life for two brothers or eight cousins" Haldane was a communist, which he didn't think conflicted with a belief in kin selection!
Historians
Nowadays, an emphasis on genetic determinism of human behavior is somewhat associated with a conservative political stance. This correlation makes sense, in that it might be difficult to change the world in a "liberal" direction if human nature is always keeping things as they are. (Here I'm ignoring the Biblical fundamentalist branch of conservatism that disputes evolution entirely.)
But in historical writing, the correlation goes the other way, or at least that is how it appears to me. Left-wing historians, following Marx, tend to see systematic patterns in history and favor impersonal demographic forces, whereas conservative historians favor explanations based on contingency. These issues are discussed in depth in the admirably even-handed In Defense of History, by Richard Evans. (Once again, there are exceptions, such as the left-wing contingencist A. J. P. Taylor, but the general pattern seems to hold.)
Once again, this alignment of political ideology with historical models makes some sense: the view based on contingency ("Cleopatra's nose") aligns with the "great man" approach to history, which is conservative in the sense of being a traditional way to view history and also in that it favors the stories of the powerful. Big systems are more about the masses of people and fit in better with a leftist attitude.
But it's funny all the same, that scholars in these different fields can be so sure that their scientific approach happens to align with their ideology. Stephen J. Gould's "Cleopatra's nose" view of biology is liberal, whereas the corresponding view held by a historian (for example, Daniel Boorstin) is conservative.
Posted by Andrew at 12:29 AM | Comments (8)
March 9, 2005
p (A|B) != p (B|A)
A common mistake in conditional probability is to confuse the conditioning (that is, to mistake p(A|B) for p(B|A)). One complication here is that our language for probability can be ambiguous. For example, I have done a classroom demo replicating the experiment of Kahneman and Tversky in which students guess "the percentage of African countries in the United Nations." I always thought this meant
100*(# African countries in U.N.)/(# countries in U.N.).
But some students thought this meant
100*(# African countries in U.N.)/(# countries in Africa).
So, to even ask the question clearly, I need to ask for "the percentage of countries in the U.N. that are in Africa," or something like that.
Anyway, I recently went to a talk by Maryanne Schretzman (Dept of Homeless Services, NYC), where an interesting example arose of the difference between p(A|B) and p(B|A). They're looking at new admissions to the shelter system, and a lot of them come are people who are released from jail. But the jail administrators aren't so interested in talking about this, because, of all the people released from jail, only a small percentage go to homeless shelters. p(A|B) is high, but p(B|A) is small. Same numerators, but the denominator is much bigger in the latter case.
Posted by Andrew at 7:05 AM | Comments (4) | TrackBack
March 8, 2005
Still more on R software for matching for causal inference
Following up on this and this and this , Dan Ho sent me the following discussion of the differences between his, Jasjeet Sekhon's, and Ben Hansen's matching programs:
Hi Andrew,On the matching software issues there are a few other differences as
well. The main difference between the approaches is that Jas' program
contemplates substituting conventional parametric models with an
estimator that simultaneously conducts matching and a bias adjustment.
Our alternative theory (as outlined by Cochran and Rubin in the 1970s
in a specific linear context and generalized to all parametric models
in our paper at http://gking.harvard.edu/files/matchp.pdf) is that
matching is best used as preprocessing. Following our approach, users
can employ all the knowledge about parametric models that they have
developed and merely add a preprocessing step. The result is greatly
reduced model dependence and increased accuracy of parametric
estimates.Other differences include that:
(1) MatchIt enables analysis of any outcome model (OLS, logit, ordered
probit, etc.) and is integrated with Zelig. The AI code appears to assume
linearity for the applied bias-adjustment.(2) MatchIt incorporates optimal matching and full matching code by Hansen
as suggested by Rosenbaum and others.(3) MatchIt also permits subclassification, exact restrictions,
Mahalanobis-distances, etc., as documented at
http://gking.harvard.edu/matchit/Lastly, one clarification to Jake's point: the default in MatchIt is
not to perform exact matching with replacement. Instead, the MatchIt
default for exact matching simply assigns subclasses to all units with
the same pretreatment covariates. Matching with replacement is a
separate option incorporated into MatchIt.Dan
Once again, I'll refer to this paper by Rubin for an overview of propensity score matching.
Posted by Andrew at 12:23 AM | Comments (0)
March 7, 2005
The secret weapon
An incredibly useful method is to fit a statistical model repeatedly on several different datasets and then display all these estimates together. For example, running a regression on data on each of 50 states (see here as discussed here), or running a regression on data for several years and plotting the estimated coefficients over time.
Here's another example:
.png)
The idea is to fit a separate model for each year, or whatever, and then to look at all these estimates together to see trends. This can be considered as an approximation to multilevel modeling, with the partial pooling done by eye on the graphs rather than using a full statistical model.
One reason the secret weapon is so great can be seen in various analyses of repeated cross-sectional data, with estimates every two or four years (for example, in studying Congressional or Presidential elections). The horrible alternative often involves people pooling data over decades in order to get stable estimates, but as a result it is then difficult to see time trends, and models get oversimplified.
We call it this technique the "secret weapon" because it seems to be done much less often than it could be. I suspect the technique is not used more because people are fixated on point estimates and don't realize that a graph can tell a clearer story. Another failure of classical statistical estimation!
(For some examples of the secret weapon with repeated cross-sectional data, see Figures 2, 4, 9, and 10 of this paper
Well, I guess it's not a secret anymore...
Posted by Andrew at 12:52 AM | Comments (2)
March 3, 2005
Meritocracy won't happen: the problem's with the "ocracy"
I was reading something the other day that referred in an offhand way to "meritocracy", which reminded me of a wide-ranging and interesting article by James Flynn (the discoverer of the "Flynn effect", the steady increase in average IQ scores over the past sixty years or so). Flynn's article talks about how we can understand variation in IQ within populations, between populations, and changes over time.
At the end of his article, Flynn gives a convincing argument that a meritocracatic future is not going to happen and in fact is not really possible.
He first summarizes some data showing that America has not been getting more meritocratic over time. He then presents the killer theoretical argument:
The case against meritocracy can be put psychologically: (a) The abolition of materialist-elitist values is a prerequisite for the abolition of inequality and privilege; (b) the persistence of materialist-elitist values is a prerequisite for class stratification based on wealth and status; (c) therefore, a class-stratified meritocracy is impossible.
Basically, "meritocracy" means that individuals with more merit get the goodies. From the American Heritage dictionary: "A system in which advancement is based on individual ability or achievement." As Flynn points out, this leads to a contradiction: to the extent that people with merit get higher status, one would expect they would use that status to help their friends, children, etc, giving them a leg up beyond what would be expected based on their merit alone.
Flynn also points out that the promotion and celebration of the concept of "meritocracy" is also, by the way, a promotion and celebration of wealth and status--these are the goodies that the people with more merit get. That is, the problem with meritocracy is that it's an "ocracy". As Flynn puts it:
People must care about that hierarchy for it to be socially significant or even for it to exist. . . .The case against meritocracy can also be put sociologically:
(a) Allocating rewards irrespective of merit is a
prerequisite for meritocracy, otherwise environments cannot
be equalized; (b) allocating rewards according to merit
is a prerequisite for meritocracy, otherwise people cannot
be stratified by wealth and status; (c) therefore, a class-stratified
meritocracy is impossible.
He also has some normative arguments which you could take or leave, but the social-science analysis is convincing to me.
Posted by Andrew at 8:15 AM | Comments (3) | TrackBack
March 2, 2005
EDA for HLM
Jake Bowers sent me a paper he and Katherine Drake wrote on exploratory data analysis for multilevel models.
My comments
The paper begins with a gradual justification of the use of multilevel models and then discusses a specific example (a regression of voting participation on education) where it could be interesting to allow slopes to vary by state, with state-level predictors.
The paper is interesting, with a variety of pretty pictures. I have a couple of technical comments. First, I like that they pull out the state-level predictors and put them in Table 2. The individual-level dataset in Table 1 can then have state indicators without the state-level predictors. (This is an issue we disucss further in Chapter 7 of our forthcoming book on regression and multilevel models.)
Second, the paper has some useful discussion of correlations in varying intercepts and slopes. But much of the correlation in this example is a statistical artifact arising from the fact that the education predictor is far from 0. I'd suggest pre-processing the "years of education" by subtracting 12 before you start. This also gives you a direct interpretation of the interecepts.
The pictures are interesting, but some tell more than others. In particular, Figure 2 seems pretty useless. For one thing, the intercepts don't mean much since there's almost nobody in the data with 0 education. (Actually, maybe those people with education less than 8 years should be moved up to 8--putting them all in the "no high school" category.)
Figure 3 is nice. But what is the ordering of the states? Perhaps stated in the text but should be in the figure caption. Figure 4 is nice, and at this point I'd say: just fit the multilevel model and start displaying some inferences from that. Why bother with the noisy least-squares estimates? Similar, Figure 5 is a mess because the super-noisy data set the scale and obscure all the interesting patterns.
Figure 7 just seems really silly to me. At this point, you're developing a lot of theory to work with these noisy least-squares regression coefficients. I'd fit the multilevel model first, and then see to what extent the model is not fitting. Or maybe I'm misunderstanding what they're doing (if so, I apologize).
In summary, I like the idea of making graphs but in this case I think I'd rather start by fitting a reasonable multilevel model and then making some graphs from there--first some graphs to summarize the estimates, then some EDA graphs to check model fit and learn more.
Finally, I really like how they use informative x-axes on the plots. I hate it when people plot things in alphabetical order or using id numbers.
(In case you missed it above, here's a link to the Bowers and Drake paper.)
Posted by Andrew at 12:43 PM | Comments (2)
March 1, 2005
Matching and matching
Daniel Ho, Kosuke Imai, Gary King, and Liz Stuart have a computer program in R to do matching for observational studies. Jasjeet Sekhon has a computer program in R to do matching for observational studies. Matching, followed by regression, has been suggested for decades by William Cochran and Donald Rubin as a way to reduce bias in observational studies.
I asked Jasjeet how his software differed from that of Liz Stuart et al. He replied:
There are a number of important differences. The three most important are:1) My [Jasjeet's] package includes a function "GenMatch()" which finds optimal balance using multivariate matching where a genetic search algorithm (called "genoud") determines the weight each covariate is given. The function never consults the outcome and is able to find amazingly good balance in datasets where human researchers have failed to do so. The use of GenMatch resolves the Dehejia and Wahba vs. Todd and Smith debate. I'm writing a paper on this approach to matching right now.
For more information on R-GENOUD (R-GENetic Optimization Using Derivatives) see http://jsekhon.fas.harvard.edu/rgenoud/
2) The core matching function, Match(), implements the standard errors and bias correction of Abadie and Imbens (forthcoming). The standard errors are principled when matching is done directly with covariates (as done by "GenMatch") or when one uses a known propensity score. Ties are handled in a deterministic and coherent fashion.
3) The MatchBalance() function provides an array of rigorous and innovative balance tests including bootstrap Kolmogorov-Smirnov (KS) tests which provide consistent test levels even when the variable under consideration is not continuous. There is also a version of the KS test which uses both the bootstrap and a Monte Carlo step to
provide a consistent multivariate balance test.
I [Andrew] can't vouch for any of these programs myself because (I'm embarrassaed to say) I've never actually used matching. I've analyzed data from observational studies but in the examples I've happened to work with, the treatment and control groups have been reasonably balanced. Matching is worth thinking about, though. Ultimately I'd like to put it all in a Bayesian poststratification framework.
Posted by Andrew at 11:52 AM | Comments (2)