Results matching “R”

My new email policy

I cleaned out my inbox again. This time I mean business. I'm gonna read my email every day at 4pm (approximately) and deal with every email immediately, right then. No more of this e-mail-all-day-and-all-night nonsense!

Reagan's words

Edo sent along this paper describing a Bayesian approach to determining the authorship of Reagan's speeches.

Juan Carlos writes,

I just have some simple questions regarding the 50 cows example you used in pp 222 in the BDA book and I was wondering if you could give me a hints.

What would be your reaction if you read a study like the 50 cows experiment? More specifically, how would the decision to re-randomize have an impact on the evaluation? I see this kind of designs pretty often in education research, and I don't have a definitive take on this. For example, I've recently read a paper where the authors randomly assigned students to treatment and control by gender and within schools. But then, they checked to see if they had equal representation between treat/control by disability, ethnicity, and lunch status - and if not, they randomize again until they got a "good balance" in terms of this three covariates.

I see in your book (pp 222) that you said "the treatment assignment is ignorable, but unknown". How "bad" is this? Would this have any impact on the evaluations results? Would it be better to use a randomized block design (including disability, ethnicity and lunch status)? Why would this be better?

I've talked with other statisticians around here, but none of them could give me a definitive answer. Then, the 50 cows example came to my mind, and I knew you'll be the right person to comment on this. Maybe you or your colleagues could discuss this issue in your blog; other people might we interested in the same topic ...

My response: Don Rubin told us about the cow experiment in the class I took from him in 1985--I think he may have got the data during his visit to University of Wisconsin but I'm not sure. Anyway, the correct analysis is basically to do a regression of outcome on treatment, also controlling for the variables used in the treatment assignment. In this case these are simply the pre-treatment variables given in the data table. It's best if these variables are at least roughly balanced among treatment groups, as this makes the inferences more robust to assumptions about the regression model. (This is an issue we also discuss in our regression chapter, in our incumbency advantage paper, and it's also been subsequently labeled as "double robustness" in some of the statistical literature.) Anyway, once you control for the variables used in the treatment assignment, it really doesn't matter at all that the cows were re-randomized, or re-re-randomized, or whatever. The key is that the assignment was just based on that information and nothing else (not, for example, whether the cows looked healthy, if that information has not been recorded). According to Rubin, the cow experiment was pretty clean in that the randomizers really only had that information available.

I think the same analysis would be ok in your school lunch study. Again, the sloppy randomization doesn't really matter. A randomized block design would be fine too.

Margin of error

Daniel Lippman sent me this news article which would be an excellent thing to give your statistics students to read if you're covering confidence intervals and sampling.

Do suicide barriers save lives?

Garrett Glasgow sent along this study on the effectiveness of suicide barriers on bridges:

With support from mental health workers, elected officials, the California Highway Patrol, and the local community, Caltrans has announced their intention to install a suicide prevention barrier on the Cold Spring Bridge by 2010 at a cost of $605,000. During the course of the debate a number of people have claimed that such a barrier would not only deter suicides at the Cold Spring Bridge, but actually prevent suicides and thus save lives. This claim is unfounded. A review of the evidence presented in favor of building the barrier and my own research reveals that there is no evidence that installing a suicide prevention barrier on the Cold Spring Bridge would save lives.

As Garrett writes, "there is a distinction between preventing suicides and preventing suicides
at a particular location."

Jeff pointed me to this interesting paper by David Primo, Matthew Jacobsmeier, and Jeffrey Milyo comparing multilevel models and clustered standard errors as tools for estimating regression models with two-level data.

From the Nelson and Simmons paper:

Across more than 90 years of professional baseball, batters whose names began with K struck out at a higher rate (in 18.8% of their plate appearances) than the remaining batters (17.2%), . . . players with the initial K struck out more often than other players even when we controlled for the average year in which each athlete played (p < .015). In fact, when we controlled for average year of play (and excluded initials associated with fewer than 5 Major League players—e.g., U as a first initial), K was both the first initial and the last initial associated with the highest strikeout rate. Furthermore, ethnic confounds are unlikely to account for the effect, as an analysis controlling for whether players were American or foreign born also showed that batters with the initial K were reliably more likely to strike out than other players were.

Their explanation is psychological:

Despite a universal desire to avoid striking out, players whose first or last names began with the letter K struck out more often than other players. For players with this initial, the explicitly negative performance outcome may feel implicitly less aversive. Even Karl ‘‘Koley’’ Kolseth would find a strikeout aversive, but he might find it a little less aversive than players who do not share his initials, and therefore he might be less motivated to avoid striking out.

This probably explains Dave Kingman pretty well. Not to mention Vince Koleman. I don't know if I believe this, or, maybe more to the point, what it would take for me to believe this. Somehow it's easier for me to accept the positive aspects of liking one's own name (dentists named Dennis, lawyers named Laura, etc.) than these sorts of negative aspects. Logically, they do go together, I guess. There's lots more of this in the Nelson and Simmons paper.

Susan sent me this paper by Leif Nelson and Joseph Simmons:

In five studies, we [Nelson and Simmons] found that people like their names enough to unconsciously pursue consciously avoided outcomes that resemble their names. Baseball players avoid strikeouts, but players whose names begin with the strikeout-signifying letter K strike out more than others (Study 1). All students want As, but students whose names begin with letters associated with poorer performance (C and D) achieve lower grade point averages (GPAs) than do students whose names begin with A and B (Study 2), especially if they like their initials (Study 3). Because lower GPAs lead to lesser graduate schools, students whose names begin with the letters C and D attend lower-ranked law schools than students whose names begin with A and B (Study 4). Finally, in an experimental study, we manipulated congruence between participants’ initials and the labels of prizes and found that participants solve fewer anagrams when a consolation prize shares their first initial than when it does not (Study 5). These findings provide striking evidence that unconsciously desiring negative name-resembling performance outcomes can insidiously undermine the more conscious pursuit of positive outcomes.

I just love this kind of stuff. Here's the data on grade point averages for students whose names begin with A, B, C, D, or other letters:

initials.png

I don't have anything to add here, beyond my comments on the paper by Pelham, Mirenberg, and Jones on dentists named Dennis (see here, here, and here). On one hand, it seems pretty implausible to me that kids whose names begin with C and D are really sabatoging themselves like this. On the other hand, hey, there are the data. An effect of 0.02 in GPA is pretty tiny, on the other hand if it were much larger I wouldn't believe it . . . It would be interesting to see the average GPA's for all 26 letters, also looking at both first and last names.

P.S. In a comment below, Derek posted a hypothetical improved graph with all 26 letters, I had it up, but I removed it since it's not actually real data!

Conference on Hypocrisy and Sincerity

I got the following odd announcement in the email:

Robin Hanson writes:

It seems obvious - in the vast space of interesting topics, academics clump around a few familiar themes, neglecting vast territories between the currently fashionable clumps. This is sure how it seems to outsiders and students, at least for fields like social science or literature, fields which must cover a vast territory. For example, economists have thousands of papers on auctions, and hardly any papers on romance, even though most people think romance far more interesting and important than auctions.

This is an interesting question. I don't see it as special to academia--I'm not quite sure how to measure trendiness or clumpiness, but by any measure I assume it would be higher in art, literature, journalism, business, and lots of other fields. Back in the 50s and 60s, there were lots of westerns on TV, now not so much. And check out the Museum of Modern Art if you want to see trendiness. Or look at cars over the years.

However, academia is what I know best, and also there is some sense that academia should be held to higher standards, so Robin's original question seems worth addressing. I have three main thoughts here:

1. The scientific landscape is fractal. When an area is studied in depth, it commonly spawns related research. From this perspective, I think clumpiness is inevitable. I think it's misleading to think of research topics as being situated on a smooth Euclidean-type space.

2. I think there's a feedback-and-overcorrection mechanism. Overstudied fields often seem to be things that were recently the hot new thing. For example, for the past 15 years or so, there have been a zillion statistics Ph.D. theses in genetics. Much of this comes from funding, I'm sure, but I think a lot comes because people (students and faculty alike) think of genetics as exciting and new, a lot more exciting than seemingly-boring topics such as sample surveys (which isn't actually boring at all!).

3. Three words: Division of labor. Economists study auctions, psychologists study romance. Gains can be made by people working outside their fields (psychologists studying auctions, economists studying romance), but, by and large, it makes sense for people to work on their topics of expertise.

Gamma-Poisson mixtures

Laura Hatfield writes,

A political science research blog

David Park, John Sides, and Lee Sigelman have a blog on political science research, with a focus on (but not limited to) American politics. There's some interesting stuff here, mostly descriptions of recent papers they've read, for example this on the importance of endorsements in primary elections and this on the economic costs of wars. Compared to our blog, theirs is going to have fewer pretty pictures and less on multilevel models, but on the upside they may have more political content and fewer comments of the "they should turn their tables into graphs" sort.

Also, I don't know if this new blog means that David won't be posting here anymore . . .

Dan Luu writes,

The short version of my question is: A lot of the papers I've been reading that sound really interesting don't seem to involve economics per se (e.g., http://home.uchicago.edu/~eoster/hivbehavior.pdf), but they usually seem to come out of econ (as opposed to statistics) departments. Why is that? Is it a matter of culture? Or just because there are more economists? Or something else?

If you don't mind getting random unsolicited email just because you write a blog, then there's the longer version of my question.

Frank Di Traglia writes,

I'm going to be teaching a three-week, introductory statistics course for local high school students next summer, and wanted to ask for your advice. I have two questions in particular.

First, I doubt that three weeks will be enough time to teach the usual Statistics 101 course. If you had only three weeks, what would you skip and what would you emphasize?

Second, since next year is an election year, I thought it might be fun to build the course around substantive examples from political science. Although I've enjoyed many of your poly-sci papers, my own background is not in this area (I did my masters in Statistics, and am currently pursuing a PhD in Economics). What would you consider to be the ten most interesting and accessible quantitative papers in this field?

My reply:

An interesting look at topology

Though more mathematics than statistics, I thought this would be relevant on the heels of the entry about the Krampf science experiment videos.

The blog 3 quarks daily recently posted a video showing how to turn a sphere inside out. See the post here.

Just as the poster on that blog, my interest in topology is amateurish as well. I found this a good explanation of a topic that I imagine loses many people in its discussion. It was one of those (rare?) Internet videos after which I felt that I had truly gained some knowledge.

Assistance in picking colors and charts

A few years ago I have used Cindy Brewer's ColorBrewer system for picking the right color scheme for graphics, based on experience from cartography. Recently, it has also been made into a R library. In particular, ColorBrewer distinguishes three types of color schemes: diverging, quantitative and qualitative:

palettes.png

Diverging blurs out the mean (appropriate for visualizing normally distributed real variables, or also correlations with color), quantitative blurs out the zero (appropriate for visualizing exponentially distributed positive variables with color), and qualitative makes it easy to distinguish adjacent values (appropriate for categorical variables).

ColorBrewer provides guidelines with respect to appropriateness of a color scheme for print, monitors, laptops, projectors, photocopying and even color blindness (I've once had someone complain about my color schemes after a lecture - just to find out that he's color blind).

I've been frustrated by people who do not use infinite palettes for visualizing data, and RColorBrewer exacerbates the problem by not being able to create palettes of arbitrary size. But, using simple linear interpolation between colors, one can create very appealing infinite palettes that maintain the approximate perceptual linearity (meaning that a change in our perception of color strength is proportional to the change in value across the scale) of ColorBrewer's palettes:

infinite_palette.png

While some people might argue that 11 bins are enough, I'd respond to this by saying that binning is an act of pure and inexcusable laziness in the case when you can easily visualize a continuum.

Just a few days ago, I've come across another tool: Juice Analytics Chart Chooser. It lists a number of charts: both as pictures, and also as PowerPoint and Excel templates:

charts.png

Each chart may have some of the following features:


  • Trend involves a variable indicating time

  • Composition involves a set of variables that add up to 1

  • Distribution exhibits an occurrence count for different values of some variable

  • Comparison pairs up two or more variables

  • Relationship visualizes a complex relationship between two or more variables

Thereby, you can see how visualizations carry many parallels to models. Picking a visualization is very much alike picking a good model. I have discussed this before.

For the end, here is a chart I use for assigning dimensions to graphical elements. A full circle indicates a good choice, an empty one an acceptable choice, while the absence of the circle means that the element isn't useful for presenting that particular aspect of the data.

graphics_elements.png

Note, nominal corresponds to the qualitative scale above, quantitative to diverging and ordinal approximately to sequential.

Happy Thanksgiving!

I recently had some thoughts about negotiating trades in the NBA. Specifically, I heard that the Lakers and the Bulls were having daily discussions about a trade involving Kobe Bryant, for at least a week; that seemed like a long time to me. Was this week-long series of conversations productive and/or necessary? Are there no quantitative methods for structuring trade negotiations that could have been used to save these teams some time and energy? I've outlined a potential solution, which can probably be improved using methods from the literature on (1) statistical models for rankings and (2) bargaining and negotiating.

Red State Blue State Article Published

Our (Andrew Gelman, Boris Shor, Joseph Bafumi, and David Park) "Rich State, Poor State, Red State, Blue State: What's the Matter with Connecticut?" paper has finally been published in the November 2007 Quarterly Journal of Political Science. You can access the paper here.

Here is the abstract:

For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer blue states in the northeast and coasts, and Republicans dominating in the red states in the middle of the country and the south. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Furthermore, we find that income matters more in red America than in blue America. In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference.

Here are the blog posts relating to this project.

Here are some radio interviews and external blog posts on preliminary versions of the paper.

. . . could you please ask them to stop saying, "We know that voting doesn't make good economic sense." I recognize that "Freakonomics" is intended to be entertainment, not scholarship, but I don't think these dudes are doing economics any favors by spreading this kind of misconception. Voting isn't a good way to make money, but that doesn't mean it "doesn't make good economic sense." The full story is here (based on an article by an economist and two political scientists). I'll repeat here for convenience, but I recommend going to the original entry to see some of the give-and-take in the comments:

Political Neuroscience

A piece by Brandon Keim in Wired points out some issues in the fMRI brain-politics study on reactions to presidential candidates discussed in a recent NYT op-ed. For example,

Let's look closer, though, at the response to Edwards. When looking at still pictures of him, "subjects who had rated him low on the thermometer scale showed activity in the insula, an area associated with disgust and other negative feelings." How many people started out with a low regard for Edwards? We aren't told. Maybe it was everybody, in which case the findings might conceivably be extrapolated to the swing voter population of the United States. But maybe it was just five or ten voters, of whom one or two had such strong feelings of disgust that it skewed the average. What about the photographs? Was he sweating and caught in flashbulb glare that would make anyone's picture look disgusting? How did the disgust felt towards Edwards compare to that felt towards other candidates? How well do scientists understand the insula's role in disgust -- better, I hope, than they understand the Romney-activated amygdala, which is indeed associated with anxiety, but also with reward and general feelings of arousal?

(And don't forget "Baby-faced politicians lose" on this blog.)

When is a bad graphic a good graphic?

This graphic (from SolarPowerRocks.com, which also gives references for the numbers, which I have not checked) is pretty neat. It compares annual U.S. energy R&D expenditures with the cost of the war in Iraq (one might well question whether it is reasonable or even meaningful to compare those, but that's not what this post is about). The graphic is neat precisely because it is so useless: it makes the point that the costs that it compares are so wildly different in magnitude that you can't even plot them on the same graph. Of course any of us could think of ways that you could plot them on the same graph, but that would make the graphic more informative at the expense of making it useless for its intended purpose.


SolarPowerRocks.com says "These figures are in millions. The source for energy R&D expenditures is from the National Council for Science and the Environment. "

Block models for network data

Aleks writes,

Studies of anti-gay prejudice

I was looking at the Archives of Sexual Behavior and saw this review by Grace Epstein of the strikingly titled, "God Hates Fags: The Rhetorics of Religious Violence." Epstein writes,

Hey, a statistician did that!

videospan.jpg

Oddly enough, I didn't know until clicking on the Flowing Data blog that Mark Hansen's installation at the New York Times building is up. Mark is a statistician at UCLA who I know from back when he was a grad student at Berkeley. (He didn't take my class, but he still did ok.) Here's the New York Times story on it.

The funny thing is, I was talking with some statisticians a few years ago about this project, back around when it was at the Whitney Museum and one guy said, "That doesn't seem like art to me." (But nobody said, "My kid could do that.")

I don't know if it's art or not, but it's pretty cool. I'd like it even more if it had a bit more statistical content, for example dynamic histograms or scatterplots of word frequencies, sentence lengths, etc etc.

Somebody (I can't remember who) pointed me to this paper by Claudio Castellano and Santo Fortunato. Here's the abstract:

A most debated topic of the last years is whether simple statistical physics models can explain collective features of social dynamics. A necessary step in this line of endeavor is to find regularities in data referring to large-scale social phenomena, such as scaling and universality. We show that, in proportional elections, the distribution of the number of votes received by candidates is a universal scaling function, identical in different countries and years. This finding reveals the existence in the voting process of a general microscopic dynamics that does not depend on the historical, political, and/or economical context where voters operate. A simple dynamical model for the behavior of voters, similar to a branching process, reproduces the universal distribution.

Hadley Wickham writes to Jouni and me:

I have just been reading your Stat Computing paper on your rv package. It's a nice paper, but you seem to be unaware of the very related ideas coming out of functional programming community - google for "probability monads" to get started. The notation and target uses are quite different, but the ideas are interesting, and there are some hints on how to compile statements to be more computationally efficient.

I also wonder if you have thought about applying these techniques to classical statistical procedures.

Gait analysis and evo psych

Dave Garbutt writes,

Perhaps you will find this interesting. It is a deconstruction rather than the original, but how to analyse the data might be an interesting challenge worthy of your readers....

I don't have anything to add about this, except that there's a long tradition of oveinterpreting data on menstrual cycles. I remember an example from an old and highly-recommended statistics textbook (Say it with Figures, by Hans Zeisel) that I used in a class once: he had an interesting example along with a graph and story, but then when I took a look at the original article being cited, I couldn't find anything like what was being claimed in the textbook. (This case is a little different, because here it's the scientific article itself that's being called into question, but still it reminded me.)

Josh Menke writes,

I saw that you had commented on adjusted plus/minus statistics for basketball in a few of your blog entries [see also here]. I've been working on a Bayesian version of the model used by Dan Rosenbaum, and wondered if I could ask you a question.

I wanted to be able to update the posterior after each sequence of game play between substitutions, so I decided to use the standard exact inference update for a normal-normal Bayesian linear regression model. If you're familiar with Chris Bishop's recent book, Pattern Recognition and Machine Learning, the updating equations for this are 3.50 and 3.51 on page 153. I felt OK with using a normal prior based on some past research I did in multiplayer game match-making with Shane Reese at BYU. The tricky part comes with using exact inference for updating the posterior. The updating method is very sensitive to the prior covariance matrix. I start with a diagonal covariance matrix, and if the initial player variances I choose are too high, the +/- estimates can go to infinity after several updates. I thought this was related to the data sparsity causing an ill-conditioned update matrix, but I thought I'd ask in case you'd had any experience with this type of problem.

Have you dealt with an issue like this before? If I set the prior variances low enough, I get reasonable results, and the ordering of the final ranking is fairly robust to changes in the prior. It's just the estimation process itself that doesn't "feel" as robust as I'd prefer, so I don't know that I trust the adjusted values (final coefficients) to be meaningful.

I don't think I can use MCMC in this situation either because trying to get 100,000 samples using 38,000+ data points and 400+ parameters feels intractable to me. I could be wrong there as well since I suppose I only need to include the current players in each match-up within the log likelihood. But it would still take quite a bit of time.

It would also be nice to go with the sequential updating version if possible since I could provide adjusted +/- values instantly after each game, if not after each match-up.

My reply:

1. I'd try the scaled inverse Wishart prior distribution as described in my book with Hill. This allows the correlations to be estimated from data in a way that still allows you to provide a reasonable amount of information about the scale parameters.

2. I'd go with the estimation procedure that gives reasonable estimates, then do some posterior predictive checks, as described in chapter 6 of Bayesian Data Analysis. (Sorry for always referencing myself; it's just the most accessible reference for me!) This should give you some sense of the aspects of the data that are not captured well by the model.

3. Finally, you can simulate some fake data from your model and check that your inferential procedure gives reasonable estimates. Cook, Rubin, and I discussed a formal way of doing this, but you can probably do it informally and still build some confidence in your method.

Happy conservatives and gloomy liberals

This post by Tyler Cowen recounts a debate in which he and another conservative argued that Americans are happy, versus two liberals who argued that Americans are not so happy, which makes me wonder how this happened. It reminds me of

Survey weighting is a mess

Dave Judkins writes, regarding my Struggles with Survey Weighting and Regression Modeling paper,

I am hoping you might be able to clarify a point in your approach. How does a variable like number of phone lines in the house get used in equation 5? (Given that N.pop and X.pop are not available.) Does your work in Section 3 apply only to X variables with known population distributions?

My reply:

Dan Schrage writes with a question about how to model group-level variation:

I [Dan] am trying to better understand the recommendation in your new book to always use random effects (pg. 246) in modeling. (I'm following your definition #5 here of fixed and random effects, as is standard in econometrics.) In econometrics, as I'm sure you know, the classical advice (dating from at least Mundlak (1978)) is this: If unobserved heterogeneity is correlated with regressors in your model, use fixed effects; otherwise, use random effects since they're more efficient. The idea is that random effects lumps this unobserved heterogeneity into the composite error term, and so if it the unobserved heterogeneity is correlated with the regressors, then the regressors are correlated with the error term, and this is bad news: estimators will be biased and inconsistent. So I'm trying to understand why your advice is essentially not to worry about this issue.

Is this just an argument about the bias/variance tradeoff, or is there something deeper here? Perhaps all of this just speaks to one of the common gaps between statistics and econometrics--econometricians tend to care a lot about asymptotic properties like consistency, and statisticians seem to care much less (correct me if I'm off here--I'm still trying to get a good sense of the divide). Econometricians are almost never willing to use an inconsistent estimator, no matter what the gain in efficiency.

My reply:

Being Overweight Isn't All Bad, Study Says

Dan Goldstein sent me this link, with the note, "possibly interesting to you / Seth."

What I'm wondering is, will Seth be happy because it shows how conventional medical research has failed, or will he be unhappy because it finds that losing weight is not such a great thing?

I started to post this item on posterior predictive checks and then I realize I already did post it several months ago! Memories (including my own) are short, though, so here it is again:

A researcher writes,

I have made use of the material in Ch. 6 of your Bayesian Data Analysis book to help select among candidate models for inference in risk analysis. In doing so, I have received some criticism from an anonymous reviewer that I don't quite understand, and was wondering if you have perhaps run into this criticism. Here's the setting. I have observable events occurring in time, and I need to choose between a homogeneous Poisson process, and a nonhomogeneous Poisson process, in which the rate is a function of time ( e.g., lognlinear model for the rate, which I'll call lambda).

I could use DIC to select between a model with constant lambda and one where the log of lambda is a linear function of time. However, I decided to try to come up with an approach that would appeal to my frequentist friends, who are more familiar with a chi-square test against the null hypothesis of constant lambda. So, following your approach in Ch. 6, I had WinBUGS compute two posterior distributions. The first, which I call the observed chi-square, subtracts the posterior mean (mu[i] = lambda[i]*t[i]) from each observed value, square this, and divides by the mean. I then add all of these values up, getting a distribution for the total. I then do the same thing, but with draws from the posterior predictive distribution of X. I call this the replicated chi-square statistic.

If my putative model has good predictive validity, it seems that the observed and replicated distributions should have substantial overlap. I called this overlap (calculated with the step funtion in WinBUGS) a "Bayesian p-value." The model with the larger p-value is a better fit, just like my frequentist friends are used to.

Now to the criticism. An anonymous reviewer suggests this approach is weakened by "using the observed data twice." Well, yes, I do use the observed data to estimate the posterior distribution of mu, and then I use it again to calculate a statistic. However, I don't see how this is a problem, in the sense that empirical Bayes is problematic to some because it uses the data first to estimate a prior distribution, then again to update that prior. I am also not interested in "degrees of freedom" in the usual sense associated with MLEs either.

I am tempted to just write this off as a confused reviewer, but I am not an expert in this area, so I thought I would see if I am missing something. I appreciate any light you can shed on this problem.

My thoughts:

To our loyal readers . . .

Sorry for all the red-state, blue-state stuff. We'll be giving you more statistics-as-usual and miscellaneous social science soon . . . In the meantime, you can read these papers:

Using redundant parameterizations to fit hierarchical models (with Zaiying Huang, David van Dyk, and John Boscardin; to appear in JCGS)

Weight loss, self-experimentation, and web trials: a conversation (with Seth Roberts; to appear in Chance)

Manipulating and summarizing posterior simulations using random variable objects (with Jouni Kerman; to appear in Statistics and Computing)

Do the Democrats represent the rich?

Boris points to this post by Megan McArdle which discusses some of the political implications of the Democrats doing best among lower-income voters but winning the states and congressional districts where more of the higher-income Americans live. As McArdle puts it,

[Michael Franc] is actually making a good point: the constituency of the Democrats will force many of them to support the interests of the rich, even where they might ideologically prefer to oppose, because doing so is good for their district. Voters, especially poor voters, are highly influenced by local economic conditions. It is thus in Chuck Schumer's strong political interest to keep the financial services industry happy, whether or not they vote for him. Ditto Nancy Pelosi and Silicon Valley.

This does seem like a real tension: I'd only add that it's not just what the Democrats ideologically prefer, but also what their voters want. I haven't looked at the data from Nancy Pelosi's district, or for that matter at votes for Chuck Schumer, but where we have looked at voting (thanks to Henry Farrell for linking to our paper on this), the Democrats do better among the poor and the Republicans do better among the rich. It's not a perfect correlation by any means, but to the extent that Nancy and Chuck are listening to their party's supporters, they'd be listening to teachers, nurses, students, and, for that matter, unemployed people, not hedge fund managers.

McArdle writes,

Democrats indisputably represent more rich voters than Republicans; their constituency is the people in their district, not the people in their district who voted for them.

I'd respond to this with a Yes and No. In terms of the Constitution, I agree; for that matter, congressmembers also represent nonvoters and people such as children and noncitizens who are ineligible to vote, just as back in 1789 they were said to represent women, non-property-owners, and 3/5 of the slaves. On the other hand, the two parties are different, and voters generally have enough information about candidates to vote for the one who is closer to their preferences. So in that sense, congressmembers do actually represent the people who vote for them. After all, Democrats are much more liberal than Republicans in otherwise-comparable districts. (There's lots of evidence on this; see, for example, the graph on page 213 of our recent book.)

Another way to look at this is to flip it around and consider the Republicans, who represent richer voters but poorer states. A simple geographically-based analysis would suggest that the Republicans would be trying to raise taxes on the rich and raise benefits for the poor. But they're not. Arguably the Republicans' pro-business, low-tax policies are ultimately what's best for the poor (and also the rich), but they're certainly don't seem like the kind of populist notions that would make people in poorer districts happy.

Conflicts

The sweet spot for either party is spending that benefits its favored voters: rich people in poor states if you're a Republican, or poor people in rich states if you're a Democrat. This could be military contracts in the South (if you're a Republican) or mass transit in the Northeast (if you're a Democrat).

Now consider slopes. In the "red states," the slope is steep so there's not much of a motivation for R's to support measures that help poor people in poor states (if there is such a program). But in the "blue states," the slope is flat enough that there is a motivation for Chuck Schumer et al. to want to help out hedge fund managers (as well as to support programs such as Amtrak that benefit upper-middle-class types in the Northeast).

To put it another way, recall that the "red state, blue state" divide occurs among the rich, not among the poor. So you might expect the parties to have pretty consistent national policies on targeted benefits to the poor, but to be much more localized when considering benefits to the rich, with the Democrats favoring the financial and high-tech industries, and Republicans favoring agribusiness and small-business owners, for example. It would be interesting to see this studied more systematically.

The man-bites-dog factor

Another aspect of this is the idea that the Democrats are expected to be fighting the rich, and it's a surprise for things to go the other way. If it's the Republicans supporting the rich, this would not be news. To put it another way, Michael Franc's article is titled, "Democrats wake up to being the party of the rich," but on his terms (unwillingness to impose taxes on "mega-millionaires"), I assume the Republicans would be "the party of the rich" also.

Similarly, Fred Siegel writes that the Democrats are the party of the rich and that this is a trend since 1972. This may be the case but it doesn't show up in the polling data. Consider this graph (from this paper by David Park and myself) showing the difference in proportion of Republican vote, comparing voters in the upper third of income to voters in the lower third:

voting.png

The Republicans have been doing consistently best at the higher end of the income scale. Again, I'm not disagreeing with Franc and Siegel about the geographic and fundraising bases for the Democrats' support; I'm just pointing out that with data such as shown above, it's no surprise that the Republicans too are going in that direction.

Based on survey data, most voters tend to place themselves to the right of the Democrats on economic and on social issues, and most voters tend to place themselves to the left of the Republicans in both dimensions; see, for example, this graph based on National Election Study data from 2004. Each dot represents where a survey respondent places him or herself on economic and social issues: positive numbers are conservative and negative numbers are liberal, and "B" and "K" represent the voters' average placements of Bush and Kerry on these scales:

views.png

Despite all the push by rich funders on both parties, the Hollywood parties and the oil money, this is where things stand. Larry Bartels has a story of why this is, but in any case, it doesn't quite fit into a simple story of the Democrats being the party of the rich. At least not yet. They have to go a ways before they catch up to the Republicans on that one.

With only a year to the next election, and with the publicity starting up already, now is a good time to ask, is it rational for you to vote? And, by extension, is it worth your while to pay attention to what Hillary, Rudy, and all the others will be saying for the next year or so? With a chance of casting a decisive vote that is comparable to the chance of winning the lottery, what is the gain from being a good citizen and casting your vote?

The short answer is, quite a lot. First the bad news. With 100 million voters, your chance that your vote will be decisive--even if the national election is predicted to be reasonably close--is, at best, 1 in a million in a battleground state such as Ohio and less than 1 in 10 million or less in a less closely-fought state such as New York. (The calculation is based on the chance that your state's vote will be exactly tied, along with the chance that your state's electoral vote is necessary for one candidate or the other to win the Electoral College. Both these conditions are necessary for your vote to be decisive.) So voting doesn't seem like such a good investment.

But here's the good news. If your vote is decisive, it will make a difference for 300 million people. If you think your preferred candidate could bring the equivalent of a $50 improvement in the quality of life to the average American--not an implausible hope, given the size of the Federal budget and the impact of decisions in foreign policy, health, the courts, and other areas--you're now buying a $1.5 billion lottery ticket. With this payoff, a 1 in 10 million chance of being decisive isn't bad odds.

And many people do see it that way. Surveys show that voters choose based on who they think will do better for the country as a whole, rather than their personal betterment. Indeed, when it comes to voting, it is irrational to be selfish, but if you care how others are affected, it's a smart calculation to cast your ballot, because the returns to voting are so high for everyone if you are decisive. Voting and vote choice (including related actions such as the decision to gather information in order to make an informed vote) are rational in large elections only to the extent that voters are not selfish.

That's also the reason for contributing money to a candidate: Large contributions, or contributions to local elections, could conceivably be justified as providing access or the opportunity to directly influence policy. But small-dollar contributions to national elections, like voting, can be better motivated by the possibility of large social benefit than by any direct benefit to you. Such civically motivated behavior is consistent with both small and large anonymous contributions to charity.

The social benefit from voting also explains the declining response rates in opinion polls. In the 1950s, when mass opinion polling was rare, we would argue that it was more rational to respond to a survey than to vote in an election: for example, as one of 1000 respondents to a Gallup poll, there was a real chance that your response could noticeably affect the poll numbers (for example, changing a poll result from 49% to 50%). Nowadays, polls are so common that a telephone poll was done recently to estimate how often individuals are surveyed (the answer was about once per year). It is thus unlikely that a response to a single survey will have much impact.

So, yes, Virginia--and Ohio, and Florida, and Pennsylvania, and New Jersey--it is rational to vote. Utah, Wyoming, and Massachusetts: maybe it's not worth your time. On the other hand, there's a chance you could swing the national popular vote (which can affect the perception of a mandate) and in any case you're likely to have close local races that can ultimately affect policies from schools to taxes to crime and punishment, so if you have any preferences there, it might very well be worth your time to cast your ballot and have a small chance of making a big difference.

Here's our research article in the journal Rationality and Society spelling out the reasoning and evidence in more detail.

P.S. No, I didn't vote today. I don't think any of the local elections were close. I doubt there was serious opposition to any of the candidates.

P.P.S. I was motivated to post this by this Freakanomics article which brings up the old argument that it's irrational to vote because the probability of your vote being decisive is so low. Stephen Dubner writes, "The irony is that the typical voter is more likely to have an impact in a smaller election than in a larger one, but it’s the bigger elections that draw far more voters." Actually, it's not such an irony at all: bigger elections have bigger effects, thus more motivation to vote. It all becomes clear when you realize that most people vote because of what they think will be good for the country, not for their own personal benefit. It's fine for individual people to disagree with this--maybe Dubner and Levitt vote for other reasons--but it's an unfortunate blind spot for them to identify rationality with selfishness.

An updated version of my (Boris Shor, Harris School, University of Chicago) paper with Nolan McCarty (Princeton) and Christopher Berry (Harris School, University of Chicago) on state legislative ideology is available here. A prior version of the paper was featured in an April 2007 post on this blog.

Basically, the idea is to try to understand the ideology of state legislators along some common scale. We have good estimation techniques (like NOMINATE and item response models) that exploit roll call votes to line people up along ideological dimensions.

There have been some attempts to apply this at the state level, and researchers have come up with ideological scores for legislators within individual states. But a major problems remains: how can we make sure scores are comparable across states? We know that legislative agendas differ. We also know that the meaning of Democrats and Republicans is different across the states, too. Illinois Republicans probably aren't the same thing as Florida Republicans. But while we may suspect that Democrats and Republicans in the various states are more or less liberal or conservative with respect to each other, we have never been able to find out if this was actually true. Nor have we been able to tell if the divisions within states are comparable in size across states.

We address this well-known problem with a trick: it turns out that quite a few former state legislators have gone on to serve in Congress. So we use the well-understood congressional ideology of these people to rescale the ideology of all state legislators who didn't go on to Congress. So, in essence, we have rescaled all the ideological placements of thousands of state legislators over a decade to be on the "Congressional scale." We use this approach for California, Colorado, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas. We start with these large states because their congressional delegations are larger, which means there is more chance for former state legislators to become our institutional "bridges."

Therefore, we can now make valid comparisons of various measures of state legislative ideology across across states. Read the paper for the full details, but here's a little preview of the result.

Below is a boxplot of the two parties in each state, but pooled over a decade and across upper and lower chambers. So this gives you a basic flavor of the ideological orientation of these states, but doesn't show you all the juicy details of what is going on in each chamber over time (see the paper).

Scores are on the vertical axis. Positive numbers indicate conservatism and negative numbers liberalism, and 0 is "moderate." The dark lines are the party medians.

What is interesting is how different the states are from each other (and Congress). California is highly polarized (parties are far apart and there's no overlap), while New York and Pennsylvania's are far less so. State parties often move in tandem in ideological directions: Illinois Democrats and Republicans are more liberal than, say, Michigan Democrats and Republicans.

boxplot_parties_mcmc.png

This paper is still in development, so we welcome your comments here or by email.

PS: Corrections in the text of the post made, thanks to Lee Sigelman (GW).

John Carlin had some comments on my paper with Weakliem:

My immediate reaction is that we won't get people away from these mistakes as long as we talk in terms of "statistical significance" and even power, since these concepts are just too subtle for most people to understand, and they distract from the real issues. Somewhat influenced by others, I spend quite a bit of time eradicating the term "statistical significance" from colleagues' papers. I suspect that as long as the world sees statistical analysis as dividing "findings" into positives and negatives then the nonsense will keep flowing, so an important step in dealing with this is to change the terminology. In your example you seem to be arguing too much on his ground by focussing on the fact that although he data-dredged a significant p-value, your p-value is not significant. (So the ignorant editor or reader may see it as technical squabbling between statisticians rather than being forced to deal with the real issues about precision of estimation or lack of information.)

I agree entirely that the problem is with the framework of effects as true/false, but this is the very framework that "statistical significance" is built around and your article makes that concept very central by continually referring to "what if the effect is not statistically significant?" etc. I think the focus should be on how dangerous it is to overinterpret small studies with vast imprecision, and I'm not sure why this can't be clarified by sticking to the precision (or information) concept. I still haven't looked again at your Type S and Type M but on the face of it wonder if they may just confuse by adding more layers. Statistical significance gets it wrong because it focuses on null hypotheses (usually artificial), but when you say Type S it almost sounds similar in that you are thinking of truth/falsity with respect to the sign, rather than uncertainty about effects...?

My big point in considering Type S errors is to move beyond the idea of hypotheses being true or false (that is, to move beyond the idea of comparisons being exactly zero), but John has a point, that I still have to decide how to think about statistical significance. The problem is that, from the Bayesian perspective, you can simply ignore statistical significance entirely and just make posterior statements like Pr (theta_1 > theta_2 | data) = 0.8 or whatever, but such statements seem silly given that you can easily get impressive-seeming probabilities like 80% by chance.

Test variables that don't depend on data

Ying Yuan writes,

Taleb and military officers

A reader writes, regarding my review of The Black Swan,

. . . and your password is . . .

I received an email from a journal asking me to review a paper. Near the bottom of the email, it says,

Your User Name is xxx and your password: xxx.

I x-ed these out for obvious reasons--I don't want my passwords spread over the net. I can't believe that an online system would send passwords by plaintext over email, conveniently flagged with the word "password"!

Counting churchgoers

When studying religoius attendance and voting, it's worth remembering this measurement issue.

I can't say I have much of an explanation for this, but it's interesting:

religion_small.png

Church attendance is a strong predictor of how high-income people vote, not such a good predictor for low-income voters.

There's lots of talk about religion and income and voting, but people don't always know that interactions are important.

Here are some time trends (from this paper with David Park). The graphs below show the difference in Republican vote between rich and poor, religious and non-religious, and their interaction (that is, the difference in differences), computed separately for each presidential election year:

interactiontimetrends_small.png

As others have noted (although not, as far as I know, looking at interactions), it all started in 1992. We heard a lot about the Moral Majority back in 1980, but it doesn't seem to have started showing up in voting patterns until Clinton.

You can read more about interactions in the linked article. The key points are that (a) higher-income voters support the Republicans and have done so for awhile; (b) more recently, churchgoers have supported the Republicans, (c) the difference between churchgoers and non-churchgoers is much greater for the rich than the poor.

P.P.S. I posted the top graph several months ago but the recent interest in these religiosity/income graphs and these state voting maps motivated me to repost.

Religiosity and income in the U.S.

David noticed this article by Dan Mitchell reporting the well-known fact that people in richer countries tend to be less religious. What about states in the U.S.? We (that is, David Park, Joe Bafumi, Boris Shor, and I) look at it two ways.

First, here's a scatterplot of the 50 states, plotting average religious attendance vs. average income. (Religious attendance is on a -2 to 2 scale, from "never" to "more than once a week," and average income was originally in dollars but has been rescaled to be centered at zero.):


st.rel.inc0004_small.png

States that voted for Bush in 2004 are in red and the Kerry-supporting states are blue. You can see that people in richer states tend to be less religious, although the relation is far from a straight line. There is also some regional variation (more religious attendance in the south, less in the northeast and west).

Second, here's a plot showing the correlation of religious attendance and individual income within each state. We get a separate correlation for each state, and so we can plot these. Here we plot the correlations vs. state income, using the same color scheme:


corr.st.rel.inc0004_small.png

Again, there's quite a bit of variation from state to state, but overall we see a positive correlation between income and religiosity in poor states and a negative correlation in rich states: To put it another way, in Mississippi, the richer people attend church more. In Connecticut, the richer people attend church less.

(See also here for more on income and voting by state, and here for more on income, voting, and church attendance.)

P.S. Typos fixed (thanks to commenters Derek and Sandapanda).

P.P.S. Colors of Iowa and New Mexico fixed (thanks to commenter David).

"Like upscale areas everywhere, from Silicon Valley to Chicago's North Shore to suburban Connecticut, Montgomery County [Maryland] supported the Democratic ticket in last year's presidential election, by a margin of 63 percent to 34 percent." -- David Brooks, 2001.

Some of the discussions of our red-state, blue-state maps picked up on the differences between where national journalists live (the mid-Atlantic states and California) and other parts of America. The income-voting pattern is different in the red states and the blue states. We have some more thoughts on this (scroll down for the pretty graphs).

David Brooks, the New York Times op-ed columnist and author of Bobos in Paradise and On Paradise Drive, explored the differences between Red and Blue America in an influential article, "One Nation, Slightly Divisible,'' in the Atlantic Monthly shortly after the 2000 election. Sometimes described as the liberals' favorite conservative, Brooks embodies the red-blue division within himself. He has liberal leanings on social issues but understands the enduring appeal of traditional values---"today's young people seem happy with the frankness of the left and the wholesomeness of the right,'' and his economic views are conservative but he sees the need for social cohesion among rich and poor. His Atlantic article compared Montgomery County, Maryland, the liberal, upper-middle-class suburb where he and his friends live, to rural, conservative Franklin County, Pennsylvania, a short drive away but distant in attitudes and values, with "no Starbucks, no Pottery Barn, no Borders or Barnes & Noble,'' plenty of churches but not so many Thai restaurants, "a lot fewer sun-dried-tomato concoctions on restaurant menus and a lot more meatloaf platters.

Brooks lives in a liberal, well-off area. It is characteristic of the east and west coasts that the richer areas tend to be more liberal, but in other parts of the country, notably the south, the correlation goes the other way. A comparable journey in Texas would go from Collin County, a suburb of Dallas where George W. Bush received 71% of the vote, to rural Zavala County in the southwest, where Bush received only 25%.

scatterplot_texas.png

The graph above shows the pattern: Collin and Zavala (the dark circles on the scatterplot) are the richest and poorest counties in Texas, and there is a clear pattern that poor counties supported the Democrats while the Republicans won in middle-class and rich counties.

When we showed this to a political scientist, he asked about the state capital, noted for its liberal attitudes, vibrant alternative rock scene, and the University of Texas: "What about Austin? It must be rich and liberal.'' We looked it up. Austin is in Travis County and makes up almost all its population. Travis County has a median household income of $45,000 and gave George W. Bush 53% of the vote, putting it about midway between Collin and Zavala counties in the graph.

By comparison, the next graph shows the counties of Brooks's home state of Maryland: here there is no clear pattern of county income and Republican vote. We have indicated Montgomery County, the prototypical wealthy slice of Blue America, in bold, and it is not difficult to find poorer, more Republican-supporting counties nearby as comparisons. Rich and poor counties look different in Blue America than in Red America.

scatterplot_maryland.png

We can also look at income and voting for individual voters in each state. In Texas, there is a strong relation between income and voting:

texas.png

In Maryland, the pattern is much weaker:

maryland.png

And here, by popular demand, is the notorious Kansas:

kansas.png

P.S. Just to be clear, I think Brooks's observations about cultural differences between red and blue America are interesting and important; you just have to be careful when aligning these with income or wealth.

Mike Alvarez, Delia Bailey, and Jonathan Katz just completed this paper:

Since the passage of the “Help America Vote Act” in 2002, nearly half of the states have adopted a variety of new identification requirements for voter registration and participation by the 2006 general election. . . . In this paper we document the effect of voter identification requirements on registered voters as they were imposed in states in the 2000 and 2004 presidential elections, and in the 2002 and 2006 midterm elections. Looking first at trends in the aggregate data, we find no evidence that voter identification requirements reduce participation. Using individual-level data from the Current Population Survey across these elections, however, we find that the strictest forms of voter identification requirements — combination requirements of presenting an identification card and positively matching one’s signature with a signature either on file or on the identification card, as well as requirements to show picture identification — have a negative impact on the participation of registered voters relative to the weakest requirement, stating one’s name. . . .

It looks interesting to me. It's a hard problem to study because the number of states with changes is not large. Also, I'm still a little baffled by Figures 5 and 6, where it says that Pr(voting) > 90% for "an average registered voter". Turnout isn't really that high! I thought it was closer to 70%? Perhaps something was set to zero rather than the middle of the distribution when computing the average? It shouldn't make a difference for the comparison but it would be good to get that intercept settled to a reasonable value. One thing that would also help would be to plot the raw data on top of some of these graphs to make sure the model is doing what it's supposed to be doing.

I was also suspicious that in Figure 6, the confidence intervals have the same width for nonwhite as for white respondents. There are fewer nonwhites, so I'd think their confidence intervals should be wider. But since the model has no interactions, I guess it makes sense for the confidence intervals to have the same with here. Also, I'd take Figures 7-9 and make them as one figure with 8 columns and 3 rows, this would allow easier comparisons.

A statistician does web analytics

I sometimes play with Google Analytics to see the number of daily visitors on our blog and where they are coming from. The charts of daily visits look a bit like this:

googanal.png

Clearly, there is an upwards trend, but the influence of the day of the week messes everything up. I exported the data into a text file, and typed a line into R:


plot(stl(ts(read.table('visitors'),frequency=7),s.window="periodic"))


decompose.png

The trend component shows what I am really interested in: the trough of summer, followed by a relatively consistent rising trend. Every now and then another site will refer to our blog, temporarily increasing the traffic, and Andrew's cool voting plots are responsible for the latest spike.

Setting the stl function's t.window parameter to 14, 21 or more will smooth the trend a bit more. The model is imperfect because new visitors do come in bursts, but leave more slowly. Perhaps we should do a better Bayesian model for time series decomposition, unless someone else has already done this.

The theoretical statistician uses x, the applied statistician uses y (because we reserve x for predictors).

This is a long one, but it's good stuff (if you like this sort of thing). Dana Kelly writes,

We've corresponded on this issue in the past, and I mentioned that I had been taken to task by a referee who claimed that model checks that rely on the posterior predictive distribution are invalid because they use the data twice.

Skepticism about empirical studies?

Nick Firoozye writes,

I [Firoozye] wanted to point your attention to the following podcast by Ian Ayres on Supercrunchers, where he shows himself an enthusiastic (if perhaps a bit naïve) proponent of the statistical method. Entertaining, definitely. One thing though that I thought you might be interested in is Russ Roberts’ (the interviewer's) own skepticism over the econometric method, which I think probably warrants a response. It may be that Roberts’ own view is due to his now-Austrian economics slant (i.e., somewhat anti-formallist approach) or perhaps to the fact that mainstream econometrics is a frequentist pursuit and one might question the honesty of the results as a consequence.

I don't really have much to add here, except that the problem noted by Roberts (it's hard to know whether to believe a statistical study) is even more of a problem with non-statstical empirical studies (i.e., anecdotes). I think Roberts might be overstating the problem because he is focusing on issues where he already had a strong personal opinion even before seeing data analyses. (He mentions the examples of concealed handguns and anti-theft devices on cars.) But there are a lot of areas where we have only weak opinions which can indeed be swayed by data (see here for some examples). These cases are important in their own right and also can serve as benchmarks for the success of statistical analysis, so that we can trust good analyses more when they're applied to tougher problems. This is one way that applied statistics proceeds, by exemplary analyses of problems that might not be hugely important on their own terms but serve as useful templates. Consider, for example, the book by Snedecor and Cochran: it's full of examples on agricultural field trials. Sure, these are important, but these methods have been useful in so many other fields. This is a great example, actually: Snedecor and his colleagues worked on agricultural trials because they cared about the results--these were not "toy examples" or thought experiments--and the resulting methods endured.

Paul Krugman writes,

The news media seem determined to destroy the republic:
In all, 63% of the campaign stories focused on political and tactical aspects of the campaign. That is nearly four times the number of stories about the personal backgrounds of the candidates (17%) or the candidates’ ideas and policy proposals (15%). And just 1% of stories examined the candidates’ records or past public performance, the study found.

And:

The press’ focus on fundraising, tactics and polling is even more evident if one looks at how stories were framed rather than the topic of the story. Just 12% of stories examined were presented in a way that explained how citizens might be affected by the election, while nearly nine-out-of-ten stories (86%) focused on matters that largely impacted only the parties and the candidates.

This has always bothered me too. One reason Gary and I did our research on why are American Presidential election campaign polls so variable when votes are so predictable was that we wanted to convince the news media to do more substantive stories and less polling. Our point was that general elections for president are generally determined by fundamental variables, not short-term news or bandwagon effects--things are different for primary elections, which have multiple candidates and are inherently unstable--and so this horse-race coverage was a waste of time.

Why, then?

Nonetheless, horse-race coverage persists. I don't know whether it's worse than before--the site linked to by Krugman does not have comparative time series data--but it's still there. I'd also include the ridiculously-frequent polling as an example of this problem. Anyway, why is it still happening?

My theory, at least for the general election, is that most of the voters have already decided who they're going to vote for--and even the ones who haven't decided are often more predictable than they realize. Suppose, for example, that 40% have pretty much already decided they'll vote for the Democrat, 40% will vote for the Republican, and the fight is over the remaining 20%--most of whom do not follow politics closely in any case. Now think of the audience for political news. 80% of the people don't need to know the candidates' positions--they've already decided their votes--but they're intensely interested in the horse race: are "we" going to win or lose? The substantive coverage that Krugman and I might want is really just for 20% of the audience. So, from that perspective, it makes sense for the media to give people the horse race. (Yes, survey respondents say they want more of candidates position issues and less on which candidate is leading in the polls--but I don't know that I believe people when they say this.)

That said, when talking about the primary elections, yeah, I think it would make sense for the media to report more on where the candidates stand on issues.

Finally, on thing that does surprise me is that they don't run more stories on the sources of the candidates' money. Y'know, this sort of thing. It's fun and also could be informative about the candidates.

P.S. The histogram here should be a horizontal dotplot. Trying to read a colored plot with a key on the side--that's bad news.

Seth posts this account by a college student who went back to her high school to give a guest lecture on depression to "Mr. Tinloy’s 3rd period psychology class." Her feelings in preparing and delivering her lecture were pretty similar to my own feelings before doing this sort of thing, and I've been doing it for over 20 years!

The college student's presentation seemed to go well--the students were polite, got involved in discussion a bit, and clapped at the end, and the teacher was helpful in keeping things focused--but when she talked with some friends afterward, one said "she was fighting to stay awake, because the topic did not interest her one bit," and another said that "it was boring because she wasn’t all that interested in what I was talking about, but it got more interesting toward the end when other students started to talk. `Nobody likes guest speakers, so it’s okay.'"

I have a few thoughts:

1. I suspect the student's presentation to the high school kids would've gone even better if she'd had them working in pairs to discuss the material. When students are working in pairs, they seem less likely to drift off, also with two students there is more of a chance that one of them is interested in the topic.

2. It's interesting but perhaps not so surprising that depression is not an interesting topic to the high school students. Maybe they'd be more interested if it were framed in terms of being happy or sad, or good moods and bad moods? Even those of us who feel far from "depression" get sad or demoralized on occasion.

3. My own lectures to outside audiences) seem to go well (in that people say nice things to me afterwards about the presentations) but I usually have difficulty getting people actively involved. It often seems that my talks don't have "hooks" to grab the audience and motivate them to ask questions and think hard. They more often sit there passively, enjoying it (I hope) but not actively engaged. Maybe I should have them work in pairs. I do this for college students and even grad students--it always surprises them, but they like it--but I've rarely had the nerve to try it with nonstudents.

4. In the continuing theme of not practicing what we preach, I should point out that my comments above (including the title of this blog entry) are not based on any systematic research, just on my informal observations of what seems to have worked and not worked for me in the past. (Although it does seem consistent with the literature on active learning, as I've abosrbed it by reading a few books on the topic.) What I'm missing is (a) careful experimentation (assigning treatments--different teaching methods--unconfounded with important variables such as characteristics of the class, and (b) outcome measures such as surveys of student satisfaction and performance on standardized tests.

More precision on income and voting?

Jeff writes,

I was pointed to Distinguishing Association from Causation:A Background for Journalists (there is also a PDF version). Here is my summary of their executive summary:


  • Scientific studies that show an association between a factor and a health effect do not necessarily imply that the factor causes the health effect.

  • Randomized trials are studies in which human volunteers are randomly assigned to receive either the agent being studied or an inactive placebo, usually under double-blind conditions.

  • The findings of animal experiments may not be directly applicable to the human situation because of genetic, anatomic, and physiologic differences between species and/or because of the use of unrealistically high doses.

  • In vitro experiments are useful for defining and isolating biologic mechanisms but are not directly applicable to humans.

  • The findings from observational epidemiologic studies are directly applicable to humans, but the associations detected in such studies are not necessarily causal.

  • Useful, time-tested criteria for determining whether an association is causal include:
    • Temporality. For an association to be causal, the cause must precede the effect.

    • Strength. Scientists can be more confident in the causality of strong associations than weak ones.

    • Dose-response. Responses that increase in frequency as exposure increases are more convincingly supportive of causality than those that do not show this pattern.

    • Consistency. Relationships that are repeatedly observed by different investigators, in different places, circumstances, and times, are more likely to be causal.

    • Biological plausbility. Associations that are consistent with the scientific understanding of the biology of the disease or health effect under investigation are more likely to be causal.

  • Studies that include appropriate statistical analysis and that have been published in peer-reviewed journals carry greater weight than those that lack statistical analysis and/or have been announced in other ways.

  • Claims of causation should never be made lightly.

But all this isn't about causation vs association, it's about better studies or worse studies. Association and causation are not binary categories. Instead, there is a continuum from simple models on observational data (correlation between two variables), through more sophisticated models on observational data that include covariates (regression, structural equation models), through yet sophisticated models on observational data that take sample selection bias into consideration (Rubin's propensity score approach), to often simple models on controlled data (randomized experiments). But the mysterious causal "truth" is still out there. If one talks to philosophers these days, they're not even happy with the notion of causality as being powerful enough as a model of reality.

In the past, I've often unfairly complained about studies after having read misleading journalistic reports, so this report is a timely one. But the report has been paid for by large pharma corporations, people may wonder if there is bias or some sort of an agenda in this report.

My quick impression is that they're promoting the best practices in statistical methodology, that all these companies are subscribing to. But there could be greater use of cheaper observational studies with better modeling (such as employing the propensity score approach, or even just better regression modeling) compared to expensive randomized experiments, and society might be better off as a result. Moreover, there is the issue of statistical versus practical significance. What do you think?

Anova

Cari Kaufman writes,

I am writing a paper on using Gaussian processes for Bayesian functional ANOVA, and I'd like to draw some connections to your 2005 Annals paper. In my own work I've chosen to use a 1-1 reparameterization of the cell means, that is, to constrain the levels within each factor. But I am intrigued by your use of exchangeable levels for all factors, and I'm hoping you can take a few minutes to help me clarify your motivation for this decision. Since not all parameters are estimable under the unconstrained model, don't you encounter problems with mixing when the sums of the levels trade off with the grand mean? It seems in many situations it's advantageous to have an orthogonal design matrix, especially when the observed levels correspond to all possible levels in the population. Do you have any thoughts on this you can share?

I should say I found the paper very useful, especially your graphical representation of the variance components. I also like your distinction between the superpopulation and finite population variances, which helped me clarify what happens when generalizing to functional responses. Basically, we can share information across the domain to estimate the superpopulation variances by having a stationary Gaussian process prior, but the finite population variances can differ over the domain, which gives some nice insight into where
various sources of variability are important. (At the moment I'm working with climate modellers, who can really use maps of where various sources of variability show up in their output.)

My reply: I'm not quite sure what the question is, but I think you're pointing out the redundant parameterization issue, that if we specify all levels of a factor, and then have other crosscutting or nested factors (or even just a constant term), then the linear parameters are not all identifiable. I would deal with this issue by fitting the large, nonidentified model and then summarizing using the relevant finite-population summaries. We discuss this a bit in Sections 19.4-19.5 and Chapters 21-22 of our new book.

A couple notes on this:

1. Mixing of the Gibbs sampler can be slow on the original, redundant parameter space but fast on the transformed space, which is what we really care about. Also, things work better with proper priors. My new thing is weakly informative priors which don't include all your prior information but act to regularize your inferences and keep the algorithms in a reasonable space where they can converge faster. The orthoganality that you want can come in this lower-dimensional summary.

2. The redundant-parameter model is identified, if only weakly, as long as we use proper prior distributions on the variance parameters. In Bayesian Data Analysis and in my 2005 Anova paper, I was using flat prior distributions on these "sigma" parameters. But since then I've moved to proper priors, or, in the Anova context, hierarchical priors. See this paper for more information, including an example in Section 6 of the hierarchical model for the variance parameters.

More on differences in differences

Bob Erikson writes,

I was trolling the internet and came across your debate with Jens H. from Feb 15 07 on your blog about differences in differences.

You might find the attached document of interest. It is a once-influential currently-obscure article from half a century ago on this topic. The language is not contemporary. But note Campbell's example of 2 ways to analyze the substantive problem and two very different interpretations. Presumably Campbell is correct, using a difference of differences approach.

Bob's office is just across from mine in the political science department, but of course we communicate via blogs and emails. Anyway, I'll have to read the paper carefully. Also it will be interesting to see if they noticed that before-after correlations are higher for controls than for treated units.

Bayes pays again

In addition to this, Frederic Bois has another position available:

INERIS Research position in statistics/biostatistics applied to toxicology

Frederic Bois (winner of the Outstanding Statistical Application Award from the American Statistical Association, among other accomplishments) told me about the following job opportunities for modeling in toxicology, decision analysis, and risk assessment. Frederic is great, and so I assume the job is also. Here's the announcement:

Venn Diagram Challenge Summary 1.5

Few people have pointed us to some more of the Venn Diagram Challenge diagrams in response to the Venn Diagram Challenge Summary 1:

I'll take advantage of Paul Krugman's recent link to our paper on income and voting by putting up some cool scatterplots that we made recently. It started with our maps of which states Bush and Kerry would've won if only the votes of the poor, middle-income, and rich were counted:

Like Dave Krantz, I'm down on the decision-theoretic concept of "utility" because it doesn't really exist.

The utility function doesn't exist

You cannot, in general, measure utility directly, and attempts to derive it based on preferences (based on the Neumann-Morgenstern theory) won't always work either because:

1. Actual preferences aren't necessarily coherent, meaning that there is no utility function that can produce all these preferences.

2. Preferences themselves don't in general exist until you ask people (or, to be even more rigorous, place them in a decision setting).

So, yeah, utility theory is cool, but I don't see utility as something that's Platonically "out there" in the sense that I can talk about Joe's utility function for money, or whatever.

Call it value, not utility

The above is commonplace (although perhaps not as well known as it should be). But my point here is something different, a point about terminology. I would prefer to follow the lead of some decision analysis books and switch from talking about "utility" to talking about "value." To the extent the utility function has any meaning, it's about preferences, or how you value things. I don't think it's about utility, or how useful things are. (Yes, I understand the idea of utility in social choice theory, where you're talking about what's useful to society in general, but even there I'd say you're really talking about what society values, or what you value for society.)

Just play around with the words for a minute. Instead of "my utility function for money" or "my utility for a washer and a dryer, compared to my utility for two washers or two dryers" (to take a standard example of a nonadditive utility function) or "my utility for a Picasso or for an SUV," try out "my value function for money" or "the value I assign to a washer and a dryer, compared to the value I assign to two washers or two dryers" or "the value I assign to a Picasso or to an SUV." This terminology sounds much better to me.

P.S. See Dave's comments here.

Here's the abstract for my talk tomorrow for the 50th anniversary conference of the Harvard statistics department.

Some Open Problems in Hierarchical Models

Dave Krantz on utility and value

Dave had these comments on my recent thoughts on utility and value functions:

I [Dave] agree with the negatives about "utility" as a word and as a Platonic function (attached to each individual).

In teaching, I tend to discuss "subjective value." In my decision making course for undergrads I talk about optimization with respect to "objective" values, including physical, biological, and economic indices (e.g., maximum area, maximum sustainable yield, maximum profit), and with respect to subjective value, measured in a variety of ways; then I emphasize that many decision rules do not maximize anything -- because the weighting or even the existence of many goals is context dependent, and because some goals are converted into constraints. Optimization is thus subject to constraint and performed with context-dependent weights.

A standard use for "value function" in behavioral economics derives from Tversky & Kahneman's Prospect Theory; one of the blog contributors complains about that. And the emphasis on choice of words leads another contributor to treat the issue as one of words, rather than concepts and facts, no more important than "degrees of freedom" (which, of course, is a venerable term used relatedly in physics and in statistics).

I don't think there is an easy cure via terminology, though I feel you are on the right track here.

Those numbered Congresses

Since I'm layin down the law on terminology . . .

Could the entire subfield of American politics please stop talking about the 77th Congress or the 103rd Congress and start talking about the 1941-42 Congress and the 1993-94 Congress and so forth? This would just make everybody's life easier.

Thank you. I'll stop bugging you now.

Xue Li and Dick Campbell write,

We are doing research on racial and SES disparities in stage at breast cancer diagnosis which involves using tract-level poverty estimates to impute person-level SES variables. We are trying to do this using Bayesian methods.

Markus Loecher asks some questions that come up on occasion:

I am in the middle of going through your book on multilevel/hierarchical models. . . hope to apply them to some multi-scale spatial problems that I am working on. I am still struggling with some of the more subtle implications of the mixed effects model and the "partial pooling" approach which even when formulated more along frequentist terminology seems to have a distinct Bayesian flavor to it ? One particular source of confusion to me is Figure 12.1 While I understand the problems with estimates that rely on small sample sizes and also find the shrunken estimates in Fig. 12.1b appealing, I cannot overcome my feeling that this "dilemma" should be taken care of and expressed in the respective wider confidence intervals. Pooling across counties seems like quite a strong assumption on exchangeability ? A more mundane question is: how did you get the confidence intervals for sample sizes of 1 ?

My response:

1. Frequentist, Bayesian: they're just words. What matters is what you're doing. That's why we talk about partial pooling, 'cos that's what we're doing to the data. If you push me on it, though, yeah, the entire book is Bayesian.

2. You'd like the intervals in Figure 12.1b to be wider. Here's a way to think of it: suppose the data we saw were a random sample from a huge dataset, with thousands of measurements per county. Now suppose you wanted to use the no-pooling or the multilevel estimates as a way of making a prediction for the average of the thousands of measurements in each county. Finally suppose that you want to give each estimate a conf interval, so that in each case there's a 68% chance that the estimate contains the true value. Then you'd find (assuming the model is correct) that, indeed, the no-pooling estimates would require those really wide confidence intervals (as in Fig 12.1a), but the multilevel estimates would only need narrower intervals (as in Fig 12.1b).

3. "Exchangeability" refers to statistical methods (or, more generally, mathematical formulas) that treat the different groups (in this case, counties) symmetrically. The no-pooling, multilevel, and complete-pooling procedures are all exchangeable. So if you don't like exchangeability, there's no reason to pick on the multilevel model. More to the point, the multilevel model allows you to easily go beyond exchangeability by adding group-level predictors as illustrated later on in the chapter.

4. Even when sample size is 1 (or 0), you can get confidence intervals--the info comes from the group-level model.

What makes a face attractive?

Susan sent me this link and asked for my thoughts about some related question which, unfortunately, I've forgotten. That's what happens when you wait over a month to answer an email. Anyway, the website is cute, much cuter than ours. We clearly have a lot of work to do.

The work looks interesting. I wonder about time trends. It's my impression that characters in old TV shows were often pretty ugly (for example, consider the guy in Mr. Ed), but now they all seem pretty attractive. But maybe some of that is technology--cameras are better so they don't have to slap on all the greasepaint or whatever.

This one's pretty funny. TV personality Tucker Carlson writes,

OK, but here’s the fact that nobody ever, ever mentions — Democrats win rich people. Over 100,000 in income, you are likely more than not to vote for Democrats. People never point that out. Rich people vote liberal. I don’t know what that’s all about.

Well, yeah, nobody ever points that out, because it's actually false! Here are the 2004 data, and Krugman links to the 2006 polls. (I followed the link originally from Mark Thoma.)

Income certainly doesn't determine everything--the Republicans only beat the Democrats 52-47 among the over 100,000's. Nonetheless, there's something funny about someone complaining that people aren't repeating a false fact. You'd think he would've thought something like, Huh, this is an interesting fact . . . nobody every mentions it . . . maybe it's actually false.

Survey weighting and regression modeling

Mike Larsen asks,

The probability coverage demonstration

Kaiser writes,

Been leafing through the "Super Crunchers" book over the weekend. . . . Halfway through it, I am still trying to figure out if "super crunching" means traditional statistics or data mining. It is not without irony that the author seems to equate the two. Regardless, it's still good publicity for our field.

One example that seemed to have caught on [comes] from a book called "Decision Traps" by Russo and Schoemaker (who I think are business consultants). The idea is a catchy one, which is to illustrate the "over-confidence" of decision makers. The trick they used is to ask people to provide interval estimates at 90% confidence to a list of 10 questions such as "What was Martin Luther King Jr's age at death?", and "In what year was Mozart born?". Out of 1000+ respondents, they found that "less than 1 percent of the people gave ranges that included the right answer nine or ten times. Ninety-nine percent of people were overconfident." (pp.112-114 in the book). . . . Have you done anything similar with your students?

As far as I know the original idea came from an example of Alpert and Raiffa. I've had lots of success doing an adaptation of the Alpert and Raiffa demo in class; see Section 13.2.2 of my book with Nolan on Teaching Statistics book or section 4 of this paper.

There's also a more standard confidence coverage demo in Section 8.4 of Teaching Statistics. That one works well in class too.

Psychologists vs. economists

This is fun because, as an outsider to both fields, I can just stand back and watch. Dan Goldstein writes:

MCMC starting values

Jun Xiang writes,

I am using your R2WinBUGS to estimate some hierarchical logistic regression and have a question about how to set initial values. First, when I use some uninformative initial values the model has some convergence problem (Rhat indicates non-convergence after 10,000 iterations). Next, I use MLE as the initial values and the model has better convergence results. I want to know if the latter way is right since each chain in this case gets the same initial values.

My reply: Yes, you can have problems if your starting values are too wacky. Basically, the usual noninformative prior distributions contain parameter values that are so extreme (e.g., theta=10^4) that there can be convergence issues. In principle it would be best to use more reasonable prior distributions, but in practice it often works fine if you pick starting values that are less extreme. This is discussed a bit in the Gelman and Hill book in the discussion of using Bugs. In addition, it makes sense to parameterize reasonably (for example, scaling predictors) so that you won't get coefficients such as 10^4 or 10^-4. We discuss the scaling issue in the book also.

I received the following email:

Web 2.0

Aleks writes: "worth watching this movie on information in the age of the internet, and the prequel."

My impression: they were fine, but I'm just too impatient to watch a video. I can scroll through text much faster. One thing I did learn, though, is that I should be using XML rather than HTML, which is how I write my webpages now.

I just got this letter in the email. Oddly enough, it appears to be serious:

The Arete Initiative at the University of Chicago is pleased to announce a new $2 million research program on the nature and benefits of Wisdom. Although it has been neglected in the past, a new scientific and scholarly study of Wisdom has the potential to raise new questions, challenge assumptions, and develop new theoretical and empirical models which will enliven debate within and across disciplines. we are looking for new ideas and approaches from the brightest young scholars. To this end, in 2008 we will award up to twenty (20), two-year research grants to scholars from institutions around the world who have received their Ph.D within the past ten years.

We are looking for highly original, methodologically rigorous projects from a broad range of disciplines: neuroscience, psychology, genetics, evolutionary biology, game theory, computer science, sociology, anthropology, economics, philosophy, ethics, education, human development, history, theology, and religion. Although individual projects will likely take root in a particular area or in two related areas, award recipients will participate in annual research meetings . . .

I was relieved to see that statistics was on this list; unfortunately, political science is listed on the webpage. Not to be too skeptical or anything--I'm sure many of my own research projects can be easily and usefully mocked--but I'm doubtful about what sort of "highly original, methodologically rigorous projects" can be done on the "nature and benefits of Wisdom." It reminds me a bit of the research project in The Tin Men where they were designing ethical machines. Or maybe it's the capital-letter thing with Wisdom that draws suspicion. Couldn't they have thrown in a few more buzzwords, such as "evidence-based" and "feng shui"? Maybe "six-sigma tough"? On the other hand, hey, it's only $2 million, that's not a lot of money.

Then We Came to the End

This book, by Joshua Ferris, is brilliant hilarious. Sort of a cross between Geoffrey O'Brien and Don DeLillo, only funnier. Or like a fast-forward Richard Ford without the smugness. It's impressive to me the way in which novel-writing technique has improved in recent decades. I mean, John Updike was pretty slick, but Ferris (and, for that matter, Ford) really seems in total control of his material, even in comparison to the masters of the previous generations. Sure, there was Nabokov (and, in his own way, James Jones), but that's about it from back then. Now there seem to be a lot of novelists who really know what they're doing in this way. (I think Jonathan Coe could have total control of his material too, if he really felt like it. He seems like Mailer or (Martin) Amis in his desire to shatter his own smooth surfaces.)

P.S. Sorry for only mentioning white males. I'll try to do better next time. Veronica Geng? Alex Haley? Joy Luck Club?

P.P.S. I recommended "Then We Came to the End" to a friend the other day. She hadn't read it, and I'm not sure if she knew what it was about, but, oddly enough, she knew the author's name. I'm not likely to remember an author's name if I haven't read the book (especially since it was his first). She did, however, return the favor by recommending I read something by Jane Austen. I don't think I've read any pre-Dickens novels, and she teaches a whole classful of this stuff. Mark Twain is the earliest writer who reads to me as if it could be written today, but she said that Austen is like that too.

Make R beep

Yu Sung says: Just type alarm() or cat("\a").

John Huber and Piero Stanig are speaking on this paper:

We [Huber and Stanig] analyze how institutions that establish the level of separation of church and state should influence the political economy of redistribution. Our formal model describes how incentives for charitable giving, coupled with church-state institutions, create opportunities for the rich to form coalitions with the religious poor, at the expense of the secular poor. In our analysis, religion can limit redistribution — not because of the particular faith, belief or risk attitudes of religious individuals (as emphasized by others) — but rather because of simple material greed among the rich and the religious poor. We explore how church-state separation will mediate efforts by the rich to form electoral coalitions with the religious poor, as well as the implications for the size of government, charitable giving, and the welfare of various social groups.

I don't have any specific comments on the paper, but I do wonder if the model can help explain the pattern that, in recent U.S. elections, income predicts of vote choice among the religious but not among non-church-attenders:

religion.png

A new kind of spam?

I received the following two emails, from two different senders, nine minutes apart today:

Half-lives of verbs

Richard Morey sends along this link. It looks pretty cool; the only thing that bugs me is that they keep using the word "mathematical" when they really mean "statistical."

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48