Results matching “R”

Still more on dolphin fetuses

Eric Archer sent in a question on Bayesian logistic regression. I don't really have much to say on this one, but since it's about dolphins I'll give it a try anyway. First Eric's question, then my reply.

Our new book on regression and multilevel models is written using R and Bugs. But we'd also like, in an appendix, to quickly show how to fit multilevel models using other software, including Stata, Sas, Spss, MLWin, and HLM (and others?). We'd really appreciate your help in getting sample code for these packages.

We've set up five simple examples. I'll give them below, along with the calls in R that would be used to fit the models. We're looking for the comparable commands in the other languages. We don't actually need anyone to fit the models, just to give the code. This should be helpful for students who will be learning from our book using other software platforms.

In a comment here, Aleks points us to Gary C. Ramseyer's Archives of Statistics Fun. In addition to various amusing items, he has the following fun-looking probability demo. For over a decade now, I've been collecting class-participation demonstrations for probability and statistics, and it's not every day or even every week that I hear of a new one. Here it is:

In a comment to this entry, Antony Unwin writes,

It is great to see someone emphasising the use of graphics for exploration as well as for presentation, but if you want to do EDA you need interactive graphics. It's a mystery to me why more people do not use them. Perhaps it's because you have to have fast and flexible software, perhaps it's because of the subjective component in exploratory work, perhaps it's because using interactive graphics is fun as well as productive and statisticians are serious people. I'd be grateful for explanations.
I'm in a good position to answer this question, since I probably should be using iteractive graphics but I actually don't. Why not? I don't know how to, and I guess I never was forced to learn. But I'm teaching a seminar course on statistical graphics this semester, so this would be a great time for me (and my students to learn).

Resources for getting started with dynamic graphics? (preferably in R)

The class will be structured as a student presentation each week. So, one week I'd like a pair of students to present on dynamic graphics. During the week before, they'd learn a dynamic graphics system, then during the class they'd do a demo, then there'd be some homework for all the class for next week.

Any suggestions on what they should do? Ideally it should be in R (since that's what we're using for the course). Antony Unwin suggested iPlots in R and Martin Theus's package Mondrian. The tricky thing is to get a student ready to do a presentation on this, given that I can't already do it myself. The student would need a good example, not just a link to the package.

26-Jan Sanders Korenman, Professor of Public Affairs, Baruch. “Improvements in Health among Black Infants in Washington DC.” (with Danielle Ferry)

2-Feb Kristin Mammen, Assistant Professor of Economics, Barnard. “"Fathers' Time Investments in Children: Do Sons Get More?"

9-Feb Sally Findley, Clinical Professor of Population and Family Health, Mailman School of Public Health, CU. “Cycles of Vulnerability in Mali: The interplay of migration with seasonal fluctuations.”

16-Feb FFWG: Jeanne Brooks-Gunn, Virginia and Leonard Marx Professor of Child Development and Education at Teachers College, CU. “Child Care at Three: Preliminary Analyses of Fragile Families.”

2-Mar FFWG: Brad Wilcox, Assistant Professor of Sociology, University of Virginia. Visiting Scholar, CRCW, Princeton University. “Domesticating Men: Religion, Norms, & Relationship Quality Among Fragile Families”

9-Mar Pierre Chiappori, E. Rowan and Barbara Steinschneider Professor of Economics, CU. “Birth Control and Female Empowerment: An Equilibrium Analysis.” NOTE: this seminar will begin at 12:30.
**12:30

16-Mar FFWG: Mary Clare Lennon, Associate Professor of Clinical Sociomedical Sciences, Mailman School of Public Health, CU. Visiting Scholar, CRCW, Princeton University. “Trajectories of Childhood Poverty: Tools for Analyzing Duration, Timing, and Sequencing Effects.”

23-Mar Andrew Gelman, Professor of Statistics and Political Science, CU. “Rich State, Poor State, Red State, Blue State: What’s the Matter with Connecticut? A Demonstration of Multilevel Modeling.”

13-Apr Howard Bloom, Chief Research Scientist, MDRC. “Using a Regression Discontinuity Design to Measure the Impacts of Reading First.”

20-Apr FFWG: Barbara Heyns, Professor of Sociology, NYU. Visiting Scholar, CRCW, Princeton University “The Mandarins of Childhood.”

27-Apr Leanna Stiefel, Professor of Economics, Wagner School, NYU. “Can Public Schools Close the Race Gap? Probing the Evidence in a Large Urban School District.”

4-May FFWG: Cay Bradley, School of Social Policy & Practice, University of Pennsylvania. Title: TBA

11-May Jane Waldfogel, Professor of Social Work and Public Affairs, CUSSW. "What Children Need."

18-May FFWG: Chris Paxson, Professor of Economics and Public Affairs, Princeton. “Income and Child Development.”

I think I noticed this because I've been thinking recently about crime and punishment . . . anyway, Gary Wills in this article in the New York Review of Books makes a basic statistical error. Wills writes:

In the most recent year for which figures are available, these are the numbers for firearms homicides:

Ireland 54
Japan 83
Sweden 183
Great Britain 197
Australia 334
Canada 1,034
United States 30,419

But, as we always tell our students, what about the denominator? Ireland only has 4 million people (it had more in the 1800s, incidentally). Yeah, 30,000/(300 million) is still greater than 54/(4 million), but still . . . this is basic stuff. I'm not trying to slam Wills here--numbers are not his job--but shouldn't the copy editor catch something like that? Flipping it around, if a scientist had written an article for the magazine and had messed up on grammar, I assume the copy editor would've fixed it.

The problem, I guess, is that there are a lot more people qualified to be grammar-copy-editors than to be statistics-copy-editors. But still, I think that magazines would do well to hire such people.

P.S. I also noticed the Wills article linked here, although they didn't seem to notice the table.

More press for Bayes

Gabor Grothendieck forwarded the following article by Derek Lowe from Medical Progress Today:

FDA Shows Interest in 18th Century Presbyterian Minister Bayesian statistics may help improve drug development

Not many ideas of 18th-century Presbyterian ministers attract the interest of the pharmaceutical industry. But the works of Rev. Thomas Bayes have improved greatly with age. . . . For decades, no one heard very much about Bayesian statistics at all. One reason for this was they're much more computationally demanding, which was a real handicap until fairly recently. . . But things are changing. . . . Bayesian and standard "frequentist" statistics are in many ways mirror images of each other, and there are mistakes to be made each way. . . . Bayesian statistics, though, don't address the likelihood that your observed results might have come out by random chance, but rather give you a likelihood of whether your initial hypothesis is true. . . . As far as I know, no pharma company has yet taken a fully Bayesian clinical package to the FDA for a drug approval. There have been a few dose-finding trials in the cancer area, and Pfizer's research arm in England ran a large trial of a novel stroke therapy under Bayesian protocols. . .

It's an interesting article in that it's presenting an pharmaceutical-industry perspective of something that I usually think of from an academic direction. I suspect that there's more Bayesian stuff going on in pharmaceuticals than people realize, though: I'd be curious what Amy Racine-Poon or Don Berry or Don Rubin could add to this article. For example, Bayesian data analysis has been used in toxicology for awhile.

Also, as I discussed in response to another recent news article about Bayesian inference, I think the differences between Bayesian and other statistical methods can be overstated. In particular, so-called frequentist methods still require "esitmates" and "predictions" which can (and often are) obtained using Bayesian inference. Also, I don't think it's quite right to say that Bayesian methods "give you a likelihood of whether your initial hypothesis is true." It would be more accurate to say that they allow you to express your unceratainty probabilistically, for example, giving some distribution of the effectiveness of a new drug, as compared to an existing treatment. The idea of a point "hypothesis" is, I think, a holdover from classical statistics that is a hindrance, not a help, in Bayesian inference. Finally, the article has a bit of discussion about where the prior comes from. In many many examples, the prior distribution is estimated from data using hierarchical modeling. There's not any need in this framework to specify a numerical prior distribution in the way described in the article.

In conclusion, Lowe's article gives an interesting look at Bayesian inference from another perspective, and also reveals that some of the recent (i.e., since 1980) developments in Bayesian data analysis still have not trickled through to the practitioners. I think that hierarchical modeling is much more powerful than the traditional "hypothesis, prior distribution, posterior distribution" approach to Bayesian statistics.

Things I feel bad about

In a comment on this entry, Jim Lebeau links to Scigen, a site by some MIT grad students describing a computer program they wrote to generate context-free simulacra of scientific papers that they submitted to a well-known fake scientific conference called SCI. (As far as I can tell, SCI and similar conferences are scams that make money off of conference fees (and maybe hotel reservations?).) SCI is well-known in that lots of people get their spam "invitations."

Anyway, the Scigen folks went to the "conference" and returned with this amusing report. Basically, the organizers were shifty and the conference attenders were clueless. It gave me a queasy feeling, though: it reminded me of when Joe Schafer and I were grad students and went to a session of cranks at the Joint Statistical Meetings. (Until a few years ago, the JSM used to accept all submissions, and they would schedule the obvious cranks, the people who could prove that pi=3 and so forth, for the last session of the conference.) Anyway, Joe and I showed up and sat at the back of the room. It was basically us and the speakers, maybe one or two other people. We attended a couple of talks. The speakers really were clueless. They were people with real jobs, I think, but with delusions of mathematical grandeur. At some point, Joe and I couldn't stop from cracking up. It was just too much trouble to swallow our laughs, and we had to leave. Then we felt terrible, of course. It's a good thing that the JSM rejects these talks now.

The funny thing is, in the last few years I've been to a couple of real sessions at the JSM at the last time on the last day. And the attendence for these real sessions has been almost as low as for the fake one that Joe and I attended.

Children vs. chimps

Phil writes,

I [Phil] recently read a book called "Becoming a Tiger," about animal learning. It has lots and lots of short pieces, arranged by theme; light reading, but very informative; includes references if one wanted to follow up.

Perhaps my favorite story is about otters. Back in the 60s (I think it was...might have even been 50s) the received wisdom among animal researchers was that animals learn only by operant conditioning: they do what they're rewarded for, they don't do what gets them punished, and that's it. Anyway, this woman is working with otters. She wants to train it to climb onto a box. She puts a box in its cage, it eventually climbs on, and she gives it a treat. It gets very excited, jumps down, runs around the cage, and gets back up on the box. She gives it another treat. Such thrills! It jumps down, and jumps back on the cage...and this time it stands on only three legs. She gives it a treat anyway. It jumps down, jumps back up, and lies on its back. And so on. It tries variations: putting just its front legs on the box, putting just its back legs on the box, etc.

Later, the researcher tells this story to some visiting scientists. They don't believe her: "an animal won't do something that might not get it a reward, if it can do something else that will get it a reward." She takes them to see her otter. She holds a hoop under water in the otter's pool. It swims around, looks at it, touches it...and eventually swims through it. She gives it a treat. Excitement! It swims through it again...another treat. It swims through it again, and grabs it out of her hand. No treat. It drops the hoop. She holds it up again. The otter swims through upside down. Treat. Try again, the otter swims through backwards. No treat. And so on. The otter kept on experimenting. One of the visiting researchers said "it takes years for me to train my grad students to be this creative."

Anyway, the book is full of stories of animals doing amazing things, as you might expect. But it also is full of stories of animals doing mental feats that humans can't do, which I did not expect. For example, supposedly a common element on human IQ tests is to look at the sketch of an object and pick which of several choices represents the same object viewed from a different angle. Trained pigeons can do this about as well as trained college students, but the pigeons can do it faster.

I think there could be a fun "reality TV" game show based on this. You would select a human team: an artist, an infant, a brainiac college student, and so on. You would have a series of tasks for them to do, and they would get to choose someone to compete in the task. Some of the tasks would be right up their alley...like, you'd have a basketweaver, and presumably the team would pick him to compete in the basketweaving contest, where he has 3 hours to make an artistic basket out of a pile of straw. He can probably do something pretty cool. but the bowerbird is going to kick his ass. You could find someone who is famous for having a great memory, and they might do pretty well burying 200 small objects in a field and coming back a week later to retrieve them...but some kinds of birds can bury literally thousands of nuts and seeds and find them months later, so this is not going to be a contest either. And in the pigeon vs the college kid in the "object rotation" test, the pigeon will have a slight edge.

The deterrent effect of the death penalty

John Donohue sent me this paper reviewing evidence about the deterrent effect of the death penalty. I recommend the Donohue and Wolfers paper highly--at least on the technical side, it's the paper I would like to have written on this topic, if I had been able to go through all the details on the many recent papers on the topic. I can't really comment on their policy recommendations but have some futher thoughts on the death-penalty studies from a statistical perspective.

I'll first give the key paragraph of the conclusion of their paper, then give my comments:

We [Donohue and Wolfers] have surveyed data on the time series of executions and homicides in the United States, compared the United States with Canada, compared non-death penalty states with executing states, analyzed the effects of the judicial experiments provided by the Furman and Gregg decisions comparing affected states with unaffected states, surveyed the state panel data since 1934, assessed a range of instrumental variables approaches, and analyzed two recent state-specific execution moratoria. None of these approaches suggested that the death penalty has large effects on the murder rate. Year-to-year movements in homicide rates are large, and the effects of even major changes in execution policy are barely detectable. Inferences of substantial deterrent effects made by authors examining specific samples appear not to be robust in larger samples; inferences based on specific functional forms appear not to be robust to alternative functional forms; inferences made without reference to a comparison group appear only to reflect broader societal trends and do not hold up when compared with appropriate control groups; inferences based on specific sets of controls turn out not to be robust to alternative sets of controls; and inferences of robust effects based on either faulty instruments or underestimated standard errors are also found wanting.

My thoughts

My first comment is that death-penalty deterrence is a difficult topic to study. The treatment is observational, the data and the effect itself are aggregate, and changes in death-penalty policies are associated with other policy changes. (In contrast, my work with Jim Liebman and others on death-penalty reversals was much cleaner--our analysis was descriptive rather than causal, and the problem was much more clearly defined. This does not make our study any better, but it certainly made it easier.) Much of the discussion of the deterrence studies reminds me of a little-known statistical principle, which is that statisticians (or, more generally, data analysts) look best when they are studying large, clear effects. This is a messy problem, and nobody is going to come out of it looking so great.

My second comment is that a quick analysis of the data, at least since 1960, will find that homicide rates went up when the death penalty went away, and then homicide rates declined when the death penalty was re-instituted (see Figure 1 of the Donohue and Wolfers paper), and similar patterns have happened within states. So it's not a surprise that regression analyses have found a deterrent effect. But, as noted, the difficulties arise because of the observational nature of the treatment, and the fact that other policies are changed along with the death penalty. There are also various technical issues that arise, which Donohue and Wolfers discussed.

Also some quick comments:

- Figure 1 should be two graphs, one for homicide rate and once for execution rate.

- Figure 2 should be 2 graphs, one on top of the other (so time line is clear), one for the U.S. and one for Canada. Actually, once Canada is in the picture, maybe consider breaking U.S. into regions, since the northern U.S. is perhaps a more reasonable comparison to Canada.

- Figure 8 qould be much clearer as 12 separate time series in a grid (with common x- and y-axes). It's hopeless to try to read all these lines. With care, you could fit these 12 little plots in the exact same space as this single graph.

Another view

Here's an article by Joanna Shepherd stating the position that the death penalty deters homicides in some states but not others. Their story is possible but it seems to me (basically, for the reasons discussed in the Donohue and Wolfers paper) not a credible extrapolation from the data.

The death penalty as a decision-analysis problem?

Policy questions about the death penalty have sometimes been expressed in terms of the number of lives lost or saved by a given sentencing policy. But I think this direction of thinking might be a dead end. First off, as noted above, it may very well be essentially impossible to statistically estimate the net deterrent effect of death sentencing--what seem like the "hard numbers" (in Richard Posner's words, "careful econometric analysis") aren't so clear at all.

More generally, though, I'm not sure how you balance out the chance of deterring murders with the chance of executing an innocent person. What if each death sentence deterred 0.1 murder, and 5% of people executed were actually innocent? That's still a 2:1 ratio (assuming that it's OK to execute the guilty people). Then again, maybe these innocent people who were executed weren't so innocent after all. But then again, not every murder victim is innocent either. Conversely, suppose that executing an innocent person were to deter 2 murders (or, conversely, that freeing an innocently-convicted man were to un-deter 2 murders). Then the utility calculus would suggest that it's actually OK to do it. In general I'm a big fan of probabilistic cost-benefit analyses (see, for example, chapter 22 of my book), but here I don't see it working out. The main concerns--on the one hand, worry about out-of-control crime, and on the other hand, worry about executing innocents--just seem difficult to put on the same scale.

Finally, regarding decision analysis, incentives, and so forth: much of the discussion (not in the Donohoe and Wolfers paper, but elsewhere) seems to go to the incentives of potential murderers. But the death penalty also affects the incentives of judges, juries, prosecutors, and so forth. One of the arguments in favor of the death penalty is that it sends a message that the justice system is serious about prosecuting murders. This message is sent to the population at large, I think, not just to deter potential murderers but to make clear that the system works. Conversely, one argument against the death penalty is that it motivates prosecutors to go after innocent people, and to hide or deny exculpatory evidence. Lots of incentives out there.

P.S. See also here and here for more comments on the topic.

Fake cancer study in Norway

Kjetil Halvorsen linked to this story, "Research cheats may be jailed," from a Norwegian newspaper:

State officials have been considering imposing jail terms on researchers who fake their material, but the proposal hasn't seen any action for more than a year. More is expected now, after a Norwegian doctor at the country's most prestigious hospital was caught cheating in a major publication. . . . The survey allegedly involved falsification of the death- and birthdates and illnesses of 454 "patients." The researcher wrote that his survey indicated that use of pain medication such as Ibuprofen had a positive effect on cancers of the mouth. He's now admitted the survey was fabricated and reportedly is cooperating with an investigation into all his research.

And here's more from BBC News:

Norwegian daily newsaper Dagbladet reported that of the 908 people in Sudbo's study, 250 shared the same birthday.

Slap a p-value on that one, pal.

Speaking as a citizen, now, not as a statistician, I don't see why they have to put these people in jail for a year. Couldn't they just give them a big fat fine and make them clean bedpans every Saturday for the next 5 years, or something like that? I would think the punishment could be appropriately calibrated to be both a deterrent and a service, rather than a cost, to society.

P.S. Update here.

Mr. Biv, call the office right away

Latest update from the front lines of the crisis in scientific education:

Curious George's Big Book of Curiosities has a picture of a rainbow, with the following colors, in order: red, yellow, bue, green. What the . . . ?

They also have a page of Shapes, which includes a circle, a triangle, a rectangle, an oval, and two different squares, one of which is labeled "square"--the other is labeled "diamond." So much for rotation-invariance!

Unethical behavior by a science journalist

Duncan Watts is a colleague of mine in the sociology department at Columbia. He is a very active researcher in the area of social networks and runs a fascinating seminar. Anyway, he and his student, Gueorgi Kossinets, recently published a paper in Science, entitled "Empirical analysis of an evolving social network." He then had an annoying interaction with Helen Pearson, a science reporter who writes a column for Nature online. Duncan writes:

I [Duncan Watts] was particularly pleased when Ms. Pearson called me last week, expressing her interest in writing a story for Nature's online news site. Having read Philip Ball's careful and insightful reports for years, I imagined that Nature News would be a great opportunity for us to have a substantive but accessible news story written about our work. And after speaking with Ms. Pearson for about two hours on the phone, over two consecutive days, sending her some additional reading material, and recommending (at her request) a number of other social network researchers she could talk to, I felt pretty confident that we would have exactly that. She asked lots of questions, seemed intent on understanding my responses, and generally acted like a real science journalist.

So imagine my surprise when monday morning I saw that our work had been characterized as "bizarre" and "pointless" in a derisive fluff piece by a fictional columnist.

Well, any publicity is good publicity, and all that. As I told Duncan, having a paper in Science is much much more of a plus than this silly article is a minus. If anything, it might get people to read the article and see that it has some cool stuff inside. I think the people who we respect will have much more respect for a peer-reviewed article in a top journal than for a silly pseudo-populist column.

On the other hand, what a waste of space--well, it's an online publication, so there are no real space restrictions. But what a waste of Duncan's time and energy. Basically, Pearson is trading on the reputation of Nature, and her own reputation as a science journalist. To pretend to be writing a serious story and then to mock, is simply unethical. Why not simply be honest--if she wants to mock, just say so upfront, or just write the article without wasting Duncan's time?

What about my own experiences in this area? I generally want my work to be publicized--why do the work if nobody will hear about it?--and so I'm happy to talk to reporters. I've tried to avoid the mockers, however. When Seth and I taught our freshman seminar on left-handedness several years ago at Berkeley, we were contacted by various news organizations in the San Francisco area. At one point a TV station wanted to send a crew into the class, but they decided not to after I explained that the class mostly involved readings from the scientific literature, not original research. Also, one publication that contacted us talked with Seth first, and he got the sense that they were just fishing for quotes for a story they wanted to write on stupid college classes, or something like that. So I avoided talking with them.

Also, I was mocked in the House of Commons once! Well, not by name, but in 1994, I think it was, I spoke at a conference in England on redistricting, to tell them of our findings about U.S. redistricting. Someone told me that a parliamentarian with the Dickensian name of Jack Straw had mocked our paper, which was called Enhancing Democracy Through Legislative Redistricting. But then someone else told me that being mocked by Jack Straw was kind of a badge of honor, so I don't know.

Anyway, it's too bad to see the reputation of Nature Online diminished in this way, and I hope Duncan has better luck in his future dealings with the press.

P.S.

See here for a funnier version of Pearson's article with bits such as "needed to hammer home the 'duh' point a bit here I think. Ok?". I also like that in this version she describes the journal Science as "one of my very favourite reads." I mean, I like research papers as much as the next guy, but I have to admit that I would't usually curl up in my favorite (whoops, I mean "favourite") chair with a science journal.

Graphics suggest new ideas

Seth wrote an article, "Three Things Statistics Textbooks Don’t Tell You." I don't completely agree with the title (see below) but I pretty much agree with the contents. As with other articles by Seth, there are lots of interesting pictures. My picky comments are below.

Gregor Gorjanc writes,

Missing-data imputation question

Joost van Ginkel writes,

Gary King, along with Bernard Grofman, Jonathan Katz, and myself, wrote an amicus brief for a Supreme Court case involving the Texas redistricting. Without making any statement on the redistricting itself (as noted in the brief, I've never even looked at a map of the Texas redistricting, let alone studied it in any way, quantitative or otherswise), we make the case that districting plans can be evaluated with regard to partisan bias, and that such evaluation is uncontroversial in social science and does not require any knowledge or speculation about the intent of the redistricters.

The key concept, as laid out in the brief, is to identify partisan bias with deviation from symmetry. I'll quote briefly from the brief (which is here; see also Rick Hasen's election law blog for more links) and then mention a couple additional points which we didn't have space there to elaborate on. I'm interested in this topic for its own sake and also because Gary and I put a lot of effort in the early 90s into figuring this stuff out).

Here's what we wrote on partisan symmetry in the amicus brief:

The symmetry standard measures fairness in election systems, and is not specific to evaluating gerrymanders. The symmetry standard requires that the electoral system treat similarly-situated political parties equally, so that each receives the same fraction of legislative seats for a particular vote percentage as the other party would receive if it had received the same percentage. In other words, it compares how both parties would fare hypothetically if they each (in turn) had received a given percentage of the vote. The difference in how parties would fare is the “partisan bias” of the electoral system. Symmetry, however, does not require proportionality.

For example, suppose the Democratic Party receives an average of 55% of the vote total across a state’s district elections and, because of the way the district lines were drawn, it wins 70% of the legislative seats in that state. Is that fair? It depends on a comparison with the opposite hypothetical outcome: it would be fair only if the Republican Party would have received 70% of the seats in an election where it had received an average of 55% of the vote totals in district elections. This electoral system would be biased against the Republican Party if it garners anything fewer than 70% of the seats and biased against the Democratic Party if the Republicans receive any more than 70%.

A couple of other things we didn't have space to include

1. To measure symmetry, both parties need to be in contention. For example, there's no way to determine if the districting system is "fair" to a third party that gets 20% of the vote, since it's unrealistic to extrapolate to what would happen if the Democrats and Republicans get 20% of the vote. This connects to the idea that the "ifs" needed to evaluate partisan bias are based on extrapolation rules that are extensions of uniform partisan swing. If one party is always getting 70% of the statewide vote, our symmetry measure isn't so relevant.

2. Our measure (and related measures) are not about "gerrymandering" at all (in the sense that "gerrymandering" refers to deliberate creation of wacky discricts) but rather are measures of symmetry (and, by implication, fairness) of the electoral system.

Thus, one could crudely imagine a 2 x 2 grid, defined by "gerrymandering / no-gerrymandering" and "symmetric / not-symmetric". It's easy to imagine districting plans in all four of these quadrants. We're saying that the social-science standard is to look at symmetry, not at gerrymandering--thus, at outcomes, not intentions, and in fact, at state-level outcomes, not district-level intentions. This is a huge point, I think.

99:1

I received the following one-sentence email from a Ph.D. statistician who works in finance:

For every one time I use stochastic calculus I use statistics 99 times.

Not that stochastic calculus isn't important...after all, I go to the bathroom more often than I use statistics, but that doesn't mean we need Ph.D. courses in pooping...but still, it says something, I think.

Chris Paulse pointed me to this magazine article entitled "Bayes rules: a once-neglected statistical technique may help to explain how the mind works," featuring a paper by Thomas Griffiths and Joshua Tenenbaum (see quick review here by Michael Stastny). The paper examines the ability of people to use partial information "make predictions about the duration or extent of everyday phenomena such as human life spans and the box-office take of movies." Griffiths and Tenenbaum find that many people's predictions can be modeled as Bayesian inferences.

This is all fine and interesing. Based on my knowledge of other experiments of this sort, I'm a bit skeptical--I suspect that different people use different sorts of reasoning to make these predictive estimates--but of course this is why psychologists such as Griffiths and Tenenbaum do research in this area.

Historical background

But I do have a problem with the article about this study that appeared in the Economist magazine. The article makes a big deal about the differences between the "Bayesian" and "frequentist" schools of statistics. FIrst off, I'm surprised that they think that frequentist methods "dominate the field and are used to predict things as diverse as the outcomes of elections and preferences for chocolate bars." They should take a look at some recent issues of JASA or at the new book by Peter Rossi, Greg Allenby, and Rob McCulloch on Bayesian statistics and marketing. More generally, they don't seem to realize that psychologists have been modeling decision making as Bayesians for decades--that's the basis of the "expected utility model" of von Neumann etc.

The statistical distinction between "estimation" and "prediction"

But I have a more important point to make--or, at least, a more staistical point. Classical statistics distinguishes between estimation and prediction. Basically, you estimate parameters and you predict observables. This is not just a semantic distinction. Parameters are those things that generalize to future studies, observables are ends in themselves. If the joint distribution of all the knowns and unknowns is written as a directed acycllic graph, the arrows go from parameters to observables and not the other way around. Or, to put it another way, one instance of a parameter can correspond to many observables.

Anyway, frequentist statistics treats esitmation and prediction differently. For example, in frequentist statistics, theta.hat is an unbiased estimate if E(theta.hat|theta) = theta, for any theta. But y.hat is an unbiased prediction if E(y.hat|theta) = E(y|theta) for any theta. Note the difference: the frequentist averages over y, but not over theta. (See pages 258-249, and the footnote on page 411, of Bayesian Data Analysis (second edition) for more on this.) Frequentist inference thus makes this technical distinction between estimation and prediction (for example, consider the terms "best linear unbiased estimation" and "best linear unbiased prediction").

Unfair to frequentists

Another way of putting this is: everybody, frequentists and Bayesians alike, agree that it's appropriate to be Bayesian for predictions. (This is particularly clear in time series analysis, for example.) The debate arises over what to do with estimation: whether or not to average over a distribution for those unknown thetas. So the Economist is misleading in describing the Griffiths/Tenenbaum result as a poke in the eye to frequentists. A good frequentist will treat these problems as predictions and apply Bayesian inference.

(Although, I have to admit, there is some really silly frequentist reasoning out there, such as the "doomsday argument"; see here and here, and here.)

1. Figure out your educational goals--what you want your students to be able to do once the semester is over. Write the final exam--something that you think the students should be able to do when the course is over, and that you think the students will be able to do.

2. Show the exam to some colleagues and former students to get a sense of whether it is too hard (or, less likely, too easy). Alter it accordingly.

3. Structure the course around the textbook. Even if it's not such a great textbook, do not cover topics out of order, and avoid skipping chapters or adding material except where absolutely necessary.

3. Set up the homework assignments for each week. Make sure that these include lots of practice in the skills needed to pass the exam.

4. Figure out what the students will need to learn each week to do the homeworks. Don't "cover" anything in class that you will not be having them do in hwks and exam. (You can do whatever you want in class, but if the students don't practice it, you're not really "covering" it.)

5. Write exams and homeworks so they are easy to grade. I also recommend having a short easy quiz every week in class. The students need lots of practice. This is at all levels, not just intro classes.

6. Every lecture, reinforce the message of what skills the students need to master. (To get an idea of how the students are thinking, picture yourself studying something you're not very good at but need to learn (swimming? Spanish? saxophone?). There are some skills you want to learn, you want to practice, but it's hard, and it helps if you are cajoled and otherwise motivated to continue.)

This is not the only way to go, of course; it's just my suggestion for a relatively easy way to get started and not get overwhelmed. Especially remember point 1 about the goals.

Some helpful resources:

First Day to Final Grade, by Curzan and Damour. This is nominally a book for graduate students but actually is relevant for all levels of teaching. Especially since I think that "lectures" should be taught more like "sections" anyway.

Teaching Statistics: A Bag of Tricks, by Gelman and Nolan

Pro bono statistical consulting?

Something I read about legal consulting made me think about statistical consulting.

Don Boudreaux (an economist) points to an op-ed by Josh Sheptow (a law student) who argues that "pro bono work by elite lawyers is a staggeringly inefficient way to provide legal services to low-income clients." Basically, he says that, instead of doing 10 hours of pro-bono work for a legal-aid society, a $500/hr corporate lawyer would be better off just working 10 more hours at the law firm and then donating some fraction of the resulting $5000 to the legal aid society. Sheptow writes, "For lawyers who have done pro bono work, cutting a check might not seem as glamorous as getting out in the trenches and helping low-income clients face to face, but it's a much more efficient way to deliver legal services to those in need." The discussants at Boudreaux's site make some interesting points, and I don't have much to add to the argument one way or another. But it did make me think about statistical consulting.

Statistical consulting

I have two consulting rates: a high rate for people who can afford it (and are willing to pay) and zero otherwise. Also, I'll give free consulting to just about anybody who walks in the door. My main criteria for the free consulting is that they're doing something which seems socially useful (i.e., it's something that I'd rather see done well than done poorly) and that they're planning to listen to my advice. (It's soooo frustrating to give lots of suggestions and then be ignored. Once nice thing about being paid is that then they usually don't ignore you.) For paid consulting, I like the project to be interesting, and then, to be honest, I'm less stringent on the "socially useful" part--I just don't want to work for somebody who seems to be doing something really bad. (That's all from my perspective. I have no objection if other statisticians consult for organizations that I don't like.)

Oh yeah, and once I declined to consult for a waste-management company located in Brooklyn. Lost out on a chance for some interesting life experience there, I think.

Anyway, I don't really do any "pro bono" work that is comparable to providing legal services for indigents. I'm not quite sure what the equivalent would be, in statistical work. Regression modeling and forecasting for nonprofits? Statistical consulting in some legal cases? I don't know what tradition there is of public-service work in statistics. Of course I'd like to believe that almost all my work is public service, in some sense--giving students tools to work more effectively, participating in scientific research projects, and so forth--but that's not quite the same thing. And yes, we give to charity (thus following the "Sheptow" strategy), but I do think that, ideally, there'd be some more direct public-service aspect to what we do.

I blogged on this awhile ago and then recently Hal Stern and I wrote a little more on the topic. Here's the paper, and here's the abstract:

A common error in statistical analyses is to summarize comparisons by declarations of statistical significance or non-significance. There are a number of difficulties with this approach. First is the oft-cited dictum that statistical significance is not the same as practical significance. Another difficulty is that this dichotomization into significant and non-significant results encourages the dismissal of observed differences in favor of the usually less interesting null hypothesis of no difference.

Here, we focus on a less commonly noted problem, namely that changes in statistical significance are not themselves significant. By this, we are not merely making the commonplace observation that any particular threshold is arbitrary--for example, only a small change is required to move an estimate from a 5.1% significance level to 4.9%, thus moving it into statistical significance. Rather, we are pointing out that even large changes in significance levels can correspond to small, non-significant changes in the underlying variables. We illustrate with a theoretical and an applied example.

The paper has a nice little multiple-comparisons-type example from the biological literature--an example where the scientists ran several experiments and (mistakenly) characterized them by their levels of statistical significance.

Here's the bad approach:

blackman1.png

Here's something better which simply displays esitmates and standard errors, letting the scientists draw their own conclusions from the overall pattern:

blackman2.png

Here's something slighlty better, using multilevel models to get a better esitmate for each separate experiment:

blackman3.png

The original paper where these data appeared was really amazing in the extreme and unsupported claims it made based on non-significant differences in statistical significance.

As part of a "carbon trading" program, a program is being instituted to reduce energy use for streetlights in a developing country. Here's how it works: (1) "baseline" energy use is established for the existing street light system, (2) some of the lights will be replaced with new lights that are more energy efficient and will thus consume less energy, and (3) the company that does the installation will be reimbursed based on the reduction in consumption. (No reduction, no money).

Simple enough on paper, but we live in a messy world. For example, the electricity provided by the grid is often substantially below the nominal voltage, so the existing lamps (which do not include voltage regulators) often put out much less light than they should, but also consume less electricity than they should. The new lights include voltage regulators so they always operate at their nominal power consumption. It's entirely possible that replacing the old lights with the new ones will increase the light output but generate no energy savings (or even negative savings) and thus no reduction in carbon dioxide production.

One possibility would be to use new lamps that have the same light output as the current lamps, rather than the same nominal energy consumption. But it's not clear that the municipalities involved will agree to that, for one thing. (For instance, the voltage that is provided varies with time, so even though the existing lamps often operate well below their nominal light output, they sometimes do achieve it). Also, lamps only come in discrete steps of light output, so there may be no way to provide the same amount of light as is currently provided.

Another problem --- the one that prompted this blog entry --- is how to establish the baseline energy use, and determine the energy savings of the replacement lamps. Lamps are not individually metered, although meters can be installed temporarily (at some expense). The actual energy consumption of an existing lamp, and its light output, depend on the lamp's age and on the voltage that it gets. As mentioned above, the voltage varies with time...but it does so differently for different lamps, depending on the distance from the power plant and on the local electric loads. There seem to be no existing records on voltage-vs-time for any locations, much less for the large number of towns that might participate in this program.

2006

My New Year's resolutions:

The economics of prize-giving

Louis Menand writes in the New Yorker about literary prizes:

The Nobel Prize in Literature was the first of the major modern cultural prizes. It was soon followed by the Prix Goncourt (first awarded in 1903) and the Pulitzer Prizes (conceived in 1904, first awarded in 1917). The Academy of Motion Picture Arts and Sciences started handing out its prizes in 1929; the Emmys began in 1949, the Grammys in 1959. Since the nineteen-seventies, English says, there has been an explosion of new cultural prizes and awards. There are now more movie awards given out every year—about nine thousand—than there are new movies, and the number of literary prizes is climbing much faster than the number of books published. . . .

[James] English [author of "The Economy of Prestige"] interprets the rise of the prize as part of the “struggle for power to produce value, which means power to confer value on that which does not intrinsically possess it.” In an information, or "symbolic," economy, in other words, the goods themselves are physically worthless: they are mere print on a page or code on a disk. What makes them valuable is the recognition that they are valuable. This recognition is not automatic and intuitive; it has to be constructed.

That's all fine, but I wonder if there's something else going on, analogous to grade inflation: awards benefit the award-giver and the award-receiver. For example, the statistics department at Columbia instituted a graduate teaching award a few years ago. This is a win-win situation: the student who gets the award is happy about the recognition and can put it on his or her resume, but also it's good for the department--it can motivate all, or at least several, of our teaching assistants to try harder (to the extent that the award is transparently and fairly administered), and it can help our best graduate students get top jobs, which in turn is good for the department also. Similarly, awards within an academic discipline can motivate and recognize good work and also provide publicity and book sales for the field as a whole. And, unlike the Oscars, Nobel Prizes, etc., these little awards can be cheap to administer.

So, as with grade inflation, the immediate motivation of all parties is to increase the number of awards. But it can get out of control! For example, here is Princeton University Press's list of its books from 2005 that received awards. It's a looong list, including an amazing 13 different political science books that received a total of 18 awards. This many awards all in one place devalues all of them a bit, I think. (Not that I'm too proud to claim what awards I receive, of course...)

The real question?

Thus maybe the real question is not, Why so many awards?, but rather, Why is the number of awards increasing? If giving awards has always been such a great idea, why haven't there always been so many awards? I don't know, just as I don't know why there isn't more grade inflation, or why there aren't more campaign contributions.

P.S.

See here for other comments on this topic by Mark Thoma (link from New Economist. Thoma talks about awards as a signal to consumers, whereas I'm thinking more about the motivation of the organizations that give the awards.

Thoma's comments are interesting although I don't really know what he gets out of saying "A good is valuable because it yields utility." Isn't that just circular reasoning? Or is this just a truism that economists like to say, just as statisticians have our own cliches ("correlation is not causation," "statisitcal significance is not the same as practical significance," and so forth)?

I've been learning more about the lmer function (part of Doug Bates's lme4 package) in R--it's just so convenient, I think I'll be using it more and more as a starting point in my analyses (which I can then check using Bugs and, as necessary, Umacs). Anyway, I came across this interesting recent discussion in the R help archive:

I just finished Bernard Crick's wonderful 1980 biography of George Orwell. Lots of great stuff. Here's something from page 434:

Just before they moved to Mortimer Crescent [in 1942], Eileen [Orwell's wife] changed jobs. She now worked in the Minstry of Food preparing recipes and scripts for 'The Kidchen Front', which the BBC broadcast each morning. These short programmes were prepared in the Ministry because it was a matter of Government policy to urge or restrain the people from eating unrationed foods according to their official estimates, often wrong, of availability. It was the time of the famous 'Potatoes are Good For You' campaign, with its attendant Potato Pie recipes, which was so successful that another campaign had to follow immediately: 'Potatoes are Fattening'.

Crick continues,

Several studies have been performed in the last few years looking at the economic decisions of parents of sons, as compared to parents of daughters. For example, Tyler Cowen links to a report of a study by Andrew Oswald and Nattavudh Powdthavee that "provides evidence that daughters make people more left wing. Having sons, by contrast, makes them more right wing":

Professor Oswald and Dr Powdthavee drew their data from the British Household Panel Survey, which has monitored 10,000 adults in 5,500 households each year since 1991 and is regarded as an accurate tracker of social and economic change. Among parents with two children who voted for the Left (Labour or Lib Dem), the mean number of daughters was higher than the mean number of sons. The same applied to parents with three or four children. Of those parents with three sons and no daughters, 67 per cent voted Left. In households with three daughters and no sons, the figure was 77 per cent.

I've seen some other studies recently with similar findings--a few years ago, a couple of economists found that having daughters, as compared to sons, was associated with the probability of divorce, I think it was, and recently a study by Ebonya Washington found that for Congressmembers, those with daughters (as compared to sons) were more likely to have liberal voting records on women's issues.

Controlling for the number of children: an intermediate outcome

A common feature of all these studies is that they control for the total number of children. This can be seen in the quote above, for example: they compare different sorts of families with 2 kids, then make a separate comparison of different sorts of families with 3 kids.

At first sight, controlling for the total number of children seems reasonable. There is a difficulty, however, in that the total number of kids is an intermediate outcome, and controlling for it (whether by subsetting the data based on #kids or using #kids as a control variable in a regression model) can bias the estimate of the causal effect of having a son (or daughter).

To see this, suppose (hypothetically) that politically conservative parents are more likely to want sons, and if they have two daughters, they are (hypothetically) more likely to try for a third kid. In comparison, liberals are more likely to stop at two daughters. In this case, if you look at data on families with 2 daughters, the conservatives will be underrepresented, and the data could show a correlation of daughters with political liberalism--even if having the daughters has no effect at all!

A solution

A solution is to apply the standard conservative (in the statistical sense!) approach to causal inference, which is to regress on your treatment variable (sex of kid) but controlling only for things that happen before the kid is born. For example, one could compare parents whose first child is a girl to parents whose first child is a boy. One can also look at the second birth, comparing parents whose second child is a girl to those whose second child is a boy--controlling for the sex of the first child. And so on for third child, etc.

The modeling could get interesting here, since there is a sort of pyramid of coefficients (one for the first-kid model, two for the second-kid model (controlling for first kid), and so forth). It might be reasonable to expect coefficients to gradually decline (I assume the effect of the first kid would be the biggest), and one could estimate that with some sort of hierarchical model.

Summary

I'm not saying that all these researchers are wrong; merely that, by controlling for an intermediate outcome, they're subject to a potential bias. Also they could redo their analyses without much effort, I think, to fix the biases and address this concern. I hope they do so (and inform me of their results).

It's an interesting example because we all know not to control for intermediate outcomes, but the total # of kids somehow doesn't look like that, at first.

P.S.

See here for more discussion of the U.K. voting example.

Sketching realistic graphs

I was trying to draw Bert and Ernie the other day, and it was really difficult. I had pictures of them right next to me, but my drawings were just incredibly crude, more "linguistic" than "visual" in the sense that I was portraying key aspect of Bert and Ernie but in pictures that didn't look anything like them. I knew that drawing was difficult--every once in awhile, I sit for an hour to draw a scene, and it's always a lot of work to get it to look anything like what I'm seeing--but I didn't realize it would be so hard to draw cartoon characters!

This got me to thinking about the students in my statistics classes. When I ask them to sketch a scatterplot of data, or to plot some function, they can never draw a realistic-looking picture. Their density functions don't go to zero in the tails, the scatter in their scatterplots does not match their standard deviations, E(y|x) does not equal their regression line, and so forth. For example, when asked to draw a potential scatterplot of earnings vs. income, they have difficulty with the x-axis (most people are between 60 and 75 inches in height) and having the data consistent with the regression line, while having all earnings be nonnegative. (Yes, it's better to model on the log scale or whatever, but that's not the point of this this exercise.)

Anyway, the students just can't make these graphs look right, which has always frustrated me. But my Bert and Ernie experience suggests that I'm thinking of it the wrong way. Maybe they need lots and lots of practice before they can draw realistic functions and scatterplots.

More on historical evidence

I commented a few weeks ago on Andy Nathan's review of a recent biography of Mao. Beyond the inherent interest of the subject matter, Nathan's review was interesting to me because it explored questions of the reliability of historical evidence.

Those of you who are interested in this sort of thing might be interested in the following exchange of letters in a recent issue of the London Review of Books. It's an entertaining shootout between Jung Chang and Jon Halliday on one side, and Andy Nathan on the other.

Recent studies by police departments and researchers confirm that police stop racial and ethnic minority citizens more often than whites, relative to their proportions in the population. However, it has been argued that stop rates more accurately reflect rates of crimes committed by each ethnic group, or that stop rates reflect elevated rates in specific social areas such as neighborhoods or precincts. Most of the research on stop rates and police-citizen interactions has focused on traffic stops, and analyses of pedestrian stops are rare. In this paper, we analyze data from 125,000 pedestrian stops by the New York Police Department over a fifteen-month period. We disaggregate stops by police precinct, and compare stop rates by racial and ethnic group controlling for previous race-specific arrest rates. We use hierarchical multilevel models to adjust for precinct-level variability, thus directly addressing the question of geographic heterogeneity that arises in the analysis of pedestrian stops. We find that persons of African and Hispanic descent were stopped more frequently than whites, even after controlling for precinct variability and race-specific estimates of crime participation.

Here's the paper (by Jeff Fagan, Alex Kiss, and myself) with the details. The work came out of a project we did with the New York State Attorney General's Office a few years ago.

If you're interested in this topic, you might also take a look at Nicola Persico's page on police stops data. (Our dataset had confidentiality restrictions so we couldn't place it on Nicola's site.)

Applications of belief functions?

Mauro Caputo writes,

I was searching on-line for some useful tutorials on Dempster-Shafer theory. I need to get up to speed quickly on this area to see if I can apply it to a particular problem. I found your paper on "The boxer, the wrestler, and the coin flip", and kinda got stuck on the belief functions section. Is there an undergrad textbook or tutorial paper on Dempster-Shafer that I could use to teach myself enough to understand and apply DST? I haven't been able to find one.

I referred him to the references in my paper but perhaps someone knows of something more recent and more applied than these references?

Yingnian writes,

I accumulated quite a number of intuitions about Monte Carlo in particular, and statistics in general, for teaching purpose. I think they can be made obvious to elementary school children.

MCMC can be visualized by a population of say 1 million people immigrating from state to state (e.g., 51 states on US territory) where transition prob is like the fraction of people in California who will move to Texas the next day. So p^{(t)} is a movie of population distribution over time. If all 1 million people start from California, the population will spread over all 51 states, and stabilize at a distribution which is the stationary distribution.

Metropolis-Hastings is just a treaty between the visa offices of any two countries x and y. It can be any c(x, y) = c(y, x), and the fraction of people who get the visa is c(x, y)/pi(x)T(x, y), since pi(x)T(x, y) is the number of people who go to apply for visa. To choose c(x, y) = min(pi(x)T(x, y), pi(y)T(y, x)) is just to maximize the immigration flow. Here the detailed balance means there is a balance between any two states, like China and US. There are more Chinese applying visa to US than vice versa, so we can let all US people go to China, but only accept a fraction of Chinese people to go to US (I just use this example for the benefit of intuition, without any disrespect for US or Chinese people). This is the acceptance probability.

Politically Incorrect Statistics

Adi Wyner, Dean Foster, Shane Jensen, and Dylan Small at the University of Pennsylvania have started a new statistics blog, Politically Incorrect Statistics. My favorite entry so far compares string theory to intelligent design. While I'm linking, the Journal of Obnoxious Statistics looks like fun, although I haven't read many of the 101 pages yet.

Numbers

When I tell people about my work, by far the most common response is "Oh, I hated statistics in college." We've been over that before. Sometimes someone will ask me to explain the Monty Hall problem. Anyway, another one I've been getting a lot lately is whether I watch the show Numbers. I've never seen it (I don't have cable), but I'm a little curious about it--does anyone out there watch it? Is it any good? Just wondering....

Statisfaction

My friend Mark Glickman (I call him Glickman; Andrew calls him Smiley) has some fun statistical song parodies. When I was taking Bayesian Data Analysis in graduate school, he came in as a guest lecturer one day and sang them for our class. It was really fun--I don't think there's anywhere near enough silliness in most statistics classes. Click here for some music and lyrics.

More on Matching

Dan Ho, Kosuke Imai, Gary King, and Liz Stuart have a new paper on matching methods for causal inference. It has lots of practical advice and interesting examples, and I predict that it will be widely read and cited. Check it out here.

...and on a completely unrelated note, Happy Birthday, Mom!!

Drunken rooks, etc.

I was telling my class the other day about the Gibbs sampler and Metropolis algorithm and was describing them as random walks--for example, with the Gibbs sampler you consider a drunk who is walking at random on the Manhattan street grid (in the hypothetical world in which the Manhattan street grid is rectilinear). But then I realized that's not right, because a Gibbs jump can go as far as necessary in any direction--it is not limited to one step. It's actually the path of a drunken rook! From there, it's natural to think of the Metropolis algorithm as a drunken knight. Or maybe a drunken king, if the jumps are small.

To take this analogy further: parameter expansion (i.e., redundant parameterization) allows rotations of parameter space, thus allowing the rooks to move diagonally as well as rectilinearly--thus, a drunken queen! I don't know if these analogies helped the students, but I like them.

Slam Dunks and No-Brainers

Encouraged by Carrie's plug, I read Leslie Savan's book, "Slam Dunks and No Brainers":

slamdunks.jpg

It's an entertaining and thought-provoking look at "pop language," which are a particular kind of enjoyable and powerful cliche that we use in speech (and sometimes in writing) to convey "attitude" (that is, 'tude). I'm not quite sure where the boundary falls between rote phrases (e.g., "entertaining and thought-provoking"), cliches (e.g., "it was raining like hell out there"), pop language (e.g., "chill out, dude"), and jokes (e.g., "he's not the sharpest tool in the shed"). But I think the idea is that pop phrases are sometimes just fun to say (sort of like the linguistic equivalent of putting on a fancy outfit or driving a sports car) or pwerful ("chill out" is hard to respond to!). Reading the book was fun, sort of the way it's fun to see a movie that was shot in one's hometown, or the way it's fun (and disturbing) to see logos for McDonalds etc. in foreign countries.

Savan argues that we've gone overboard with these phrases (I think that "going overboard" is a cliche but not a pop phrase) and they make us dumber, substituting pre-cooked thoughts for orignal thoughts, substituting scripted exchanges for spontaneous interactions, and so forth. I don't have anything to say about this claim--except that I'm not sure exactly how it could be studied, and I would also consider making the opposite claim, which is that cliched thought-modules actually allow us to express more sophisticated concepts in plainer language (see here and here). She makes her argument from a politically-liberal perspective (by using pop phrases, we're acting more like passive consumers than like politically-involved citizens) but I think a similar case could be made from the conservative direction in terms of loss of traditoinal values.

As always, the important question is, How does this relate to statistics teaching

But my main point of reference when reading this book was my experience as a statistics teacher. I use these phrases in class all the time, and I enjoy when my students use them too. (The first time I ever heard "Woo-hoo" uttered by anyone other than Homer Simplson was in 1994 when a student used the phrase in my decision analysis class.) Why do I use pop phrases? They're fun to say, they help project my image as a cool, down-to-earth dude ("down-to-earth" is a cliche, I think; "dude" is pop), they make the students laugh, thus relaxing their muscles so I can cram that extra bit of statistics into them (just kidding on that one, but it is pleasant to hear them laugh; more on this below).

When pop phrases don't pop

One of the themes of Savan's book is that pop talk is so fun and powerful, it's no surprise that we do more and m ore of it. But it can backfire. Maybe simply by ticking people off, but also because many of my graduate students and colleagues are not from the United States, and they "just don't get it" sometimes. Just as our kickball references fall flat to people who were not kids in American schoolyards, similarly, those without our bakcgrounds (or just of different age groups) won't appreciate references to "identical cousins" or that episode on the Brady Bunch where Peter's voice changed. Or even many of the phrases in Savan's book. I've found even Canadians to be baffled by what we would consider fundamental pop phrases (and I'm sure I'd be baffled by theirs, too).

This really comes up when I give talks to foreign audiences. Even setting aside language difficulties and the need to speak slowly, I have to tone down my popisms. Also in other settings . . . for example, the profs in the stat dept here are almost all from other countries. Once in a faculty meeting, I responded to someone's statement with a fast, high-pitched, Eddie-Murphy-style, "Get the fuck outta here." My colleauges were offended. They didn't catch the Murphy reference and thought I was saying "Fuck" to this guy. Which, of course, I wasn't.

For a teacher, the other drawpack to pop phrases is that they can be more memorable than the points they are used to illustrate. This probably happens all the time with me. My tendency is to go for the easy laugh--and the har is set so low in a statistics class that just about any pop reference will get a laugh--without always keeping my eyes on the prize (cliche, not pop phrase, right?), which is to give the students the tools they need to be able to solve problems. (That last phrase sounds like a cliche, I know it does, but it's not!, I swear!)

Anyway, that's my point, my one suggested addition to Savan's book: pop language can actually impede conversation, get in the way of conveying meaning, when we speak to people outside the in-group. I think this will always limit the extent of pop phrases, at least for those of us who need to communicate to people from other countries.

Really, really unrelated to statistics

I just found this (at Daniel Radosh's webpage) to be hilarious. Also all the links, like this and this. Actually, my favorite was this, which for convenience I'll copy in its entirety, first the picture, then the words.

A10836.jpg

Stef writes,

In the preparation of my inaugural lecture I am searching for literature on what I momentarily would call "modular statistics". It widely accepted that everything that varies in the model should be part of theloss function/likelihood, etc. For some reasonably complex models, this may lead to estimation errors, and worse, problems in the interpretation, or in possibilities to tamper with the model.

It is often possible to analyse the data sequentially, as a kind of poor man's data analysis. For example, first FA then regression (instead of LISREL), first impute, then complete-data analysis (instead of EM/Gibbs), first quantify then anything (instead of gifi-techniques), first match on propensity, then t-test (instead of correction for confounding by introducing covariates), and so on. This is simpler and sometimes conceptually more defensible, but of course at the expense of fit to the data.

It depends on the situation whether the latter is a real problem. There must be statisticians out there that have written on the factors to take into account when deciding between these two strategies, but until now, I have been unable to locate them. Any idea where to look?

My reply:

One paper you can look at is by Xiao-Li Meng in Statistical Science in 1994, on "congeniality" of imputations which addresses some of these issues.

More generally, it is a common feature of applied Bayesian statistics that different pieces of info come from different sources and then they are combined. Cor example, you get the "data" from 1 study and the "prior" from a literature review. Full Bayes would imply analyzing all the data at once but in practice we don't always do that.

Even more generally, I've long thought that the bootstrap and similar methods have this modular feature. (I've called it the two-model approach, but I prefer your term "modular statistics"). The bootstrap literature is all about what bootstrap replication to use, what's the real sampling distribution, should you do parametric or nonparametric bootstrap, how to bootstrap with time series and spatial data, etc. But the elephant in the room that never gets mentioned is the theta-hat, the estimate that's getting "bootstrapped." I've always thought of bootstrap as being "modular" (to use your term) because the model (or implicit model) used to construct theta-hat is not necessarily the model used for the bootstrapping.

Sometimes bootstrapping (or similar ideas) can give the wrong answer but other two-level models can work well. For example, in our 1990 JASA paper, Gary King and I were estimating counterfactuals about electoins for Congress. We needed to set up a model for what could have happened had the electoin gone differently. A natural approach would have been to "bootstrap" the 400 or so elections in a year, to get different versions of what would have happened. But that would have been wrong, because the districts were fixed and not a sample from a larger population. We did something better which was to use a hierarchical model. iI was challenging because we had only 1 observation per district (it was important for our analysis to do a separate analysis for each year), so we estimated the crucial hierarchical variance parameter using a separate analysis of data from several years. Thus, a modular model. (We also did some missing data imputation.) So this is an example where we really used this approach, and we really needed to. Another reference on it is here.

Even more generally, I've started to use the phrase "secret weapon" to describe the policy of fitting a separate model to each of several data sets (e.g., data from surveys at different time points) and then plotting the sequence of estimates. This is a form of modular inference that has worked well for me. See this paper (almost all of the graphs, also the footnote on page 27) and also this blog entry. In these cases, the second analysis (combining all the inferences) is implicit, but it's still there. Sometimes I also call this "poor man's Gibbs".

Stef replied,

Eric Tassone sent me the this graph from a U.S. Treasury Department press release. The graph is so ugly I put it below the fold.

Anyway, it got me to thinking about a famous point that Orwell made, that one reason that propaganda is often poorly written is that the propagandist wants to give a particular impression while being vague on the details. But we should all be aware of how we write:

[The English language] becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts.

Tufte made a similar point about graphs for learning vs. graphs for propaganda.

Here are (some) of the errors in the Treasury Department graph:

1. Displaying only 3 years gives minimal historical perspective. This is particularly clear, given that the avg rate of unemployment displayed is from 1960 to 2005.
2. Jobs are not normalized to population. Actually, since unemployment rate is being displayed, it's not clear that anything is gained by showing jobs also.
3. Double-y-axis graph is very hard to read and emphasizes a meaningless point on the graph where the lines cross. Use 2 separate graphs (or, probably better, just get rid of the "jobs" line entirely).
4. Axes are crowded. x-axis should be labeled every year, not every 2 months. y-axis could just have labels at 4%, 5%, 6%. And if you want to display jobs, display them in millions, not thousands! I mean, what were they thinking??
5. Horizontal lines at 129, 130, etc., add nothing but clutter.

To return to the Orwell article, I think there are two things going on. First, there's an obvious political motivation for starting the graph in 2003. And also for not dividing jobs by population. But the other errors are just generic poor practice. And, as Junk Charts has illustrated many times, it happens all the time that people use too short a time scale for their graphs, even when they get no benefit from doing so. So, my take on it: there are a lot of people out there who make basic graphical mistakes because they don't know better. But when you're trying to make a questionable political point, there's an extra motivation for being sloppy. This sort of graph is comparable to the paragraphs that Orwell quoted in "Politics and the English Language": the general message is clear, but when you try to pin down the exact meanings of the words, the logic becomes less convincing.

OK, here's the graph:

Stef van Buuren, Jaap Brand, C.G.M. Groothuis, and Don Rubin wrote a paper evaluating the "chained equations" method of multiple imputation--that is, the method of imputing each variable using a regression model conditional on all the others, iteratively cycling thorugh all the variables that contain missing data. Versions of this "algorithm" are implemented as MICE (which can be downloaded directly from R) and IVEware (a SAS package). (I put "algorithm" in quotes because you still have to decide what model to use--typically, what variables to include as predictors--in each of the imputation steps.)

Here's the paper, and here's the abstract:

The use of the Gibbs sampler with fully conditionally specified models, where the distribution of each variable given the other variables is the starting point, has become a popular method to create imputations in incomplete multivariate data. The theoretical weakness of this approach is that the specified conditional densities can be incompatible, and therefore the stationary distribution to which the Gibbs sampler attempts to converge may not exist. This study investigates practical consequences of this problem by means of simulation. Missing data are created under four different missing data mechanisms. Attention is given to the statistical behavior under compatible and incompatible models. The results indicate that multiple imputation produces essentially unbiased estimates with appropriate coverage in the simple cases investigated, even for the incompatible models. Of particular interest is that these results were produced using only five Gibbs iterations starting from a simple draw from observed marginal distributions. It thus appears that, despite the theoretical weaknesses, the actual performance of conditional model specification for multivariate imputation can be quite good, and therefore deserves further study.

Here are Stef's webpages on multiple imputation. Multiple imputation was invented by Don Rubin in 1977.

mipub.GIF

Another one from Junk Charts

Bad graph:

ecbarroombrawl.gif

Good graph:

redoalcohol.png

As is often the case in these situations, the good graph takes up less space, is easier to understand, and is easier to construct.

P.S. I think the "good graph" could be made even better by labeling the y-axis using round numbers. I don't think the exact numbers as displayed there are so helpful. Also, I'd convert to a more recognizable scale. Instead of liters per year, perhaps ounces per day, or the equivalent of glasses of wine per week?

Multilevel model

Gregor Gorjanc writes,

My colleague has biological data, which measures degree of DNA damage in cells (Olive Tail Moment - OTM). This data are gathered via so called comet assay test and used to detect genotoxicitiy of various chemicals, environmental waters, soil, ... The test (this is very imprecise description, but should show the point) is conducted in such a way that say we take blood sample from 10 animals, where first 5 animals are under treatment of interest and the other five are used for control. Specified type of cells is extracted from blood sample, "processed" and finally a set of those cells (usually around 100) is scored for OTM.

Rules of historical evidence

Our Columbia colleague Andy Nathan recently wrote a review of a recent biography of Mao. My interest here lies not so much in the subject matter (important though it is) but rather in Nathan's comments about historical research methods, in particular, for the key issue of what can or should be believed.

We finally finished the paper (and here's the earlier blog posting with link to a powerpoint).

superplot_var_slopes_annen_2000.png

SAFE

I don't need art to be work-related. In fact, I generally prefer that it's not. But there's an exhibition at MOMA called SAFE: Design Takes On Risk, that looks pretty cool. Items range from practical (chairs with well-placed hooks to hide a purse) to pseudo-practical (suitcase-like containers to keep bananas from getting bruised) to borderline neurotic (slip-on fork covers). And earplugs. Lots of earplugs. For those who don't live in the city or just don't want to shell out the $20 entrance fee, there's an online exhibition: http://moma.org/exhibitions/2005/safe/.

I recently met Carlos Davidson, a prof at Cal State University. He studies amphibians, with a special interest in why frogs in California are disappearing. He said that he can "predict quite well whether a site will have frogs, based on the pesticide use upwind" and that he thinks that pesticides are a big part of the problem. But he also said that others in his field are far from convinced. What should it take to be convincing? Is there a "statistical" answer to questions like, which is more important: lab work, more field work, more analysis of existing field data (perhaps with more covariates included)?

Geek Alert

Last week I substitute professed a mathematical statistics course for a friend who was out of town. I was sort of dreading it: interpretation of confidence intervals, Fisher information, AND hypothesis tests, all in one class, less than 24 hours before the start of Thanksgiving break. I didn't have high hopes for the enthusiasm level in the room. BUT it was actually pretty fun. The Cramer-Rao inequality? It's really cool that there's a derivable bound on the variance of an unbiased estimator, and even cooler that that bound happens to be the inverse of the Fisher information. It's not the kind of stuff that comes up much in my own work or that I'd want to do research on myself, but I got a kick out of teaching it.

Economics and voter irrationality

During my visit to George Mason University, Bryan Caplan gave me a draft of his forthcoming book, "The logic of collective belief: the political economy of voter irrationality." The basic argument of the book goes as follows:

(1) It is rational for people to vote and to make their preferences based on their views of what is best for the country as a whole, not necessarily what they think will be best for themselves individually.
(2) The feedback between voting, policy, and economic outcomes is weak enough that there is no reason to suppose that voters will be motiaved to have "correct" views on the economy (in the sense of agreeing with the economics profession).
(3) As a result, democracy can lead to suboptimal outcomes--foolish policies resulting from foolish preferences of voters.
(4) In comparison, people have more motivation to be rational in their conomic decisions (when acting as consumers, producers, employers, etc). Thus it would be better to reduce the role of democracy and increase the role of the market in economic decision-making.

Caplan says a lot of things that make sense and puts them together in an interesting way. Poorly-informed voters are a big problem in democracy, and Caplan makes the compelling argument that this is not necessarily a problem that can be easily fixed--it may be fundamental to the system. His argument differs from that of Samuel Huntington and others who claimed in the 1970s that democracy was failing because there was too much political participation. As I recall, the "too much democracy" theorists of the 1970s saw a problem with expectations: basically, there is just no way for "City Hall" to be accountable to everyone, thus they preferred limiting things to a more manageable population of elites. Caplan thinks that voting itself (not just more elaborate demands for governmental attention) is the problem.

Bounding the arguments

I have a bunch of specific comments on the book but first want to bound its arguments a bit. First, Caplan focuses on economics, and specifically on economic issues that economists agree on. To the extent the economists disagree, the recommendations are less clear. For example, some economists prefer a strongly graduated income tax, others prefer a flat tax. Caplan would argue, I think, that tax rates in general should be lowered (since that would reduce the role of democratic government in the economic sphere) but it would still be up to Congress to decide the relative rates. This isn't a weakness of Caplan's argument; I'm just pointing out a limitation of its applicability.

More generally, non-economic issues--on which there is no general agrement by experts--spread into the economic sphere. Consider policies regarding national security, racial discrimination, and health care. Once again, I'm not saying that Caplan is wrong in his analysis of economic issues, just that democratic goverments do a lot of other things. (At one place he points out that the evidence shows that voters typically decide whom to vote for based on economic considerations. But, even thought the economy might be decisive on the margin, that doesn't mean these other issues don't matter.)

Finally, Caplan generally consideres democracy as if it were direct. But I think representative democracy is much different than direct democracy. Caplan makes some mention of this, the idea that politicians have some "slack" in decision-making, but I suspect he is understating the importance of the role of the politicians in the decision-making process.

Specific comments

Objects of the class "Weekend at Bernie's"

"Weekend at Bernie's" is a low-quality movie that nobody's seen but everybody knows what it's about. Are there many other examples of this sort of cultural artifact? Another is Woody Allen's movie Zelig, which didn't get great reviews or great box office, but once again, its theme is well known. It's tough for me to think of lots of examples of this sort. The key is to have all 3 features:
(1) Generally acknowledged to be of low quality
(2) Not particularly popular or successful
(3) Basic storyline or theme is well known.

For example, James Joyce's Ulysses satisfies (2) and (3) but not (1); similarly (at a lower level) with Edward Scissorhands. The Bridges of Madison County satisfies (1) and (3) but not (2), and of course lots of things satisfy (1) and (2) but not (3). I also don't want to include "so-bad-it's-good" kinds of artifacts like "Plan 9 from Outer Space" which are famous because of their crappiness.

I'm really thinking of things like Weekend at Bernie's, which sought, and found a low-to-moderate success but had a storyline with such a good "hook" that most of the people who know about it didn't actually see the movie (or read the book, or whatever). It's not such a mystery that lots of people know what Ulysses is about--it's a Great Book so we've heard about it. And it's not such a mystery that lots of people know what The Bridges of Madison County is about--lots of people bought the book. But Weekend at Bernie's (or Zelig)--they probably have really good gimmicks to be so well known.

Questions about Futarchy

One of the people I met in my visit to George Mason University was Robin Hanson. At lunch we had a lively conversation about democracy--Hanson thinks it's overrated! When I (innocently) told him that representative democracy seemed better than the alternatives, he pointed out that there are some successful alternatives out there. For example, Microsoft is a (de-facto) dictatorship, and it does pretty well. Did I think Microsoft would work better as a democracy, he asked? Well, hmm, I don't know much about Microsoft, but yeah, I suppose that if some element of representative democracy were included, where employees, customers, and other "stakeholders" could vote for representatives that would have some vote in how things were run, maybe that would work well. At this point, someone else at lunch (sorry, I don't remember his name) objected and said that only shareholders should have the vote. I said maybe that's true about "should," but in terms of actual outcomes I wouldn't be surprised if things could be improved by including employees (and suppliers, customers, etc) in the decision-making. Bringing back to the main discussion, Robin pointed out that I had retreated from the claim that "democracy is best" to the claim that "some democracy could make things a little better." He said that his point about democracy's problems leads him to want to restrict the range of powers given to a democratic government and make the private sphere larger.

I don't really know what to say about that--the distinction I was making was between "pure democracy" and "representative democracy." My impression from the work in social and cognitive psychology on information aggregation is that representative democracy will work better than dictatorship or pure democracy.

Anyway, Robin gave me a copy of his paper proposing decision-making using betting markets. It's an interesting paper, sort of a mix of a policy proposal and a criticism of our current political system. Here's the abstract:

Democracies often fail to aggregate information, while speculative markets excel at this task. We consider a new form of governance, wherein voters would say what we want, but speculators would say how to get it. Elected representatives would oversee the after-the-fact measurement of national welfare, while market speculators would say which policies they expect to raise national welfare. Those who recommend policies that regressions suggest will raise GDP should be willing to endorse similar market advice. Using a qualitative engineering-style approach, we present three scenarios, consider thirty design issues, and then present a more specific design responding to those concerns.

It's an interesting paper and I have a few comments and questions:
- On page 8, Hanson notes that "betting markets beat major opinion polls." I think betting markets are great, but comparing to opinion polls is a little misleading--a poll is a snapshot, not a forecast (see here for elaboration on this point).
- On page 10, Hanson proposes using GDP as a measure of policy success. When I read this, I thought, why not just use some measure of "happiness," as measured in a poll, for example? One problem with a measure from a survey is that then the survey response itself becomes a political statement, so if, for example, you oppose the current government, you might be more likely to declare yourself "unhappy" for the purpose of such a poll. Joe Bafumi has found such patterns in self-reports of personal financial situations. GDP, on the other hand, can't be so easily manipulated. For the purposes of Robin's paper, I guess my point is that these properties of the "success measure" are potentially crucial.
- On page 11, Handon says, "an engineer [as compared to a scientist] is happy to work on a concept with a five percent chance of success, if the payoff from success would be thirty times the cost of trying." I would hope that a scientist would think that way too!
- On pages 11-12, he says that "scientists usually have little use for prototypes and their tests..." This may be true of some scientists, but "prototypes" (in the sense of data analyses that illustrate new or untested methods) play a huge role in statistics. In fact, this may characterize most of my own published papers!
- On page 12, Hanson writes that "most corporations are in effect small democratic governments." However, the vote of stockholders is not representative democracy as in U.S. politics, with defined districts, regularly scheduled elections for representatives, and so forth. I think this makes a big difference.

Now for my larger questions, which I think reflect my confusion about how this proposal would actually be implemented.

- Choice of policies to evaluate. I don't quite see how you would decide which potential policies get a chance of being evaluated in the prediction market. There could be potentally thousands or millions of policies to compare, right? On page 26, Hanson suggests limiting these via a $100,000 fee, but this would seem to shut a lot of people out of the system. (Of course, Hanson might reply that the current system, in which politicians from Bloomberg to Bush can parlay money into votes, also has this problem. And I would agree. I'm just trying to understand how the current system would work. In practice, would there need to be a system of "primary elections" or "satellite tournaments" to winnow the proposals?

- Picking which proposal to implement. Suppose two or more conflicting proposals are judged (by the prediction markets) to improve expected GDP. Which one would be implemented? This sort of problem would just get worse if there were thousants of proposals to compare.

In some ways, this reminds me my idea of "institutional decision analysis," which is that formal decision rules are appropriate for "institutional" settings where there is agreement on goals and also the need for careful justification of decisions. Similarly, Hanson's "futarchy" technocratically formalizes decisions that otherwise would have been made politically (though bargaining, persuasion, maneuvering, manipulation of rules, and so forth).

Clear writing

Sometimes people ask me to help them write more clearly. A common difficulty is that the way that first seems natural to write something, is not always the best way.

For example, I just (in an email) wrote the sentence, "The ultimate goal is to understand the causal effect on asthma of traveling to/from Puerto Rico." I started to write it as, "The ultimate goal is to understand the causal effect of traveling to/from Puerto Rico on asthma." This made more sense (the causal effect of X on Y), but it's more confusing to read (I think) because "on asthma" is at the end of the sentence, so when you get there you have to figure out where it goes. The sentence I actually wrote has the form, "the causal effect on Y of X," which is more awkward logically but easier to read.

P.S. Yes, I'm sure I've made at least one writing mistake in the above paragraph (otherwise I'd be violating Bierce's Law, and I wouldn't want to do that). But I think my main point is valid.

More on Teaching

It's College Week at Slate: Click here for the thoughts of several prominent academics on improving undergraduate education, sometimes with the aid of a magic wand. I of course first read "Learn Statistics. Go Abroad" by K. Anthony Appiah. I completely agree with Dr. Appiah's view that many college graduates can't evaluate statistical arguments, leaving them unequipped to make informed decisions in areas such as public policy. He writes "So I favor making sure that someone teaches a bunch of really exciting courses, aimed at non-majors in the natural and social sciences, which display how mathematical modeling and statistical techniques can be used and abused in science and in discussions of public policy." Again, I agree completely. But (as we've discussed here and here) teaching those kinds of courses is really hard, and probably requires that magic wand.

Accuracy of prediction markets

Ben Cowling asks,

What do you think of the growing area of 'expert trading markets' using expert opinion for predicting future events (as compared to, say, formal mathematical or statistical models incorporating past data in the forecasting process)? From what I can gather the markets produce a form of informative prior so perhaps the whole process might be considered as a kind of simple mathematical model(?)

I'm motivated by the recent article in the economist:

Science and Technology: Trading in flu-tures; Predicting influenza
The Economist: 377 (8448) p. 108. Oct 15, 2005.

But I know these expert markets have been used in other areas; the Iowa Electronic Market is claimed to be good at predicting all sorts of things successfully including elections, which is why I thought you and readers of your blog might be interested.

My response: I first heard of the Iowa markets nearly 15 years ago, when Gary King and I were writing our paper about why pre-election polls vary so much when elections are so predictable. For this paper, all we needed to establish was that elections are predictable, which indeed they are, using state-by-state regression forecasting models (as was done in the 1980s and 1990s by Rosenstone and Campbell, and more recently by Erikson, among others). The Iowa markets also give good forecasts, which isn't a suprise given that the investors in these markets can use the regression forecasts that are out there.

Basically, my impression is that the prediction markets do a good job at making use of the information and analyses that are already out there--for elections, this includes polls and also the information such as economic indicators and past election results, which are used in good forecasting models. The market doesn't produce the forecast so much as it motivates investors to find the good forecasts that are already out there.

As an aside, people sometimes talk about a forecasting model, or a prediction market, "outperforming the polls." This is misleading, because a poll is a snapshot, not a forecast. It makes sense to use polls, even early polls, as an ingredient in a forecast (weighted appropriately, as estimated using linear regression, for example) but not to just use them raw.

P.S. In the comments, Chris points out this interesting article on prediction markets by Wolfers and Zitzewitz.

Creeping alphabetism

Here's an example of how the principles of statistical graphics can be relevant for displays that, at first glance, do not appear to be statistical. Below is a table, from a Language Log entry by Benjamin Zimmer, of instances of phrases of the form "He eats, drinks, sleeps X" (where the three verbs, along with X, can be altered). I'll present Zimmer's table and then give my comment.

Here's the table:

I agree completely

I agree completely with this Junk Charts entry, which presents two examples (from the Wall Street Journal on 9 Nov 2005) of bar graphs that become much much more readable when presented as line graphs. The trends are clearer, the comparisons are clearer, and the graphs themselves need much less explaining. Here's the first graph (and its improvement):

redocolumns2_1.png

And here's the second:

redocolumns1.png

My only (minor) comments are:

- First graph should be inflation-adjusted.
- Both graphs could cover a longer time span.
- Axis labels on second graph could be sparer (especially the x-axis, which could be labeled every 5 years).
- I'd think seriously about having the second graph go from 0 to 100% with shading for the three categories (as in Figure 10 on page 451 of this paper, which is one of my favorites, because we tell our story entirely through statistical graphics).
- The graphs could be black-and-white. I mean, color's fine but b&w is nice because it reproduces in all media. The lines are so clearly separated in each case that no shading or dotting would be needed.

P.S. See here for a link to some really pretty graphs.

Per Pettersson-Lidbom is presenting a paper (in the Political Economy Seminar) that claims that an increase in size of local-government legislatures decreases the size of local government. First I'll give his abstract, then my comments. The abstract:

This paper addresses the question of whether the size of the legislature matters for the size of government. Previous empirical studies have found a positive relationship between the number of legislators and government spending but those studies do not adequately address the concerns of endogeneity. In contrast, this paper uses variation in council size induced by statutory council size laws to estimate the causal effect of legislature size on government size. These laws create discontinuities in council size at certain known thresholds of an underlying continuous variable, which make it possible to generate “near experimental” causal estimates of the effect of council size on government size. In contrast to previous findings, I [Pettersson-Lidbom] find a negative relationship between council size and government size: on average, spending and revenues are decreased by roughly 0.5 percent for each additional council member.

It's cool how he uses a natural experiment based on the laws of Finland and Sweden. As he writes: "In Finland, the council size of local governments is determined solely by population size. For example, if a local government has a population between 4001 and 8,000, the council must consist of 27 members, but if its population is between 8,001 and 15,000 the council must have 35 members. Thus, the law creates a discontinuity in council size at the threshold of 8001 inhabitants." And regression-discontinuity analysis certainly seems appropriate here.

The actual result is suprising to me--not that I'm any expert on local government, it's just surprising to see a negative effect here--and so I'd like to see some presentation of the data. If the effect is as clear as is claimed, it should show up in some basic analyses--here I'm thinking of scatterplots and matching analyses. This is somewhat a matter of taste--as a statistician, I like graphs, but economists seem to prefer tables. But I just find it difficult to be convinced by results such as Tables 9-14.

To flip it around: this is a pretty clean dataset, right? You have a natural experiment and some points near the boundary. So a scatterplot, and a simple regression could be pretty convincing. Tables 2 and 3 are promising (well, I'd prefer graphs, but still...) but they only have data on "x", not on "y". As things stand, I really just have to take the results on trust. Not that I have any reason to disbelieve them, but I'd like to be a little more confident in the results--especially given that much of the paper discusses why these results differ from the rest of the literature on the topic.

Why do Supreme Court justices drift toward the center? This seems to have occured with some Republican-appointed justices over the past few decades and with some Democrat-appointed justices in earlier years. The natural comparison here is to Congress, where I don't know of any evidence of center-drifting or leftward-drifting of Congressmembers. My law-professor colleague explains the difference as coming from the enviornment of the court. In particular, each case gets discussed and argued (rather than simply voted upon, as with many bills in Congress).

He also points out that judges, by the nature of their job, are exposed to two sides of every issue. Over the years, this could tend to lead to moderation. Unfortunately, most of us do not generally have to seriously consider two sides of every issue. Once again, the comparison to Congress is instructive: Congressmembers are exposed to a lot of lobbyists, who are certainly not divided evenly on issues. (Just pick your favorite issue here: drugs, guns, Israel, . . .) My argument here is not that lobbyists have too much (or not enough) influence but rather that as a judge, you get exposed to arguments in a more structured and balanced way, which might lead to moderation.

Looking at the data

Aleks sent along the following graph of Supreme Court justices, for each year, plotting the proportion of cases for which each judge voted on the conservative side (as coded by Spaeth):

alex2.png

(See here for the bigger version.)

This is a pretty picture--I particularly like the careful use of colors (the original version had some dotted lines but I talked Aleks into just using solid lines)--but one has to be careful in its interpretation. In particular, the graph is telling us about the relative position of justices in any given year, but I wouldn't trust its implicit claims about changes from year to year, or its long-term trends. The difficulty is that the results shown in this graph depend on the case mix in any given year.

Much of the year-to-year variation in your graph can be attributed to variation in the docket. It's not clear how to make sense of long-term trends given that the docket is changing over time. In particular, I wouldn't be surprised if the docket has become more conservative in recent years--as the court has shifted, i'd expect the cases to shift also. Another example is Marshall and Brennan from 1970 to 1990. Do I really believe that they both got more liberal, then both got more conservative, then both more liberal? Well, maybe, but it seems more plausible to me that the docket was changing during these times. This is related to the problem in epidemiology of simultaneously estimating age, period, and cohort effects.

Here's Kevin Quinn's estimate of ideal points of Supreme Court justices:

pic2000.gif

Clearly we need to combine Kevin's modeling tools with Aleks's graphics! (Joe, David, Noah, and I fit our own Supreme Court model, but I'm embarrassed to say it didn't allow judges' ideologies to move over time. And here's Simon Jackman's overview of ideal-point models.)

Doomsday and Bayes

Tyler Cowen links to an article from 1999 by Mark Greenberg that discusses the so-called "doomsday argument," which holds that there is there is a high probability that humanity will be extinct (or drastically reduce in population) soon, because if this were not true--if, for example, humanity were to continue with 10 billion people or so for the next few thousand years--then each of us would be among the first people to exist, and that's highly unlikely.

Anyway, the (sociologically) interesting thing about this argument is that it's been presented as Bayesian (see here, for example) but it's actually not a Bayesian analysis at all! The "doomsday argument" is actually a classical frequentist confidence interval. Averaging over all members of the group under consideration, 95% of these confidence intervals will contain the true value. Thus, if we go back and apply the doomsday argument to thousands of past data sets, its 95% intervals should indeed have 95% coverage. If you look carefully at classical statistical theory, you'll see that it makes claims about averages, not about particular cases.

However, this does not mean that there is a 95% chance that any particular interval will contain the true value. Especially not in this situation, where we have additional subject-matter knowledge. That's where Bayesian statistics (or, short of that, some humility about applying frequentist inferences to particular cases) comes in. The doomsday argument is pretty silly (and also, it's not Bayesian). Although maybe it's a good thing that Bayesian inference has such high prestige now that it's being misapplied in silly ways. That's a true sign of acceptance of a scientific method.

Sy Spilerman writes,

I am interested in the effect of log(family wealth) on some dependent variable, but I have negative and zero wealth values. I could add a constant to family wealth so that all values are positive. But I think that families with zero and negative values may behave differently from positive wealth families. Suppose I do the following: Decompose family wealth into three variables: positive wealth, zero wealth, and negative wealth, as follows:

- positive wealth coded as ln(wealth) where family wealth is positive, and 0 otherwise,
- zero wealth coded 1 if the family has zero wealth, 0 otherwise.
- negative wealth coded ln(absolute value of wealth) if family wealth is negative, and 0 otherwise,

and then use this coding as right side variables in a regression. It seems to me that this coding would permit me to obtain the separate effects of these three household statuses on my dependent variable (e.g., educational attainment of offspring). Do you see a problem with this coding? A better suggestion?

My reply:

Yes, you could do it this way. I think then you'd want to include values very close to zero (for example, anything less than $100 or maybe $1000 in absolute value) as zero. But yes, this should work ok. Another option is to just completely discretize it, into 10 categories, say.

Any other suggestions out there? This problem arises occasionally, and I've seen some methods that seem very silly to me (for example, addiing a constant to all the data and then taking logs). Obviously the best choice of method will depend on details of the application, but it is good to have some general advice too.

Mark Liberman replied to this entry (see also here):

I [Mark Lieberman] was mostly trying to see whether a new database search program was working. I knew that men have been said to use filled pauses like "uh" more than women, and it made sense to me that disfluency would increase with age, so I generated the data for the first plot and took a look. I think you're right that I should have started the plot from 0, but I wasn't sure what I'd see, and thought that the qualitative effects if any would be clearer with a narrower range of values plotted.

Then I wondered about "um", and still had a few minutes, so I ginned up the data for the second plot and took a look at it. I was quite surprised to see the opposite age effect, and somewhat surprised to see the inverted sex effect, so I quickly looked up the standard papers on the subject and banged out a post.

Actually what I did was to add a bit of verbiage around the .html notes (with embedded graphs) that I'd been making for myself.

I've attached the first plot that I made in that session, showing the female/male ratio for a number of words that I thought might show a difference. The X axis is the (log) count of the word (mean of counts for male and female speakers), and the y axis is the (log) ratio of female/male counts. The plotted words are too small, but I wasn't sure how much they would overlap...

If I can find another spare hour or two, I'm going to check out whether southerners really talk slower than northeners.

And here's Mark's new plot:

FisherSexData1.png

Here's the full version. (I don't know how to fit it all on the blog page.)

P.S. In his new plots (see here), Mark uses a 2x2 grid and extends the y-axis to 0. To be really picky, I'd suggest making 0 a "hard boundary." In R you can do this using 'yaxs="i"' in the plot() call, but then the top boundary will be "hard" also, so that you have to use ylim to extend the range (e.g., ylim=c(0,1.05*max(y))). What I should really do is write a few R functions to encode my default graphing preferences so that I don't need to do this crap every time I make a graph.

Uh . . . um . . .

Mark Liberman posted some interesting summaries of telephone speech records from the Linguistic Data Consortium. He writes:

I [Mark Liberman] took a quick look at demographic variation in the frequency of the filled pauses conventionally written as "uh" and "um". For technical reasons that I won't go into here, I used the frequency of the definite article "the" as the basis for comparison. Thus I selected a group of speakers (e.g. men aged 60-69), counted how often they were transcribed as saying "uh", and to normalize that count (since the number of people in each category was different) I divided by the number of times the same speakers were transcribed as saying "the".

AgeSexFluency1.png

He also did "Um":

AgeSexFluency3.png

My comments

And now, some contentless comments about graphical presentation:

1. I like the clear axis labels and titles, and even more importantly, that the lines are labeled directly (rather than using different dotted lines and a key). Good labeling is important--I do it even for the little graphs I'm making in my own research when exploring data or model fits.
2. I would've used blue for boys and pink for girls--easier to remember--although perhaps Mark was purposely trying to be non-stereotypical.
3. My biggest change would have been to (a) put the 2 graphs on a common scale, and (b) make them smaller, and put them next to each other. Smaller graphs allow us to see more at once, and see patterns that can be more obscure when we are forced to scroll back and forth between mutiple plots. In R, I do par(mfrow=c(2,2)) as a default.
4. I would have the bottom of each graph go to 0, since that's a natural baseline (the zero-uh and zero-um level that we might all like to try to reach!). There's been some debate about the "start-at-zero rule" but I usually favor it in a situation such as this, where it doesn't require much extension of the axis.

Anyway, Mark's blog entry has much more on this interesting data source.

P.S.

Caroline says "emmm" instead of "ummm." Is this standard among native Spanish speakers?

P.P.S.

See here and here for more.

There's been some debate in the media and among social scientists about the relation between income and voting. On one hand, the states that support the Democrats--the so-called "blue states"--are richer, on average, than the Republican-leaning "red states." On the other hand, richer voters continue to support the Republicans--not so much as an economic determinst might suspect (even in the lowest income category, Bush in 2004 still got 36% of the vote) but the correlation is there.

Awhile ago, we made this plot, which shows how the Republicans can simultaneously have the support of poor states, and richer voters within states:

superplot_var_intercepts_annen_2000.png

Mississippi is the poorest state, Ohio is in the middle, and Connecticut is the richest state. Within each state, the line shows the probability of supporting Bush for President for each of five income categories, and the five open circles represent the relative proportion of adults in that state in each category. The black circles show the average income and probability of supporting Bush for each state.

The above plot was fit with a model (a varying-intercept logistic regression) that restricted the slopes in the states to be essentially parallel. We then expanded the model to allow the slopes to vary also, so that the coefficient for income could differ in richer or poorer states. The figure below shows the result:

superplot_var_slopes_annen_2000.png

Income clearly matters much more in "red states" like Mississippi than in "blue states" like Connecticut. We also see this pattern in 2004, and somewhat in 1992 and 1996, but not really before the 90s.

Why vote?

There's an article by Stephen Dubner and Steven Levitt in the New York Times today on why it's rational to vote. They correctly point out that the probability of casting a decisive vote is very small. Unfortunately they don't seem to be aware of the social-benefit motivation for voting, which is why voting is, in fact, rational behavior. See here for our paper on the topic. For convenience, I'll repeat our earlier blog entry below. The short version is that you can be rational without being selfish.

The chance that your vote will be decisive in the Presidential election is, at best, about 1 in 10 million. So why vote?

In another example of the paradox of importance, a colleague writes:

In other news, I am about to use the "hot deck" method to do some imputation. I considered using one of the more sophisticated and generally better methods instead, but hey, I'm on a deadline, plus there are many other sources of error that will be larger than the ones I'm introducing. It's the same old story/justification for using linear models, normal models, assuming iid errors, etc.

At least he feels bad about it. That's a start.

Crash options

My colleague Jan Vecer in the statistics department at Columbia gave a talk the other day on "Crash options." His claim was that the introduction of such options could have a socially beneficial effect by allowing investors to plan more effectively in the context of market instabilities. I'm in no position to evaluate this one way or another, but it sounded like a cool idea, so I'm passing it along.

Here's Jan's abstract:

In this paper, we introduce new types of options which do not yet exist in the market, but they have some very desirable properties. These proposed contracts can directly insure events such as a market crash or a market rally. Although the currently traded options can to some extent address situations of extreme market movements, there is no contract whose payo® would be directly linked to the market crash and priced and hedged accordingly as an option.

Here's the paper, and here are the slides from a talk he gave on the topic.

Unfortunately, his paper has no cool graphs. I've suggested to Jan that he make a graph to show how the crash option could work to stabilize the market. I know he has the ability to make cool graphs; see his paper on tiebreakers in tennis and here for an article about his tennis predictor.

Expert statistical modeler needed

Statistics is fundamental to pharmacology and drug development. Billy Amzal at Novartis forwarded me this job announcement for a statistician or mathematician who wants to do statistical modeling in pharmocokinetics/pharmacodynamics. "Knowledge of Bayesian statistics and its application is a strong plus." It's a long way from Berkeley, where one of my colleagues told me that "we don't believe in models" and another characterized a nonlinear differential equation model (in pharmacokinetics) as a "hierarchical linear model." Anyway, it looks like an interesting job opportunity.

Identity theft

Somebody saw this entry and called, saying that he heard that I was going to Barcelona.

Como se dice "I hate statistics"?

As every statistician knows, many people hate our field. How many times have we all heard "You do statistics? I HATED that class in college!" (I remember one of my college professors complaining indignantly that no one would presume to tell an artist that he hated art.) There are all sorts of factors that probably contribute to the unpopularity of statistics: it's often one of the few quantitative courses required for social science majors, who may be less into mathy subjects to begin with; it's not always well-taught (although what subject is?); the logic of hypothesis testing isn't terribly intuitive. Lately I've been wondering if statistics' bad reputation is confined to the US or if it's more universal. My own experiences really don't help to answer that question: Sure, most of the "I hate statistics" comments I hear come from Americans, but that's not surprising given that I live in the US. And many of the international people I know are work-related, so of course they tend not to hate statistics. Anyway, my wondering about this is self-serving. Starting in January I'll be teaching intro statistics at Pompeu Fabra University in Barcelona, and I've been wondering what the students will be like. I'm hoping not to hear "me disgusta estadística" too much....

Jim Greiner has an interesting note on the use of statistics in racial discrimination cases. As both a lawyer and a statistician, Jim has a more complete perspective on these issues than most people have. I won't comment on the substance of Jim's comments (basically, he claims that the statistical analyses in these cases, on both sides, are so crude that judges can pretty much ignore the quantitative evidence when making their decisions) since I know nothing about the case in question. But I do have a technical point, which in fact has nothing really to do with racial discrimination and everything to do with statistical hypothesis testing.

Jim writes,

The facts of the specific case, which concerned the potential use of race in preemptory challenges in a death penalty trial, are less important than Judge Alito's approach to statistics and the burden of proof.

Schematically, the facts of the case follow this pattern: Party A has the burden of proof on an issue concerning race. Party A produces some numbers that look funny, meaning instinctively unlikely in a race-neutral world, but conducts no significance test or other formal statistical analysis. The opposing side, Party B, doesn't respond at all, or if it does respond, it simply points out that a million different factors could explain the funny-looking numbers. Party B does not attempt to show that such innocent factors actually do explain the observed numbers, just that they could, and that Party A has failed to eliminate all such alternative explanations.

. . .

Is there a middle way? Perhaps. In the above situation, what about requiring some sort of significance test from Party A, but not one that eliminates alternative explanations? In the specific facts of Riley, the number-crunching necessary for "some sort of significance test" is the statistical equivalent of riding a tricycle: a two-by-two hypergeometric with row totals of 71 whites and 8 blacks, column totals of 31 strikes and 48 non-strikes, and an observed value of 8 black strikes yields a p-value of 0.

OK, now my little technical comment. I don't think the hypergeometric distribution is appropriate since it conditoins on both margins. The relevant margin to condition on is the number of whites and blacks, since that was determined before the lawyers got to the problem. In a hypothesis-testing framework in which p-values represent the probability of various hypothetical alternatives (this is the framework I like, it can be interpreted classically or Bayesianly). To put it another way, the so-called Fisher exact test isn't really "exact" at all.

This is just a rant I go on occasionally, really has nothing to do with Jim's note except that it reminded me of the issue. For the fuller version of this argument, see Section 3.3 of my paper on Bayesian goodness-of-fit testing in the International Statistical Review. Also, Jasjeet Sekhon wrote a paper recently on the same topic.

For Jim's specific example, I'd be happy just doing a chi-squared test with 1 degree of freedom. His calculation is fine too--the hypergeometric is a reasonable approximation to a Bayesian posterior p-value with noninformative prior distribution.

P.S.

See also this item in Chance News.

Special Halloween edition

Here's the abstract to today's brown bag seminar in the Marketing Department (331 Uris Hall, 1:30pm, for you locals). If you read the abstract you'll see the Halloween connection.

On the Consumption of Negative Feelings
(Eduardo B. Andrade, UC Berkeley & Joel B. Cohen, University of Florida)

Abstract:

If the hedonistic assumption (i.e., people’s willingness to pursue pleasure and avoid pain) holds, why do individuals expose themselves to events known to elicit negative feelings? In this article, we assess how (1) the intensity of the negative feelings, (2) the positive feelings in the aftermath, and (3) the coactivation of positive and negative feelings contribute to our understanding of the phenomenon. In a series of 4 studies, horror and non-horror movie watchers are asked to report their positive and negative feelings either after (experiment 1) or while (experiments 2A, 2B, and 3) they are exposed to a horror movie. The results converge with a coactivation-based model and highlight the importance of a protective frame.

The "white male effect"

Dave Krantz pointed me to a paper by Kahan, Braman, Gastil, Slovic, and Mertz on "Gender, race, and risk perception: the influence of cultural status anxiety," which explores the "white male effect," which is the "tendency of white males to fear all manner of risk less than women and minorities," a pattern first noted by Slovic and others in the early 1990s. Finucane and Slovic (1999) wrote that “the white-male effect seemed to be caused by about 30 percent of the white male sample that judged risks to be extremely low.”

Here's the abstract of the new paper:

Why do white men fear various risks less than women and minorities? Known as the “white male effect,” this pattern is well documented but poorly understood. This paper proposes a new explanation: cultural status anxiety. The cultural theory of risk posits that individuals selectively credit and dismiss asserted dangers in a manner supportive of their preferred form of social organization. This dynamic, it is hypothesized, drives the white male effect, which reflects the risk skepticism that hierarchical and individualistic white males display when activities integral to their status are challenged as harmful. The paper presents the results of an 1800-person survey that confirmed that cultural worldviews moderate the impact of sex and race on risk perception in patterns consistent with status anxieties. It also discusses the implication of these findings for risk regulation and communication.

The paper is interesting, and I'm sympathetic to its general arguments--it certainly makes sense to me that risk perceptions, and perceptions about uncertainties in general, will be influenced by cultural values. But I have a couple of concerns relating to how the data were collected and analyzed.

The findings of the article come from regression analyses of responses to a national survey. They aksed people about their perceptions of risks of environmental danger, guns, and abortion. They also asked some cultural world view and personality questions, along with demographics. They found that the cultural worldview questions were predictive of risk attitudes.

I'm just a little worried that they may be measuring political views as much as risk attitudes. For example, one of the agree/disagree statements is "Women who get abortions are putting their health in danger." Statistically, my impression is that the health risk from abortion itself is low, but a person who opposes abortion might answer Yes to the question, on the grounds that a lifestyle associated with frequent abortions is risky. My point here is that the answer to the question itself could have a political twist to it. Although the question is nominally about risks, I don't know how much it's really telling us about risk perception.

I'm not saying that this is a devastating critique. Understanding the "white male effect" is a challenge, and cultural world view, etc., has got to be relevant. But this particular study maybe could be interpreted in other ways.

Fred Mosteller's advice

I was lucky enough to be a T.A. for Fred Mosteller in his final year of teaching introductory statistics at Harvard. He had taught for 30 years and told us that in different years he emphasized different material--he never knew what aspect of the course they would learn the most from, so each year he focused on what interested him the most.

Anyway, every week he would take his three T.A.'s to lunch to talk about how the course was going and just to get us talking about things. One day he asked us what we thought about some issue of education policy--I don't remember what it was, but I remember that we each gave our opinions. Fred then told us that, as statisticians, people are interested in our statistical expertise, not in our opinions. So in a professional context we should be giving answers about sampling, measurement, experimentation, data analysis, and so forth--not our off-the-cuff policy opinion, which are not what people were coming to us for.

I was thinking of this after reading David Kane's comment on Sam's link to an article about the book, The Bell Curve. David asked me (or Sam) to tell us what we really think about the Bell Curve. I can't speak for Sam, but I wouldn't venture to give an opinion considering that I haven't read the book. I'd like to think I'm qualified to make judgments about it, if I were to spend the effort to follow all the arguments--but it would take a lot of time, and my impression is that a bunch of scientists have already done so (and have come to various conclusions on the topic). I would imagine that I might be inclined to study the issue further if I were involved in a study evaluating educational policies, for example, but it hasn't really come up in any of my own research. (I did think that James Flynn's article on a related topic was interesting, but I don't even really know what are the key points of The Bell Curve are, so I wouldn't presume to comment.

Over the years, I've been distressed to see statistians and other academic researchers quoted as "experts" in the news media, even for subjects way out of their areas of expertise. It takes work to become an expert on a topic. Teaching classes in probability and statistics isn't always enough. As a reaction to this, I've several times said no to media requests on things that I'm not an expert on. (For example, when asked to go on TV to comment on something on the state lottery, I forwarded them to Clotfelter and Cook, two economists at Duke who wrote an excellent book on the topic.) Standards for blogs are lower than for TV, but still . . .

Swarthmore

I spoke at Swarthmore College last week. Here are the abstracts and here are the talks: Mathematical vs. statistical models in social science (for the general audience) and Coalitions, voting power, and political instability (for math and stat majors).

Visiting the Swarthmore math dept was lots of fun. It's great to be at a place where teaching is taken seriously. The classrooms, student common areas, faculty common areas, and faculty offices were all near each other, and students were always walking by and dropping in to faculty offices. After my talk on the first day, there was a dinner, at which I sat at a table with several students. I was impressed at the level of discussion--it seemed to be a really intellectual place. (It also was fun, in a way, to eat dining hall food--I hadn't done that for many years!)

Class-participation activities

On the second day, I did a probability demo for the students in the math-stat course, and a statistics demo for the students in the intro stat course. The probability demo involved a jar of coins and culminates in an expected-monetary-value calculation that can be done by differentiating xp(x) (where p is the normal density function with a specified mean and variance); setting the derivative to zero reduces to solving a quadratic equation. The instructor for both classes the was Walter Stromquist, who did mathematics in the "real world" for many years as a consultant before coming to teach. While I was differentiating and solving the equation on the board, he quickly programmed the formula into a spreadsheet and computed the optimal solution before I could finish. For the second class, I did the real- and fake-coin-flips demo: while I was out of the room, each pair of students created either a sequence of 100 coin flips or a sequence of 100 1's and 0's that were supposed to be fake. Then I returned and was shown sequences in pairs, with my task being to tell the real and fake sequences apart. I'm embarrassed to say that I only got 4 out of the 5 pairs correct.

Mathtalk

Overheard in the hallway of the math department:

Teacher 1: When a proof has a gap, use L'Hopital's Rule.

Teacher 2: Everytime my lectures have a gap, I tell a joke.

Teacher 1: What happens if the joke has a gap?

Hardly Statistical At All

I'm sorry. You come to this blog seeking deep thoughts and insight, and I give you links and rants. Or gratuitous plugs for things that appeal to me, which is what today's post contains. There's a new-ish magazine/literary journal called n+1. It's full of deep thoughts and insight on various topics, from travel to domestic violence to the vicious cycle that is dating. And it's called n+1--how cool is that?

Stat/Biostat Departments

I wish there were more connections between statistics departments and biostatistics departments. I've been working with survival data recently, and it's made me realize another gaping hole in my statistical knowledge base. It's also made me realize that I wish I knew more biostatisticians. And I'm one of the lucky ones, really, because Columbia has a biostatistics department and I do know some people there. Often when statistics and biostatistics departments don't have close connections, it's for understandable reasons. When I was in graduate school at Harvard, for example, the statistics and biostatistics departments were (still are, I guess) separated by the Charles River and it took a 45-minute bus ride to travel between the two. I almost never made that trip. Still, there are some great people in the Harvard Biostatistics Department and I'm sure I could have benefited from working with or taking classes from them. Here at Columbia, the biostatistics department is a subway ride away from the statistics department, and if you take the 1 train then there's that awful subway elevator to contend with (how on earth is that not a fire hazard?). Lots of universities don't have both statistics and biostatistics departments; of the ones that do there are some with close connections. I just wish that was the rule rather than the exception.


The Bell Curve

I spent too much of one day last week reading this article and everything it links to. Charles Murray, one of the authors of The Bell Curve, also has a piece in the August 2005 issue of Statistical Science called "How to Accuse the Other Guy of Lying with Statistics" (part of a special section "celebrating" the 50th anniversary of "How to Lie with Statistics"--it's a fun issue).

I haven't read The Bell Curve myself, so I better stop now.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48