September 2009 Archives

Bill Ricker points me to this blog from Mark Liberman on whether (and how much) managers are more likely to use management jargon. Or, to be more precise, whether knowing that someone uses management jargon in their speech gives you information on how likely they are to be a manager. The motivation was this quote from Peter Taylor:

I [Peter Taylor] argue that the first question to ask is whether hearing someone use the phrase "At the end of the day" conveys information on whether they are likely to be a manager...

Much Bayesian inference follows. My only comment here is not on the Bayesian inference but rather on the idea that "managers" are dweeby Dilbert characters who talk using management jargon. I was thinking about it, and I realized that I'm a manager. I manage projects, hire people, etc. But of course I don't usually think of myself as a "manager" since that's considered a bad thing to be.

For another example, Liberman considers a "spokesperson for a manufacturer of sex toys" as a manager. I don't know what this person does, but I wouldn't usually think of a spokesperson as a manager at all.

To me, the most interesting linguistic phenomenon here is the floating definition of "manager."

P.S. Lots and lots and lots of discussion here. Somehow I think that Mark Liberman gets a lot more readers on his blog than I do on mine!

I just assumed they already were doing this. Did they really used to charge the same price for flights on every day of the year? That would be silly, no? It doesn't make sense to me for people to be angry about differential pricing.

Comments on the linked blog suggest that the problem is a lack of information in the communication of ticket prices. Consumers (such as myself) don't really have any idea what a ticket will cost--we either have to just buy something blind or else do informal statistical inference by running a lot of queries on Expedia or whatever. As a result of this ignorance, airlines have an incentive to advertise super-low fares, which then leads to surcharges etc. What a mess.

On the other hand, I never feel comfortable complaining about airport/airline experiences. I fly a lot and as a result am a big polluter. So, really, anything that makes flying more of a pain in the ass is probably a net benefit to the world.

FInd out on Thurs 1 Oct at 11:15 am in Kimmel 900 at NYU: Dr. Michael Foster from UNC will present the 4th Statistics in Society lecture, entitled: "Does Special Education Actually Work?" This talk will explore the efficacy of current special education policies while highlighting the role of new methods in causal inference in to helping answer it. It is jointly sponsored by the Departments of Teaching and Learning and Applied Psychology, and by the Institute for Human Development and Social Change.

I'd definitely go to this if I were in town.

Yuck!

| No Comments

I'd much much rather have the Washington Post have a competition for America's Next Great Reporters. We have enough Next Great Pundits as it is.

"The cab driver - who witnesses said was talking on his cell phone and appeared distracted - slowed briefly but then tried to speed away . . ." (link from Streetsblog). (And this other source says he "has several traffic violations on his driving record."

Something similar happened to us not long ego (although, luckily, nobody was hurt). This time the cops told the driver (who, again, had to be stopped by people on the street so she couldn't drive off) not to worry about it.

P.S. I'm not saying the drivers should go to prison. The more appropriate sanction would be to get them out of the driver's seat of a car. For the guy who killed the kid and tried to escape, perhaps forbidding him to drive for 20 years would be appropriate. But really it should never have reached that point: if each of the "several traffic violations" had resulted in his car being taken away and him being forbidden to drive for some period of time, then it's likely he wouldn't have been on the road that day and the kid would still be alive.

On the other hand, if hitting a kid and driving away is considered OK, then of course you'll still see people driving that way.

Fivethirtyeight commenter TGGP links to a news article about zillionaire financier Peter Theil, who "predicted which firms would be bailed out based on whether they leaned Republican or Democratic." In the words of reporter Peter Robinson, Theil "possesses a preternatural ability to spot patterns that others miss."

I'll repeat a bunch of Theil's reasoning, because on one level if's interesting while on another level I find it hard to take completely seriously as it stands..

Macartan Humphreys sends along this article where he proves that the requirement of "compactness" in districting, if interpreted as requiring districts to be convex, does not by itself stop a majority party from gerrymandering:

Gerrymandering--the manipulation of electoral boundaries to maximize constituency wins|is often seen as a pathology of democratic systems. A commonly cited cure is to require that electoral constituencies have a `compact' shape. But how much of a constraint does compactness in fact place on would-be gerrymanderers? By applying a theorem of Kaneko, Kano, and Suzuki (2004) to the two party situation we show that a gerrymanderer can always create equal sized convex constituencies that translate a margin of k voters into a margin of at least k constituency wins. Thus even with a small margin a majority party can win all constituencies. Moreover there always exists some population distribution such that all divisions into equal sized convex constituencies translate a margin of k voters into a margin of exactly k constituencies. Thus a convexity constraint can sometimes prevent a gerrymanderer from generating any wins for a minority party.

Chris Blattman reports on a study by Seema Jayachandran and Ilyana Kuziemko that makes the following argument:

Medical research indicates that breastfeeding suppresses post-natal fertility. We [Jayachandran and Kuziemko] model the implications for breastfeeding decisions and test the model's predictions using survey data from India. . . . mothers with no or few sons want to conceive again and thus limit their breastfeeding. . . . Because breastfeeding protects against water- and food-borne disease, our model also makes predictions regarding health outcomes. We find that child-mortality patterns mirror those of breastfeeding with respect to gender and its interactions with birth order and ideal family size. Our results suggest that the gender gap in breastfeeding explains 14 percent of excess female child mortality in India, or about 22,000 "missing girls" each year.

Interesting. I wonder what Monica Das Gupta would say about this study--she seems to be the expert in this area.

Huh?

The only thing that really puzzles me about Jayachandran and Kuziemko's article is that, on one hand, they produce an estimate of 14%, but on the other, they write:

In contrast to conventional explanations, excess female mortality due to differential breastfeeding is largely an unintended consequence of parents' desire to have more sons rather than an explicit decision to allocate fewer resources to daughters.

But they just said their explanation only explains 14%. Doesn't that suggest that the other 86% arises from infanticide and other "explicit decisions"? The difference between "14%" and "largely" is so big that I think I must be missing something here. Perhaps someone can explain? Thanks.

John reports on an article by Oeindrila Dube and Suresh Naidu, who ran some regressions on observational data and wrote:

This paper examines the effect of U.S. military aid on political violence and democracy in Colombia. We take advantage of the fact that U.S. military aid is channeled to Colombian army brigades operating out of military bases, and compare how changes in aid affect outcomes in municipalities with and without bases. Using detailed data on violence perpetuated by illegal armed groups, we Â…find that U.S. military aid leads to differential increases in attacks by paramilitaries . . .

It's an interesting analysis, but I wish they'd restrained themselves and replaced all their causal language with "is associated with" and the like.

From a statistical point of view, what Dubey and Naiduz are doing is estimating the effects of military aid in two ways: first, by comparing outcomes in years in which the U.S. spends more or less in military aid; second, by comparing outcomes in cities in Colombia with and without military bases.

Interactions and Bayesian Anova

| No Comments

Gregor Gorjanc writes:

Colin Gillespie writes:

A couple of weeks ago I did your suggested exercise (from Teaching Statistics: A Bag of Tricks) on 'Guessing the age', with the additional twist that the people were actors/actresses out of CSI. As well as discussing the data in class, I used it for there first R lab, where they generated simple scatterplots, boxplots and histograms.

In case your interested, the main results where:

1. Watching CSI didn't seem to affect your guess

2. Females guessed better than males.

3. The vast majority of guesses where too low (unsurprising for actors), except for the youngest actor.

If you are interested you can find my slides/handouts here.

Cool!

My friend Seth wrote:

A few months ago, because of this blog, I got a free heart scan from HeartScan in Walnut Creek. It's a multi-level X-ray of your heart and is scored to indicate your heart disease risk. . . . What's impressive about these scans is three-fold:

1. The derived scores are strongly correlated with risk of heart disease death. . . . Here is an example of the predictive power. . . .

2. You can improve the score. Via lifestyle changes.

3. The scans provided by HeartScan are low enough in radiation that they can be repeated every year, which is crucial if you want to measure improvement. In contrast, a higher-tech type of scan (64 slice) is so high in radiation that it can't be safely repeated. . . .

Heart scans, like the sort of self-experimentation I've done, is a way to wrest control of your health away from the medical establishment. No matter what your doctor says, no matter what anyone says, you can do whatever you want to try to improve your score. . . .

This looked pretty good. Heart attacks are the #1 killer, maybe I should be getting a heart scan. On the other hand, Seth's references are to a journal article from 2000 and a news article from Life Extension magazine, hardly a trustworthy source. So I didn't know what to think.

I contacted another friend who works in medical statistics, who wrote:

I don't know any of this literature but the fact that his source publication dates back to 2000 while the screening method has clearly not gained widespread traction is an indicator that the cost/benefit ratio is not very favorable (though it's no doubt very favorable to HeartScan who make money out of doing the scanning).

I found this more recent (though skimpy) review, "CT-Based Calcium Scoring to Screen for Coronary Artery Disease: Why Aren't We There Yet?" which casts doubt on the whole idea (and given that it's written by radiologists it has some credibility because they would normally be the first to promote a radiology-based screening technique). There were also some links to reviews of the potential dangers (carcinogenic) of repeated CT scans.

From this information, I wouldn't try to talk Seth out of getting heart scans, but I won't rush out to get one of my own.

Tom Holbrook writes:

I just saw your post on the generic ballot and thought you might be interested in something I posted just the other day. My post was stimulated something Charlie cook had written a couple of weeks ago, and I hadn't yet seen the Bafumi, Erikson, and Wlezien article. Anyway, I find that there isn't much connection between generic ballots and midterm results this far (14 months) out from the election. My analysis doesn't break down by in-party and out-party, and it uses data much farther out than Bafumi et al. used, so it not directly comparable to their work; but I thought you might find it interesting.

This came in the email:

Who has babies when?

| 22 Comments

Sheril Kirshenbaum links to this graph from economists Kasey Buckles and Daniel Hungerman showing differences in who conceives babies in the fall (older, better-educated people) and the spring (younger, less well-educated people):

NA-BA643_BIRTH_NS_20090921192123.gif

Pretty stunning. And a nice graph. The repeating pattern over the years is super-clear. I'd also like to see a version that just shows the averages for the 12 months, so I could see the pattern in more detail. Also I'd like to subtract 40 weeks so it shows the data by (approximate) month/date of conception.

P.S. This news article by Justin Lahart is excellent. But I did notice one funny thing (to a statistician):

The two economists examined birth-certificate data from the Centers for Disease Control and Prevention for 52 million children born between 1989 and 2001 . . . 13.2% of January births were to teen mothers, compared with 12% in May--a small but statistically significant difference, they say.

Well, yeah, with n=52,000,000, I'd think that a 1 percentage point difference would be statistically significant! More seriously, with that many cases, it sounds like the next step (if the researcher haven't already done this) is to break things down by subgroups of the population. I wonder what data are available from the birth certificate records. To start with, there's geographic information.

Under the heading, "Republicans not in a position to retake the House (yet)," Chris Bowers estimates that the Democrats have a 41.2%-37.7% lead in recent generic House polling. Bowers writes, "Democrats are, after all, still winning."

But it's not so simple. In research published a couple years ago, Joe Bafumi, Bob Erikson, and Chris Wlezien found that, yes, generic party ballots are highly predictive of House voting--especially in the month or two before the election-but that early polling can be improved by adjusting for political conditions. In particular, the out-party consistently outperforms the generic polls.

congpolls2.jpg

The paper accompanying this graph was among the first public predictions of a Democratic takeover in 2006.

Bafumi, Erikson, and Wlezien's analysis doesn't go back before 300 days before the election, but if we take the liberty of extrapolating . . . The current state of the generic polls gives the Democrats .412/(.412+.377) = 52% of the two-party vote. Going to the graph, we see, first, that 52% for the Democrats is near historic lows (comparable to 1946, 1994, and 1998) and that the expected Democratic vote--given that their party holds the White House--is around -3%, or a 53-47 popular vote win for the Republicans.

Would 53% of the popular vote be enough for the Republicans to win a House majority? A quick look, based on my analysis with John Kastellec and Jamie Chandler of seats and votes in Congress, suggests yes.

It's still early--and there's a lot of scatter in those scatterplots--but if the generic polls remain this close, the Republican Party looks to be in good shape in the 2010.

P.S. Is there any hope for the Democrats? Sure. Beyond the general uncertainty in prediction, there is the general unpopularity of Republicans; also, it will be year 2 of the presidential term, not year 6 which is historically the really bad year for the incumbent party. Still and all, the numbers now definitely do not look good for the Democrats.

"Everything's coming up Tarslaw!"

| 8 Comments

I just finished three novels that got me thinking about the nature of fiction.

First, How I Became a Famous Novelist, by Steve Hely. Just from seeing the back cover of the book, with its hilarious parody of a New York Times bestseller list, I was pretty sure I'd like it. And, indeed, after reading the first six pages or so, I was laughing so hard that I put the book aside to keep myself from reading it too fast--I wanted to savor the pleasure. How I Became... really is a great book--in some ways, the Airplane! of books, in that it is overflowing with jokes. Hely threw in everything he had, clearly confident he could come up with more funny stuff whenever, as needed. (I don't know if you remember, but movie comedies used to be stingy with the humor. They'd space out their jokes with boring bits of drama and elaborate set-pieces. Airplane! was special because it had more jokes than it knew what to do with.) Anyway, Hely's gimmick in How I Became... is to invent dozens of hypothetical books, newspapers, locations, etc. There are bits of pseudo-bestsellers from all sorts of genres. The main character ends up writing a Snow-Falling-on-Cedars-type piece of overwritten historical crap. I have to admit I felt a little envy when he recounts the over-the-top, yet plausible sequence of events that puts him on the bestseller list--I still think this could've been possible with Red State, Blue State if we had had some professional public relations help--but I guess that added to the bittersweet pleasure of reading the book.

The other thing I appreciated about How I Became... was its forthrightness about the effort required to write all of a book and put it all together. I know what he's talking about. It really is a pain in the ass to get a book into good shape. More so on my end: Nick Tarslaw had an editor, a luxury I don't have for my books. (I mean, sure, I have an editor and a copy editor, but the role of the former is mostly to acquire my book and maybe make a few comments and suggestions; we're not talking Maxwell Perkins here. And copy editors will catch some mistakes (and introduce about an equal number of their own), but, again, I'm the one (along with my coauthors) who are doing all the work.)

Finally, I should say that, the minute I started reading How I Became..., I happily recognized it as part of what might be called "The Office" genre of comic novels, along with, for example, Slab Rat, Then We Came to the End, and Personal Days. To me, Then We Came to the End was deeper, and left me with a better aftertaste, than How I Became..., but How I Became... had more flat-out funny moments, especially in its first half. (Set-ups are almost always better than resolutions.)

The next book I read recently was The Finder, by Colin Harrison, a very well-written and (I assume) well-researched piece of crap about a mix of lowlifes, killers, and big shots. The plot kept it moving, and I enjoyed the NYC local color. But, jeez, is it really necessary that the hero be, not only a good guy in every respect, but also happen to have rugged good looks, much-talked-about upper-body strength, and of course be gentle yet authoritative in the sack? Oh, and did I forget to mention, he's also the strong silent type? Does the main female character really have to be labeled by everybody as "pretty" or, occasionally "gorgeous"? Is it really required that the rich guy be a billionaire? Wouldn't a few million suffice? Etc.

Still, even though it insulted my intelligence and moral sensibilities a bit, The Finder was fun to read. One advantage of having no email for a week is that it freed up time to relax and read a couple of books.

Anyway, before reading How I became..., I would've just taken the above as Harrison's choices in writing his book, but now I'm wondering . . . Did Harrison do it on purpose? Did he think to himself, Hey, I wanna write a big bestseller this time, let me take what worked before and amp it up? I guess what I'm saying it, Hely's book has ruined the enjoyment I can get from trash fiction. At least for awhile.

Most recently, I was in the library and checked out The Dwarves of Death, an early novel (from 1990) of Jonathan Coe, author a few years ago of the instant-classic, The Rotter's Club. The Dwarves of Death isn't perfect--for one thing, it has plot holes you could thread the Spruce Goose through, and without needing any careful piloting--but it's just great. It's real in a way that How I Became... is not. This is not a slam on Hely's book, which is an excellent confection, it's more of a distinction between a dessert and a main course.

The Dwarves of Death had so many funny lines I forgot all of them. That said, it wasn't laugh-out-loud funny the way How I Became... was (especially in its . Then again, it didn't need to be.

Chris Wiggins sent me a link to this article by Caroline Savage and Andrew Vickers, which, as he puts it, "takes an empirical approach to revealing the community's publishing practices." Here's the abstract:

Many journals now require authors share their data with other investigators, either by depositing the data in a public repository or making it freely available upon request. These policies are explicit, but remain largely untested. We sought to determine how well authors comply with such policies by requesting data from authors who had published in one of two journals with clear data sharing policies. . . .

We received only one of ten raw data sets requested. This suggests that journal policies requiring data sharing do not lead to authors making their data sets available to independent investigators.

Not good. Personally, I hate it when people don't share their data. I've found researchers in biomedical sciences to be particularly bad about this, possibly because (a) these are big-money fields where the investigators are just too damn busy to reply to requests, and (b) pain-in-the-butt Institutional Review Boards make it difficult to share data. Bad stuff all around, and maybe Savage and Vickers's paper will be a valuable wake-up call.

Why are veep nominees so lame?

| No Comments

See here and here.

I received the following email the other day:

I read the abstract for your paper What is the probability your vote will make a difference? with Nate Silver, Aaron Edlin [to appear in Economic Inquiry]. I'd note that the abstract prima facie contains an error. Your sentence in the abstract, "On average, a voter in America had a 1 in 60 million chance of being decisive in the presidential election." can not be correct. If we assume that this sentence is correct that means that given the actual turnout of 132,618,580 people the sum total probability of voters being decisive is larger than one. This of course [sic] is impossible. The total amount of decisiveness must be at most one (although obviously the sum total can be lower than one if the voters are not equally disposed to both candidates). . . .

The above argument is at first appealing but is not actually correct. Actually the total probability can exceed 1. For a simple mathematical example, see p.425 of this paper. The reason the total probability can exceed 1 is that it is possible for many voters to be decisive at the same time.

In the "weakly informative priors" article, we propose a Cauchy (0, 2.5) default prior distribution for logistic regression coefficients, motivating it from applied concerns and also as a regularizer.

Recently, Gregor Gorjanc pointed me to an article by Jairo Fuquen, John Cook, and Luis Pericchi, also recommending Cauchy prior distributions but this time using a robustness argument. Their article is a bit more mathematical than ours, and with a different focus, more concerned with improvements in specific applications than in the construction of a generic default prior distribution. But we have similar messages, and in that sense our papers are complementary.

Mark Thoma links to this article by Bill Easterly about the history of economic development in the mid-twentieth century. Easterly writes:

Why does this history matter today . . . I [Easterly] do NOT mean to imply guilt by association for development as imperialist and racist; there are many theories of development and many who work on development (including many from developing countries themselves) that have nothing to do with imperialism and racism.

But I [Easterly] think the origin of development as cover for imperialism and racism did have toxic legacies for some. First, it meant that the concept of development was determined to fit a propaganda imperative; it was NOT a breakthrough in thought by economists. Second, it followed that development from the beginning would stress the central role of Western aid to help the helpless natives. . . And this history also seems strangely relevant with today's "humanitarian" nouveau-imperialism to invade and fix "failed states" like Iraq and Afghanistan.

I defer to Easterly both on the history and the economics of international development, but I do have one criticism of his argument. It is my impression that a lot of ideas in economic development are not just about the interaction between "first world" and "third world" countries (Easterly's focus) but also relate to struggles within individual third world countries. In some countries, the international development people were opposing white elites. This doesn't mean that either side was necessarily correct in its economic assumptions, but it seems a bit extreme to think of economic development experts as supporting white superiority.

Of all the first-world institutions that were influencing poorer countries during those times fifty years ago, I'd think that the international development community was one of the less racist.

Rorschach's on the loose

| 9 Comments

According to Josh Millet, the notorious Rorschach inkplots have been posted on the web, leading to much teeth-gnashing among psychologists, who worry that they can't use the test anymore now that civilians can get their hands on the images ahead of time.

For example, here's a hint for Card IV (see below): "The human or animal content seen in the card is almost invariably classified as male rather than female, and the qualities expressed by the subject may indicate attitudes toward men and authority."

160px-Rorschach_blot_04.jpg

So, if they show you this one on a pre-employment test, better play it safe and say that the big figure looks trustworthy and that you'd never, ever steal paperclips from it.

Oh, and when Card II comes up, maybe you should just play it safe and not mention blood at all.

More general concerns

I'm not particularly worried about the Rorschach test since it's pretty much a joke--you can read into it whatever you want--but, as Millet points out, similar issues would arise, for example, if someone stole a bunch of SAT questions and posted them. It would compromise the test's integrity. Millet points out that this problem could be solved if you were to release thousands and thousands of potential SAT questions: nobody could memorize all of these, it would be easier to just learn the material.

I've had the plan for many years to do this for introductory statistics classes: to have, say, 200 questions for the final exam, give out the questions to all the students, and explain ahead of time that the actual exam will be a stratified sample from the list. This would encourage students to study the material but not in a way that they could usefully "game the system." I haven't done this yet--it's a lot of work!--but I'm still planning to do so.

Lazy ways of modeling proportions

| No Comments

Andrew Therriault writes:

I'm creating a model of issue emphasis in political campaigns as a product of public opinion (so candidates choose what to discuss strategically based on which issue will help them most), and the data I'm using combines candidates' ad spending (coded by issue) with the public's issue positions in the candidates' districts. Thus far, I've used percentage of ad spending per issue for each candidate as my DV in OLS and tobit models. I know that this specification is not optimal, though, because of the correlation between each candidate's observations (since they are constrained to sum to 100).

Peter Loewen sends along this article that follows up on some work of James Fowler, Aaron Edlin, Noah Kaplan, and myself on rational voter turnout. He goes with the rational-but-not-entirely-self-interested model that we all use (and which first appeared, as far as we know, in a book by Derek Parfit in 1984), and tests it empirically:

How to make graphs that work

| 9 Comments

Aleks pointed me to this good advice from Seth Godin on preparing graphs. Some snippets:

1. Don't let popular spreadsheets be in charge of the way you look . . . when you show me something exactly like something I've seen a hundred times before, what do you expect me to do? Here's a hint: Zzzzzz.

2. Tell a story

3. Follow some simple rules

- Time goes on the bottom, and goes from left to right

- Good results should go up on the Y axis. This means that if you're charting weight loss, don't chart "how much I weigh" because good results would go down. Instead, chart "percentage of goal" or "how much I lost."

4. Break some other rules . . .

Lynn Vavreck writes:

I just heard the Carter interview about Obama and racism. Simon Jackman and I have a bit of evidence on this from the Cooperative Campaign Analysis Project. These are data from white, registered voters nationwide about stereotypes of different groups. You can see, roughly a third of the people think blacks are inferior to whites on lazy v. hardworking and similarly on intelligent/unintelligent:

lynn.png

[click for larger version of the table]

This, of course, doesn't answer the question about whether Carter is right -- but, it does provide some systematic evidence for his claim that many Americans don't think African Americans are "qualified" (his words) to lead this country.

Some people would take these data as evidence of racism, but I have a more positive spin. The table gives the average rating for whites, and for southern whites, and from these you can back out the implied rating for non-southern whites. And we're lookin good. We're as intelligent as Asians and almost as hardworking! In the words of a famous non-hardworking, non-southern white: Woo-hoo!

Of course I don't buy this--no way are non-southern whites as hard-working and intelligent as Asian Americans--I mean, c'mon. But it's good to know that white people, at least, still think this. Now I want to see Lynn and Simon break down their respondents by where they live. Do southern whites themselves think they are lazier and dumber than non-southern whites??

P.S. The question stem reads:

Now, some questions about different groups in our society. Rate each group on the following scale, where "1" means you think almost all of the people in that group are "lazy"; and "7" means that you think almost everyone in the group is "hardworking."

A correspondent who wishes to remain anonymous sends this in:

From Pat Robertson's Regent University course catalog:

GOV 601 Quantitative Analysis (3) Skills for quantitative data gathering, measurement, policy analysis and program evaluation. Research and sampling design, surveys, data collection and data reduction and display. Review of basic statistics through multivariate analysis, z-scores, regression through the use of statistical computer package (SPSS), and a Judeo-Christian perspective on the use of statistics.

I wonder if they teach the principle that God is in every leaf of every tree.

(I looked for the course description online but couldn't find it. But the description seems consistent with others in the catalog.)

I'm in Paris through Aug, 2010

| 2 Comments

Pour l'année sabbatique, à Sciences Po.

See here for a link to a statistical consultant. I don't know the guy, but his credentials seem strong and he has a nice-looking website and what seems like a good philosophy.

A very specialized sort of spam

| 2 Comments

I received the following in my inbox, from emily@nextadvance.com:

When I visited AT&T Labs last month, I saw some beautiful poster-size maps from Yifan Hu, visualizing structures in large data sets. Here's TV Land:

uverse_1000_country_labels.png

The research is by Emden Gansner, Yifan Hu, Stephen Kobourov, Chris Volinsky, and you can download their article and more maps at the above link.

NYC datasets

| No Comments

Abhishek Joshi of the Columbia Population Research Center (CPRC) writes:

CPRC is pleased to offer an easy way to locate New York City related databases. The New York City Dataset link includes data such as; NYC Community Health Survey, NYC Youth Risk Behavior Survey, NYC HANES, MapPLUTO and much much more. The website provides easy access for downloading data sets, code books, data dictionaries, online data extraction tools and other relevant documentation.

I did a quick check and it seems that you can access a lot of information here without needing a Columbia University password.

An Anova sort of question

| 1 Comment

Suresh Krishna writes:

One of my favorite cartoons, by Charles Barsotti, shows a hot dog excitedly saying, "Hey everybody, we've been invited to a cookout!" I share this with my classes when I teach decision analysis to emphasize that different people (or, more generally, "agents") have different goals.

I was thinking about this point recently after a discussion here a couple weeks ago about the first-player advantage in Risk. Commenter Ken Williams suggested solving the problem by alternating games. In considering why this seems like a bad idea to me (beyond the impracticality of playing several games of risk back to back), I realized that the relevant issue here is not fairness but rather is fun, or playability. After all, for fairness alone you only need to randomize who starts first and that solves the problem. But, if there's a huge first-player advantage, the game still might not be so playable. It's not always a lot of fun to play a game if you know to start with that you're gonna lose.

[Have to leave office before 4 to pick up kids] + [No internet yet at home] = [Can't read email]

Response to two-slit discussion

| 7 Comments

Thanks for all the comments. I responded here. To summarize briefly:

1. Many people commented that the laws of probability work just fine in quantum mechanics, you just have to include the act of measurement in your model: there is no latent joint distribution that exists out there to be passively measured.

I agree, but my point was that when we apply probability theory to analyze surveys, experiments, observational studies, etc., we typically do assume a joint distribution and we typically do treat the act of measurement as a direct application of conditional probability. If classical probability theory (which we use all the time in poli sci, econ, psychometrics, astronomy, etc) needs to be generalized to apply to quantum mechanics. Which makes me wonder if it should be generalized for other applications too.

2. Some commenters discussed work in political science and psychometrics in which researchers are working on generalized probability models, inspired by quantum probability, to do statistical data analysis. Looks like it could be interesting.

P.S. Just to clarify further: I know more physics than most statisticians do, but that's not a lot, and I certainly don't think I have anything useful to say about quantum mechanics beyond what Richard Feynman (or, for that matter, Bill Jefferys) has written already. Where I do have expertise is in the application of probability models to diverse applied fields. And what I'm wondering is whether it would be appropriate to generalize the usual probability models there, just as it is necessary to do for quantum mechanics.

This is all standard physics. Consider the two-slit experiment--a light beam, two slits, and a screen--with y being the place on the screen that lights up. For simplicity, think of the screen as one-dimensional. So y is a continuous random variable.

Consider four experiments:

1. Slit 1 is open, slit 2 is closed. Shine light through the slit and observe where the screen lights up. Or shoot photons through one at a time, it doesn't matter. Either way you get a distribution, which we can call p1(y).

2. Slit 1 is closed, slit 2 is open. Same thing. Now we get p2(y).

3. Both slits are open. Now we get p3(y).

4. Now run experiment 3 with detectors at the slits. You'll find out which slit each photon goes through. Call the slit x. So x is a discrete random variable taking on two possible values, 1 or 2. Assuming the experiment has been set up symmetrically, you'll find that Pr(x=1) = Pr(x=2) = 1/2.

You can also record y, thus you can get p4(y), and you can also observe the conditional distributions, p4(y|x=1) and p4(y|x=2). You'll find that p4(y|x=1) = p1(y) and p4(y|x=2) = p2(y). You'll also find that p4(y) = (1/2) p1(y) + (1/2) p2(y). So far, so good.

The problem is that p4 is not the same as p3. Heisenberg's uncertainty principle: putting detectors at the slits changes the distribution of the hits on the screen.

This violates the laws of conditional probability, in which you have random variables x and y, and in which p(x|y) is the distribution of x if you observe y, p(y|x) is the distribution of y if you observe x, and so forth.

A dissenting argument (that doesn't convince me)

To complicate matters, Bill Jefferys writes:

As to the two slit experiment, it all depends on how you look at it. Leslie Ballentine wrote an article a number of years ago in The American Journal of Physics, in which he showed that conditional probability can indeed be used to analyze the two slit experiment. You just have to do it the right way.

I looked at the Ballentine article and I'm not convinced. Basically he's saying that the reasoning above isn't a correct application of probability theory because you should really be conditioning on all information, which in this case includes the fact that you measured or did not measure a slit. I don't buy this argument. If the probability distribution changes when you condition on a measurement, this doesn't really seem to be classical "Boltzmannian" probability to me.

In standard probability theory, the whole idea of conditioning is that you have a single joint distribution sitting out there--possibly there are parts that are unobserved or even unobservable (as in much of psychometrics)--but you can treat it as a fixed object that you can observe through conditioning (the six blind men and the elephant). Once you abandon the idea of a single joint distribution, I think you've moved beyond conditional probability as we usually know it.

And so I think I'm justified in pointing out that the laws of conditional probability are false. This is not a new point with me--I learned it in college, and obviously the ideas go back to the founders of quantum mechanics. But not everyone in statistics knows about this example, so I thought it would be useful to lay it out.

What I don't know are whether there are any practical uses to this idea in statistics, outside of quantum physics. For example, would it make sense to use "two-slit-type" models in psychometrics, to capture the idea that asking one question affects the response to others? I just don't know.

On many occasions it's handy to have a list of conjugate prior distributions. Several books have it, but if you're typing away on a beach somewhere, let me provide some links:

John Cook's summary of univariate conjugate prior relationships:

conjugate.png

John links to another two good sources: Wikipedia and to Daniel Fink's "A Compendium of Conjugate Priors".

John Cook also has a clickable diagram of distribution relationships, a subset of a much larger one by Leemis and McQueston (click to enlarge):

univ16.png"

(Material found via LingPipe's introduction to Bayesian statistics, thanks Bob.)

Zbicyclist writes:

Doug Rivers's discussion about weighting and bias reminds me of a common trick I don't see very much, and that is using effective sample size as a planning/tracking tool. ESS is (Sum of the weights)^2 / Sum of (weights^2).

If all n cases are weighted equally, ESS is n. Otherwise, ESS

Because I've instituted this as a tracking signal and as a planning tool, I've sometimes been asked to justify it; the clearest explication is Kish, Leslie (1992) Weighting for unequal Pi. Journal of Official Statistics, 8, 183-200, which makes sense, since I pulled the ESS formula from my class notes from when I took sampling from Kish.

My question is this: given that this is such a simple tracking and planning metric, why does it seem so hard to find in the literature?

My reply: I dunno. I use this formula all the time, but I usually just derive it myself from scratch when I need it. In general, survey sampling books are weak on this stuff. The other thing is that this formula in general overestimates the sampling variance, I think. It's a formula for variance with unequal-probability sampling (as indicated by the title of the Kish article). When weights are constructed using poststratification (as is standard, see for example any of my many articles on the topic), the sampling variance will be lower.

This note by Robin Hanson (in which he expresses his irritation with state highway departments that leave cones on the highways too long, thus unnecessarily restricting traffic lanes) reminded me of an idea I had when I moved to Berkeley, California, many years ago. I lived on a residential street that was only a few blocks long. But boy was it wide. Really wide. Here's a recent picture from Google maps:

spaulding.png

My thought was: why not narrow the street by about 50% and give the extra land to the owners of the property? The lots are pretty small there and property values are high--higher now than in 1990, I'm sure. So it's basically free money. As a renter, I didn't think too much about this, but I really don't see why nobody's done this. They don't even have to do the whole city, they could do it one street at a time.

Is $98/hour a high rate of pay?

| 9 Comments

John Sides and Joshua Tucker link to a news article by Jeremy Peters that reports that former New York State governor Eliot Spitzer is teaching a course at the City University of New York for "$98.43 an hour, or about $4,500 for the semester." This comes out to 45 or 46 hours--let's say 3 classroom hours a week for 15 weeks.

I noticed a few interesting things in the article.

1. I think it's ridiculous to consider $4500 for a course to be a rate of $98/hour. Teaching isn't just lecturing. You also have to prepare the classes, meet with students, write exams, and grade homeworks. $98/hour sounds like a lot, but it's based on a low denominator.

(There are exceptions, though. I know of a professor who paid the T.A. $100 per lecture to teach the class when the prof was out of town. It happened several times during the semester.)

2. I thought it was interesting that the commenters identified in the news article seemed to think that $4500 was a high rate of pay. I mean, suppose you teach 8 courses a year at $4500 each. That's $36,000. Hardly Richie Rich territory. This point is made at the very end of the article ("The point is not that Spitzer is paid too much, but rather that most adjuncts are paid too little") but it didn't really come through at first.

3. Sides writes that "This isn't pretty." I don't see what's so bad about Spitzer teaching a class. He knows a lot about politics and would seem to be well qualified to be an adjunct professor. I thought that was the ideal, to have adjuncts who are working professionals who take time off to teach a class.

Pete Lindstrom writes:

I was wondering if you could blog on the points discussed in the WSJ at this link. Apparently, there is a controversy over ways to use clinical data to calculate risks - one method adjusting for time and another using absolute numbers for the entire length of the study.

My (wholly inadequate) reply: This is interesting, but I have to say, I find the article pretty confusing. It's written in the standard journalistic style of going forward and backward in time, rather than in the scientific-journal style of presenting the data and models all in one place. If this was something I had to do, I'd puzzle through what's happening here. Luckily for me, I'm blogging just for fun and so I'll just let the question sit for others to worry about.

Listen to me on the radio

| 1 Comment

Apparently the BBC has a radio show all about statistics. It's broadcast at 1330 on Friday afternoons and repeated at 2000 on Sundays on Radio 4. I taped the interview a few weeks ago; I wonder how much of it they'll use. The interviewer, Tim Hartford, was excellent. I just hope they cut some of the part near the end when I got too relaxed and started to babble.

The topic was my article (with David Weakliem), Of Beauty, Sex, and Power.

P.S. I took a listen. They did a pretty good job of cutting and pasting my rambling into coherent sequences of sentences. I still sound pretty dry and professorial, I'm afraid, but at least it's to the point.

Regarding the request for a "good graphical way of showing changes in the distribution of a population among quantile categories," Antony Unwin sends in this:

MultBarsFatherSonIncome.png

He writes:

Matt Fox writes:

I teach various Epidemiology courses in Boston and in South Africa and have been reading your blog for the past year or so and used several of your examples in class . . . I am curious to know why you are skeptical of structural models. Much of my training has been in how essential these models are and I rarely hear the other side of the debate.

I've never used structural models myself. They just seem to require so many assumptions that I don't know how to interpret them. (Of course the same could be said of Bayesian methods, but that doesn't seem to bother me at all.) One thing I like to say is that in observational settings I feel I can interpret at most one variable causally. The difficulty is that it's hard to control for things that happen after the variable that you're thinking of as the "treatment."

To put it another way, there's a research paradigm in which you fit a model--maybe a regression, maybe a structural equations model, maybe a multilevel model, whatever--and then you read off the coefficients, with each coefficient telling you something. You gather these together and those are your conclusions.

My paradigm is a bit different. I sometimes say that each causal inference requires its own analysis and maybe its own experiment. I find it difficult to causally interpret several different coefficients from the same model.

It's all about the salamanders

| 1 Comment

Jacob Felson writes:

I have a statistics question that may lend itself to multilevel modeling and incomplete data.

I get so irritated when economists and political scientists try to explain every sort of irrational behavior in life as being part of some utility function.

That's one reason I love this paper by Erik Snowberg and Justin Wolfers, "Explaining the Favorite-Longshot Bias: Is it Risk-Love or Misperceptions." They conclude that, yes, it's misperceptions:

Doug writes:

Probability sampling is a great invention, but rhetoric has overtaken reality here. Both of the probability samples in this study had large amounts of nonresponse, so that the real selection probability--i.e., the probability of being selected by the surveyor and the respondent choosing to participate--is not known. Usually a fairly simple nonresponse model is adequate, but the accuracy of the estimates depends on the validity of the model, as it does for non-probability samples. Nonresponse is a form of self-selection. All of us who work with non-probability samples should spend our efforts trying to improve the modeling and methods for dealing with the problem, instead of pretending it doesn't exist.

Good stuff. Read the whole thing. Doug was, along with me and several others, an advisor on the recent report on National Election Study weighing.

The 4pm rule

| 3 Comments

People keep asking me about this, so let me explain . . . "4pm" refers to local time, wherever I happen to be.

Graphing a transition matrix

| 3 Comments

Andy Baxter writes:

I wondered if you had a suggestion for a good graphical way of showing changes in the distribution of a population among quantile categories from one time period to another. I'm working on a project in which I need to show our district leadership the stability of various value-added estimates of a given teacher's effectiveness from year to year. For example, how many teachers in the first quintile remain in the first quintile 1 year later? I know I could probably just do it with a table, but I wondered if there was a better way to do it with a graph. Any ideas or links to good examples?

My reply:

I imagine there's been a lot of work on this general task: it's basically the same problem as summarizing transition matrices, which is a bit issue in sociology. Anyway, here's my quick suggestion.

Label i as the starting quintile and j as the quintile one year later. You then have 25 data points, corresponding to the percentage of teachers that start in quintile i and end up in quintile j. Call these p_ij. The sum of the 25 p_ij's will be 100% (by definition).

The natural next step would be to make a scatterplot showing these 25 values, perhaps a circle in each grid point with the size of the circle at (i,j) being proportional to p_ij. But I have a slightly different idea which takes up a bit more space but might be more helpful in showing what you're looking for.

I'm thinking of a display with 5 narrow plots, side by side. Plots i=1,2,3,4,5 correspond to starting quintiles 1,2,3,4,5. Plot i has 5 arrows, each starting at position (0,i) and going to positions (0,j), j=1,2,3,4,5. The width of the j-th arrow here is proportional to p_ij. The separate plots can be pretty narrow because they are only going from 0 to 1 on the x-axis.

My suggestion is to give this a try. If it works out, please let me know--I can post the graph on the blog.

I'm thinking of a graph with 25 lines, where the width of line ij is proportional to p_ij. The positions of the lines

How about a set of five graphs, one for each of the five "before" quintiles. Each graph has five lines showing the number of cases starting in

Steve Farmer writes:

I am working on a hierarchical logistic regression model is SAS using the GLIMMIX function. I really liked your average predictive comparisons Figure 21.7 from your book, "Data Analysis Using Regression and Multilevel Hierarchical Models" (2007, p 473). I wondered if you could direct me to the easiest way to accomplish this analysis short of doing it manually. Is there a command or macro you could direct me to in SAS or STATA?

My reply: Some version of this may actually be available in Stata, although my guess is that it would be based on a single central point rather than averaging over all the data as we do in our book (and, before that, in my article with Pardoe). I've been planning to put it in an R package sometime but have never gotten around to it.

JoAnn Kuchera-Morin's Allosphere. See also here.

Mapping sin

| 2 Comments

David Shor writes:

You did some initial analysis of Nate's election forecasting work over on FiveThirtyEight.

Forgive me for plugging in my own [Shor's] work, but I did pre-election poll aggregation using a different methodology, and performed about on par with Nate. (Slightly better with state level presidential races, worse with senate races, and quite a bit better with the popular vote).

A comparison of our results(And a spreadsheet with quite a bit of data) are available here.

In terms of possible improvements for next cycle:

Troels Ring writes:

You know undoubtedly this site and the idea behind, presented also in the book by Spiegelhalter et al 2004. A recent reason for wondering about this is a paper in American journal of Kidney Disease 2009; 53: 208-217 claiming that protein restriction kills people with a hazard ratio 1.15 to 3.20 so to "believe" this, if I understand it, the prior would have to have weights above 1.9 which is strange since the anticipated effect would be beneficial. I have found few references to this method (a paper by Greenland mentions it shortly) and I'd like to hear your view of it.

My reply: I actually hadn't heard of this research before. It looks like it could be useful. I have no time to think more about this now, but my quick thought is that there's something a bit wacky about making decisions based on the endpoint of a 95% interval. It doesn't seem so Bayesian to me. On the other hand, I do this myself to some extent often enough, and in ARM we have a whole chapter on power calculations, so I don't quite follow a consistent line on this myself.

Ring adds:

Tyler Cowen links to an article by economist Ed Glaeser on urban political activists Jane Jacobs and Robert Moses. Moses, who ran various NYC government commissions in the mid-twentieth century, is famous for organizing the construction of bridges and structuring the financing so that he controlled the flow of money from the tolls. This independent source of funding gave him a huge amount of power within the government to do almost whatever he wanted--for awhile, until Jacobs and others mustered the popular support to stop him. Given my experiences at Columbia University, I can appreciate Moses's bureaucratic acumen: in any organization I've been involved in, there aren't so many sources of free money--that is, funds that haven't already been allocated to some expense. Free money is a source of power. I imagine this is true within corporations as well.

That's all tactics, though. What's relevant for Glaeser's article is what Robert Moses did with his money and power, which was to build some highways and attempt to build others that, on the plus side, would make it faster for people to go through New York City on the way to or from other places and, on the minus side, would destroy some neighborhoods and make many of the un-destroyed neighborhoods less pleasant to be in (by being next to a highway, disconnected from the rest of the city, etc).

What about the specifics? Glaeser agrees that Moses's proposed lower Manhattan expressway was a bad idea, as was his highway that destroyed a neighborhood in the Bronx. On the plus side, Glaeser supports Moses's parks and swimming pools and describes his roads and bridges as "not all bad."

One thing that interests me about Glaeser's discussion is that, implicitly, there are two levels of liberal-conservative dispute here.

I received the following email:

I was hoping if you could take a moment to counsel me on a problem that I'm having trying to calculate correct confidence intervals (I'm actually using a bootstrap method to simulate 95%CIs). . . . [What follows is a one-page description of where the data came from and the method that was used.]

My reply:

Without following all the details, let me make a quick suggestion which is that you try simulating your entire procedure on a fake dataset in which you know the "true" answer. You can then run your procedure and see if it works there. This won't prove anything but it will be a way of catching big problems, and it should also be helpful as a convincer to others.

If you want to carry this idea further, try to "break" your method by coming up with fake data that causes your procedure to give bad answers. This sort of simulation-and-exploration can be the first step in a deeper understanding of your method.

And then I got another, unrelated email from somebody else:

I am working on a mixed treatment comparison of treatments for non-small cell lung cancer. I am doing the analysis in two parts in order to estimate treatment effects (i.e. log hazard ratios) and absolute effects (by projecting the log hazard ratios onto a baseline treatment scale parameter; the baseline treatment times to event are assumed to arise from a Weibull distribution. . . . .[What follows is a one-page description of the model, which was somewhat complicated by constraints on some of the variance parameters] . . . I can get my analysis to run with constraints imposed on the treatment specific prior distributions for PFS and OS, and on the population log hazard ratios for PFS and OS. However, my proble is that the constraint does not appear to be doing anything and the results are similar to what I obtain without imposing the constraint. This is not what I expect . . .

My reply:

Sometimes the data are strong enough that essentially no information is supplied by external constraints. You can, to some extent, check how important this is for your problem by simulating some fake data from a setting similar to yours and then seeing whether your method comes close to reproducing the known truth. You can look at point estimates and also the coverage of posterior intervals.

More on fitting multilevel models

| No Comments

Eric Schwartz writes:

Thanks for the blog post. I have three follow up questions:

The National Election Study is hugely important in political science, but, as with just about all surveys, it has problems of coverage and nonresponse. Hence, some adjustment is needed to generalize from sample to population.

Matthew DeBell and Jon Krosnick wrote this report summarizing some of the choices that have to be made when considering adjustments for future editions of the survey. The report was put together in consultation with several statisticians and political scientists: Doug Rivers, Martin Frankel, Colm O'Muircheartaigh, Charles Franklin, and me. Survey weighting isn't easy, and this sort of report is just about impossible to write--you can't help leaving things out. They did a good job, though, and it's great to have this stuff put down in an official way, so that people can work off it of it when going forward.

It's a lot harder to write a procedure for general use than to do a single analysis oneself.

Some corrections

I have a few corrections to add to the report that unfortunately didn't make it into the final version (no doubt because of space limitations):

R Flashmob

| 2 Comments

Dan Goldstein, Michael E. Driscoll, and JD Long write:

We're organizing an "R Flashmob" to get students and researchers asking and answering R questions on the very useful questions site "stackoverflow.com" during one fun-filled hour.

We were wondering if you might help out in promoting this event by blogging something about it and/or emailing it to students or researchers who would benefit by having a friendly, efficient resource for asking and answering R questions. It would be great if you could!

Reality meets the DeLilloverse

| No Comments

Eric Schwartz writes a long question (complete with R code). My reply is at the end. Schwartz writes:

How do you recommend one tests whether fixed effects are different from zero in a generalized linear mixed model?

Douglas Bates argues that it is inappropriate to use p-values from t-statistics reported in lmer(). I've followed all of his and your subsequent postings on the topic that I have found, but still have some questions:

Bill Harris writes:

I'm not a professional statistician, but I do use statistics in my work, and I'm increasingly attracted to Bayesian approaches.

Several colleagues have asked me to describe the difference between Bayesian analysis and classical statistics. I think I've not yet succeeded well, and so I was about to start a blog entry to clear that up. Then I decided to look around.

Things haven't changed much since the 8-schools experiment, apparently. (See also this article by Ben Hansen.) Howard Wainer once told me that SAT coaching is effective--it's about as effective as the equivalent number of hours in your math or English class at school.

Recent Comments

  • Ceolaf: Rigorous qualitative metholdogy is about discovering processes, mechanisms and even read more
  • JulieSLQ: Strikes me you are on to something -- some conflation read more
  • Bill Shipley: I suggest that you read the following books, which deal read more
  • Will: I understand what you are saying about the limits on read more
  • Benoit Essiambre: This has been bothering me greatly in the past few read more
  • Frank D: Building off Kieth's comment:What about the attempts to tease causality read more
  • John Transue: Cross posted from themonkeycage.org because there's more discussion here: It read more
  • Byran Smucker: What about areas in which designed experiments are possible (e.g. read more
  • Keith O'Rourke: And, but, maybe "discover causality from just about any observational read more
  • Jesse: I think one pair of words sums it up: igon read more
  • Andrew Gelman: Chris: You ask, why are percentages in the graph relative read more
  • Jinchi: I'm astounded at the numbers with respect to age. The read more
  • Komodo: The Times version of the second graph has a typo: read more
  • Chris: Why are percentages in the graph relative to the national read more
  • Matt Frost: OK. IN that case, I understand the data, graphs, and read more
  • Andrew Gelman: Matt: I'm not sure, maybe we misread the graph when read more
  • Matt Frost: It's nice to see that making graphs cute doesn't always read more
  • Willem: Tip to self: Read back the history before you post. read more
  • Willem: Andrew, what cost you months of effort? Not the creating read more
  • Anonymous Coward: There is a large literature on combining many variables into read more