Search this site

Match case Regex search

Matching entries from Statistical Modeling, Causal Inference, and Social Science

The two incompatible options of empirical microeconomics

I enjoy reading the Freakonomics blog, but as I've noted previously, I remain puzzled by the presence of two appealing but, to my mind, incompatible forms of reasoning that seem to be used more generally in the world of "freakonomics"...

Evidence that, in India, girls get less mother's milk than boys, leading to higher infant mortality among girls (leaving me confused about whether this is explaining 14% or most of the differences observed in data)

Chris Blattman reports on a study by Seema Jayachandran and Ilyana Kuziemko that makes the following argument: Medical research indicates that breastfeeding suppresses post-natal fertility. We [Jayachandran and Kuziemko] model the implications for breastfeeding decisions and test the model's predictions...

Interactions and Bayesian Anova

Gregor Gorjanc writes:...

Why don't I do more explicit modeling of spatial or temporal patterns?

Paul Cross writes: In reading your book and papers on multilevel modeling I've noticed that you do not do much explicit modeling of spatial or temporal effects. I'm wondering if this is philosophically driven, perhaps because you prefer to get...

Coding ordinal input variables in a regression

Denis Cote writes: I am reviewing a paper using logistic regression and I am uncertain about the way they coded their inputs. They have different ordinal variables coming from self-report questions. For example, self-perceived health" with its answer choice: excellent,...

Econometrics reaches The Economist

Hal Varian pointed me to this article in The Economist: Instrumental variables help to isolate causal relationships. But they can be taken too far "Like elaborately plumed birds...we preen and strut and display our t-values." That was Edward Leamer's uncharitable...

New book on Bayesian nonparametrics

Nils Hjort, Chris Holmes, Peter Muller, and Stephen Walker have come out with a new book on Bayesian Nonparametrics. It's great stuff, makes me realize how ignorant I am of this important area of statistics. Here are the chapters: 0....

Estimating treatment effects that vary, with application to voter mobilization experiments in political science

Avi Feller and Chris Holmes sent me a new article on estimating varying treatment effects. Their article begins: Randomized experiments have become increasingly important for political scientists and campaign professionals. With few exceptions, these experiments have addressed the overall causal...

When to standardize regression inputs and when to leave them alone

Daniel Egan sent me a link to an article, "Standardized or simple effect size: What should be reported?" by Thom Baguley, that recently appeared in the British Journal of Psychology. Here's the abstract: It is regarded as best practice for...

Work with me in Paris on a postdoc!

Among other things, while on sabbatical in Paris next year I'll be working with my longtime collaborator Frederic Bois, a toxicologist who uses hierarchical Bayes models extensively. We have a project in toxicology that necessarily also involves research in Bayesian...

That modeling feeling

It goes like this: there's something you want to estimate and you have some data. Maybe, to take my favorite recent example, you want to break down support for school vouchers by religion, ethnicity, income, and state (or maybe you'd...

More on Pearl's and Rubin's frameworks for causal inference

To follow up on yesterday's discussion, I wanted to go through a bunch of different issues involving graphical modeling and causal inference. Contents: - A practical issue: poststratification - 3 kinds of graphs - Minimal Pearl and Minimal Rubin -...

My talks in Seattle and Vancouver

1. Coalitions, voting power, and political instability. Thurs 4 Jun, 3:30pm, Kane Hall 210 at the University of Washington. Part of the Math Across Campus series. We shall consider two topics involving coalitions and voting. Each topic involves open questions...

Having daughters rather than sons makes you more liberal: a link followed by a plea

Andrew J. Oswald and Nattavudh Powdthavee write: In remarkable research, the sociologist Rebecca Warner and the economist Ebonya Washington have shown that the gender of a person's children seems to influence the attitudes and actions of the parent. Warner (1991)...

Advice on statistical model building

Andrew Grogan-Kaylor writes:...

Bayes, Jeffreys, prior distributions, and the philosophy of statistics

Christian Robert, Nicolas Chopin, and Judith Rousseau wrote this article that will appear in Statistical Science with various discussions, including mine. I hope those of you who are interested in the foundations of statistics will read this. Sometimes I feel...

Discussion of Red State, Blue State

The political website Talking Points Memo is featuring a discussion of Red State, Blue State this week. The discussants so far have included software developer / political activist Aaron Swartz, historian Eric Rauchway, political scientist Nolan McCarty, journalist Steve Sailer,...

Main effects and interactions

We all know to look at main effects first and then look for interactions. But a former student pointed me to some disturbing advice from some statistics textbooks. I'll give his quotes and then my reactions:...

Reactions to "Planning the Optimal Get-out-the-vote Campaign Using Randomized Field Experiments," including a bunch of comments that should certainly be of interest to quantitative political scientists

Aaron Strauss spoke today on his work with Kosuke Imai on estimating the optimal order of priority and the optimal approach for contacting voters in a political campaign. They use inferences from field experiments on voter turnout and persuasion and...

McCloskey et al. on significance testing in economics

Now that we're on the topic of econometrics . . . somebody recommended to me a book by Deirdre McCloskey. I can't remember who gave me this recommendation, but the name did ring a bell, and then I remembered I...

Mostly Harmless Econometrics

I just read the new book, "Mostly Harmless Econometrics: An Empiricist's Companion," by Joshua Angrist and Jorn-Steffen Pischke. It's an excellent book and, I think, well worth your $35. I recommend that all of you buy it. I also have...

Official journal publication of our article on weakly informative priors

By Aleks, Grazia, Yu-Sung and myself. Here's the article, and here's the abstract: We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5,...

Exogenous tax cuts and treatment interactions

Nate Silver and Greg Mankiw have an interesting exchange about the use of exogenous instruments to estimate causal effects. Unfortunately, the subject is macroeconomics, a topic on which I know next to nothing beyond what I learned in Mr. Cutlip's...

An idea for a course on statistical communication

This semester I'm teaching my "how to teach" class: The Teaching of Statistics at the University Level. (Stat 6600, or those of you here at Columbia.) I'll post more on that in a bit. Here I want to talk about...

Rich voter, poor voter, white voter, black voter

A couple weeks ago I posted an analysis of rich and poor voters in rich and poor states from exit polls in 2008, and a commenter ("Audacious Epigone") picked up on Larry Bartels's observation that, among whites, the Republican advantage...

Education, uncertainty and interactions

AT writes: A Facebook friend pointed me to this Gladwell piece discussing how you can('t) predict whether a teacher will be successful, but more importantly, on the range of advancement of a class depending on a teacher's ability. The claim...

Fitting a model with constraints

Chris Chatham writes: I am using multilevel logistic regression to model individuals' abilties to 'stop' a planned motor movement (my binary outcome), based on the delay between the beginning of the trial and the occurrence of the stop signal (my...

Evaluating multi-site interventions

Rajeev sends a link to this paper on hierarchical modeling for evaluating multi-site interventions: This article discusses the evaluation of programs implemented at multiple sites. Two frequently used methods are pooling the data or using fixed effects (an extreme version...

More on interactions

Bruce McCullough writes: Don't know if you're aware of this, but if you need more evidence for the primacy of interaction effects, data mining is a great place to look. My degree is in economics. I was taught to use...

My talks this week in D.C.: today (Wed.) at George Washington University, Thurs. at the Cato Institute

If you're in D.C., you should stop by. . . . I'm speaking in the statistics department at George Washington University on the topic of interactions. Here's the powerpoint and here's the abstract: As statisticians and practitioners, we all know...

Case-mix adjustment in non-randomised observational evaluations: the constant risk fallacy

Mohammed Mohammed points me to this article by John Nichols, which begins: Observational studies comparing groups or populations to evaluate services or interventions usually require case-mix adjustment to account for imbalances between the groups being compared. Simulation studies have, however,...

Engineers think about the method, statisticians think about psychology

See here for Jeremy's comments to my comments. I agree with what he writes. The whole discussion reminds me of a comment made to me once by a statistician who generally works with engineers. He said that when he talks...

When voting on Supreme Court nominees, senators respond to public opinion

John Kastellec sent me this attractive paper: We [Kastellec et al.] study the relationship between state-level public opinion and the roll call votes of senators on Supreme Court nominees. Applying recent advances in multilevel modeling, we use national polls on...

Interactions

I have mixed feelings about this picture and accompanying note of Jeremy Freese, who writes: Key findings in quantitative social science are often interaction effects in which the estimated “effect” of a continuous variable on an outcome for one group...

Multilevel models with interactions

John Kastellec writes: Let's say you wanted to estimate a multilevel model with an interaction in the individual-level model, say: Pr(y=1) = logit-1(B0 + B1X + B2Z + B3XZ) and you wanted to allow the interaction effect to vary by...

Teaching Bayesian applied statistics to graduate students in political science, sociology, public health, education, economics, . . .

Here are my thoughts, to appear in the American Statistician: 1. Introduction 2. Teaching Bayesian statistics to social scientists, including a discussion of what is Bayesian about making graphs to get a better understanding of the deterministic part of a...

Two-stage and multilevel regressions

Robert Rohrschneider writes: I [Rohrschneider] am trying to gain an understanding of the pitfalls of multi-level analyses in my work which typically requires that I merge country data with surveys of individuals, usually in Europe. I wonder whether you could...

Bayesian prediction with high-order interactions!!

Longhai Li did a really cool Ph.D. thesis (under the supervision of Radford Neal) on computing for models with deep interactions. The website containing all stuff about this software, including the R packages, documentations and references, is here and here....

My talk at MIT on Monday

I'm speaking Monday 14 April at 4:30 on weakly informative prior distributions and models with interactions. I'll try to make things accessible to a general audience of people who might not know much about statistics in general or Bayesian methods...

Disaster aid as vote buying?

Jowei Chen sent along this paper: In the aftermath of the summer 2004 Florida hurricane season, the Federal Emergency Management Agency (FEMA) distributed $1.2 billion in disaster aid to Florida residents. This research presents two empirical findings that collectively suggest...

Instrumental variables analysis with interactions

Boliang writes,...

Random restriction as an alternative to random assignment? A mini-seminar from the experts

Robin Hanson suggested here an experimental design in which patients, instead of randomly assigned to particular treatments, are randomly given restrictions (so that each patient would have only n-1 options to consider, with the one option removed at random). I...

Notation for crossover designs

David Afshartous writes,...

My class this spring on applied Bayesian statistical computing

I had various course titles floating around: my course at Columbia this spring is officially called Applied Statistics, and I had promised people that it would cover Bayesian statistics. At Harvard they asked me to teach Statistical Computing, but I...

A question about causal inference and a question about variable selection

Lingzhou Michael Xue writes in with two questions:...

Questions about transformations

Manuel Spínola writes,...

Clustered standard errors vs. multilevel modeling

Jeff pointed me to this interesting paper by David Primo, Matthew Jacobsmeier, and Jeffrey Milyo comparing multilevel models and clustered standard errors as tools for estimating regression models with two-level data....

Fixed and random effects: what do statistics and econometrics say?

Dan Schrage writes with a question about how to model group-level variation: I [Dan] am trying to better understand the recommendation in your new book to always use random effects (pg. 246) in modeling. (I'm following your definition #5 here...

Income, religious attendance, and voting: recent patterns and trends since 1992

I can't say I have much of an explanation for this, but it's interesting: --> Church attendance is a strong predictor of how high-income people vote, not such a good predictor for low-income voters. There's lots of talk about religion...

The effect of voter identification laws on turnout

Mike Alvarez, Delia Bailey, and Jonathan Katz just completed this paper: Since the passage of the “Help America Vote Act” in 2002, nearly half of the states have adopted a variety of new identification requirements for voter registration and participation...

Survey weighting and regression modeling

Mike Larsen asks,...

How Bayesian am I?

I was reminded of the varieties of Bayesians after reading this article by Robin Hanson: [I]n our standard framework systems out there have many possible states and our minds can have many possible belief states, and interactions between minds and...

My talks at Dartmouth this Friday

The political science talk: Culture wars, voting, and polarization: divisions and unities in modern American politics. (Here's the higher-resolution powerpoint version.) Here's the handout that goes with the talk The statistics talk: Interactions are important....

Significance testing in economics: McCloskey, Ziliak, Hoover, and Siegler

Scott Cunningham writes, Today I was rereading Deirdre McCloskey and Ziliak's JEL paper on statistical significance, and then reading for the first time their detailed response to a critic who challenged their original paper. I was wondering what opinion you...

Playroom and lab meetings

We've been trying to figure out how to set up a weekly lab meeting--something where people take turns giving updates on their research, along with a "Hill Street Blues" sort of summary of progress on ongoing projects. My impression is...

Context is important: a question I don't know how to answer, leading to general thoughts about consulting

Someone writes in with a question that I can't answer but which reminds me of a general point about interactions between statisticians and others....

Poststratification on variables that are not fully observed

Seth Wayland writes, In Chapter 14.1 of your new book, the example uses only predictors for which you have census data at the state level. In the postratification step, you just plug the values of those covariates into the model,...

Average predictive comparisons for models with nonlinearity, interactions, and variance components

How do you summarize logistic regressions and other nonlinear models? The coefficients are only interpretable on a transformed scale. One quick approach is to divide logistic regression coefficients by 4 to convert on to the probability scale--that works for probabilities...

A new default prior distribution for logistic and other regression coefficients

This (by Aleks, Grazia, Yu-Sung, and myself) is really cool. Here's the abstract: We propose a new prior distribution for classical (non-hierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5,...

Bayesian Anova

Song Qian sent me this paper, to appear in the journal Ecology, on ecological applications of multilevel analysis of variance. Here's the abstract: A Bayesian representation of the analysis of variance by Gelman (2005) is introduced with ecological examples. These...

Functional is not optimal: thoughts from a structural engineer

Robin Hanson points out that biological systems that have a useful function are not necessarily optimal when put in new environments. This reminds me of an interesting interesting article by Witold Rybczynski where I learned that the structural engineer Ove...

The best nonfiction books ever

How to talk so kids will listen and listen so kids will talk, by Adele Faber and Elaine Mazlish. I read this book long before I had kids--it's incredibly helpful for interactions with adults as well. It's definitely #1 on...

Multiple predictors

Jarrett Byrnes writes, A group of us are working through your Multilevel book, and a question has come up regarding models incorporating multiple predictors. We were working some of the chapters on using simulation to draw inference, but have been...

Happiness, children, and the difficulties of trying to answer Why-type questions

Wil Wilkinson points to an interesting article by Nicholas Eberstadt (and adds some comments of his own) on the topic of the high birth rates in the United States compared to Europe. Wilkinson attributes the difference to Americans' higher average...

Racial bias in basketball fouls

Yu-Sung and Jeff pointed me to a study by Joseph Price and Justin Wolfers on racial discrimination among NBA referees. Basically, black refs call more fouls on white players and vice-versa. Here's a news article (by Alan Schwarz), here's the...

Statistical inefficiency = bias, or, Increasing efficiency will reduce bias (on average), or, There is no bias-variance tradeoff

Statisticians often talk about a bias-variance tradeoff, comparing a simple unbiased estimator (for example, a difference in differences) to something more efficient but possibly biased (for example, a regression). There's commonly the attitude that the unbiased estimate is a better...

How to summarize a multilevel model fit?

Michael Kubovy writes, Can you point me to a model report of empirical research (preferably of a designed experiment) using mixed models? As you know, the pattern in psychology is to have a stultifying paragraph listing which effects and interactions...

Comparing the results from multilevel models fit to two groups

I received the following email:...

Interactions, new variables, and market segmentation

Most academic research employs basic variables that are then correlated or regressed on with outcomes of interest. These basic variables are, for example, income, state and similar. Using such variables we can claim that, on average, urban dwellers vote for...

High-dimensional data analysis

I came across this talk by David Donoho (see also here for more detail) from 2000. I was disappointed to see that he scooped me on the phrase "blessing of dimensionality" but I guess this is not such an obscure...

Estimating individual effects in group-level experiments

Michael Weiksner writes, I [Weiksner] do research on deliberation, where the treatment itself is defined as the interaction with other people (who are inevitably also randomly assigned to the treatment group). Because all the treated individuals interact, I know that...

Treatment interactions in experiments and observational studies

I've become increasingly convinced of the importance of treatment interactions--that is, models (or analyses) in which a treatment effect is measurably different for different units. Here's a quick example (from my 1994 paper with Gary King): But there are lots...

Nomograms

Regression coefficients are not very pleasant to look at when listed in a table. Moreover, the value of the coefficient is not what really matters. What matters is the value of the coefficient multiplied with the value of the corresponding variable: this is the actual "effect" that contributes to the value of the outcome, or with logistic regression, towards the log-odds ratio. With this approach, it is no longer necessary to scale variables prior to regression. A nomogram is the visualization method based on this idea.

Multilevel model with small numbers of observations per group

Holly writes, I am interested in where children live when their parent is incarcerated. It turns out that there is a major gender difference in that when the father is incarcerated the child tends to live with the other parent,...

Richard Berk's book on regression analysis

I just finished reading Dick Berk's book, "Regression analysis: a constructive critique" (2004). It was a pleasure to read, and I'm glad to be able to refer to it in our forthcoming book. Berk's book has a conversational format and...

Tables of regression coefficients

Andrew Sutter writes,...

Confusion about altruism

Many scientists of the "selfish gene" persuasion get bothered by instances of altruistic behavior by humans and other animals. For example, Damon Centola forwarded these links: Human beings routinely help others to achieve their goals, even when the helper receives...

Interesting Cases, Support Vectors, and Ape Art

What makes an observation interesting? Through the example of devious quizzes that ask you to distinguish ape art from modern art, we will investigate the fundamental idea of support vector machines: a SVM is a classifier specified in terms of...

2006

My New Year's resolutions:...

An analysis of the NYPD's stop-and-frisk policy in the context of claims of racial bias

Recent studies by police departments and researchers confirm that police stop racial and ethnic minority citizens more often than whites, relative to their proportions in the population. However, it has been argued that stop rates more accurately reflect rates of...

Slam Dunks and No-Brainers

Encouraged by Carrie's plug, I read Leslie Savan's book, "Slam Dunks and No Brainers": It's an entertaining and thought-provoking look at "pop language," which are a particular kind of enjoyable and powerful cliche that we use in speech (and sometimes...

Statisticians are foxes

In a recent article in the New York Review of Books (see also here), Freeman Dyson writes, Great scientists come in two varieties, which Isaiah Berlin, quoting the seventh-century-BC poet Archilochus, called foxes and hedgehogs. Foxes know many tricks, hedgehogs...

Smoothed Anova

Jim Hodges, Yue Cui, Daniel Sargent, and Brad Carlin completed their paper on "smoothed Anova". The abstract begins: "We present an approach to smoothing balanced, single-term analysis of variance (ANOVA) that emphasizes smoothing interactions, the premise being that for a...

Class-participation activities in a regression class

I'm trying to integrate class-participation activities into the Applied Regression and Multilevel Modeling course I'm teaching this semester. We have a whole bunch of these activities for introductory statistics (in my intro class I have at least one demo and...

Is dimensionality a blessing or a curse?

Scott de Marchi writes, regarding the "blessing of dimensionality": One of my students forwarded your blog, and I think you've got it wrong on this topic. More data does not always help and this has been shown in numerous applications...

Interactions are important

Here's the talk I gave last week on interactions in multilevel models (work in collaboration with Samantha Cook and Shouhao Zhou). The short version: (1) interactions are important, (2) more work is needed on how to reasonably model complex structures...

Reasons for randomization

I was at the UCLA statistics preprint site, which is full of interesting papers--we should so something like that here at Columbia--and came across this paper by Richard Berk on randomized experiments. From the abstract to Berk's paper:...

Using propensity scores to estimate the effects of seeing gun violence

Jeff Fagan forwarded this article on gun violence by Jeffrey Bingenheimer, Robert Brennan, and Felton Earls. The research looks at children in Chicago who were exposed to gun violence, and uses propensity score matching to find a similar group who...

Overestimates of immigrants

In reference to the recent entry on misperception of minorities, John Sides sent me the following data on the estimated, and actual, percentage of foreign-born residents in each of 20 European countries: The estimates are average survey responses in each...

A question about multilevel modeling

Someone sent me a question about whether it makes sense to use multilevel modeling in a study of polls from many countries. I'll give the question and my response. The topic has been on my mind because I just wrote...

Columbia Causal Inference Meeting

On June 20, we had a miniconference on causal inference at the Columbia University Statistics Department. The conference consisted of six talks and lots of discussion. One topic of discussion was the use of propensity scores in causal inference, specifically,...

Propensity scores and Bayesian inference

Zhiqiang Tan (Biostatistics, Johns Hopkins) writes, regarding my blog entry on regression and matching. I wrote: I'm imagining a unification of matching and regression methods, following the Cochran and Rubin approach: (1) matching, (2) keeping the treated and control units...

Regression modeling and meta-analysis for decision making; or, We thank Kevin Brancato and Hailin Lou for research assistance . . .

I noticed the blog of Kevin Brancato. I've been enjoying reading the blog entries, especially since Kevin is a former student of ours at Columbia! His paper on macroeconomic statistics is also interesting (and relevant to some of my work)....

Some questions (and a few answers) about multilevel models

Here are the slides of the talk I gave at the CDC last week. And here's the abstract: Multilevel (hierarchical) models are increasingly popular for data with hierarchical, longitudinal, and cross-classified structures. We consider several questions that arise in the...

(Towards) a solution to a 40-year-old problem: Prior distributions for variance parameters in hierarchical models

Fully Bayesian analyses of hierarchical linear models have been considered for at least forty years. A persistent challenge has been choosing a prior distribution for the hierarchical variance parameters. Proposed models include uniform distributions (on various scales), inverse-gamma distributions, and...

Regression and matching for causal inference: questions about Guido Imbens's article

We would like to incorporate matching methods into a Bayesian regression framework for causal inference, with the ultimate goal of being able to do more effective inference using hierarchical modeling. The founding work here are papers by Cochran and Rubin...

Postdoctoral position available

Postdoctoral research opportunity: Columbia University, Departments of Epidemiology and Statistics Supervisors: Ezra Susser (epidemiology) and Andrew Gelman (statistics) We have a NIH-funded postdoctoral position (1 or 2 years) available for what is essentially statistical research as applied to some important...

Physicists modeling social phenomena; social scientists invoking physics

Tim Halpin-Healy (Physics, Barnard College) spoke today at the Collective Dynamics Group on "The Dynamics of Conformity and Dissent". Unfortunately I wasn't able to attend his talk--it looked interesting--but I have to say, speaking curmudgeonly and parochially as a political...

Ranking colleges

Christopher Avery, Mark Glickman, Caroline Hoxby, and Andrew Metrick wrote a paper recently ranking colleges and universities based on the "revealed preferences" of the students making decisions about where to attend. They apply, to data on 3000 high-school students, statistical...

Estimating spatial interactions in forest clearing

Juan Robalino and Alex Pfaff have written a paper on estimating the factors that influence the decision of Costa Rican farmers to clear forest land. This is an important question because, as they note in the article, Rural areas of...

Matching, regression, interactions, and robustness

Daniel Ho, Kosuke Imai, Gary King, and Liz Stuart recently wrote a paper on matching, followed by regression, as a tool for causal inference. They apply the methods developed by Don Rubin in 1970 and 1973 to some political science...

Reference for variable selection

This is the reference to the work of Chipman on including interactions: Chipman, H. (1996), ``Bayesian Variable Selection with Related Predictors'', Canadian Journal of Statistics , 24, 17--36....

Partial pooling of interactions

In a multi-way analysis of variance setting, the number of possible predictors can be huge. For example, consider a 10x19x50 array of continuous measurements, there is a grand mean, 10+19+50 main effects, 10x19+19x50+10x50 two-way interactions. and 10x19x50 three-way interactions. Multilevel...

Feed Subscription

If you use an RSS reader, you can subscribe to a feed of all future entries matching 'interactions'. [What is this?]

Subscribe to feed Subscribe to feed