Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box

| 2 Comments

Our article (by Yu-Sung, Jennifer, Masanao, and myself, and based also on work with Kobi, Grazia, and Peter Messeri) will be appearing in the Journal of Statistical Software, in a special issue on missing-data imputation. Here's the abstract:

Our mi package in R has several features that allow the user to get inside the imputation process and evaluate the reasonableness of the resulting models and imputations. These features include: flexible choice of predictors, models, and transformations for chained imputation models; binned residual plots for checking the fit of the conditional distributions used for imputation; and plots for comparing the distributions of observed and imputed data in one and two dimensions. In addition, we use Bayesian models and weakly informative prior distributions to construct more stable estimates of imputation models. Our goal is to have a demonstration package that (a) avoids many of the practical problems that arise with existing multivariate imputation programs, and (b) demonstrates state-of-the-art diagnostics that can be applied more generally and can be incorporated into the software of others.

We've made lots of improvements since listing the package last year (here). There's still a lot more work to do, in many different directions (including multilevel models, nonignorable models, the self-cleaning oven, and making the program run faster in sorts of ways), and we keep improving it. But it's good to have something out there.

To actually get the R package, just open your R window, click on Packages, Install packages, and grab mi.

2 Comments

> ... multilevel models

That would be nice to have in the mi-package...

I am trying to use mi with some likert data (41 variables, n=167), and I'm finding myself stumped. Trying to apply the principles of the examples in your online paper to my own dataset, I generate errors (see below). Are there help files available, or a wiki or somesuch?

Thanks.

My Error, FWIW:
Beginning Multiple Imputation ( Sat Sep 26 09:17:18 2009 ):
Iteration 1
Imputation 1 : SI1*
Error while imputing variable: SI1 , model: mi.polr
Error in parse(text = x) :
unexpected numeric constant in "ordered(SI1) ~ SI2 + SI3 + SI4 + SI5 + SI6 + SI7 + SI8 + SI9 + ordered(SI1)0"

The command that generated it:
imp

-note that siOnly contains 41 variables, named "SI1" through "SI41"

Leave a comment

Subscribe to Entry

Recent Comments

  • Darrin Rogers: I am trying to use mi with some likert data read more
  • MH: > ... multilevel models That would be nice to have read more