Multiple imputation using chained equations

Stef van Buuren, Jaap Brand, C.G.M. Groothuis, and Don Rubin wrote a paper evaluating the “chained equations” method of multiple imputation–that is, the method of imputing each variable using a regression model conditional on all the others, iteratively cycling thorugh all the variables that contain missing data. Versions of this “algorithm” are implemented as MICE (which can be downloaded directly from R) and IVEware (a SAS package). (I put “algorithm” in quotes because you still have to decide what model to use–typically, what variables to include as predictors–in each of the imputation steps.)

Here’s the paper, and here’s the abstract:

The use of the Gibbs sampler with fully conditionally specified models, where the
distribution of each variable given the other variables is the starting point, has become a popular
method to create imputations in incomplete multivariate data. The theoretical weakness of this
approach is that the specified conditional densities can be incompatible, and therefore the
stationary distribution to which the Gibbs sampler attempts to converge may not exist. This study
investigates practical consequences of this problem by means of simulation. Missing data are
created under four different missing data mechanisms. Attention is given to the statistical
behavior under compatible and incompatible models. The results indicate that multiple
imputation produces essentially unbiased estimates with appropriate coverage in the simple cases
investigated, even for the incompatible models. Of particular interest is that these results were
produced using only five Gibbs iterations starting from a simple draw from observed marginal
distributions. It thus appears that, despite the theoretical weaknesses, the actual performance of
conditional model specification for multivariate imputation can be quite good, and therefore
deserves further study.

Here are Stef’s webpages on multiple imputation. Multiple imputation was invented by Don Rubin in 1977.

mipub.GIF

1 thought on “Multiple imputation using chained equations

Comments are closed.