Model checking in the presence of missing data

I received this question in the mail:

Your Biometrics article, Multiple imputation for model checking: completed-data plots with missing and latent data, suggests diagnostics when the missing values of a dataset are filled in by multiple imputation. But suppose we have two equivalent files–File A with variable y left-censored at known threshold and File B with y fully observed. We draw multiple imputations of censored y in File A.

(1) Can we validate our imputation model by setting y in File B as left-censored according to the inclusion indicator from A, performing multiple imputation of these “censored” data, and comparing imputed to observed values?

(2) In particular, what diagnostic measure(s) would tell us whether the imputed and observed values fit closely enough to validate our imputation model?

My reply: I’m a little confused: if you already have File B, what do you need File A for? Do the two files have different data, or are you just using this to validate your imputation model? If the latter, then, yes, you can see whether the observations in File B are consistent with the predictive distributions obtained from your multiple imputations on File A. You wouldn’t expect the imputations to be perfect, but you’d like the imputed 50% intervals to have approximate 50% coverage, you’d like the average values of the true data to equal the predictions from the imputations, on average, and conditional on any information in the observed data in File A. (But the imputations don’t have to–and, in general, shouldn’t–be correct on average, conditional on the hidden true values.)

You may also be interested in my 2004 article, Exploratory data analysis for complex models, which actually an example on death-penalty sentencing, with censored data.