MCMC starting values

Jun Xiang writes,

I am using your R2WinBUGS to estimate some hierarchical logistic regression and have a question about how to set initial values. First, when I use some uninformative initial values the model has some convergence problem (Rhat indicates non-convergence after 10,000 iterations). Next, I use MLE as the initial values and the model has better convergence results. I want to know if the latter way is right since each chain in this case gets the same initial values.

My reply: Yes, you can have problems if your starting values are too wacky. Basically, the usual noninformative prior distributions contain parameter values that are so extreme (e.g., theta=10^4) that there can be convergence issues. In principle it would be best to use more reasonable prior distributions, but in practice it often works fine if you pick starting values that are less extreme. This is discussed a bit in the Gelman and Hill book in the discussion of using Bugs. In addition, it makes sense to parameterize reasonably (for example, scaling predictors) so that you won’t get coefficients such as 10^4 or 10^-4. We discuss the scaling issue in the book also.