Uncertainty estimates for hierarchical variance parameters

This whole exchange is interesting and is closely related to our current research on prior distributions and Bayesian inference for varying-intercept, varying-slope models.

It all started when Sean Zhang asked,

I am using your book to self-teach myself using R for multilevel modeling.

One question I have is that why lmer cannot provide var-cov matrix of estimates of random components. You mentioned in your book that lmer can only provide point estimates for variance component and that is one of the reasons to go Bayesian and use Bugs.

I am running a simple random intercept model using SAS glimmix and found that glimmix provides standard error for variance estimate of random intercept. I looked into glimmix document(see attachment, page 121, theta contains random effect parameters) and can imagine that SAS may use hessian or outer produce of gradient (see page for their likelihood function) to get them.

My question is then why lmer cannot do similar thing as SAS to report var-cov matrix of estimates of random components?

I replied,

I’m not sure, but lmer is part of R and, as such, is done using volunteer work. I would not trust the Sas estimates, given that parameter estimates are often on the boundary of parameter space. Another way to get uncertainty estimates, if you really want them, would be to use bootstrap or jackknife.

Sean then contacted Doug Bates (creater of lmer), who gave the following more detailed answer:

The answer to your general question of why there are no standard errors for the estimates of variance components from the lme4 package is:

a) because they are awkward to calculate

and

b) because I don’t think they make sense

I don’t mind spending time working to calculate a value that would make sense but I’m not about to spend time working out how to calculate something I don’t really want.

Let me address b) first. Quoting an estimate and a standard error of the estimate for a parameter is useful if you think that the distribution of the estimator is more-or-less symmetric. [emphasis added==AG] If you are estimating a mean it makes sense to assume symmetry so we would calculate a confidence interval, say, as estimate +/- some multiple of the standard error. Coefficients in linear models behave somewhat the same way as means and we would expect symmetric confidence intervals to be adequate.

The distributions of estimators of variances are not symmetric. Consider the simplest case of estimating the variance of a normal distribution given an independent, identically distributed sample. We use critical values from a chi-squared distribution to calculate a confidence interval and that confidence interval is definitely not symmetric about the estimate. It is incredibly optimistic to expect that for very complex models the distribution of the estimators of variances will suddenly become symmetric.

Even if I wanted to calculate standard errors for estimates of variances, it would be difficult because the model is actually fit in terms of another parameterization of the covariance matrix of the random effects. The optimization is intricate when determining maximum likelihood estimates of generalized linear mixed models and the obvious parameterization of the variances and covariances is not a good way to go about determining these estimates. The parameters are constrained and we need to consider the constraints carefully. Also we want a good quadratic approximation to the log-likelihood near the optimum in the parameters that we have chosen. This all adds up to the variances and covariances being indirectly derived from the parameters used in the optimization.

Sean then followed up by asking me:

I am curious whether Bates asymmetry argument and the boundary condition argument you mentioned are the same.

My reply:

Yes, the boundary issue is a special case of asymmetry. When estimate is on the boundary, it is not completely to be trusted. (Sigma.hat=0 does not mean Sigma=0, it means that Sigma is statistically indistinguishable from 0, which is a different thing.) Ultimately I think that uncertainties for variance parameters are useful but I agree with Doug that the “estimate +/- standard error” framework isn’t so helpful here.

1 thought on “Uncertainty estimates for hierarchical variance parameters

  1. I'm wondering if you can clarify this part:
    "The optimization is intricate when determining maximum likelihood estimates of generalized linear mixed models and the obvious parameterization of the variances and covariances is not a good way to go about determining these estimates."

    Is the "obvious parameterization" of the variance and covariances here an inverse-gamma/inverse-wishart distribution? If the goal is a distribution over the variances, then what is wrong with assuming a conjugate parametric form and trying to come up with maximum likelihood estimates of the hyperparameters?

Comments are closed.