Generalized Method of Moments, whatever that is

Xuequn Hu writes:

I am an econ doctoral student, trying to do some empirical work using Bayesian methods. Recently I read a paper(and its discussion) that pitches Bayesian methods against GMM (Generalized Method of Moments), which is quite popular in econometrics for frequentists. I am wondering if you can, here or on your blog, give some insights about these two methods, from the perspective of a Bayesian statistician. I know GMM does not conform to likelihood principle, but Bayesian are often charged with strong distribution assumptions.

I can’t actually help on this, since I don’t know what GMM is. My guess is that, like other methods that don’t explicitly use prior estimation, this method will work well if sufficient information is included as data. Which would imply a hierarchical structure.

19 thoughts on “Generalized Method of Moments, whatever that is

  1. Andrew, sorry to have to tell you this but the econometricians are going to tease you mercilessly when lars hansen wins the Nobel for proposing GMM :-)

  2. My crude understanding from the Hayashi and Kennedy books is that GMM is useful as a tool for relating and understanding the various econometric methods, but as an estimation strategy it is often inferior to simply using a given method directly.

  3. Estimating equations. GEE is pitched as making valid inferences on mean model parameters (which are often the most interesting) even if the higher moments are mis-specified, and getting variance parameters right if you have the first four moments correct.

    Many people think of likelihood methods as requiring that the structure be correctly specified; the pitch was probably that with flexible hierarchical priors you actually inferences which are robust to some kinds of misspecification.

  4. Search "Bayesian analysis of random coefficient logit models using aggregate data" in google scholar. It's a Bayesian treatment of BLP model that is still the benchmark model for differentiated products in empirical IO. Thanks.

  5. Too many posts on this blog are variation on the theme of "I don't know the answer to this". If you don't know that's fine, but it certainly doesn't warrant a blog post.

  6. Like Xuequn I am a phd student in economics. I am familiar with GMM but much less so with Bayesian methods. I would love to see some discussion on this too!

    I once heard that GMM wasn't really new and known in the statistics community by another name (some 70's Annals of statistics article). I have searched for it but I couldn't find the proper reference (if it exists).

  7. Js:

    When I post statistical questions to which I don't know the answer, two purposes are served:

    1. People can answer in the comments, thus helping the original questioner and anyone else who is interested in the topic.

    2. People are reminded that there is a lot in statistics that I don't know. I think it's good to dispel some of the aura of omniscience that surrounds many textbooks and blogs.

    But if you don't like these posts, feel free to skip them!

  8. Louis: Perhaps Godambe VP (one of Fisher's graduate students)

    For instance – Conditional likelihood and optimal estimating equations Biometrika 1976 or earlier 1960 Annals of Math Stat paper

    and a very colourful character infused with brilliance.

    K?

  9. Louis is right. It was invented by Karl Pearson in the late 19th century. He called it the method of moments, but I think its also called the method of estimating equations. The main idea is that every stochastic model entails moment conditions on the observable data. The parameters of the model are therefore estimated by choosing the parameters that best approximate the moment conditions. It is probably the most popular approach in econometrics. Andrew is right that it works best for large data sets, because the estimation and inference is valid only asymptotically.

    But I'm very surprised at how wide the gulf is between statistics and econometrics. Its quite unfortunate. We have a lot to learn from each other.

  10. @Louis: "I once heard that GMM wasn't really new and known in the statistics community by another name"

    From Wikipedia:
    "The term GMM is very popular among econometricians but is hardly used at all outside of economics, where the slightly more general term estimating equations is preferred."

    http://en.wikipedia.org/wiki/Generalized_method_o

    I don't know enough about this method to be useful.

  11. As the name suggests, GMM is a generalization of the method of moments. In fact, MLE is also a special case of GMM. In summary, the only use GMM gets (as opposed to vanilla method of moments) is when you have more moments than the actual unknown parameters. You can throw away the extra moments until you have the same number of moments as the number of unknown parameters, and then do method of moments… or you can do GMM. But here lies the puzzle for most statisticians. What are the problems where you end up having so many lower-order moments that they outnumber the parameters you're trying to estimate. And the truth is, most such problems exist only in finance and economics because the models that economists use suggest the existence of such moment conditions. For example, the rational expectations models and the simultaneous equations models. One very common example of GMM (and one most economists don't seem to bother to use) is that you can use GMM to generalize the 3SLS method to obtain efficient estimators for the parameters of a two equation model when the errors are not homoscedastic. To summarize, this method is of use primarily to economists, and is only an interesting curiosity to statisticians.

  12. Suppose E[ g(X,theta) ] = 0 where g( ) is a known function with dimension larger than the unknown parameter theta. The dimension of the random variable X doesn't matter. Then this is a system of more equations than unknowns. The ordinary method of moments says replace the expectation with the sample average to get

    (1/n) sum over i of g(X_i, theta) = 0

    and then solve for theta to get an estimator. When the dimension of g equals theta, no big deal. But when the dimension of g is larger than theta, this system may not have a solution. This is where GMM comes in. GMM essentially says this: take a weighted sum of each equation in this system, and pick theta to minimize that scalar.
    —————————————————–
    I don't know if this has a direct counterpart in statistics. The definition of an "estimating equations estimator" doesn't seem to be agreed upon; the sources I've seen (Cameron and Trivedi (2005), Wikipedia) give slightly different definitions. Does anyone have an authoritative stat textbook reference for that?

    Anyway, perhaps we can all agree: statisticians and econometricians could benefit from closing the gap in methods and jargon.

  13. GMM is variation of M-estimator (http://en.wikipedia.org/wiki/M-estimator). The M-estimator generalizes maximum likelihood estimators, and it is not hard to show that maximum likelihood estimator is actualy GMM estimator. All these estimators are shown to be consistent and square N assymptoticaly normal. It is popular in econometrics, since it allows to treat endogenous variable problem quite elegantly. GMM does not need any distributional assumptions, so it relies on asymptotic results to justify its usefulness.

  14. See the work of Zellner on Bayesian Method of Moments arrived at via the principal of maximum entropy. Inoue has worked on similar Bayesian estimators without the use of entropy.

    Some background reading on GMM and entropy with notes on Bayesian estimation:

    Bera and Bilias, "THE MM, ME, ML, EL, EF and GMM Approaches to Estimation: A Synthesis"

    Golan, "Information and entropy econometrics: a review and synthesis"

Comments are closed.