Modeling growth

Charles Williams writes,

In a number of your examples in the multilevel modeling book you use growth as an outcome. I’m doing this in a study of firm growth in the cellular industry. In this setting, we need to control for firm size since firm’s propensity to grow is definitely affected by its size. Someone suggested to me that I may have correlation between the size variable and the error term, since size is effectively in the denominator of the growth variable. They suggested using just the numerator of the growth term (subscribers added) as the outcome, since the denominator will be controlled for in the regression.

Have you run into this? Do you agree that there is a potential for bias in using size as a regressor for growth?

My reply: Yes, it makes sense to control for size (at the beginning of the study) in your regressions, probably on the log scale. I’d still use the ratio as an outcome because I think it would help the coefficients be more directly interpretable (which is a virtue in itself and also helps with efficiency if you have a hierarchical or Bayesian model).

4 thoughts on “Modeling growth

  1. Andrew; I'm not sure it makes sense to control for size *and* use the ratio as an outcome. Isn't the following argument valid?

    We're interested in fitting this equation:

    (1) z = Bx + e.

    But z is a growth rate, with x as the denominator:

    (2) z = y/x.

    So:

    (3) y/x = Bx + e.

    Because it is a regressor in our model, x is fixed and known. It's not random and is not determined by the error e. So in this framework the outcome/response variable of the regression can't genuinely be the growth ratio, z. The response variable actually has to be the numerator y, as putting things the right way round we've really got:

    (4) y = Bx^2 + ex.

    Now, (4) is a pretty strange model. But if you control for a fixed denominator, then you can't be studying a random ratio as your outcome. You have to in fact be studying the numerator, because that's where the only space for randomness is left. You're just doing it in a rather strange way, with a more complicated model than you think.

    I don't see anything wrong with (4) per se, so long as the model fits. But Charles might want to try other simpler models to study y first.

  2. Isn't it more like:

    (5') log y – log x = Bx + e

    This model makes sense to me, at least as a linear approximation; small firms grow faster, perhaps, so B is negative.

    It seems to me you also need to take into account firms going out of business or being acquired; no fair looking just at the survivors.

  3. As the commenters have indicated, there are lots of possible models that could make sense here. That was why I said it was a good idea to include size as a regression predictor, even if it already seemed to be included by being in the denominator somewhere.

Comments are closed.