Time-series regression question

Iain Pardoe writes,

I was wondering if you might have any thoughts on the following …

Suppose I have data collected over a period of years, with a response y and some predictors x. I want to predict y for the following year based on data collected up to that year. One approach is to model the data each year using ALL data collected up to that year. But what if you expect the relationship between y and x to change over time, i.e., you want to down-weight data from further in the past when fitting a model each year. You could ignore any data that is say more than 10 years old, but this seems a little ad-hoc. What might be a reasonable approach to doing this that isn’t so ad-hoc?

Any thoughts?

9 thoughts on “Time-series regression question

  1. "Moving window" regressions are useful. Most of

    the time, I think that our models are pitifully

    simple approximations of a complex world, so that

    we suffer from constantly shifting parameters. So

    it helps to be skeptical, and use moving window

    estimation rather than a single model for a long

    period.

  2. Weighted least squares regression might work for you. Rather than minimizing sum(y – yhat), create an auxiliary variable w equal to the (total number of observed periods – this period) + 1. Then minimize sum((y – yhat)/ w). This will result in the estimators keeping errors small with more recent data and tolerating larger errors with older periods. Note that the method of constructing the auxilliary variable w can be changed to suit your needs and still have the same effect so long as w increases as you get into more recent data.

  3. Thanks for the suggestions – I'd like to pursue this further if I may. To simplify things, suppose I have a single predictor and I'm just doing OLS. Suppose I have 10 data points (collected over time) – I can use the fitted model of y(1-10) on x(1-10) to predict y(11) for given x(11). Then suppose I get an actual y(11) – I can repeat the process and use the fitted model of y(1-11) on x(1-11) to predict y(12) for given x(12). And so on.

    The WLS suggestion could use weights (1, 2, …, 10) for the first model, then weights (1, 2, …, 11) for the second, and so on. This certainly downweights older data, but might there be a way to let the data tell me what the weights should be (rather than simply using 1, …, n)?

    The moving window suggestion could regress y(1-10) on x(1-10) for the first model, then y(2-11) on x(2-11) for the second model, and so on. This also downweights older data (by not considering data older than 10 time periods ago in this case), but might there be a way to let the data tell me how long the moving window should be?

    Or maybe some hybrid approach might work. Or perhaps a Bayesian approach with informative priors based on models from previous years …?

  4. You might want to try exponential weights (w[t]=1, w[t-i]=r**i, |r| < 1), similar to what is done with an exponentially weighted moving average. Then use a general least squares algorithm–in something like Excel Solver–to find the value of r that gives the best fit. You can also kludge up a version of this with lagged regression, using the previous response, y[t-1], as a predictor for the current response y[t]. Check out Abraham and Ledolter's <i>Statistical Methods for Forecasting for details.

  5. One issue that you might want to take into account is the nature of the change in the relationship between y and x. If you believe it to be a gradual process, then some form of WLS might be the way to go. If you believe it to be less "smooth," or even jumpy, then there is a strong case for using a smaller but more recent data set as older data would only lead you to model the wrong relationship.

    A slightly less ad hoc method of letting your data tell you what weights to use in a WLS approach would be to use the auxiliary variable as described above raised to some exponent k, so you'd be minimizing sum(y – yhat/w^k) where k is some positive real number. For larger values of k, you'd be giving greater wieght to more recent observations and vice versa. From there you could use the resultant model to predict the value of y in the periods in your data set and decide which value of k generates estimators that best predict y. This test of predictive power would require a selection of window size so you still have to get around the selection problem.

  6. I still think a hierarchical Bayes approach would be best, with the regression coefficients (or, equivalently, "weights") having a prior distribution with a functional dependence on the lag, and with that functional dependence having hyperparameters estimated from the data. Basically this is similar to these various weighting approaches except that I think it would be better at handling uncertainty, be more easily generalizable to larger structured data sets, and be easier to check (using posterior predictive checks) and improve.

  7. M Hashem Pesaran has a number of papers on his website on this issue, which is dear to my heart. Much depends on how you think that the parameters will change over time.

    If you have a priori reason to believe that the parameters will slowly drift around but will basically be stationary, then some form of Kalman Filter would be the way to go. I would bet decent money that Andrew's hierarchical Bayesian suggestion above would have some representation as a state space model equivalent to the Kalman filter.

    If you think that the parameters will drift over time in a nonstationary way, but that the process will be continuous, then the best thing to do is to estimate the full-dataset model, estimate a model on (say) the most recent 60 observations and then take a weighted average. The issue of what weighting schema and how to decide what window to use for the rolling estimates is the sort of thing that Pesaran goes into in depth. For what it is worth, I have had reasonable luck with cointegrated financial time series using a 60 observation window and a weight of 33/67 on the rolling/recursive models.

    However, if, as is often the case with data of interest, the parameters move around because of sudden regime shifts (thus making *all* data from before the shift more or less irrelevant to the prediction problem), then you have to allow for this. The search term to use is "stochastic permanent breaks" or "STOPBREAKS", which is Robert Engle's model for this sort of situation and is AFAIK state of the art. I think that the best way to estimate this model too would be some kind of hierarchical Bayesian but this would be way over my head.

    David Hendry reckons in his book on econometric forecasts that if you are forecasting series like GDP growth one step ahead, you can pick up almost all of the benefit of more sophisticated methods simply by using "intercept correction"; setting the intercept of your model to last period's residual. This is probably domain-specific, but it certainly involves less brainache than other methods.

  8. (and finally, whatever model you use, a neat trick is to reverse the order of your residuals, square them and perfom a CUSUM test to see whether parameter instability and structural breaks are actually something you need to worry about)

  9. Suppose I have data collected over a period of 10 years, with a response y(year and month) and some predictors x. I want to predict y for the following year based on data collected up to that year. I want to know to fit a model, can I use Factor Analysis?or suggest your model which suits best? please reply as possible earlier.

Comments are closed.