๐ "A stationary time series is predictable; a non-stationary one is deceptive." Understanding this difference is the first step to reliable forecasting and avoiding the trap of spurious regression in quantitative analysis.
In quantitative methods and econometrics, we analyze data collected over time, called time series. The most critical property of a time series is whether it is stationary or non-stationary. This distinction determines which statistical tools we can use and how we interpret results.
What is a Stationary Time Series?
A stationary time series has statistical properties that do not change over time. Specifically, its mean, variance, and covariance are constant. This makes it predictable and suitable for standard regression models.
Consider the daily average temperature in a climate-controlled room, measured over a year. The temperature fluctuates around a fixed set point (e.g., 22ยฐC). The mean is constant, the variance (range of fluctuation) is stable, and today's temperature is not heavily influenced by yesterday's extreme value.
A simple example is a sequence of random numbers, where each value is independent and drawn from the same distribution (e.g., a normal distribution with mean 0 and variance 1). The series has no trend, no seasonality, and no persistent patterns.
What is a Non-Stationary Time Series?
A non-stationary time series has statistical properties that change over time. The most common type has a unit root, meaning it contains a stochastic trend. Shocks to this series have a permanent effect, and it does not revert to a long-term mean.
The value of a major stock index generally increases over the long term (a trend). A major market crash (a shock) permanently lowers the level from which future growth continues. The mean value of the index is not constant over different decades.
A classic model is: Today's value = Yesterday's value + Random noise. If you start at 100 and add a random number each day (e.g., +2, -1, +3), the path wanders without a fixed anchor. Its variance grows infinitely over time.
The Danger: Spurious Regression
Using standard regression on non-stationary series can produce spurious results. You might find a statistically significant relationship between two unrelated trending series, like ice cream sales and the national debt, simply because both are growing over time.
โ ๏ธ Common Pitfall: Ignoring Non-Stationarity
- False Relationships: Regression may show high R-squared and significant t-statistics between two independent non-stationary series, suggesting a causal link that doesn't exist.
- Unreliable Forecasts: Models built on non-stationary data will have poor out-of-sample forecasting performance because the underlying structure is changing.
- Solution: Always test for stationarity (e.g., using the Augmented Dickey-Fuller test) before modeling. If non-stationary, apply differencing to make the series stationary.
How to Handle Non-Stationary Data: Differencing
The standard fix is differencing. Instead of analyzing the raw series (Yt), analyze the changes from one period to the next: ฮYt = Yt - Yt-1. This often removes the stochastic trend and creates a stationary series.
Nominal GDP is non-stationary (it grows over time). The first difference of GDP (this year's GDP minus last year's GDP) represents the growth rate. This growth rate series is typically stationary, with a constant mean around the long-term average growth rate.
| Property | Stationary Series | Non-Stationary Series (Unit Root) |
|---|---|---|
| Mean | Constant over time | Changes over time (has a trend) |
| Variance | Constant over time | Often increases over time |
| Effect of Shock | Temporary, fades away | Permanent, persists forever |
| Long-run Behavior | Reverts to mean | Wanders without bound |
| Forecastability | Good, based on stable structure | Poor, best guess is last value |
| Regression Safety | Safe for standard models | Risky, causes spurious regression |
| Common Example | Temperature in a stable room | Stock prices, GDP level |