Understanding Long Run Regressions using the Wave Function

1. Introduction

In this post, I show how long run predictive regressions like the ones studied in Fama and French (1988) or Campbell (2003) can be understood using the wave function, a second order partial differential equation, rather than sums of correlations as in Cochrane (2005). First, in Section 2 I introduce the idea of a long run regression and relate the coefficient of interest, $\beta(h)$ , where $h$ is the horizon to the auto-correlation of the returns. The beta of a regression of the returns over the next $h$ periods on today’s returns is a weighted sum of auto-correlations, so long long run regressions are able to identify minute amounts of return predictability if this predictability is persistent. Then, in Section 3 I show how to model this same phenomenon using the wave function from statistical mechanics.

2. Long Run Regressions

What is a long run regression? Suppose that we are looking at annual data with years indexed by $t = 0, 1, 2\ldots$ Fama and French (1988) then run the regression below where $h \geq 1$ is the investment horizon:

(1) $\begin{align*} r_{t \to (t+h)} &= \alpha(h) + \beta(h) \cdot r_{(t-1) \to t} + \varepsilon_{t \to (t+h)} \end{align*}$

This regression captures the relationship between the realized returns over the past year and the expected returns over the next $h$ years. A $\beta(h) > 0$ would mean that high returns in the past year would predict high returns over the next $h$ years and vice versa. I want to map this $\beta(h)$ estimate into a formula composed of autocorrelations rather than just variances and covariances as I want to understand how varying the time horizon changes the $\beta(h)$ estimate. To do this, I use results from Poterba and Summers (1988) who studied the variance ratio $\mathtt{VR}(h)$ of returns at horizon $h$ . Suppose that returns are iid, then we can write:

(2) $\begin{align*} \mathbb{V} \left[ r_{t \to (t+h)} \right] &= \mathbb{V} \left[ r_{t \to (t+1)} + r_{(t+1) \to (t+2)} + \dots + r_{(t+h-1) \to (t+h)}\right] \\ &= h \cdot \mathbb{V} \left[ r_{t \to (t+1)}\right] \end{align*}$

Then, I know that I can relate the variance ratio to a weighted sum of auto-correlations as follows using the rules of variances of sums:

(3) $\begin{align*} \mathtt{VR}(h) &= \frac{1}{h} \cdot \frac{\mathbb{V} \left[ r_{t \to (t+h)} \right]}{\mathbb{V} \left[ r_{t \to (t+1)} \right]} \\ &= \frac{1}{h} \cdot \frac{\mathbb{V} \left[ r_{t \to (t+1)} + r_{(t+1) \to (t+2)} + \dots + r_{(t+h-1) \to (t+h)}\right]}{\mathbb{V} \left[ r_{t \to (t+1)} \right]} \\ &= \frac{1}{h} \cdot \left( \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \right) \end{align*}$

Given that the variance terms drop out, the variance ratio is essentially just a weighted sum of auto-correlations. The $\left| h-i \right|$ term (i.e., the weights) comes from the fact that there are $h-1$ of the $1$ period ahead auto-correlations, $h-2$ of the $2$ period ahead auto-correlations… I use the variable $\rho_i$ to capture the $i$ period ahead auto-correlation:

(4) $\begin{align*} \rho_i &= \mathtt{cor}\left[ r_{(t+i-1) \to (t+i)}, r_{(t-1) \to t} \right] \end{align*}$

Now, using the fact the $\beta(h)$ term is just the ratio a covariance and variance term, I can solve for $\beta(h)$ as a sum of auto-correlations using the same method:

(5) $\begin{align*} \beta(h) &= \frac{\mathbb{C} \left[ r_{t \to (t+h)}, r_{(t-1) \to t} \right]}{\mathbb{V}\left[ r_{t \to (t+h)} \right]} \\ &= \frac{h \cdot \mathbb{V} \left[ r_{t \to (t+1)} \right]}{\mathbb{V}\left[ r_{t \to (t+1)} \right]} \cdot \frac{1}{h} \cdot \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \\ &= \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \end{align*}$

This formulation tells us that if returns are an auto-regressive process, then the long run regression coefficient $\beta(h)$ is a sum of its auto-correlations at different horizons where the short horizon auto-correlations get the most weight.

3. The Wave Function

It turns out that we can model this same behavior using wave functions (i.e., a well chosen combination of sine and cosine functions) rather than auto-correlation coefficients. As suggestive evidence of why this approach might be plausible, consider Table 20.5 from Cochrane (2005) which finds that the predictability of log returns varies cyclically with the time horizon using annual data from 1926-1996 with coefficients ranging over the interval $\pm 0.30$ with a period of about $1$ cycle every $10$ years:

(6) $\begin{align*} \begin{array}{l|cccccc} & 1 & 2 & 3 & 5 & 7 & 10 \\ \hline \hline \beta(h) & 0.08 & -0.15 & -0.22 & -0.04 & 0.24 & 0.08 \end{array} \end{align*}$

Using this new formulation is helpful as it makes clear how combinations of return processes vibrating at different frequencies might be added or subtracted from one another via an analogy to Fourier analysis. Waves lie in frequency space which is indexed by horizon $h$ and time $t$ . Instead of thinking about an auto-regression equation, consider the following second order differential equation:

(7) $\begin{align*} \frac{\partial^2 r}{(\partial h)^2} &= \frac{1}{\phi^2} \cdot \frac{\partial^2 r}{(\partial t)^2} \end{align*}$

This equation says that the acceleration of returns with respect to the time horizon (i.e., the rate at which returns are increasing with respect to how long you hold onto an asset) is equal to the acceleration of returns with respect to time (i.e., how quickly do the properties of the asset you are holding onto change with respect to time) scaled by a constant term $\phi^2$ . This constant term $\phi$ captures the mean reversion of the return process. Put differently, if you found an asset whose return was increasing at an increasing rate in the holding period, then you would want to hold onto that asset as long as possible. However, this wave equation says that the properties of the return are changing over time and the higher this acceleration of returns with respect to the time horizon, the faster the properties of the return have to be changing. The constant that regulates this relationship is $\phi$ —the rate at which asset properties change in order to maintain the standing wave.

To solve this partial differential equation, I perform the change of variables below:

(8) $\begin{align*} r = H(h) \cdot T(t) \end{align*}$

This yields $2$ separate equations connected via a negative constant $-\theta^2$ as dictated by the physical properties of the problem:

(9) $\begin{align*} - \theta^2 &= \frac{1}{H} \cdot \frac{d^2 X}{(dx)^2} \\ &= \frac{1}{\phi^2} \cdot \frac{1}{T} \cdot \frac{d^2 T}{(dt)^2} \end{align*}$

From college math courses (e.g., see Boas (2006) Ch: 13, Sec: 4.), we know that differential equations of this type are going to have solutions of a form of either $\sin \times \cos$ or $\sin \times \sin$ ; however, I know that $\partial r / \partial t$ has to be $0$ at $h=0$ as the property of no arbitrage dictates that I should never be able to earn excess returns without holding onto risk for some positive increment of time (…even if that increment is really small). Thus, I get the functional form:

(10) $\begin{align*} r &= \sin \left[ \frac{n \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{n \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] \\ &= \sum_{n=1}^\infty f_n \cdot \sin \left[ \frac{n \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{n \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] \\ &= \frac{8 \cdot \sigma_r}{\pi^2} \cdot \left( \sin\left[ \frac{\pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{\pi \cdot \phi \cdot t}{\overline{h}} \right] \right. \\ &\qquad \qquad \left. - \frac{1}{9} \cdot \sin \left[ \frac{3 \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{3 \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] + \dots \right) \end{align*}$

1 period ahead returns at different time horizons from 0 to ‾h.

This solution gives the dynamics of the short rate process process at each time horizon given an initial pluck of length $\sigma_r$ at the horizon $\overline{h}/n$ . Put differently, if you thought about attaching the return process of length $\overline{h}$ to $2$ fixed end-points, then this solution relates where the short rate process at each and every horizon $h \in [0,\overline{h}]$ is, given any observation of the short rate process on the interval.

Thus, the predictive regression coefficient will be:

(11) $\begin{align*} \beta(h) &= \frac{\mathbb{C} \left[ r_{t \to (t+h)} \right]}{\mathbb{V} \left[ r_{(t-1) \to t} \right]} \\ &= \frac{\mathbb{C} \left[ \sum_{i=1}^h \sin \left\{ \frac{\pi \cdot i}{10} \right\} \cdot \cos \left\{\frac{\pi \cdot 0.12 \cdot t}{10} \right\}, \sin \left\{\frac{\pi \cdot -1}{10} \right\} \cdot \cos \left\{ \frac{\pi \cdot 0.12 \cdot t}{10} \right\} \right]}{\mathbb{V} \left[ \sin \left\{ \frac{\pi \cdot -1}{10} \right\} \cdot \cos \left\{\frac{\pi \cdot 0.12 \cdot t}{10} \right\} \right]} \end{align*}$

Note that for both the covariance and variance operators, any terms without a $t$ in them are constants. Thus, we can see again that the $\cos$ terms will drop out and we will get a sum of $\sin$ terms just as before.