Research Notebook

Understanding Long Run Regressions using the Wave Function

August 29, 2011 by Alex

1. Introduction

In this post, I show how long run predictive regressions like the ones studied in Fama and French (1988) or Campbell (2003) can be understood using the wave function, a second order partial differential equation, rather than sums of correlations as in Cochrane (2005). First, in Section 2 I introduce the idea of a long run regression and relate the coefficient of interest, \beta(h), where h is the horizon to the auto-correlation of the returns. The beta of a regression of the returns over the next h periods on today’s returns is a weighted sum of auto-correlations, so long long run regressions are able to identify minute amounts of return predictability if this predictability is persistent. Then, in Section 3 I show how to model this same phenomenon using the wave function from statistical mechanics.

2. Long Run Regressions

What is a long run regression? Suppose that we are looking at annual data with years indexed by t = 0, 1, 2\ldots Fama and French (1988) then run the regression below where h \geq 1 is the investment horizon:

(1)   \begin{align*} r_{t \to (t+h)} &= \alpha(h) + \beta(h) \cdot r_{(t-1) \to t} + \varepsilon_{t \to (t+h)} \end{align*}

This regression captures the relationship between the realized returns over the past year and the expected returns over the next h years. A \beta(h) > 0 would mean that high returns in the past year would predict high returns over the next h years and vice versa. I want to map this \beta(h) estimate into a formula composed of autocorrelations rather than just variances and covariances as I want to understand how varying the time horizon changes the \beta(h) estimate. To do this, I use results from  Poterba and Summers (1988) who studied the variance ratio \mathtt{VR}(h) of returns at horizon h. Suppose that returns are iid, then we can write:

(2)   \begin{align*} \mathbb{V} \left[ r_{t \to (t+h)} \right] &= \mathbb{V} \left[ r_{t \to (t+1)} +  r_{(t+1) \to (t+2)} + \dots + r_{(t+h-1) \to (t+h)}\right] \\ &= h \cdot \mathbb{V} \left[  r_{t \to (t+1)}\right] \end{align*}

Then, I know that I can relate the variance ratio to a weighted sum of auto-correlations as follows using the rules of variances of sums:

(3)   \begin{align*} \mathtt{VR}(h) &= \frac{1}{h} \cdot \frac{\mathbb{V} \left[ r_{t \to (t+h)} \right]}{\mathbb{V} \left[ r_{t \to (t+1)} \right]} \\ &= \frac{1}{h} \cdot \frac{\mathbb{V} \left[ r_{t \to (t+1)} +  r_{(t+1) \to (t+2)} + \dots + r_{(t+h-1) \to (t+h)}\right]}{\mathbb{V} \left[ r_{t \to (t+1)} \right]} \\ &= \frac{1}{h} \cdot \left( \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \right) \end{align*}

Given that the variance terms drop out, the variance ratio is essentially just a weighted sum of auto-correlations. The \left| h-i \right| term (i.e., the weights) comes from the fact that there are h-1 of the 1 period ahead auto-correlations, h-2 of the 2 period ahead auto-correlations… I use the variable \rho_i to capture the i period ahead auto-correlation:

(4)   \begin{align*} \rho_i &= \mathtt{cor}\left[ r_{(t+i-1) \to (t+i)}, r_{(t-1) \to t} \right] \end{align*}

Now, using the fact the \beta(h) term is just the ratio a covariance and variance term, I can solve for \beta(h) as a sum of auto-correlations using the same method:

(5)   \begin{align*} \beta(h) &= \frac{\mathbb{C} \left[ r_{t \to (t+h)}, r_{(t-1) \to t} \right]}{\mathbb{V}\left[ r_{t \to (t+h)} \right]} \\ &= \frac{h \cdot \mathbb{V} \left[ r_{t \to (t+1)} \right]}{\mathbb{V}\left[ r_{t \to (t+1)} \right]} \cdot \frac{1}{h} \cdot \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \\ &= \sum_{i=1}^h \left| h - i \right| \cdot \rho_i \end{align*}

This formulation tells us that if returns are an auto-regressive process, then the long run regression coefficient \beta(h) is a sum of its auto-correlations at different horizons where the short horizon auto-correlations get the most weight.

3. The Wave Function

It turns out that we can model this same behavior using wave functions (i.e., a well chosen combination of sine and cosine functions) rather than auto-correlation coefficients. As suggestive evidence of why this approach might be plausible, consider Table 20.5 from Cochrane (2005) which finds that the predictability of log returns varies cyclically with the time horizon using annual data from 1926-1996 with coefficients ranging over the interval \pm 0.30 with a period of about 1 cycle every 10 years:

(6)   \begin{align*} \begin{array}{l|cccccc} & 1 & 2 & 3 & 5 & 7 & 10 \\ \hline \hline \beta(h) & 0.08 & -0.15 & -0.22 & -0.04 & 0.24 & 0.08 \end{array} \end{align*}

Using this new formulation is helpful as it makes clear how combinations of return processes vibrating at different frequencies might be added or subtracted from one another via an analogy to Fourier analysis. Waves lie in frequency space which is indexed by horizon h and time t. Instead of thinking about an auto-regression equation, consider the following second order differential equation:

(7)   \begin{align*} \frac{\partial^2 r}{(\partial h)^2} &= \frac{1}{\phi^2} \cdot \frac{\partial^2 r}{(\partial t)^2} \end{align*}

This equation says that the acceleration of returns with respect to the time horizon (i.e., the rate at which returns are increasing with respect to how long you hold onto an asset) is equal to the acceleration of returns with respect to time (i.e., how quickly do the properties of the asset you are holding onto change with respect to time) scaled by a constant term \phi^2. This constant term \phi captures the mean reversion of the return process. Put differently, if you found an asset whose return was increasing at an increasing rate in the holding period, then you would want to hold onto that asset as long as possible. However, this wave equation says that the properties of the return are changing over time and the higher this acceleration of returns with respect to the time horizon, the faster the properties of the return have to be changing. The constant that regulates this relationship is \phi—the rate at which asset properties change in order to maintain the standing wave.

To solve this partial differential equation, I perform the change of variables below:

(8)   \begin{align*} r = H(h) \cdot T(t) \end{align*}

This yields 2 separate equations connected via a negative constant -\theta^2 as dictated by the physical properties of the problem:

(9)   \begin{align*} - \theta^2 &= \frac{1}{H} \cdot \frac{d^2 X}{(dx)^2} \\ &= \frac{1}{\phi^2} \cdot \frac{1}{T} \cdot \frac{d^2 T}{(dt)^2} \end{align*}

From college math courses (e.g., see Boas (2006) Ch: 13, Sec: 4.), we know that differential equations of this type are going to have solutions of a form of either \sin \times \cos or \sin \times \sin; however, I know that \partial r / \partial t has to be 0 at h=0 as the property of no arbitrage dictates that I should never be able to earn excess returns without holding onto risk for some positive increment of time (…even if that increment is really small). Thus, I get the functional form:

(10)   \begin{align*} r &= \sin \left[ \frac{n \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{n \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] \\ &= \sum_{n=1}^\infty f_n \cdot \sin \left[ \frac{n \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{n \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] \\ &= \frac{8 \cdot \sigma_r}{\pi^2} \cdot \left( \sin\left[ \frac{\pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{\pi \cdot \phi \cdot t}{\overline{h}} \right] \right. \\ &\qquad \qquad \left. - \frac{1}{9} \cdot \sin \left[ \frac{3 \cdot \pi \cdot h}{\overline{h}} \right] \cdot \cos \left[ \frac{3 \cdot \pi \cdot \phi \cdot t}{\overline{h}} \right] + \dots \right) \end{align*}

1 period ahead returns at different time horizons from 0 to ‾h.

This solution gives the dynamics of the short rate process process at each time horizon given an initial pluck of length \sigma_r at the horizon \overline{h}/n. Put differently, if you thought about attaching the return process of length \overline{h} to 2 fixed end-points, then this solution relates where the short rate process at each and every horizon h \in [0,\overline{h}] is, given any observation of the short rate process on the interval.

Thus, the predictive regression coefficient will be:

(11)   \begin{align*} \beta(h) &= \frac{\mathbb{C} \left[ r_{t \to (t+h)} \right]}{\mathbb{V} \left[ r_{(t-1) \to t} \right]} \\ &= \frac{\mathbb{C} \left[ \sum_{i=1}^h \sin \left\{ \frac{\pi \cdot i}{10} \right\} \cdot \cos \left\{\frac{\pi \cdot 0.12 \cdot t}{10} \right\}, \sin \left\{\frac{\pi \cdot -1}{10} \right\} \cdot \cos \left\{ \frac{\pi \cdot 0.12 \cdot t}{10} \right\} \right]}{\mathbb{V} \left[ \sin \left\{ \frac{\pi \cdot -1}{10} \right\} \cdot \cos \left\{\frac{\pi \cdot 0.12 \cdot t}{10} \right\} \right]} \end{align*}

Note that for both the covariance and variance operators, any terms without a t in them are constants. Thus, we can see again that the \cos terms will drop out and we will get a sum of \sin terms just as before.

Filed Under: Uncategorized

Geometric Interpretation of Noisy Rational Expectations Equilibrium

August 27, 2011 by Alex

1. Introduction

In this post, I solve a simple noisy rational expectations equilibrium model from Grossman and Stiglitz (1980) and then give a geometric interpretation of their result. First, in Section 2 I set up and solve a noisy rational expectations model. Then, in Section 3 I show how to display the 2 linear projections embedded in the model on a 3D figure. I found the figure to be a useful way of keeping track of the assumptions in more complicated settings.

2. Solution

Consider a world with a single period and only 1 asset with price p and aggregate demand x:

(1)   \begin{align*} p &= a + b \cdot x \\ &= a + b \cdot \left( z + \varepsilon \right) \end{align*}

The coefficient a denotes the average price while the coefficient b represents the price’s responsiveness to aggregate demand changes. i.e., b represents the amount by which a restaurant would change its prices if all of the sudden twice as many people started showing up each evening. Suppose that there are some traders with knowledge of the true value of the asset v, but also other traders who trade randomly and demand an amount \varepsilon \sim \mathtt{N}(0,\sigma_{\varepsilon}^2). Suppose that the value of the asset is drawn from a distribution \mathtt{N}(\mu_v,\sigma_v^2). The coefficients a and b in the equation above are equilibrium objects which I will solve for below as is the value z which is the amount demanded by the informed agents which is also linear in the asset value with coefficients c and d:

(2)   \begin{align*} z = c + d \cdot v \end{align*}

There is a market maker who sets the equilibrium price in order to break even. Let \Pi denote the informed agent’s utility from trading:

(3)   \begin{align*} \Pi &= \max_{z} \left\{ \mathbb{E} \left[ (v - p) \cdot z \mid v \right] \right\} \\ &= \max_{z} \left\{ \mathbb{E} \left[ (v - a - b \cdot z - b \cdot \varepsilon) \cdot z \mid v \right] \right\} \end{align*}

Differentiating yields an expression for the optimal holdings of the informed traders given the true asset value:

(4)   \begin{align*} z &= - \frac{a}{2 \cdot b} + \left( \frac{1}{2 \cdot b} \right) \cdot v \end{align*}

Next, to solve for the coefficient values (a, b, c, d) as functions of the model primitives (\mu_v, \sigma_v, \sigma_{\varepsilon}), I enforce the break even condition for the market maker which demands that the price of the asset be equal to the expected value of the asset conditional on observing the aggregate asset demand:

(5)   \begin{align*} p &= \mathbb{E} \left[ v \mid x \right] \\ &= \mathbb{E} \left[ v \right] + \frac{\mathbb{C} \left[ x, v\right]}{\mathbb{V}\left[ x \right]} \cdot \left( x - \mathbb{E}[x] \right) \\ &= \mu_v + \frac{\mathbb{C} \left[ - \frac{a}{2 \cdot b} + \left( \frac{1}{2 \cdot b} \right) \cdot v + \varepsilon, v\right]}{\mathbb{V}\left[ - \frac{a}{2 \cdot b} + \left( \frac{1}{2 \cdot b} \right) \cdot v + \varepsilon \right]} \cdot \left( x - \left[ - \frac{a}{2 \cdot b} + \frac{\mu_v}{2 \cdot b} \right] \right) \\ &= \mu_v + \left( \frac{\left( \frac{1}{2 \cdot b} \right) \cdot \sigma_v^2}{ \left( \frac{1}{2 \cdot b} \right)^2 \cdot \sigma_v^2 + \sigma_{\varepsilon}^2} \right) \cdot \left( \frac{a}{2 \cdot b} - \frac{\mu_v}{2 \cdot b} \right) \\ &\qquad \qquad + \left( \frac{\left( \frac{1}{2 \cdot b} \right) \cdot \sigma_v^2}{ \left( \frac{1}{2 \cdot b} \right)^2 \cdot \sigma_v^2 + \sigma_{\varepsilon}^2} \right) \cdot x \end{align*}

Enforcing this condition leads to an expression for p that is linear in the aggregate asset demand x. Thus, I observe the b must be equal to the coefficient on x from the equation above and solve for b:

(6)   \begin{align*} b &= \frac{\left( \frac{1}{2 \cdot b} \right) \cdot \sigma_v^2}{ \left( \frac{1}{2 \cdot b} \right)^2 \cdot \sigma_v^2 + \sigma_{\varepsilon}^2} \\ &= \frac{\sigma_v}{2 \cdot \sigma_{\varepsilon}} \end{align*}

We see that the market maker tends to change prices more in response to an aggregate demand shock when the asset value is more volatile and when the noise trader demand is less volatile–i.e., when asset value changes aer more likely and when there is less demand noise for the informed traders to hide behind.

3. Geometric Interpretation

The plot below captures the essence of the intuition embedded in the noisy rational expectations model. The core idea is that the market maker cannot precisely distinguish a random fluctuation in demand from a shift in demand due to a shift in the asset value. Thus, when the market maker sets the price, he looks at the aggregate demand and shades the price a bit higher if he observes a high demand or a bit lower if he observes a surprisingly low demand schedule, but does not do so on a one for one schedule: 0 < b < 1. By using normally distributed random variables as well as linear pricing and informed demand rules, we can then get nice expressions for the coefficients of interest.

To read this plot, step into the shoes of the market maker and have a look at the blue side of the figure which shows the relationship between the price (i.e., the market maker’s expectation of the value on the y-axis) and the aggregate demand (x-axis). The line p = a + b \cdot x shows the price you will set if you observe an aggregate demand of x. Note that is x = 0, you will set the price of a–the y-intercept.

Where does this pricing rule come from? When you observe the aggregate demand of x, your best guess for the informed demand schedule is z as \mathbb{E} [ \varepsilon ] = 0. This best guess is displayed in the plot by the mapping through the z = u + v \cdot x line on the green floor of the figure over to the z-axis. On the red wall of the figure, we see that this choice of z has to map over to a realized value v that is a linear function of z and is equal to your choice of p. This is the double projection and fixed point problem that pins down the equilibrium values. For instance, note that at z=0, it must be the case that both the v and p functionals cross the y-axis at the same place as \mathbb{E}[\varepsilon]=0.

Filed Under: Uncategorized

Storing CRSP-COMPUSTAT Data Using MongoDB

August 22, 2011 by Alex

In this note, I show how to set up a local Mongo DB database to house CRSP-COMPUSTAT data. I grew tired of having to use SAS to access these data on the WRDS server. The coding language is difficult to use and the server is not particularly responsive. By using Mongo DB, I can access the data in a variety of languages such as Python and R as well as parallelize the queries to speed up any computations.

I downloaded monthly CRSP data and annual COMPUSTAT data from WRDS over the time period from 1950-2010. To set up the new database system, I downloaded MongoDB, PyMongo and Python. I am running Ubuntu 11.04 (“Natty Narwhal”) so just selected “mongodb” and “python-pymongo” from the Synaptic Package Manager. Downloading all of the software took less than 5 minutes. I then used the short piece of Python code located here to populate a new database.

Filed Under: Uncategorized

Standard Error Estimation with Overlapping Samples

August 13, 2011 by Alex

1. Introduction

In this post, I show how to compute corrected standard errors for a predictive regression with overlapping samples as in Hodrick (1992). First, in Section 2, I walk through a simple example which outlines the general empirical setting and illustrates why we would need to correct the standard errors on the coefficient estimates when faced with overlapping samples. Then, in Section 3 I compute the estimator for the standard errors proposed in Hodrick (1992). I conclude in Section 4 with a numerical simulation to verify that the mathematics below in fact computes a sensible estimate of the standard deviation of \beta.

2. An Illustrative Example

Suppose that you are a mutual fund manager who has to allocate capital amongst stocks and you want to know which stocks will earn the highest returns over the next H months where H stands for your investment horizon. To start with, you might consider H=1 and run a bunch of regressions with the form below where r_{t \to (t+1)} is the log 1 month excess return, z_t is a current state variable and \varepsilon_{t \to (t+1)} is the residual:

(1)   \begin{align*} r_{t \to (t+1)} &= \theta_1 + \theta_z \cdot z_t + \varepsilon_{t \to (t + 1)} \end{align*}

For example, Fama and French (1988) pick z_t to be the log price to dividend ratio while Jegadeesh and Titman (1993) pick z_t to be a dummy variable for a stock’s inclusion or exclusion from a momentum portfolio. We can vectorize the expression above to clean up the algebra and obtain the regression equation below:

(2)   \begin{align*} \underbrace{\begin{bmatrix} r_{1 \to 2} \\ r_{2 \to 3} \\ r_{3 \to 4} \\ \vdots \\ r_{(T-1) \to T} \end{bmatrix}}_{Y_{T-1}(1)} &= \underbrace{\begin{bmatrix} 1 & z_1 \\  1 & z_2 \\ 1 & z_3 \\ \vdots & \vdots \\ 1 & z_{T-1} \end{bmatrix}}_{X_{T-1}} \underbrace{\begin{pmatrix} \theta_1 \\ \theta_z \end{pmatrix}}_{\Theta(1)} + \underbrace{\begin{bmatrix} \varepsilon_{1 \to 2} \\ \varepsilon_{2 \to 3} \\ \varepsilon_{3 \to 4} \\ \vdots \\ \varepsilon_{(T-1) \to T} \end{bmatrix}}_{\mathcal{E}_{T-1}(1)} \end{align*}

However, just as Fama and French (1988) and Jegadeesh and Titman (1993) are interested in investment horizons of H>1, you could also set up the regression from above with H>1 by making the adjustments:

(3)   \begin{align*} r_{t \to (t+H)} &= \sum_{h=1}^H r_{(t+h-1) \to (t+h)} \\ \varepsilon_{t \to (t+H)} &= \sum_{h=1}^H \varepsilon_{(t+h-1) \to (t+h)} \end{align*}

Here, expression for \varepsilon_{t \to (t+H)} comes from the null hypothesis that z_t has no predictive power assuming that each of the \varepsilon_{t \to (t+1)} terms are \mathtt{iid} white noise. Thus, if you run a new set of regressions at the H=2 month investment horizon, you would have the vectorized regression equation:

(4)   \begin{align*} Y_{T-2}(2) &= X_{T-2} \ \Theta(2) + \mathcal{E}_{T-2}(2) \end{align*}

However, while estimating this equation and trying to compute the standard errors for \theta_z, you notice something troubling: even though each of the \varepsilon_{t \to (t+1)} terms is distributed \mathtt{iid} and act as white noise, the \varepsilon_{t \to (t+2)} and \varepsilon_{(t+1) \to (t+3)} terms each contain the \varepsilon_{(t+1) \to (t+2)} shock. Thus, while the step by step shocks are white noise, the regression residuals are autocorrelated in a non trivial way:

(5)   \begin{align*} r_{t \to (t+2)} &= 2 \cdot \theta_1 + \varepsilon_{t \to (t + 1)}  + \varepsilon_{(t+1) \to (t + 2)} \end{align*}

Thus, in order to properly account for the variability of your estimate of \theta_z, you will have to compute standard errors for the regression that take this autocorrelation and conditional heteroskedasticity into account.

3. Hodrick (1992) Solution

Standard econometric theory tells us that we can estimate \Theta(H) using GMM yielding the distributional result:

(6)   \begin{align*} \hat{\Theta}(H) - \Theta(H) &\sim \mathtt{N}\left( 0, V(H) \right) \end{align*}

with the variance covariance matrix given by the expression:

(7)   \begin{align*} V(H) &= \frac{1}{T-H} \cdot \mathbb{E}[X_{T-H} X_{T-H}^{\top}]^{-1} S_{T-H} \mathbb{E}[X_{T-H} X_{T-H}^{\top}]^{-1} \end{align*}

Thus, the autocovariance of the regression residuals will be captured by the S_{T-H} or spectral density term. A natural way to account for this persistence in errors would be to compute the S_{T-H} would be to compute something like the average of the autocovariances:

(8)   \begin{align*} S_{T-H} &= \sum_{j=-H+1}^{H-1} \mathbb{E} \left[ \left( \varepsilon_{t+H} \cdot \begin{bmatrix} 1 & z_t \end{bmatrix}^{\top} \right) \left( \varepsilon_{t+H-j} \cdot \begin{bmatrix} 1 & z_t \end{bmatrix}^{\top} \right)^{\top} \right] \end{align*}

However, this estimator for the spectral density has bad small sample properties as autocovariance matrices are only garraunteed to be positive semi-definite leading to large amounts of noise as your computer attempts to invert a nearly singular matrix. The insight in Hodrick (1992) is to use stationarity of the time series Y_{T-H}(H) and X_{T-H} to switch from summing autocovariances to variances:

(9)   \begin{align*} &= \mathbb{E} \left[ \varepsilon_{t+1}^2 \cdot \left( \sum_{h=0}^{H-1} \begin{bmatrix} 1 & z_{t - h} \end{bmatrix}^{\top} \right) \left( \sum_{h=0}^{H-1} \begin{bmatrix} 1 & z_{t - h} \end{bmatrix}^{\top} \right)^{\top} \right] \end{align*}

4. Simulation Results

In this section, I conclude by verifying my derivations with a simulation (code). First, I compute a data set of 1 month returns using a discretized version of an Ornstein-Uhlenbeck process with \Delta t = 1/12:

(10)   \begin{align*} r_{t \to (t+1)} &= \theta \cdot (\mu - r_{(t-1) \to t}) \cdot \Delta t + \sigma \cdot \sqrt{\Delta t} \cdot \varsigma_{t \to (t+1)} \end{align*}

with \varsigma_{t \to (t+1)} an \mathtt{iid} standard normal variable. I use the annualized moments below taken from Cochrane (2005):

(11)   \begin{align*} \begin{array}{l|c} & \textit{Value} \\ \hline \hline \mu & 0.08 \\ \theta & 0.70 \\ \sigma & 0.16 \end{array} \end{align*}

I also simulate a completely unrelated process z_t which represents T \mathtt{iid} draws from a standard normal distribution. Thus, I check my computations under the null hypothesis that z_t has no predictive power. I then run 500 simulations in which I compute the data series above, estimate the regression:

(12)   \begin{align*} Y_{T-6}(6) &= X_{T-6} \ \Theta(6) + \mathcal{E}_{T-6}(6) \end{align*}

and then report the distribution of \theta_z, as well as the naive and Hodrick (1992) implied standard errors:

Estimated coefficients for 500 simulated draws.

Estimated standard errors for 500 simulated draws using both the naive and Hodrick (1992) approaches.

I report the mean values from the simulations below:

(13)   \begin{align*} \begin{array}{l|c} \hat{\sigma}_{\mathtt{Naive}} & 0.00312 \\ \hline \hat{\sigma}_{\mathtt{H1992}} & 0.00326 \end{array} \end{align*}

Filed Under: Uncategorized

Co-Movement Between Bond and Stock Risk Premia

August 2, 2011 by Alex

1. Introduction

I compare the covariance between the bond risk premium as captured by the Cochrane and Piazzesi (2005) factor and the stock risk premium as captured by the logarithm of the price to dividend ratio as used in, say, Shiller (2006). This covariance measure gives a rough view of whether or not the same risk factors are driving the excess returns in each market.

In section 2, I recreate the Cochrane and Piazzesi (2005) factor and verify its properties. In section 3, I compute the log price to dividend ratio from Robert Shiller’s online data. Finally, in section 4, I conclude by computing the covariances between the Cochrane and Piazzesi (2005) factor, the log price to dividend ratio and the excess returns on the S&P 500.

2. Bond Risk Premium

In this section, I recreate the Cochrane and Piazzesi (2005) bond risk factor. The raw data are 1 through 5 year zero coupon bond prices which come from the CRSP Fama-Bliss risk free bond prices hosted on WRDS with run from January 1964 to December 2003. From these raw series, I convert the price as p_t = \ln \left( P_t/100 \right) and then compute bond yields, forward rates and returns as described below where the t subscript represents the current date and the \tau argument represents the bond maturity:

(1)   \begin{align*} y_t(\tau) \ &= \ - \dfrac{p_t(\tau)}{\tau} \\ f_t(\tau) \ &= \ p_t(\tau - 1) \ - \ p_t(\tau) \\ r_t(\tau) \ &= \ p_t(\tau - 1) \ - \ p_{t-12}(\tau) \end{align*}

Next, I run a regression of the average excess returns for the 2 through 5 year maturity bonds over the 1 year spot rate:

(2)   \begin{align*} \overline{r}^x_{t+1} &= \begin{bmatrix} \gamma_0 \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_5 \end{bmatrix}' \begin{pmatrix} 1 \\ y_t(1) \\ f_t(2) \\ f_t(3) \\ f_t(4) \\ f_t(5) \end{pmatrix} + \varepsilon_{t+1} \\ &= \Gamma' F_t + \varepsilon_{t+1} \end{align*}

Using monthly data from 1964-2003, I find the regression results below which match up closely with those reported in Table 1 of Cochrane and Piazzesi (2005):

    \begin{align*} \begin{array}{l|cc} & \textit{est} & \textit{s.e.} \\ \hline \hline \gamma_0 & -0.03 & 0.0056 \\ \gamma_1 & -2.06 & 0.255 \\ \gamma_2 & 0.59 & 0.523 \\ \gamma_3 & 2.92 & 0.440 \\ \gamma_4 & 0.90 & 0.323 \\ \gamma_5 & -1.95 & 0.267 \\ \hline R^2 & 0.34 \end{array} \end{align*}

These results say that, for instance, a 1\% increase in the 3 year forward rate predicts an increase in the average excess return over maturities from 2 to 5 years by 3\%, while a 1\% increase in the 5 year forward rate predicts a 2\% decrease in the same measure. The figure below plots the coefficients above along with a 95\% confidence interval.

This figure reports the estimates of gamma from the regression depicted above of average excess returns on bonds of maturities 2 through 5 on forward rates at maturities 1 to 5 as well as a constant corresponding to Panel A of Figure 1 in Cochrane and Piazzesi (2005).

I then use this regression estimate to compute the Cochrane and Piazzesi (2005) factor as the predicted value of the average excess returns:

(3)   \begin{align*} \textit{CP}_t &= \Gamma' F_t \end{align*}

Don't click on me!!!

This figure plots the level and first difference of the Cochrane and Piazzesi (2005) bond risk premium measure using monthly observations of annual returns from January 1964-December 2003 as well as the analogous bond risk premium measure implied in Gabaix (2011).

I also compute the analogous measure of the bond risk premium implied by Gabaix (2011):

(4)   \begin{align*} \textit{CP}_t^G &= \frac{-y_t(1) + 2 \cdot f_t(3) - f_t(5)}{4} \end{align*}

Below I report the summary statistics of the both bond risk premium measures. This table reads that the average excess return on bonds at maturities 2–5 years is 0.84\% per year in this monthly sample from January 1964 to December 2003 with a standard deviation of 2.4\%.

    \begin{align*} \begin{array}{l|cc} & \mathbb{E}[ \cdot ] & \mathbb{S}[ \cdot ] \\ \hline \hline \textit{CP}_t & 0.0084  & 0.024 \\ \textit{CP}_t^G & 0.0014 & 0.0027 \end{array} \end{align*}

3. Stock Risk Premium

In this section, I compute the logarithm of the price to dividend ratio on the S&P 500 as a proxy for the risk premium in the US stock market. I use data from Robert Shiller’s website which reports the real price level, real dividends and real earnings on the S&P 500 on a monthly frequency dating back to the 1800’s. Using this data, I compute the annual cum dividend excess return on the S&P 500 as:

(5)   \begin{align*} r_{t+12}^e &= \frac{P_{t+12} + D_{t+12}}{P_t} - 1 \end{align*}

This data delivers the summary statistics below for the monthly sample running from January 1964 to December 2003 corresponding to the Cochrane and Piazzesi (2005) interal. The excess returns are annualized while the log price to dividend and log price to earnings ratios, p_t - d_t and p_t - y_t respectively, are on a month by month basis. This table reads that the average excess return on the stock market over the next year was 6.6\% per year with a volatility of 16\%, while in any given month the average of the logarithm of the price to dividend ratio is 3.496 with an average month to month increase of 0.0013.

    \begin{align*} \begin{array}{l|cc} & \mathbb{E}[ \cdot ] & \mathbb{S}[ \cdot ] \\ \hline \hline r_{t+12}^e & 0.066  & 0.16 \\ p_t - d_t & 3.496 & 0.413 \\ p_t - y_t & 2.759 & 0.423 \\ \Delta (p_t - d_t) & 0.0013 & 0.0359 \\ \Delta (p_t - y_t) & 0.00039 & 0.041 \end{array} \end{align*}

Below I plot the annualized excess return on the S&P 500 along with the log price to dividend and log price to earnings ratios in both levels and changes.

This figure shows monthly observations of the annual excess returns on the S&P 500 along with the log price to dividend and price to earnings ratio from January 1964 to December 2003 using data from Shiller (2006).

This figure shows monthly observations of the annual excess returns on the S&P 500 along with the first difference of the log price to dividend and price to earnings ratio from February 1964 to December 2003 using data from Shiller (2006).

I regress the excess return on the S&P 500 over the next year on the current log price to dividend ratio using monthly observations over the period from January 1964 to December 2003. I report these regression results below. The standard errors have not been corrected for overlapping samples. The point estimates indicate that a 1\% increase in the price to dividend ratio in the current month is associated with a 0.06\% or 6\textit{b.p.} decrease in the returns on the S&P 500 over the next year.

    \begin{align*} \begin{array}{l|cc} & \textit{est} & \textit{s.e.} \\ \hline \hline \alpha_{p_t - d_t} & 0.24 & 0.049 \\ \beta_{p_t - d_t} & -0.063 & 0.018 \\ \hline R^2 & 0.026 \end{array} \end{align*}

I also regress the level and the change in the log price to dividend ratio on the level and the change in the annual real interest rate using monthly observations over the period from January 1964 to December 2003 and report the regression coefficients below. The standard errors have not been corrected for overlapping samples. The annual real interest rate each month is the difference between the Fama-Bliss riskless rate over that year and the log change in inflation using the data from the CPI located here.

    \begin{align*} \begin{array}{l|cc} \mathtt{LHS:} \ p_t - d_t & \textit{est} & \textit{s.e.} \\ \hline \hline \alpha & 3.60 & 0.024 \\ \beta & -2.52 & 0.78 \\ \hline R^2 & 0.42 \end{array} \end{align*}

    \begin{align*} \begin{array}{l|cc} \mathtt{LHS:} \ \Delta_1 (p_t - d_t) & \textit{est} & \textit{s.e.} \\ \hline \hline \alpha_{\Delta_1 r_{t+12}^f} & 0.00065 & 0.0015 \\ \beta_{\Delta_1 r_{t+12}^f} & -1.70 & 0.26 \\ \hline R^2 & 0.034 \end{array} \end{align*}

    \begin{align*} \begin{array}{l|cc} \mathtt{LHS:} \ \Delta_{12} (p_t - d_t) & \textit{est} & \textit{s.e.} \\ \hline \hline \alpha_{\Delta_{12} r_{t+12}^f} & 0.012 & 0.02 \\ \beta_{\Delta_{12} r_{t+12}^f} & -2.08 & 1.07 \\ \hline R^2 & 0.13 \end{array} \end{align*}

The raw correlations here are given in the tables below where the \tilde{r}^f denotes the real riskless rate over the next year computed using an AR(2) implied inflation rate.

    \begin{align*} \begin{array}{l||cccc} & p_t - d_t & p_t - y_t &  r_{t+12}^f &  \tilde{r}_{t+12}^f \\ \hline \hline p_t - d_t & 1 & 0.88 & -0.14 & -0.14 \\ p_t - y_t & & 1 & -0.15 & -0.15 \\ r_{t+12}^f & &  & 1 & 0.995 \\ \tilde{r}_{t+12}^f & & & & 1 \end{array} \end{align*}

    \begin{align*} \begin{array}{l||cccc} & \Delta (p_t - d_t) & \Delta (p_t - y_t) &  \Delta r_{t+12}^f  & \Delta \tilde{r}_{t+12}^f \\ \hline \hline \Delta (p_t - d_t) & 1 & 0.88 & -0.28 & -0.25 \\ \Delta (p_t - y_t) & & 1 & -0.29 & -0.26 \\ \Delta r_{t+12}^f & &  & 1 & 0.820 \\ \Delta \tilde{r}_{t+12}^f  & & & & 1 \end{array} \end{align*}

We can see from the plots below that this negative correlation between the log price to dividend ratio and the real riskless rate comes entirely from a spike during the 1970’s.

Level and month over month changes in the log price to dividend ratio and annual real riskless rate.

4. Co-Movement

Finally, to get a sense of how much the risk premia in the bond and stock markets co-move, I report the correlation between the Cochrane and Piazzesi (2005) bond risk factor, the Gabaix (2011) implied bond risk premium analogue, the log price to dividend and priec to earnings ratios and the excess returns on the S&P 500 in both levels and changes below:

    \begin{align*} \begin{array}{l||cccccc} & \textit{CP}_t & \textit{CP}_t^G & p_t - d_t & p_t - y_t & r_{t+12}^e & r_{t+12}^f \\ \hline \hline \textit{CP}_t & 1 & 0.885 &  - 0.155 & - 0.033 & 0.311 & 0.39 \\ \textit{CP}_t^G &  &  1 &  0.135 & 0.286 & 0.246 & 0.19 \\ p_t - d_t & & & 1 & 0.908 & -0.169 & -0.14 \\ p_t - y_t & & & & 1 & -0.163 & -0.15 \\ r_{t+12}^e & & & & & 1 & 0.44 \\ r_{t+12}^f & & & & & & 1 \end{array} \end{align*}

    \begin{align*} \begin{array}{l||ccccc} & \Delta \textit{CP}_t & \Delta \textit{CP}_t^G & \Delta (p_t - d_t)  & r_{t+12}^e & r_{t+12}^f \\ \hline \hline \Delta \textit{CP}_t & 1 & 0.930 & 0.018 & -0.029 & - 0.00035 \\ \Delta \textit{CP}_t^G & & 1 & 0.011 & -0.0121 & -0.00053 \\ \Delta (p_t - d_t) & & & 1  & 0.081 & 0.044 \\ r_{t+12}^e & &  & & 1 & 0.44 \\ r_{t+12}^f & & & & & 1 \end{array} \end{align*}

Filed Under: Uncategorized

« Previous Page
Next Page »

Pages

  • Publications
  • Working Papers
  • Curriculum Vitae
  • Notebook
  • Courses

Copyright © 2026 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in

Loading Comments...

You must be logged in to post a comment.