Research Notebook

Digesting the Hansen and Scheinkman Multiplicative Decomposition of the SDF

July 12, 2011 by Alex

Introduction1

I give some intuition behind the multiplicative decomposition of the stochastic discount factor M_{t \to t+h} introduced in Hansen and Scheinkman (2009). The economics underlying the original Hansen and Scheinkman (2009) results was not clear to me during my initial readings. This post collects my efforts to interpret these mathematical ideas in a sensible way.

Below I formally state the decomposition.

Theorem (Hansen and Scheinkman Decomposition): Suppose that \phi_M is a principal eigenfunction with eigenvalue \lambda_M for the extended generator of the stochastic discount factor M. Then this multiplicative functional can be decomposed as:

    \begin{align*} M_{t \to t+h} \ &= \ e^{\lambda_M \cdot h} \cdot \left( \frac{\phi_M(X_t)}{\phi_M(X_{t+h})} \right) \cdot \hat{M}_{t \to t+h} \end{align*}

where \hat{M}_{t \to t+h} is a local martingale.

 

The stochastic discount factor M_{t \to t+h} dictates how to discount cashflows occurring h periods in the future in state X_{t+h}. Roughly speaking, Hansen and Scheinkman (2009) factors M_{t \to t+h} into 3 different pieces: a state independent component e^{\lambda_M \cdot h}, an investment horizon independent component \phi_M(X_t)/\phi_M(X_{t+h}), and a white noise component \hat{M}_{t \to t+h}.

Thus, you should think about \lambda_M as a generalized time preference parameter. \lambda_M will generally be negative, so e^{\lambda_M \cdot h} is the continuous time representation of the state independent discount rate dictated by an asset pricing model. The ratio \phi_M(X_t) / \phi_M(X_{t+h}) captures the rate at which I discount payments at time t+h given the state today at time t and the state at time t+h. This ratio is independent of h meaning that if X_{t+h} = X_{t+h'}, then for any h and h' we have:

    \begin{align*} \frac{\phi_M(X_t)}{\phi_M(X_{t+h})} \ &= \ \frac{\phi_M(X_t)}{\phi_M(X_{t+h'})} \end{align*}

Finally, \hat{M}_{t+h} represents a random noise component with \mathbb{E}\hat{M}_{t+h} = 1 and independent increments.

 

Motivation

The Hansen and Scheinkman decomposition generalizes the binomial options pricing framework for use in standard asset pricing applications by allowing for more complicated state space features like jumps and time averaging.2 The main advantages of casting the stochastic discount factor as a multiplicative functional are a) the use of the binomial pricing intuition to understand more complicated asset pricing models and b) the streamlining of the econometrics needed to compare excess returns at different horizons.3

To illustrate the basic intuition behind this analogy, I work through the Black, Derman and Toy (1990) model.

Example (Binomial Model): Consider a discrete time, binomial world with states X_t \in \{d,u\}, \ \forall t \geq 0 in which traders have an independent probability \pi(x) of entering state x in the next period regardless of the current state. In this world, the price P_{t \to t+1} at time t of a risk free bond that pays out $1 at time t+1 is given by the expression:

    \begin{align*} P_{t \to t+1} \ &= \ \frac{\pi(u) \cdot 1 + \pi(d) \cdot 1}{1 + r^f_{t+1}} \end{align*}

This 1 step ahead pricing rule applies at each and every starting date t. All pricing computations at longer horizons are built up from this local relationship based on the prevailing short rate r_{t+1}^f.

To solve the model, I need to assume that the short rate r_{t+1}^f process has independent log-normal increments. I could then use the volatility of this process to pin down the values of the short rate for the entire binomial tree.

 

In general, models of this sort are easy to solve analytically if the short rate process has log-normal increments. The recent papers Lettau and Wachter (2007), Van Binsbergen, Brandt and Koijen (2010) and Backus, Chernov and Zin (2011) adopt similar approaches and try to extend these insights to equity markets.

Nevertheless, most asset pricing models are not log-normal and will not suffer pen and paper analysis of their term structure using existing methods. Thus, in order to use cross-horizon predictions to discriminate between alternative models, we must adopt new mathematical tools.

 Example (Binomial Model, Ctd…): We use operator methods to factor the discount factor process M_{t \to t+h} which deflates payments in state X_{t+h} at time horizon t+h back to time t into 3 pieces, e^{\lambda_M \cdot h}, \tilde{\phi}_M(X_{t+h},X_t) and \hat{M}_{t \to t+h}, where the first factor only depends on the investment horizon h, the second factor only depends on the realized states and the third factor is noise, so that M_{t \to t+h} = e^{\lambda_M \cdot h} \cdot \tilde{\phi}_M(X_{t+h},X_t) \cdot \hat{M}_{t \to t+h}.

By visual analogy to the Black, Derman and Toy (1990) model, in a binomial world we can use this decomposition to rewrite the h=1 Euler equation below where the dependence on X_t is implicit:

    \begin{align*} 1 \ &= \ \mathbb{E}_t \left[ \ M_{t \to t+1} \cdot R_{t \to t+1} \ \right] \\ &= \ \frac{\pi(u) \cdot \tilde{\phi}_M(u) \cdot \varepsilon(u) \cdot R(u) + \pi(d) \cdot \tilde{\phi}_M( d) \cdot \varepsilon(u) \cdot R(d)}{1 - \lambda_M} \end{align*}

 

Thus, in the Hansen and Scheinkman (2009) decomposition, - \lambda_M serves as a synthetic risk free rate and the \pi(x) \cdot \tilde{\phi}_M(x) serve as the twisted martingale measure.

In my work with  Anmol Bhandari4 we look at a class of models for which \ln \tilde{\phi}_M(x) is affine5 and show how to use this decomposition to compute a cross-horizon analogue to the Hansen and Jagannathan (1991) volatility bound. This new bound can be used to discriminate between different models which make identical predictions at a particular horizon. This exponentially affine structure is useful as it permits closed form solutions for the moments of M_{t \to t+h}:

    \begin{align*} \mathbb{E}_t[M_{t \to t+h}] \ &\approx \ e^{\lambda_M \cdot h} \cdot \mathbb{E}_0 \left[ \frac{\phi_M(X_t)}{\phi_M(X_{t+h})} \right] \cdot 1 \\ \mathbb{E}_t[M_{t \to t+h}^2] \ &\approx \ e^{\lambda_{M^2} \cdot h} \cdot \mathbb{E}_0 \left[ \frac{\phi_{M^2}(X_t)}{\phi_{M^2}(X_{t+h})} \right] \cdot 1 \end{align*}

In the next 2 sections, I walk through the economics governing the \lambda_M and \phi_M terms.

 

Time Preference

Where does \lambda_M come from? In the original article, the authors refer to \lambda_M as the principle eigen-value of the extended generator of M; however, \lambda_M has a well defined meaning without ever subscribing to Perron-Frobenius theory. \lambda_M is a generalization of the time preference parameter dictated by an asset pricing model.

Consider the following thought experiment which casts the \lambda_M term as the time preference parameter plus an extra Jensen inequality term.

Example (Generalized Time Preference): Suppose that an agent has preferences over a stream of consumption C_1, C_2, C_3, ... and that for each period t, C_t = 100 with probability 0.95 and the remaining 5\% of the time C_t = 50 or C_t = 150 with equal probability. While \mathbb{E}_t[C_{t+1}] = 100, the certainty equivalent is \mathbb{E}_t^{c.e.}[C_{t+1}] < 100^{1-\gamma} = \mathbb{E}_t^*[C_{t+1}].

In fact, with probability 0.05 the agent will get a payout worth:

    \begin{align*} \mathbb{E}_t^{c.e.}[C_{t+1} \mid C_{t+1} \neq 100 ] \ &= \ \frac{50^{1-\gamma}}{2} + \frac{150^{1-\gamma}}{2} \end{align*}

Let’s call this certainty equivelant gap \delta:

    \begin{align*} \delta \ &= \ \mathbb{E}_t^{c.e.}[C_{t+1} \mid C_{t+1} \neq 100 ] \ - \ 100^{1-\gamma} \end{align*}

\lambda_M should then include both time preference, \rho, and also the expected Jensen’s inequality loss:

    \begin{align*} \lambda_M \ &= \ \rho \ + \ 0.05 \cdot \delta \end{align*}

 

Thus, in a more general framework, we should expect \lambda_M to have roughly the following form:

    \begin{align*} \lambda_M \ &= \ \rho \ + \ f(\sigma_M^2, \sigma_X^2, \sigma_{M \times X}) \end{align*}

where f is an affine function. Heuristically, the \sigma_X component will capture how volatile the state space is while the \sigma_M component will capture how badly I need to discount this consumption stream due to Jensen’s inequality.

 

State Dependence

Next, in order to capture the dependence of the discount factor M_{t \to t+h} on the current and future state (X_t,X_{t+h}), Hansen and Scheinkman (2009) downshift to continuous time and apply the Perron-Frobenius theorem to the infinitesimal generator of the discount factor. When applied to the transition probability matrices, the Perron-Frobenius theory implies the largest eigen-pair dominates the behavior of a stochastic process as h \to \infty. Hansen and Scheinkman use this h \to \infty limiting result to argue that the ratio of \phi_M(X_t)/\phi_M(X_{t+h}), the largest eigen-functions of the generator of the discount factor M, is a good choice for the state dependent component of M_{t \to t+h}.

It is important to note that Perron-Frobenius theory is only a modeling tool in the Hansen and Scheinkman (2009) construction, not a critical feature of their results. There may well be other reasonable choices for the state dependent component of M_{t \to t+h}. In its simplest form6, the result can be written as:

Theorem (Perron-Frobenius): The largest eigen-value \lambda of a positive square matrix A is both simple and positive and belongs to a positive eigenvector \phi. All other eigen-values are smaller in absolute value.7

 

In order to use this theorem, I need to have a positive square matrix to operate on. While strictly positive, M_{t \to t+h} is not a square matrix; however, its infinitesimal generator is. Heuristically, you can think about the infinitesimal generator as encoding the transition probability matrix under the equivalent martingale measure deflated by the time preference parameter.

Definition (Infinitesimal Generator): The infinitesimal generator \mathbb{A} of an Ito diffusion \{ X_t \} in \mathcal{R}^n is defined by:

    \begin{align*} \mathbb{A}[ f(x)] \ &= \ \lim_{h \searrow 0} \ \frac{\mathbb{E}_0[ f(X_h) ] - f(x)}{h}, \end{align*}

where the set of functions f: \mathcal{R}^n \mapsto \mathcal{R} such that the limit exists at x is denoted by \mathcal{D}_A(x).

 

In words, the infinitesimal generator of the discount factor M_{t \to t+h} captures how my valuation of a $1 payment in, say, the up state u will change if I move the payment from h=1 period in the future to h=2 periods in the future. To get a feel for what the infinitesimal generator captures, consider the following short example using a 2 state Markov chain. First, I define the physical transition intensity matrix for the Markov process X_t.

Example (Markov Process w/ 2 States): Consider a 2 state Markov chain with states X_t \in \{u,d\}. First, consider the physical evolution of the stochastic process X_t which is governed by an 2 \times 2 intensity matrix \mathbb{T}. An intensity matrix encodes all of the transition probabilities. The matrix e^{h \cdot \mathbb{T}} is the matrix of transition probabilities over a horizon h. Since each row of the transition probability matrix e^{h \cdot \mathbb{T}} must sum to 1, each row of  the transition intensity matrix \mathbb{T} must sum to 0.

    \begin{align*} \mathbb{T} \ &= \ \begin{bmatrix} \tau(u \mid u) & \tau(d \mid u) \\ \tau(u \mid d) & \tau(d \mid d) \end{bmatrix} \end{align*}

The diagonal entries are nonpositive and represent minus the intensity of jumping from the current state to a new one. The remaining row entries, appropriately scaled, represent the conditional probabilities of jumping to the respective states. For concreteness, the following parameter values would be suffice:

    \begin{align*} \mathbb{T} \ &= \ \begin{bmatrix} -0.10 & 0.10 \\ 0.05 & -0.05 \end{bmatrix} \end{align*}

 

Next, I want to show how to modify this transition intensity matrix \mathbb{T} to describe the local evolution of the discount factor process M_t. To do this, I first need to have an asset pricing model in mind, and I use a standard CRRA power utility model with risk aversion parameter \gamma as in Breeden (1979) where X_t is the log of the expected consumption growth.

Example (Markov Process w/ 2 States, Ctd…): Intuitively, I know that every period I push the payment out into the future, I will end up discounting the payment by an additional e^{\lambda_M}. However, I know that I will also have to twist \mathbb{T} from the physical measure over to the risk neutral measure. Thus, the resulting generator will look something like:

    \begin{align*} \mathbb{A} \ &= \ \begin{bmatrix} \tau(u \mid u) \cdot \tilde{\phi}_M(u \mid u) & \tau(u \mid d) \cdot \tilde{\phi}_M(u \mid d) \\ \tau(d \mid u) \cdot \tilde{\phi}_M(d \mid u) & \tau(d \mid d) \cdot \tilde{\phi}_M(d \mid d) \end{bmatrix} \ - \ \lambda_M \end{align*}

If we (correctly) assume that \tilde{\phi}_M(s' \mid s) = 1, then we have:

    \begin{align*} \alpha(s' \mid s) \ &= \ \begin{cases} \tau(s' \mid s) - \lambda_M &\text{ if } s' = s \\ \tau(s' \mid s) \cdot \tilde{\phi}_M(s' \mid s) - \lambda_M &\text{ if } s' \neq s \end{cases} \end{align*}

Note that the rows of \mathbb{A} will in general not sum to 0 as in the physical transition intensity matrix T.

 

An Example

I conclude by working through an extended example showing how to solve for each of the terms in a simple model. Think about a Vasicek (1977) interest rate model. Let X_t be a risk factor with the following scalar Ito diffusion. I choose this model so that I can verify all of my solutions by hand using existing techniques.

    \begin{align*} dX_t \ &= \ \beta_X(X_t) \cdot dt \ + \ \sigma_X(X_t) \cdot dB_t \\ \beta_X(x) \ &= \ \bar{\beta}_X \ - \ \beta_X \cdot x \\ \sigma_X(x) \ &= \ \sigma_X \end{align*}

Let M_t=\exp \{A_t\} and A_t solves the following Ito diffusion.

    \begin{align*} dA_t \ &= \ \beta_A(X_t) \cdot dt \ + \ \sigma_A(X_t) \cdot dB_t \\ \beta_A(x) \ &= \ \bar{\beta}_A \ - \ \beta_A \cdot x \\ \sigma_A(x) \ &= \ \sigma_A \end{align*}

Thus (X_t,M_t) are described by parameter vector \Theta:

    \begin{align*} \Theta \ &= \ \begin{bmatrix} \beta_X & \beta_A & \bar{\beta}_X & \bar{\beta}_A & \sigma_X & \sigma_A \end{bmatrix} \end{align*}

We need to restrict \Theta to ensure stationarity. Matching coefficients to ensure that \lambda_M does not move with x yields the following characterization of \kappa_M.

    \begin{align*} \kappa_M \ &= \ - \ \frac{\beta_A}{\beta_X} \end{align*}

Substituting back into the formula for \lambda_M yields.

    \begin{align*} \begin{split} \lambda_M \ &= \ \left( \ \bar{\beta}_A \ + \ \frac{\sigma_A^2}{2} \ \right) \\ &\qquad \qquad + \ \left( \ \bar{\beta}_X \ + \ \sigma_A \cdot \sigma_X \ \right) \cdot \kappa_M \\ &\qquad \qquad \qquad  + \ \left( \ \frac{\sigma_X^2}{2} \ \right) \cdot \kappa_M^2 \end{split} \end{align*}

We know that M_t^2 =\exp\{2 \cdot A_t\}.

    \begin{align*} \kappa_{M^2} \ &= \ - \ \frac{2 \cdot \beta_A}{\beta_X} \\ \lambda_{M^2} \ &= \ 2 \cdot \lambda_M \ + \ \sigma_A^2 \ + \ \left( \ \sigma_A \cdot \sigma_X \ \right) \cdot \kappa_{M^2} \ + \ \left( \ \frac{\sigma_X^2}{4} \ \right) \cdot \kappa_{M^2}^2 \end{align*}

Exercise (Offsetting Shocks): If \rho is the standard time preference parameter, when would \lambda_M = \rho?

Exercise (Stochastic Volatility): Think about a Feller square root term to allow for stochastic volatility a lá Cox, Ingersoll and Ross (1985) interest rate model.

    \begin{align*} dX_t \ &= \ \beta_X(X_t) \cdot dt \ + \ \sigma_X(X_t) \cdot dB_t \\ \beta_X(x) \ &= \ \bar{\beta}_X \ - \ \beta_X \cdot x \\ \sigma_X(x) \ &= \ \sigma_X \cdot \sqrt{x} \end{align*}

    \begin{align*} dA_t \ &= \ \beta_A(X_t) \cdot dt \ + \ \sigma_A(X_t) \cdot dB_t \\ \beta_A(x) \ &= \ \bar{\beta}_A \ - \ \beta_A \cdot x \\ \sigma_A(x) \ &= \ \sigma_A \cdot \sqrt{x} \end{align*}

What are \kappa_M and \lambda_M?

  1. Note: The results in this post stem from joint work I am conducting with Anmol Bhandari for our paper “Model Selection Using the Term Structure of Risk”. In this paper, we characterize the maximum Sharpe ratio allowed by an asset pricing model at each and every investment horizon. Using this cross-horizon bound, we develop a macro-finance model identification toolkit. ↩
  2. e.g., think of the state space needed in the Campbell and Cochrane (1999) habit model. ↩
  3. Investment horizon symmetry is an unexplored prediction of many asset pricing theory. Asset pricing models characterize how much a trader needs to be compensated in order to hold 1 unit of risk for 1 unit of time. The standard approach to testing these models is to fix the unit of time and then look for incorrectly priced packets of risk. e.g., Roll (1981) looked at the spread in 1 month holding period returns on 10 portfolios of NYSE firms sorted by market cap and found that small firms earned abnormal excess returns relative to the CAPM. Yet, I could just as easily ask the question: Given a model, how much more does a trader need to be compensated for her to hold the same 1 unit of risk for an extra 1 unit of time? This inversion is well defined as asset pricing models possess investment horizon symmetry. Models hold at each and every investment horizon running from 1 second to 1 year to 1 century and everywhere in between. To illustrate this point via an absurd case, John Cochrane writes in his textbook (Asset Pricing (2005), Section 9.3.) that according to the consumption CAPM ‘…if stocks go up between 12:00 and 1:00, it must be because (on average) we all decided to have a big lunch.’ ↩
  4. See Model Selection Using the Term Structure of Risk. ↩
  5. This class of models allows for features such as rare disasters, recursive preferences and habit formation among others… ↩
  6. Really, this is just the Oskar Perron version of the theorem. ↩
  7. For an introduction to Perron-Frobenius theory, see MacCluer (2000). ↩

Filed Under: Uncategorized

Pages

  • Publications
  • Working Papers
  • Curriculum Vitae
  • Notebook
  • Courses

Copyright © 2026 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in