Alex – Page 19 – Research Notebook

Pearson-Wong Diffusions

July 30, 2011 by Alex

1. Introduction

I introduce the concept of Pearson-Wong diffusions and then show how this mathematical object can be put to use in macro-finance.

Roughly speaking, Pearson-Wong diffusions link properties of stochastic processes to properties of cross-sectional distributions in the resulting population. For example, suppose you have in mind a stochastic process that governs the total sales of each firm in the US. If this stochastic process is a Pearson-Wong diffusion you would also know what the steady state cross-sectional distribution of firm sales would be. Conversely, if you observed a particular cross-sectional distribution of firm sales, then if you assumed all firms had a similar sales growth process and that the economy was in a steady state, you could then back out which Pearson-Wong diffusion was governing sales growth in the economy up to an affine transformation.

First, in the Section 2, I define the Pearson (1895) system of distributions. Next, in Section 3, I elaborate on work by Wong (1964) and show that a broad class of diffusion processes with polynomial volatility called Pearson-Wong diffusions lead to steady state distributions in the Pearson system. I show that these distributions are uniquely defined by their polynomial coefficients. In Section 4, I show how a broad set of common continuous time processes in macro-finance such as Ornstein-Uhlenbeck processes and Feller square root processes sit in this class of Pearson-Wong diffusions. Finally, I conclude in Section 5 by returning to this sales volume example above taken from Gabaix (2011) and showing that variation in the cross-sectional distribution of firm sales volume implies variation in the functional form of the stochastic process governing each firm’s aggregate sales.

2. The Pearson System of Distributions

In this section I motivate and define the Pearson system of distributions. Karl Pearson developed the Pearson system of distributions as a taxonomy for understanding the skewed distributions he was finding in the biological data he was studying. For instance, Pearson had access to data on dimensions of crabs caught off the coast of Naples as illustrated in the figure below¹. When studying the ratio of the length of the crabs to their breadth, he found a distribution that was non-normal, and almost seemed to be a mixture of $2$ normal distributions.

The data give the ratio of "forehead" breadth to body length for 1000 crabs sampled at Naples. Source: R mixdist package (http://goo.gl/CaGkt).

In order to manipulate these data analytically, Pearson then searched out a simple functional form that would capture the main features of these skewed distributions with only a handful of parameters. In particular, he was after a formulation that fit continuous, single peaked distributions over various ranges with varying levels of skewness and kurtosis. Through guess and check, he settled on the definition below²:

Definition (Pearson System): A continuous, univariate, probability distribution $p(x)$ over $x \in (\underline{x}, \overline{x})$ is a member of the Pearson System if it satisfies the differential equation below with constants $r$ , $a$ , $b$ and $c$ :

(1) $\begin{align*} \frac{d}{dx} p(x) &= \frac{x-r}{a \cdot x^2 + b \cdot x + c} \cdot p(x) \end{align*}$

What are the features of this formulation? First, if $r$ is not a root of the polynomial $a \cdot x^2 + b \cdot x + c$ then $p(x)$ is finite. Next, we see that $r$ characterizes the signle peak of the distribution as $dp(x)/dx = 0$ when $x = r$ . What’s more, we know that $p(x)$ has to be single peaked as $p(x) \geq 0$ and $\int_{\underline{x}}^{\overline{x}} p(x) dx = 1$ , so $p(x)$ and $d p(x)/dx$ must tend towards $0$ as $x$ goes to $\pm \infty$ .

Heuristically speaking, we can think about $r$ as parameterizing the peak of the distribution and the quadratic polynomial as characterizing the rate of descent from this peak in either direction as a function of $x$ . Importantly, the solution to this differential equation will depend on the character of the roots of the quadratic polynomial:

(2) $\begin{align*} 0 &= a \cdot x^2 + b \cdot x + c \end{align*}$

In his original 1895 paper, Pearson spent most of his time actually classifying different types of distributions based on the nature of their respective polynomials. Below is a short list of distributions that fall into the Pearson class:

$\begin{align*} \begin{array}{|l|} \text{Distribution} \\ \hline \hline \textit{Normal} \\ \textit{Gamma} \\ \textit{Inverse-Gamma} \\ \textit{Student t} \\ \textit{Beta} \end{array} \end{align*}$

Below, I walk through an example showing how the normal distribution fits into the Pearson system:

Example (Gaussian Distribution): When $a = b = 0$ we get the Gaussian PDF. First, note that given these assumptions, the differential equation above can be written as:

(3) $\begin{align*} \frac{d}{dx} \ln p(x) &= -\frac{x-r}{c} \end{align*}$

Thus, by integrating up we see that the solution has the form:

(4) $\begin{align*} p(x) &= k \cdot e^{-\frac{(x-r)^2}{2 \cdot c}} \end{align*}$

If we choose $k$ such that the probability mass over the real line is $1$ , we get $k = \sqrt{2 \cdot \pi \cdot c}$ .

3. Main Results

In this section I define the class of Pearson-Wong diffusions and outline the mapping between the coefficients of the stochastic process and the parameters of the cross-sectional distribution. In the analysis below, I consider time homogeneous diffusion processes; i.e., the coefficients of the stochastic process can only depend on time $t$ through the value of $X_t$ :

Definition (Time Homogeneous Diffusion): Let $m(x)$ and $\sigma(x)$ be real valued functions that are Lipschitz on the interval $(\underline{x}, \overline{x})$ with $\sigma(x) > 0$ . Then a diffusion $X_t$ is a time-homogeneous diffusion if there exists a unique solution to the equation:

(5) $\begin{align*} X_t &= x_0 + \int_0^t m(X_s) \cdot ds + \int_0^t \sigma(X_s) \cdot dB_s \end{align*}$

Next, I define the class of Pearson-Wong diffusions:

Definition (Pearson-Wong Diffusion): A Pearson-Wong polynomial diffusion is stationary, time homogeneous solution to a stochastic differential equation of the form below, where $\theta > 0$ , $B_t$ is a Brownian motion and the triple of coefficients $(\alpha, \beta, \gamma)$ are such that the square root is well defined when $X_t$ is in the state space $(\underline{x}, \overline{x})$ :

(6) $\begin{align*} dX_t &= \theta \cdot (\mu - X_t) \cdot dt + \sqrt{2 \cdot \theta \cdot (\alpha \cdot X_t^2 + \beta \cdot X_t + \gamma)} \cdot dB_t \end{align*}$

What sorts of processes fit inside this class of diffusions? For one example, consider an Ornstein-Uhlenbeck process which would arise if we set $\alpha = \beta = 0$ , $\gamma = 1$ and $\theta \in (0,1)$ . In this setting, $\sigma^2 = 2 \cdot \theta$ . In the next section, I show how more exotic process also fit into this box of Pearson-Wong diffusions.

Now, given this definition, I need to derive a mapping between the values of the polynomial coefficients $\begin{bmatrix} \alpha & \beta & \gamma \end{bmatrix}$ and the form of the resulting cross-sectional distribution $p(x)$ . I do this in $3$ steps. First, I characterize the scale function $U(x)$ and speed density $P(x)$ of the diffusion $X_t$ . Next, I link the infinitesimal generator of the diffusion process $X_t$ to the scale function and speed density of $X$ . Finally, I show that given the mapping between the infinitesimal generator and stochastic processes in the class of Pearson diffusions, if $X_t$ is an ergodic process then this mapping is unique.

Below, I define the scale function for a stochastic process $X_t$ which captures how much the probability of reaching different points $y$ and $w$ in the domain of $X_t$ varies with the starting point $x$ :

Definition (Scale Function): Let $X_t$ be a $1$ dimensional diffusion on the open interval $(\underline{x}, \overline{x})$ . A scale function for $X_t$ is an increasing function $U(x): (\underline{x}, \overline{x}) \mapsto \mathbb{R}$ such that for all $w < x < y$ with $w,y \in (\underline{x}, \overline{x})$ , we have that:

(7) $\begin{align*} \mathtt{Pr}[\tau(y) < \tau(w)] &= \frac{U(x) - U(w)}{U(y) - U(w)} \end{align*}$

where $\tau(w) = \inf\left\{ t > 0 \mid X_t = w \right\}$ .

For instance, if $U(x) = x$ is a scale function for $X_t$ , then we say that $X_t$ is in its natural scale. By definition, $U(x)$ is a local martingale and satisfies the equation:

(8) $\begin{align*} 0 &= m(x) \cdot U'(x) + \frac{\sigma(x)^2}{2} \cdot U''(x) \end{align*}$

This is a linear first order differential equation of $u(x)$ with variable coefficients leading to a standard solution:

(9) $\begin{align*} \ln u(x) &= \int_{x_0}^{\overline{x}} \frac{s - \mu}{\alpha \cdot s^2 + \beta \cdot s + \gamma} \cdot ds \end{align*}$

where $x_0$ is a fixed point such that $\alpha \cdot x_0^2 + \beta \cdot x_0 + \gamma > 0$ . Next, I define a speed measure $P(x)$ which captures the probability that $x$ will exceed a certain value in finite time; i.e., will ever reach a value:

Definition (Speed Measure): The speed measure $P(x)$ is the measure such that the infinitesimal generator of $X_t$ can be written as:

(10) $\begin{align*} \mathbb{A} f(x) &= \frac{d^2}{dP \cdot dU} f(x) \end{align*}$

where we have that:

(11) $\begin{align*} \frac{d}{dU} f(x) &= \lim_{h \to 0} \frac{f(x+h) - f(x)}{U(x+h) - U(x)} \\ \frac{d}{dP} g(x) &= \lim_{h \to 0} \frac{g(x+h) - g(x)}{P(x,x+h)} \end{align*}$

Thus, it is in fact the $\mathtt{PDF}$ of the cross-sectional distribution as we consider this probability as $t \to \infty$ . This measure has a particularly nice functional form which allows for easy analytical computations in the case of Pearson-Wong diffusions. The lemma below characterizes this formulation:

Lemma (Speed Measure): The speed density of a Pearson-Wong diffusion is given by the fomula below where $x_0$ is a fixed point such that $\alpha \cdot x_0^2 + \beta \cdot x_0 + \gamma > 0$ :

(12) $\begin{align*} p(x) &= \frac{1}{u(x) \cdot (\alpha \cdot x^2 + \beta \cdot x + \gamma)} \end{align*}$

with $P'(x) = p(x)$ and $U'(x) = u(x)$ .

The proof of this lemma stems from the definition of the infinitesimal generator:

Proof (Speed Measure): On one hand, from the definition of the speed measure, we have that:

(13) $\begin{align*} \mathbb{A} f(x) &= \frac{d^2}{dU(x) \cdot dP(x)} f(x) \\ &= \frac{1}{U'(x)} \cdot \frac{d}{dx} \left( \frac{1}{P'(x)} \cdot \frac{d}{dx} f(x) \right) \\ &= \frac{1}{u(x) \cdot p(x)} \cdot f''(x) - k(x) \cdot f'(x) \end{align*}$

where $k(x)$ is some well behaved function of $x$ . On the other hand, from the definition of an infinitesimal generator, we have that:

(14) $\begin{align*} \mathbb{A} f(x) &= \frac{1}{2} \cdot \sigma(x)^2 \cdot f''(x) + m(x) \cdot f'(x) \end{align*}$

Thus, we have that $\left[ u(x) \cdot p(x) \right]^{-1} \propto \sigma(x)^2/2$ .

Thus, we have now marched through the framework for first $2$ steps of the construction of the link between a stochastic process in the class of Pearson-Wong diffusions and their corresponding cross-sectional distributions. All I need to do now is flesh out the requirements for uniqueness. In order to attain this property, I need an additional assumption on the class of Pearson-Wong diffusions: ergodicity. Below, I give a formal definition of this additional assumption:

Definition (Ergodic Pearson-Wong Diffusion): If $(\underline{x}, \overline{x})$ is an interval such that $a \cdot x^2 + b \cdot x + c > 0$ for all $x \in (\underline{x}, \overline{x})$ for a Pearson-Wong diffusion $X_t$ , then $X_t$ is ergodic if:

(15) $\begin{align*} \infty &= \int_{x_0}^{\overline{x}} u(x) \cdot dx = \int_{\underline{x}}^{x_0} u(x) \cdot dx \\ \infty &> \int_{\underline{x}}^{\overline{x}} V(x) \cdot dx \end{align*}$

If $\int_{x_0}^{\overline{x}} u(x) \cdot dx$ , then the boundary $\underline{x}$ can be reached in finite time with positive probability.

Proposition (Pearson-Wong Mapping): For all ergodic diffusions in the Pearson-Wong class parameterized by the coefficient vector $\begin{bmatrix} \alpha & \beta & \gamma \end{bmatrix}$ , there exists a unique invariant distribution in the Pearson system.

Ergodicity ensures that there are no eddies in the state space where multiple diffusions can get trapped yielding observationally equivalent cross-sectional distributions $p(x)$ for different diffusion processes.

Proof (Pearson-Wong Mapping): From the lemma above, we know that the scale measure as the density:

(16) $\begin{align*} u(x) &= e^{\int_{x_0}^x \frac{s - \mu}{\alpha \cdot s^2 + \beta \cdot s + \gamma} \cdot ds} \end{align*}$

where $x_0 \in (\underline{x}, \overline{x})$ is a point such that $\alpha \cdot x_0^2 + \beta \cdot x_0 + \gamma > 0$ . What’s more, we know that:

(17) $\begin{align*} p(x) &\propto \frac{1}{u(x) \cdot (\alpha \cdot x^2 + \beta \cdot x + \gamma)} \end{align*}$

Differentiating $p(x)$ yields:

(18) $\begin{align*} \frac{d}{dx} p(x) &= - \frac{(2 \cdot \alpha + 1) \cdot x - \mu + \beta}{\alpha \cdot x^2 + \beta \cdot x + \gamma} \cdot p(x) \end{align*}$

4. Examples

In this section I work through $2$ examples which illustrate how to fit the Vasicek process and a reflecting process that generates a cross-sectional distribution that satisfies Zipf’s law.

In a Vasicek model returns follow an Ornstein-Uhlenbeck process:

(19) $\begin{align*} dX_t &= \theta \cdot (\mu - X_t) \cdot dt + \sigma \cdot dB_t \end{align*}$

with $\theta, \mu, \sigma > 0$ . Thus, in the functional notation of the Pearson-Wong diffusion, we have that $2 \cdot \theta = \sigma^2$ , $a,b=0$ and $c=1$ . Using the formulation above, we see that:

(20) $\begin{align*} \frac{d}{dx} \ln p(x) &= - \frac{x - \mu}{\sigma} \end{align*}$

This is the exact same formulation as the Ornstein-Uhlenbeck example from the first section. Thus, we have that:

(21) $\begin{align*} p(x) &= \frac{1}{\sqrt{2 \cdot \pi \cdot \sigma}} \cdot e^{-\frac{(x - \mu)^2}{2 \cdot \sigma^2}} \end{align*}$

by solving for the constant via the boundary condition that $\int_{-\infty}^\infty p(x) \cdot dx = 1$ .

Next, consider a more complicated reflecting process that is defined only on the positive half-line in the form of a power law distribution with reflecting boundary at $\underline{x}>0$ . Specifically, suppose that you have a cross-sectional probability density $p(x)$ defined as:

(22) $\begin{align*} p(x) &\propto \frac{1}{x^2} \end{align*}$

which is defined on $(\underline{x},\infty)$ . We see that the cummulative probability density is proportional to $1/x$ so that Zipf’s law³ holds. However, note that there is no $x$ term in the numerator of the differential equation defining $p(x)$ :

(23) $\begin{align*} \frac{d}{dx} p(x) &= - \frac{2}{x} \cdot p(x) \end{align*}$

Thus, the power law cross-sectional distribution acts as a limiting case of the class of Pearson-Wong diffusions with $\beta = \gamma = 0$ and $\alpha \gg \mu$ :

(24) $\begin{align*} \frac{d}{dx} \ln p(x) &= - \frac{2}{x} \\ &= - \frac{\overbrace{(2 \cdot \alpha + 1)}^{\approx 2 \cdot \alpha} \cdot x}{\alpha \cdot x^2} - \underbrace{\left(\frac{\mu}{\alpha \cdot x^2}\right)}_{\approx 0} \end{align*}$

This solution works given the reflecting boundary $\bar{x} > 0$ as, for $\alpha$ large enough the second term on the right hand side will be roughly $0$ and the $+1$ in the first term will be negligible.

5. Conclusions

In the text above, I outline the topic of Pearson-Wong diffusions and also relate these results in continuous time mathematics to topics in macro-finance.

I conclude by looking at a final application in a recent Econometrica article, Gabaix (2011), on the granular origins of aggregate macroeconomic fluctuations. The core idea of this paper is that, when the cross-sectional distribution of firm production, $S_{t,n}$ , is Gaussian or some other thin-tailed distribution, shocks to the largest firms won’t matter as the number of firms $N \to \infty$ . However, if firm production is distributed according to a fat tailed distribution, then shocks to the production of the largest firms will matter.⁴

Proposition 2 of Gabaix (2011) gives the central result. Namely, that if firm size is distributed according to a power law,

(25) $\begin{align*} \mathtt{Pr}[S > \bar{s}] &= a \cdot x^{-\zeta}, \end{align*}$

then as $N \to \infty$ , if $\zeta \geq 2$ shocks to large firms won’t matter, while if $\zeta \in [1,2)$ shocks to large firms will matter.

Interestingly, the Pearson-Wong diffusion mathematics above gives us a new results for the implications of switching from the limiting case of $\zeta = 1$ to the case of $\zeta \in (0,1)$ . With $\zeta \in (0,1)$ , there will now be a new parameter $\beta$ to estimate. Thus, variation in how dispersed firms are in terms of their output reveals meaningful information about the structure of the stochastic process to which each firm’s output adheres.

Source: R mixdist package. ↩
Background info comes from Ord (1985). ↩
Gabaix (1999) or Tao (2009). ↩
In practice, shocks to the largest couple of firms do seem to have an impact on even large economies. For example, in December 2004, Microsoft issued a $\$24B$ one-time dividend which boosted the growth in average personal income that year from $0.6\%$ to $3.7\%$ in the United States. ↩

The Predictability and Volatility of Returns in the Presence of Rare Disasters

July 27, 2011 by Alex

1. Introduction

I characterize the relationship between the variance premium at time $t$ and the excess returns over the next $h$ months in an economy with variable rare disasters as outlined in Gabaix (2011) using the parameter estimates given in Bollerslev, Tauchen and Zhou (2009). Hao Zhou has been very kind and posted his data¹ for me to use in this analysis.

First, in section 2, I relate the equity premium conditional on no disasters occurring to the value of a put option on the market given that a disaster does occur. Then, in section 3, I relate the variance premium–i.e., the difference between the Black-Scholes implied variance and the realized variance–to this equity premium over the next $h$ months. Finally, in section 4, I conclude by using the parameter estimates in Bollerslev et al. (2009) to compute the coefficient $\beta$ which links the equity premium to the variance premium predicted by Gabaix (2011).

2. The Put Option Premium

Intuitively, in an economy with rare disasters, the equity premium should be given by the sum of the premium due to normal times risk and the premium due to disaster risk:

(1) $\begin{align*} r_t - r_t^f &= \pi^{\mathtt{N}}_t + \pi^{\mathtt{D}}_t, \end{align*}$

where $\pi^{\mathtt{N}}_t$ is the premium due to Gaussian noise and $\pi^{\mathtt{D}}_t$ is the premium due to disasters. If we looked at an economy in which there was no Gaussian noise as in the main sections of Gabaix (2011), then the entire risk premium would be due to disaster risk. However, here I am going to study a world in which there exists normal times Gaussian noise. Specifically, consider an asset with price $P_t$ that evolves according to the rule,

(2) $\begin{align*} \frac{P_{t+1}}{P_t} &= e^\mu_t \times \begin{cases} e^{\sigma \cdot \varepsilon_{t+1} - \sigma^2/2} &\text{ w/o disaster } \\ F_{t+1} &\text{ else } \end{cases} \end{align*}$

where $\mu_t = g_D + \zeta(\hat{H}_t)$ as $P_t/D_t = a + b \cdot \hat{H}_t$ . The functional form of $\zeta$ is given in Gabaix (2011), but will not be important here.

Let $\hat{V}_t(K)$ be the value of a $1$ -period European put option on this asset with strike price $K$ ,

(3) $\begin{align*} \hat{V}_t(K) &= \mathbb{E}_t \left[ \frac{M_{t+1}}{M_t} \cdot \left( K - \frac{P_{t+1}}{P_t} \right)^+ \right] \end{align*}$

Proposition 3 in Gabaix (2011) tells us how to compute the value of this put option:

(4) $\begin{align*} \hat{V}_t(K) &= e^{- \delta + \mu_t} \cdot \left\{ \hat{V}^{\mathtt{N}}_t(K) + \hat{V}^{\mathtt{D}}_t(K) \right\} \\ \hat{V}^{\mathtt{N}}_t(K) &= (1 - p_t) \cdot V(K \cdot e^{- \mu_t}) \\ \hat{V}^{\mathtt{D}}_t(K) &= p_t \cdot \mathbb{E}^{\mathtt{D}}_t \left[ B_{t+1}^{-\gamma} \cdot \left( K \cdot e^{-\mu_t} - F_{t+1} \right)^+ \right] \end{align*}$

where $V(\cdot)$ is the Black-Scholes value of a put option with volatility $\sigma$ , initial price $1$ , maturity $1$ and interest rate $0$ . In the proposition below, I relate the equity premium due to disaster risk to the value of the disaster component of the put option:

Proposition (Disaster Risk Premium): For a put option with strike price $K = e^{\mu_t}$ , the disaster risk premium can be written as,

(5) $\begin{align*} \pi^{\mathtt{D}}_t &\approx \hat{V}^{\mathtt{D}}_t(e^{\mu_t}) \end{align*}$

This result reads that, given a sufficiently high strike price $K = e^{\mu_t}$ such that the fraction of dividend lost due to a disaster $F_{t+1}$ will always be less than the discounted strike price value of $1$ , the contribution of disaster risk to the put option value is equal to the disaster risk premium. More intuitively, the value of the disaster risk premium must equal the value of an asset that pays out $\$1$ in the event of a disaster; i.e., the value of $\hat{V}_t^{\mathtt{D}}(e^{\mu_t})$ .

Proof (Equity Premium): From Proposition 1 in Gabaix (2011), we have that in a world with no Gaussian noise,

(6) $\begin{align*} r_t - r_t^f &= p_t \cdot \mathbb{E}^{\mathtt{D}}_t \left[ B_{t+1}^{-\gamma} \cdot \left( 1 - F_{t+1} \right) \right] \end{align*}$

If $K \cdot e^{\mu_t} \geq 1$ we can drop the $\max$ operator. Thus, for $K = e^{\mu_t}$ we have that $\hat{V}^{\mathtt{D}}_t(K)$ is equal to the disaster risk premium.

3. What’s Vol Got to Do With It?

Next, I want to use the result above to link the difference between in the implied variance given by the price of the $1$ period European put option $\hat{V}_t(e^{\mu_t})$ and the realized variance of the underlying asset to the excess rate of return on the market over the next $h$ months. I denote this variance premium as,

(7) $\begin{align*} \textit{vp}_t &= \tilde{\sigma}_t^2 - \sigma_t^2 \end{align*}$

I then want to be able to run the regression below where $r_{t \to (t+h)} - r^f_{t \to (t+h)}$ is the annualized excess return on the S&P 500 and $\eta_t(h)$ is an error term,

(8) $\begin{align*} r_{t \to (t+h)} - r^f_{t \to (t+h)} &= \alpha(h) + \beta(h) \cdot \textit{vp}_t + \eta_{t}(h) \end{align*}$

Proposition (Return Predictability): Let $\phi(\cdot)$ be the $\mathtt{pdf}$ of the standard Gaussian distribution:

(9) $\begin{align*} \phi(x) &= \frac{1}{\sqrt{2 \cdot \pi}} \cdot e^{-\frac{x^2}{2}} \end{align*}$

Then, given a strike price $K = e^{\mu_t}$ , we can relate the variance premium $\textit{vp}_t$ to the risk premium $r_{t \to (t+h)} - r^f_{t \to (t+h)}$ over the next $h$ months via the relationship below,

(10) $\begin{align*} r_{t \to (t+h)} - r_{t \to (t+h)}^f &\approx \alpha + \beta \cdot \textit{vp}_t \end{align*}$

with the coefficients,

(11) $\begin{align*} \alpha &= e^{\delta - \mu_t} \cdot \left\{ V(e^{\mu_t}) - e^{-\delta + \mu_t} \cdot (1 - p_t) \cdot V(1) \right\} + \pi^{\mathtt{N}}_t \\ \beta &\approx \frac{e^{\delta - \mu_t}}{2 \cdot \sigma} \cdot \phi \left( \frac{\delta + \frac{\sigma^2}{2}}{\sigma} \right) \end{align*}$

This result says that, conditional on the realized variance $\sigma_t^2$ , increasing an economy’s variance premium $\textit{vp}_t$ will increase the realized excess returns over the next $h$ months in a manner that is proportional to $e^{\delta - \mu_t} / (2 \cdot \sigma)$ scaled by the $z$ -statistic for $(\delta + \sigma^2/2)/\sigma$ . Note that the coefficient $\beta$ will be highly non-linear in the realized volatility $\sigma$ . On one hand, with regards to the $e^{\delta - \mu_t} / (2 \cdot \sigma)$ term, decreasing $\sigma$ will always increase the value of the equity premium. However, on the other hand, with regards to the Gaussian term, as $\sigma$ gets very small the input to the $\phi(\cdot)$ function gets large leading to a very unlikely realization.

Proof (Return Predictability): Suppose that we observe $\hat{V}(K)$ in the data. Then, we can approximate the implied volatility using the Black-Scholes $\nu$ (i.e., “vega”) via the relationship:

(12) $\begin{align*} \hat{V}(K) &= V(K) + \nu \cdot \left[ \ \tilde{\sigma} - \sigma \ \right] \\ \nu &= S \cdot \phi \left( \frac{\ln(S/K) + \delta + \frac{\sigma^2}{2}}{\sigma} \right) \end{align*}$

By definition, the starting price is $1$ . What’s more, since is near the money $e^{\mu_t} \approx 1$ , we have that $\ln(S/K) = 0$ .

Plugging in the expression for $\hat{V}(K)$ given in Proposition 3 of Gabaix (2011) and working backwards yields the desired result:

(13) $\begin{align*} \hat{V}(e^{\mu_t}) &= V(e^{\mu_t}) + \phi \left( \frac{\delta + \frac{\sigma^2}{2}}{\sigma} \right) \cdot \left[ \ \tilde{\sigma} - \sigma \ \right] \\ &= e^{- \delta + \mu_t} \cdot \left\{ \hat{V}^{\mathtt{N}}_t(e^{\mu_t}) + \pi_t^{\mathtt{D}} \right\} \\ &= e^{- \delta + \mu_t} \cdot \left\{ (1 - p) \cdot V(1) + \pi_t^{\mathtt{D}} \right\} \end{align*}$

I use the approximation:

(14) $\begin{align*} \frac{\tilde{\sigma}_t^2 - \sigma_t^2}{2 \cdot \sigma_t} &\approx \tilde{\sigma}_t - \sigma_t \end{align*}$

This proposition yields a corollary linking the variance premium to innovations in asset resilience $\hat{H}_t$ :

Corollary (Implied Volatility and Asset Resilience): Given that from Proposition 1 in Gabaix (2011), we know that,

(15) $\begin{align*} r_{t \to (t+1)} - r_{t \to (t+1)}^f &= \delta - H_t \end{align*}$

we know that the variance premium must be linearly related to innovations in asset resilience $\hat{H}_t$ with slope coefficient $\theta$ given below:

(16) $\begin{align*} \textit{vp}_t &= \zeta + \theta \cdot \hat{H}_t \\ \theta &= \frac{1}{\frac{e^{\delta - \mu_t}}{2 \cdot \sigma} \cdot \phi \left( \frac{\delta + \frac{\sigma^2}{2}}{\sigma} \right)} \end{align*}$

4. Matching Bollerslev et al. (2009)

In this section, I conclude by computing a model derived estimate of $\beta(1)$ using the parameter values given in Bollerslev et al. (2009) which estimates the regression given in Equation (8) above.

In their original regression results in Table 2, Bollerslev et al. (2009) use a somewhat unconventional choice of units. Namely, the excess returns are annualized while the variance premium is computed as a monthly estimate. To map the estimates from the original paper over to model implied values, I convert all of the data to annualized log values as outlined below,

(17) $\begin{align*} r_{t+1}^e &= r_{t+1}^{e,\mathtt{btz}} \times \frac{1}{10^2} \\ \textit{vp}_t &= \textit{vp}_t^{\mathtt{btz}} \times \frac{12}{10^4} \end{align*}$

After converting the variables to natural units, I find the summary statistics listed below where $\mathbb{S}$ is the standard deviation operator,

$\begin{align*} \begin{array}{l|c} & \text{Estimate} \\ \hline \hline \mathbb{E}[r^e_{t+1}] & 0.069 \\ \mathbb{S}[r^e_{t+1}] & 0.138 \\ \hline \mathbb{E}[\textit{vp}_t] & 0.022 \\ \mathbb{S}[\textit{vp}_t] & 0.0053 \end{array} \end{align*}$

The sample runs from Jan. 1990 to December 2007. These estimates read that the S&P 500 outperformed $3$ -month T-Bills by an average of $6.9\%$ a year. This gap had a volatility of $13.8\%$ a year. What’s more, the average distance between the VIX implied variance and the realized variance for the S&P 500 was $2.2\%^2$ .

Using these newly converted variables, I estimate a $\beta(1)$ of $3.205$ with a $t$ -stat of $1.781$ . Below I report the remainder of the regression results where the values in square brackets represent standard errors,

*** QuickLaTeX cannot compile formula:
\begin{align*}
\begin{array}{l|ccccc}
& h=1 & h=3 & h=6 & h=9 & h=12
\\
\hline
\hline
\alpha & -0.0024 & -0.023 & 0.0097 & 0.033 & 0.042
\\
\text{s.e.} & <sup class='footnote'><a href='#fn-1150-2' id='fnref-1150-2' onclick='return fdfootnote_show(1150)'>2</a></sup> & <sup class='footnote'><a href='#fn-1150-3' id='fnref-1150-3' onclick='return fdfootnote_show(1150)'>3</a></sup> & <sup class='footnote'><a href='#fn-1150-4' id='fnref-1150-4' onclick='return fdfootnote_show(1150)'>4</a></sup> & <sup class='footnote'><a href='#fn-1150-5' id='fnref-1150-5' onclick='return fdfootnote_show(1150)'>5</a></sup> & <sup class='footnote'><a href='#fn-1150-6' id='fnref-1150-6' onclick='return fdfootnote_show(1150)'>6</a></sup>
\\
\text{s.e.}_{H1992} & <sup class='footnote'><a href='#fn-1150-7' id='fnref-1150-7' onclick='return fdfootnote_show(1150)'>7</a></sup> & <sup class='footnote'><a href='#fn-1150-8' id='fnref-1150-8' onclick='return fdfootnote_show(1150)'>8</a></sup> & <sup class='footnote'><a href='#fn-1150-9' id='fnref-1150-9' onclick='return fdfootnote_show(1150)'>9</a></sup> & <sup class='footnote'><a href='#fn-1150-10' id='fnref-1150-10' onclick='return fdfootnote_show(1150)'>10</a></sup> & <sup class='footnote'><a href='#fn-1150-11' id='fnref-1150-11' onclick='return fdfootnote_show(1150)'>11</a></sup>
\\
\hline
\beta & 3.21 & 4.00 & 2.54 & 1.53 & 1.17
\\
\text{s.e.} & <sup class='footnote'><a href='#fn-1150-12' id='fnref-1150-12' onclick='return fdfootnote_show(1150)'>12</a></sup> & <sup class='footnote'><a href='#fn-1150-13' id='fnref-1150-13' onclick='return fdfootnote_show(1150)'>13</a></sup> & <sup class='footnote'><a href='#fn-1150-14' id='fnref-1150-14' onclick='return fdfootnote_show(1150)'>14</a></sup> & <sup class='footnote'><a href='#fn-1150-15' id='fnref-1150-15' onclick='return fdfootnote_show(1150)'>15</a></sup> & <sup class='footnote'><a href='#fn-1150-16' id='fnref-1150-16' onclick='return fdfootnote_show(1150)'>16</a></sup>
\\
\text{s.e.}_{H1992} & <sup class='footnote'><a href='#fn-1150-17' id='fnref-1150-17' onclick='return fdfootnote_show(1150)'>17</a></sup> & <sup class='footnote'><a href='#fn-1150-18' id='fnref-1150-18' onclick='return fdfootnote_show(1150)'>18</a></sup> & <sup class='footnote'><a href='#fn-1150-19' id='fnref-1150-19' onclick='return fdfootnote_show(1150)'>19</a></sup> & <sup class='footnote'><a href='#fn-1150-20' id='fnref-1150-20' onclick='return fdfootnote_show(1150)'>20</a></sup> & <sup class='footnote'><a href='#fn-1150-21' id='fnref-1150-21' onclick='return fdfootnote_show(1150)'>21</a></sup>
\\
\hline
\textit{Adj. } R^2 & 0.010 & 0.070 & 0.057 & 0.027 & 0.018
\end{array}
\end{align*}

*** Error message:
You can't use `macro parameter character #' in math mode.
leading text: \end{align*}

Alternatively, using the data Drechler and Yaron (2011), I find the coefficient estimates below:

*** QuickLaTeX cannot compile formula:
\begin{align*}
\begin{array}{l|ccccc}
& h=1 & h=3 & h=6 & h=9 & h=12
\\
\hline
\hline
\alpha & -0.017 & -0.030 & 0.012 & 0.039 & 0.042
\\
\text{s.e.} & <sup class='footnote'><a href='#fn-1150-22' id='fnref-1150-22' onclick='return fdfootnote_show(1150)'>22</a></sup> & <sup class='footnote'><a href='#fn-1150-23' id='fnref-1150-23' onclick='return fdfootnote_show(1150)'>23</a></sup> & <sup class='footnote'><a href='#fn-1150-24' id='fnref-1150-24' onclick='return fdfootnote_show(1150)'>24</a></sup> & <sup class='footnote'><a href='#fn-1150-4' id='fnref-1150-4' onclick='return fdfootnote_show(1150)'>4</a></sup> & <sup class='footnote'><a href='#fn-1150-26' id='fnref-1150-26' onclick='return fdfootnote_show(1150)'>26</a></sup>
\\
\text{s.e.}_{H1992} & <sup class='footnote'><a href='#fn-1150-27' id='fnref-1150-27' onclick='return fdfootnote_show(1150)'>27</a></sup> & <sup class='footnote'><a href='#fn-1150-28' id='fnref-1150-28' onclick='return fdfootnote_show(1150)'>28</a></sup> & <sup class='footnote'><a href='#fn-1150-29' id='fnref-1150-29' onclick='return fdfootnote_show(1150)'>29</a></sup> & <sup class='footnote'><a href='#fn-1150-10' id='fnref-1150-10' onclick='return fdfootnote_show(1150)'>10</a></sup> & <sup class='footnote'><a href='#fn-1150-7' id='fnref-1150-7' onclick='return fdfootnote_show(1150)'>7</a></sup>
\\
\beta & 6.30 & 7.18 & 4.09 & 2.17 & 1.31
\\
\text{s.e.} & <sup class='footnote'><a href='#fn-1150-32' id='fnref-1150-32' onclick='return fdfootnote_show(1150)'>32</a></sup> & <sup class='footnote'><a href='#fn-1150-33' id='fnref-1150-33' onclick='return fdfootnote_show(1150)'>33</a></sup> & <sup class='footnote'><a href='#fn-1150-34' id='fnref-1150-34' onclick='return fdfootnote_show(1150)'>34</a></sup> & <sup class='footnote'><a href='#fn-1150-35' id='fnref-1150-35' onclick='return fdfootnote_show(1150)'>35</a></sup> & <sup class='footnote'><a href='#fn-1150-36' id='fnref-1150-36' onclick='return fdfootnote_show(1150)'>36</a></sup>
\\
\text{s.e.}_{H1992} & <sup class='footnote'><a href='#fn-1150-37' id='fnref-1150-37' onclick='return fdfootnote_show(1150)'>37</a></sup> & <sup class='footnote'><a href='#fn-1150-38' id='fnref-1150-38' onclick='return fdfootnote_show(1150)'>38</a></sup> & <sup class='footnote'><a href='#fn-1150-39' id='fnref-1150-39' onclick='return fdfootnote_show(1150)'>39</a></sup> & <sup class='footnote'><a href='#fn-1150-40' id='fnref-1150-40' onclick='return fdfootnote_show(1150)'>40</a></sup> & <sup class='footnote'><a href='#fn-1150-41' id='fnref-1150-41' onclick='return fdfootnote_show(1150)'>41</a></sup>
\\
\hline
\textit{Adj. } R^2 & 0.0098 & 0.054 & 0.035 & 0.011 & 0.018
\end{array}
\end{align*}

*** Error message:
You can't use `macro parameter character #' in math mode.
leading text: \end{align*}

I want to compute the $\beta(1)$ estimates implied by the variable rare disasters model given in Gabaix (2011) which requires that I have estimates for $\sigma$ , $\delta$ and $\mu_t$ . I take the annualized values of $\delta = 0.05$ and $\mu_t = 0.025$ from Gabaix (2011). I take the estimate of the annualized realized variance from Bollerslev et al. (2009) of $\sigma_t^2 = 0.133$ . Using these values, I find that,

(18) $\begin{align*} \beta &= \frac{e^{\delta - \mu_t}}{2 \cdot \sigma} \cdot \phi \left( \frac{\delta + \frac{\sigma^2}{2}}{\sigma} \right) \\ &= \frac{e^{0.05 - 0.025}}{2 \cdot 0.133} \cdot \phi \left( \frac{0.05 + \frac{0.133^2}{2}}{0.133} \right) \\ &= 1.394 \end{align*}$

This estimate implies that if the variance premium rises by $1\%^2$ per year, then the equity premium will rise by $1.394\%$ per year.

Click here to download. ↩
052 ↩
029 ↩
020 ↩
018 ↩
016 ↩
85 ↩
50 ↩
59 ↩
72 ↩
83 ↩
80 ↩
99 ↩
70 ↩
60 ↩
55 ↩
91 ↩
68 ↩
19 ↩
94 ↩
30 ↩
059 ↩
033 ↩
024 ↩
020 ↩
019 ↩
86 ↩
52 ↩
61 ↩
72 ↩
85 ↩
62 ↩
01 ↩
43 ↩
21 ↩
12 ↩
90 ↩
86 ↩
43 ↩
03 ↩
34 ↩

Correct Prices Are Not Free

July 16, 2011 by Alex

1. Introduction

It takes hard work to maintain prices at their fundamental values. Accurate, responsive and informative prices do not occur by magic. Analysts have to diligently monitor firms prospects and security prices. Market makers have to sustain a trustworthy and relatively liquid trading environment. Firms have to issue and honor publicly traded shares.

Markets with publicly posted prices¹ are a form of institutional capital that needs to be operated and maintained just like any other form of capital. Given these properties, what is the optimal amount of this institutional capital to use in our production function?

2. Nickle and Dimed

A common mis-perception is that, as technology improves and the number of market participants increases, the costs of maintaining markets drops to a negligible level. For example, Eugene Fama and Ken French write on their blog² that:

“If some informed active investors turn passive, prices tend to become less efficient. But the effect can be small if there is sufficient competition among remaining informed active investors. The answer also depends on the costs of uncovering and evaluating relevant knowable information. If the costs are low, then not much active investing is needed to get efficient prices.” — Eugene Fama and Ken French

However, even if it only takes each analyst a couple of seconds to look at his portfolio every day, due to the size of modern financial markets this effort will still add up to a meaningful total. Consider the example below that illustrates this point:

Example (Google’s Pac-Man Homage): Analysts at the software firm Rescue Time studied the browsing habits and Google usage of roughly $11K$ users in May, 2010. The firm makes time tracking software that keeps an eye on what workers do and where they go online.

On a typical day, people in the sample conducted roughly $22$ searches on the Google page. Each one of these searches lasts about $11$ seconds. Putting Pac-Man on the Google homepage increased the average time spent on the page by an average of about $36$ seconds.³

Extrapolating this up across the $504$ million unique users who visit the main Google page day-to-day, this represents an increase of $4.8$ million hours – equal to about $549$ years! In dollar terms, assuming people are paid roughly $\$25/\mathtt{hr}$ , this equates to about $\$120M$ in lost productivity.

Screen shot of the playable version of Pac-Man posted on Google's front page on 5/21/10 to celebrate 30 years since the launch of Pac-Man in Japan.

3. The Brain Drain

There are a few papers like Abel, Eberly and Panageas (2007) which address this problem of optimal inattention, but I do not know of any papers which explicitly examine the welfare effects of too much or too little attention to asset markets.

To my knowledge, the closest paper to this line of analysis is Philippon (2007) which posits a simple general equilibrium model to digest the welfare effects of the recent growth in the output of US financial sector. The plot below shows that by 2006, the financial sector was generating roughly $8\%$ of US GDP. However, this study focuses on expenditures in the corporate finance side of the financial industry.

Financial industry fraction of US GDP. Source: Philippon (2007). Estimation based on U.S. Annual Industry Accounts, Kuznet (1941), Martin (1939), U.S. Census, and Historical Statistics of the U.S. (2006).

I am interested in a slightly different question. i.e., how much time should we worry about the markets? To see the importance of this trade off consider the example below. Lots of very smart physicists have left academic and industrial engineering positions to work on Wall Street over the course of the last $20$ years⁴:

Example (Physics Brain Drain): Suppose that all these skilled analysts were making market more efficient by pinning asset prices closer to their “correct” values. How much better off are we due to this marginal improvement in asset pricing accuracy? How much more productive would our world be if these people had foregone their finance careers and focused on the material physics behind computer hardware or the complex programming problems that underpin parallel computing?

4. Why Do Public Markets Persist?

In spite of their operating and maintenance costs, markets with publicly posted prices remain a common method of allocating physical goods, control rights and risk. For instance, the total value of U.S. equity markets exceeds the total U.S. GDP by a factor of at least $1.5$ as shown below⁵:

Total market capitalization of the NYSE-NASDAQ as a % of U.S. GDP.

There must be some positive externality to having fairly accurate and publicly displayed asset prices, otherwise these types of exchanges wouldn’t be so popular.⁶ I think that this insight that accurate, public prices must confer a positive externality is the key to understanding how making asset prices more accurate will improve welfare.

Alternative methods include procedures such as over-the-counter exchanges or dark pools. ↩
This post relates to their 2007 paper: Disagreement, Tastes and Asset Prices. ↩
These figures will form the basis of a back of the envelope calculation which may be biased either up or down. On one hand, the figure could over estimate the number of Google searches done per day as the software is voluntarily used by technically savvy employees. On the other hand, the estimates could be biased downward as only a minority of users realised that the logo was playable. To play, people had to click on the “insert coin” button which replaced the more familiar “I’m Feeling Lucky” button. ↩
See Steve Hsu’s (2010) review of The Quants for Physics World. ↩
Source: The Big Picture. ↩
…even in prison! ↩

Dyson Brownian Motion

July 15, 2011 by Alex

1. Introduction

I outline the construction of Dyson Brownian motion which governs the evolution of the eigen-values of an $(N \times N)$ -dimensional stochastic process of Hermitian matrices. For instance, if $A(t)$ is such a process, then:

(1) $\begin{align*} A(t+dt) \ &= \ A(t) \ + \ (dt)^{1/2} \cdot G, \end{align*}$

where $G$ in an $(N \times N)$ -dimensional random Hermitian matrix drawn from the Gaussian Unitary Ensemble ( $\mathtt{GUE}_N$ ).

Why study the eigen-values of a stream of Hermitian matrices? At first glance, this seems like a rather obscure mathematical object. Before I can answer this question, in Section 2 I first define what a Hermitian matrix is and discuss how I would select this random matrix $G$ . Then, in Section 3 I can give some practical examples in which Dyson Brownian motion would be a useful construction. I also give an alternative interpretation of Dyson Brownian motion related to non-intersecting Brownian processes and explain the use of complex matrices in an economic context. Finally, in Section 4, I construct Dyson Brownian motion.

The main source for the material in this post is Terry Tao‘s set of lecture notes on Random Matrix Theory, though I also used Mehta (2004) and Anderson, Guinnet and Zeitouni (2009) as references.

2. Mathematical Foundation

First things first: “What is a Hermitian matrix?”

Definition (Hermitian Matrix): A square $N \times N$ matrix $A$ is called Hermitian if it is self-adjoint:

(2) $\begin{align*} A \ &= \ A^* \end{align*}$

Each element $(i,j)$ of a Hermitian matrix $A$ is the complex conjugate of element $(j,i)$ in $A$ . Thus, the diagonal elements have to be real. Consider an example Hermitian matrix $A'$ :

(3) $\begin{align*} A' \ &= \ \begin{bmatrix} 1 & - i \\ i & 2 \end{bmatrix} \end{align*}$

Hermitian matrices are just the complex extension of real symmetric matrices. For instance, the matrix $A''$ below is a real instance of a Hermitian matrix:

(4) $\begin{align*} A'' \ &= \ \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \end{align*}$

Next, when I defined the stochastic process $A(t)$ above, I characterized each lurch forward by the addition of a random matrix $G$ scaled by the square root of the time interval. I pull this random matrix from the Gaussian Unitary Ensemble.

Definition (Gaussian Unitary Ensemble): The Gaussian Unitary Ensemble $\mathtt{GUE}(N)$ is a probability space over the vector space of $N \times N$ dimensional Hermitian matrices governed by the measure $\mu$ defined as:

(5) $\begin{align*} \mu(A) \ &= \ \frac{1}{2^{\frac{N}{2}}\cdot \pi^{\frac{N^2}{2}}} \cdot e^{-\frac{N}{2} \cdot \mathtt{Tr}(A^2)} \end{align*}$

So, each upper triangular element $a_{i,j}$ will be drawn from $\mathcal{N}(0,1)_{\mathbb{C}}$ while each element $a_{i,i}$ along the diagonal will be drawn from $\mathcal{N}(0,1)_{\mathbb{R}}$ . To get a feel for what this definition really means, consider a concrete example. Suppose that we want to know the probability $\mu(A')$ for the example $A'$ above. Using these paramters, I find that:

(6) $\begin{align*} \begin{split} (A')^2 \ &= \ \begin{bmatrix} 2 & 3 \cdot i \\ - 3 \cdot i & 5 \end{bmatrix} \\ \mathtt{Tr}[(A')^2] \ &= \ 2 \ + \ 5 \ = \ 7 \\ \mu(A') \ &= \ \frac{1}{2^{\frac{2}{2}} \cdot \pi^{\frac{2^2}{2}}} \cdot e^{-\frac{2}{2} \cdot 7} \\ &= \ \frac{1}{2 \cdot \pi^2} \cdot e^{-7} \\ &= \ 0.0000462 \end{split} \end{align*}$

The analysis below will follow through if you consider only adding random matrices $G$ drawn from an ensemble of real symmetric matrices. This ensemble is known as the Gaussian Orthogonal Ensemble.

Note that each of the elements of the matrices $A(t)$ do not follow their own independent Brownian motion processes. In a loose sense, this would be “too little” structure. The main thrust of the Dyson Brownian motion construction is that the eigen-values of this process follow a Brownian motion plus a twist term. The eigen-values are an attractive summary statistic for $2$ reasons. First, we know that they have a simple spectrum due to the fact that at each point in time $A(t)$ is a Hermitian matrix. What’s more, the Ky Fan inequality tells us that the eigen-values should have a smooth transition function over time.

3. Motivating Examples

With these terms in hand, I can now ask: “Why worry about matrix valued stochastic processes?”

First, consider some financial applications. Financial theory is built around the $\beta$ -pricing models which measure the correlation between the returns of assets and various risk factors. Models which assume that these $\beta$ measures remain constant for long periods of time do poorly in empirical tests.¹ It would be nice to characterize the evolution of these variance-covariance matrices.² Alternatively, suppose that you were in the business of identifying the principle components of stock returns–either explicitly or by hunting for additional factors.³ Here, you might want to check to work out how likely it is that the largest principle component has moved by $d\lambda$ in order to test your model.

In an entirely different context, Dyson Brownian motion can also be thought of as characterizing the evolution of $N$ Brownian motions $\begin{bmatrix} \lambda_1(t) & \ldots & \lambda_N(t) \end{bmatrix}$ that have been restricted to never intersect. The problem of modelling the eigen-values of Hermitian matrices and non-intersecting Brownian processes are not link a priori. However, constructing these non-intersecting processes is hard and an elegant solution emerges via solving the eigen-value process problem for Hermitian matrices which have a simple spectrum. For an example of the economic usefulness of such a trick, conside modelling the real option of a worker to switch jobs.⁴ At each point in time, he has a next best option but the exact nature of that next best option will change over time. Rather than keeping track of all possibilities, you could just model the evolution of the best option via Dyson Brownian motion.

Finally, I want to make a quick note about the use of complex valued rather than real matrices. Physicists declare that real symmetric matrices preserve time reversal symmetry while complex Hermitian matrices do not. For instance, in the financial application above where each of the entries in the variance-covariance matrix process have to be real, you can always undo the last step of the stochastic process by hitting $A(t+dt)$ by a well chosen inverse. However, when using complex valued matrices, this inverse is no longer possible as complex numbers are periodic; i.e.,

(7) $\begin{align*} \begin{split} i \ &= \ \sqrt{-1} \\ i^2 \ &= \ -1 \\ i^3 \ &= \ - i \\ i^4 \ &= \ 1 \end{split} \end{align*}$

To see the implications of this fact in a macroeconomic setting, consider a complex valued extension of a Leontif production model as follows. Suppose that prices are local⁵ and form the real part of each $a_{i,j}$ entry while the magnitude of the transaction forms the complex part. So, for instance, a transaction of $3$ tons of steel between a builder $i$ and a steel maker $j$ at a price of $5$ dollars per ton would manifest itself as $a_{i,j} = 5 + 3 \cdot i$ and $\bar{a}_{i,j} = a_{j,i} = 5 - 3 \cdot i$ . Thus, by introducing an additional dimension to the Leontif matrix, the physical properties of the process it represents changes dramatically.

4. Construction

Finally, I actually get around to defining Brownian motion. Below, I state the result:

Theorem (Dyson Brownian Motion): Let $t > 0$ , $dt > 0$ and $\begin{bmatrix} \lambda_1(t) & \ldots & \lambda_N(t) \end{bmatrix}$ be the spectrum of eigen-values of the $N \times N$ Hermitian matrix valued process $\{A(t)\}_{t \geq 0}$ . Then, we have:

(8) $\begin{align*} d \lambda_i(t) \ &= \ d B_i(t) \ + \ \sum_{1 \leq j \leq N: j \neq i} \ \frac{d t}{\lambda_i(t) - \lambda_j(t)} \end{align*}$

for all $1 \leq i \leq N$ , where $d \lambda_i(t) = \lambda_i(t + dt) - \lambda_i(t)$ and $\begin{bmatrix} dB_1 & \ldots & dB_N \end{bmatrix}$ are independent Brownian motion processes.

In words, this theorem says that the eigen-values of a stochastic process of Hermitian matrices behave like independent Brownian motions plus a repulsion force which is inversely proportional to the distances between any $2$ eigen-values. What’s more, this repulsive force is not-localized. Each eigen-value $\lambda_i(t)$ is pushed an pulled by $\lambda_{i-1}(t)$ and $\lambda_{i+1}(t)$ but also $\lambda_1(t)$ and $\lambda_N(t)$ as well as each and every eigen-level in between.

To formulate the construction, I use a Lemma from Hadamard given below:

Lemma (Hadamard Operator): The eigen-values of $A$ have the following first and second derivatives with respect to time $t$ :

(9) $\begin{align*} \dot{\lambda}_i \ &= \ u_i^* \ \dot{A} \ u_i \\ \ddot{\lambda}_i \ &= \ u^*_i \ \ddot{A} \ u_i \ + \ 2 \cdot \sum_{j \neq k} \ \frac{\left\vert u_i^* \ \dot{A} \ u_i \right\vert^2}{\lambda_j - \lambda_k} \end{align*}$

Proof (Hadamard Operator):

(10) $\begin{align*} \begin{split} A \ u_i \ &= \lambda_i \ u_i \\ u_i^* \ u_i \ &= \ 1 \end{split} \end{align*}$

(11) $\begin{align*} \begin{split} \dot{A} \ u_i \ + \ A \ \dot{u}_i \ &= \dot{\lambda}_i \cdot u_i \ + \ \lambda_i \cdot \dot{u}_i \\ \dot{u}_i^* \ u_i \ + \ u_i^* \ \dot{u}_i \ &= \ 0 \end{split} \end{align*}$

(12) $\begin{align*} \dot{\lambda}_i \ &= \ u_i^* \ \dot{A} \ u_i \end{align*}$

(13) $\begin{align*} \begin{split} \ddot{\lambda}_i \ &= \ \dot{u}^*_i \ \dot{A} \ u_i \ + \ u^*_i \ \ddot{A} \ u_i \ + \ u^*_i \ \dot{A} \ \dot{u}_i \\ 0 \ &= \ \dot{u}_j^* \ \dot{A} \ u_i \ + \ (\lambda_j - \lambda_i) \cdot u_j^* \ \dot{u}_i \end{split} \end{align*}$

Finally, I use this lemma together with the properties of Hermitian matrices to finish the construction of Dyson Brownian motion.

Proof (Dyson Brownian Motion):

(14) $\begin{align*} A(t + dt) \ &= \ A(t) \ + \ (dt)^{1/2} \cdot G \end{align*}$

(15) $\begin{align*} \lambda_i(t + dt) \ &= \ \lambda_i(t) \ + \ (dt)^{1/2} \cdot \nabla_G \lambda_i(t) \ + \ \frac{dt}{2} \cdot \nabla_G^2 \lambda_i(t) \ + \ \ldots \end{align*}$

(16) $\begin{align*} \begin{split} \nabla_G \lambda_i(t) \ &= \ u_i^* \ G \ u_i \\ \nabla_G^2 \lambda_i(t) \ &= \ 2 \cdot \sum_{i \neq j} \ \frac{\left\vert u_j^* \ G \ u_i \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \end{split} \end{align*}$

(17) $\begin{align*} d \lambda_i(t) \ &= \ (dt)^{1/2} \cdot \left\{ \ u_i^* \ G \ u_i \ \right\} \ + \ dt \cdot \left\{ \ \sum_{i \neq j} \ \frac{\left\vert u_j^* \ G \ u_i \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \ \right\} \end{align*}$

(18) $\begin{align*} d \lambda_i(t) \ &= \ (dt)^{1/2} \cdot \varepsilon_{i,i} \ + \ dt \cdot \left\{ \ \sum_{i \neq j} \ \frac{\left\vert \varepsilon_{i,j} \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \ \right\} \end{align*}$

Business Cycle Patterns

July 15, 2011 by Alex

The chart below contains macroeconomic and financial data on the US economy from 1775 to 1943. The chart is from the St. Louis Federal Reserve Fraser Archives. I’ve run across this chart several times over the course of the past year and I find new and interesting trends each time I look at it. The only thing I find a bit annoying about the chart is its size: it is hard to look at such a wide image online as an embedded PNG. To solve this problem, I’ve uploaded the image to Zoom.it and posted the resulting zommable image here.

For me, the most striking part about this plot is the correlation between spikes in commodity prices and the start of wars. Most economics models of rare disasters focus on the stock and bond pricing implications of rare drops in GDP due to wars and depressions. However, this infographic shows that commodity prices for goods like energy, housing, oil and metals are the most dramatically impacted by these rare events.

Be sure to expand the image to full screen mode if you want a more detailed look.

« Previous Page