Alex – Page 20 – Research Notebook

Plotting Geographic Densities in R

July 11, 2011 by Alex

I show how (here) to create a heat map of the intensity of home purchases from 2000 to 2008 in Los Angeles County, CA using a random sample of 5000observations from the county deeds records. I build off of the code created by David Kahle for Hadley Wickham‘s GGPlot2 Case study competition. I use the results of the geocoding procedure that I outline here as the input data.

How to Geocode Addresses Using the Yahoo! PlaceFinder API

July 11, 2011 by Alex

This post contains a link (here) to a python program which geocodes a large number of addresses using the Yahoo! PlaceFinder API. This program manages both the use of the API IDs as well as which files have been completed. The code can also be easily parallelized. The code makes use of earlier work I had done in R to accomplish the same task.

Random Effects Decomposition

June 27, 2011 by Alex

Motivation

I work through the error components econometric model outlined in Amemiya (1985). I use Hayashi (2000) as a reference text. I work through this example because I use this model in my working paper with Chris Mayer on bubble identification and I would like to work out the details as I didn’t spend much time on these sorts of models in my core econometrics courses.

In my paper with Chris, I develop a method of identifying relative mispricings between city specific markets in the US residential housing market using flows of speculative buyers between cities and assuming that city sizes are exogenous. Previously, analysts suspected that the housing bubble was due to credit supply factors. I use a random effects model to gauge the relative importance of $1)$ aggregate credit supply factors and $2)$ cross-city speculator flows in explaining mis-pricing in the housing market in our sample.

Econometric Framework

I characterize the random effects error components estimator outlined in Amemiya (1985, Ch. 6). Consider a balanced panel with $N$ panels and $T$ observations per panel. I study a regression specification of the following type:

(1) $\begin{align*} y_{n,t} \ &= \ \langle X_{n,t} \mid \beta \rangle \ + \ \mu_n \ + \ \lambda_t \ + \ \varepsilon_{n,t} \end{align*}$

I can vectorize this specification by stacking each of these $N \times T$ equations:

(2) $\begin{align*} \begin{split} \mathcal{U} \ &= \ \langle I_N \otimes 1_T \mid \mu \rangle \ + \ \langle 1_N \otimes I_T \mid \lambda \rangle \ + \ \mathcal{E} \\ Y \ &= \ \langle X \mid \beta \rangle \ + \ \mathcal{U} \end{split} \end{align*}$

Assumptions

I make the following assumptions about the shape of the errors:

Assumption: (Error Structure) I assume that:

1) Unbiased-ness: $\langle \mu_n \rangle = 0$ , $\langle \lambda_t \rangle = 0$ and $\langle \varepsilon_{n,t} \rangle = 0$

2) White-Noise: $\langle \mu_n \mid \lambda_t \rangle = 0$ , $\langle \lambda_t \mid \varepsilon_{n,t} \rangle = 0$ and $\langle \varepsilon_{n,t} \mid \mu_n \rangle = 0$

3) Homoskedasticity: $\vert \mu \rangle \langle \mu \vert = I_N \cdot \sigma^2_\mu$ , $\vert \lambda \rangle \langle \lambda \vert = I_T \cdot \sigma^2_\lambda$ and $\vert \varepsilon \rangle \langle \varepsilon \vert = I_{N \times T} \cdot \sigma^2_{\varepsilon}$

What are the key take-aways from these assumptions? First, assumption $1)$ means that there is a constant term in the explanatory $X$ variables. Assumption $2)$ is just the standard white noise assumption. Assumption $3)$ is the key restriction. This assumption says that the within and between effects are independent across time and panels respectively. The estimator I define below allows me to learn the values of $\sigma_\mu^2$ , $\sigma_\lambda^2$ and $\sigma_\varepsilon^2$ .

Estimation

How do I go about estimating these $3$ objects? First, I define some notation to make my life a bit easier and stave of carpel tunnel for a few more semesters:

(3) $\begin{align*} \begin{split} F \ &= \ \vert I_N \otimes 1_T \rangle \langle I_N \otimes 1_T \vert \\ G \ &= \ \vert 1_N \otimes I_T \rangle \langle 1_N \otimes I_T \vert \end{split} \end{align*}$

Also, let $H$ be an $(N \cdot T) \cdot (N \cdot T - N - T + 1)$ unit matrix. I name the error covariance matrix $\Omega$ , and then characterize it as a linear function of the $3$ variance terms of interest:

(4) $\begin{align*} \begin{split} \Omega \ &= \ \vert \mathcal{U} \rangle \langle \mathcal{U} \vert \\ &= \ \sigma_\mu^2 \cdot F \ + \ \sigma^2_\lambda \cdot G \ + \ \sigma_\varepsilon^2 \cdot I_{N \times T} \end{split} \end{align*}$

I can write out the inverse of the error covariance matix $\Omega$ as follows:

(5) $\begin{align*} \begin{split} \Omega^{-1} \ &= \ \frac{1}{\sigma_\varepsilon^2} \cdot \left( I_{N \times T} - \gamma_1 \cdot F + \gamma_2 \cdot G + \gamma_3 \cdot H \right) \\ \gamma_1 \ &= \ \frac{\sigma_\mu^2}{\sigma_\varepsilon^2 + T \cdot \sigma_\mu^2} \\ \gamma_2 \ &= \ \frac{\sigma_\lambda^2}{\sigma_\varepsilon^2 + N \cdot \sigma_\lambda^2} \\ \gamma_3 \ &= \ \gamma_1 \cdot \gamma_2 \cdot \left( \ \frac{2 \cdot \sigma_\varepsilon^2 + T \cdot \sigma_\mu^2 + N \cdot \sigma_\lambda^2}{\sigma_\varepsilon^2 + T \cdot \sigma_\mu^2 + N \cdot \sigma_\lambda^2} \ \right) \end{split} \end{align*}$

This formulation shows that the sample error covariance matrix will provide unbiased and consistent estimates if both $N \to \infty$ and $T \to \infty$ . In this not, I am not going to worry about what is the most consistent estimator for the parameters. Next, I want to decompose the error covariance matrix into within, between and indiosyncratic components. To do this I need $1$ last piece of notation:

(6) $\begin{align*} Q \ &= \ I \ - \ \frac{F}{T} \ - \ \frac{G}{N} \ + \ \frac{H}{N \cdot T} \end{align*}$

Think about this as an orthogonal decomposition of a unitary error covariance matrix into each of the $3$ components: within, between and idiosyncratic. Then, using this term, Amemiya (1971) shows that the following estimators for the parameter vector $\begin{bmatrix} \sigma_\mu^2 & \sigma_\lambda^2 & \sigma_\varepsilon^2 \end{bmatrix}$ :

(7) $\begin{align*} \begin{split} \hat{\mathcal{U}} \ &= \ Y \ - \ \langle X \mid \hat{\beta} \rangle \\ \hat{\sigma}_{\varepsilon}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle Q \mid \hat{\mathcal{U}} \rangle \rangle}{(N-1) \cdot (T-1)} \\ \hat{\sigma}_{\mu}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle \frac{T-1}{T} \cdot F - \frac{T-1}{N \cdot T} \cdot H - Q \mid \hat{\mathcal{U}} \rangle \rangle}{T \cdot (N-1) \cdot (T-1)} \\ \hat{\sigma}_{\lambda}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle \frac{N-1}{N} \cdot G - \frac{N-1}{N \cdot T} \cdot H - Q \mid \hat{\mathcal{U}} \rangle \rangle}{T \cdot (N-1) \cdot (T-1)} \end{split} \end{align*}$

Recurrence in 1D, 2D and 3D Brownian Motion

June 26, 2011 by Alex

Introduction

I show that Brownian motion is recurrent for dimensions $d=1$ and $d=2$ but transient for dimensions $d \geq 3$ . Below, I give the technical definition of a recurrent stochastic process:

Definition: (Recurrent Stochastic Process) Let $X(t)$ be a stochastic process. We say that $X(t)$ is recurrent if for any $\varepsilon > 0$ and any point $\bar{x} \in \mathtt{Dom}(X)$ we have that:

(1) $\begin{align*} \infty \ &= \ \int_0^\infty \ \mathtt{Pr} \left[ \ \left\Vert X(t) - \bar{x} \right\Vert < \varepsilon \mid X(0) = \bar{x} \ \right] \cdot dt \end{align*}$

In words, this definition says that if the stochastic process $X(t)$ starts out at a point $\bar{x}$ , then if we watch the process forever it will return again and again to within some tiny region of $a$ an infinite number of times.

Motivating Example

Before I go about proving that Brownian motion is recurrent or transient in different dimensions, I first want to nail down the intuition of what it means for a stochastic process to be recurrent in a more physical sense. To do this, I use the standard real world example for random walks: a drunk leaving a bar.

Arnold’s lattice world for the case of 2 dimensions.

Example: (A Drunkard’s Flight) Suppose that Arnold is drunk and leaving his local bar. What’s more, Arnold is really inebriated and can only muster enough coordination to move $1$ step backwards or $1$ step forward each second. Because he is so drunk, he doesn’t have any control which direction he stumbles so you can think about him moving backwards and forwards each second with equal probability $\pi = 1/2$ . Thus, Arnold’s position relative to the door of the bar is a stochastic process with independent $\pm 1$ increments. This process is recurrent if Arnold returns to the bar an infinite number of times as we allow him to stumble around all night. Put differently, if Arnold ever has a last drink for the evening and exits the bar for good, then his stumbling process will be transient.

In the context of this toy example, I show that as I allow Arnold to stumble in more and more different directions (backwards vs. forwards, left vs. right, up vs. down, etc…), his probability of returning to the bar decreases. Namely, if Arnold can only move backwards and forwards, then his stumbling will lead him back to his bar an infinite number of times. If he can move backwards and forwards as well as left and right, he will still wander back to the bar an infinite number of times. However, if Arnold either suddenly grows wings (i.e., can move up or down) or happens to be the Terminator (i.e., can time travel to the future or past), at some point his wandering will lead him away from the bar forever.

Outline

First, I state and prove Polya’s Theorem which characterizes whether or not a random walk on a lattice is recurrent in each dimension $d=1,2,3\ldots$ . Then, I show how to extend this result to continuous time Brownian motion using the Central Limit Theorem. I attack this recurrence result for continuous time Brownian motion via Polya’s Recurrence Theorem because I think the intuition is much clearer along this route. I find the direct proof in continuous time which relies on Dynkin’s lemma a bit obscure; whereas, I have a very good feel for what it means to count paths (i.e., possible random walk trajectories) on a grid.

Polya’s Recurrence Theorem

Below, I formulate and prove Polya’s Recurrence Theorem for dimensions $d \in \{1,2,3\}$ .

Theorem: (Polya Recurrence Theorem) Let $p(d)$ be the probability that a random walk on a $d$ dimensional lattice ever returns to the origin. Then, we have that $p(1)=p(2) = 1$ while $p(3) < 1$ .

Intuition

Before I go any further into the maths, I walk through the physical intuition behind the result. First, imagine the case where drunk Arnold can only move forwards and backwards. In order for Arnold to return to the bar door in $2 \cdot s$ steps¹, he must take the exact same number of forward and backwards steps. i.e., he has to choose a sequence of $2 \cdot s$ steps such that exactly $s$ of them are forward. There are $2 \cdot s$ choose $s$ ways I could do this:

(2) $\begin{align*} \mathtt{\# \ returning \ paths} \ &= \ \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \end{align*}$

What’s more, I know that the probability of each of the paths Arnold could take is just $1$ divided by the total number of paths $2^{2 \cdot s}$ :

(3) $\begin{align*} \mathtt{Pr[each \ path]} \ &= \ \frac{1}{2^{2 \cdot s}} \end{align*}$

Now consider drunk Arnold’s situation in $2$ -dimensions. Here, he must take the exact same number of steps forward and backwards as well as the exact same number of steps left and right. Thus, there are $2 \cdot s$ choose $(k,k,s-k,s-k)$ ways for Arnold to return to the bar:

(4) $\begin{align*} \mathtt{\# \ returning \ paths} \ &= \ \sum_{k=0}^s \ \begin{pmatrix} 2 \cdot s \\ k,k,(s-k),(s-k) \end{pmatrix} \end{align*}$

What is this sum computing in words? First, suppose that Arnold takes no steps in the left or right directions, then set $k=0$ and the number of paths he could take back to the bar is equal to the number in the $1$ -dimensional case. Conversely, if Arnold takes no steps forwards or backwards, set $k=s$ and again you get the $1$ -dimensional case. Thus, the number of possible paths Arnold can take back to the bar in $2$ -dimensions is strictly larger than in $1$ -dimension. However, Arnold can also take paths which mover along both axes. This sum first counts up the number of ways he can make to end up back at his starting point in the left or right directions. Then, it takes the remaining number of steps, and counts the number of ways he can use those steps to return to the starting point in the forwards and backwards direction.

Note that this process doesn’t add that many new returning paths for each new dimension. Every time I add a new dimension, I’m certainly adding fewer than $2^s$ new paths as:

(5) $\begin{align*} m^n \ &= \ \sum_{k_1 + k_2 + \ldots + k_m = n} \ \begin{pmatrix} n \\ k_1, k_2, \ldots, k_m \end{pmatrix} \end{align*}$

However, each path only happens with probability $4^{- 2 \cdot s}$ now. The probability of realizing each possible path is decreasing at a rate of $2 \cdot s$ :

(6) $\begin{align*} \mathtt{Pr[each \ path]}(d) \ &= \ \left(\frac{1}{2 \cdot d}\right)^{2 \cdot s} \end{align*}$

Thus, the Polya’s Recurrence Theorem stems from the fact that the number of possible paths back to the origin in growing at a rate that in less than the number of all paths; i.e., the wilderness of paths that do not loop back to the origin is increasing faster than the set of paths which do loop back as we add dimensions.

Proof

Below, I prove this result $1$ dimension at a time:

Proof: ( $d=1$ ) The probability that Arnold will return to the origin in $2 \cdot s$ steps is the number of possible paths times the probability that each $1$ of those paths occurs:

(7) $\begin{align*} p_{2 \cdot s}(1) \ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \end{align*}$

Next, in order to derive an analytical characterization of this probability, I use Stirling’s approximation to handle the factorial terms in the binomial coefficient:

(8) $\begin{align*} s! \ &\approx \ \sqrt{2 \cdot \pi \cdot s} \cdot e^{-s} \cdot s^s \end{align*}$

Using this approximation and simplifying, I find that:

(9) $\begin{align*} \begin{split} p_{2 \cdot s}(1) \ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \frac{(2 \cdot s)!}{s! \cdot (2 \cdot s - s)!} \\ &\approx \ \frac{1}{(\pi \cdot s)^{1/2}} \end{split} \end{align*}$

Thus, if I sum over all possible periods, I get the expected number of times that drunk Arnold will return to the bar for another night cap. I find that this infinite sum diverges:

(10) $\begin{align*} \begin{split} p(1) \ &= \ \sum_{s=0}^\infty \ p_{2 \cdot s}(1) \\ &= \ \sum_{s=0}^\infty \ \frac{1}{(\pi \cdot s)^{1/2}} \\ &= \ \infty \end{split} \end{align*}$

Proof: ( $d=2$ ) Next, I follow all of the same steps through for the $d=2$ dimensional case:

(11) $\begin{align*} \begin{split} p_{2 \cdot s}(2) \ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \begin{pmatrix} 2 \cdot s \\ k,k,(n-k),(n-k) \end{pmatrix} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \frac{(2 \cdot s)!}{k! \cdot k! \cdot (s - k)! \cdot (s - k)!} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \frac{(2 \cdot s)!}{s! \cdot s!} \cdot \frac{s! \cdot s!}{k! \cdot k! \cdot (s - k)! \cdot (s - k)!} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \sum_{k=0}^s \ \begin{pmatrix} s \\ k \end{pmatrix}^2 \\ &= \ \left[ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \right]^2 \\ &= \ \left[ p_{2 \cdot s}(1) \right]^2 \end{split} \end{align*}$

Summing over all possible path lengths yields a divergent series:

(12) $\begin{align*} \begin{split} p(2) \ &= \sum_{s=0}^\infty \ p_{2 \cdot s}(2) \\ &= \ \sum_{s=0}^\infty \ \frac{1}{\pi \cdot s} \\ &= \ \infty \end{split} \end{align*}$

Proof: ( $d=3$ ) The result for $d=3$ is a bit more complicated as there isn’t a nice closed form expression for each of the $p_{2 \cdot s}(3)$ terms. I start by simplifying as far as I can:

(13) $\begin{align*} \begin{split} p_{2 \cdot s}(3) \ &= \ \left( \frac{1}{6} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ k,k, j,j, (s-k-j),(s-k-j) \end{pmatrix} \\ &= \ \left( \frac{1}{6} \right)^{2 \cdot s} \cdot \sum_{j,k \mid j+k \leq s} \ \frac{(2 \cdot s)!}{k! \cdot k! \cdot j! \cdot j! \cdot (s-j-k)! \cdot (s-j-k)!} \\ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \sum_{j,k \mid j+k \leq s} \ \left( \frac{1}{3^s} \cdot \frac{s!}{k! \cdot j! \cdot (s-j-k)!} \right)^2 \end{split} \end{align*}$

Next, I apply the Multinomial Theorem and note that this probability is maximized when $j=k=n/3$ . Thus, if I substitute in this value, I will have an upper bound on the probability $p_{2 \cdot s}(3)$ :

(14) $\begin{align*} \begin{split} p_{2 \cdot s}(3) \ &\leq \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \left( \frac{1}{3^s} \cdot \frac{s!}{\left[ \left( \frac{s}{3} \right)! \right]^3} \right) \\ &\leq \ \frac{C}{(\pi \cdot s)^{3/2}} \end{split} \end{align*}$

Summing over all possible path lengths leads to a convergent series, so I know that Arnold may have a final drink at some point during the evening:

(15) $\begin{align*} \begin{split} p(3) \ &= \ \sum_{s=0}^\infty \ p_{2 \cdot n}(3) \\ &< \ \infty \end{split} \end{align*}$

Extension to Brownian Motion

Below, I define Brownian motion in $d>1$ dimensions and then show how to extend the results from Polya’s Recurrence Theorem from random walks on a lattice to continuous time Brownian motion.

Brownian motion for $d>1$ dimensions is a natural extension of the $d=1$ dimensional case. I give the formal definition below:

Definition: (Multi-Dimensional Brownian Motion) Brownian motion in $\mathcal{R}^d$ is the vector valued process:

(16) $\begin{align*} \mathbf{B}(t) \ &= \ \begin{bmatrix} B_1(t) & B_2(t) & \ldots & B_d(t) \end{bmatrix} \end{align*}$

To extend Polya’s Recurrence Theorem to continuous time Brownian motion, I just need to apply the Central Limit Theorem and then construct the Brownian motion from the resulting independent Gaussian increments:

Theorem: (deMoivre-Laplace) Let $k_s$ be the number of successful draws from a binomial distribution in $s$ tries. Then, when $\mathbb{E}(k_s) \approx s \cdot \pi$ , we can approximate the binomial distribution with the Gaussian distribution with the approximation becoming exact as $s \to \infty$ :

(17) $\begin{align*} \mathtt{Bin}(s,\pi) \ &\sim \ \mathtt{Norm}\left(s \cdot \pi, \sqrt{s \cdot \pi \cdot (1-\pi)}\right) \end{align*}$

Lemma: (Levy’s Selector) Suppose that $s<t$ and $X(s)$ and $X(t)$ are random variables defined on the same sample space such that $X(t) - X(s)$ has a distribution which is $\mathtt{Norm}(0,t-s)$ . Then there exists a random variable $X(\frac{t+s}{2})$ such that $X(\frac{t+s}{2}) - X(s)$ and $X(t) - X(\frac{t+s}{2})$ are independent with a common $\mathtt{Norm}(0,\frac{t-s}{2})$ distribution.

Sanity Check: Why $2 \cdot s$ and not just $s$ here? ↩

Hong and Stein (1999)

June 24, 2011 by Alex

1. Introduction

I replicate main results from Hong and Stein (1999) which constructs an equilibrium model with under-reaction and momentum. First, I give a rough verbal explanation of the model’s results. Then, I outline the basic mathematical framework and work through the equilibrium concept. Finally, I simulate the equilibrium outcomes for different momentum trader horizons and information speeds.

This paper develops an interesting model in which endogenizes the frequency and amplitude of price fluctuations. I work through this paper to better understand the nuts and bolts of this equilibrium concept. Perhaps I might be able to use these statistical wave-like properties to identify and discriminate between different mis-pricing generators.

2. Simple Example

The basic idea behind the model is as follows. Suppose that you have a bunch of traders that receive a demand shock, but only respond slowly. For example, imagine that a bunch of people earn a windfall payment (i.e. win the lottery or find out about a long lost rich uncle) and decide to buy new houses. It would take them a while to search for the appropriate house that fits their exact needs. For instance, perhaps one family needs to be in a nice school district, another needs to be near the airport for frequent trips, and so on… These guys represent slow moving information or demand. However, though it would take time for each of the people to purchase their new home, and anyone who knew about the windfall payments would know that the demand for expensive houses was going to jump up in the future.

Now, suppose that no one knows that the windfall payments have already occurred; but there is, instead, a group of traders that know a windfall payment might occur at anytime. It could have been today. It could have been yesterday. It might actually happen in a week. Yet, while this group doesn’t know when the payment has been made, each of the agents can infer the likelihood from the price movements. If the price drifts up, then it is more likely that helicopters have dumped the cash. A trader who acts on these price movements is a momentum trader.

There are 2 additional quirks: 1) The informed traders don’t realize that other people have also recieved a windfall payments. 2) Momentum traders enter sequentially and are very simple minded. They don’t know how many of their own kind there are. They don’t meet anyone for lunch to discuss what are the best ways to back out whether or not there has been a windfall payment. All they do is make their best guess based on the price growth over the past 6 months. That’s it.

What happens now? The first couple of momentum traders that walk into the market will see a price jump after the windfall actually occurs and trade into it. However, the next few momentum traders will see the price growth induced by the earlier momentum traders and get very excited and trade into the asset even further. These momentum traders are responding to a price movement that was solely generated by other momentum traders. This pattern repeats itself until the bottom falls out. So, it is as if the later momentum traders pay a tax for being late. Early price movements accelerate, then over shoot their fundamental value and collapse.

3. Economic Framework

First, let’s consider a world with only a unit mass of naive, but informed traders. Agents live in a discrete time $T$ period world where $T$ is large. There is a riskless asset with a $0$ return as well as a risky asset in positive net supply $q$ which pays out a dividend $d_T$ at time $T$ . The asset has an expected dividend $d_t$ and price $p_t$ at each interim period $t$ . These traders all start out with the same information and the same endowment of the riskless asset $m_0$ and risky asset $q_0$ . Each trader $i$ has CARA utility over consumption at time $T$ :

(1) $\begin{equation*} U(C^i) \ = \ - \mathbb{E} \left[ \ e^{-\alpha C^i} \ \right] \end{equation*}$

The traders are informed because they receive a series of signals $\varepsilon_t$ about the size of the dividend $d_T$ at each point in time $t$ . However, these signals move slowly throughout the population. Specifically, suppose that there are $z$ different flavors of traders of equal size. Each flavor of traders sees a different, independent component of the signal $\varepsilon_t$ at each point in time. So, for example, at time $t$ , traders of type $z=1$ see the component $\varepsilon^1_{t+z-1}$ of the shock $\varepsilon_{t+z-1}$ . At time $t+1$ , type $z=1$ traders see the second component of the shock $\varepsilon_{t+z-1}$ as well as the first component of the shock $\varepsilon_{t+z}$ . Likewise, at time $t$ , traders of type $z=2$ see the component $\varepsilon^2_{t+z-1}$ of the shock $\varepsilon_{t+z-1}$ as so on.

Thus, the traders of each flavor rotate which component of the shock they see until they have seen all $z$ independent components and know the full shock. This information rotation structure means that, after $\tau$ periods since time $t$ , agents have seen a fraction $\tau z$ of the total signals available for the shock $\varepsilon_{t+z-1}$ .

Information rotation structure in Hong and Stein (1999).

Traders are naive because they do not condition on the observed price when they formulate their expectations. Traders see their components of the shock at each point in time, update their beliefs about the future value of the dividend, and then place their order believing that they will adopt a buy and hold until date $T$ strategy, but they do not impound the market clearing price into their information set.

Equilibrium Concept: Walrasian equilibrium with private valuations.

4. No Momentum Traders

To solve the model, I follow the same general strategy as in a Grossman and Stiglitz (1983) equilibrium, but I give the traders naive rather than rational expectations. First, I write out the optimization problem for each naive, informed trader $i$ . I assume that $z=2$ and solve a $1$ period problem, but the solutions below easily generalize to $z>2$ and multiple periods.

Each trader maximizes his consumption utility by choosing his asset holdings subject to a budget constraint where $m^i$ represents his riskless asset holdings and $q^i$ represents his risky asset holdings:

(2) $\begin{align*} V^i &= \max_{m^iq^i} \left\{ -\mathbb{E} \left[ e^{-\rho w^i} \right] \right\} \\ &\textit{subject to} \\ m^i + \tilde{p} q^i &\leq m_0 + \tilde{p}_0 q_0 \end{align*}$

I assume that each trader has a unit mass of wealth. After substituting in the budget constraint, I get:

(3) $\begin{align*} V^i &= \max_{q^i} \left\{ -\mathbb{E}^i \left[ e^{-\rho \left( (d-\tilde{p}) q^i \right)} \right] \right\} \end{align*}$

The first order condition with respect to $q^i$ characterizes the risky holdings as follows:

(4) $\begin{align*} q^i &= \frac{\mathbb{E}^i\left[ d-\tilde{p} \right]}{\rho\mathbb{V}^i\left[ d-\tilde{p} \right]} \end{align*}$

Next, I guess that price is linear in the public information, the private signal about tomorrows information and the total quantity. $\varepsilon_0$ denotes the public signal available to traders of both flavors. $\varepsilon_1$ denotes the sum of the private signals for each type of trader .

(5) $\begin{align*} \tilde{p} &= \alpha + \beta \varepsilon_0 + \gamma \varepsilon_1 - \delta Q \end{align*}$

I solve for $\tilde{p}$ by substituting the function for $q^i$ into the budget constraint. Since both flavors of agents are symmetric and ignorant of the information in the prices themselves, the price functional simplifies to:

(6) $\begin{align*} \tilde{p} &= \left( \varepsilon_0 + \dfrac{1}{2} \varepsilon_1 \right) - \delta Q \end{align*}$

$\delta$ is a function of the risk aversion parameter $\rho$ as well as the variance $\sigma_\varepsilon^2$ . I pick the risk aversion parameter in order to set $\delta=1$ for simplicity.

Price, dividend, return and signal series from Hong and Stein (1999) with no momentum traders.

5. With Momentum Traders

Now, I add in momentum traders. In order to do this, I allow the naive, informed traders to believe that the risky asset supply is a random variable. Thus, they remain blissfully unaware that there are momentum traders at all. Momentum traders also have CARA utility but, rather than living until date $T$ , these traders have shorter term horizons and die out at date $t+j$ if they enter at date $t$ . Momentum traders earn their name because, rather than observing the sequence of dividend shocks like the naive informed traders above, momentum traders update their beliefs solely on past price movements: $\Delta_k p_{t-1}=p_{t-1} - p_{t-k-1}$ .

For simplicity, I pick $k=1$ below. I conjecture that moment traders demand is a linear function of past price growth:

(7) $\begin{align*} f_t &= \theta + \phi \Delta p_{t-1} \end{align*}$

I denote price in the momentum regime as $p_t$ rather than $\tilde{p}_t$ . Informed traders solve the exact same problem as before, since they see the supply shock as a random variable rather than an informative signal. Now, momentum traders affect the quantity available. I can rewrite the pricing equation from above as:

(8) $\begin{align*} \begin{split} p_t &= \dfrac{1}{z}\left[z\varepsilon_t+(z-1)\varepsilon_{t+1}+ (z-2)\varepsilon_{t+2}+...+\varepsilon_{t + z-1}\right] \\ &\qquad \qquad - \left(Q-\left[\theta j + \phi \sum_{i=1}^j \Delta p_{t-i} \right] \right) \end{split} \end{align*}$

So that the price today reflects both the current knowledge of all of the naive informed traders as well as the myopic response of the momentum traders where the summation over $j$ comes into play since their are $j$ generations of momentum traders in the market at any given time. As is standard in models with CARA agents, the momentum traders choose $\phi$ according to the rule below where $\rho$ represents the moment trader’s risk aversion parameter:

(9) $\begin{align*} \phi &= \rho \left( \frac{\mathbb{C}\left[ \Delta_j p_{t+j}, \Delta p_{t-1} \right]}{\mathbb{V}\left[\Delta p_{t-1}\right] \mathbb{V}\left[ \Delta_j p_{t+j} \right]} \right) \end{align*}$

An equilibrium is a price $p$ and a quantity demanded by the momentum traders $\phi$ such that both the pricing and mean variance equations above are satisfied. I solve for the equilibrium numerically in R.

Price, dividend, return and signal series from Hong and Stein (1999) with momentum traders at the 20 period horizon.

6. Code

Click HERE to view the code used to create these plots.

« Previous Page