Research Notebook

Plotting Geographic Densities in R

July 11, 2011 by Alex

I show how (here) to create a heat map of the intensity of home purchases from 2000 to 2008 in Los Angeles County, CA using a random sample of 5000observations from the county deeds records. I build off of the code created by David Kahle for Hadley Wickham‘s GGPlot2 Case study competition. I use the results of the geocoding procedure that I outline here as the input data.

Filed Under: Uncategorized

How to Geocode Addresses Using the Yahoo! PlaceFinder API

July 11, 2011 by Alex

This post contains a link (here) to a python program which geocodes a large number of addresses using the Yahoo! PlaceFinder API. This program manages both the use of the API IDs as well as which files have been completed. The code can also be easily parallelized. The code makes use of earlier work I had done in R to accomplish the same task.

Filed Under: Uncategorized

Random Effects Decomposition

June 27, 2011 by Alex

Motivation

I work through the error components econometric model outlined in Amemiya (1985). I use Hayashi (2000) as a reference text. I work through this example because I use this model in my working paper with Chris Mayer on bubble identification and I would like to  work out the details as I didn’t spend much time on these sorts of  models in my core econometrics courses.

In my paper with Chris, I develop a method of identifying relative  mispricings between city specific markets in the US residential  housing market using flows of speculative buyers between cities and  assuming that city sizes are exogenous. Previously, analysts  suspected that the housing bubble was due to credit supply  factors. I use a random effects model to gauge the relative  importance of 1) aggregate credit supply factors and 2)  cross-city speculator flows in explaining mis-pricing in the housing  market in our sample.

 

Econometric Framework

I characterize the random effects error components  estimator outlined in Amemiya (1985, Ch. 6). Consider a balanced panel with N panels and T observations per  panel. I study a regression specification of the following type:

(1)   \begin{align*} y_{n,t} \ &= \ \langle X_{n,t} \mid \beta \rangle \ + \ \mu_n \ + \ \lambda_t \ + \ \varepsilon_{n,t} \end{align*}

 

I can vectorize this specification by stacking each of these N  \times T equations:

(2)   \begin{align*} \begin{split} \mathcal{U} \ &= \ \langle I_N \otimes 1_T \mid \mu \rangle \ + \ \langle 1_N \otimes I_T \mid \lambda \rangle \ + \ \mathcal{E} \\ Y \ &= \ \langle X \mid \beta \rangle \ + \ \mathcal{U} \end{split} \end{align*}

 

Assumptions

I make the following assumptions about the shape of the errors:

Assumption: (Error Structure) I assume that:

1) Unbiased-ness: \langle \mu_n \rangle = 0, \langle \lambda_t \rangle = 0 and \langle \varepsilon_{n,t} \rangle = 0

2) White-Noise: \langle \mu_n \mid \lambda_t \rangle = 0,  \langle \lambda_t \mid \varepsilon_{n,t} \rangle = 0 and \langle \varepsilon_{n,t} \mid \mu_n \rangle = 0

3) Homoskedasticity: \vert \mu \rangle \langle \mu \vert = I_N \cdot \sigma^2_\mu, \vert \lambda \rangle \langle \lambda \vert = I_T \cdot \sigma^2_\lambda and \vert \varepsilon \rangle \langle \varepsilon \vert = I_{N \times T} \cdot \sigma^2_{\varepsilon}

 

What are the key take-aways from these assumptions? First,  assumption 1) means that there is a constant term in the  explanatory X variables. Assumption 2) is just the standard  white noise assumption. Assumption 3) is the key restriction. This  assumption says that the within and between effects are independent  across time and panels respectively. The estimator I define below  allows me to learn the values of \sigma_\mu^2, \sigma_\lambda^2  and \sigma_\varepsilon^2.

 

Estimation

How do I go about estimating these 3 objects? First, I define some notation to make my life a bit easier and stave  of carpel tunnel for a few more semesters:

(3)   \begin{align*} \begin{split} F \ &= \ \vert I_N \otimes 1_T \rangle \langle I_N \otimes 1_T \vert \\ G \ &= \ \vert 1_N \otimes I_T \rangle \langle 1_N \otimes I_T \vert \end{split} \end{align*}

 

Also, let H be an (N \cdot T) \cdot (N \cdot T - N - T + 1) unit  matrix. I name the error covariance matrix \Omega, and then characterize  it as a linear function of the 3 variance terms of interest:

(4)   \begin{align*} \begin{split} \Omega \ &= \ \vert \mathcal{U} \rangle \langle \mathcal{U} \vert \\ &= \ \sigma_\mu^2 \cdot F \ + \ \sigma^2_\lambda \cdot G \ + \ \sigma_\varepsilon^2 \cdot I_{N \times T} \end{split} \end{align*}

 

I can write out the inverse of the error covariance matix \Omega  as follows:

(5)   \begin{align*} \begin{split} \Omega^{-1} \ &= \ \frac{1}{\sigma_\varepsilon^2} \cdot \left( I_{N \times T} - \gamma_1 \cdot F + \gamma_2 \cdot G + \gamma_3 \cdot H \right) \\ \gamma_1 \ &= \ \frac{\sigma_\mu^2}{\sigma_\varepsilon^2 + T \cdot \sigma_\mu^2} \\ \gamma_2 \ &= \ \frac{\sigma_\lambda^2}{\sigma_\varepsilon^2 + N \cdot \sigma_\lambda^2} \\ \gamma_3 \ &= \ \gamma_1 \cdot \gamma_2 \cdot \left( \ \frac{2 \cdot \sigma_\varepsilon^2 + T \cdot \sigma_\mu^2 + N \cdot \sigma_\lambda^2}{\sigma_\varepsilon^2 + T \cdot \sigma_\mu^2 + N \cdot \sigma_\lambda^2} \ \right) \end{split} \end{align*}

 

This formulation shows that the sample error covariance matrix will  provide unbiased and consistent estimates if both N \to \infty and  T \to \infty. In this not, I am not going to worry about what is  the most consistent estimator for the parameters. Next, I want to decompose the error covariance matrix into within,  between and indiosyncratic components. To do this I need 1 last  piece of notation:

(6)   \begin{align*} Q \ &= \ I \ - \ \frac{F}{T} \ - \ \frac{G}{N} \ + \ \frac{H}{N \cdot T} \end{align*}

 

Think about this as an orthogonal decomposition of a unitary error  covariance matrix into each of the 3 components: within, between  and idiosyncratic. Then, using this term, Amemiya (1971) shows that the following  estimators for the parameter vector \begin{bmatrix} \sigma_\mu^2 &  \sigma_\lambda^2 & \sigma_\varepsilon^2 \end{bmatrix}:

(7)   \begin{align*} \begin{split} \hat{\mathcal{U}} \ &= \ Y \ - \ \langle X \mid \hat{\beta} \rangle \\ \hat{\sigma}_{\varepsilon}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle Q \mid \hat{\mathcal{U}} \rangle \rangle}{(N-1) \cdot (T-1)} \\ \hat{\sigma}_{\mu}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle \frac{T-1}{T} \cdot F - \frac{T-1}{N \cdot T} \cdot H - Q \mid \hat{\mathcal{U}} \rangle \rangle}{T \cdot (N-1) \cdot (T-1)} \\ \hat{\sigma}_{\lambda}^2 \ &= \ \frac{\langle \hat{\mathcal{U}} \mid \langle \frac{N-1}{N} \cdot G - \frac{N-1}{N \cdot T} \cdot H - Q \mid \hat{\mathcal{U}} \rangle \rangle}{T \cdot (N-1) \cdot (T-1)} \end{split} \end{align*}

Filed Under: Uncategorized

Recurrence in 1D, 2D and 3D Brownian Motion

June 26, 2011 by Alex

Introduction

I show that Brownian motion is recurrent for dimensions d=1 and  d=2 but transient for dimensions d \geq 3. Below, I give the  technical definition of a recurrent stochastic process:

Definition: (Recurrent Stochastic Process) Let X(t) be a stochastic process. We say that X(t) is recurrent  if for any \varepsilon > 0 and any point \bar{x} \in  \mathtt{Dom}(X) we have that:

(1)   \begin{align*} \infty \ &= \ \int_0^\infty \ \mathtt{Pr} \left[ \ \left\Vert X(t) - \bar{x} \right\Vert < \varepsilon  \mid X(0) = \bar{x} \ \right] \cdot dt \end{align*}

In words, this definition says that if the stochastic process X(t)  starts out at a point \bar{x}, then if we watch the process  forever it will return again and again to within some tiny region of  a an infinite number of times.

Motivating Example

Before I go about proving that Brownian motion is recurrent or   transient in different dimensions, I first want to nail down the   intuition of what it means for a stochastic process to be recurrent   in a more physical sense. To do this, I use the standard real world   example for random walks: a drunk leaving a bar.

Arnold’s lattice world for the case of 2 dimensions.

Example: (A Drunkard’s Flight) Suppose that Arnold is drunk and leaving his local bar. What’s   more, Arnold is really inebriated and can only muster enough   coordination to move 1 step backwards or 1 step forward each   second. Because he is so drunk, he doesn’t have any control which   direction he stumbles so you can think about him moving backwards   and forwards each second with equal probability \pi = 1/2. Thus, Arnold’s position relative to the door of the bar is a   stochastic process with independent \pm 1 increments. This   process is recurrent if Arnold returns to the bar an infinite   number of times as we allow him to stumble around all night. Put   differently, if Arnold ever has a last drink for the evening and   exits the bar for good, then his stumbling process will be   transient.

In the context of this toy example, I show that as I allow Arnold   to stumble in more and more different directions (backwards   vs. forwards, left vs. right, up vs. down, etc…), his   probability of returning to the bar decreases. Namely, if Arnold   can only move backwards and forwards, then his stumbling will lead   him back to his bar an infinite number of times. If he can move   backwards and forwards as well as left and right, he will still   wander back to the bar an infinite number of times. However, if   Arnold either suddenly grows wings (i.e., can move up or down) or   happens to be the Terminator (i.e., can time travel to the future   or past), at some point his wandering will lead him away from the   bar forever.

 

Outline

First, I state and prove   Polya’s Theorem which characterizes whether or not a random walk on   a lattice is recurrent in each dimension d=1,2,3\ldots. Then, I show how to extend this result to continuous time Brownian motion using the   Central Limit Theorem. I attack this recurrence result for continuous time Brownian motion   via Polya’s Recurrence Theorem because I think the intuition is   much clearer along this route. I find the direct proof in   continuous time which relies on Dynkin’s lemma a bit obscure;   whereas, I have a very good feel for what it means to count paths   (i.e., possible random walk trajectories) on a grid.

 

Polya’s Recurrence Theorem

Below, I formulate and prove Polya’s Recurrence Theorem for  dimensions d \in \{1,2,3\}.

Theorem: (Polya Recurrence Theorem) Let p(d) be the probability that a random walk on a d  dimensional lattice ever returns to the origin. Then, we have that  p(1)=p(2) = 1 while p(3) < 1.

 

Intuition

Before I go any further into the maths, I walk through the physical   intuition behind the result. First, imagine the case where drunk   Arnold can only move forwards and backwards. In order for Arnold to   return to the bar door in 2 \cdot s steps1, he must take the exact   same number of forward and backwards steps. i.e., he has to choose   a sequence of 2 \cdot s steps such that exactly s of them are   forward. There are 2 \cdot s choose s ways I could do   this:

(2)   \begin{align*} \mathtt{\# \ returning \ paths} \ &= \ \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \end{align*}

What’s more, I know that the probability of each of the paths Arnold could take is just 1 divided by the total number of paths 2^{2   \cdot s}:

(3)   \begin{align*} \mathtt{Pr[each \ path]} \ &= \ \frac{1}{2^{2 \cdot s}} \end{align*}

Now consider drunk Arnold’s situation in 2-dimensions. Here, he   must take the exact same number of steps forward and backwards as   well as the exact same number of steps left and right. Thus, there   are 2 \cdot s choose (k,k,s-k,s-k) ways for Arnold to return to the bar:

(4)   \begin{align*} \mathtt{\# \ returning \ paths} \ &= \ \sum_{k=0}^s \ \begin{pmatrix} 2 \cdot s \\ k,k,(s-k),(s-k) \end{pmatrix} \end{align*}

What is this sum computing in words? First, suppose that Arnold   takes no steps in the left or right directions, then set k=0 and   the number of paths he could take back to the bar is equal to the   number in the 1-dimensional case. Conversely, if Arnold takes no   steps forwards or backwards, set k=s and again you get the   1-dimensional case. Thus, the number of possible paths Arnold can   take back to the bar in 2-dimensions is strictly larger than in   1-dimension. However, Arnold can also take paths which mover   along both axes. This sum first counts up the number of ways he can   make to end up back at his starting point in the left or right   directions. Then, it takes the remaining number of steps, and   counts the number of ways he can use those steps to return to the   starting point in the forwards and backwards direction.

Note that this process doesn’t add that many new returning paths   for each new dimension. Every time I add a new dimension, I’m   certainly adding fewer than 2^s new paths as:

(5)   \begin{align*} m^n \ &= \ \sum_{k_1 + k_2 + \ldots + k_m = n} \ \begin{pmatrix} n \\ k_1, k_2, \ldots, k_m \end{pmatrix} \end{align*}

However, each path only happens with probability 4^{- 2 \cdot s}   now. The probability of realizing each possible path is decreasing   at a rate of 2 \cdot s:

(6)   \begin{align*} \mathtt{Pr[each \ path]}(d) \ &= \ \left(\frac{1}{2 \cdot d}\right)^{2 \cdot s} \end{align*}

Thus, the Polya’s Recurrence Theorem stems from the fact that the number of possible paths back to the origin in growing at a rate   that in less than the number of all paths; i.e., the wilderness of   paths that do not loop back to the origin is increasing faster than   the set of paths which do loop back as we add dimensions.

 

Proof

Below, I prove this result 1 dimension at a time:

Proof: (d=1) The probability that Arnold will return to the origin in 2 \cdot s   steps is the number of possible paths times the probability that   each 1 of those paths occurs:

(7)   \begin{align*} p_{2 \cdot s}(1) \ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \end{align*}

Next, in order to derive an analytical characterization of this   probability, I use Stirling’s approximation to handle the factorial   terms in the binomial coefficient:

(8)   \begin{align*}   s! \ &\approx \ \sqrt{2 \cdot \pi \cdot s} \cdot e^{-s} \cdot s^s \end{align*}

Using this approximation and simplifying, I find that:

(9)   \begin{align*} \begin{split} p_{2 \cdot s}(1) \ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \frac{(2 \cdot s)!}{s! \cdot (2 \cdot s - s)!} \\ &\approx \ \frac{1}{(\pi \cdot s)^{1/2}} \end{split} \end{align*}

Thus, if I sum over all possible periods, I get the expected number   of times that drunk Arnold will return to the bar for another night   cap. I find that this infinite sum diverges:

(10)   \begin{align*} \begin{split} p(1) \ &= \ \sum_{s=0}^\infty \ p_{2 \cdot s}(1) \\ &= \ \sum_{s=0}^\infty \ \frac{1}{(\pi \cdot s)^{1/2}} \\ &= \ \infty \end{split} \end{align*}

 

Proof: (d=2) Next, I follow all of the same steps through for the d=2   dimensional case:

(11)   \begin{align*} \begin{split} p_{2 \cdot s}(2) \ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \begin{pmatrix} 2 \cdot s \\ k,k,(n-k),(n-k) \end{pmatrix} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \frac{(2 \cdot s)!}{k! \cdot k! \cdot (s - k)! \cdot (s - k)!} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \sum_{k=0}^s \ \frac{(2 \cdot s)!}{s! \cdot s!} \cdot \frac{s! \cdot s!}{k! \cdot k! \cdot (s - k)! \cdot (s - k)!} \\ &= \ \left( \frac{1}{4} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \sum_{k=0}^s \ \begin{pmatrix} s \\ k \end{pmatrix}^2 \\ &= \ \left[ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \right]^2 \\ &= \ \left[ p_{2 \cdot s}(1) \right]^2 \end{split} \end{align*}

Summing over all possible path lengths yields a divergent series:

(12)   \begin{align*} \begin{split} p(2) \ &= \sum_{s=0}^\infty \ p_{2 \cdot s}(2) \\ &= \ \sum_{s=0}^\infty \ \frac{1}{\pi \cdot s} \\ &= \ \infty \end{split} \end{align*}

 

Proof: (d=3) The result for d=3 is a bit more complicated as there isn’t a   nice closed form expression for each of the p_{2 \cdot s}(3)   terms. I start by simplifying as far as I can:

(13)   \begin{align*} \begin{split} p_{2 \cdot s}(3) \ &= \ \left( \frac{1}{6} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ k,k, j,j,  (s-k-j),(s-k-j) \end{pmatrix} \\ &= \ \left( \frac{1}{6} \right)^{2 \cdot s} \cdot \sum_{j,k \mid j+k \leq s} \ \frac{(2 \cdot s)!}{k! \cdot k! \cdot j! \cdot j! \cdot (s-j-k)! \cdot (s-j-k)!} \\ &= \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \sum_{j,k \mid j+k \leq s} \ \left( \frac{1}{3^s} \cdot \frac{s!}{k! \cdot j! \cdot (s-j-k)!} \right)^2 \end{split} \end{align*}

Next, I apply the Multinomial Theorem and note that this   probability is maximized when j=k=n/3. Thus, if I substitute in   this value, I will have an upper bound on the probability p_{2   \cdot s}(3):

(14)   \begin{align*} \begin{split} p_{2 \cdot s}(3) \ &\leq \ \left( \frac{1}{2} \right)^{2 \cdot s} \cdot \begin{pmatrix} 2 \cdot s \\ s \end{pmatrix} \cdot \left( \frac{1}{3^s} \cdot \frac{s!}{\left[ \left( \frac{s}{3} \right)! \right]^3} \right) \\ &\leq \ \frac{C}{(\pi \cdot s)^{3/2}} \end{split} \end{align*}

Summing over all possible path lengths leads to a convergent   series, so I know that Arnold may have a final drink at some point   during the evening:

(15)   \begin{align*} \begin{split} p(3) \ &= \ \sum_{s=0}^\infty \ p_{2 \cdot n}(3) \\ &< \ \infty \end{split} \end{align*}

 

Extension to Brownian Motion

Below, I define Brownian motion in d>1 dimensions and then show  how to extend the results from Polya’s Recurrence Theorem from  random walks on a lattice to continuous time Brownian  motion.

Brownian motion for d>1 dimensions is a natural extension of the  d=1 dimensional case. I give the formal definition below:

Definition: (Multi-Dimensional Brownian Motion) Brownian motion in \mathcal{R}^d is the vector valued process:

(16)   \begin{align*} \mathbf{B}(t) \ &= \ \begin{bmatrix} B_1(t) & B_2(t) & \ldots & B_d(t) \end{bmatrix} \end{align*}

To extend Polya’s Recurrence Theorem to continuous time Brownian  motion, I just need to apply the Central Limit Theorem and then  construct the Brownian motion from the resulting independent  Gaussian increments:

Theorem: (deMoivre-Laplace) Let k_s be the number of successful draws from a binomial  distribution in s tries. Then, when \mathbb{E}(k_s) \approx s  \cdot \pi, we can approximate the binomial distribution with the  Gaussian distribution with the approximation becoming exact as s  \to \infty:

(17)   \begin{align*} \mathtt{Bin}(s,\pi) \ &\sim \ \mathtt{Norm}\left(s \cdot \pi, \sqrt{s \cdot \pi \cdot (1-\pi)}\right) \end{align*}

Lemma: (Levy’s Selector) Suppose that s<t and X(s) and X(t) are random variables  defined on the same sample space such that X(t) - X(s) has a  distribution which is \mathtt{Norm}(0,t-s). Then there exists a  random variable X(\frac{t+s}{2}) such that X(\frac{t+s}{2}) -  X(s) and X(t) - X(\frac{t+s}{2}) are independent with a common  \mathtt{Norm}(0,\frac{t-s}{2}) distribution.

  1. Sanity Check:   Why 2 \cdot s and not just s here? ↩

Filed Under: Uncategorized

Hong and Stein (1999)

June 24, 2011 by Alex

1. Introduction

I replicate main results from Hong and Stein (1999) which constructs an equilibrium model with under-reaction and momentum. First, I give a rough verbal explanation of the model’s results. Then, I outline the basic mathematical framework and work through the equilibrium concept. Finally, I simulate the equilibrium outcomes for different momentum trader horizons and information speeds.

This paper develops an interesting model in which endogenizes the frequency and amplitude of price fluctuations. I work through this paper to better understand the nuts and bolts of this equilibrium concept. Perhaps I might be able to use these statistical wave-like properties to identify and discriminate between different mis-pricing generators.

2. Simple Example

The basic idea behind the model is as follows.  Suppose that you  have a bunch of traders that receive a demand shock, but only  respond slowly.  For example, imagine that a bunch of people earn a  windfall payment (i.e. win the lottery or find out about a long lost  rich uncle) and decide to buy new houses.  It would take them a  while to search for the appropriate house that fits their exact  needs.  For instance, perhaps one family needs to be in a nice  school district, another needs to be near the airport for frequent  trips, and so on…  These guys represent slow moving information or  demand.  However, though it would take time for each of the people  to purchase their new home, and anyone who knew about the windfall  payments would know that the demand for expensive houses was going  to jump up in the future.

Now, suppose that no one knows that the windfall payments have  already occurred; but there is, instead, a group of traders that  know a windfall payment might occur at anytime.  It could have been  today.  It could have been yesterday.  It might actually happen in a  week.  Yet, while this group doesn’t know when the payment has been  made, each of the agents can infer the likelihood from the price  movements.  If the price drifts up, then it is more likely that  helicopters have dumped the cash.  A trader who acts on these price  movements is a momentum trader.

There are 2 additional quirks: 1) The informed traders  don’t realize that other people have also recieved a windfall  payments.  2) Momentum traders enter sequentially and are very  simple minded.  They don’t know how many of their own kind there  are.  They don’t meet anyone for lunch to discuss what are the best  ways to back out whether or not there has been a windfall payment.  All they do is make their best guess based on the price growth over  the past 6 months.  That’s it.

What happens now?  The first couple of momentum traders that walk  into the market will see a price jump after the windfall actually  occurs and trade into it.  However, the next few momentum traders  will see the price growth induced by the earlier momentum traders  and get very excited and trade into the asset even further.  These  momentum traders are responding to a price movement that was solely  generated by other momentum traders.  This pattern repeats itself  until the bottom falls out.  So, it is as if the later momentum  traders pay a tax for being late.  Early price movements accelerate,  then over shoot their fundamental value and collapse.

3. Economic Framework

First, let’s consider a world with only a unit mass of naive, but informed traders.  Agents live in a discrete time T period world where T is large.  There is a riskless asset with a 0 return    as well as a risky asset in positive net supply q which pays out a dividend d_T at time T.  The asset has an expected dividend d_t and price p_t at each interim period t.  These traders all start out with the same information and the same endowment of  the riskless asset m_0 and risky asset q_0.  Each trader i has CARA utility over consumption at time T:

(1)   \begin{equation*} U(C^i) \ = \ - \mathbb{E} \left[ \ e^{-\alpha C^i} \ \right] \end{equation*}

The traders are informed because they receive a series of signals \varepsilon_t about the size of the dividend d_T at each point in time t.  However, these signals move slowly throughout the population.  Specifically, suppose that there are z different flavors of traders of equal size.  Each flavor of traders sees a different, independent component of the signal \varepsilon_t at each point in time.  So, for example, at time t, traders of type z=1 see the component \varepsilon^1_{t+z-1} of the shock \varepsilon_{t+z-1}.  At time t+1, type z=1 traders see the second component of the shock \varepsilon_{t+z-1} as well as the first component of the shock \varepsilon_{t+z}.  Likewise, at time t, traders of type z=2 see the component \varepsilon^2_{t+z-1} of the shock \varepsilon_{t+z-1} as so on.

Thus, the traders of each flavor rotate which component of the shock they see until they have seen all z independent components and know the full shock.  This information rotation structure means that, after \tau periods since time t, agents have seen a fraction \tau z of the total signals available for the shock \varepsilon_{t+z-1}.

Information rotation.

Information rotation structure in Hong and Stein (1999).

Traders are naive because they do not condition on the observed price when they formulate their expectations. Traders see their components of the shock at each point in time, update their beliefs about the future value of the dividend, and then place their order believing that they will adopt a buy and hold until date T strategy, but they do not impound the market clearing price into their information set.

Equilibrium Concept: Walrasian equilibrium with private valuations.

4. No Momentum Traders

To solve the model, I follow the same general strategy as in a Grossman and Stiglitz (1983) equilibrium, but I give the traders  naive rather than rational expectations. First, I write out the optimization problem for each naive, informed  trader i.  I assume that z=2 and solve a 1 period problem, but  the solutions below easily generalize to z>2 and multiple periods.

Each trader maximizes his consumption utility by choosing his asset  holdings subject to a budget constraint where m^i represents his  riskless asset holdings and q^i represents his risky asset  holdings:

(2)   \begin{align*} V^i &= \max_{m^iq^i} \left\{ -\mathbb{E} \left[ e^{-\rho w^i} \right] \right\} \\ &\textit{subject to} \\ m^i + \tilde{p} q^i &\leq m_0 + \tilde{p}_0 q_0 \end{align*}

I assume that each trader has a unit mass of wealth.  After  substituting in the budget constraint, I get:

(3)   \begin{align*} V^i &= \max_{q^i} \left\{ -\mathbb{E}^i \left[ e^{-\rho \left( (d-\tilde{p}) q^i \right)} \right] \right\} \end{align*}

The first order condition with respect to q^i characterizes the  risky holdings as follows:

(4)   \begin{align*} q^i &= \frac{\mathbb{E}^i\left[ d-\tilde{p} \right]}{\rho\mathbb{V}^i\left[ d-\tilde{p} \right]} \end{align*}

Next, I guess that price is linear in the public information, the  private signal about tomorrows information and the total quantity.  \varepsilon_0 denotes the public signal available to traders of  both flavors.  \varepsilon_1 denotes the sum of the private  signals for each type of trader .

(5)   \begin{align*} \tilde{p} &= \alpha + \beta \varepsilon_0 + \gamma \varepsilon_1 - \delta Q \end{align*}

I solve for \tilde{p} by substituting the function for q^i into the  budget constraint.  Since both flavors of agents are symmetric and  ignorant of the information in the prices themselves, the price  functional simplifies to:

(6)   \begin{align*} \tilde{p} &= \left( \varepsilon_0 + \dfrac{1}{2} \varepsilon_1 \right) - \delta Q \end{align*}

\delta is a function of the risk aversion parameter \rho as well  as the variance \sigma_\varepsilon^2.  I pick the risk aversion  parameter in order to set \delta=1 for simplicity.

Price, dividend, return and signal series from Hong and Stein (1999) with no momentum traders.


5. With Momentum Traders

Now, I add in momentum traders.  In order to do this, I allow the  naive, informed traders to believe that the risky asset supply is a  random variable.  Thus, they remain blissfully unaware that there  are momentum traders at all.  Momentum traders also have CARA  utility but, rather than living until date T, these traders have  shorter term horizons and die out at date t+j if they enter at  date t.  Momentum traders earn their name because, rather than  observing the sequence of dividend shocks like the naive informed  traders above, momentum traders update their beliefs solely on past  price movements: \Delta_k p_{t-1}=p_{t-1} - p_{t-k-1}.

For  simplicity, I pick k=1 below. I conjecture that moment traders demand is a linear function of past  price growth:

(7)   \begin{align*} f_t &= \theta + \phi \Delta p_{t-1} \end{align*}

I denote price in the momentum regime as p_t rather than  \tilde{p}_t.  Informed traders solve the exact same problem as before, since they see the supply shock as a random variable rather  than an informative signal.  Now, momentum traders affect the  quantity available.  I can rewrite the pricing equation from above  as:

(8)   \begin{align*} \begin{split} p_t &= \dfrac{1}{z}\left[z\varepsilon_t+(z-1)\varepsilon_{t+1}+ (z-2)\varepsilon_{t+2}+...+\varepsilon_{t + z-1}\right] \\ &\qquad \qquad - \left(Q-\left[\theta j + \phi \sum_{i=1}^j \Delta p_{t-i} \right] \right) \end{split} \end{align*}

So that the price today reflects both the current knowledge of all  of the naive informed traders as well as the myopic response of the  momentum traders where the summation over j comes into play since  their are j generations of momentum traders in the market at any  given time.  As is standard in models with CARA agents, the momentum  traders choose \phi according to the rule below where \rho  represents the moment trader’s risk aversion parameter:

(9)   \begin{align*} \phi &= \rho \left( \frac{\mathbb{C}\left[ \Delta_j p_{t+j}, \Delta p_{t-1} \right]}{\mathbb{V}\left[\Delta p_{t-1}\right] \mathbb{V}\left[ \Delta_j p_{t+j} \right]} \right) \end{align*}

An equilibrium is a price p and a quantity demanded by the  momentum traders \phi such that both the pricing and mean variance  equations above are satisfied.  I solve for the equilibrium  numerically in R.

Price, dividend, return and signal series from Hong and Stein (1999) with momentum traders at the 20 period horizon.

6. Code

Click HERE to view the code used to create these plots.

Filed Under: Uncategorized

« Previous Page

Pages

  • Publications
  • Working Papers
  • Curriculum Vitae
  • Notebook
  • Courses

Copyright © 2026 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in

 

Loading Comments...
 

You must be logged in to post a comment.