When Can Arbitrageurs Identify a Sporadic Pricing Error?

1. Motivation

Imagine you’re an arbitrageur and you see a sequence of abnormal returns:

(1) $\begin{align*} \mathit{ra}_t \overset{\scriptscriptstyle \mathrm{iid}}{\sim} \begin{cases} +1 &\text{w/ prob } \sfrac{1}{2} \cdot (1 + \alpha) \\ -1 &\text{w/ prob } \sfrac{1}{2} \cdot (1 - \alpha) \end{cases} \qquad \text{with} \qquad \alpha \in [-1,1] \end{align*}$

Here, $\alpha$ denotes the stock’s average abnormal returns, so the stock’s mispriced if $\alpha \neq 0$ . Suppose you don’t initially know whether or not the stock is priced correctly, whether or not $\alpha = 0$ , but as you see more and more data you refine your beliefs, $\widehat{\alpha}_T$ . In fact, your posterior variance disappears as $T \to \infty$ :¹

(2) $\begin{align*} \mathrm{Var}(\widehat{\alpha}_T - \alpha) \asymp T^{-1} \end{align*}$

So, such pricing errors can’t persist forever unless there’s some limit to arbitrage, like trading costs or short-sale constraints, to keep you from trading it away.

However, pricing errors often don’t persist period after period. Instead, they tend to arrive sporadically, affecting the first period’s returns, skipping the next two periods’ returns, affecting the fourth and fifth periods’ returns, and so on… What’s more, arbitrageurs don’t have access to an oracle. They don’t know ahead of time which periods are affected and which aren’t. They have to figure this information out on the fly, in real time. In this post, I show that arbitrageurs with an infinite time series may never be able to identify a sporadic pricing error because they don’t know ahead of time where to look.

2. Sporadic Errors

What does it mean to say that a pricing error is sporadic? Suppose that, in each trading period, there is a key state variable, $s_t \in \{0,1\}$ . If $s_t = 0$ , then the stock’s abnormal returns are drawn from a distribution with $\alpha = 0$ ; whereas, if $s_t = 1$ , then the stock’s abnormal returns are drawn from a distribution with $\alpha \neq 0$ . Let $\theta_t$ denote the probability that $s_t = 1$ in a given trading period:

(3) $\begin{align*} \theta_t &= \mathrm{Pr}\left[ \, s_t = 1 \, \middle| \, s_0 = 1 \, \right] \end{align*}$

So, $\theta_t = 1$ for every $t \geq 0$ in the example above where the stock always has a mean abnormal return of $\alpha$ . By contrast, if $\theta_t < 1$ in every trading period, then the pricing error is sporadic.

You can think about this state variable in a number of ways. For instance, perhaps the market conditions have to be just right for the arbitrage opportunity to exist. In the figure above, I model the distance to this Goldilocks zone as a reflected random walk, $x_t$ :

(4) $\begin{align*} x_t &= \begin{cases} x_{t-1} + 1 &\text{w/ prob } 52{\scriptstyle \%} \\ \max\{x_{t-1} - 1, 0\} &\text{w/ prob } 48{\scriptstyle \%} \end{cases} \end{align*}$

Then, every time $x_t$ hits the origin, the mispricing occurs. Thought of in this way, the state variable represents a renewal process.

3. Inference Problem

Suppose you see an infinitely long time series of abnormal returns. Let $f_0(\{ \mathit{ra}_t \})$ denote the distribution of abnormal returns when $\alpha = 0$ always, and let $f_a(\{ \mathit{ra}_t \})$ denote the distribution of abnormal returns when $\alpha \neq 0$ sometimes. So, if abnormal returns are drawn from $f_0$ , then there are no pricing errors; whereas, if abnormal returns are drawn from $f_a$ , then there are some pricing errors. Here’s the question: When can you conclusively tell whether the data was drawn from $f_0$ rather than $f_a$ ?

If a trader can perfectly distinguish between the pair of probability distributions, $f_0$ and $f_a$ , then, when you give him any randomly selected sequence of abnormal returns, $\{ \mathit{ra}_t \}$ , he will look at it and go, “That’s from distribution $f_0$ .”, or “That’s from distribution $f_a$ .” He will never be stumped. He will never need more information. Mutual singularity is the mathematical way of phrasing this simple idea. Let $\Omega$ denote the set of all possible infinite abnormal return sequences:

(5) $\begin{align*} \Omega &= \left\{ \, \{ \mathit{ra}_t \}, \, \{ \mathit{ra}_t' \}, \, \{ \mathit{ra}_t'' \}, \, \ldots \, \right\} \end{align*}$

We say that a pair of distributions, $f_0$ and $f_a$ , are mutually singular if there exist disjoint sets, $\Sigma_0$ and $\Sigma_a$ , whose union is $\Omega$ such that $f_0(\{ \mathit{ra}_t \}) = 0$ for all sequences $\{ \mathit{ra}_t \} \in \Sigma_a$ while $f_a(\{ \mathit{ra}_t \}) = 0$ for all sequences $\{ \mathit{ra}_t \} \in \Sigma_0$ . If $f_0$ and $f_a$ are mutually singular then we can write $f_0 \perp f_a$ . For example, if the alternative hypothesis is that $\alpha = 0.10$ every day, then $f_0 \perp f_a$ since we know that all abnormal return sequences drawn from $f_0$ have a mean of exactly $0$ as $T \to \infty$ while all abnormal return sequences drawn from $f_a$ have a mean of exactly $\alpha = 0.10$ as $T \to \infty$ .

At the other extreme, you could imagine a trader being completely unable to tell a pair of distributions apart. It might be the case that any sequence of abnormal returns that is off limits in distribution $f_0$ is also off limits in distribution $f_a$ . That is, you might never be able to find a sequence of returns that you could use to reject the null hypothesis. Absolute continuity is the mathematical way of phrasing this idea. A distribution $f_a$ is absolutely continuous with respect to $f_0$ if $f_0(\{ \mathit{ra}_t \}) = 0$ implies that $f_a(\{ \mathit{ra}_t \})=0$ . This is written as $f_0 \gg f_a$ .

In this post I want to know: for what kind of sporadic pricing errors is $f_a \gg f_0$ ? When can a trader never be completely sure that he’s seen an pricing error and not just an unlikely set of market events?

4. Main Results

Now for the two main results. Harris and Keane (1997) show that, i) if the pricing error happens frequently enough, then traders can identify it regardless of how large it is:

(6) $\begin{align*} \sum_{t=0}^{\infty} \theta_t^2 = \infty \qquad \Rightarrow \qquad f_0 \perp f_a \end{align*}$

For instance, suppose that a stock’s abnormal returns are drawn from a distribution with a really small $\alpha > 0$ every single period (i.e., $\theta_t = 1$ for all $t \geq 0$ ). Then, just like standard statistical intuition would suggest, traders with enough data will eventually identify this tiny pricing error since $\sum_{t=0}^\infty 1 = \infty$ .

By contrast, ii) if the pricing error is small and rare enough, then traders with an infinite amount of data will never be able to reject $f_0$ . They will never be able to conclusively know that there was a pricing error:

(7) $\begin{align*} \sum_{t=0}^{\infty} \theta_t^2 &< \sfrac{1}{\alpha^2} \qquad \Rightarrow \qquad f_0 \gg f_a \end{align*}$

Again, $f_0 \gg f_a$ means that traders can’t find a sequence of abnormal returns which would only have been possible under $f_a$ . This is a bit of a strange result. To illustrate, suppose that you and I both see a suspicious abnormal return at time $t=0$ . But, while you think it’s due to a sporadic arbitrage opportunity of size $\alpha$ , I think it was just a random market fluctuation. If the probability that this pricing error recurs shrinks over time,

(8) $\begin{align*} \theta_t \asymp \sfrac{1}{(\alpha \cdot \sqrt{t})} \end{align*}$

then no amount of additional data will enable us to conclusively settle our argument. There can be no smoking gun. We’ll just have to agree to disagree. This result runs against the standard Harsanyi doctrine which says that people who see the same information will end up with the same beliefs.

5. Proof Sketch

I conclude by sketching the proof for part ii) of Harris and Keane (1997)‘s main result: if the pricing error is sufficiently small and rare, then traders will never be able to reject $f_0$ . To do this, I need to be able to show that, if $\sum \theta_t^2 < \sfrac{1}{\alpha^2}$ , then every sequence of abnormal returns with the property that $f_0(\{ \mathit{ra}_t \}) = 0$ also has the property that $f_a(\{ \mathit{ra}_t \}) = 0$ . The easiest way to do this is to look at the behavior of the following integral:

(9) $\begin{align*} \int_\Omega \left( \, \frac{f_a(\{ \mathit{ra}_t \})}{f_0(\{ \mathit{ra}_t \})} \, \right)^2 \cdot dF_0(\{ \mathit{ra}_t \}) \end{align*}$

If it’s finite, then every time $f_0(\{ \mathit{ra}_t \}) = 0$ it must also be the case that $f_a(\{ \mathit{ra}_t \}) = 0$ . Otherwise, you’d be dividing a positive number, $f_a(\{ \mathit{ra}_t \})^2$ , by $0$ .

So, let’s examine this integral. At its core, this integral is just a weighted average of the number of times that a mispricing should occur under $f_a$ since $dF_0 = f_0$ :

(10) $\begin{align*} \int_\Omega \left( \, \frac{f_a(\{ \mathit{ra}_t \})}{f_0(\{ \mathit{ra}_t \})} \, \right)^2 \cdot dF_0(\{ \mathit{ra}_t \}) &= \int_{\{0,1\}^\infty} \prod_{t=1}^\infty (1 + \alpha^2 \cdot s_t^2 ) \cdot d\Theta(\mathbf{s}) \end{align*}$

If we define $J$ as the number of periods in which there is a pricing error,

(11) $\begin{align*} J &= \sum_{t=1}^\infty s_t^2, \end{align*}$

then we can further bound this integral as follows since $s_t^2 \in \{0,1\}$ :

(12) $\begin{align*} \int_{\{0,1\}^\infty} \prod_{t=1}^\infty (1 + \alpha^2 \cdot s_t^2 ) \cdot d\Theta(\mathbf{s}) &\leq \sum_{j=1}^\infty (1 + \alpha^2)^j \cdot \mathrm{Pr}[J = j] &\leq \sum_{j=1}^\infty (1 + \alpha^2)^j \cdot \mathrm{Pr}[J > 1]^{j-1} \end{align*}$

However, we know that the probability that there is at least $1$ period where a pricing error occurs is just the inverse of the expected number of periods in which a pricing error occurs:

(13) $\begin{align*} \frac{1}{\mathrm{Pr}[J > 1]} = \sum_{t=0}^\infty \theta_t^2 \end{align*}$

So, we have our desired result. That is, $f_0 \gg f_a$ whenever:

(14) $\begin{align*} 1 + \alpha^2 < \sum_{t=0}^\infty \theta_t^2 \end{align*}$

e.g., see the computation for the beta-binomial model. ↩