Empirical Bayes and Price Signals

Asset-pricing models are built upon the idea that traders learn from price signals. For example, suppose there are $N \geq 1$ actively managed mutual funds. And, imagine a trader that observes the entire cross-section of these mutual funds’ returns in month $t$ :

$\begin{equation*} R_{n,t} = \alpha_n + \epsilon_{n,t} \qquad \text{where} \qquad \epsilon_{n,t} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(0, \, \sigma_{\epsilon}^2) \end{equation*}$

In the equation above, $\alpha_n$ is the average performance of the $n$ th active mutual-fund manager while $\epsilon_{n,t}$ is measurement error in price signals. A skilled mutual-fund manager has an $\alpha_n > 0$ . So, after observing cross-section of mutual funds’ returns in a given month, a trader can use Bayes’ rule

$\begin{equation*} \mathrm{Pr}[\alpha_n > 0|R_{n,t}] = {\textstyle \left( \frac{\mathrm{Pr}[R_{n,t}|\alpha_n > 0]}{\mathrm{Pr}[R_{n,t}]} \right)} \times \mathrm{Pr}[\alpha_n > 0] \end{equation*}$

to update his beliefs about whether or not the $n$ th active mutual-fund manager is skilled.

At first glance, the logic above seems trivial. But, on closer inspection, there’s something a bit paradoxical about framing a trader’s inference problem this way. If a trader wants to use Bayes’ rule to learn about a particular fund manager’s skill level from realized returns, then it seems like the trader needs to know how skill is distributed across the population of fund managers. After all, the last term in the equation above is just one minus the cumulative-distribution function at zero, $\mathrm{Pr}[\alpha_n > 0] = 1 - \mathrm{CDF}_{\alpha}(0)$ . And, financial economists disagree about basic properties of this distribution. For instance, there is debate about whether any active mutual-fund managers are skilled—i.e., about whether $\mathrm{Pr}[\alpha_n > 0] = 0$ or $\mathrm{Pr}[\alpha_n > 0] > 0$ . If we can’t agree about these sorts of basic facts, then are traders supposed to apply Bayes’ rule?

This is where the empirical-Bayes method makes an appearance. It turns out that, in the example above, a trader can learn about a particular mutual-fund manager’s skill from realized returns without having any ex ante knowledge about how skill is distributed across the population of fund managers. He can just estimate this prior distribution from the data. And, this post illustrates how using a trick known as Tweedie’s formula.

Simple Example

Let’s start with a simple example to illustrate how the empirical-Bayes method works. Suppose that fund-manager skill is normally distributed across the population:

$\begin{equation*} \alpha_n \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(\mu_{\alpha}, \, \sigma_{\alpha}^2) \end{equation*}$

In this setup, it’s easy to compute a trader’s posterior beliefs about the skill of the $n$ th mutual-fund manager after observing this manager’s month- $t$ returns:

(1) $\begin{equation*} \mathrm{E}[\alpha_n|R_{n,t}] = {\textstyle \left( \frac{\sigma_{\alpha}^2}{\sigma_{\alpha}^2 + \sigma_{\epsilon}^2}\right)} \times R_{n,t} + {\textstyle \left( \frac{\sigma_{\epsilon}^2}{\sigma_{\alpha}^2 + \sigma_{\epsilon}^2} \right)} \times \mu_\alpha \end{equation*}$

This is a completely standard Gaussian-learning problem (e.g., see here, here, here, etc…).

The formula in Equation (1) seems to require knowledge of the mean and variance of the skill distribution, $\mu_{\alpha}$ and $\sigma_{\alpha}^2$ . But, not so. Notice that when both skill and error are normally distributed, realized returns are also normally distributed, $R_{n,t} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(\mu_R, \, \sigma_R^2)$ , with

$\begin{equation*} \begin{split} \mu_R &= \mu_{\alpha} \\ \sigma_R^2 &= \sigma_{\alpha}^2 + \sigma_{\epsilon}^2 \end{split} \end{equation*}$

So, a trader could form the correct ex-post beliefs about the skill of $n$ th mutual-fund manager by simply estimating the cross-sectional mean and variance of the realized returns

(2) $\begin{equation*} \mathrm{E}[\alpha_n|R_{n,t}] = {\textstyle \left( \frac{\sigma_R^2 - \sigma_{\epsilon}^2}{\sigma_R^2} \right)} \times R_{n,t} + {\textstyle \left( \frac{\sigma_{\epsilon}^2}{\sigma_R^2} \right)} \times \mu_R \end{equation*}$

since $\sfrac{\sigma_{\alpha}^2}{(\sigma_{\alpha}^2 + \sigma_{\epsilon}^2)} = \sfrac{(\sigma_R^2 - \sigma_{\epsilon}^2)}{\sigma_R^2}$ and $\sfrac{\sigma_{\epsilon}^2}{(\sigma_{\alpha}^2 + \sigma_{\epsilon}^2)} \times \mu_{\alpha} = \sfrac{\sigma_{\epsilon}^2}{\sigma_R^2} \times \mu_R$ .

This is the essence of the empirical-Bayes method. If you’re interested in learning from a specific observation and you don’t know which priors to use, then just use the remaining data to estimate these priors. i.e., replace $\mu_R$ and $\sigma_R^2$ with $\hat{\mu}_R = \frac{1}{N-1} \cdot \sum_{n' \neq n} R_{n',t}$ and $\hat{\sigma}_R^2 = \frac{1}{N-2} \cdot \sum_{n' \neq n} (R_{n',t} - \hat{\mu}_R)^2$ in Equation (2).

The figure above illustrates how this scheme works. The three left panels show the cross-sectional distributions of measurement error, manager skill, and realized returns under the assumption of normality. The right panel then shows a trader’s posterior beliefs about the skill of the $n$ th mutual-fund manager ( $y$ -axis) after observing this fund’s realized returns in month $t$ ( $x$ -axis). The purple line shows $\mathrm{E}[\alpha_n|R_n]$ calculated using knowledge of both $\mu_{\alpha}$ and $\sigma_{\alpha}^2$ . The dashed black line shows $\mathrm{E}[\alpha_n|R_n]$ calculated via the empirical-Bayes method using estimates of $\hat{\mu}_R$ and $\hat{\sigma}_R^2$ from the cross-sectional distribution of returns.

Tweedie’s Formula

Tweedie’s formula is a natural extension of this approach that doesn’t require manager skill (or whatever it is that traders are trying to learn about) to be normally distributed. Here’s the idea. Notice that in the normally-distributed case, $R_{n,t} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(\mu_R, \, \sigma_R^2)$ , the probability-density function (PDF) of realized returns is given by:

$\begin{equation*} \mathrm{f}(R) = {\textstyle \frac{1}{\sqrt{2 \cdot \pi \cdot \sigma_R^2}}} \cdot \exp {\textstyle \left\{- \, \frac{1}{2 \cdot \sigma_R^2} \cdot (R-\mu_R)^2 \right\}} \end{equation*}$

And, if we define the log of this PDF, $\ell(R) = \log \mathrm{f}(R)$ , then $\ell'(R) = - \, (\sfrac{1\!}{\sigma_R^2}) \cdot (R - \mu_R)$ . So, we can write the formula for a trader’s posterior beliefs in Equation (2) as follows:

(3) $\begin{equation*} \mathrm{E}[\alpha_n|R_n] = R_n + \sigma_{\epsilon}^2 \cdot \ell'(R_n) \end{equation*}$

This is Tweedie’s formula. And, in keeping with Stigler’s law, it was Robbins (1956) who showed that Tweedie’s formula holds approximately for any prior distribution on $\alpha_n$ that satisfies standard regularity conditions, such as being smooth and having a single peak. The formula in Equation (3) is interesting because it means that, if a trader can approximate the cross-sectional distribution of mutual-fund returns (i.e., estimate $\hat{\mathrm{f}}(R)$ ), then he can appropriately update his beliefs about any particular manager’s skill level (i.e., compute $\hat{\mathrm{E}}[\alpha|R_{n,t}]$ ). There’s no need for him to take a hard-line dogmatic stance about what the cross-sectional distribution of mutual-fund manager skill looks like.

To see this point in action, check out the figure above. First, click on the “Normal” button. This version of the figure replicates the earlier result by estimating $\mathrm{f}(R)$ with a $4$ th-order polynomial rather than by directly estimating $\hat{\mu}_R$ and $\hat{\sigma}_R^2$ . The interesting part, however, is that this result also holds when mutual-fund manager skill is not normally distributed. For example, suppose that skill obeys a Laplace distribution:

$\begin{equation*} \alpha_n \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{Lap}(\lambda_{\alpha}/\sigma_{\epsilon}) \qquad \text{where} \qquad \mathrm{Lap}(\theta) = (\sfrac{\theta\!}{2}) \cdot e^{- \theta \cdot |x|} \end{equation*}$

Under this assumption, the correct way for a trader to update his prior beliefs about the $n$ th manager’s skill after observing the manager’s realized return in month $t$ is to use a threshold rule. When the manager’s return is sufficiently small, $|R_{n,t}| \leq \sigma_{\epsilon} \cdot \lambda_{\alpha}$ , a trader should not update his beliefs at all:

$\begin{equation*} \mathrm{E}[\alpha_n|R_n] = \begin{cases} \mathrm{Sgn}(R_n) \cdot (|R_n| - \sigma_{\epsilon} \cdot \lambda_{\alpha}) &\text{if } |R_n| > \sigma_{\epsilon} \cdot \lambda_{\alpha} \\ 0 &\text{otherwise} \end{cases} \end{equation*}$

This is the Bayesian LASSO. And, by clicking on the “Laplace” button in the figure above, you can see how Tweedie’s formula captures this non-responsiveness without having to directly assume that traders are using an $\ell_1$ penalty. Something resembling a soft-thresholding rule just emerges from the data.

Actual Data

Finally, click on the “???????” button. In the lower left-hand corner, you should now see the distribution of returns for all actively managed equity mutual funds in May 2012 (normalized to be on the same scale as the data from the earlier simulations). The data in this version of the figure comes from WRDS. I just picked one month at random. The dashed line is the estimated PDF for this return distribution, which I again computed using a $4$ th-order polynomial (see CASI, Ch. 15). There is nothing in the middle box because I don’t know the skill level of each mutual-fund manager. In the upper-left box, I’ve plotted the PDF of the measurement error. But, I didn’t plot a histogram of realized errors because, again, I can’t tell skill from luck. I can only see the cross-section of returns.

There are two interesting things about the plot of the posterior beliefs on the right. The first is that, if you apply Tweedie’s formula to actual mutual-fund returns, then you get a picture that looks a lot like the picture that emerged using Laplace priors. In other words, it looks a lot like a world where the distribution of active mutual-fund manager skill has fat tails—i.e., a world where there are a couple of very skilled managers and a couple of utterly incompetent managers and everyone else is just sort of ‘meh’. The second interesting thing about this picture is that making prices less informative (i.e., increasing $\sigma_{\epsilon}^2$ ) affects traders’ posterior beliefs in a highly non-linear way. Put differently, when prices are less informative, traders don’t just react less to all price signals. They just stop reacting to small price changes. This is not a result that you could get in an information-based asset-pricing model with strictly normal shocks.