Alex – Page 3 – Research Notebook

Many explanations for the same fact

June 18, 2021 by Alex

Asset-pricing research consistently produces many different explanations for the same empirical facts. As a rule of thumb, you should expect asset-pricing researchers to wildly overachieve. Behavioral researchers can typically point to several psychological biases which might explain the same anomaly. e.g., it is possible to argue that the excess trading puzzle is due to a preference for gambling, the disposition effect, and social interactions among other things. In cross-sectional asset-pricing, there is an entire zoo of different explanations for the size and value effects, most of which “have little in common economically with each other.” In a 2017 review article, John Cochrane lists ten different explanations for the equity premium puzzle.

Unfortunately, the existence of so many different explanations for the same few facts is a real problem for the field. With so many to choose from, which explanation should a researcher use when evaluating counterfactuals or doing policy analysis? While each explanation is consistent with observed market data, different explanations will generally make different predictions out-of-sample in novel market environments. e.g., even if firm size and liquidity were perfectly correlated in all past data, researchers would still care which was the correct explanation for the size effect because they’d want to know what to expect if they ever encountered a bunch of large illiquid stocks.

So what is it about the asset-pricing research process that consistently produces multiple explanations for the same set of facts? That’s the topic of this post.

The Research Process

To answer this question, I need a working model of what it means to do asset-pricing research. What does the process entail? What counts as a new empirical fact? When asset-pricing researchers encounter one, how do they go about constructing a model to explain it? What primitives do they start with? What’s the general form of the model they eventually arrive at? How do different models differ from one another?

Asset-pricing models study the behavior of investors who can take actions today and want things tomorrow. I will use $a$ to denote an action that investors can take today. e.g., think about allocating a portfolio, deciding how much to consume, or making a capital improvement. I will use $w$ to denote a future outcome that investors want. e.g., think about consumption, wealth, or eternal happiness in the afterlife.

An asset-pricing model is a constrained optimization problem of the form

(1) $\begin{equation*} \begin{split} {\textstyle \max_a} &\phantom{=} \Exp[ \, \mathrm{Utility}(w) \, ] \\ \text{s.t.} &\phantom{=} 0 \geq \mathrm{Constraints}(a) \end{split} \end{equation*}$

The $\max_a(\cdot)$ operator captures the idea that investors strategically choose which actions to take today, $a$ . They realize that their choice of actions today, $a$ , affects how much of the thing they want they get to enjoy tomorrow, $w = \mathrm{W}(a)$ . The function $\mathrm{Utility}(w)$ denotes the utility that investors get from having a given amount of $w$ tomorrow. The expectation operator $\Exp[w] = \int \, w \cdot \mathrm{pdf}(w|a) \cdot \mathrm{d}w$ represents their conditional beliefs about the likelihood of $w$ tomorrow given their choice of actions today with $\mathrm{pdf}(w|a)$ denoting their subjective probability distribution function. This is the distribution that investors have in their heads, and because it is a subjective distribution, it might not be objectively correct. Investors also realize that they face various constraints, which I write as the requirement $0 \geq \mathrm{Constraints}(a)$ . Not all actions are possible today.

All asset-pricing models agree on this basic setup. Different asset-pricing models just make different claims about the functional forms of investor preferences, $\mathrm{Utility}(w)$ ; beliefs, $\mathrm{pdf}(w|a)$ ; and, constraints, $\mathrm{Constraints}(a)$ . Models based on habit formation, recursive utility, hyperbolic discounting, and loss aversion all monkey around with investor preferences. When a model says that investors have rational expectations, that they extrapolate, or that they seek sparsity, it is making a choice about the functional form of investors’ beliefs. The limits-to-arbitrage literature gives a taxonomy of investor constraints (short sale, margin, etc).

Researchers test whether an asset-pricing model’s specific choices for $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ fit the data by examining the model’s first-order conditions (FOCs). e.g., the stochastic discount factor (SDF) implied by an asset-pricing model comes from the FOC with respect to consumption. See Chapter 1 verse 1 of the Book of John. The function $\mathrm{f}(x) = 1 - x^2$ is an upside down parabola, which has a maximum value of $\mathrm{f}(0)=1$ . You could double check that $x=0$ maximizes $\mathrm{f}(x)=1-x^2$ by verifying that $\mathrm{f}'(0) = 0$ . Likewise, you can double check whether investors’ actions are optimal by verifying that

(2) $\begin{equation*} {\textstyle \frac{\mathrm{d}\phantom{a}}{\mathrm{d}a}} \big\{ \, \Exp[ \mathrm{Utility}(w) ] - \lambda \cdot \mathrm{Constraints}(a) \, \big\} = 0 \end{equation*}$

If an asset-pricing model’s FOCs aren’t satisfied in the data, then the model’s choice of functional forms must be missing something about investor behavior. The new parameter $\lambda$ in the equation above is a Lagrange multiplier, which captures how much a particular constraint distorts investors’ optimal choice of actions.

Here’s the asset-pricing research process in action. At time $t$ , the literature contains a bunch of models and data on various empirical settings (different countries, time periods, asset classes, etc). Each asset-pricing model makes different assumptions about $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ . But they all take the general form outlined in Equation (1). Asset-pricing researchers check whether the FOCs implied by each existing model hold in the various empirical settings observed at time $t$ . If no model adequately explains market data in some important empirical setting, then we call it a new empirical fact. Researchers then work backwards from the non-zero values of the FOCs in Equation (2) to try and guess the correct functional forms for $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ . If successful, we add a new asset-pricing model to the literature and the process repeats at time $(t+1)$ , perhaps with data on a few more empirical settings.

Ill-Posed Inverse Problem

The whole goal of the asset-pricing research process is to guess the correct functional forms for $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ to plug into Equation (1). Researchers do this by working backwards from violations of known models’ FOCs in Equation (2). If researchers guess correctly, they will find that the resulting model’s FOCs fit the observed market data in all empirical settings. But this excellent empirical fit is just a signature verifying that the model is correct. The point of guessing the true asset-pricing model is not to maximize empirical fit. With enough free parameters, any model can fit the data arbitrarily well.

Researchers care about guessing the true model (i.e., the optimization problem that investors are actually trying to solve) because the true model will allow them to predict how investors will behave in novel market environments that they haven’t encountered before. Put differently, knowing the correct functional forms for $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ allows researchers to evaluate counterfactual scenarios and do policy analysis. Knowing the correct asset-pricing model allows you to answer questions like, ‘What should the expected return of a large illiquid stock be?’, even if you have never encountered such a stock in the past.

This is an inverse problem. Asset-pricing research doesn’t involve finding the maximum of some known widely-agreed-upon optimization program. It involves staring at a bunch of data that is assumed to maximize some optimization problem and trying to figure out the details of that unknown problem. Put another way, asset-pricing researchers aren’t in the business of finding the maximum of a known function, $\mathrm{f}(x) = 1 - x^2$ . They observe $(x, \, y) = (0, \, 1)$ and assume this data is the maximum of some unknown function, $\mathrm{f}(x) = y$ . Then they try to guess the details of this unknown function.

The study of inverse optimization problems represents an entire mathematical field of inquiry. But there’s one detail in particular that’s relevant here: inverse problems are typically ill-posed. Put simply, many different functions can share the same derivative values. Functions with quite different global behavior can have the same slope locally. e.g., $\mathrm{f}(x) = 1 - x^2$ , $\mathrm{g}(x) = e^{-x^2}$ , and $\mathrm{h}(x) = \cos(\pi/2 \cdot x)$ all achieve a maximum value of $1$ at $x=0$ . These curves are all roughly the same when $|x| < 1/2$ . Yet, they have very different behavior globally: $\mathrm{f}(4) = -15$ , $\mathrm{g}(4) \approx 0$ , and $\mathrm{h}(4) = 1$ .

Neighborhood of x=0

Global behavior

Asset-pricing researchers are solving an inverse problem. They are trying to reverse engineer an entire optimization problem by studying finite data about its first-order conditions. And inverse problems are typically ill-posed. This is why they consistently produce multiple explanations for the same facts. We should expect many different optimization problems to produce equivalent FOCs in the observed market data. Moreover, we should expect this to be true even though the optimization problems, which look equivalent locally, will generally display quite different global behavior in novel as-yet-unseen market environments. And these global differences are what matter when evaluating counterfactuals and doing policy analysis.

A Potential Solution

If asset-pricing researchers generate multiple explanations for the same fact because they’re solving an ill-posed inverse problem, then how might we make this problem well-posed? The mathematical literature on inverse optimization problems makes one suggestion: severely limit the class of functions you are willing to consider. If you see derivative data in the neighborhood of $x=0$ that was generated by either $\mathrm{f}(x) = 1 - x^2$ , $\mathrm{g}(x) = e^{-x^2}$ , or $\mathrm{h}(x) = \cos(\pi/2 \cdot x)$ , you must find some grounds for ruling out two of the three options. e.g., if you were only willing to consider polynomial solutions, then $\mathrm{g}(x) = e^{-x^2}$ and $\mathrm{h}(x) = \cos(\pi/2 \cdot x)$ would be off limits. Your only remaining choice would be $\mathrm{f}(x) = 1 - x^2$ , making the inverse problem well-posed.

Unfortunately, this isn’t a particularly promising route for asset-pricing researchers. There aren’t good economic grounds for ruling out functional forms for $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ that yield similar FOCs in the observed data. In fact, some of the most cited papers in asset pricing involve dreaming new and exotic functional forms for these objects. e.g., think about models involving recursive preferences, which argue that investors’ utility takes the functional form:

(3) $\begin{equation*} \mathrm{Utility}_t(w_t) = \Big\{ \, (1 - \delta) \cdot w_t^{1 - 1/\psi} + \delta \cdot \big(\Exp_t[\mathrm{Utility}_{t+1}(w_{t+1})]^{1-\gamma}\big)^{\frac{1-1/\psi}{1-\gamma}} \, \Big\}^{\frac{1}{1 - 1/\psi}} \end{equation*}$

There’s simply no way this heifer would have been accepted into the asset-pricing cannon if researchers had decided to severely restrict the kinds of utility functions they were willing to consider sometime back in the early 1980s. If you think the discovery of recursive utility was progress, then that’s a problem.

Luckily, this isn’t the only way forward. Asset-pricing models do more than just make predictions about which FOCs should hold in the observed market data. They also say why these FOCs should be satisfied: because investors are optimizing with a specific tradeoff in mind. There is economic content in the $\max_a(\cdot)$ operator as well as in the FOCs this operator produces. An asset-pricing model doesn’t say that a set of FOCs should just happen to hold in the data. It says that these FOCs should hold because investors are optimizing the tradeoff embodied by the model’s choice of $\mathrm{Utility}(w)$ , $\mathrm{pdf}(w|a)$ , and $\mathrm{Constraints}(a)$ . So in most cases it should be possible to ask investors whether they are thinking about this tradeoff.

This represents a practical step we could take towards converting the asset-pricing research process into a well-posed inverse problem. It is a straightforward way of ruling out lots of spurious solutions. e.g., Cross-sectional differences in expected returns can’t be explained by investors demanding a risk premium for holding assets with exposure to X if real-world investors show no desire to hedge their exposure to X when given the chance. Likewise, if the excess trading puzzle is due to a widespread preference for gambling among retail investors, then it should be easy to find retail investors who express a preference for gambling. Yes, it is possible for markets to move in ways that no individual investor understands. But asset-pricing models don’t explain those kinds of market fluctuations. Asset-pricing models make predictions about how investors strategically respond to market fluctuations they do understand, so it should be possible to ask them about these fluctuations and the logic behind their responses.

Why do ‘as if’ critiques only apply to survey evidence?

November 10, 2020 by Alex

Milton Friedman laid out his methodological approach to doing economics in his 1953 essay, The Methodology of Positive Economics. This essay gives his answer to the question: What constitutes a good economic model? Or, put differently, how would you recognize a good economic model if you saw one?

According to Friedman, “the only relevant test of the validity of a hypothesis is the comparison of its predictions with experience. The hypothesis is rejected if its predictions are contradicted; it is accepted if its predictions are not contradicted.” All that matters is whether or not a model fits the data. Assumptions? Priors? Intuition? All that stuff is just moonshine. Empirical fit reigns supreme. This is an extreme view!

For example, in Friedman’s eyes, a good model of how leaves are distributed about the canopy of an oak tree is a model in which each leaf optimally chooses its position and orientation relative to its neighbors. Yes, we know that leaves don’t have brains. They can’t actually make decisions like this. But it is ‘as if’ they could. So a model in which each leaf strategically chooses where to grow is a good model of leaf placement.

A good model of how an expert billiards player makes difficult shots would be a model in which “he knew the complicated mathematical formulas that would give the optimum directions of travel, could estimate accurately by eye the angles, could make lightning calculations from the formulas, and could then make the balls travel in the direction indicated by the formulas.” So what if the player can’t do these things? We know he regularly makes difficult shots, so it’s ‘as if’ he can. Friedman tells us to just model him like that anyway.

In Friedman’s view, “a theory cannot be tested by comparing its ‘assumptions’ directly with ‘reality.’ Indeed, there is no meaningful way in which this can be done.” In fact, Friedman argues that insisting on reasonable assumptions can be misleading. “The more significant the theory, the more unrealistic the assumptions.”

Every economist knows about Friedman’s ‘as if’ approach to model evaluation. If asked, most economists will say that Friedman’s methodological approach is, if not correct, then at least reasonable. They will argue that it’s at least important to consider ‘as if’ justifications when evaluating a model.

But here’s the thing: no working economist actually evaluates models this way! Aside from one glaring exception, no economist actually thinks ‘as if’ models are helpful. Ask yourself: Is the factor zoo a problem for asset pricing? Yes. But what is a spurious factor? It’s a factor that fits the data for wrong reasons. It is ‘as if’ investors were using it to price assets even though they aren’t. And that’s precisely the problem!

The idea that we can’t test (or shouldn’t even bother testing) the assumptions behind our economic models is simply preposterous. It’s a claim that Steven Pinker would call a “conventional absurdity: a statement that goes against all common sense but that everyone believes because they dimly recall having heard it somewhere and because it is so pregnant with implications.” No economist does research this way!

Why not replace all economic models with uninterpretable machine-learning (ML) algorithms? ML algorithms can fit the data well precisely because they contain no economic assumptions. But TANSTAAFL! It is precisely the economic assumptions about what agents are trying to do that give us confidence a model’s predictions will hold up when conditions change. In other words, these assumptions are what allow economists to use the model for counterfactual analysis—i.e., to make predictions in new and as-yet-unseen environments. The right assumptions embedded in a good economic model are responsible for its robust predictions. If you’re going to ignore all such economic restrictions, then there’s no point in writing down an economic model in the first place. There are better ways to do pure prediction.

I’m by no means the first person to highlight these issues. They long predate the factor zoo and the popularity of ML algorithms. If I had to pick one person to judge the quality of an economic model, that person would be Paul Samuelson. And Samuelson strongly disagreed with Friedman’s ‘as if’ approach. Samuelson clearly recognized the importance of evaluating your assumptions, disparagingly referring to Friedman’s ‘as if’ methodology as the “F-Twist” in a 1963 discussion paper.

Moreover, in almost every context, economists approach research in a manner more consist with Samuelson than with Friedman. They firmly believe it’s important to verify one’s assumptions. This is why we see papers with titles like Do Measures of Financial Constraints Measure Financial Constraints? getting hundreds of cites a year. This influential paper is entirely concerned with testing our working assumptions.

As far as I can tell, there is only one context in which economists actually use ‘as if’ reasoning to constrain the research process—namely, when interpreting survey data. Standard asset-pricing models assume investors are solving an optimization problem that looks something like

(1) $\begin{equation*} \begin{array}{rl} \text{maximize} & \Exp\left[ \, \sum_{t=0}^\infty \, \beta^t \cdot U_t \, \right] \\ \text{subject to} & \quad\,\;\; U_t = \mathrm{u}(C_t) \\ & \Delta W_{t+1} = \mathrm{f}(C_t, \, X_t; W_t) \\ & \qquad\,\, 0 \leq \mathrm{g}_n(C_t, \, X_t; W_t) \qquad \text{constraints } n=1, \ldots,\,N \end{array} \end{equation*}$

Economists regularly test assumptions about investor preferences $U = \mathrm{u}(C)$ , the law of motion for wealth, $\Delta W = \mathrm{f}(C, \, X; W)$ , and various other kinds of economic constraints, $0 \leq \mathrm{g}_n(C, \, X; W)$ . However, for some reason, it’s entirely taboo to ask investors whether they are actually trying to this problem in the first place.

Friedman directly calls out survey data in his 1953 essay, writing that “questionnaire studies of businessmen’s or others’ motives or beliefs about the forces affecting their behavior… seem to me almost entirely useless as a means of testing the validity of economic hypotheses.” However, he offers no concrete reasons why economists should think about the “maximize” part of investors’ optimization problem any differently than the “subject to” part. Both are assumptions. In Friedman’s eyes, both are untestable.

Yes, survey data can be misleading. Above I describe a situation where surveying economists about their views on ‘as if’ reasoning would yield specious evidence. But all data can be misleading. It’s not like NOT using survey data has resolved the factor zoo. Sometimes investors give uninformative answers which might lead researchers down the wrong path. But this doesn’t mean that we can’t learn anything concrete about how investors price assets from a well-constructed survey. Not every regression result is informative. Some regression estimates can even be misleading. None of this implies that regression analysis is worthless.

Friedman’s 1953 essay outlines a bad approach to model evaluation. There’s more to a good model than $R^2 = 100\%$ . Paul Samuelson knew this to be true. And, except when they’re looking at survey data, every other economist knows it to be true as well. There’s no reason for us to continue applying ‘as if’ reasoning only in this particular context. It’s just not a valid argument for dismissing survey evidence about a model.

Consumption Risk In Modern Macro-Finance Models

October 8, 2020 by Alex

Stocks returns are $8\%$ per year higher than bond returns on average. It’s hard to explain such a large equity premium using the standard consumption-based model because consumption growth isn’t risky enough. So, to fix this problem, modern macro-finance models introduce new state variables capturing other kinds of risk investors might care about, such as the surplus consumption ratio (habits; Campbell-Cochrane 1999) and news about long-run consumption growth (long-run risks; Bansal-Yaron 2004).

Because exposure to one of these new state variables typically explains most of the $8\%$ per year equity premium in modern macro-finance models, there’s a sense among researchers that it doesn’t matter whether investors are trying to insure themselves against shocks to consumption growth.

Not so!

This post shows why. The new state variables used in modern macro-finance models are not separate from consumption risk. They’re ways of amplifying the effects of consumption risk. Arguing that investors don’t care about consumption risk because exposure to the surplus consumption ratio explains most of the $8\%$ per year equity premium is like arguing that Hollywood doesn’t care about beauty because plastic surgery has a bigger effect on casting decisions than the face actors are born with. It’s nonsense.

Consumption CAPM

Investors in the consumption capital asset-pricing model (CCAPM; Lucas 1978) try to maximize their expected discounted utility by choosing how much to consume, $C_t$ , how much to invest in the stock market, $S_t$ , and how much to invest in riskless bonds, $B_t$ :

(1) $\begin{equation*} \begin{array}{rl} \text{maximize} & \Exp\left[ \, \sum_{t=0}^\infty \, \beta^t \cdot U_t \, \right] \\ \text{subject to} & \quad U_t = C_t^{1-\gamma} / \, (1-\gamma) \\ & W_{t+1} = (W_t - C_t) \cdot (1 + R_f) + S_t \cdot (R_{t+1} - R_f) \\ & \,\,\,\,\,W_t = C_t + B_t + S_t \end{array} \end{equation*}$

$\beta \in (0, \, 1)$ is investors’ subjective time preference, $U_t$ is their utility from consumption, $W_t$ represents their wealth, and $\gamma > 0$ is their coefficient of risk aversion. $(1+R_{t+1}) = (P_{t+1} + D_{t+1})/P_t$ is the gross return on the value-weighted stock market, and $(1+R_f)$ is the return on a riskless bond.

The CCAPM predicts that the expected excess return on stocks, $\Exp[R_{t+1}] - R_f$ , will be proportional to the covariance between consumption growth and stock returns:

(2) $\begin{equation*} \Exp[R_{t+1}] - R_f \approx \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] \end{equation*}$

The price of stocks is inversely related to the expected market return, $\Exp[1+R_{t+1}] = \Exp[P_{t+1} + D_{t+1}]/P_t$ . So, the CCAPM says that investors pay more for stocks when stock returns tend to offset negative consumption shocks, $\Cov[\Delta \log C_{t+1}, \, R_{t+1}] < 0$ . In other words, Equation (2) says CCAPM investors view the stock market as a way to insure future consumption shocks.

Equity Premium Puzzle

Unfortunately for the CCAPM, investors’ desire to hedge future consumption shocks can’t explain the entire $\Exp[R_{t+1}] - R_f \approx 8\%$ per year equity premium that we observe in the data on its own. To see why, consider rewriting the right-hand side of Equation (2) using the definition of a covariance:

(3) $\begin{equation*} \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] = \gamma \times \big( \, \rho \cdot \sigma_{\Delta \log C} \cdot \sigma_R \, \big) \end{equation*}$

The parameter $\rho = \Corr[\Delta \log C_{t+1}, \, R_{t+1}]$ is the correlation between consumption growth and stock returns, $\sigma_{\Delta \log C} = \Sd[\Delta \log C_{t+1}]$ is the volatility of consumption growth, and $\sigma_R = \Sd[R_{t+1}]$ is the volatility of stock returns. In the data, we observe values of roughly $\rho \approx 0.20$ , $\sigma_{\Delta \log C} \approx 1\%$ per year, and $\sigma_R \approx 16\%$ per year. Thus, investors would need a risk aversion of $\gamma = 250$ for the CCAPM to explain an $8\%$ equity premium.

A risk aversion of $\gamma = 250$ seems too high. But, if we assume a lower risk aversion of merely $\gamma = 10$ , the equity premium should only be $10 \times (0.20 \cdot 1\% \cdot 16\%) \approx 0.32\%$ per year according to the CCAPM. Given the power of compounding to increase investor wealth over long horizons, a difference of $8\% - 0.32\% = 7.68\%$ per year is a big deal. CCAPM investors with a $\gamma = 10$ should be putting much more money in stocks than they actually are, which would drive up stock prices and thereby lower the expected returns. So, if the basic logic of the CCAPM is correct, there must be something else about stock returns scaring investors away.

External Habit

Campbell-Cochrane (1999) argue that the something else is captured by a variable called the surplus consumption ratio. Their starts with Problem (1) and plugs in a modified utility function:

(4) $\begin{equation*} U_t = (C_t-X_t)^{1-\gamma} / \, (1-\gamma) \end{equation*}$

In this specification, investors care about their consumption in excess of the level of consumption they have become accustomed to, $X_t$ . This level corresponds to a weighted average of past consumption:

(5) $\begin{equation*} \log X_t = \lambda \cdot {\textstyle \sum_{\ell=0}^{\infty}} \, \phi^{\ell} \cdot \log C_{t-\ell} \qquad \qquad \lambda > 0, \, \phi \in (0, \, 1) \end{equation*}$

So, drops in consumption following prolonged periods of high consumption are extremely painful for investors. Conversely, an increase in consumption following a long hungry spell will be really enjoyable.

The surplus consumption ratio is $Z_t = (C_t - X_t) / C_t$ . The model says expected stock returns could be high either because stock returns covary with consumption growth or because they covary with growth in the surplus consumption ratio:

(6) $\begin{equation*} \Exp[R_{t+1}] - R_f \approx \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] + \gamma \times \Cov[\Delta \log Z_{t+1}, \, R_{t+1}] \end{equation*}$

Yes, the covariance between consumption growth and stock returns isn’t strong enough explain the $8\%$ equity premium on its own. But, stock market crashes tend to sucker punch investors, occurring just when investors’ consumption falls following a prolonged boom. $\Cov[\Delta \log Z_{t+1}, \, R_{t+1}] > 0$ explains nearly all of the $8\%$ per year equity premium according to this habit-formation model.

Long-Run Risk

Bansal-Yaron (2004) add a different state variable to the CCAPM. This model also starts with Problem (1) and plugs in a new utility function based on Epstein-Zin (1989) recursive preferences rather than habit formation:

(7) $\begin{equation*} U_t = \left\{ \, (1 - \beta) \cdot C_t^{1-\alpha} + \beta \cdot \big(\Exp_t[U_{t+1}^{1-\gamma}]^{\frac{1}{1-\gamma}}\big)^{1-\alpha} \, \right\}^{\frac{1}{1-\alpha}} \end{equation*}$

These preferences are recursive because they indicate that investors care not only about their consumption today, $(1 - \beta) \cdot C_t^{1-\alpha}$ , but also the present value of their expected future consumption, $\beta \cdot \big(\Exp_t[U_{t+1}^{1-\gamma}]^{\frac{1}{1-\gamma}}\big)^{1-\alpha}$ .

$\beta$ and $\gamma$ represent investors’ time preferences and risk aversion just like in the original power utility specification. The only new parameter is $\alpha > 1$ . The ratio $1/\alpha$ represents investors’ elasticity of intertemporal substitution (EIS). This parameter captures how much investors want to resolve future uncertainty about consumption, not because they want to do something with the information but because resolving uncertainty as soon as possible makes them happy. The guy on the subway platform who’s leaning dangerously out onto the tracks staring down the tunnel so that he can be the first to spot the next train is someone with a very high EIS. He wants to know as soon as possible if the next train is immanent, not because it will allow him to board sooner (everyone boards at the same time) but because knowing the train is about to arrive makes him happy. For a long-run risk model to work, we need $\gamma \cdot (1/\alpha) > 1$ .

Let $P_t$ denote the current price of an asset whose payout is aggregate consumption in the following period. The key new state variable in the long-run risk model is $\log Z_{t+1} = \log (P/C)_{t+1}$ . The model says that the equity premium will be determined as follow:

(8) $\begin{equation*} \Exp[R_{t+1}] - R_f \approx \gamma \times \Cov[ \, \Delta \log C_{t+1}, \, R_{t+1} \, ] + \mathrm{f}(\gamma, \, \alpha) \times \Cov[ \, \log Z_{t+1}, \, R_{t+1} \, ] \end{equation*}$

$\mathrm{f}(\gamma, \, \alpha) \leq 0$ is a function that stems from a Campbell-Shiller (1988) approximation of $Z_t$ . Thus, the long-run risk model says expected stock returns could be high either because future stock returns tend to covary with consumption growth or because these stock returns tend to covary with the future price-to-dividend ratio of the aggregate consumption claim. This price-to-dividend ratio will partly reflect current changes in consumption, $\Delta \log C_{t+1}$ . But, since consumption growth is persistent and investors have recursive preferences, it will also reflect future consumption shocks as well. In calibrations, most of the $8\%$ per year equity premium is explained by variation in $\log Z_{t+1}$ coming from consumption shocks far off in the future.

Source Of Confusion

What would it mean for investors not to care about consumption risk in one of these models? $\rho = \Corr[\Delta \log C_{t+1}, \, R_{t+1}]$ captures the stock market’s exposure to consumption risk. When $\rho \approx 1$ , stock market booms always coincide with increases in consumption. When $\rho = 0$ , knowing that the stock market is booming tells you nothing about whether aggregate consumption is increasing or decreasing. So, investors in a particular model would be indifferent to changes in consumption risk if

(9) $\begin{equation*} \partial_{\rho} (\Exp[R_{t+1}] - R_f) = 0 \end{equation*}$

In other words, they wouldn’t care about consumption risk if increasing the amount of consumption risk had no effect on their demand and thus no effect on equilibrium prices.

We saw above that, if we assume a risk aversion coefficient of $\gamma = 10$ , then the first term in Equations (6) and (8) is very small. $\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] \approx 0.3\%$ . And, as a result, the effect of an increase in consumption risk on asset prices coming from this first term is quite small as well:

(10) $\begin{equation*} \partial_{\rho} (\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]) = \gamma \times \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.016 \end{equation*}$

Judged only by the effect of this initial term, an increase in consumption risk from $\rho = 0.00$ to $\rho = 0.40$ would only increase the expected excess return on the stock market by $0.016 \times 0.40 = 0.64\%$ per year. We observe a correlation between stock returns and consumption growth of $\rho = 0.20$ in the data. So, these numbers imply that a $2\times$ swing around the mean $\rho$ would explain less than a tenth of the total $8\%$ per year equity premium puzzle if consumption risk only affected asset prices via the $\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]$ term.

However, consumption risk doesn’t only affect asset prices via the $\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]$ term in Equations (6) and (8). Therefore

( $!!!$ ) $\begin{equation*} \partial_{\rho} (\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]) \approx 0 \qquad \text{does \underline{\textbf{not}} imply} \qquad \partial_{\rho} (\Exp[R_{t+1}] - R_f) \approx 0 \end{equation*}$

Such a conclusion would only be valid if the new state variables introduced in Campbell-Cochrane (1999) and Bansal-Yaron (2004) happened to be unrelated to consumption growth. This is absolutely not the case! Changes in the surplus consumption ratio are highly correlated with consumption growth. And, the long-run risk model assumes that consumption growth is very persistent, so price changes due to anticipated consumption shocks in the far distant future will be highly correlated with consumption growth today too.

Plugging In Numbers

How much does the Campbell-Cochrane (1999) model suggest expected excess returns should increase in response to a move from $\rho = 0$ to $\rho = 0.40$ ? Campbell-Cochrane (1999) talk about habit formation as “amplification mechanism for consumption risks in marginal utility. (page 240)” Mathematically, this shows up as a scaling up of the risk-aversion coefficient from $\gamma$ to $\gamma / \Exp[Z_t]$ . The authors use $\phi = 0.87$ . With $\sigma_{\Delta \log C} = 1\%$ per year and $\gamma = 10$ , the average surplus consumption ratio is $\Exp[Z_t] = \sigma_{\Delta \log C} \cdot \sqrt{\frac{\gamma}{1 - \phi}} \approx 0.088$ . So, in the external habit model, the effect of consumption risk on asset prices will be:

(11) $\begin{equation*} \partial_{\rho} (\Exp[R_{t+1}] - R_f) = (\gamma / 0.088) \times \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.18 \end{equation*}$

Because increasing the stock market’s correlation with consumption growth must also increase its correlation with growth in the surplus consumption ratio, a $\Delta \rho = 0.40$ increase in consumption risk will increase the annual expected excess return on the stock market by $0.18 \times 0.40 = 7.30\%$ in a habit model.

How much does the Bansal-Yaron (2004) model suggest expected excess returns should increase in response to a $\Delta \rho = 0.40$ increase in consumption risk? Cochrane (2017) describes how this model “ties its extra state variables… to observables by the assumption of a time-series process in which short-run consumption growth is correlated with… long-run news.” When $1/\alpha \approx 1$ , the function $\mathrm{f}(\gamma, \, \alpha) = 1 - \gamma$ and Equation (8) can be re-written as:

(12) $\begin{equation*} \Exp[R_{t+1}] - R_f \approx \gamma \times \Cov[ \, \Delta \log C_{t+1}, \, R_{t+1} \, ] + (1 - \gamma) \times \Cov[ \, \log (P/C)_{t+1}, \, R_{t+1} \, ] \end{equation*}$

Changing an asset’s correlation with consumption growth also changes its correlation with the future log price-to-consumption ratio. I estimate $\log (P/C)_{t+1} = 3.61 - 30.26 \cdot \Delta \log C_{t+1} + \varepsilon_{t+1}$ , which would imply that $\partial_{\rho} (\Exp[R_{t+1}] - R_f) = [\gamma - (1-\gamma) \cdot 30.26] \cdot \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.45$ . Thus, a $\Delta \rho = 0.40$ increase in consumption risk will increase annual expected excess returns by $0.45 \times 0.40 \approx 18\%$ in the long-run risk model!

Factor Models, Little Green Men, And Machine Learning

June 28, 2019 by Alex

Economists use machine learning (ML) to study asset prices in two different ways. Approach #1: use these techniques to predict the cross-section of expected returns—i.e., to predict which stocks are most likely to have high or low future returns. e.g., see here, here, or here. Approach #2: use them to try to uncover the “true asset-pricing model”—a.k.a., the “set of priced risk factors”.

Many economists dismiss approach #1, arguing that predicting future stock returns is a job for traders not academics. Instead, it’s much more common for researchers to adopt approach #2. The conventional wisdom is that we, as researchers, will learn something deep and fundamental about how financial markets work if one of these new ML techniques uncovers a factor model that perfectly explains the cross-section of expected returns. There’s a widely held view that doing empirical asset-pricing research means attributing differences in expected returns to some risk-return tradeoff with an intuitive story attached to it.

But… not so fast. There’s actually something paradoxical about the logic of approach #2. There’s a problem with the conventional wisdom. And, the goal of this post is to explain what that special something is.

Factor Models

But first: factor models. What are economists talking about when they say they’re trying to find the “true asset-pricing model” or the “set of priced risk factors”? To get a handle on this terminology, consider regressing the returns of each stock, $R_{n,t}$ , on lagged values of some predictive variable, $X_{n,t-1}$ :

$\begin{equation*} R_{n,t} = \hat{a} + \hat{b} \cdot X_{n,t-1} + \hat{e}_{n,t} \end{equation*}$

The results of a predictive regression like this one can be interpreted as trading-strategy returns. You can read the estimated $\hat{b}$ as the return to a zero-cost portfolio that’s long high- $X$ stocks and short low- $X$ stocks:

$\begin{equation*} \hat{b} \propto {\textstyle \frac{1}{N} \cdot \sum_n} \, (R_{n,t} - \bar{R}_t) \cdot (X_{n,t-1} - \bar{X}_{t-1}) \end{equation*}$

Thus, $\hat{b} > 0$ implies both that stocks with high predictor values yesterday, $(X_{n,t-1} - \bar{X}_{t-1}) > 0$ , tended to have high excess returns today, $(R_{n,t} - \bar{R}_t) > 0$ , and also that it would have been profitable to trade on $X$ today.

It could be that an estimated $\hat{b} > 0$ represents arbitrage profits. But, maybe trading on $X$ is only profitable because it requires investors to bear lots of non-diversifiable risk? Imagine that investors are all really worried about not having enough money during future market crashes, $R_{\mathit{Mkt},t} \ll 0$ . Then, if the predictive variable $X_{n,t-1}$ turned out to be capturing exposure to market risk,

$\begin{equation*} X_{n,t-1} = {\textstyle \frac{\mathrm{Cov}[R_{n,t}, \, R_{\mathit{Mkt},t}]}{\mathrm{Var}[R_{\mathit{Mkt},t}]}} \end{equation*}$

the profits earned by trading on $X$ would represent compensation for holding a portfolio that will deliver terrible returns during market crashes—i.e., at the worst possible time as far as investors are concerned. And, when economists think this is what’s going on, they typically write the predictive variable as $\beta_{n,t-1}^{(\mathit{Mkt})}$ rather than $X_{n,t-1}$ . This is what they’re talking about when they speak of “market beta”.

So far so good. Now, for the final step. Notice that this compensation-for-risk logic doesn’t just apply when the risk factor is market returns. You can replace $R_{\mathit{Mkt},t}$ with any variable so long as the variable defines some sort of bad aggregate outcome in investors’ eyes. e.g., think about something like a drop in market liquidity. So, looking for the “true asset-pricing model” or the “set of priced risk factors” means looking for a collection of $K \geq 1$ variables $\{R_{1,t}, \ldots,\,R_{K,t} \}$ such that, if we assume investors are worried about not having enough money when these risk factors are negative, then every difference in expected returns is perfectly explained by differences exposure to these $K$ priced risk factors:

$\begin{equation*} \mathrm{E}_{t-1}[R_{n,t}] = {\textstyle \sum_{k=1}^K} \, \lambda_t^{(k)} \cdot \beta_{n,t-1}^{(k)} \end{equation*}$

Above, each $\lambda_t^{(k)} > 0$ is a market-wide constant called the price of risk associated with the $k$ th factor.

I really want to emphasize the logic here. When an economist says a factor model explains the cross-section of expected returns, he’s saying that investors all have the same $K \geq 1$ risk factors in mind when making their respective portfolio choices. If one of these risk factors were to go negative, investors would consider it a bad state of the world; if all of them were to go negative, it’d be apocalyptic. The clain is that investors are all really worried about having enough money when these various kinds of bad outcomes occur. So, as a result, they’re willing to pay extra for assets whose returns are less correlated with these $K$ risk factors—i.e., for assets that are more likely to have positive returns when risk factors are negative. Therefore, in equilibrium, these assets will have higher prices today and thus lower expected future returns.

Little Green Men

By now, researchers have proposed lots of different candidate factor models. Some might even say there’s a “factor zoo”. Each model makes its own claim about a specific set of risk factors that all investors are worried about. And yet, there’s no general consensus among researchers (let alone investors) about which is correct. This disagreement should already give you pause, but now ask yourself this: If you have to use an ML algorithm to identify the correct “set of priced risk factors” in investors’ “true asset-pricing model”, how did investors find these variables in the first place? A few investors certainly understand the ML toolkit today, but most certainly do not. And, no one was aware of these ideas twenty something years ago.

As a thought experiment, suppose that tomorrow while doing other research you encounter an ML algorithm, which was first discovered in 2010, that always outputs a factor model which perfectly explains the cross-section of expected returns. Does it make sense to claim that this ML algorithm is able to find the “true asset-pricing model” at work in, say, 1985? By assumption, when you feed data from 1985 into the algorithm, the output will be a “set of priced risk factors” that perfectly explains the cross-section of expected returns in 1985. But, could these risk factors possibly reflect how Madonna-loving 1985 investors were thinking about risk and return? No. Of course not. If the algorithm wasn’t discovered until 2010, could 1985 investors have known about this “set of priced risk factors”?

Let’s make the thought experiment even more extreme. Suppose that little green men come to earth tomorrow and secretly give you an alien computer that operates based on principles never before seen by humans. There’s absolutely nothing like it here on earth. And, this advanced computer comes pre-programmed with correspondingly advanced ML algorithms. And, imagine that one of these algorithms works like the algorithm described above. It always outputs a set of $K \geq 1$ risk factors that perfectly explain the cross-section of expected returns. Do these risk factors tell us anything about how human investors view risk in earthly markets? Again: No. Of course not. To discover them you had to use an advanced alien technology with absolutely no analog here on earth. So, how could this risk factors be capturing earthly investors’ views about risk and return? The algorithm simply produces an excellent set of predictive variables that take the form of partial correlations with each asset’s returns—i.e., that take the form of $\beta_{n,t-1}^{(k)}$ s.

Machine Learning

I’m quite bullish about the prospects of ML in asset pricing. I think researchers have barely scratched the surface. I just don’t think that approach #2—i.e., searching for the “true asset-pricing model”/”set of priced risk factors”—is a sensible way to apply the ML toolkit. Although academics tend to poo poo approach #1 as lacking in economic content, it’s simply not true. There are lots of situations where we’re perfectly happy to have good return predictions at the price of not understanding where this fit comes from. Traders are obviously OK with this Faustian bargain. But, so too are researchers. It’s not like the Fama-French 3-factor model is popular because we have an economic understanding of what the size and value factors represent.

Financial economists like to think about the market and its investors as something separate. But, it’s just not so. We are the investors in our asset-pricing models. There’s no separation. And, this fact should be reflected in our models. For me, this is the most interesting economic insight that comes with applying ML algorithms to study asset prices. If the tools that we use to find predictors change, then the predictors that our theoretical investors find should change, too. In his AFA presidential address, John Cochrane writes that, “to address these questions in the zoo of new variables, I suspect we will have to use different methods… For one variable, portfolio sorts and regressions both work. But we cannot chop portfolios $27$ ways… so, I do not see how to do it by a high-dimensional portfolio sort.” Whatever those different methods end up being (ML or otherwise), we’d better not be modeling asset-pricing equilibria the same way after they get introduced.

Risk-Factor Identification: A Critique

May 26, 2019 by Alex

In standard cross-sectional asset-pricing models, expected returns are governed by exposure to aggregate risk factors in a market populated by fully rational investors. Here’s how these models work. Because investors are fully rational, they correctly anticipate which assets are most likely to have low returns in especially inconvenient future states of the world—i.e., returns that are highly correlated with aggregate risk factors. They won’t be willing to pay as much for the high risk-exposure assets today. So, the price of high risk-exposure assets will drop in equilibrium, giving these assets high expected returns going forward.

With this standard framework in mind, financial economists are constantly on the lookout for assets with similar risk exposures but different average returns. e.g., in a CAPM world, value and growth stocks would have similar average returns after adjusting for market beta; however, in the real world, there’s a 4%-per-year value premium. Assuming they are fully rational, this finding suggests that investors are worried about more than just aggregate market risk when pricing assets. It suggests they’re also paying attention to another as-yet-unknown risk factor(s). The central challenge in this literature is to figure out which one(s).

Unfortunately, after decades of work, there’s still no general consensus about which aggregate risk factors matter to real-world investors. Instead, the academic literature contains a zoo of candidate risk factors. Correlation with any of these factors will help predict an asset’s expected returns. But, it’s hard to believe that all of these aggregate risk factors actually matter to real-world investors, especially when they “have little in common economically with each other”.

Lax econometric standards are certainly one explanation for this factor zoo. The goal of this post is to suggest another: full rationality. Notice that full rationality plays two different roles in the discussion above. The first is to make sure that investors correctly anticipate the correlation between each asset’s future returns and the aggregate risk factors. If investors are fully rational, then changes in an asset’s risk exposure must be due to changes in fundamentals. The second role is to remove any logical limits on what these aggregate risk factors might be. If investors are fully rational, then they might potentially be worried about any future state of the world a researcher might dream up… and more! The whole premise of learning about the true risk factors requires real-world investors to know things that researchers haven’t yet noticed. And, if investors are fully rational, this additional knowledge might be arbitrarily subtle.

Below I show that, if researchers assume that investors are fully rational in both of the above senses, then identifying the true set of aggregate risk factors used by real-world investors is an impossible goal.

RCT Protocol

Economists think about randomized controlled trials (RCTs) as the gold standard for identification. Here’s how the RCT protocol works. Imagine you’re a medical researcher who’s just discovered a new cancer-treatment drug. You think your new discovery has promise, but the only way to know if it actually works is to give it to cancer patients and see whether they’re more likely to recover. But, how should you do this?

You could just distribute flyers advertising your new drug at the nearest hospital, give your drug to all the cancer patients who respond to the flyers, and then compare the recovery rate of the patients who took your drug to that of the remaining cancer patients. However, this is a bad idea. People try to make the best decision possible given all available knowledge about their current circumstances. So, we should expect that the cancer patients who respond to your flyer will be different from those who do not. We should expect them to be sicker, having exhausted all other treatment options. This means that any difference in recovery rates could be due to your new drug or to underlying differences in patient populations.

What’s more, if patients are optimizing based on information that’s unobservable to you (the researcher), then it doesn’t help to control for the differences in patient populations that you can see. Suppose you found two cancer patients, one who took your drug and one who decided not to, that looked identical in every conceivable way you could measure: both male, both white, both 43 years old, same height and weight, etc… If you really believed that these patients were making fully rational choices based on all the available information they had, then you must be missing something about each of their respective situations. Two identical fully rational people wouldn’t make two radically different life choices given the same information.

In short, to learn whether your new drug works, you have to break the link between drug treatment and patients’ optimal decisions based on (potentially) unobservable information. And, the RCT protocol does this by randomizing which cancer patients get your new drug and which get a sugar pill. You need to find a bunch of patients willing to participate in your study knowing that they have only a 50:50 chance of receiving the new experimental treatment. Then, with enough patients, the law of large numbers makes it very unlikely that the treated patient population will systematically differ from the untreated population. Thus, any difference in the recovery rates of these two groups must be due to your drug regimen.

Model Testing

Now, think about what’s going on when we test a cross-sectional asset-pricing model. A model is just a list of $K \geq 1$ aggregate risk factors. A fully rational investor will anticipate which assets have returns that are highly correlated with these $K$ aggregate risk factors. So, if the model is correct, differences in expected returns across assets will be explained by differences in exposure to these $K$ aggregate risk factors.

This logic suggests a straightforward empirical approach. To test a cross-sectional asset-pricing model, first separately regress the excess returns of each asset $n = 1,\ldots,\,N$ on the $K$ aggregate risk factors:

(1) $\begin{equation*} \mathit{rx}_{n,t} = \bar{a}_n + {\textstyle \sum_{k=1}^K} \, \bar{b}_{n,k} \cdot f_{k,t} + e_{n,t} \end{equation*}$

Run a time-series regression involving $t=1,\ldots,\,T$ observations for each asset. Then, take the estimated slope coefficients from these $N$ regressions, which capture each asset’s exposure to the $K$ aggregate risk factors, $\bar{b}_{n,k} \overset{\scriptscriptstyle \text{def}}{=} \overline{\mathrm{Cov}}[\mathit{rx}_{n,t}, \, f_{k,t}] \, \big/ \, \overline{\mathrm{Var}}[f_{k,t}]$ , and test whether differences in risk-factor exposure across assets explain differences in expected returns across assets:

(2) $\begin{equation*} \overline{\mathit{rx}}_n = \hat{\alpha} + {\textstyle \sum_{k=1}^K} \, \hat{\lambda}_k \cdot \bar{b}_{n,k} + \varepsilon_n \end{equation*}$

Run one cross-sectional regression involving $n=1,\ldots,\,N$ observations. If you’ve found the true factor model that real-world investors are using, then i) $\hat{\lambda}_k > 0$ for all $k=1,\ldots,\,K$ , ii) $\hat{\alpha} \approx 0$ , and iii) $\widehat{\mathrm{Var}}[\varepsilon_n] \approx 0$ .

But, satisfying these three criteria is only a necessary condition. It’s not sufficient for proving you’ve got the right model. Even if a cross-sectional asset-pricing model passes these hurdles, real-world investors might not be using those $K$ aggregate risk factors to price assets. Exposure to the K aggregate risk factors could be the result of correlations with other omitted variables that real-world investors really care about.

This is a question about identification. And, the RCT protocol suggests we can solve it by looking for random variation in an asset’s exposure to each of the $K$ risk factors that has nothing to do with changes in fundamentals. The whole point of using an RCT is to make sure that patient decisions based on unobserved information aren’t causing a spurious link between drug treatment and recovery. And, we want to make sure that investor decisions based on unobserved fundamentals aren’t causing a spurious link between risk exposure and expected returns. We need to block any possibility of an unobserved link between risk-factor exposure and asset fundamentals.

So, imagine that investors perceive a noisy version of each asset’s exposure to the $k$ th risk factor:

(3) $\begin{equation*} \bar{b}_{n,k} = \bar{b}_{n,k}^{\star} + \tilde{b}_{n,k} \end{equation*}$

Above, $\bar{b}_{n,k}^{\star}$ denotes the $n$ th asset’s true risk exposure and $\tilde{b}_{n,k}$ denotes noise that’s unrelated to fundamentals. The only way to know that investors are using a particular set of $K$ aggregate risk factors and not some other correlated set of factors is to study how $\tilde{b}_{n,k}$ predicts expected returns. After all, differences in expected returns that are associated with estimation errors, $\tilde{b}_{n,k}$ , can’t be attributed to investors acting strategically based on unobserved information about asset fundamentals.

Impossible Goal

By now, you probably see the logical trap that’s been laid. A fully rational investor might potentially be reacting to any piece of unobserved information about an asset’s fundamentals. So, non-fundamental variation in their perception of risk exposure is crucial to identifying the model they’re using. But, non-fundamental variation in perceived risk exposure would represent an error. And, fully rational investors don’t make errors. Thus, if we are adamant that real-world investors are fully rational, then we must give up any hope of identifying the cross-sectional asset-pricing model they’re using.

Note that this impossibility result doesn’t say that investors need to be completely irrational… far from it. The true $\bar{b}_{n,k}^{\star}$ has to have some bearing on investors’ perceived $\bar{b}_{n,k}$ . If investors aren’t strategically adjusting their demand today in response to actual future risks, then cross-sectional asset-pricing models have no content. Rather, the impossibility result says that, for researchers to identify the cross-sectional asset-pricing model that real-world investors are using, these perceptions can’t be perfectly accurate. For a useful analogy, think about every spy thriller with a canary trap that you’ve ever seen. In order for one spy to figure out what the other knows, he’s got to see how his adversary reacts to planted fake intel. If his foe always sees through the ploy (i.e., if his foe is “fully rational” in High Economyan), then there’s no hope of any success.

This impossibility result also suggests a new use for many of the cognitive errors documented by behavioral economists: as tools for testing whether or not real-world investors care about exposure to particular risk factors. The existing behavioral-finance literature contains a ready supply of $\tilde{b}_{n,k}$ s.

« Previous Page