Research Notebook

Consumption Risk In Modern Macro-Finance Models

October 8, 2020 by Alex

Stocks returns are 8\% per year higher than bond returns on average. It’s hard to explain such a large equity premium using the standard consumption-based model because consumption growth isn’t risky enough. So, to fix this problem, modern macro-finance models introduce new state variables capturing other kinds of risk investors might care about, such as the surplus consumption ratio (habits; Campbell-Cochrane 1999) and news about long-run consumption growth (long-run risks; Bansal-Yaron 2004).

Because exposure to one of these new state variables typically explains most of the 8\% per year equity premium in modern macro-finance models, there’s a sense among researchers that it doesn’t matter whether investors are trying to insure themselves against shocks to consumption growth.

Not so!

This post shows why. The new state variables used in modern macro-finance models are not separate from consumption risk. They’re ways of amplifying the effects of consumption risk. Arguing that investors don’t care about consumption risk because exposure to the surplus consumption ratio explains most of the 8\% per year equity premium is like arguing that Hollywood doesn’t care about beauty because plastic surgery has a bigger effect on casting decisions than the face actors are born with. It’s nonsense.

Consumption CAPM

Investors in the consumption capital asset-pricing model (CCAPM; Lucas 1978) try to maximize their expected discounted utility by choosing how much to consume, C_t, how much to invest in the stock market, S_t, and how much to invest in riskless bonds, B_t:

(1)   \begin{equation*} \begin{array}{rl} \text{maximize} & \Exp\left[ \, \sum_{t=0}^\infty \, \beta^t \cdot U_t \, \right] \\ \text{subject to} & \quad U_t = C_t^{1-\gamma} / \, (1-\gamma) \\ & W_{t+1} = (W_t - C_t) \cdot (1 + R_f) + S_t \cdot (R_{t+1} - R_f) \\ & \,\,\,\,\,W_t = C_t + B_t + S_t \end{array} \end{equation*}

\beta \in (0, \, 1) is investors’ subjective time preference, U_t is their utility from consumption, W_t represents their wealth, and \gamma > 0 is their coefficient of risk aversion. (1+R_{t+1}) = (P_{t+1} + D_{t+1})/P_t is the gross return on the value-weighted stock market, and (1+R_f) is the return on a riskless bond.

The CCAPM predicts that the expected excess return on stocks, \Exp[R_{t+1}] - R_f, will be proportional to the covariance between consumption growth and stock returns:

(2)   \begin{equation*} \Exp[R_{t+1}] - R_f \approx \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] \end{equation*}

The price of stocks is inversely related to the expected market return, \Exp[1+R_{t+1}] = \Exp[P_{t+1} + D_{t+1}]/P_t. So, the CCAPM says that investors pay more for stocks when stock returns tend to offset negative consumption shocks, \Cov[\Delta \log C_{t+1}, \, R_{t+1}] < 0. In other words, Equation (2) says CCAPM investors view the stock market as a way to insure future consumption shocks.

Equity Premium Puzzle

Unfortunately for the CCAPM, investors’ desire to hedge future consumption shocks can’t explain the entire \Exp[R_{t+1}] - R_f \approx 8\% per year equity premium that we observe in the data on its own. To see why, consider rewriting the right-hand side of Equation (2) using the definition of a covariance:

(3)   \begin{equation*} \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]  =  \gamma \times \big( \, \rho \cdot \sigma_{\Delta \log C} \cdot \sigma_R \, \big) \end{equation*}

The parameter \rho = \Corr[\Delta \log C_{t+1}, \, R_{t+1}] is the correlation between consumption growth and stock returns, \sigma_{\Delta \log C} = \Sd[\Delta \log C_{t+1}] is the volatility of consumption growth, and \sigma_R = \Sd[R_{t+1}] is the volatility of stock returns. In the data, we observe values of roughly \rho \approx 0.20, \sigma_{\Delta \log C} \approx 1\% per year, and \sigma_R \approx 16\% per year. Thus, investors would need a risk aversion of \gamma = 250 for the CCAPM to explain an 8\% equity premium.

A risk aversion of \gamma = 250 seems too high. But, if we assume a lower risk aversion of merely \gamma = 10, the equity premium should only be 10 \times (0.20 \cdot 1\% \cdot 16\%) \approx 0.32\% per year according to the CCAPM. Given the power of compounding to increase investor wealth over long horizons, a difference of 8\% - 0.32\% = 7.68\% per year is a big deal. CCAPM investors with a \gamma = 10 should be putting much more money in stocks than they actually are, which would drive up stock prices and thereby lower the expected returns. So, if the basic logic of the CCAPM is correct, there must be something else about stock returns scaring investors away.

External Habit

Campbell-Cochrane (1999) argue that the something else is captured by a variable called the surplus consumption ratio. Their starts with Problem (1) and plugs in a modified utility function:

(4)   \begin{equation*}  U_t = (C_t-X_t)^{1-\gamma} / \, (1-\gamma) \end{equation*}

In this specification, investors care about their consumption in excess of the level of consumption they have become accustomed to, X_t. This level corresponds to a weighted average of past consumption:

(5)   \begin{equation*} \log X_t  = \lambda \cdot {\textstyle \sum_{\ell=0}^{\infty}} \, \phi^{\ell} \cdot \log C_{t-\ell} \qquad \qquad \lambda > 0, \, \phi \in (0, \, 1) \end{equation*}

So, drops in consumption following prolonged periods of high consumption are extremely painful for investors. Conversely, an increase in consumption following a long hungry spell will be really enjoyable.

The surplus consumption ratio is Z_t = (C_t - X_t) / C_t. The model says expected stock returns could be high either because stock returns covary with consumption growth or because they covary with growth in the surplus consumption ratio:

(6)   \begin{equation*} \Exp[R_{t+1}] - R_f  \approx  \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]  +  \gamma \times \Cov[\Delta \log Z_{t+1}, \, R_{t+1}] \end{equation*}

Yes, the covariance between consumption growth and stock returns isn’t strong enough explain the 8\% equity premium on its own. But, stock market crashes tend to sucker punch investors, occurring just when investors’ consumption falls following a prolonged boom. \Cov[\Delta \log Z_{t+1}, \, R_{t+1}] > 0 explains nearly all of the 8\% per year equity premium according to this habit-formation model.

Long-Run Risk

Bansal-Yaron (2004) add a different state variable to the CCAPM. This model also starts with Problem (1) and plugs in a new utility function based on Epstein-Zin (1989) recursive preferences rather than habit formation:

(7)   \begin{equation*} U_t = \left\{ \, (1 - \beta) \cdot C_t^{1-\alpha} + \beta \cdot \big(\Exp_t[U_{t+1}^{1-\gamma}]^{\frac{1}{1-\gamma}}\big)^{1-\alpha} \, \right\}^{\frac{1}{1-\alpha}} \end{equation*}

These preferences are recursive because they indicate that investors care not only about their consumption today, (1 - \beta) \cdot C_t^{1-\alpha}, but also the present value of their expected future consumption, \beta \cdot \big(\Exp_t[U_{t+1}^{1-\gamma}]^{\frac{1}{1-\gamma}}\big)^{1-\alpha}.

\beta and \gamma represent investors’ time preferences and risk aversion just like in the original power utility specification. The only new parameter is \alpha > 1. The ratio 1/\alpha represents investors’ elasticity of intertemporal substitution (EIS). This parameter captures how much investors want to resolve future uncertainty about consumption, not because they want to do something with the information but because resolving uncertainty as soon as possible makes them happy. The guy on the subway platform who’s leaning dangerously out onto the tracks staring down the tunnel so that he can be the first to spot the next train is someone with a very high EIS. He wants to know as soon as possible if the next train is immanent, not because it will allow him to board sooner (everyone boards at the same time) but because knowing the train is about to arrive makes him happy. For a long-run risk model to work, we need \gamma \cdot (1/\alpha) > 1.

Let P_t denote the current price of an asset whose payout is aggregate consumption in the following period. The key new state variable in the long-run risk model is \log Z_{t+1} = \log (P/C)_{t+1}. The model says that the equity premium will be determined as follow:

(8)   \begin{equation*} \Exp[R_{t+1}] - R_f  \approx \gamma \times \Cov[ \, \Delta \log C_{t+1}, \, R_{t+1} \, ] + \mathrm{f}(\gamma, \, \alpha) \times \Cov[ \, \log Z_{t+1}, \, R_{t+1} \, ] \end{equation*}

\mathrm{f}(\gamma, \, \alpha) \leq 0 is a function that stems from a Campbell-Shiller (1988) approximation of Z_t. Thus, the long-run risk model says expected stock returns could be high either because future stock returns tend to covary with consumption growth or because these stock returns tend to covary with the future price-to-dividend ratio of the aggregate consumption claim. This price-to-dividend ratio will partly reflect current changes in consumption, \Delta \log C_{t+1}. But, since consumption growth is persistent and investors have recursive preferences, it will also reflect future consumption shocks as well. In calibrations, most of the 8\% per year equity premium is explained by variation in \log Z_{t+1} coming from consumption shocks far off in the future.

Source Of Confusion

What would it mean for investors not to care about consumption risk in one of these models? \rho = \Corr[\Delta \log C_{t+1}, \, R_{t+1}] captures the stock market’s exposure to consumption risk. When \rho \approx 1, stock market booms always coincide with increases in consumption. When \rho = 0, knowing that the stock market is booming tells you nothing about whether aggregate consumption is increasing or decreasing. So, investors in a particular model would be indifferent to changes in consumption risk if

(9)   \begin{equation*} \partial_{\rho} (\Exp[R_{t+1}] - R_f) = 0 \end{equation*}

In other words, they wouldn’t care about consumption risk if increasing the amount of consumption risk had no effect on their demand and thus no effect on equilibrium prices.

We saw above that, if we assume a risk aversion coefficient of \gamma = 10, then the first term in Equations (6) and (8) is very small. \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] \approx 0.3\%. And, as a result, the effect of an increase in consumption risk on asset prices coming from this first term is quite small as well:

(10)   \begin{equation*} \partial_{\rho} (\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]) = \gamma \times \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.016 \end{equation*}

Judged only by the effect of this initial term, an increase in consumption risk from \rho = 0.00 to \rho = 0.40 would only increase the expected excess return on the stock market by 0.016 \times 0.40 = 0.64\% per year. We observe a correlation between stock returns and consumption growth of \rho = 0.20 in the data. So, these numbers imply that a 2\times swing around the mean \rho would explain less than a tenth of the total 8\% per year equity premium puzzle if consumption risk only affected asset prices via the \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] term.

However, consumption risk doesn’t only affect asset prices via the \gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}] term in Equations (6) and (8). Therefore

(!!!)   \begin{equation*} \partial_{\rho} (\gamma \times \Cov[\Delta \log C_{t+1}, \, R_{t+1}]) \approx 0 \qquad \text{does \underline{\textbf{not}} imply} \qquad \partial_{\rho} (\Exp[R_{t+1}] - R_f) \approx 0 \end{equation*}

Such a conclusion would only be valid if the new state variables introduced in Campbell-Cochrane (1999) and Bansal-Yaron (2004) happened to be unrelated to consumption growth. This is absolutely not the case! Changes in the surplus consumption ratio are highly correlated with consumption growth. And, the long-run risk model assumes that consumption growth is very persistent, so price changes due to anticipated consumption shocks in the far distant future will be highly correlated with consumption growth today too.

Plugging In Numbers

How much does the Campbell-Cochrane (1999) model suggest expected excess returns should increase in response to a move from \rho = 0 to \rho = 0.40? Campbell-Cochrane (1999) talk about habit formation as “amplification mechanism for consumption risks in marginal utility. (page 240)” Mathematically, this shows up as a scaling up of the risk-aversion coefficient from \gamma to \gamma / \Exp[Z_t]. The authors use \phi = 0.87. With \sigma_{\Delta \log C} = 1\% per year and \gamma = 10, the average surplus consumption ratio is \Exp[Z_t] = \sigma_{\Delta \log C} \cdot \sqrt{\frac{\gamma}{1 - \phi}} \approx 0.088. So, in the external habit model, the effect of consumption risk on asset prices will be:

(11)   \begin{equation*} \partial_{\rho} (\Exp[R_{t+1}] - R_f) = (\gamma / 0.088) \times \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.18 \end{equation*}

Because increasing the stock market’s correlation with consumption growth must also increase its correlation with growth in the surplus consumption ratio, a \Delta \rho = 0.40 increase in consumption risk will increase the annual expected excess return on the stock market by 0.18 \times 0.40 = 7.30\% in a habit model.

How much does the Bansal-Yaron (2004) model suggest expected excess returns should increase in response to a \Delta \rho = 0.40 increase in consumption risk? Cochrane (2017) describes how this model “ties its extra state variables… to observables by the assumption of a time-series process in which short-run consumption growth is correlated with… long-run news.” When 1/\alpha \approx 1, the function \mathrm{f}(\gamma, \, \alpha) = 1 - \gamma and Equation (8) can be re-written as:

(12)   \begin{equation*} \Exp[R_{t+1}] - R_f  \approx \gamma \times \Cov[ \, \Delta \log C_{t+1}, \, R_{t+1} \, ] + (1 - \gamma) \times \Cov[ \, \log (P/C)_{t+1}, \, R_{t+1} \, ] \end{equation*}

Changing an asset’s correlation with consumption growth also changes its correlation with the future log price-to-consumption ratio. I estimate \log (P/C)_{t+1} = 3.61 - 30.26 \cdot \Delta \log C_{t+1} + \varepsilon_{t+1}, which would imply that \partial_{\rho} (\Exp[R_{t+1}] - R_f) = [\gamma - (1-\gamma) \cdot 30.26] \cdot \sigma_{\Delta \log C} \cdot \sigma_R \approx 0.45. Thus, a \Delta \rho = 0.40 increase in consumption risk will increase annual expected excess returns by 0.45 \times 0.40 \approx 18\% in the long-run risk model!

Filed Under: Uncategorized

Factor Models, Little Green Men, And Machine Learning

June 28, 2019 by Alex

Economists use machine learning (ML) to study asset prices in two different ways. Approach #1: use these techniques to predict the cross-section of expected returns—i.e., to predict which stocks are most likely to have high or low future returns. e.g., see here, here, or here. Approach #2: use them to try to uncover the “true asset-pricing model”—a.k.a., the “set of priced risk factors”.

Many economists dismiss approach #1, arguing that predicting future stock returns is a job for traders not academics. Instead, it’s much more common for researchers to adopt approach #2. The conventional wisdom is that we, as researchers, will learn something deep and fundamental about how financial markets work if one of these new ML techniques uncovers a factor model that perfectly explains the cross-section of expected returns. There’s a widely held view that doing empirical asset-pricing research means attributing differences in expected returns to some risk-return tradeoff with an intuitive story attached to it.

But… not so fast. There’s actually something paradoxical about the logic of approach #2. There’s a problem with the conventional wisdom. And, the goal of this post is to explain what that special something is.

Factor Models

But first: factor models. What are economists talking about when they say they’re trying to find the “true asset-pricing model” or the “set of priced risk factors”? To get a handle on this terminology, consider regressing the returns of each stock, R_{n,t}, on lagged values of some predictive variable, X_{n,t-1}:

    \begin{equation*} R_{n,t} = \hat{a} + \hat{b} \cdot X_{n,t-1} + \hat{e}_{n,t} \end{equation*}

The results of a predictive regression like this one can be interpreted as trading-strategy returns. You can read the estimated \hat{b} as the return to a zero-cost portfolio that’s long high-X stocks and short low-X stocks:

    \begin{equation*} \hat{b} \propto {\textstyle \frac{1}{N} \cdot \sum_n} \, (R_{n,t} - \bar{R}_t) \cdot (X_{n,t-1} - \bar{X}_{t-1}) \end{equation*}

Thus, \hat{b} > 0 implies both that stocks with high predictor values yesterday, (X_{n,t-1} - \bar{X}_{t-1}) > 0, tended to have high excess returns today, (R_{n,t} - \bar{R}_t) > 0, and also that it would have been profitable to trade on X today.

It could be that an estimated \hat{b} > 0 represents arbitrage profits. But, maybe trading on X is only profitable because it requires investors to bear lots of non-diversifiable risk? Imagine that investors are all really worried about not having enough money during future market crashes, R_{\mathit{Mkt},t} \ll 0. Then, if the predictive variable X_{n,t-1} turned out to be capturing exposure to market risk,

    \begin{equation*} X_{n,t-1} = {\textstyle \frac{\mathrm{Cov}[R_{n,t}, \, R_{\mathit{Mkt},t}]}{\mathrm{Var}[R_{\mathit{Mkt},t}]}} \end{equation*}

the profits earned by trading on X would represent compensation for holding a portfolio that will deliver terrible returns during market crashes—i.e., at the worst possible time as far as investors are concerned. And, when economists think this is what’s going on, they typically write the predictive variable as \beta_{n,t-1}^{(\mathit{Mkt})} rather than X_{n,t-1}. This is what they’re talking about when they speak of “market beta”.

So far so good. Now, for the final step. Notice that this compensation-for-risk logic doesn’t just apply when the risk factor is market returns. You can replace R_{\mathit{Mkt},t} with any variable so long as the variable defines some sort of bad aggregate outcome in investors’ eyes. e.g., think about something like a drop in market liquidity. So, looking for the “true asset-pricing model” or the “set of priced risk factors” means looking for a collection of K \geq 1 variables \{R_{1,t}, \ldots,\,R_{K,t} \} such that, if we assume investors are worried about not having enough money when these risk factors are negative, then every difference in expected returns is perfectly explained by differences exposure to these K priced risk factors:

    \begin{equation*} \mathrm{E}_{t-1}[R_{n,t}] = {\textstyle \sum_{k=1}^K} \, \lambda_t^{(k)} \cdot \beta_{n,t-1}^{(k)} \end{equation*}

Above, each \lambda_t^{(k)} > 0 is a market-wide constant called the price of risk associated with the kth factor.

I really want to emphasize the logic here. When an economist says a factor model explains the cross-section of expected returns, he’s saying that investors all have the same K \geq 1 risk factors in mind when making their respective portfolio choices. If one of these risk factors were to go negative, investors would consider it a bad state of the world; if all of them were to go negative, it’d be apocalyptic. The clain is that investors are all really worried about having enough money when these various kinds of bad outcomes occur. So, as a result, they’re willing to pay extra for assets whose returns are less correlated with these K risk factors—i.e., for assets that are more likely to have positive returns when risk factors are negative. Therefore, in equilibrium, these assets will have higher prices today and thus lower expected future returns.

Little Green Men

By now, researchers have proposed lots of different candidate factor models. Some might even say there’s a “factor zoo”. Each model makes its own claim about a specific set of risk factors that all investors are worried about. And yet, there’s no general consensus among researchers (let alone investors) about which is correct. This disagreement should already give you pause, but now ask yourself this: If you have to use an ML algorithm to identify the correct “set of priced risk factors” in investors’ “true asset-pricing model”, how did investors find these variables in the first place? A few investors certainly understand the ML toolkit today, but most certainly do not. And, no one was aware of these ideas twenty something years ago.

As a thought experiment, suppose that tomorrow while doing other research you encounter an ML algorithm, which was first discovered in 2010, that always outputs a factor model which perfectly explains the cross-section of expected returns. Does it make sense to claim that this ML algorithm is able to find the “true asset-pricing model” at work in, say, 1985? By assumption, when you feed data from 1985 into the algorithm, the output will be a “set of priced risk factors” that perfectly explains the cross-section of expected returns in 1985. But, could these risk factors possibly reflect how Madonna-loving 1985 investors were thinking about risk and return? No. Of course not. If the algorithm wasn’t discovered until 2010, could 1985 investors have known about this “set of priced risk factors”?

Let’s make the thought experiment even more extreme. Suppose that little green men come to earth tomorrow and secretly give you an alien computer that operates based on principles never before seen by humans. There’s absolutely nothing like it here on earth. And, this advanced computer comes pre-programmed with correspondingly advanced ML algorithms. And, imagine that one of these algorithms works like the algorithm described above. It always outputs a set of K \geq 1 risk factors that perfectly explain the cross-section of expected returns. Do these risk factors tell us anything about how human investors view risk in earthly markets? Again: No. Of course not. To discover them you had to use an advanced alien technology with absolutely no analog here on earth. So, how could this risk factors be capturing earthly investors’ views about risk and return? The algorithm simply produces an excellent set of predictive variables that take the form of partial correlations with each asset’s returns—i.e., that take the form of \beta_{n,t-1}^{(k)}s.

Machine Learning

I’m quite bullish about the prospects of ML in asset pricing. I think researchers have barely scratched the surface. I just don’t think that approach #2—i.e., searching for the “true asset-pricing model”/”set of priced risk factors”—is a sensible way to apply the ML toolkit. Although academics tend to poo poo approach #1 as lacking in economic content, it’s simply not true. There are lots of situations where we’re perfectly happy to have good return predictions at the price of not understanding where this fit comes from. Traders are obviously OK with this Faustian bargain. But, so too are researchers. It’s not like the Fama-French 3-factor model is popular because we have an economic understanding of what the size and value factors represent.

Financial economists like to think about the market and its investors as something separate. But, it’s just not so. We are the investors in our asset-pricing models. There’s no separation. And, this fact should be reflected in our models. For me, this is the most interesting economic insight that comes with applying ML algorithms to study asset prices. If the tools that we use to find predictors change, then the predictors that our theoretical investors find should change, too. In his AFA presidential address, John Cochrane writes that, “to address these questions in the zoo of new variables, I suspect we will have to use different methods… For one variable, portfolio sorts and regressions both work. But we cannot chop portfolios 27 ways… so, I do not see how to do it by a high-dimensional portfolio sort.” Whatever those different methods end up being (ML or otherwise), we’d better not be modeling asset-pricing equilibria the same way after they get introduced.

Filed Under: Uncategorized

Risk-Factor Identification: A Critique

May 26, 2019 by Alex

In standard cross-sectional asset-pricing models, expected returns are governed by exposure to aggregate risk factors in a market populated by fully rational investors. Here’s how these models work. Because investors are fully rational, they correctly anticipate which assets are most likely to have low returns in especially inconvenient future states of the world—i.e., returns that are highly correlated with aggregate risk factors. They won’t be willing to pay as much for the high risk-exposure assets today. So, the price of high risk-exposure assets will drop in equilibrium, giving these assets high expected returns going forward.

With this standard framework in mind, financial economists are constantly on the lookout for assets with similar risk exposures but different average returns. e.g., in a CAPM world, value and growth stocks would have similar average returns after adjusting for market beta; however, in the real world, there’s a 4%-per-year value premium. Assuming they are fully rational, this finding suggests that investors are worried about more than just aggregate market risk when pricing assets. It suggests they’re also paying attention to another as-yet-unknown risk factor(s). The central challenge in this literature is to figure out which one(s).

Unfortunately, after decades of work, there’s still no general consensus about which aggregate risk factors matter to real-world investors. Instead, the academic literature contains a zoo of candidate risk factors. Correlation with any of these factors will help predict an asset’s expected returns. But, it’s hard to believe that all of these aggregate risk factors actually matter to real-world investors, especially when they “have little in common economically with each other”.

Lax econometric standards are certainly one explanation for this factor zoo. The goal of this post is to suggest another: full rationality. Notice that full rationality plays two different roles in the discussion above. The first is to make sure that investors correctly anticipate the correlation between each asset’s future returns and the aggregate risk factors. If investors are fully rational, then changes in an asset’s risk exposure must be due to changes in fundamentals. The second role is to remove any logical limits on what these aggregate risk factors might be. If investors are fully rational, then they might potentially be worried about any future state of the world a researcher might dream up… and more! The whole premise of learning about the true risk factors requires real-world investors to know things that researchers haven’t yet noticed. And, if investors are fully rational, this additional knowledge might be arbitrarily subtle.

Below I show that, if researchers assume that investors are fully rational in both of the above senses, then identifying the true set of aggregate risk factors used by real-world investors is an impossible goal.

RCT Protocol

Economists think about randomized controlled trials (RCTs) as the gold standard for identification. Here’s how the RCT protocol works. Imagine you’re a medical researcher who’s just discovered a new cancer-treatment drug. You think your new discovery has promise, but the only way to know if it actually works is to give it to cancer patients and see whether they’re more likely to recover. But, how should you do this?

You could just distribute flyers advertising your new drug at the nearest hospital, give your drug to all the cancer patients who respond to the flyers, and then compare the recovery rate of the patients who took your drug to that of the remaining cancer patients. However, this is a bad idea. People try to make the best decision possible given all available knowledge about their current circumstances. So, we should expect that the cancer patients who respond to your flyer will be different from those who do not. We should expect them to be sicker, having exhausted all other treatment options. This means that any difference in recovery rates could be due to your new drug or to underlying differences in patient populations.

What’s more, if patients are optimizing based on information that’s unobservable to you (the researcher), then it doesn’t help to control for the differences in patient populations that you can see. Suppose you found two cancer patients, one who took your drug and one who decided not to, that looked identical in every conceivable way you could measure: both male, both white, both 43 years old, same height and weight, etc… If you really believed that these patients were making fully rational choices based on all the available information they had, then you must be missing something about each of their respective situations. Two identical fully rational people wouldn’t make two radically different life choices given the same information.

In short, to learn whether your new drug works, you have to break the link between drug treatment and patients’ optimal decisions based on (potentially) unobservable information. And, the RCT protocol does this by randomizing which cancer patients get your new drug and which get a sugar pill. You need to find a bunch of patients willing to participate in your study knowing that they have only a 50:50 chance of receiving the new experimental treatment. Then, with enough patients, the law of large numbers makes it very unlikely that the treated patient population will systematically differ from the untreated population. Thus, any difference in the recovery rates of these two groups must be due to your drug regimen.

Model Testing

Now, think about what’s going on when we test a cross-sectional asset-pricing model. A model is just a list of K \geq 1 aggregate risk factors. A fully rational investor will anticipate which assets have returns that are highly correlated with these K aggregate risk factors. So, if the model is correct, differences in expected returns across assets will be explained by differences in exposure to these K aggregate risk factors.

This logic suggests a straightforward empirical approach. To test a cross-sectional asset-pricing model, first separately regress the excess returns of each asset n = 1,\ldots,\,N on the K aggregate risk factors:

(1)   \begin{equation*} \mathit{rx}_{n,t} = \bar{a}_n + {\textstyle \sum_{k=1}^K} \, \bar{b}_{n,k} \cdot f_{k,t} + e_{n,t} \end{equation*}

Run a time-series regression involving t=1,\ldots,\,T observations for each asset. Then, take the estimated slope coefficients from these N regressions, which capture each asset’s exposure to the K aggregate risk factors, \bar{b}_{n,k} \overset{\scriptscriptstyle \text{def}}{=} \overline{\mathrm{Cov}}[\mathit{rx}_{n,t}, \, f_{k,t}] \, \big/ \, \overline{\mathrm{Var}}[f_{k,t}], and test whether differences in risk-factor exposure across assets explain differences in expected returns across assets:

(2)   \begin{equation*} \overline{\mathit{rx}}_n = \hat{\alpha} + {\textstyle \sum_{k=1}^K} \, \hat{\lambda}_k \cdot \bar{b}_{n,k} + \varepsilon_n \end{equation*}

Run one cross-sectional regression involving n=1,\ldots,\,N observations. If you’ve found the true factor model that real-world investors are using, then i) \hat{\lambda}_k > 0 for all k=1,\ldots,\,K, ii) \hat{\alpha} \approx 0, and iii) \widehat{\mathrm{Var}}[\varepsilon_n] \approx 0.

But, satisfying these three criteria is only a necessary condition. It’s not sufficient for proving you’ve got the right model. Even if a cross-sectional asset-pricing model passes these hurdles, real-world investors might not be using those K aggregate risk factors to price assets. Exposure to the K aggregate risk factors could be the result of correlations with other omitted variables that real-world investors really care about.

This is a question about identification. And, the RCT protocol suggests we can solve it by looking for random variation in an asset’s exposure to each of the K risk factors that has nothing to do with changes in fundamentals. The whole point of using an RCT is to make sure that patient decisions based on unobserved information aren’t causing a spurious link between drug treatment and recovery. And, we want to make sure that investor decisions based on unobserved fundamentals aren’t causing a spurious link between risk exposure and expected returns. We need to block any possibility of an unobserved link between risk-factor exposure and asset fundamentals.

So, imagine that investors perceive a noisy version of each asset’s exposure to the kth risk factor:

(3)   \begin{equation*} \bar{b}_{n,k} = \bar{b}_{n,k}^{\star} + \tilde{b}_{n,k} \end{equation*}

Above, \bar{b}_{n,k}^{\star} denotes the nth asset’s true risk exposure and \tilde{b}_{n,k} denotes noise that’s unrelated to fundamentals. The only way to know that investors are using a particular set of K aggregate risk factors and not some other correlated set of factors is to study how \tilde{b}_{n,k} predicts expected returns. After all, differences in expected returns that are associated with estimation errors, \tilde{b}_{n,k}, can’t be attributed to investors acting strategically based on unobserved information about asset fundamentals.

Impossible Goal

By now, you probably see the logical trap that’s been laid. A fully rational investor might potentially be reacting to any piece of unobserved information about an asset’s fundamentals. So, non-fundamental variation in their perception of risk exposure is crucial to identifying the model they’re using. But, non-fundamental variation in perceived risk exposure would represent an error. And, fully rational investors don’t make errors. Thus, if we are adamant that real-world investors are fully rational, then we must give up any hope of identifying the cross-sectional asset-pricing model they’re using.

Note that this impossibility result doesn’t say that investors need to be completely irrational… far from it. The true \bar{b}_{n,k}^{\star} has to have some bearing on investors’ perceived \bar{b}_{n,k}. If investors aren’t strategically adjusting their demand today in response to actual future risks, then cross-sectional asset-pricing models have no content. Rather, the impossibility result says that, for researchers to identify the cross-sectional asset-pricing model that real-world investors are using, these perceptions can’t be perfectly accurate. For a useful analogy, think about every spy thriller with a canary trap that you’ve ever seen. In order for one spy to figure out what the other knows, he’s got to see how his adversary reacts to planted fake intel. If his foe always sees through the ploy (i.e., if his foe is “fully rational” in High Economyan), then there’s no hope of any success.

This impossibility result also suggests a new use for many of the cognitive errors documented by behavioral economists: as tools for testing whether or not real-world investors care about exposure to particular risk factors. The existing behavioral-finance literature contains a ready supply of \tilde{b}_{n,k}s.

Filed Under: Uncategorized

The Basic Recipe For Rationalizing Errors In Belief

February 3, 2019 by Alex

Behavioral-finance models are often written down so that, although each individual trader holds incorrect beliefs, market events nevertheless unfold in such a way that traders can rationalize their own errors. e.g., consider the model in Scheinkman and Xiong (2003). In this model, each individual trader knows that every other trader is over-confident, and he knows that every other trader thinks that he himself is over-confident. He just doesn’t think that they’re correct. He’s pig-headedly insists that he’s the only unbiased trader. And yet, in spite of this error, the model is set up so that he can interpret the realized price path in his own internally consistent way. Each trader thinks the price distortion caused by his own over-confidence is actually coming from the value of the option to resell at a later date to some other over-confident trader.

There’s a good reason why researchers write down models this way. The idea is to write down a model that’s exactly one-step away from a rational benchmark. That way, any new predictions made by the model can be attributed to the behavioral bias. In this post, I first outline the basic recipe for rationalizing traders’ errors in beliefs. Then, I point out something slightly paradoxical about this recipe—namely, it requires fine-tuning the model parameters. And, while a researcher can to do this fine-tuning in a theoretical model, it’s not clear who can turn the appropriate knobs in the real world. These models are like stage magic. And, while we can learn about which cognitive biases people suffer from by studying a good magician’s sleight of hand, most missing coins wind up between the couch cushions rather than in The Amazing Randi‘s pocket.

Errors In Belief

Here’s a simple framework for digesting errors in beliefs. To start with, consider a market where a trader has correct beliefs. i.e., suppose that a trader receives a noiseless signal:

    \begin{equation*} \mathit{Signal} = \mathit{News} \qquad \text{where} \qquad \mathit{News} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(0, \, 1) \end{equation*}

Then, given the trader’s optimal demand in response to this noiseless signal, suppose that the structural relationship between the trader’s noiseless signal and realized returns is given by:

    \begin{equation*} \mathit{Return} = \beta^{\star} \cdot \mathit{Signal} + \varepsilon^{\star} \qquad \text{where} \qquad \varepsilon^{\star} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(0, \, 1) \end{equation*}

We typically think that \beta^\star \in (0, \, 1) with larger values of \beta^\star indicating more informative prices. I’m using the term “structural relationship” for \beta^{\star} \overset{\scriptscriptstyle \mathrm{def}}{=} {\textstyle \frac{\partial \phantom{s}}{\partial s}} \mathrm{E}^{\star}[ \, \mathit{Return} \, | \, \mathrm{do}(\mathit{Signal} = s) \, ] because this parameter reflects the expected change in returns due to an exogenous shift in the trader’s signal. Note that this structural relationship could reflect other traders’ errors in belief, as was the case in Scheinkman and Xiong (2003).

But, in reality, suppose that the trader is over-confident about the precision of his signal. While he thinks it’s noiseless, his signal actually contains noise:

    \begin{equation*} \mathit{Signal} = \alpha \cdot \mathit{Noise} + \sqrt{1 - \alpha^2} \cdot \mathit{News} \qquad \text{where} \qquad \mathit{Noise} \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{N}(0, \, 1) \end{equation*}

And, the parameter \alpha \in [0, \, 1] governs the relative contribution of noise to the trader’s signal: \alpha = 0 corresponds to correct beliefs; whereas, \alpha = 1 corresponds to a signal that is pure noise. Then, given the trader’s optimal demand, suppose that the structural relationship between the trader’s noisy signal and realized returns is actually given by:

    \begin{equation*} \mathit{Return} = \beta \cdot \mathit{Signal} + \varepsilon \qquad \text{where} \qquad \varepsilon \sim \mathrm{N}(0, \, 1) \end{equation*}

Notice that, in reality, idiosyncratic-return shocks are no longer drawn IID. Let \rho \overset{\scriptscriptstyle \mathrm{def}}{=} \mathrm{Cor}[\mathit{News}, \, \varepsilon] denote the correlation between the news about fundamentals in the trader’s signal and idiosyncratic-return shocks. e.g., in a model of disagreement, you might think about \rho < 0 due to the existence of another trader whose disagreement stems from negatively correlated signals or negatively correlated mistakes.

The Basic Recipe

Suppose that the trader, who doesn’t realizing that he’s getting a noisy signal, is still carefully monitoring price informativeness. i.e., he’s carefully monitoring the relationship between his signal and realized returns. Here’s what it would take for this trader to rationalize his error in beliefs. Notice that the covariance of the trader’s signal and market returns is given by:

    \begin{align*} \mathrm{Cov}[\mathit{Return}, \, \mathit{Signal}] &= \mathrm{Cov}[\beta \cdot \mathit{Signal} + \varepsilon, \, \mathit{Signal}] \\ &= \mathrm{Cov}[\beta \cdot \mathit{Signal}, \,\mathit{Signal}] + \mathrm{Cov}[\varepsilon, \, \mathit{Signal}] \\ &= \beta + \mathrm{Cov}\big[ \, \varepsilon, \, \alpha \cdot \mathit{Noise} + \sqrt{1-\alpha^2} \cdot \mathit{News} \, \big] \\ &= \beta + \rho \cdot \sqrt{1-\alpha^2} \end{align*}

So, since the variance of his signal is \mathrm{Var}[\mathit{Signal}] = 1, if the trader regresses realized returns on his signal, he’ll find a slope coefficient of

    \begin{equation*} \hat{\beta}^{\text{OLS}} \overset{\scriptscriptstyle \mathrm{def}}{=} {\textstyle \frac{\mathrm{Cov}[\mathit{Return}, \, \mathit{Signal}]}{\mathrm{Var}[\mathit{Signal}]}} = \beta + \rho \cdot \sqrt{1-\alpha^2} \end{equation*}

Thus, if a researcher chooses the values of \rho and \alpha so that \beta^{\star} = \hat{\beta}^{\text{OLS}} = \beta + \rho \cdot \sqrt{1-\alpha^2}, then the trader will see data that’s consistent with his erroneous belief about his signal being noiseless.

It’s important to emphasize that, when a researcher chooses \alpha and \rho so that \beta^{\star} - \beta = \rho \cdot \sqrt{1-\alpha^2}, he’s not giving the trader correct beliefs, though. Although price informativeness will look correct to the trader, his error in beliefs will still cause returns to respond to pure noise. The covariance of noise and returns will be:

    \begin{align*} \mathrm{Cov}[\mathit{Return}, \, \mathit{Noise}] &= \mathrm{Cov}[\beta \cdot \mathit{Signal} + \varepsilon, \, \mathit{Noise}] \\ &= \mathrm{Cov}\big[ \, \beta \cdot \big( \, \alpha \cdot \mathit{Noise} + \sqrt{1-\alpha^2} \cdot \mathit{News} \, \big), \, \mathit{Noise} \, \big] + \mathrm{Cov}[\varepsilon, \, \mathit{Noise}] \\ &= \beta \cdot \alpha \end{align*}

So, returns will react to pure noise whenever \alpha > 0. And, in principle, the trader could notice this fact if he cared to inspect \mathrm{Cov}[\mathit{Return}, \, \mathit{Noise}] rather than just \mathrm{Cov}[\mathit{Return}, \, \mathit{Signal}].

Like Stage Magic

That’s how you write down a model where biased traders can rationalize their own errors in belief. The basic recipe is simple enough. Just introduce a hidden correlation into the information structure of the model (i.e., the parameter \rho) and then fine-tune this correlation so that it cancels out the effects of the trader’s behavioral bias (the parameter \alpha). It’s really pretty when this sort of cancellation takes place. Models that manage to use this basic recipe, such as Scheinkman and Xiong (2003), are really beautiful. But, this approach raises an obvious question: in the real world, why should we expect \alpha and \rho to take on the precise values needed to hide a trader’s error? Where does the required fine-tuning come from? What’s the underlying mechanism at work?

These models are like stage magic. They’re expertly scripted illusions that demonstrate how behavioral biases can go undetected… even by traders who are actively trying to detect them. And, this is not a slight. This is really informative in the same way that going to a good magic show is really informative. It teaches you something useful about the limits of human perception, about how your attention can be managed, about how you can be deceived. But, you don’t leave magic shows thinking that the next deck of cards you open will contain 52 copies of the 6\clubsuit because you happened to be thinking of that card when you opened the box. No one expects everyday situations to operate by the rules of stage magic. Most of the time, there’s no magician to carefully script the illusion. And, the same logic applies to financial markets. It’s useful to know that you can fine-tune parameters to hide an error, but we shouldn’t assume that markets typically operate with the parameters dialed in this way. Why should we? Who exactly would be the one turning the knobs?

Filed Under: Uncategorized

The Existence Of A Bubble vs. The Timing Of Its Crash

October 31, 2018 by Alex

Journalists love to talk about bubbles. The Wall Street Journal has hinted at bubbles in both the Chinese stock market and the market for Bitcoin during the past month alone. But, financial economists are much more reluctant to call something a bubble. There’s debate about whether bubbles even exist. And, much of this debate revolves around whether it’s possible to predict the timing of the resulting crash. If bubbles really do exist, then it seems like there should be some theory of when they’re going to pop (Fama, 2016).

There’s a good reason why financial economists care so much about the timing of the crash: this is what matters most to traders. Put yourself in the shoes of a trader in mid August 1987. The DJIA has risen roughly 12% since the beginning of July. What you want to know is, “When’s this party going to end?” Should you get out now? Or, should you keep dancing for another few months? It would be really useful to have a theory that answered these questions. And, it would be super useful to know which variables help predict the timing of the crash.

But, here’s the thing: traders aren’t the only people who care about bubbles. And, the crash ain’t the only thing worth modeling. To illustrate, put yourself in the shoes of a market regulator in August 1987. You’re staring at the same data as before. But now, what you really want to know is, “Is this party going to end? Does it represent a bubble?” If it does, then it doesn’t matter to you when the crash happens. Black Monday or Black Thursday; October 1987 or May 1988; it’s all the same to you. The same number of people will be harmed in all cases.

There are good reasons to worry about the existence of a bubble even if you can’t predict the timing of the crash. What are these reasons? That’s the topic of today’s post.

A Simple Model

One way to think about a speculative bubble is as a kind of Ponzi scheme (e.g., see here). Here’s a simple model. Imagine a group of traders that tend to build enormous long positions whenever they enter the market—i.e., these traders each have excess demand. And, whenever they build a position, suppose that they hold this position for a limited period of time–e.g., six months or a year. Once this time is up, they cash out. A bubble starts when one of these traders enters the market and drives up the price a little bit with his excess demand. If the market doesn’t crash before this first trader’s time is up, then the resulting price increase attracts additional traders. So, when the first trader finally cashes out, two more traders take his place. If the market doesn’t crash before these two traders exit the market, then the further price increase caused by these two traders’ excess demand attracts four additional traders, and so on… Thus, the Nth round of a bubble contains 2^{N-1} traders.

Let \theta \in (0, \, 1] denote the probability that the market crashes during the Nth round. The probability that a bubble lasts exactly one round—i.e., that the bubble ends immediately—is \theta. The probability that a bubble lasts exactly two rounds is (1-\theta) \times \theta. The probability that a bubble lasts exactly three rounds is (1-\theta)^2 \times \theta. In general, we have that \mathrm{Pr}[N=n|\theta] = (1 - \theta)^{n-1} \times \theta. Thus, larger values of \theta imply that a bubble will crash sooner:

    \begin{equation*} \mathrm{E}[N|\theta] = {\textstyle \sum_{n=1}^\infty} \, (1-\theta)^{n-1} \times \theta = (1-\theta)/\theta \end{equation*}

The number of rounds in any given bubble episode represents a draw from a negative binomial distribution.

Regulator’s Viewpoint

Here’s the important thing from a regulator’s point of view: once it crashes, most traders involved in a bubble will have lost money. If the bubble immediately crashes, then the first trader will be its last. 1/1=100\% of all traders involved will have lost money. If the bubble collapses during the second wave of traders, then there will be three traders in total. And, after the crash, two of them will have lost money, which corresponds to 2/(1+2) \approx 67\% of all traders involved. If the bubble collapses during the third round, then there will be seven traders in total. Four of them will have lost money after all is said and done. So, casualty rate will be 4/(1+2+4) \approx 57\%. In general, if the bubble lasts N rounds, then after the bubble bursts a fraction

    \begin{equation*} \mathrm{F}[N] = 2^{N-1} \big/ \, \big({\textstyle \sum_{n=1}^N} \, 2^{n-1} \big) \end{equation*}

of all traders will have lost money.

It’s easy to see that \mathrm{F}[N] > 50\% for all N \in \{1, \, 2,\, 3, \ldots\}. The figure to the right presents one way of looking at it. It shows the fraction of traders that will have lost money after a bubble ends, \mathrm{F}[N] (y-axis), as a function of how long the bubble lasts, N (x-axis). The fraction of traders who wind up suffering losses asymptotes towards 50\% from above as N gets larger and larger, but it never quite gets there. Here’s another way to look at it. Notice that \sum_{n=1}^N 2^{n-1} = 2^N-1. In other words, we have that 1 + 2 = 4-1 and 1 + 2 + 4 = 8-1 and so on… Thus, the number of traders who will have lost money after a bubble pops, 2^{N-1}, is always a little more than half of all traders involved in the bubble, 2^N -1, regardless of how long the bubble episode lasts.

What I love about this example is that \mathrm{F}[N] > 50\% is true for all choices of N \geq 1. It’s a statement that holds pointwise. More than half of all traders will have been harmed by a bubble no matter how likely it is that the bubble ends next period. Changing \theta doesn’t matter. This is the sense in which a regulator cares about the likelihood of a bubble taking place but not the timing of the crash. If the regulator is trying to maximize overall well-being, then he wants to make sure this doubling process doesn’t get started in the first place. Why does he care when it ends? It’s going to be socially harmful no matter the timing.

Trader’s Perspective

In addition, this regulator-indifference result is perfectly consistent with the idea that traders care a lot about the timing of the crash. To make things simple, suppose that a trader will lose \mathdollar 1m if he belongs to the Nth and final round that experiences the crash. But, he will profit by \mathdollar 1m if he’s one of the (N-1) rounds who cash out before then. In this setting, each trader has expected profit given by:

    \begin{equation*} (1 - 2 \cdot \theta) \times \mathdollar 1m = (1-\theta) \times \mathdollar 1m + \theta \times (-\mathdollar 1m) \end{equation*}

Thus, entering the market and trying to ride the bubble only makes sense for a trader if \theta \in (0, \, 0.5)—i.e., if the probability of a market crash in the next round is less than 50\%. In order to profit from a bubble, you have to get out before the crash. So, traders clearly care about the timing of the crash. Whether \theta = 0.49 or \theta = 0.51 makes a big difference to them. It’s just that this difference doesn’t matter to a regulator.

Filed Under: Uncategorized

« Previous Page
Next Page »

Pages

  • Publications
  • Working Papers
  • Curriculum Vitae
  • Notebook
  • Courses

Copyright © 2025 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in