Imagine you’re an asset-pricing researcher. You’ve just thought up a new variable, , that might predict the cross-section of returns. And you’ve regressed returns on in a market environment of your choosing (i.e., using data on some specific time period, country, asset class, set of test assets, etc):
(1)
If differences in predict differences in returns in your chosen market environment , the estimated slope coefficient will large, . It would’ve been profitable to trade on the predictor in sample.
Suppose you find . Assets with higher values today tend to have higher returns tomorrow. You now face a choice about whether to publish this finding. If you do, then other researchers will read your paper and try to replicate it in other market environments you haven’t yet looked at, . Let denote the collection of all out-of-sample market environments that your colleagues might examine.
Obviously, you shouldn’t publish if isn’t a good cross-sectional predictor in most of these out-of-sample environments—i.e., you shouldn’t publish if . But, even if is a good predictor on average, you still worry about worst-case scenarios. If there’s one market environment where , then one of your colleagues will surely discover it and you’ll look utterly foolish when he tells the world. You only want to publish if robustly predicts returns out-of-sample:
(2)
captures the relative importance of these two considerations to your publication decision. The larger the , the more you care about saving face by not publishing any really bad predictions.
Importantly, let’s assume that all you care about when doing research is solving this robust out-of-sample prediction problem. You don’t care at all about whether investors actually price assets based on . All that matters is whether reliably predicts returns out-of-sample. You’re completely drunk on Friedman’s “as if” Kool-Aid. Before deciding whether to publish, you have a choice as to which market environment to examine. What sort of environment should you choose? What should your empirical strategy be?
The key insight in this post is that, even if all you care about is robust out-of-sample performance, causal inference still turns out to be a useful tool for achieving this goal. If investors always use the same model to price assets, then understanding this model will allow you to always make good predictions. Your empirical strategy should be to choose an empirical environment that identifies the causal effect of on returns.
Investors’ model
I begin by defining investors’ model. Suppose that, in every market environment, investors price each asset so that its returns are governed by the following linear structural model:
(3)
Moreover, assume that the parameters, , are the same in every market environment. is the cross-sectional predictor that you’re working on, and is an omitted variable. This is a variable that investors might be using to price assets but researchers have yet to discover. If it’s 1981 and is firm size, then might be liquidity. is a noise term and captures its affect on returns.
Crucially, either affects returns, , or affects returns, , but not both in investors’ model. If , then reliably predicts the cross-section of returns since is the same in every environment—i.e., in every time period, country, etc. If , then any predictability associated with is spurious. Let
(4)
denote the entire range of possible values that and might take on.
To keep things simple, suppose that the realized values of , , and for each asset in a given market environment are drawn IID normal:
(5)
, , and all have mean zero and unit variance. The noise term is uncorrelated with both and in every market environment, . However, and may be correlated across stocks, . Moreover, this correlation can differ across market environments. In other words, and may be highly correlated in one time period/country/asset class/etc but not in another.
Note that Equations (3) and (5) imply asset returns are zero on average, , in every market environment. I’m making this assumption to keep the math simple. If it really bothers you, just think about as an asset’s residual return that unexplained by other trading signals. Let’s also assume that in every market environment for the same reasons. This assumption implies that .
Two explanations
When you regressed the cross-section of returns on in your chosen market environment , you found that . Given the structure of investors’ model, we know that either predicts the cross-section of returns or predicts the cross-section of returns but not both:
(6)
It might be that you estimated in market environment because in every environment. Or it might be that you estimated in market environment because happened to be correlated with an omitted variable in that environment, . These are the two possible explanations.
Since you’re focused on robust out-of-sample performance in other market environments , the reason why in-sample is very important. If merely because , then will only be a good cross-sectional predictor in other market environments where and are similarly correlated, . Under this explanation, it’s possible to imagine market environments where is an abysmal predictor. Just look for environments where .
Causal inference
What needs to be true about market environment if you want to be able to distinguish between these two explanations? The answer boils down to an identifying assumption about the range of values that might take on:
(7)
A market environment, , consists of a set of structural parameters, ; a range of possible values for the correlation between and , ; and, a particular choice for this value, .
If market environments and have the same structural parameters, , then the cross-sectional slope coefficient will be the same in both environments, . Yet, you will interpret the slope coefficient differently in each environment if . By analogy, medical researchers will draw different conclusions about a drug’s efficacy from an RCT than from an observational study even if the joint distribution of patient outcomes and observable characteristics is the same in both datasets. If , then identifies as the correct explanation. There’s no way to have in such an environment. By contrast, if , then could be explained either by or by .
Note that it isn’t possible to choose a market environment where consists of an arbitrarily small neighborhood around zero. The omitted variable can explain no more than of the variation in returns across assets. That would occur if since we are assuming . Hence, if and due to a spurious correlation, then this correlation must be bounded away from zero:
(8)
This digital zero/non-zero distinction is why it’s possible to map out causal effects using path diagrams. A path between two variables must be contemplated whenever they could have a non-zero correlation.
Out-of-sample environments
When you regressed the cross-section of returns on in market environment , you found . We can now give a precise definition for the set of all out-of-sample market environments that other researchers might try to replicate this finding in. Let
(9)
denote the range of possible values for and that are consistent with your initial estimate given . If is guaranteed to be uncorrelated with the omitted variable, , then and we say that market environment identifies changes in as having a causal affect on the cross-section of returns.
Given the set of all and values that are consistent with your initial result , the range of potential out-of-sample market environments is defined as follows:
(10)
This collection of market environments consists of any environment which could be generated by some and that’s consistent with your initial result combined with any possible value of .
Research strategy
If you chose a market environment that identified the causal effect of on the cross-section of returns for your initial test, then your estimate of would imply that in every out-of-sample market environment. The left-hand side of Equation (2) would reduce to:
(11)
The finding would be robust out-of-sample, and you should publish it.
By contrast, if you chose a market environment that did not identify the causal effect of on the cross-section of returns, then your estimate of would be harder to interpret. It could be that or it could be that . If the latter is true, then we would say that reflects a spurious correlation. And to make this spurious correlation look as bad as possible out-of-sample, other researchers should look for a market environment where :
(12)
Absent identification, you have to entertain this possibility. So you may refuse to publish strong results with good average-case out-of-sample performance for fear of being embarrassed by worst-case predictions.
Thus, as outlined at the beginning, even if all you care about as a researcher is publishing results that have robust out-of-sample performance, causal inference still turns out to be relevant. It’s a very a useful tool for achieving this goal. If investors are always using the same model to price assets, then understanding this model will allow you to always make good predictions. So you should consider adopting a research strategy whereby you insist on testing each new predictor in an identified market environment .
No free lunch
I recognize that identifying causal effects is hard. Running RCTs is hard. Finding valid instrumental variables is hard. It’s hard to find a market environment that identifies the causal effect of a change in on returns—i.e., to find a market environment where it’s reasonable to assume that .
So you might be thinking: “Can’t I just get around the problem by checking lots of different market environments before publishing? If in lots of different market environments , then shouldn’t I be more confident in ‘s out-of-sample performance? After all, in real life, no researcher would (or could!) publish a result about cross-sectional predictability based on one regression.”
It’s absolutely true that you do learn something about ‘s out-of-sample performance when you verify that in many different market environments . Unfortunately, the something that you learn only applies to ‘s average-case performance, . For example, if in more than half of all possible out-of-sample environments, then there’s no way for since we know that in every remaining market environment as we saw in Equation (12).
Yet, until you check every imaginable out-of-sample environment, you can say nothing new about the worst-case outcome. No matter how many environments you check in , you can never be certain that in one of the remaining environments in that you haven’t checked. Thus, if you care about never publishing a result that makes an embarrassingly bad out-of-sample prediction in some situation, , then simply doing lots of in-sample checks isn’t a viable research strategy on its own. It certainly doesn’t hurt. But it DOES NOT tell you anything about out-of-sample robustness in the setup thusfar.
The thing that makes causal inference difficult is that it requires making a strong assumption about the joint distribution of and an unobserved/unknown/omitted variable in a particular market environment . You have to assume that . Such identifying assumptions can be hard to stomach. However, the assumption that you would need to make for in-sample robustness to guarantee out-of-sample robustness is even less palatable. TANSTAAFL. Instead of requiring that in some specific market environment, you would need to assume that in all out-of-sample environments . Such an assumption is tantamount to simply assuming the result you’re after—namely, out-of-sample robustness.