Econometrics and the bridge between model and reality
from Lars Syll
Trygve Haavelmo, the “father” of modern probabilistic econometrics, wrote that he and other econometricians could not “build a complete bridge between our models and reality” by logical operations alone, but finally had to make “a non-logical jump” [‘Statistical testing of business-cycle theories,’ 1943:15]. A part of that jump consisted in that econometricians “like to believe … that the various a priori possible sequences would somehow cluster around some typical time shapes, which if we knew them, could be used for prediction” [1943:16]. But since we do not know the true distribution, one has to look for the mechanisms (processes) that “might rule the data” and that hopefully persist so that predictions may be made. Of possible hypothesis on different time sequences (“samples” in Haavelmo’s somewhat idiosyncratic vocabulary)) most had to be ruled out a priori “by economic theory”, although “one shall always remain in doubt as to the possibility of some … outside hypothesis being the true one” [1943:18].
The explanations we can give of economic relations and structures based on econometric models are, according to Haavelmo, “not hidden truths to be discovered” but rather our own “artificial inventions”. Models are consequently perceived not as true representations of the Data Generating Process, but rather instrumentally conceived “as if”-constructs. Their “intrinsic closure” is realized by searching for parameters showing “a great degree of invariance” or relative autonomy and the “extrinsic closure” by hoping that the “practically decisive” explanatory variables are relatively few, so that one may proceed “as if … natural limitations of the number of relevant factors exist” [‘The probability approach in econometrics,’ 1944:29].
But why the “logically conceivable” really should turn out to be the case is difficult to see. At least if we are not satisfied by sheer hope. In real economies it is unlikely that we find many “autonomous” relations and events. And one could of course also raise the objection that to invoke a probabilistic approach to econometrics presupposes, e. g., that we have to be able to describe the world in terms of risk rather than genuine uncertainty.
And that is exactly what Haavelmo [1944:48] does: “To make this a rational problem of statistical inference we have to start out by an axiom, postulating that every set of observable variables has associated with it one particular ‘true’, but unknown, probability law.”
But to use this “trick of our own” and just assign “a certain probability law to a system of observable variables”, however, cannot – just as little as hoping – build a firm bridge between model and reality. Treating phenomena as if they essentially were stochastic processes is not the same as showing that they essentially are stochastic processes. As Hicks so neatly puts it in Causality in Economics [1979:120-21]:
Things become more difficult when we turn to time-series … The econometrist, who works in that field, may claim that he is not treading on very shaky ground. But if one asks him what he is really doing, he will not find it easy, even here, to give a convincing answer … [H]e must be treating the observations known to him as a sample of a larger “population”; but what population? … I am bold enough to conclude, from these considerations that the usefulness of “statistical” or “stochastic” methods in economics is a good deal less than is now conventionally supposed. We have no business to turn to them automatically; we should always ask ourselves, before we apply them, whether they are appropriate to the problem in hand.”
And as if this wasn’t enough, one could also seriously wonder what kind of “populations” these statistical and econometric models ultimately are based on. Why should we as social scientists – and not as pure mathematicians working with formal-axiomatic systems without the urge to confront our models with real target systems – unquestioningly accept Haavelmo’s “infinite population”, Fisher’s “hypothetical infinite population”, von Mises’s “collective” or Gibbs’s ”ensemble”?
Of course one could treat our observational or experimental data as random samples from real populations. I have no problem with that. But modern (probabilistic) econometrics does not content itself with that kind of populations. Instead it creates imaginary populations of “parallel universes” and assume that our data are random samples from that kind of populations.
But this is actually nothing else but handwaving! And it is inadequate for real science. As David Freedman writes in Statistical Models and Causal Inference [2010:105-111]:
With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions … Nevertheless, reliance on imaginary populations is widespread. Indeed regression models are commonly used to analyze convenience samples … The rhetoric of imaginary populations is seductive because it seems to free the investigator from the necessity of understanding how data were generated.
Econometricians should know better than to treat random variables, probabilites and expected values as anything else than things that strictly seen only pertain to statistical models. If they want us take the leap of faith from mathematics into the empirical world in applying the theory, they have to really argue an justify this leap by showing that those neat mathematical assumptions (that, to make things worse, often are left implicit, as e.g. independence and additivity) do not collide with the ugly reality. The set of mathematical assumptions is no validation in itself of the adequacy of the application.
A crucial ingredient to any economic theory that wants to use probabilistic models should be a convincing argument for the view that “there can be no harm in considering economic variables as stochastic variables” [Haavelmo 1943:13]. In most cases no such arguments are given.
Of course you are entitled — like Haavelmo and his modern probabilistic followers — to express a hope “at a metaphysical level” that there are invariant features of reality to uncover and that also show up at the empirical level of observations as some kind of regularities.
But is it a justifiable hope? I have serious doubts. The kind of regularities you may hope to find in society is not to be found in the domain of surface phenomena, but rather at the level of causal mechanisms, powers and capacities. Persistence and generality has to be looked out for at an underlying deep level. Most econometricians do not want to visit that playground. They are content with setting up theoretical models that give us correlations and eventually “mimic” existing causal properties.
We have to accept that reality has no “correct” representation in an economic or econometric model. There is no such thing as a “true” model that can capture an open, complex and contextual system in a set of equations with parameters stable over space and time, and exhibiting invariant regularities. To just “believe”, “hope” or “assume” that such a modelpossibly could exist is not enough. It has to be justified in relation to the ontological conditions of social reality.
In contrast to those who want to give up on (fallible, transient and transformable) “truth” as a relation between theory and reality and content themselves with “truth” as a relation between a model and a probability distribution, I think it is better to really scrutinize if this latter attitude is feasible. To abandon the quest for truth and replace it with sheer positivism would indeed be a sad fate of econometrics.