## Econometrics — science built on beliefs and untestable assumptions

from **Lars Syll**

What is distinctive about structural models, in contrast to forecasting models, is that they are supposed to be – when successfully supported by observation – informative about the impact of interventions in the economy. As such, they carry causal content about the structure of the economy. Therefore, structural models do not model mere functional relations supported by correlations, their functional relations have causal content which support counterfactuals about what would happen under certain changes or interventions.

This suggests an important question: just what is the causal content attributed to structural models in econometrics? And, from the more restricted perspective of this paper, what does this imply with respect to the interpretation of the error term? What does the error term represent causally in structural equation models in econometrics? And finally, what constraints are imposed on the error term for successful causal inference? …

I now consider briefly a key constraint that may be necessary for the error term to meet for using the model for causal inference. To keep the discussion simple, I look only at the simplest model

y= αx+u

The obvious experiment that comes to mind is to vary x, to see by how much y changes as a result. This sounds straight forward, one changes x, y changes and one calculates α as follows.

α = ∆y/ ∆x

Everything seems straightforward. However there is a concern since u is unobservable: how does one know that u has not also changed in changing x? Suppose that u does change so that there is hidden in the change in y a change in u, that is, the change in y is incorrectly measured by

∆yfalse= ∆y + ∆u

And thus that α is falsely measured as

αfalse =∆yfalse/∆x = ∆y/ ∆x +∆u/ ∆x = α + ∆u/ ∆x

Therefore, in order for the experiment to give the correct measurement for α, one needs either to know that u has not also changed or know by how much it has changed (if it has.) Since u is unobservable it is not known by how much u has changed. This leaves as the only option the need to know that in changing x, u has not also been unwittingly changed. Intuitively, this requires that it is known that whatever cause(s) of x which are used to change x, not also be causes of any of the factors hidden in u …

More generally, the example above shows a need to constrain the error term in the equation in a non-simultaneous structural equation model as follows. It requires that each right hand variable have a cause that causes y but not via any factor hidden in the error term. This imposes a limit on the common causes the factors in the error term can have with those factors explicitly modelled …

Consider briefly the testability of the two key assumptions brought to light in this section: (i) that the error term denotes the net impact of a set of omitted causal factors and (ii) that the each error term have at least one cause which does not cause the error term. Given these assumptions directly involve the factors omitted in the error term, testing these empirically seems impossible without information about what is hidden in the error term. This places the modeller in a difficult situation, how to know that something important has not been hidden. In practice, there will always be element of faith in the assumptions about the error term, assuming that assumptions like (i) and (ii) have been met, even if it is impossible to test these conclusively.

In econometrics textbooks it is often said that the error term in the regression models used represents the effect of the variables that were omitted from the model. The error term is somehow thought to be a ‘cover-all’ term representing omitted content in the model and necessary to include to ‘save’ the assumed deterministic relation between the other random variables included in the model. Error terms are usually assumed to be orthogonal (uncorrelated) to the explanatory variables. But since they are unobservable, they are also impossible to empirically test. And without justification of the orthogonality assumption, there is as a rule nothing to ensure identifiability:

With enough math, an author can be confident that most readers will never figure out where a FWUTV (facts with unknown truth value) is buried. A discussant or referee cannot say that an identification assumption is not credible if they cannot figure out what it is and are too embarrassed to ask.

Distributional assumptions about error terms are a good place to bury things because hardly anyone pays attention to them. Moreover, if a critic does see that this is the identifying assumption, how can she win an argument about the true expected value the level of aether? If the author can make up an imaginary variable, “because I say so” seems like a pretty convincing answer to any question about its properties.

I read this stuff and I read this stuff, and then I re-read this stuff, and then I think: Myron Ebell probably agrees. If he would not, somebody please explain why not.

Thank you for this, Lars. It enables me to explain very simply what Keynes was all about.

The issue in y = ax +u is not whether changes in u are detectable but whether the causes being modelled (and hence the type of number required) are discrete (things) or continuous in time (flows). Unless otherwise stated, the fact that the symbol ‘u’ is fixed gives the impression that the number u is fixed, but if the cause of change x is continuous (as in economics or the power of an engine driving a ship) then any effect on y and u is also continuous, so that y and u measure speed and not change of position, and u (insofar as it is itself a discrete quantitity) is actually the integral of a differential over a period of time: this applying to side effects as well as in the direction of travel. In other words, even if originally u is not observable (and it is not even clear how it might be observed), over time it may well become obvious, as both non-growth and increased unemployment became obvious to all but the deniers in the 1930’s and 2000’s. The deniers, of course, don’t look at what their models represent in reality, but sit in their ivory towers misinterpreting them by taking GDP and average wages as things rather than as respectively a time integral and spatial variable.

How horrible it is having to try and explain this without being able to include mathematical symbols in a comment! The point, anyway, is that the logic of it has been embodied in PID servos since around 1964, and Keynes had discovered it (without being able to adequately formalise it) in 1936. The sick thing is the concepts have been around in navigation for several centuries, which was why it became so important to know the local time so as to fix longitude and thus position relative to the observable motion of the sun.

I’ve recently realised there is another crucial (and similar) issue arising from this, which is the difference between mechanical controls which affect only the machine they are built into, and information-based servos which can not only assist control mechanisms from afar but -because of the share-ability of information not affecting its content – the same information can assist many mechanisms. Hence the value of appropriate theory and the evidence of successful methods.

Why this is so important is that mechanical thinking leads to the idea of a global economy needing global government, with all the ships of state combined in a United Nations or in any case in “united states” like those of the US and EU, with an Admiral banker directing their operations via their purse strings. But the reality is that control is not effected by force but by our acting on information affecting our behaviour, which in the shipping example applies as much to every little boat sailing towards its own destination on the open seas as it does to the largest destroyers. The mechanically-minded Wallies still seeing themselves at the centre of their own little worlds see that the bigger they get, the less easily they can be pushed off course. What they don’t see is that being pushed off course doesn’t matter when it is possible to regain it. There is no reason for their ship not to give way to boats heading for other destinations, or to avoid dangers rapidly approaching like the end of our world.

The practical import of this is not that governments give way to traders, but that governments ensure that the necessary communications channels are in place so that (in particular) local populations can intelligently control themselves. This is not so much a question of mechanically how to, but of the need to.

for the del to be understood then “u” is the equivalent of ceteris paribus.

You are missing the point, Paul. For a continuous function your “ceteris paribus” grows, like the total cost of even a non-compounded interest rate.