Econometrics — the danger of calling your pet cat dog
from Lars Syll
The assumption of additivity and linearity means that the outcome variable is, in reality, linearly related to any predictors … and that if you have several predictors then their combined effect is best described by adding their effects together …
This assumption is the most important because if it is not true then even if all other assumptions are met, your model is invalid because you have described it incorrectly. It’s a bit like calling your pet cat a dog: you can try to get it to go in a kennel, or to fetch sticks, or to sit when you tell it to, but don’t be surprised when its behaviour isn’t what you expect because even though you’ve called it a dog, it is in fact a cat. Similarly, if you have described your statistical model inaccurately it won’t behave itself and there’s no point in interpreting its parameter estimates or worrying about significance tests of confidence intervals: the model is wrong.
Econometrics fails miserably over and over again — and not only because of the additivity and linearity assumption.
Another reason why it does, is that the error term in the regression models used is thought of as representing the effect of the variables that were omitted from the models. The error term is somehow thought to be a ‘cover-all’ term representing omitted content in the model and necessary to include to ‘save’ the assumed deterministic relation between the other random variables included in the model. Error terms are usually assumed to be orthogonal (uncorrelated) to the explanatory variables. But since they are unobservable, they are also impossible to empirically test. And without justification of the orthogonality assumption, there is, as a rule, nothing to ensure identifiability.
Nowadays it has almost become a self-evident truism among economists that you cannot expect people to take your arguments seriously unless they are based on or backed up by advanced econometric modelling. So legions of mathematical-statistical theorems are proved — and heaps of fiction are being produced, masquerading as science. The rigour of the econometric modelling and the far-reaching assumptions they are built on is frequently not supported by data.
What’s seen the morning after a bank crisis, is described as risky (the cat) while, what those huge dangerous bank exposures were built up with before, were assets deemed very safe (the dog)
That’s what regulators missed with their risk weighted bank capital requirements
https://subprimeregulations.blogspot.com/2018/08/risk-weighted-capital-requirements-for.html
Surely the euphoria generated by the econometrics revolution has long evaporated. The strictures in the post apply to regression analysis. (How can the relation between random variables be deterministic?!). The error term is self-evidently a confession of modesty. It is the assumptions made about it that cause discomfort. What about time series econometrics? We have nonlinearity, structural breaks, and all that.
Applied econometric work is careful, qualified, and strong on description. Other forms of evidentiary analysis like narrative economics are gaining respect. What became of cliometrics? The finest historians today work assiduously with hand-collected data. Stories are evoked by the skill and insight brought into the arrangement of the material.
Lars makes the mistake of supposing econometrics is treating reality as a “sample”. He therefore believes “errors” are real but unobservable. But “errors” do not exist. What exists are residuals – the quantities indicating how far an hypothesized explanatory equation fails to match a set of data. These residuals are eminently observable and can be tested for correlation with explanatory variables as well as for randomness. The data is not a sample of anything so Lars’ concern that we are comparing reality with parallel universes is misplaced. The data set is what it is. The hypothesized explanation for the data set can be just that, trying to explain a particular set of data or it can be a general hypothesis. If it fails tests on the data it cannot be truly general, though it may work somewhere else. It is true the tests are statistical but they depend on the characteristics of random not on sampling theory per se.