Home > Uncategorized > Why econometric models by necessity are endlessly misspecified

Why econometric models by necessity are endlessly misspecified

from Lars Syll

The impossibility of proper specification is true generally in regression analyses across the social sciences, whether we are looking at the factors affecting occupational status, voting behavior, etc. The problem is that as implied by the three conditions for regression analyses to yield accurate, unbiased estimates, you need to investigate a phenomenon that has underlying mathematical regularities – and, moreover, you need to know what they are. Neither seems true. I have no reason to believe that the way in which multiple factors affect earnings, student achievement, and GNP have some underlying mathematical regularity across individuals or countries. More likely, each individual or country has a different function, and one that changes over time. Even if there was some constancy, the processes are so complex that we have no idea of what the function looks like.

Misspecification in empirical models: How problematic and what can we do  about it? | VOX, CEPR Policy PortalResearchers recognize that they do not know the true function and seem to treat, usually implicitly, their results as a good-enough approximation. But there is no basis for the belief that the results of what is run in practice is anything close to the underlying phenomenon, even if there is an underlying phenomenon. This just seems to be wishful thinking. Most regression analysis research doesn’t even pay lip service to theoretical regularities. But you can’t just regress anything you want and expect the results to approximate reality. And even when researchers take somewhat seriously the need to have an underlying theoretical framework – as they have, at least to some extent, in the examples of studies of earnings, educational achievement, and GNP that I have used to illustrate my argument – they are so far from the conditions necessary for proper specification that one can have no confidence in the validity of the results.

Steven J. Klees

Most work in econometrics and regression analysis is made on the assumption that the researcher has a theoretical model that is ‘true.’ Based on this belief of having a correct specification for an econometric model or running a regression, one proceeds as if the only problem remaining to solve have to do with measurement and observation.

The problem is that there is pretty little to support the perfect specification assumption. Looking around in social science and economics we don’t find a single regression or econometric model that lives up to the standards set by the ‘true’ theoretical model — and there is nothing that gives us reason to believe things will be different in the future.

To think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them, is  not only a belief with little support, but a belief impossible to support.

The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.

Every regression model constructed is misspecified. There are always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.

The theoretical conditions that have to be fulfilled for regression analysis and econometrics to really work are nowhere even closely met in reality. Making outlandish statistical assumptions does not provide a solid ground for doing relevant social science and economics. Although regression analysis and econometrics have become the most used quantitative methods in social sciences and economics today, it’s still a fact that the inferences made from them are of strongly questionable validity.

The econometric art as it is practiced at the computer … involves fitting many, perhaps thousands, of statistical models….There can be no doubt that such a specification search invalidates the traditional theories of inference … All the concepts of traditional theory utterly lose their meaning by the time an applied researcher pulls from the bramble of computer output the one thorn of a model he likes best, the one he chooses to portray as a rose.

Ed Leamer

  1. Ken Zimmerman
    June 1, 2021 at 12:01 am

    Lars, based on experience of multiple researchers it seems clear that this is correct. Experience from on-the-job researchers indicates multiple factors affect earnings, student achievement, GDP, etc. in uncertain and episodic ways and show some but clearly limited underlying mathematical regularity across individuals and societies. More common it seems each individual or society has a different set of relationships that change over time. In practical terms picking out regularities in these relationships is impossible. Even if all the relationships are or could be made quantitative which they are not and cannot.

    While ‘expert’ opinions must always be questioned, the opinions of communities of highly experienced researchers should be given great weight in assessing the strength and forms of relationships in communities where their experiences lie.

  2. Gerald Holtham
    June 1, 2021 at 11:55 am

    There is no point in entering into another dialogue of the deaf. The reader should be aware however that the statement: “Most work in econometrics and regression analysis is made on the assumption that the researcher has a theoretical model that is ‘true.’” does not describe best practice in contemporary econometrics. A model has to be shown to be compatible with the data via a battery of tests before estimation of parameter values has any significance. The aim is to start as generally as data allow and to find the simplest model of the data generating process at work for the data set examined, one that is truly data compatible. The hope is that characteristics of the model will apply to other data sets but that can only be tested by experience. As KZ remarks social situations are too variable for general laws to apply with any quantitative regularity. Surprisingly enough most competent econometricians are perfectly aware of that. Theoreticians fall in love with their constructs; people who try to apply them generally do not.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.