Home > The Economics Profession > How to lie with statistics and econometrics

How to lie with statistics and econometrics

from Lars Syll


As social scientists – and economists – we have to confront the all-important question of how to handle uncertainty and randomness. Should we define randomness with probability? If we do, we have to accept that to speak of randomness we also have to presuppose the existence of nomological probability machines, since probabilities cannot be spoken of – and actually, to be strict, do not at all exist – without specifying such system-contexts. Accepting Haavelmo’s domain of probability theory and sample space of infinite populations – just as Fisher’s “hypothetical infinite population, of which the actual data are regarded as constituting a random sample”, von Mises’s “collective” or Gibbs’s ”ensemble” – also implies that judgments are made on the basis of observations that are actually never made!

Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for a science with aspirations of explaining real-world socio-economic processes, structures or events. It’s not tenable.

As David Salsburg once noted – in his lovely The Lady Tasting Tea – on probability theory:

[W]e assume there is an abstract space of elementary things called ‘events’ … If a measure on the abstract space of events fulfills certain axioms, then it is a probability. To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space … Unless we can identify [this] abstract space, the probability statements that emerge from statistical analyses will have many different and sometimes contrary meanings.

Just as e. g. John Maynard Keynes and Nicholas Georgescu-Roegen, Salsburg is very critical of the way social scientists – including economists and econometricians – uncritically and without arguments have come to simply assume that one can apply probability distributions from statistical theory on their own area of research:

Probability is a measure of sets in an abstract space of events. All the mathematical properties of probability can be derived from this definition. When we wish to apply probability to real life, we need to identify that abstract space of events for the particular problem at hand … It is not well established when statistical methods are used for observational studies … If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.

This importantly also means that if you cannot show that data satisfies all the conditions of the probabilistic nomological machine – including e. g. the distribution of the deviations corresponding to a normal curve – then the statistical inferences used, lack sound foundations.

In his great book Statistical Models and Causal Inference: A Dialogue with the Social Sciences David Freedman also touched on these fundamental problems, arising when you try to apply statistical models outside overly simple nomological machines like coin tossing and roulette wheels (emphasis added):

Of course, statistical models are applied not only to coin tossing but also to more complex systems. For example, “regression models” are widely used in the social sciences, as indicated below; such applications raise serious epistemological questions  …

A case study would take us too far afield, but a stylized example – regression analysis used to demonstrate sex discrimination in salaries – may give the idea. We use a regression model to predict salaries (dollars per year) of employees in a firm from: education (years of schooling completed), experience (years with the firm), and the dummy variable “man,” which takes the value 1 for men and 0 for women. Employees are indexed by the subscript i; for example, salaryi; is the salary of the ith employee. The equation is

(3) salaryi = a + b*educationi + c*experiencei + d*mani + εi.

Equation (3) is a statistical model for the data, with unknown parameters a, b, c, d; here, a is the “intercept” and the others are “regression coefficients”; εi is an unobservable error term. … In other words, an employee’s salary is determined as if by computing

(4) a + b*education + c*experience + d*man,

then adding an error drawn at random from a box of tickets. The display (4) is the expected value for salary given the explanatory variables (education, experience, man); the error term in (3) represents deviations from the expected.

The parameters in (3) are estimated from the data using least squares. If the estimated coefficient d for the dummy variable turns out to be positive and “statistically significant” (by a “t-test”), that would be taken as evidence of disparate impact: men earn more than women, even after adjusting for differences in background factors that might affect productivity. Education and experience are entered into equation (3) as “statistical controls,” precisely in order to claim that adjustment has been made for differences in backgrounds …

The story about the error term – that the ε’s are independent and identically distributed from person to person in the data set – turns out to be critical for computing statistical significance. Discrimination cannot be proved by regression modeling unless statistical significance can be established, and statistical significance cannot be established unless conventional presuppositions are made about unobservable error terms.

Lurking behind the typical regression model will be found a host of such assumptions; without them, legitimate inferences cannot be drawn from the model. There are statistical procedures for testing some of these assumptions. However, the tests often lack the power to detect substantial failures. Furthermore, model testing may become circular; breakdowns in assumptions are detected, and the model is redefined to accommodate. In short, hiding the problems can become a major goal of model building.

Using models to make predictions of the future, or the results of interventions, would be a valuable corrective. Testing the model on a variety of data sets – rather than fitting refinements over and over again to the same data set – might be a good second-best … Built into the equation is a model for non-discriminatory behavior: the coefficient d vanishes. If the company discriminates, that part of the model cannot be validated at all.

Regression models like (3) are widely used by social scientists to make causal inferences; such models are now almost a routine way of demonstrating counterfactuals. However, the “demonstrations” generally turn out to depend on a series of untested, even unarticulated, technical assumptions. Under the circumstances, reliance on model outputs may be quite unjustified. Making the ideas of validation somewhat more precise is a serious problem in the philosophy of science. That models should correspond to reality is, after all, a useful but not totally straightforward idea – with some history to it. Developing appropriate models is a serious problem in statistics; testing the connection to the phenomena is even more serious …

In our days, serious arguments have been made from data. Beautiful, delicate theorems have been proved, although the connection with data analysis often remains to be established. And an enormous amount of fiction has been produced, masquerading as rigorous science.

And as if this wasn’t enough, one could also seriously wonder what kind of “populations” these statistical and econometric models ultimately are based on. Why should we as social scientists – and not as pure mathematicians working with formal-axiomatic systems without the urge to confront our models with real target systems – unquestioningly accept Haavelmo’s “infinite population”, Fisher’s “hypothetical infinite population”, von Mises’s “collective” or Gibbs’s ”ensemble”?

Of course one could treat our observational or experimental data as random samples from real populations. I have no problem with that. But probabilistic econometrics does not content itself with that kind of populations. Instead it creates imaginary populations of “parallel universes” and assume that our data are random samples from that kind of populations.

But this is actually nothing else but hand-waving! And it is inadequate for real science. As David Freedman writes in Statistical Models and Causal Inference (emphasis added):

With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions … Nevertheless, reliance on imaginary populations is widespread. Indeed regression models are commonly used to analyze convenience samples … The rhetoric of imaginary populations is seductive because it seems to free the investigator from the necessity of understanding how data were generated.

  1. Helge Nome
    October 17, 2013 at 4:05 pm

    Face up to it. You can’t turn the Universe into a mechanical clockwork orange.
    Rather, it is non conformist and rebellious.

  2. BFWR
    October 17, 2013 at 6:21 pm

    Well, yes and no. The economy is in a continuous state of flux, but by utilizing the best and most relevant statistics available and an equilibrating mathematical formula on a routine basis….an equilibrium, for all practical purposes could be maintained. Maintained is the operative word here because after all, the economy as a whole has no mind and so requires the equivalent of a mind….and that is policy. Policy should reflect both the needs of individuals and correct the actual problems which are inherent to economics/an economy.

    Seeings how the physics (not the abstraction or theory) of economics, that is cost accounting, tell us that individual incomes are scarce in ratio to prices simultaneously created and needed to merely enable any and all enterprises to break even…..a continuous supplement to individual incomes…..seems to be in order….that is if we actually want both economic equilibrium AND equal consideration of the individual in the system.

  3. October 17, 2013 at 7:15 pm

    I have not read David Freedman. But I would say (and have said on this blog) pretty much the same things, not about social sciences in general (because of insufficient sampling), but about applied econometrics in particular. From a vast majority (99 percent?) of economics papers I read, “an enormous amount of fiction has been produced, masquerading as rigorous science”.

    For example, despite the recent bagging given by many commentors and the critical paper by Herndon et al, as well as the authors’ own admission of errors, the paper by Reinhart and Rogoff continued to be referenced in central bank research and media as though the conclusions are valid and widely accepted. This is an example of how economists do read, or do not understand what they read, but are apparently willing to accept without question “authority” based on affiliation. This is not science.

    From the empirical econometrics in finance, which has vastly more data to play with than economics, also “an enormous amount of fiction has been produced, masquerading as rigorous science”, including Fama’s own empirical evidence of “efficient capital markets”.

  4. October 18, 2013 at 12:23 am

    By the way, Bernanke’s reputation as an expert on “The Great Depression” is based on his collection of econometric essays, from which “an enormous amount of fiction has been produced, masquerading as rigorous science”.

    Apart statistical insignificance his studies, Bernanke has often confused causes and effects, because linear regressions do not really say anything about causality. In the real-time experiment of the past several years, extreme monetary stimulus has not produced commensurate increases (if any) in economic growth, despite endorsed market price manipulations to influence expectations.

  5. Jeff Z.
    October 18, 2013 at 3:22 pm

    Let us not forget the practical consequences. The intersection of regression models and legal requirements comes to mind. This is partially captured by BFWR’s comments on policy.

    Suppose that I do want to show that the possibility of discrimination exists within a company. Freedman’s exemplar would be the first thing that I would try. The company does not even have to be doing this intentionally. For legal purposes, all you need to make a prima facie case of discrimination is for a woman with the same credentials and experience and job title as man in the same company, is to show that she is either paid less, or received a smaller raise. That is a sample of TWO, and the only thing the woman is required to show is the correlation. At that point, the burden of proof, legally speaking in the U.S., shifts to the employer. Discrimination is PRESUMED to be the CAUSE of the disparity. The company has to show that they DIDN’T discriminate on the basis of sex.

    If I want to show a pattern, then I might argue that I do have a real ‘universal’ population to examine – say, all the Vice Presidents in the company. For this case, I might argue that the use of econometrics is entirely appropriate – because we already know certain things – the hierarchical structure of the company, the job duties (presumably), the important characteristics of the workers, employment law and the standards of proof, etc. These things tell me A LOT about how the data were generated IN THIS CASE. Even so, the error terms remain unobservable, and postulating different distributions for these (normal, mean zero etc) and not normal / other does not change the basic fact that the error terms remain unobservable.

    If I believe that ‘defense wins championships’ I might be able to do this statistically, because I already know the rules of the game I am investigating – soccer in the U.S., football elsewhere, U.S. football and baseball, hockey, basketball etc. Thus, I already know a great deal about how that data are generated, both within the sport and in the math used to build the statistics.

    In these two cases, would I be justified in making the conventional presuppositions about the error terms – one company, sport? I am reservedly claiming that yes, I can, but I realize that I may not be able to do this elsewhere.

    Is this right? I am troubled by the fact that I am missing something, yet I can not pin it down.

  6. paolo
    October 21, 2013 at 5:21 am

    BFWR says that the economy as a whole does not have a mind; this is not entirely true. Its mind, although ex post, is the government or, better the modern State, which, different from individuals,understands the existence of macroeconomics as separate from micro, always if its culture (not Hayek’s) admits it. Ex post, however, because government intervention may be wrong, but the government can intervene to correct mistakes. Econometrics derives macroeconomics from micro behavior and from ex post data, which include macroevents : if it gets a decent result is generally by chance.

  7. BFWR
    October 23, 2013 at 6:22 am


    You are correct the government can be wrong, especially if they rely upon flawed economic orthodoxy which is what they do if they take the current theory of the velocity of money seriously. This theory claims that individual purchasing power can be increased by money’s circulation within the economy. Only problem is…it can’t. Why? Because cost accounting data tells us that labor costs (individual incomes) is only a fraction of total costs, and because all costs must go into price every enterprise must garner at least all of its costs in prices, or go bankrupt. But if only a fraction of total incomes are ever created to liquidate total costs/prices, and this applies to every ongoing enterprise and hence each and every economy…then how can we think that there is not a scarcity of total individual incomes in ratio to total prices or the possibility of economic equilibrium….unless of course we equilibrate that most relevant metric (total individual incomes and total prices) in the only way that does not immediately incur an additional cost…..that is, a direct and periodically perpetual supplement to individual incomes…which again, would eliminate the troublesome scarcity and enable free market theory to be free in fact.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: