## Confusing statistics and research

from **Lars Syll**

Coupled with downright incompetence in statistics, we often find the syndrome that I have come to call

statisticism: the notion that computing is synonymous with doing research, the naïve faith that statistics is a complete or sufficient basis for scientific methodology, the superstition that statistical formulas exist for evaluating such things as the relative merits of different substantive theories or the “importance” of the causes of a “dependent variable”; and the delusion that decomposing the covariations of some arbitrary and haphazardly assembled collection of variables can somehow justify not only a “causal model” but also, praise a mark, a “measurement model.” There would be no point in deploring such caricatures of the scientific enterprise if there were a clearly identifiable sector of social science research wherein such fallacies were clearly recognized and emphatically out of bounds.

Wise words well worth pondering on.

As long as economists and statisticians cannot really identify their statistical theories with real-world phenomena there is no real warrant for taking their statistical inferences seriously.

Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes of events (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.

Probability is a relational thing. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to or approximate) real data generating processes or structures — something seldom or never done!

And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!

Economic knowledge must be a well organized system composed of statistics, history, observations, policy experiments and theory. The main task of theory is to construct a conceptual model (or models) of socio-economic structures. In this work, Hayek thought that algebraic theorizing is important in his paper on the Theory of Complex Phenomena (he pays credit that this term was hinted by J. W. N. Watkins).

Here are some (parts of) paragraphs that Hayek explained in Section 5 Pattern Prediction with Incomplete Data why algebraic theories are necessary:

‘The multiplicity of even the minimum of distinct elements required to produce (and therefore also of the minimum number of data required to explain) a complex phenomenon of a certain kind creates problems which dominate the disciplines concerned with such phenomena and gives them an appearance very different from that of those concerned with simpler phenomena.

‘There is, however, no justification for the belief that it must always be possible to discover such simple regularities and that physics is more advanced because it has succeeded in doing this while other sciences have not yet done so. It is rather the other way round : physics has succeeded because it deals with phenomena which, in our sense, are simple. There is, however, no justification for the belief that it must always be possible to discover such simple regularities and that physics is more advanced because it has succeeded in doing this while other sciences have not yet done so. It is rather the other way round : physics has succeeded because it deals with phenomena which, in our sense, are simple.

‘We are, however, interested not only in individual events, and it is also not only predictions of individual events which can be empirically tested. We are equally interested in the recurrence of abstract patterns as such; and the prediction that a pattern of a certain kind will appear in defined circumstances is a falsifiable (and therefore empirical) statement. Knowledge of the conditions in which a pattern of a certain kind will appear, and of what depends on its preservation, may be of great practical importance. The circumstances or conditions in which the pattern described by the theory will appear are defined by the range of values which may be inserted for the variables of the formula. All we need to know in order to make such a theory applicable to a situation is, therefore, that the data possess certain general properties (or belong to the class defined by the scope of the variables). Beyond this we need to know nothing about their individual attributes so long as we are content to derive merely the sort of pattern that will appear and not its particular manifestation.

‘Such a theory destined to remain ‘algebraic’, because we are in fact unable to substitute particular values for the variables, ceases then to be a mere tool and becomes the final result of our theoretical efforts.

Economy is a system that is essentially complex. Hayek explains why a good theory of complex phenomena must be algebraic. As everybody knows, algebra is a field of mathematics. As Lars Syll has a strong tendency to exclude mathematics as a form of theoretical studies (whatever it may be), it becomes inevitable for him that he can find no positive way for a good theory making for economics. Syll knows that economy is complex phenomena but, as he excludes algebraic theories, he is fallen in a self-constructed dilemma.

Statistics as a tool of research in humans’ ways of life is legitimate. However, as a tool statistical examination has massive limitations. Statistics grasps almost nothing of human societies and cultures. Metaphorically speaking, as a tool in social and behavioral sciences statistics has the same breath and reach as learning about an ocean by studying only its surface. With the ocean in studying its surface we examine only what is easily reachable within arm’s length, measurable (sometime only what is easily measurable), and presents no risk for us, the scientists. Keeping this in mind, the use of statistics in social and behavioral sciences must be diligently and abundantly footnoted about each of these considerable limitations.