## On the applicability of statistics in social sciences

from **Lars Syll**

Eminent statistician David Salsburg is rightfully very critical of the way social scientists — including economists and econometricians — uncritically and without arguments have come to simply assume that they can apply probability distributions from statistical theory on their own area of research:

We assume there is an abstract space of elementary things called ‘events’ … If a measure on the abstract space of events fulfills certain axioms, then it is a probability. To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space … Unless we can identify [this] abstract space, the probability statements that emerge from statistical analyses will have many different and sometimes contrary meanings …

Kolmogorov established the mathematical meaning of probability: Probability is a measure of sets in an abstract space of events. All the mathematical properties of probability can be derived from this definition. When we wish to apply probability to real life, we need to identify that abstract space of events for the particular problem at hand … It is not well established when statistical methods are used for observational studies … If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.

Wise words well worth pondering on.

As long as economists and statisticians cannot really identify their statistical theories with real-world phenomena there is no real warrant for taking their statistical inferences seriously.

Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes or events (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.

Probability is — as strongly argued by Keynes — a relational thing. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to or approximate) real data generating processes or structures — something seldom or never done!

And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!

I have commented at https://wordpress.com/post/djmarsay.wordpress.com/4181 . Basically, while – as Keynes noted – a lot of applied statistics is pseudomathematics, you are missing some nuances that sometimes make a difference. Statistics can be useful, provided they are interpreted appropriately.

“If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.”

This is either well-written or is validating my pre-conceived biases. I cannot decide which.

Humans invented parametric statistics, as the name implies to be applied to data that to the best of our understanding have certain parametric characteristics. Quite frequently it is impossible to determine the parametric characteristics of the data arising from one or more observations of people in the wild (non-laboratory). In these instances parametric statistics should never even be attempted. Sometimes the parametric characteristics of the data can be estimated. Here parametric statistics can be applied, if fully caveated. That leaves a small, very small portion of the data to which parametric statistics can be applied with no cautions. Perhaps as little as 1-2% of the original data. Thus, for all practical purposes parametric statistics is not applicable in economics. Other social sciences draw data from sources with both a longer history and broader reach than economics, e.g., sociology. Thus, in these social sciences parametric statistics is more widely applicable. In some instances, such as demographics, perhaps as much as 50%

Ken, How do you know when parametric statistics are actually appropriate? In my experience such methods are always ‘useful’, as long as the results are ‘fully caveated’. But what do we mean by this? It seems to me where there are problems with the (mis)use of statistics it is because the caveats have got lost or misunderstood. How could we make sure they are taken seriously, except by some reforms including improved mathematical education?

Dave, there’s a long list of assumptions you’ll find in most textbooks for parametric statistics. As I indicated in most research we’re unable to verify if they are met. That’s a long list of caveats. Working backwards from whatever data we have, we must ask what do want from the data? We want it help us understand the people the data points show us. What is data showing us about those people. In most instances we can do a good job of that either without mathematics or with basic mathematics procedures, e.g., addition, and tables and graphs. Better to avoid statistics if we cannot verify the data meets the basic data assumptions for parametric statistics. Less chance of claiming a finding that is incorrect.

Ken, mathematics is always of the form ‘If A then B’. You seem to think it useless because we can never know that A is the case. But why not follow Keynes, Turing et al in observing that if B is not the case (e.g., economies are not stable) then A must not be the case. If we previously believed in A, this is surely an advance? (E.g., if mathematics had been applied to mainstream economic theories prior to 2007, we might have seen the problems emerging.)

Dave, mathematics takes the form “mathematical operations performed” on A, B, D, etc. = Z. Determining that B is not the case, in your example, requires something no one wants to discuss, it seems, judgment. And judgment doesn’t require mathematics but information via experience. Some of which may be mathematical. Mathematics cannot substitute for human judgment. If I tell you I respect you and your opinions, but slug you in the face each time you speak your opinions, what is learned from the experience? 1+1=2. Or, in this case 1+1=3?

Ken, Your ideas about mathematics seem to me to be ‘sociologically valid’ in some sense. But the kind of mathematics that economists have tended to ignore is different. For example, it is a theorem of ‘model theory’ that there is no possible mathematical model of what economists habitually call ‘mathematical models’. The only ‘judgement’ required is whether you want to trust the logicians or the economists on the nature of mathematics. Since Keynes, mathematicians have used the term ‘pseudo-mathematics’ to distinguish between sound mathematics and statements masquerading as mathematics which have no underpinning logic (and may even be invalid).

Of course, the above is controversial and perhaps impolite, but it seems to me that if economics is to be reformed the issues that Keynes raised need to be revisited. (You make some excellent points – I merely wish to identify your target better!)

Dave, what you call ‘model theory’ is a valid approach for any science. As a scientist friend says, a way to shake off the cobwebs. And the models often aren’t mathematical. There is little doubt in my mind many social scientists, including economists misapply mathematics. Using statistics, for example social scientists often ignore many of the axioms on which statistics is based. This makes any results mathematically meaningless. Social scientists can certainly invent and have invented their own versions of mathematics. With different axioms and proof structures. Which puts them at odds with traditional mathematicians. I would not call these social science mathematics pseudo (false, spurious). I would call them “alternative” mathematics. You would not be the only person to challenge these alternatives and the axioms upon which they are based. That’s a public fight. You can get involved. On the mathematicians side, about 25% of mathematicians accept them, at least in part. Another 25% reject them in total.

Ken, Your notion of ‘alternative’ mathematics seems more reasonable that Keynes’ terminology, since it doesn’t require people to take sides. Using your language, I would suggest that rather than critique ‘mathematics’ as if it were homogenous you critique ‘mathematics as used in mainstream economics’ and at least provide a footnote noting that there are alternatives that survive your critique. We might then move on to the question of which alternative?

(Compare my https://djmarsay.wordpress.com/debates/which-type-of-mathematics-in-finance/ )

Mathematics is a cultural tool. Humans created it originally in a dozen or more forms. With more being created all the time. I concluded long ago that discussing other forms of mathematics with economists is pointless. They appear to be locked into the one form, the form their theories supposedly express. And, like genuine paleoconservative refuse to even consider alternatives.

Thanks for the paper. I like it very much. You seem to understand what economists do not about mathematics. That it is a language. Languages are created and re-created in conversation (spoken or written). And the purpose of any language is to communicate something from one human(s) to another human(s). It’s not reality that’s created, but the ongoing debate among humans (going on for 30,000 years or so) about what is to be assumed, for the present time to be factual. Facts that are sufficiently firm upon which to base a culture and a society. The issue then is, what form of mathematics best fits the kind of conversation we’re entering into?

Ken, I’m glad you liked my paper. Actually, I think I have a very different view of mathematics from you and most social scientists, and the differences may, practically, be irreconcilable. The gulf between us, I think, huge. But there is hope!

Having worked as the lone mathematician among social scientists (and similar) it seems to me that when we focus on a particular issue it is possible – with a lot of hard work on both sides – to come to forms of words that will at least be interpreted consistently with respect to the issues at hand by the differing ‘cultures’. My own view, by now firmly held, is that this is ‘as good as it gets’. (Cf Whitehead et al.)

Now I come to a bit of ‘shifting the burden’: how could the social sciences facilitate the kind of debate I envisage, and what guidance could they provide, e.g. to mathematicians? (We need it!)

Yes, there is always hope. Sometimes forlorn, but hope still.