Home > Uncategorized > How statistics can be misleading

## How statistics can be misleading

.

from Lars Syll

From a theoretical perspective, Simpson’s paradox importantly shows that causality can never be reduced to a question of statistics or probabilities.

To understand causality we always have to relate it to a specific causal structure. Statistical correlations are never enough. No structure, no causality.

Econometric patterns should never be seen as anything else than possible clues to follow. From a critical realist perspective, it is obvious that behind observable data there are real structures and mechanisms operating, things that are  — if we really want to understand, explain and (possibly) predict things in the real world — more important to get hold of than to simply correlate and regress observable variables.

Math cannot establish the truth value of a fact. Never has. Never will.

Paul Romer

1. June 5, 2021 at 3:22 pm

When cigarette smokers were observed to have a 10x risk of lung cancer (1950), there was no “structure”. There was also, at that time, no *known* mechanism. The statistical evidence was correct in indicating a causal relationship, and ensuing research confirmed this and found the mechanism(s). Lesson: statistical associations, if robust (replicated and not due to chance), represent some sort of causal structure – I say “some sort of” because it can be causation in the “wrong” direction or a common cause (omitted variable bias; confounding). That can be informative in leading to the discovery of the causal mechanism. Don’t just dismiss things (in this case statistical or econometric methods) just because you don’t like them.

Obviously (hopefully, for this audience), to get from statistical association to causation you have to go through a process of causal inference. In that epidemiological example, it was done using Bradford Hill’s “aspects”, which was a list of things to look for, to try and find out what causal relationships underlie any observed robust statistical association. Mechanism (plausibility) was part of that, but many items on the list were purely statistical criteria such as a dose-response relationship, and time order. Also, what is causally plausible can change when new evidence is obtained, as happened with cigarettes and lung cancer: until then, cancer was thought to arise endogenously rather than being caused, at least partly, by outside exposures.

And sometimes, econometric methods can be used to make an important discovery: I am thinking of the considerable statistical evidence on the employment effects of the minimum wage. The literature is currently uncertain, in that it is divided between those who find no association and those who find a weak association. In a broader perspective, what this has shown is that pessimistic conclusions drawn from a priori economic theory are wrong. The practical benefit of that has been huge.

By the way, I don’t see the relevance of Simpson’s paradox in this discussion. It is concerned with the composition of the different groups, not with causal mechanisms and the like.

• June 6, 2021 at 9:31 am

You write: “statistical associations … can be informative in leading to the discovery of the causal mechanism. Don’t just dismiss things (in this case statistical or econometric methods) just because you don’t like them.” On this, I think we surely agree. When holding my statistics and econometrics classes, I — again and again — emphasise that data and statistics can help us on the way of detecting causally interesting relations/processes/mechanisms. My students certainly do not dismiss statistics. But they do understand that statistics usually do not give the answers to the most interesting social science questions. When we’ve got our statistics right — both descriptively and inferentially — we have just started on the scientific journey. Statistics is a start, not the end, of scientific endeavours to explain things.

1. No trackbacks yet.

This site uses Akismet to reduce spam. Learn how your comment data is processed.