Home > Uncategorized > Fooled by randomness

Fooled by randomness

from Lars Syll

A non-trivial part of teaching statistics to social science students is made up of learning them to perform significance testing. A problem yours truly has noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them.

Is betting random? | Analysing randomness in bettingA couple of years ago I gave a statistics course for the Swedish National Research School in History, and at the exam I asked the students to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since it is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent (conditional probability inference is difficult even to those of us who teach and practice it). A lot of researchers fall pray to the same mistakes. So – given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on weird backward logic that students and researchers usually don’t understand?

Let me just give a simple example to illustrate how slippery it is to deal with p-values – and how easy it is to impute causality to things that really are nothing but chance occurrences.

Say you have collected cross-country data on austerity policies and growth (and let’s assume that you have been able to “control” for possible confounders). You find that countries that have implemented austerity policies have on average increased their growth by say 2% more than the other countries. To really feel sure about the efficacy of the austerity policies you run a significance test – thereby actually assuming without argument that all the values you have come from the same probability distribution – and you get a p-value of  less than 0.05. Heureka! You’ve got a statistically significant value. The probability is less than 1/20 that you got this value out of pure stochastic randomness.

But wait a minute. There is – as you may have guessed – a snag. If you test austerity policies in enough many countries you will get a statistically ‘significant’ result out of pure chance 5% of the time. So, really, there is nothing to get so excited about!

Statistical significance doesn’t say that something is important or true. And since there already are far better and more relevant testing that can be done (see e. g. here and  here), it is high time to give up on this statistical fetish and not continue to be fooled by randomness.

  1. January 16, 2021 at 6:37 pm

    Minor grammar error. We do not learn others. We teach them. Learning is passive in English.

  2. January 16, 2021 at 6:50 pm

    I was asked by my supervisor to consult with a student on a research project in Oakville. He was hired by the agency (where I was placed as a MSW student in second year 1974) funded through the social planning council to examine the attitudes of parents and students in the city.

    In the end he concluded that there was no difference. I challenged that because my clinical work said otherwise. He said he had vetted his results past the supervising Stats researcher at the University of Waterloo and had it confirmed by him and the senior agency personnel. I showed the results to some profs in my programme and they also confirmed his results.

    So I got serious and dove in deep. I discovered that he had accepted the null hypothesis — namely that there was no difference in attitudes of kids and parents — because he got significance at p=05. I explained that in that case he has to reject the null hypothesis and not accept it.

    It saved the day because there was a $50k grant for kids programmes riding on the results. The agency got the grant and he presented me with a copy of the research writing “To Herb ‘Stats’ Wiseman” as a dedication. The best part of the story is that I had failed stats — twice — because I did not do the work in the course —lazy? But when I needed stats to get into the masters programme, I got serious and passed an evening course with a B+ while working full time and training to compete at the national level canoeing.

  3. deshoebox
    January 16, 2021 at 7:15 pm

    Also, of course, if austerity policies tend to lead to significantly greater economic growth as compared with economies without austerity policies, we all know that the benefits of that austerity will disproportionately accrue to the wealthiest five percent of the population. Do we need a theory to understand this or to grasp its social significance? Is regression analysis the right tool to work out how angry, depressed, and desperate the people who typically suffer from such austerity policies will be? Hey, Economist Guys. How about a theory that supports policies where austerity disproportionately beneifts poor people. That would really be a valuable contribution.

  4. January 19, 2021 at 11:44 pm

    deshoebox: ” Do we need a theory to understand this or to grasp its social significance?

    At the macro level, we certainly do need a theory of the obvious. The assumptions used in modern economics conceal what is obvious by constructing models based upon the hypothesized actual existence of ‘Offer-Demand Curves’ wherein Offer Prices = f(x). Since these is no such thing as such a ‘Demand Curve’, and since equilibrium depends upon the existence of such curves intersecting with Supply Curves, we must begin by reconstructing both the Theory of the Consumer and of Aggregate Demand overall. In both, the distribution of incomes matters and, because of this, the distribution of money never has neutral impacts on the path economies take, particularly market-based economies.

  5. Ken Zimmerman
    January 31, 2021 at 4:32 pm

    Much statistical theory – especially 20th century statistical theory – is concerned with the problem of inference. Sloppily put, (and here sloppiness is an advantage in that precision would be explicitly theory-laden) the problem is one of the kind of statements statisticians can make based on their analyses. Typically, they will have data on only a subset of the cases in which they are interested and will wish to say something about all of them. They may want to make a prediction, based on experience, as to what will happen in the future. They may wish to change their estimates of the plausibility of a hypothesis in the light of an experiment. They may wish to say something about a population based on examining a sample of it chosen at random. Generally, they want to infer from the known and examined to the unknown and unexamined.

    In the contemporary world, problems of statistical inference tend to be intricately linked with technical prediction and control: for example, in the techniques of quality control. However, the historical roots of statistical inference lie elsewhere. Much of the framework of inference was developed in the context of problems of belief in a general sense, and especially of theological belief. The problem of inference was this: given our limited knowledge, ought we to believe in God? Or, given our lack of knowledge of God, what is the rational decision to take regarding Christianity? The concept of probability was used to interpret and give meaning to decisions about religion. By metaphoric extension, a concept from games of luck (that of chance, hazard) was linked to the old, non-quantitative concept of probability used by the schoolmen of the Middle Ages to discuss certain doctrines of Christianity that were disputed (Ian Hacking, The Emergence of Probability 1975).

    In the same way, in Britain, Newtonian natural theology provided much of the framework for the 18th century development of probability theory (Karl Pearson, Historical Note on the Origin of the Normal Curve of Errors.’ Biometrika, 16, 402-4 1924, 1978; P. S. Buck 1977, Seventeenth-Century Political Arithmetic: Civil Strife and Vital Statistics.’ Isis, 68, 67-84). The Reverend Thomas Bayes (see G. A. Barnard, Thomas Bayes-a Biographical Note.’ Biometrika, 45 1958) worked within this tradition of ‘social Newtonianism’. The problem that made Bayes famous was, in the words of his friend Richard Price, to “give a clear account of the strength of analogical or inductive reasoning.” (Bayes 1764, 135; Price’s emphasis); induction had, of course, been under attack by skeptics. De Moivre and others had not, according to Price, fully achieved the main purpose of the doctrine of chances, (1738. The Doctrine of Chances. Second edition. London: Woodfall). namely:
    … to shew what reason we have for believing that there are in the constitution of things fixt laws according to which events happen, and that, therefore, the frame of the world must be the effect of the wisdom and power of an intelligent cause; and thus to confirm the argument taken from final causes for the existence of the Deity. (Bayes 1764, 135) (Mackenzie, Statistics in Britain, 1981)

  6. January 31, 2021 at 8:17 pm

    it seems Lars has missed the point of my saying that neither the digital calculations on throwing the dice nor the analogue null hypothesis based on the shape of the dice is true; what may be true is that either is a good estimate of the other, a statement you cannot make without the representation being complex, i..e. comparing estimates made in different ways.

    As I hadn’t have access to Bayes, I am grateful to Ken for the reference he gives, though I dispute his history. (Probability theory is dated from 1654. Aquinas in the thirteenth century specifically argued for Aristotle’s first cause argument being not a proof of the existence of God but a way of coming to believe in it, when weighed against the historical evidence for the life, death and resurrection of Christ). I see Bayes [1764] was concerned (as I suggested) to judge the validity of analogical reasoning, but the “must believe” in his argument concerns the existence of fixed laws, and the inference to an intelligent cause does not follow, being derived from Aristotle’s “family tree” logic, to the effect that if one has to have a father then there must have been a first father.

    As a scientist and Christian, I would like to offer a more up to date quote from J H Newman’s “Grammar of Assent”,1870. “Sciences are only so many distinct aspects of nature; sometimes suggested by nature itself, sometimes created by the mind. (1) One of the simplest and broadest aspects under which to view the natural world, is that of a system of final causes, or, on the other had, of initial or effective causes. Bacon [1604] having it in view to extend our power over nature, adopted the latter. He took firm hold of the idea of causation (in the common sense of he word) as contrasted with that of design, refusing to mix up the two ideas in one inquiry, and denouncing such traditional interpretation of facts, as did but obscure the simplicity of the aspect necessary for his purpose. He saw what others before him might have seen in what they saw, but did not see as he saw it. In this achievement of intellect, which has been so fruitful in results, lie his genius and his fame”.

    • Ken Zimmerman
      February 5, 2021 at 4:18 am

      Dave, it seems clear that the history of science and the history of religion are intertwined. Each is an element of culture. So, the form and justifying beliefs of each is based on the fundamental configuration of the culture. Based on our comparative understanding of cultures, science in each is the fundamental sense of knowledge of nature. So, it should not be surprising to find that it originated with the people closest to nature: hunter gatherers, peasant farmers, sailors, miners, blacksmiths, folk healers, and others forced by the conditions of their lives to daily wrest the means of their survival from encounters with nature. As this science is knowledge of the make up of nature, which includes humans as they are directly a part of nature. But humans are also learning creatures and cultural constructors. The work to identify a sense of knowledge of these is also science. Social science.

      But science is near impossible to define, according to J.D. Bernal. Bernal begins ‘Science in History,’ with “Science throughout is taken in a very broad sense and nowhere do I attempt to cramp it into a definition.” This nondogmatic approach is necessary because, “in the last resort it is the people who are the ultimate judges of the meaning and value of science. Where science has been kept a mystery in the hands of a selected few, it is inevitably linked with the interests of the ruling classes and is cut off from the understanding and inspiration that arise from the needs and capacities of the people.” (vol. 1, pp. 3, 34.) Even more reason to expect and as necessary explore the connections of science and religion. Although some may be difficult to trace.

      Science is not, nor should it ever be primarily a theoretical endeavor. Particularly, theory with a religious basis. Those who portray science as “pure theory” do so to place it beyond criticism. That view of science is frequently an adjunct to reactionary political views because it supposedly offers a source of unchallengeable authority, like religion, and thereby serves as a support for authoritarianism. But many scholars, feminists, environmental activists, and others reject that notion and refuse to bow down before a deified Science. I suggest we do the same.

      Considering science historically the primary focus in the practice of science is empirical as opposed to theoretical. My contention is that the foundations of scientific knowledge owe far more to experiment and “hands on” trial-and-error procedures than to abstract thought. Benjamin Farrington makes the point well:

      In its origin science is not in fact so divorced from practical ends as histories have sometimes made out. Textbooks, right down from Greek times, have tended to obscure the empirical element in the growth of knowledge by their ambition of presenting their subjects in a logical orderly development. This is, perhaps, the best method of exposition; but it is a mistake to confuse it with a record of the genesis of theory. Behind Euclid’s definition of a straight line as “one that lies evenly between the points on it” one divines the mason with his level. (Science in Antiquity, 3)

      We have confirmation of this insight in the development of AI. In a recent essay, Jacob Browning traces how “mindless learning” through distributed experiences of trial and error — instead of the “minded learning” of conscious forethought — is driving the far-reaching advances of AI in much the same way as physical and social evolution itself takes place. “Mindless learning is more natural and commonplace than the minded variety we value so highly.”…“The history of human tools and technologies … reveals that conscious deliberation plays a much less prominent role than trial and error.” (Learning Without Thinking, Noema Magazine, Dec. 29, 2020)

      This means science is a practical craft. It is more hands than brains. (V. Gordon Childe, Man Makes Himself, 171) Crafts are also interactive, cumulative, and carried out over many generations. No ‘great heroes’ of science in such undertakings. Only working people. No matter the lab coats, extravagant degrees, and titles, science is not theory. It is craftwork.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.