Home > Uncategorized > Statistical inference and sampling assumptions

## Statistical inference and sampling assumptions

from Lars Syll

Real probability samples have two great benefits: (i) they allow unbiased extrapolation from the sample; (ii) with data internal to the sample, it is possible to estimate how much results are likely to change if another sample is taken. These benefits, of course, have a price: drawing probability samples is hard work. An investigator who assumes that a convenience sample is like a random sample seeks to obtain the benefits without the costs—just on the basis of assumptions. If scrutinized, few convenience samples would pass muster as the equivalent of probability samples. Indeed, probability sampling is a technique whose use is justified because it is so unlikely that social processes will generate representative samples. Decades of survey research have demonstrated that when a probability sample is desired, probability sampling must be done. Assumptions do not suffice. Hence, our first recommendation for research practice: whenever possible, use probability sampling.

If the data-generation mechanism is unexamined, statistical inference with convenience samples risks substantial error. Bias is to be expected and independence is problematic. When independence is lacking, the p-values produced by conventional formulas can be grossly misleading. In general, we think that reported p-values will be too small; in the social world, proximity seems to breed similarity. Thus, many research results are held to be statistically significant when they are the mere product of chance variation.

Richard Berk & David Freedman

In econometrics one often gets the feeling that many of its practitioners think of it as a kind of automatic inferential machine: input data and out comes casual knowledge. This is like pulling a rabbit from a hat. Great — but first you have to put the rabbit in the hat. And this is where assumptions come into the picture.

The assumption of imaginary ‘super populations’ is one of many dubious assumptions used in modern econometrics and statistical analyses to handle uncertainty. As social scientists — and economists — we have to confront the all-important question of how to handle uncertainty and randomness. Should we define randomness with probability? If we do, we have to accept that to speak of randomness we also have to presuppose the existence of nomological probability machines, since probabilities cannot be spoken of – and actually, to be strict, do not at all exist – without specifying such system-contexts. Accepting a domain of probability theory and sample space of infinite populations also implies that judgments are made on the basis of observations that are actually never made!

Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for a science with aspirations of explaining real-world socio-economic processes, structures or events. It’s not tenable.

And as if this wasn’t enough, one could — as we’ve seen — also seriously wonder what kind of ‘populations’ these statistical and econometric models ultimately are based on. Why should we as social scientists — and not as pure mathematicians working with formal-axiomatic systems without the urge to confront our models with real target systems — unquestioningly accept models based on concepts like the ‘infinite super populations’ used in e.g. the ‘potential outcome’ framework that has become so popular lately in social sciences?

One could, of course, treat observational or experimental data as random samples from real populations. I have no problem with that (although it has to be noted that most ‘natural experiments’ are not based on random sampling from some underlying population — which, of course, means that the effect-estimators, strictly seen, only are unbiased for the specific groups studied). But probabilistic econometrics does not content itself with that kind of populations. Instead, it creates imaginary populations of ‘parallel universes’ and assume that our data are random samples from that kind of  ‘infinite super populations.’

But this is actually nothing else but hand-waving! And it is inadequate for real science. As David Freedman writes:

With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions … Nevertheless, reliance on imaginary populations is widespread. Indeed regression models are commonly used to analyze convenience samples … The rhetoric of imaginary populations is seductive because it seems to free the investigator from the necessity of understanding how data were generated.

In social sciences — including economics — it’s always wise to ponder C. S. Peirce’s remark that universes are not as common as peanuts …

1. May 1, 2022 at 7:17 pm

Reality only happens once, as far as we know. Does that mean the probabilities are meaningless? Lars asserts that assigning probabilities to events implies we think real-world outcomes are a random drawing from a stable distribution of events in some metaphysical meta-universe. Because that is a weird if not nonsensical assumption, Lars has claimed it means statistical analysis of economic phenomena is not much (or no?) use.
In my conception, however, that is a complete misunderstanding of what is going on when we conduct econometrics. Errors or residuals (I use the terms interchangeably) are not a property of reality. Reality may be stochastic or entirely deterministic for all we know. (Philosophers have argued over that for millennia in the context of free will). The residuals are a property of the model in its relation to reality. We look at an open, evolving system and make a hypothesis about what has caused certain phenomena over a certain time period. We would not expect our hypothesis to fit each data observation perfectly. The system has apparently stochastic elements for two reasons: the agglomeration of idiosyncratic effects unaccounted for at the level of aggregation we are considering and the effect of variations in variables that can generally be ignored as unimportant but an occasional or extreme movement in which will disturb the system. Note we cannot generate a causal hypothesis from the data. The causal hypothesis is prior but if we are to continue to believe it, it must be compatible with the data. To determine that compatibility we require statistics.
We are not looking at reality as a sample of a larger population; we are comparing the implications of a causal hypothesis with data and seeing how well they fit. We are not using the central limit theorem to reassure ourselves that the sample (reality) is representative of a population of realities (!?) but in the knowledge that it means our errors will be normal if they are owing to a host of unrelated, idiosyncratic elements and no important systematic influence has been missed. A good fit proves nothing but if the model does not account for a substantial proportion of the variables of interest it is not a useful model.
If we knew our model was correct as far as it goes, the errors or residuals in its fit would come from the two sources cited. But there is always the risk that we have got it wrong and the model is mis-specified even with respect to the variables it covers. Some insight into that possibility is afforded by the nature of the errors themselves. If systematic relevant factors have been omitted, if the functional form of our equations is wrong, there will be tell-tale signature patterns in the residuals. If the residuals look like white noise and pass tests for being such there is a better chance the model is an adequate representation of reality for the purposes for which it was designed. Note that there is still no guarantee. Moreover, applying the model outside the sample for which it was fitted and tested means we are assuming that reality has been sufficiently stable for the model to remain useful. That is not something that has been proven in any sense. It is a hypothesis we retain faute de mieux until it is rejected by events. As Popper observed, any scientific generalisation remains permanently conjectural, corroborated by successes but always open to the possibility of refutation in other circumstances.
When we quote probabilities around future events we are saying the following. We think we understand this system to a useful extent; we have a representation of it which accounts for x per cent of the variations we see. If (note the necessary condition) our model is indeed – and continues to be – a serviceable picture of reality, these variables over the coming period will fall in the following ranges – probabilities of different outcomes can then be quoted. Those outcomes also require that extraneous forces behave as they have in our sample period, i.e. no volcanic eruptions and no asteroid strikes. Forecasts are necessarily conditional on that assumption. That’s what the probabilities mean whether we are talking of economics or weather forecasting. They will not survive unexpected exogenous shocks. Probabilities express degrees of confidence in different outcomes on the retained hypothesis that the model is roughly correct. The warrant for its correctness is (a) that it makes some sort of intuitive sense and (b) it fits the historical sample on which it was tested in terms of the size and characteristics of the observed errors in fit. As John Kay has remarked, the real probability of any event is the probability that our best model assigns to it times the probability that our model is correct. Especially out of sample there is no way at all to measure the latter quantity. If that is what Lars’ assertions come down to, we can agree. It is an important warning but does not invalidate the citation of probabilities, properly understood.
Lars and I are certainly in agreement that much contemporary economic theorising and modelling fails completely on criterion (a). It defies common sense and common observation. But it also usually fails on (b) as well to a greater or lesser extent.
The nature of reality imposes limits on what we can know for sure but we must do the best we can. Conjecture, hypothesis and the careful examination of data to test the hypothesis is the best we can do. The biggest fault in contemporary economics is a refusal to jettison pretty models that repeatedly fail empirical tests. By denigrating statistical testing, Lars, no doubt unwittingly, serves the forces of prejudice.

• May 2, 2022 at 3:05 am

Gerald, after lamenting how mainstream economics allows “one … to start from assumptions that are flagrantly unrealistic” and how ME flaunts experimental results (e.g., humans are not always rational utility calculation machines aka “rationality,” the “money illusion,” and “aggregation issues” used to “construct representative agents”) he goes on to profess ot being all about empirical testing:

“Conclusions can only be as certain as their premises. That also applies to econometrics.” Absolutely. It applies to everything so far as I can see. … [M]ost of the obligatory assumptions are not logical axioms but are empirical; they could be subjected to testing. Laboratory exercises with people have found they are not “rational” in the sense proposed and the term is not even well defined in the presence of uncertainty. (…) I would prefer to start from observed empirical regularities (like people die) even if we do not understand their full cause. Furthermore I would prefer that theories specify their conditions of application so that they can be tested on data and amended or abandoned if the data are inconsistent with the theory. The only antidote to rampant a priorism in economics is empiricism. (Gerald Holtham, RWER, Espousing Empircal Testing)

Further:

An investigator who assumes that a convenience sample is like a random sample seeks to obtain the benefits without the costs—just on the basis of assumptions. If scrutinized, few convenience samples would pass muster as the equivalent of probability samples. (Lars, On fictional “convenience sample” and assumed “imaginary populations”, the topic of his post)

Lars asserts that assigning probabilities to events implies we think real-world outcomes are a random drawing from a stable distribution of events in some metaphysical meta-universe. (Gerald Holtham, RWER, Eliding the real point, i.e., use “convenience samples” and imaginary populations in place of the hard work of legitimate random samples from real empirical populations)

Lars post is about “convenience samples” based upon “imaginary populations.” Hardly something one would call “empirical testing.” What is weird is how Gerald, right out the gate, elides the core context and meaning of Lars post above and chooses to misrepresent the context and content of the post. Lars must have hit a hot button (perhaps Gerald likes using “convenience samples” and “imaginary populations”?) given the length of his verborse relply with irrelevant rhetoric never once addressing the question Lars addresses regarding “convenience samples” and “imaginary populations” and the dubious fictionarl nature of their substitution for the hard work of statistical sampling based upon empirical data.

What is weird AND nonsensical is the way Gerald elides the context and content of Lars post to engage in a long irrelevant rant that never once address the question of “convenience samples” and “imaginary populations” and if these are valid assumptions in light of trying to make econometrics more empirical.

2. May 16, 2022 at 11:59 am

I don’t think Meta understands what I am saying. But then it is mutual. I don’t understand him either. When I do time-series econometrics I use data that is not a sample of anything. If I claim to have found an explanation for that data it applies to the data not to anything else. Lars seems to think that I am treating the data as a sample of something else and claiming to have found a universal law or relationship but that is not so. I may use my finding to forecast but that is to make an auxiliary assumption – that I do not claim to have tested – namely that the system is sufficiently stable that the data-generating process will remain similar out of sample. If someone can come up with a better hypothesis it ought to be considered and used.
Incidentally I deplore Meta’s habit of injecting unnecessary heat into discussions. I was attempting to elucidate a difference in the way Lars appears to view econometric testing from my conception of it. Far from “eliding” a discussion of imaginary populations I am denying the existence of imaginary populations and explaining why I think so. I hope I did so politely and there is no reason to describe my note as a “rant”.

• May 16, 2022 at 1:22 pm

2.4 An imaginary population and imaginary sampling mechanism
.
Another way to treat uncertainty is to create an imaginary population from which the data are assumed to be a random sample. Consider the shelter story. The population might be taken as the set of all shelter residents that could have been produced by the social processes creating victims who seek shelter. These processes might include family violence, as well as more particular factors affecting possible victims and external forces shaping the availability of shelter space. (Freedman 2010, 27)
.
With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions. Convenience will not be denied; the source of the fiction is twofold: (i) the population does not have any empirical existence of its own; and (ii) the sample was not in fact drawn at random. (Freedman 2010, 27)
.
In order to use the imaginary-population approach, it would seem necessary for the investigators to demonstrate that the data can be treated as a random sample. It would be necessary to specify the social processes that are involved, how they work, and why they would produce the statistical equivalent of a random sample. Handwaving is inadequate. We doubt the case could be made for the shelter example or any similar illustration. Nevertheless, reliance on imaginary populations is widespread. Indeed, regression models are commonly used to analyze convenience samples: As we show later, such analyses are often predicated on random sampling from imaginary populations. The rhetoric of imaginary populations is seductive precisely because it seems to free the investigator from the necessity of understanding how the data are generated. (Freedman 2010, 27) (Freedman, David A. Statistical Models and Causal Inference [A Dialogue With the Social Sciences]. New York: Cambridge University Press; 2010. )
.
As John Kay has remarked, the real probability of any event is the probability that our best model assigns to it times the probability that our model is correct. Especially out of sample there is no way at all to measure the latter quantity. If that is what Lars’ assertions come down to, we can agree. (Gerald Eliding the Topic of Lars Post: Rhetoric of Imaginary Populations and The Dazzling (aka more eliding) with Irrelevant Statistics Lecture)

.
The point of Lars post is that for economics to be a real science it must draw its experimental data as random samples from real populations (I learned this in statistics 101), not some investigator’s assumptions about imaginary “populations of ‘parallel universes’ and assume that our data are random samples from that kind of ‘infinite super populations.” If that is not “out of sample”, which is exactly what imaginary population samples are, and which by the way, Gerald feigns agreement with, then nothing is “out of sample” and statistics anything anyone wants it to be. To merely assume some imaginary population exists in some “ill-defined” sense is pseudoscience. Gerald, is pedantic rant elides this simple fact (which he agrees with, sort of, only to later engage in innuendo trying to assert that Lars intentions (which is hardly something Gerald can know, after all) that Lars is “denigrating [all] statistical testing, Lars, no doubt unwittingly, serves the forces of prejudice.” The only prejudice I see here comes from Gerald’s sophistry and attempt to read into Lars intentions rather than simply address the topic of the post (which he out of one side of his mouth agrees with).

I understand perfectly well your style of rhetoric Gerald. Elide the core topic of Lars post, insinuate meanings (aka put words in his mouth he didn’t say) and then rant on with a irrelevant statistics lecture that conflates, confuses, and misdirects away from what you already agreed regarding imaginary “out of sample” samples form imaginary populations.

• May 16, 2022 at 1:29 pm

BTW, I could play back (but I won’t, unless I need to) a boat load of quotes from you asserting that to make economics more “empirical” it needs to be more “experimental” and that it is all in the empirical data. Now you are conflating imaginary populations with empirical data, eh? Consistency counts.

• May 16, 2022 at 11:16 pm

I would like to establish a consensus. I do not think people on this blog are as far apart as the rhetoric would imply. Let me try some propositions and see if Lars and others assent.
.
(….) [E]economic phenomena are the result of a complex, evolving system which is partly self referential and adapts in response to what is discovered about it. It exceeds our powers to develop fully general models of such a system….
.
The response has to be a certain eclecticism whereby we develop different partial and incomplete models in an attempt to understand particular situations. Understand here means get some intuitive understanding of some of the principal forces at work. It is seldom if ever appropriate to apply such a partial model without adaptation in a given situation and you have to allow for plenty of fudge factors – as any engineer would do when using abstract physical theories in practice. One must not expect too much, certainly not a general truth. Yet a familiarity with the suite of models available and past failures and successes is useful when forecasting or developing policy advice. What to use is a matter of judgement and it will at best produce conclusions specific to a place and time. The limits of extrapolation are learned from experience.

I do not wish to defend all the contents of most economics text books …. [A] certain methodological conformity results from the sociology of the academic economics trade. Such models generally and unsurprisingly fail empirical tests. They would not have achieved popularity if economists were more concerned with empirical corroboration and less with producing deductive theorems from attractive but unverified axioms. We agree on that.
.
[Does imaginary samples from populations fall into the category of “empirical tests” and “empirical corroboration”?]
.
Surely, then, we should give some respect to efforts at empirical testing, be they statistical or experimental. [What is empirical testing in statistics? Is it based on real world populations and samples therefrom, or imaginary populations and imaginary samples?]
.
It is somewhat perverse, having damned economists for anti-empirical scholasticism, to then damn root and branch all efforts to bring evidence to bear systematically.
.
[This is Gerald engaging in fallacious argument. He puts his own interpretation of Lars critique into Lars mouth, using the innuendo that it was Lars who intends to “damn root and branch all efforts to bring evidence to bear” on economic questions using real world populations and real world samples rather than the unwarranted extrapolations based upon imaginary populations. This is a fallacious form of argument used repeated by Holtham and Shiozawa on this forum. It a form of argument rife red herrings, straw men, and forms of ad hominem the purpose of which is to put statements into the mouth of Lars that he never made, then to knock them down and claim how reasonable they are after all. Very transparent indeed.]
.
Of course any method can be misused, data can be mined, failures can be hidden, samples used selectively [or simply imagined whole out fictional samples from fictional populations]. All of these crimes have been committed…. It invites an appropriate scepticism….

If economics is to make any progress, sensible use of econometrics is one of the ways it will happen. Indeed if econometricians had higher status relative to economic theorists a lot of errors might have been avoided. Bad practice is rife and should be criticized when it occurs.
(Gerald Holtham, RWER, Desiring Consensus, 11/25/2019)