Home > Uncategorized > Observational data and causal inference

Observational data and causal inference

from Lars Syll

Distinguished Professor of social psychology Richard E. Nisbett takes on the idea of intelligence and IQ testing in his Intelligence and How to Get It (Norton 2011). He also has some interesting thoughts on multiple-regression analysis and writes:

nisbettResearchers often determine the individual’s contemporary IQ or IQ earlier in life, socioeconomic status of the family of origin, living circumstances when the individual was a child, number of siblings, whether the family had a library card, educational attainment of the individual, and other variables, and put all of them into a multiple-regression equation predicting adult socioeconomic status or income or social pathology or whatever. Researchers then report the magnitude of the contribution of each of the variables in the regression equation, net of all the others (that is, holding constant all the others). It always turns out that IQ, net of all the other variables, is important to outcomes. But … the independent variables pose a tangle of causality – with some causing others in goodness-knows-what ways and some being caused by unknown variables that have not even been measured. Higher socioeconomic status of parents is related to educational attainment of the child, but higher-socioeconomic-status parents have higher IQs, and this affects both the genes that the child has and the emphasis that the parents are likely to place on education and the quality of the parenting with respect to encouragement of intellectual skills and so on. So statements such as “IQ accounts for X percent of the variation in occupational attainment” are built on the shakiest of statistical foundations. What nature hath joined together, multiple regressions cannot put asunder.

Now, I think this is right as far as it goes, although it would certainly have strengthened Nisbett’s argumentation if he had elaborated more on the methodological question around causality, or at least had given some mathematical-statistical-econometric references. Unfortunately, his alternative approach is not more convincing than regression analysis. Like so many other contemporary social scientists today, Nisbett seems to think that randomization may solve the empirical problem. By randomizing we are getting different “populations” that are homogeneous in regards to all variables except the one we think is a genuine cause. In that way, we are supposed being able not having to actually know what all these other factors are.

If you succeed in performing ideal randomization with different treatment groups and control groups that is attainable. But it presupposes that you really have been able to establish – and not just assume – that the probability of all other causes but the putative have the same probability distribution in the treatment and control groups, and that the probability of assignment to treatment or control groups is independent of all other possible causal variables.

Unfortunately, real experiments and real randomizations seldom or never achieve this. So, yes, we may do without knowing all causes, but it takes ideal experiments and idealrandomizations to do that, not real ones.

As yours truly has argued more than once on this blog, that means that in practice we do have to have sufficient background knowledge to deduce causal knowledge. Without old knowledge, we can’t get new knowledge, and — no causes in, no causes out.

  1. Helen Sakho
    August 9, 2019 at 1:36 am

    Should we be proposing a new IQ test for Economists? They seem to have buried their numerous repeated past mistakes and lost their way forward in the rubble, the smoke of which alone is condemning billions to all kinds of deceases and miseries.

  2. Ken Zimmerman
    August 12, 2019 at 2:37 pm

    Lars, I agree all social science work needs to elaborate more on the methodological questions around causality. Causation is first generally nonlinear. That is, one set of events or things counted as “causes” in one time and situation, may not be the “causes” in another time and situation. The story changes with changes in circumstances of interactions, which are are not the same in every time and situation. Also, causation is a process, a process of interlinked events and actions that can change in their sequence and in the relative emphasis on one link rather than another. Finally, this causation story changes with feedback. That is, if the story of the cause of a burglary is about poverty, starvation, unemployment, family needs and responsibilities, that cause may change based on the feedback (responses from victim, police, neighbors, church, alarm merchants, etc.) to the burglary. And then the responses of “burglars” to these responses. This is a rather simple example. These causation networks can and do become complex in more subtle and varied ways. So much so that sometimes the results are impossible to predict or model.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.