Home > Uncategorized > The randomistas revolution

The randomistas revolution

from Lars Syll

RandomistasIn his new history of experimental social science — Randomistas: How radical researchers are changing our world — Andrew Leigh gives an introduction to the RCT (randomized controlled trial) method for conducting experiments in medicine, psychology, development economics, and policy evaluation. Although it mentions there are critiques that can be waged against it, the author does not let that shadow his overwhelmingly enthusiastic view on RCT.

Among mainstream economists, this uncritical attitude towards RCTs has become standard. Nowadays many mainstream economists maintain that ‘imaginative empirical methods’ — such as natural experiments, field experiments, lab experiments, RCTs — can help us to answer questions concerning the external validity of economic models. In their view, they are more or less tests of ‘an underlying economic model’ and enable economists to make the right selection from the ever-expanding ‘collection of potentially applicable models.’

When looked at carefully, however, there are in fact few real reasons to share this optimism on the alleged ’empirical turn’ in economics. 

If we see experiments or field studies as theory tests or models that ultimately aspire to say something about the real ‘target system,’ then the problem of external validity is central (and was for a long time also a key reason why behavioural economists had trouble getting their research results published).

Assume that you have examined how the performance of a group of people (A) is affected by a specific ‘treatment’ (B). How can we extrapolate/generalize to new samples outside the original population? How do we know that any replication attempt ‘succeeds’? How do we know when these replicated experimental results can be said to justify inferences made in samples from the original population? If, for example, P(A|B) is the conditional density function for the original sample, and we are interested in doing an extrapolative prediction of E [P(A|B)], how can we know that the new sample’s density function is identical with the original? Unless we can give some really good argument for this being the case, inferences built on P(A|B) is not really saying anything on that of the target system’s P'(A|B).

External validity/extrapolation/generalization is founded on the assumption that we can make inferences based on P(A|B) that is exportable to other populations for which P'(A|B) applies. Sure, if one can convincingly show that P and P’are similar enough, the problems are perhaps surmountable. But arbitrarily just introducing functional specification restrictions of the type invariance/stability /homogeneity, is, at least for an epistemological realist far from satisfactory. And often it is – unfortunately – exactly this that I see when I take part of mainstream economists’ RCTs and ‘experiments.’

Many ‘experimentalists’ claim that it is easy to replicate experiments under different conditions and therefore a fortiori easy to test the robustness of experimental results. But is it really that easy? Population selection is almost never simple. Had the problem of external validity only been about inference from sample to population, this would be no critical problem. But the really interesting inferences are those we try to make from specific labs/experiments/fields to specific real-world situations/institutions/ structures that we are interested in understanding or (causally) to explain. And then the population problem is more difficult to tackle.

In randomized trials the researchers try to find out the causal effects that different variables of interest may have by changing circumstances randomly — a procedure somewhat (‘on average’) equivalent to the usual ceteris paribus assumption).

Besides the fact that ‘on average’ is not always ‘good enough,’ it amounts to nothing but hand waving to simpliciter assume, without argumentation, that it is tenable to treat social agents and relations as homogeneous and interchangeable entities.

Randomization is used to basically allow the econometrician to treat the population as consisting of interchangeable and homogeneous groups (‘treatment’ and ‘control’). The regression models one arrives at by using randomized trials tell us the average effect that variations in variable X has on the outcome variable Y, without having to explicitly control for effects of other explanatory variables R, S, T, etc., etc. Everything is assumed to be essentially equal except the values taken by variable X.

In a usual regression context one would apply an ordinary least squares estimator (OLS) in trying to get an unbiased and consistent estimate:

Y = α + βX + ε,

where α is a constant intercept, β a constant ‘structural’ causal effect and ε an error term.

The problem here is that although we may get an estimate of the ‘true’ average causal effect, this may ‘mask’ important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are ‘treated'( X=1) may have causal effects equal to – 100 and those ‘not treated’ (X=0) may have causal effects equal to 100. Contemplating being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the OLS average effect particularly enlightening.

Most ‘randomistas’ underestimate the heterogeneity problem. It does not just turn up as an external validity problem when trying to ‘export’ regression results to different times or different target populations. It is also often an internal problem to the millions of regression estimates that economists produce every year.

‘Ideally controlled experiments’ tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. “It works there” is no evidence for “it will work here”. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ and ‘precise’ methods — and ‘on-average-knowledge’ — is despairingly small.

RCTs have very little reach beyond giving descriptions of what has happened in the past. From the perspective of the future and for policy purposes they are as a rule of limited value since they cannot tell us what background factors were held constant when the trial intervention was being made.

RCTs usually do not provide evidence that the results are exportable to other target systems. RCTs cannot be taken for granted to give generalizable results. That something works somewhere for someone is no warranty for us to believe it to work for us here or even that it works generally. RCTs are simply not the best method for all questions and in all circumstances. And insisting on using only one tool often means using the wrong tool.

  1. July 10, 2018 at 3:46 am

    We physicists are way up this learning curve of detecting signals in noise. Signal averaging is the first go to method but signal to noise improves as the square root of sample size. If this isn’t working early on it’s only going to get worse path forward. The only thing that works first time every time is to identify a robust signal variable that jumps off the page. For example Steve Keen keeps pointing out that the correlation over the trailing three decades between the slope of private debt (aka credit) and unemployment is -.93. This is not noisy signal buried in noise. When you find a true controlling variable you don’t need lots of ANOVA and signal averaging. The data grabs one by both ears and screams: Hey. Look here!

    • Craig
      July 10, 2018 at 6:35 am

      Precisely. And if debt build up is continual and austerity is not the macro answer…how can anything other than a new paradigm of COSTLESS monetary gifting be the answer. Keen and other heterodox economists even acknowledge and advocate monetary gifting as a policy, it’s just that
      1) they don’t know how to implement it without causing inflation, although they mouth the flimsy liberal orthodoxy that limited gifting with a UBI would not create inflation. Money of course is NOT the primary and operant factor in “monetary” inflation…it’s the complete freedom that business decision makers have to raise their prices because who or what is going to stop them from doing so???? especially in a macro-economically monetarily austere system when they perceive more money than normal is forthcoming????

      2) they haven’t looked at and thought economically about where the terminal expression point for any and all price inflation is (retail sale)

      3) they don’t have the knowledge of the digital nature of the accounting and pricing systems and so they miss how a high percentage discount/rebate monetary gifting policy at retail sale could be utilized to break up the idiotic and contradictory monopoly paradigm of Finance, implement the new paradigm of Direct and Reciprocal Monetary Gifting and put monetary and purchasing power scarcity forever in the dust bin of history.

      Ah, not looking everywhere for answers (being orthodox) and not modeling temporal realities in their theories and potential policies….just what everyone here is lamenting. It’s way past time that heterodox economists stopped being extremely erudite policy and paradigm dunces.

  2. Trond Andresen
    July 10, 2018 at 10:10 am

    The mainstream remains uncomfortable after the advent of the debt-induced “great recession”, which demonstrated that the academic economics emperor is nearly naked.

    So they feel the need to come up with something “new”, and to increase credibility, “experiments” and “laboratories” sound cool. The old “physics envy” — nothing new under the sun.

  3. July 10, 2018 at 6:03 pm

    It is scary to think that economists think they can solve problems with models based on individual experiences that are then introduced into large populations. I think that economists are using numbers and models in place of an empathic understanding that using people in experiments can mean real harm for the people, (for example, introducing austerity after banks commit fraud and then are bailed out while those defrauded are foreclosed upon). If economists would think about how people benefit, or not, from policies or actions, then the policies would make sense. Experimentation for the sake of experimentation is a fool’s errand.

    • Craig
      July 10, 2018 at 6:41 pm


      Exactly. Now if economists, politicians, the lords of finance and large political constituencies would cognite on the fact that, that empathic understanding was the natural personal experience of grace as in love in action/love in application….they’d also cognite on wisdom which is the best integration of the practical and the ideal. Wisdom isn’t wisdom unless it is imminently practical, and as I have shown the policies of economic and monetary grace are best accomplished within commerce’s infrastructure of double entry bookkeeping, using its digital nature and an understanding of where and when it could best be APPLIED in the economic process. Play the policies out with the above criteria in your mind or better yet on a clay table to make them more temporally real to yourself….and the elegant simplicity and yet thorough applicability of them will become apparent to you.

      • Craig
        July 10, 2018 at 9:01 pm

        The mental and temporal reflectivities and alignments of logic and wisdom after all DO count.

  4. July 24, 2018 at 11:44 am

    A few words on how science works are helpful. Consider Pasteur and Anthrax. Anthrax is a disease deadly for farm animals, thus feared by farmers. Pasteur prepared a vaccine for Anthrax based on the Anthrax identified in his laboratory. The vaccine stopped Anthrax on farms, even though we now know the laboratory Anthrax and farm Anthrax are not the same chemically or structurally. In other words, the vaccine was close enough to what could kill the farm Anthrax to get the job done. In most of science “close enough” is all we ever have. And we only know if our experiment is close enough “after the experiment.”

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.