Home > The Economics Profession > Randomized control trials and the problem of spillovers (wonkish)

Randomized control trials and the problem of spillovers (wonkish)

from Lars Syll

When it comes to questions of causality, randomized control trials (RCTs) are nowadays considered some kind of “gold standard” in social sciences and policies. Everything has to be “evidence based,” and the evidence preferably has to come from randomized experiments.

spillover

But randomization is basically – just as e. g. econometrics – a deductive method. Given warranted assumptions (manipulability, transitivity, separability, additivity, linearity etc.) this method delivers deductive inferences. The problem, of course, is that we will never completely know when the assumptions are warranted and a fortiori being able to justify our causal conclusions. Although randomization may contribute to controlling for “confounding,” it does not guarantee it, since genuine ramdomness presupposes infinite experimentation and we know all real experimentation is finite. Even if randomization may help to establish average causal effects, it says nothing of individual effects unless homogeneity is added to the list of assumptions. Real target systems are seldom epistemically isomorphic to our axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by randomization procedures may be valid in “closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.

An interesting example that illustrates one of the problems with RCTs – spillovers and the bridging of the micro-macro gap – was recently presented in the article Job search assistance: Micro success does not guarantee macro success by Pieter Gautier et al.:

The average unemployment rate in the EU increased to 11.2% in June 2012 compared with 10.0% a year earlier. This reopened the debate on the desirability of providing assistance to unemployed workers in their search for work. Many countries now offer mandatory job search assistance programmes to unemployment benefits recipients – but th.e question is, does this help? …

In new research, we study a Danish job search assistance programme which, according to a randomised experiment, leads to large positive effects on exit rates to work … We show, however, that because of spillover effects, a large-scale implementation will only marginally reduce unemployment without increasing welfare …

The simple comparison of outcomes of participants and nonparticipants only gives a consistent estimate about the programme’s effectiveness in case of a large-scale roll out when there are no spillovers between participants and nonparticipants. This assumption is unlikely to hold in case of job search assistance programmes. Participants and nonparticipants are competing for the same jobs, so when participants increase their job search effort, nonparticipants may suffer from this. At the same time, if firms respond by opening more vacancies, nonparticipants benefit …

The empirical results suggest that considering both negative and positive spillover effects is important when evaluating the job search assistance programme. The Danish programme essentially increases the job search effort of participants by requiring them to make more job applications. The effect on vacancy supply is modest, so when participants send out more applications, this reduces the probability that a specific job application gets selected …

In the past two decades there has been more focus on evidence-based policy. Policymakers, therefore, have an increasing interest in evaluating the effectiveness of specific programmes. It is often argued that randomised experiments are the golden standard for such evaluations. However, it is well know that a randomised experiment only provides a policy-relevant treatment effect when there are no spillovers between individuals. In the study discussed above, we have shown that spillovers can be substantial. Despite the success of a small-scale implementation of the programme at the micro level, we find it to be ineffective at the macro level. The results of our study are no exception …

So this example does pretty well explain one reason for Randomized Controlled Trial (RCT) not at all being the “gold standard” that it has lately often been portrayed as. As yours truly has repeatedly argued on this blog, RCTs usually do not provide evidence that their results are exportable to other target systems. The almost religious belief with which its propagators portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for it to work for us or even that it works generally.

About these ads
  1. Bruce E. Woych
    January 13, 2013 at 5:20 pm | #1

    The phrase “evidence base” has come a long way from its origins. The general idea of “best practice” is a guideline to “outcome based” consensus from authoritative (professional) sources. “Evidence” became obscured around 1995 and thereafter as it took on the “reality” base of cost effectiveness and authoritative opinions (from primary sources of research in progress in lieu of hard proof. The “middle management” that was displaced by a more science based research soon got the upper hand from Universities that began to adopt the phrase…but define (…or as they might say “refine”…) the “methods” as well. What got lost was a precise and hardened definition of “EVIDENCE” and the results are that information technology spits out “best practice” for a conference call equivalent of outcome based protocol which may or may not include “hard” (causal) evidence as research driven science. All too often these “randomized” trials are simply mass quantities of opinion creating a statistic of “research” associated with an outcome that may well be an institutional bias.

    Evidence base is a catch phrase that should be scrutinized carefully for REAL EVIDENCE.
    and the definition of that evidence along with the very precise methods of drawing conclusions from each study other than “…1200 RCT studies all agree…. The proceduures permit an authentic demonstration of reliable “causal” question to be ignored because the politics of selection is skewed towards validation and not proof.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,295 other followers