## The experimental dilemma

from** Lars Syll**

We can either let theory guide us in our attempt to estimate causal relationships from data … or we don’t let theory guide us. If we let theory guide us, our causal inferences will be ‘incredible’ because our theoretical knowledge is itself not certain … If we do not let theory guide us, we have no good reason to believe that our causal conclusions are true either of the experimental population or of other populations because we have no understanding of the mechanisms that are responsible for a causal relationship to hold in the first place, and it is difficult to see how we could generalize an experimental result to other settings if this understanding doesn’t exist. Either way, then, causal inference seems to be a cul-de-sac.

Nowadays many mainstream economists maintain that ‘imaginative empirical methods’ — especially randomized experiments (RCTs) — can help us to answer questions concerning the external validity of economic models. In their view, they are, more or less, tests of ‘an underlying economic model’ and enable economists to make the right selection from the ever-expanding ‘collection of potentially applicable models.’

It is widely believed among economists that the scientific value of randomization — contrary to other methods — is totally uncontroversial and that randomized experiments are free from bias. When looked at carefully, however, there are in fact few real reasons to share this optimism on the alleged ’experimental turn’ in economics. Strictly seen, randomization does not guarantee anything.

‘Ideally controlled experiments’ tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. ‘It works there’ is no evidence for ‘it will work here’. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ and ‘precise’ methods — and ‘on-average-knowledge’ — is despairingly small.

The almost religious belief with which its propagators — this year’s ‘Nobel prize’ winners Duflo, Banerjee and Kremer included — portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works *somewhere* is no warranty for us to believe it to work for us *here* or even that it works *generally*.

The present RCT idolatry is dangerous. Believing there is only one really good evidence-based method on the market — and that randomization is the only way to achieve scientific validity — blinds people to searching for and using other methods that in many contexts are better. RCTs are simply not the best method for all questions and in all circumstances. Insisting on using only *one* tool often means using the *wrong* tool.

So who is the philosopher of economics you are quoting? “We can either let theory guide us in our attempt to estimate causal relationships from data … or we don’t let theory guide us”?

Guidance is needed for the future, where it cannot dictate what we will see, but it can helpfully advise us on which way to look. The theory of electric circuits involves the simplest of series and parallel circuits, but can be applied to all networks by its showing how to convert from one to the other, so the parallel circuits in any complex network can be readily reduced to the resistance of the whole system, e.g. the load on a battery or generator. If we want to know what the network is, we have to look, but it helps to know what to do with what we find.

Conclusions

.

Although this last section has adopted a critical tone toward RTCs, I do not maintain here that randomization is always a bad idea or that there is no virtue in evidence-based policy. Quite to the contrary. Evidence-based movements have brought to the fore a number of mythological issues that are important for practice—for using evidence to learn about causal relationships—and that have been neglected in the literature. The problem of external validity is one among several examples of an obstacle to learning about policy from experience, and it is quite an embarrassment that philosophers of science and other professional methodologists only recently discovered it. Francesco Guala aptly refers to it as a “minor scandal in the philosophy of science” (Guala 2010: 1070).

.

I want to end with a dilemma and call for future research. The dilemma is the following. We can either let theory guide us in our attempt to estimate causal relationships from data (and other causal inferences such as generalizing a known causal relationship to a new setting) or we don’t let theory guide us. If we let theory guide us, our causal inferences will be “incredible” because our theoretical knowledge is itself not certain. It is often easy to build a number of theoretical models with conflicting conclusions, and quite generally theoreticians do not really trust one another’s assumptions. If we do not let theory guide us, we have no good reason to believe that our causal conclusions are true either of the experimental population or of other populations because we have no understanding of the mechanisms that are responsible for a causal relationship to hold in the first place, and it is difficult to see how we could generalize an experimental result to other settings if this understanding doesn’t exist. (Reiss 2013, 206)

.

Either way, then, causal inference seems to be a cul-de-sac. More methodological research is needed on the question of how precisely to integrate theoretical knowledge with empirical studies for successful policy applications, of what nature the theoretical knowledge should be and what to do about the problem that theoretical conclusions are often sensitive to changes in the assumptions made but different sets of such assumptions are equally plausible. This pretty much mirrors Angus Deaton’s conclusion of the matter:

.

It is certainly not always obvious how to combine theory with experiments. Indeed, much of the interest in RCTs—and in instrumental variables and other econometric techniques that mimic random allocation—comes from deep skepticism of economic theory, and impatience with its ability to deliver structures that seem at all helpful in interpreting reality. Applied and theoretical economists seem to be further apart now than at any period in the last quarter century. Yet failure to reintegrate is hardly an option because without it there is no chance of long-run scientific progress or of maintaining and extending the results of experimentation. RCTs that are not theoretically guided are unlikely to have more than local validity, a warning that applies to nonexperimental work. (Deaton 2010a: 450)

.

My only qualification would be that RCTs that are not theoretically guided are unlikely to have even local validity. (Reiss 2013, 206-207)

.

— Julian Reiss (2013)

Philosophy of Economics: A Contemporary Introduction. Routledge.Randomisation should be replaced with victimisation of millions of people who have been misled for decades. The system has blown up like half of the world that is on fire. Will these economists ever give up?

I have repeatedly pointed that there is NO orthodox and NO heterodox quantitative theory at all. All conventional analysis is rejected by the quantity calculus. So the present blog presents a category error. This is a statement of fact.

I would like to know if you, the reader, accepts this as the fact it is. If you do, please let me know. If you think I am wrong, please let me know. At the moment this blog is stuck marking time. By not discussing the “nature of the category error” a disservice to economics is promulgated. Please comment whatever you think. It is only by engaging with this subject will progress be made.

The silence is deafening. If you have nothing to say, why do read these blogs?

In hope of learning something new?

Dave: I believe that you might learn new things but unless economic analysts practise the scientific method what you will learn will continue be nonsense.

Frank, as a mathematician it does seem to me that what passes as ‘science’ in many fields (not just economics) looks like scientism, in that it applies methods (not just RCT) without any apparent concern for their applicability.

i have a blog ( djmarsay.wordpress.com ) where I consider claims about analytic techniques from a mathematical perspective. Maybe I should comment on RCTs?.

Wikpedia ( https://en.wikipedia.org/wiki/Randomized_controlled_trial#Criticism .) refers to Hacking ( https://www.semanticscholar.org/paper/Telepathy%3A-Origins-of-Randomization-in-Experimental-Hacking/dfdbd650a0223c5cff427b3b5214661cf09c75a7 ) which is very critical of RCTs.

As far as I can see, the most that is claimed about RCTs is that they eliminate some sources of error. But

If anyone is aware of anything that looks like a logical argument that RCTs are fit for the purposes of justifying ‘causal’ claims, I would be glad to critique it.As far as I can see we are in total agreement. RCT’s are a technique which has its place in eliminating investigator bias.

As a mathematician you are ideally placed to comment on my paper, Transient Development, RWER-81 pp, 135−167. This analysis presents abstract production theory. The mathematical solutions explain all the relationships found to be present in the empirical relationships of macroeconomics. Further it predicts the empirical values of the Verdoorn coefficient and the intercept of the aggregate production function are the same parameter and therefore that it is a Lakatosian progressive research programme. No empirical results contradict the mathematical solutions which means it passes the Popperian test for validity and of course it is consistent with the quantity calculus. I would appreciate your comments.

That’s why we have paradigms. That’s why we have paradigm shifts.

If Economics is accepted as a science (?), it would go through stages of growth as any other.

The validation of any economic model is not a one-time validation that freezes and enshrines that model forever. Rather, it offers a working model until improvement comes along.

Then: shift.

Frank, If you insist, I’ll say it: you are wrong. Labour time cannot explain prices therefore it cannot explain aggregates that are arrived at by summing values not physical quantities. Equations that relate one set of monetary values to another set are dimensionally homogenous and therefore consistent with the quantity calculus. They might be unstable and practically useless, and frequently are, but that is a different point. Your point seems to be that economics cannot be reduced to a branch of physics and is therefore illegitimate. Well, it can’t be reduced to physics. If you can show any economic proposition is actually inconsistent with a physical law you are entitled to reject it. You are not entitled to reject it because it cannot be derived from a physical law. It has to be judged on its own terms, like propositions in biology or psychology. You cannot derive Darwin’s theory of natural selection from Newton’s equations of motion. If you can prove they are inconsistent then you have a powerful point. If you cannot derive one from the other it is not because Darwin is failing the quantity calculus. I think you have an idee fixe but ask yourself why you are the only person in the world banging on about it.

Thank you for your reply. Your dismissal of a relationship between labour-time and macroeconomic data is only valid if there is not an affine transformation between the two. My figure 5 shows that for the Solow data there is an affine transformation. I therefore believe your point might be true in other circumstances but it is invalid in my analysis of the Solow data. I agree that equations involving monetary values are dimensionally homogeneous but my analysis is not based on this in any way. I agree with you that monetary aggregation might be “unstable and practically useless”.

I do not understand why you think I do not believe that the physical manifestations of the economy are subject to the laws of physics. I am certain that they are. However there two separate aspects of economies. One is the decisions people make which might be advantageous or damaging. Whatever decisions they make when converted into action will be subject to the laws of physics. The Solow data covers periods including two word wars, periods of normality and the great depression. All are consistent with my mathematical analysis.

I wonder why as a mathematician, you who are ideally placed to comment on mathematics, do not examine my mathematical analysis but discuss things which are irrelevant to my analysis.

Yes I have an “idee fixe”, that my mathematical analysis is correct and that it describes abstract production theory. You fail to comment on the mathematics which you ideally placed to do and you do not explain why when it is consistent with every mathematical regularity in the economy you choose not to comment on this important fact.

Possibly, I should put it very simply. Is my mathematical analysis correct? If you think it incorrect, please explain what is wrong.

Frank, Coincidentally I have made some notes on ‘mathematization’ at https://djmarsay.wordpress.com/mathematics/rouxs-forms-of-mathematization/ .

If you want to check your equations and numerical results, you can put them into something like MathCAD, and some would regard this as demonstrating ‘correctnesss’. But the real issues is, by what criteria should we judge your theory? More generally, how should we judge any empirical theory? What are the ‘first principles’?.

Perhaps more importntly, what are its implications? Supposing that your conclusion is correct:does this represent some fundamental constraint on how economies could work, or a feature of historical capitalism that might be changed by policy?

.

Dave, I have read the page you refer to. My first reaction is that you seem to have ignored the the use of mathematics in physics and engineering.

I personally do not wish to check my equations. My further confirming that I am correct is not my objective and it moves nothing forward. What is necessary is for the truth to be recognised by others. That is why I would hope that you would confirm that my manipulations are correct.

Originally I did not have a position on production. The theory arose from two definitions, one of productivity and the other of technical progress and an assumption of returns to scale. Everything else followed.

I expect that all accepted empirical theories should not be invalidated by the empirical evidence. Much of my paper demonstrates that this is true. It passes Popper’s validation tests and is a Lakosian progressive research programme. It would appear that economists do not understand what these are and their significance.

Yes it does show fundamental constraints on how economies work. When I publish a paper on abstract growth theory will the very surprising policy implications be visible.

As you are the only mathematician that I know of who has seen my paper, I would ask you to confirm that the mathematics presented are correct. This would appear to be an important first step.

Frank,

My problem is that people seem to varying ideas about what ‘mathematically correct’ means, and mathematicians have been blamed because of this misunderstanding. I have some hopes of resolving this problem, but aren’t there yet.

Dave,

I was asking you to confirm that my mathematical manipulations are not in error. That is all. There are many who do not have the mathematical understanding to determine that so it would do them a service.

I’ll read your article, again. One thing I did not understand on a quick reading is you seem to insist on constant returns to scale. Adam Smith’s pin-factory example shows how you get more pins with the same labour time through a process of specialisation, which is dependent on scale. Reorganisation produces more output consistent with the laws of physics.

Meanwhile you might read Nelson and Winter’s criticism of the aggregate production and the Solow theory. The relevant point is that the power of aggregate data, such as Solow used, to discriminate among theories is very low. In their book on evolutionary economics and in a earlier Economic Journal article Nelson and Winter built an evolutionary model of US industrial production which posited a substantial number of different firms following a satisfying approach, sticking with a production method until it failed to make a minimum profit and then searching in prescribed ways for an alternative improved technique. Some firms failed so there was population churn. They based the routines, including the search routine, on empirical microeconomic research. When they simulated the model they could generate data series for output, capital stock and employment very similar to the Solow data. .When they fitted Cobb Douglas type equations to the data their model had generated, the fit was good. The data alone could not reject a model that was actually wrong by construction. That reinforces your point , which I have always accepted, that curve fitting on limited aggregate data does not validate a theory. The fit is a necessary but not sufficient condition for putting confidence in it. Moreover there will be more than one abstract theory consistent with the data.at any time.

Btw I did not say you didn’t believe economic processes were not subject to the laws of physics. Evidently they must be. We both believe that. My point is that economic hypotheses cannot be derived just from the laws of physics. A pretty good generalisation is that people prefer more material comfort to less (Diogenes excepted) that means when income measured in currency units goes up, consumption of goods and services goes up also measured in currency units. A stochastic equation relating the two monetary values of income and consumption is not derivable from physics, and is not inconsistent with the quantity calculus. It is not a “law” like a physical law – things can happen to disrupt it. Nevertheless it can be calibrated for a given place and time and can be useful in forecasting or estimating the effect of a policy change. Useful, but not infallible.

ghh, you say ” The data alone could not reject a model that was actually wrong by construction.” Could you explain? When I read Frank’s paper I thought of something very like Nelson and Winter.

On returns to scale: Economists misunderstand how scale changes. Engineers are very familiar with scaling laws. This is by applying dimensional analysis. From appropriate groups of dimension-one a smaller scale model can be used to predict how the full scale object will behave. I would interpret the pin example to changes in productivity when the process is changed. You will see in my analysis that there is a maximum output produced when the proportion of tool makers to production workers changes. Solow interpreted his data as suggesting decreasing returns to scale. I show that this is the effect of the maintenance requirement. I would like to remind you that I have no model. First principles analysis precludes that. I am aware of all the criticisms of aggregate production functions. They are always a description of a single data set. My equation (31) is only a description of Solow’s data. Please note that the left hand side should read q-dot not q. What is significant is that my equation (17) describes every data set along the whole range of aggregate production whereas Solow’s relationships fit only in the limited range of the data. The reason why there are so many forms (see Humphrey (1997) and Mishra (2010)) of production function is that none are capable of aligning with the many published data sets used in various attempts to find the abstract theoretical form.

So when Nelson and Winter fitted a Cobb-Douglas equation to their data, they committed a category error as the equation fails the test of the quantity calculus.

I agree with you that a fit is a sufficient but not a necessary condition for confidence in a relationship. Please note that my equation (17) fits all the data over the full range of possibilities. I am in total agreement that there will be other abstract relationships consistent with the data. I have developed abstract growth theory which is consistent with the data and also with abstract production theory. I intend to publish this is the near future.

Sorry, if I misunderstood what you said about my thinking on the economy and the laws of physics. When I am talking about my analysis I am only talking about production theory and not even about growth theory.

By the way my email address is on the bottom of my paper. I would be very happy to discuss any aspects of it with you as I am sure that they will not always be relevant to the contents of these blogs.

References:

Humphrey, T. M. (1997). ‘Algebraic production functions and their uses before Cobb-Douglas’. In: Economic Quarterly-Federal Reserve Bank of Richmond 83, pp. 51–83.

Mishra, S. K. (2010). ‘A brief history of production functions’. In: The IUP Journal of Managerial Economics VIII(4), pp. 6–34.

Dave Marsay, I meant to say that the data generated by Nelson and Winter came from an evolutionary model that was never in any sort of equilibrium and was not generated therefore according to the “theory” underlying the aggregate production function or the Solow growth model. When they fitted the Cobb Douglas econometrically to their data, they knew that any fit would be adventitious. Their point was that the function appeared to fit data generated in quite a different way and in fact explained nothing. How much weight should we put therefore when people find it fits real data? The data have low power to discriminate and one has to pay attention to the a priori plausibility of the model being tested.

I agree with N&W and with critics on this blog who argue that it is necessary to pay attention to the plausibility of assumptions when constructing a model. Fitting data is not enough. Milton Friedman was simply wrong in his instrumentalism. But though fitting data is not sufficient it is surely necessary. Why would you entertain any proposition that is incompatible with the data we have? Econometrics, honestly applied, therefore has an important role.

Gerald, Not being a genuine economist, I look at things this way:

In any empirical subject one searches for invariants. Analysing situations that are out of equilibrium tends to be difficult, so a common procedure is to assume equilibrium, identify some invariants, and then look at the data to see which of these invariants survive substantive changes. Thus while the fit may well be adventitous, it seems to be just the sort of thing that one ought to be looking for.

In terms of your second paragraph, I think it enough that assumptions are plausible for some ‘epoch’. The problem, surely, is when one takes an implausible assumption (such as permanent equilibria) and presents it as a conclusion?

Frank’s results look highly implausible, for the reasons you give. But doesn’t that make them worthy of informed scrutiny?

Or am I missing something? (Or misunderstanding you?)

My “results look highly implausible”: I find this to be a strange assertion as I presented evidence showing that my analysis conforms to every macroeconomic relationship described in the literature. If you say what you think is implausible, then I will deal with your comments either here or by email.

Frank. Sorry, I should have anticipated your interpretation of what I said. If you regard something as ‘highly probable’ and do an experiment that doesn’t falsify it, you have learnt very little. For example, it could be that your assessment of high probability is due to some implicit assumption that has also ‘infected’ your experimental design.

On the other hand, if (given certain assumptions) your results are highly implausible then you have learnt something, have every reason to question the assumptions, implicit and explicit, and might make progress.

From an information-theoretic perspective then, it is the apparently highly implausible things that we should be paying attention to, as distinct from arguments that explain things in terms of other things that seem reasonable, but which – technically – ought to be regarded as at least as doubtful. (Which is what we all tend to do.)

It seems to me that there are serious methodological issues that we would need to establish a common language before before it would be meaningful to say that something was ‘mathematically correct’ and expect to be understood.

(But there may also be more important things one could sort out without getting so pedantic.)

I agree with everything you have said but it does not advance economics.

As a mathematician you should be at least able to confirm whether my mathematical manipulations are correct. I do not think that you would find that a problem to see as being mathematically correct. I believe it is of great importance for you to do this, as you will be able to do so, then it will be possible to move the discussion on. There are many people who do not have the mathematical skills to get to that point. Could you at least do that, please?

As I have said elsewhere I had no preconceptions of what to expect when I started. Then I discovered that the solutions I developed , described all the empirical relationships related to production. Then to show that the Verdoorn relationship shared the same variable as the intercept of the aggregate production and that the empirical values were the same, meant that the analysis produced a Lakotosian progressive reseach programme. Furthermore my analysis is parsimonious unlike all of conventional and heterodox analysis.

Gerald,

Arbitrary equations will fit data perfectly well. The Cobb-Douglas equation is simply another arbitrary equation.

The only way to determine real theory is from a first principles analysis. This is the major lesson economists need to understand and accept.