Keynes on statistics and evidential weight
from Lars Syll
Almost a hundred years after John Maynard Keynes wrote his seminal A Treatise on Probability (1921), it is still very difficult to find statistics textbooks that seriously try to incorporate his far-reaching and incisive analysis of induction and evidential weight.
The standard view in statistics – and the axiomatic probability theory underlying it – is to a large extent based on the rather simplistic idea that “more is better.” But as Keynes argues – “more of the same” is not what is important when making inductive inferences. It’s rather a question of “more but different.”
Variation, not replication, is at the core of induction. Finding that p(x|y) = p(x|y & w) doesn’t make w “irrelevant.” Knowing that the probability is unchanged when w is present gives p(x|y & w) another evidential weight (“weight of argument”). Running 10 replicative experiments do not make you as “sure” of your inductions as when running 10 000 varied experiments – even if the probability values happen to be the same.
According to Keynes we live in a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk – which often forces us to make decisions based on anything but “rational expectations.” Keynes rather thinks that we base our expectations on the confidence or “weight” we put on different events and alternatives.

To Keynes expectations are a question of weighing probabilities by “degrees of belief,” beliefs that often have preciously little to do with the kind of stochastic probabilistic calculations made by the rational agents as modeled by “modern” social sciences. And often we “simply do not know.” As Keynes writes in Treatise:
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be [that] the system of the material universe must consist of bodies … such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state … Yet there might well be quite different laws for wholes of different degrees of complexity, and laws of connection between complexes which could not be stated in terms of laws connecting individual parts … If different wholes were subject to different laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts … These considerations do not show us a way by which we can justify induction … /427 No one supposes that a good induction can be arrived at merely by counting cases. The business of strengthening the argument chiefly consists in determining whether the alleged association is stable, when accompanying conditions are varied … /468 In my judgment, the practical usefulness of those modes of inference … on which the boasted knowledge of modern science depends, can only exist … if the universe of phenomena does in fact present those peculiar characteristics of atomism and limited variety which appears more and more clearly as the ultimate result to which material science is tending.
Science according to Keynes should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” Models can never be more than a starting point in that endeavour. He further argued that it was inadmissible to project history on the future. Consequently we cannot presuppose that what has worked before, will continue to do so in the future. That statistical models can get hold of correlations between different “variables” is not enough. If they cannot get at the causal structure that generated the data, they are not really “identified.”
How strange that writers of statistics textbook as a rule do not even touch upon these aspects of scientific methodology that seems to be so fundamental and important for anyone trying to understand how we learn and orient ourselves in an uncertain world. An educated guess on why this is a fact would be that Keynes concepts are not possible to squeeze into a single calculable numerical “probability.” In the quest for quantities one puts a blind eye to qualities and looks the other way – but Keynes ideas keep creeping out from under the statistics carpet.
It’s high time that statistics textbooks give Keynes his due.
It is not surprising that Karl Popper studied Keynes’ Treatise intensively. See several references in The Logic of Scientific Discovery. See esp. Section 83 on Corroborability, Testability, and Logical Probability:
“That Keynes (nevertheless) intends by his ‘probability’ the same as I do by my ‘corroboration’ may be seen from the fact that that his ‘probablity’ rises with the number of corroborating instances, and also (most important) with the increase of diversity among them.”
Interesting, although I don’t agree with Popper’s first remark, since, Keynes’s “probability” can actually go in any direction when increasing “instances”.
the author of the statistics textbook I used in graduate school — almost a half century ago — always warned the students that “figures do not lie, but liars can figure” and therefore the mindless pursuit of R – squares was a fool’s game.
Unfortunately modern day econometricians have no concept of this reality. And, even worse, the establishment of the economics profession, who lo g to be thought of as the physicists of the social sciences, reward those who pubish these pursuits of econometric R-squares. Accordingly, if a youung economics Ph. D. want to be considered to be put on the tenure track at a “good” economics department , one must follow the rules of this fool’s game.
Similar to Ellsberg’s discussion of ambiguity.
Very perceptive, and I agree on this too! Looking at Keynes’s rightly famous chapter 5 – Other methods of determining probabilities – you find an argumentation in substance similar to Ellsberg’s.
Chapter 5 of Treatise (1921) that is.
On the relationship between Keynes’s conception of evidential weight and the Ellsberg paradox
Alberto Feduzi, Journal of Economic Psychology – J ECON PSYCH , vol. 28, no. 5, pp. 545-565, 2007
Thanks for the reference. Very interesting indeed! Although I wrote on Ellsbergs paradox in my first PhD dissertation, in 1991, I hadn’t read his dissertation.
I was at Cornell in 1987 and came across a copy of Ellsberg’s dissertation in the Catherwood industrial labor relations library of all places. It changed my life.
Have been looking for Keynes’ Treatise on Probability for forty years since our local British library sold it off: never finding it second-hand or via Amazon and only seeing it again in an Australian university library. Was therefore delighted just now to find it has become available via the Gutenberg project.
http://www.gutenberg.org/files/32625/32625-pdf.pdf
thanks dave, just browsing the treatise has been stimulating for this old statistician/ecologist I hope to read it in full …though it may be just a little over my head, the anticipation i feel is surprizing maybe it will answer some of my long standing issues in stats/probability
i have read little of keynes (tho i know some of his ideas—’sticky prices’, interdependent utility,etc–from other sources (and also his essay saying something like in 50 years people would stop building ‘the wealth of nations’ and instead go for some more ‘moral sentiments’.
i was interested in that formula which i have never seen — p(x/y)=p(x/y&w) . In other words you can keep adding on more information w such as p(x/y&w&z…) but you wont get to know any more.
this looks similar to a first order markov process—CKS equation (and i guess in some cases a martingale), or in other form a simple recursion (like a unit root process).
or maybe not—here you already know everything you ever did and will know. didnt some guru from the 60′s write a book ‘be here now’? (physicist julian barbour also suggests time is an illusion. and then using recursions, you can get the slutsky-yule effect to get something from nothing).
ps i just re-loaded my memory and that equation is the markov property suitably interpreted which does lead to the CKS equation, and then fokker-planck, etc. (for diffusive processes or quantum mechanics via a wick transformation). i.e. if x= x(t), y=x(t-1), w=x(t-2), etc. where t is time
Alberto Feduzi’s effort is commendable, but there are other works on Keynes’s probability theory that do a better job at explaining how it works. (Hint: To get a better understanding, go beyond the first few chapters of the TP.)
http://www.sciencedirect.com/science/article/pii/S0167487008001104
http://bjps.oxfordjournals.org/content/44/2/357.extract
http://www.tandfonline.com/doi/abs/10.1080/02698599408573487
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1546726
As for Daniel Ellsberg’s doctoral dissertation, Risk, Ambiguity and Decision, it deserves the acclaim it gets. See the following review.
http://www.amazon.com/review/R2GDKWB98KVBDD/
Bearing in mind that in 1968 I had to think this out for myself in a different context, with Keynes’s position in mind but before I discovered Popper, my conclusion has long been that where Popper is talking about the probability of an event, Keynes is talking about the probability that a dynamic physical system of a given logical type will be in a given state at an indeterminate time. Effectively Popper and “least squares” methodology are aiming at establishing the hypothesis of the norm of a sample of repeated measurements, where the repetition effectively increases the accuracy of the prediction by narrowing the probable standard deviation of the distribution, whereas for Keynes the hypothesis is in effect the logical structure of a real system (tautologically defining particular types of event), and the probabilities he is feeling for are concerned with establishing the reliability of the expected results through time, i.e. the probability that the logic and its results will change in the ways expected. Evidence of how results change in logically similar applications thus confirm the probability that the choice of logical type is right and thus (as Popper suggests) adds all the instances of their reliability and failure to those under direct consideration.
Another approach to this is suggested by reflection on library scientist S R Ranganathan’s 1965 book “The Colon Classification”. SR argued that generalisation and abstraction go in opposite directions, the one abstracting evidence and the other logical space and thus ALL the evidence which would have appeared in that view (as in seeing a piece of paper sideways on). Popper (in the place cited by Henk at #1) specifically refers to generalisation. In effect Keynes was arguing from logical abstraction, almost seeing “the invisible hand” as the logic of cybernetics (steering), and in ‘The General Theory’ advancing from continuous redirection (-dE/dt) by adding periodic repositioning (-∫E.dt) to allow for sideways drift). I’m arguing that the logical type of economics is not a single circuit but, at a minimum when all the content is abstracted away, four circuits interconnected. Modelled simply as electrical battery circuits, these change character if someone fails to recharge some or all of the batteries. The present assumption is that one battery (finance) recharges the system, but in fact, when another (industry) goes flat experiment shows one can predict that family batteries will go flat and the flow from distributors back to finance will reverse (i.e. from net credit to net debit).
Lars, you say: ‘Science according to Keynes should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” Models can never be more than a starting point in that endeavour’.
I agree with Keynes’s comment, but not with yours here. Science is a cyclic process and what is one’s beginning can also be one’s end. Observation can suggest a model, the model (but not the observations) may be communicated to others, and further observations by many in light of the original model may lead them to change it.
The initial penetration to causes involves what Pierce and Roy Bhaskar call “retroductive” logic; the resultant chains of increasingly abstract models are reversed to permit the “deduction” of appropriately concrete applications and effects.