Home > Uncategorized > The Reinhart and Rogoff defense of their work is riddled with mistakes, too

The Reinhart and Rogoff defense of their work is riddled with mistakes, too

Economists are proud of their advanced mathematical and quantitative methods. The already classic Herndon, Ash and Pollin paper  however pointed out mayor flaws and mistakes in basic arithmetic in the influential work of Reinhart and Rogoff on the relation between debt and growth. Mistakes with ‘non-trivial’ implications for their conclusions and present economic policy. Reinhart and Rogoff responded to this – but to an extent this response only made matters worse, see also this Krugman blogpost. As is also pointed out in a new response by Ash (which we can publish here courtesy of the German Handelsblatt), who shows that the Reinhart and Rogoff defense (see below) crucially misquotes and misrepresents their own work.

I appreciate Reinhart and Rogoff’s frankness in acknowledging the errors in their paper.

I disagree with their assertion that our results are not dramatically different from theirs. Where Reinhart and Rogoff report average GDP growth of -0.1 percent above the 90 percent public debt/GDP ratio, we find average GDP growth of 2.2 percent. This is an enormous difference.

We also refute the RR evidence for an “historical boundary” around public debt/GDP of 90 percent, above which growth is substantively and non-linearly reduced. There is no indication of an important threshold in the data and this was an important part of the argument.

Their weighting to calculate mean growth is nonstandard; for example, it gives a single year of data for New Zealand as much weight as 19 years of data for the UK or 19 years of data for Greece. We discuss the considerations around weighting in our paper, and the Reinhart and Rogoff response does not address that the 1951 GDP growth for New Zealand accounts for a substantial share of their result.

In the discussion of the method on p. 7 of their 2010 NBER working paper, they imply that the mean results are an average of country-year values by the high number of observations: “Note that of the 1186 annual observations, there are a significant number in each category, including 96 above 90 percent.” Similarly in the notes to Figure 2 in the AER article, the description implies that country-year observations are being averaged: “The number of observations for the four debt groups are: 443 for debt/GDP below 30%; 442 for debt/GDP 30 to 60%; 199 observations for debt/GDP 60 to 90%; and 96 for debt/GDP above 90%. There are 1,180 observations.” However, means were taken over countries, and hence RR used only 10 observations (countries), not 96 observations (country-years), in calculating the mean GDP growth in the above-90-percent public debt/GDP category.

In their 2010 NBER working paper RR note the early postwar experience of some of the countries for which they ultimately exclude observations: “Of course, there is considerable

variation across the countries, with some countries such as Australia and New Zealand experiencing no growth deterioration at very high debt levels. It is noteworthy, however, that those high-growth high-debt observations are clustered in the years following World War II.” These years are also the source of the only low-growth high-debt observations for the United States, which in contrast to Australia and New Zealand are included in the RR paper. Irons and Bivens have an excellent discussion of the US experience.

In passing I note that the results that their response attributes to Table 1 of their 2010 AER article are in fact from their 2010 NBER paper. Table 1 of the AER article does not report results for the postwar experience of the advanced economies. In the AER article, the postwar averages for the advanced economies are reported only in the bar chart in Figure 2. The bar chart values in AER 2010, except the -0.1 percent GDP growth for the highest public debt/GDP category, differ somewhat from the values reported in Appendix Table 1 of the 2010 NBER paper and the Reinhart and Rogoff response.

Finally I reject entirely the implication that we “insinuate manipulation.” Our paper is an objective assessment of the methods and errors in the RR 2010 publications and attributes no intention or motive.

Here is the defense of Reinhart and Rogoff, to which they respond:

Response to Herndon, Ash and Pollin by Carmen Reinhart and Kenneth Rogoff, April 17, 2012

We are grateful to Herndon et al. for the careful attention to our original Growth in a Time of Debt AER paper and for pointing out an important correction to Figure 2 of that paper. It is sobering that such an error slipped into one of our papers despite our best efforts to be consistently careful. We will redouble our efforts to avoid such errors in the future. We do not, however, believe this regrettable slip affects in any significant way the central message of the paper or that in our subsequent work. But first let us consider the specific points raised by Herndon Ash and Pollin (HAP) in their comment that we were sent yesterday.

The authors point out that there are three problems with our 1945-2009 averages and the paper itself: (i) a coding error that causes the first five countries in the alphabet to be omitted in forming averages for the 1946-2009 period in one figure, (ii) “selective exclusion” of 1946-1950 for New Zealand, and (iii) “unconventional weighting of summary statistics”—the implication being that these omissions are intentionally used to bias the results. They argue that the interaction of three problems magnifies their effects and leads to completely different conclusions, especially when they choose a different weighting scheme.

On the first point, we reiterate that Herndon, Ash and Pollin accurately point out the coding error that omits several countries from the averages in figure 2. Full stop. HAP are on point. The authors show our accidental omission has a fairly marginal effect on the 0-90% buckets in figure 2. However, it leads to a notable change in the average growth rate for the over 90% debt group. The median growth rate we report is the right order of magnitude.

Our interpretation of the errant data point in figure 2 was fortunately tempered somewhat by the parallel weight given to the median GDP growth rate for the various levels of debt in our discussion, an issue HAP selectively ignore. To quote from our opening paragraph:

“median growth rates for countries with public debt over roughly 90 percent of GDP are about one percent lower than otherwise; There is also Table 1 which immediately follows figure 2, and does not have the same issues. Table 1 gives all the individual country estimates for all the buckets and over a much longer time period than figure 2, and of course figures importantly in our analysis. We are fortunate that we chose to present our results in several different ways, including means, medians, and individual country averages, in no small part as standard robustness checks. Nevertheless, the mistake in figure 2 resulting from the coding error is a significant lapse on our part.

HAP go on to note some other missing debt data points, which they describe as “selective omissions”. This charge, which permeates through their paper, is one we object to in the strongest terms. The “gaps” are explained by the fact there were still gaps in our public data debt set at the time of this paper, a data set no one else had ever been able to construct before and which we now have filled in much more completely. Many readers of our work have followed this evolution on our data website. For example, at the time the 2010 AER paper was written, there were gaps in the French data for the 1970s that we only filled in later. Other data, including data for New Zealand for the years around WWII, had just been incorporated and we had not vetted the comparability and quality data with data for the more recent period. In effect, HAP only knew we had these data as we sent them the file we had used at that time.

We have no issues with to Herndon, Ash and Pollin for bringing attention to any data question regarding our work. In this regard, we note that we have long since fully integrated the New Zealand data back to the early 1800s, once we were able to process it. Every major high debt episode for advanced countries since 1800 and the underlying data is included in our 2012 Journal of Economic Perspectives paper co-authored with Vincent Reinhart.

But surely the authors do not mean to insinuate that we manipulated the data to exaggerate our results. To what purpose would we “manipulate” the average growth rate for debt above 90% and show an average of -0.1% when in the same AER paper we report the median for 1946-2009 at 1.6%, and over the longer sample 1790-2009, we report and average of 1.7% and a median of 1.9%? (see table below) Why, for that matter, would we provide all the data that we have gathered and used in our research over the years documenting in detail multiple sources to the public domain?

This brings us to the core conceptual issue, which Herndon, Ash and Pollin argue greatly biases our results. They argue that we use an “unconventional weighting of summary statistics.” In particular, for each bucket, we take average growth rates for each country and then take an average of the result. This seems perfectly natural to us, and hardly unconventional. We do not want to excessively weight Greece, for example, which has debt over 90% for 19 years in the 1946-2009 sample. The post-war Advanced Economy experience would quickly reduce to the experiences of Greece and Japan. Our approach has been followed in many other settings where one does not want to overly weight a small number of countries that may have their own peculiarities. Our approach is quite clear from table 1, which also gives the averages for each individual country.

Our 2012 Journal of Economic Perspectives paper, based a much longer time period (1800-2011 versus the 1946-2009 Herndon et al focus on), gives episode-by-episode data for each country, including growth rates and number of years, so the results are quite transparent. The problems with weighting long episodes much more heavily than short episodes, as Herndon et al. suggest, become much more apparent in the longer time series (an earlier version of which was also used in Table 1 of our original AER paper) As we noted in our initial comment yesterday upon just receiving the paper, our JEP paper anticipates most of the aggregation debate and diffuses it by using a case-study approach. That is where this literature is now expanding.

So do where does this leave matters on debt and growth? Do Herndon et al. get dramatically different results on the relatively short post war sample they focus on? Not really. They, too, find lower growth associated with periods when debt is over 90% (they find 0-30 debt/GDP , 4.2% growth; 30-60, 3.1 %; 60-90, 3.2%,; over 90, 2.2%. Put differently, growth at high debt levels is a little more than half of the growth rate at the lowest levels of debt. They ignore the fact that these results are close to what we get in our Table 1 of our AER paper they critique, and not far from the median results in Figure 2 despite its coding error. And they are not very different from what we report in our 2012 Journal of Economic Perspectives paper with Vincent Reinhart—where the average is 2.4% for high debt versus 3.5% for below 90% . The table below makes the similarity of all these comparisons clear:


RR AER (2010)

HAP (2013)

Debt/GDP Mean Median Mean Median
0 to 30 4.1 4.2 4.2 NA
30 to 60 2.8 3.9 3.1 NA
60 to 90 2.8 2.9 3.2 NA
Above 90 -0.1 1.6 2.2 NA
RR AER (2010) (Table 1)
0 to 30 3.7 3.9 NA NA
30 to 60 3.0 3.1 NA NA
60 to 90 3.4 2.8 NA NA
Above 90 1.7 1.9 NA NA
RRR JEP (2012),
1800-2011 Mean
Below 90 3.5
Above 90 2.4

NA implies not available

There is also the question of whether these growth effects can be economically large. Here it is very misleading to think of 1% growth differences without recognizing that the typical high debt episode lasts well over a decade (23 years on average in the full sample.)

It is utterly misleading to speak of a 1% growth differential that lasts 10-25 years as small. If a country grows at 1% below trend for 23 years, output will be roughly 25% below trend at the end of the period, with massive cumulative effects.

Looking to the reaction to this comment in blogosphere, we note that this is not the first time our academic work is seen pandering to a political view. What is quite remarkable is that this claim has spanned polar opposites! This time, we are charged with misconstruing analysis to support austerity. Only a few months ago, our findings on slow recoveries from financial crises was accused as providing a rationale for the deep recession and weak economy the Obama administration has faced since 2007.

Herndon, Ash and Pollin have written a useful paper, finding a significant mistake in one of our figures, and helped reconcile why one result is out of line with all the other results in our original paper as well as ones presented in our later research, not to mention those they present in their helpful comment. Clearly more research is needed on debt and growth and we welcome all efforts, it is very exciting area. We now have debt data for a larger number of countries than the original sample and long time periods that allows this research to press forward.

  1. Matt
    April 18, 2013 at 12:53 pm

    The main problem is not in whether the two variables are related or not. Even if there is a correlation, the main question is which factor causes which. Does high debt cause low growth? Does low growth cause high debt? Is there another factor or are there many other confounding factors, causing both low growth AND high debt? Without some proof or at least a plausible explanation for a causal relation, these (very weak) correlations are totally meaningless.

    Also, it seems pretty silly to lump together so many completely different countries trying to find a general relationship.

  2. April 18, 2013 at 1:32 pm

    Applied econometric analysis needs to prove statistical signficance, by examining the distribution of errors between actual values and model values. Just looking at the diagrams in the papers, I suspect the purported empirical relationships are just noise. This impression is supported by the dramatic change in conclusion resulting from adding or substracting a small subset of data points. It should be obligatory for authors the prove in their papers that they are not just presenting noise

  3. charlie
    April 19, 2013 at 12:53 am

    as a statistician who reviewed scientist study plans and papers I often found they had used statistical techniques which were advanced, but really did not apply to the issue they were spposed to be addressing … I was always suspicious of economist choice of tests. Ecologists were less likely to screw up their choice of test …

  4. mil
    April 19, 2013 at 9:33 pm

    Merijn, have you read this comment on RR defense?
    It’s most damning:

    But those statements are pretty misleading themselves. That’s because HAP’s key point isn’t that the decline from 3.2% growth to 2.2% growth for countries over 90% debt-to-GDP is small. It isn’t small. If your country grew at 1% less a year, it would indeed, as R+R say, be a good deal poorer two decades hence.

    No, HAP’s point is that 1% is statistically insignificant. That, given the statistical margins of error that arise when you do a calculation like this with a limited set of data, “[d]iff erences in average GDP growth in the categories 30-60 percent, 60-90 percent, and 90-120 percent cannot be statistically distinguished.”

    If that’s true, you can’t even really speak of a 1% growth difference for high and low debt countries. Why? Because statistically speaking there is not enough evidence to suggest it really exists, even with average growth data that the economists collected.

    • Nell
      April 20, 2013 at 8:57 pm

      Good point. Essentially working from the data Reinhard and Rogoff used it is impossible to say whether or not high or low government debt has an impact on growth. One would hope that we could all stop focusing on public debt now. My sense is that it goes up and it goes down depending on how well the private sector is doing, particularly in modern economies with committments to support the unemployed during a downturn. What economists and politicans should be focusing on is how to get people back to work, how to restore some equity in the distribution of income and how to put a great big massive leash on the financial sector.

  5. Matt
    April 21, 2013 at 6:15 am

    Do people even understand the difference between correlation and causation? Even in the responses here, people keep on discussing the (possible) errors being made in the statistics. But the point is: it does not matter! Even if there was a significant, very strong correlation, it would proof nothing.

    If there is a high correlation, it could mean very different things:
    1) High debt causes low growth
    2) Low growth causes high depth
    3) Another factor, or other factors, cause both low growth and high debt
    4) Coincidence

    Again, correlation is completely meaningless on its own. Pick any random data set and you will find correlations. I could ask housekeepers about their political views and at the same time measure seasonal variation in breading patterns of the birds, compare the data sets and find correlations. Did I just proof that breading patterns of birds influence political views of housekeepers?

  6. April 21, 2013 at 12:05 pm

    Among the economists I met in business and government, at least 9 out 10, use Excel spreadsheets and statistical packages. Excel coding is error-proned, because it is difficult to run tests to verify every aspect of the program logic. Any modification such as adding more data could potentially introduce errors and be hazardous to the program. Excel was original meant for doing simple things such as tax returns and was never meant to replace real computer programs, which are required for serious work.

    Yet the senior managers in many large organisations are so ignorant that they allow serious
    calculations to be done on Excel spreadsheets. One of National Australia Bank’s home mortgage books in the UK was run on an Excel spreadsheet. Errors in some cells for interest rates eventually caused the closure of that operation due to heavy losses sustained.

    When I was in school many of the students who were relatively weak in mathematics went to the C classes and studied business, commerce and economics. If this is a general reflection, this may explain the low level of quantitative skills in professional and academic economists. Such people appear to be running the world.

  7. davetaylor1
    April 21, 2013 at 3:59 pm

    Glad to see this making the news:


    • April 22, 2013 at 1:22 am

      Apart from the Homeside debacle in the US (mentioned above) with NAB losing $4billion, there are many other examples:


      Obviously, many, many more such errors would be hushed up in business and government, simply to protect reputation and they can give reasons of “commercial in confidence”, “security” etc. for not revealing their calculations.

      The same would probably be true of most academic research in applied econometrics, where replication, checking, confirmation or falsification are relatively rare. Direct disputes are avoided and pin pointing errors like HAP are very rare, because HAP would not have been able to publish in a “top” journal like AER. Having a good reason to go public helps.

      There is empirical research showing systematic curtailment and decline of critical commentary in economic journals for the past 50 years. The result of suppression is there is no real consensus on what the economic data are actually revealing, particularly in relation to theory.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: