De gustibus non computandum: or, economics needs a divorce
Most people believe that there is something called “economics.” But this is just not so.
When we use the word “economics,” we are conflating two completely distinct disciplines. Worse, at most one of these disciplines is right—each despises and condemns the other. It’s as if English had one word stellatry, which meant both astronomy and astrology.
Our first discipline is literary economics. Literary economics is what the word economics meant in English until the 1870s or so. It is Carlyle’s dismal science. It was also practiced in the 20th century, under the name Austrian economics, by figures such as Mises, Rothbard and Hazlitt. Our second is quantitative economics. Quantitative economics was invented in the late 19th century and early 20th century, by figures such as Walras, Marshall, Fisher, Keynes and Friedman. It is also practiced today, under the name economics.
Observe, for a moment, the suspicious evolution of this terminology. Astrology and astronomy have a similar temporal relationship—as do alchemy and chemistry. I.e.: astronomy replaced astrology, and made it clear that its predecessor was nonsense. Chemistry replaced alchemy, and made it clear that its predecessor was nonsense.
But when Robert Boyle replaced alchemy with chemistry, he chose a new name to make it clear that he was separating the sheep from the goats and classifying himself among the former. Astronomy is separate from astrology for much the same reason, and in much the same way.
Whereas in economics it’s the other way around. The new name has replaced the old one in situ, forcing its predecessor to decamp to a label which, like all labels, was originally pejorative. It’s as if chemistry had decided that it was the only true alchemy, and forced the original alchemists to rebrand their field as, I don’t know, Swedish alchemy.
Of course, this doesn’t prove anything at all. But isn’t it slightly weird? You’d think that if you discover that Field A, which has been taught in all the best schools and universities since Jesus was a little boy, was so misguided in its methodology that it is useless to continue its work, and instead people should study the far superior Field B, you’d call your glorious new B a B, rather than insisting that you had discovered the one true A.
I’d say this anomaly is, if nothing else, a reason to investigate the obvious alternative that this question suggests. Which is that it’s actually the new field, Field B, which is a crock. And which has chosen to hitch a ride on the good name of Field A, devouring it in classic parasitic style. In other words, it is actually the Swedish alchemists who are the real chemists, and whose field has been invaded and annexed by a horde of canting, zodiac-wielding transmutationists. Oops.
You may or may not agree with this proposition. But it is surely prudent to consider it fairly. And the only way to do so is to hold the disputed marital property, economics, in escrow, leaving the respondents with their own separate and equal names. Ideally, the noun would be estopped from both parties, giving us not literary economics but something like econography, and not quantitative economics but something like economodeling. (Or perhaps, if you want to be nasty, econogy—practitioners, econogers.) However, some may be too conservative for these bold linguistic innovations.
Let’s briefly establish the distinction between these fields. It should be obvious that whatever their respective merits, they are different things and should not be conflated under one name. To indulge in a little Procrustean generalization:
The method of quantitative economics—including both econometrics and neoclassical macroeconomics—is to construct mathematical models of economic systems, i.e., systems of independent, utility-maximizing agents. The purpose of quantitative economics is to predict the behavior of these systems, so that central planners can manage them intelligently.
The method of literary economics is to reason clearly and deductively in English about the behavior of economic agents. The purpose of literary economics is to construct and convey an intuitive understanding of causal relationships in economic systems.
Clearly, these fields have nothing in common, either in methodology or purpose. It is true that some quantitative models can be explained in literary terms. However, they cannot be justified in literary terms. And if they can, no quantitative methods are necessary. Indeed, successful quantitative methods often hold up quite poorly when judged by literary standards. Two good examples of this phenomenon are Henry Hazlitt’s Failure of the New Economics—a line-by-line response to Keynes’ General Theory—or Murray Rothbard’s abusive treatment of Irving Fisher’s equation of exchange.
And by the standards of quantitative economics, which considers itself a predictive, falsifiable, inductive science, literary economics is simply a nothing. At best, a popularization. It makes no testable predictions. Why anyone would study it in the 21st century is a mystery.
Ergo: there is no possibility of reconciliation. Papers should issue. Custody of that little brat, economics, should be delayed for further consideration. Perhaps he can be sent to Antartica and eaten by a leopard seal.
Consideration of the merits and demerits of contemporary literary, or Austrian, economics is off-topic for this brief, off-week UR. (I know I promised not to post again until the 28th. But your host is not really known for keeping his promises.) Rather, I want to use the tools of literary economics to explain why quantitative macroeconomics is, to put it quite bluntly, fraudulent. The argument that economics needs a divorce is not dependent on this point, but it certainly ought to help seal the deal.
The point is unusually accessible because of an article this week in the NYT Magazine, on the economist Nouriel Roubini. Brad Setser, Roubini’s old sidekick and now at the CFR, expands. (I used to post a lot of comments on Setser’s blog, but that was before my daughter showed up.)
Roubini and Setser are certainly not Austrians. Which makes their reinvention of literary economics all the more interesting. The great lie, after all, is always one cohesive monolith, whereas great truths seep in from every pore. Roubini is trained in modeling, whereas Setser is not. Personally, I prefer Setser. But Roubini is more flamboyant and more credentialed, and he gets the press. As the Times flack puts it:
Roubini’s work was distinguished not only by his conclusions but also by his approach. By making extensive use of transnational comparisons and historical analogies, he was employing a subjective, nontechnical framework, the sort embraced by popular economists like the Times Op-Ed columnist Paul Krugman and Joseph Stiglitz in order to reach a nonacademic audience. Roubini takes pains to note that he remains a rigorous scholarly economist—“When I weigh evidence,” he told me, “I’m drawing on 20 years of accumulated experience using models” — but his approach is not the contemporary scholarly ideal in which an economist builds a model in order to constrain his subjective impressions and abide by a discrete set of data.
Indeed. Setser puts it this way:
But the outcome of most such models seem determined by the assumptions used to create the model more than anything else. […] I am biased, but I think it is possible to be analytically rigorous (and to use real data to inform your conclusions) even in the absence of a formal model.
And the commenters, as usual, repeat and reinforce the point in various interesting ways. (If you want to participate in a productive discussion about international monetary economics, Setser’s blog is still the place to go.)
Moreover, even leading quantitative economists are happy to admit that in the century or so of its existence, quantitative economics has displayed little if any predictive power. For example, consider this cri de coeur from Greg Mankiw, or this European reminder that none of the DSGE models in use every day at the highest policy level includes anything like a plausible theory of what is surely the most salient issue in economic prediction—the business cycle. And Axel Leijonhufvud is brutally direct:
I conclude that dynamic stochastic general equilibrium theory has proven itself an intellectually bankrupt enterprise.
Of course, I’m sure many other economodelers disagree with this epitaph. If nothing else, few scholars display such Diogenesian honesty as to pronounce the death of their own field. Nor are Mankiw and Leijonhufvud really saying that it’s time for all the economists in the world to clean out their cubicles and consider new careers in the lawn-care industry. I’m sure they are quite confident that, with redoubled effort, the profession can redeem itself. But, considering the age of the endeavor, the rest of us should feel free to be skeptical.
One of the most interesting intellectual phenomena of today is the fact that we can read the words of professionals whose jobs it is to actually predict macroeconomic systems. I.e.: fund managers. My favorites are three blogs I’ve mentioned before—Macro Man, Cassandra Does Tokyo, and CT Oysters. Frankly, these guys are just mensches, anyone who reads their blogs for five minutes can see it, and if I had any money I would find a way to give it to them.
The first thing you notice about financial professionals is that they have almost no interest in the work of academic quantitative economists. (In fact there is a species of fund manager called a quant, but quantitative financial engineering has essentially nothing to do with the quantitative economics of Marshall, Keynes, Fisher and the like.) I discovered this world through Brad Setser’s blog, and the likes of Setser and Roubini do interact with the pros—giving them considerable respect, indeed, which says a lot for Setser and Roubini.
But as we’ve seen, they are exceptions. In general, the relationship between the financial industry and the academic field of quantitative economics strikes me as quite comparable to the relationship between the software industry and the academic field of computer science. I.e.: what relationship?
When you look at the methodology of the world’s Macro Men—this post is a fine example—they seem to operate mainly on the basis of their noses. Their specialty is integrating a wide variety of perspectives and arguments and judging the soup by intuitive gut feel. Nothing farther from the work of Keynes and Fisher, or even from the methodology of the 21st-century central banker (who is often looking at the same problem from the other side) could be imagined. They are not practicing strict Austrian economics, although they are often influenced by it. But their approach is certainly in the literary category. Yet another strikeout for the econogers.
What I want to explain today is why quantitative economics doesn’t work. My refutation is certainly not the only possible one—you can find a whole bushel full here. However, since only one refutation is required to refute anything, I have chosen one I find particularly ripe.
The task of econogy is actually remarkably similar to that of astrology. The astrologer maintains a model whose inputs are past measurements (star and planet positions, comets, eclipses), and whose outputs are future events (the death of a prince, the winner of a battle, the sex of a baby). Using signs, symbols and simulations, he maps the one to the other. His results are dispatched directly to the king, who uses them to set monetary policy.
Why do we know astrology doesn’t work? We know it in two ways: inductively and deductively. Inductively, we see no past track record of successful astrological prediction—just as Professor Mankiw sees no past track record of successful econogical prediction. But this tells us very little, really. It could just mean that past astrologers were bad astrologers. However, radical new theories of astrology may indeed allow us to predict the deaths of princes, etc.
The deductive argument is far more brutal. We know that astrology doesn’t work because we know that astrology cannot work. We know that astrology cannot work because we know physics, and we know that there is no plausible mechanism by which comets, planets, etc., can cause the deaths of princes. (Unless the comet actually hits the prince—a contingency which boring, zodiacless astronomers are perfectly competent to predict.)
Similarly, if you are an American, you’ve probably had the experience of watching a baseball game and hearing the commentator produce some egregious abuse of statistics, such as “Rodriguez bats .586 against Clemens on prime-numbered Tuesdays.” You know that this number is (a) not statistically significant, and even if it was it would still be (b) a product of data dredging, i.e., overfitting. And you know this not just because sports commentators are nincompoops, but because you know that the numerical properties of the calendar cannot possibly affect the batting of Rodriguez.
Is there an equivalent for econogy? As it so happens, there is.
Here is a question: how much better a car is a (new) 2008 Mustang than a (new) 1988 Mustang? Is it twice as good? Three times as good? 5.7 times as good? 1.32 times as good? How much better a computer is the MacBook Pro than the Mac Plus? 75 times as good? 1082.1 times as good? And who was a better Bond, Sean Connery or Daniel Craig?
Surely the form of the question is nonsensical. There is no objective way to quantify the quality of a car, or a computer, or a Bond. We might as well ask: what temperature is love? What is the weight of March? Which is better, a cat or a dog? We can express all of these as numbers. We can even produce formulas which output these numbers. But we cannot describe them as measurements.
Actually, however, econogers (econometricians, to be precise) solve these problems on a regular basis. They do so when computing the mysterious quantity known to econogers—and to readers of fine newspapers everywhere—as inflation.
The implicit proposition assumed by the word inflation, at least in its 20th-century meaning, is that changes in the value of money over time can be measured quantitatively. The 20th-century concept of inflation is largely due to one Irving Fisher, who is surely one of the century’s most distinguished charlatans. Professor Fisher’s work is online, and certainly worth reading, if only from a forensic point of view.
Consider the phrase “value of money.” What is the “value” of money? Money, at least your modern fiat currency, has no particular direct utility, unless you need to snort a line of coke or find yourself “caught short.” We desire money because we desire the things that money can be exchanged for, such as 2008 Mustangs, plasma TVs, crude oil, etc. Thus we speak of the “purchasing power” of money.
To be more concrete, let’s look at a specific currency—the dollar. At any moment, we observe a vast set of prices, which are exchange rates between the dollar and other goods—2008 Mustangs, euros, tickets to Bond films, and so on.
These prices are measurements—hard numbers. For any time T at which the market is open, we have a precise measurement of the dollar-to-euro ratio. We also have a precise measurement of the dollar-to-crude-oil ratio, and even a fairly precise one of the dollar-to-2008-Mustang ratio. (Though the bid-ask spread on this last is quite wide.)
If we want to speak clearly, however, we cannot speak of the “value of the dollar.” We cannot say “the dollar is strong” or “the dollar is weak” or even “the dollar rose today.” We can say all these things about the US Dollar Index. But we could construct another index which was a weighted average of the exchange rate from dollars to zinc, dollars to Malaysian ringgit, and dollars to Bond-film tickets, and this number would have just as much (and just as little) right to be described as “the dollar” as does the aforementioned index (which is usually what people mean). But our two numbers might well be quite different.
When we speak of the dollar, we have surrendered our souls to the fallacy of objective value. We have constructed a unitless number out of a set of measurements in incompatible units. The result is a mathematical solecism, like saying “my car is 60 miles fast.” There is no such thing as value, only price, and every price has both numerator and denominator units.
But saying “the dollar is strong” is a peccadillo next to the flagitious concept of inflation. The Irving Fishers of the world claim, quite seriously, to be able to measure the “value” of 2008 dollars in 1988 dollars. Let’s look a little more closely at how this feat is accomplished.
Clearly, this number is not a price. There is no exchange rate between 1988 dollars and 2008 dollars, as there is between 2008 dollars and 2008 euros. Due to our lack of reliable and effective time machines, there is no market on which 1988 dollars and 2008 dollars can be exchanged. (We can measure the 20-year interest rate in 1988, but by definition this cannot be affected by any events after 1988, unless 1988 has a time machine which can observe the future.)
Time is confusing, so we can think of the problem in another way. Suppose we build a powerful telescope by which we observe economic activity on Alpha Centauri. It so happens that the Centaurians call their currency the “dollar.” What is the exchange rate between Earth dollars and Centauri dollars? There is none, because Earth and Centauri cannot exchange any goods, monetary or otherwise.
However, we note that some goods are exactly the same on Earth and on Centauri—the laws of physics being equal. Earth zinc and Centauri zinc, for example, are both zinc. Earth water and Centauri water are both water. So if we measure the prices of these invariant goods in Earth dollars and Centauri dollars, we obtain—presto—Professor Fisher’s “commodity dollar.”
But there are two problems with this approach. One, we have no reason to expect the Earth-Centauri price ratio in zinc to be the same as the ratio in water, or methanol, or gold, or any other intergalactic commodity standard. Two, while we can average them, but we have no objective procedure with which to define the weight of each commodity in our average. By mass? By energy content? By number of protons? By total price of annual product on Earth? By total price of annual product on Centauri? Each method produces a different result.
And three, we have no reason at all to measure goods that are identical on both Earth and Centauri (“commodities”) and ignore items that vary, such as cars, movies, pizza, etc. An Earth Mustang costs E$30,000; a Centauri Mustang costs C$10,000. This suggests that 1 C$ equals 3 E$. On the other hand, Centaurians weigh 500 pounds and have eight legs, which requires a much heavier grade of suspension and a completely different style of seat…
Human anatomy in 1988 is quite similar to its descendant in 2008. Otherwise, the relationship between 1988 and 2008, at least as far as prices are concerned, is precisely identical to the relationship between Earth and Centauri: 1988 and 2008 cannot trade, and consumers in each have a completely different set of preferences and a completely different set of goods they can purchase to satisfy those preferences.
The diligent gnomes at the BLS, of course, have thought of all these things. For example, they have no problem in telling you how much better a car a 2008 Mustang is than a 1988 Mustang. Every year, for a month or two, both the new year’s model and the old’s are on sale at the same time. In the fall of 1988, you could buy either a 1988 Mustang or a 1989 Mustang. The latter was more expensive—say, 5% more expensive. Therefore, it must be 5% better, dollars to dollars. Compute this number every year, multiply it out, and we have our ratio. Here is a fine illustration of the sort of numerology involved. (Yes, “Fisher” is our dear old Irving, he of the “permanently high plateau,” the eugenics, the prohibition and the intestinal butchery.)
This gives us an objective procedure to calculate the quality improvement in Mustangs. Here is another objective procedure: subtract 1900 from the year, and divide. This tells us that the 2008 Mustang is 108/88 better than the 1988 Mustang, i.e., 22% better. Which of these procedures produces a number closer to the actual ratio of the value of a 2008 Mustang to the value of a 1988 Mustang? Obviously, to evaluate them, we would have to know said ratio. Back to square one.
Do you still believe that the ratio between 1988 Mustangs and 2008 Mustangs, or the ratio between 1988 dollars and 2008 dollars, or the ratio between Earth dollars and Centauri dollars can be calculated objectively, and should be assigned to the same category as actual numerical measurements, such as the ratio between 2008 dollars and 2008 euros?
If so, I recommend a visit to John Williams’ Shadow Government Statistics. Williams combines the formulas used in the Carter era with the data sets of 2008 to compute a rather different set of numbers. A look at the graph tells all. As Williams puts it:
Inflation, as reported by the Consumer Price Index (CPI) is understated by roughly 7% per year. This is due to recent redefinitions of the series as well as to flawed methodologies, particularly adjustments to price measures for quality changes.
Needless to say, 7% is a rather huge number in the world of inflation statistics.
Is Williams right? Is he wrong? I think it goes beyond this. The problem is that Williams is operating under exactly the same assumption as those he criticizes: that there is an objectively calculable number called inflation. No doubt for perfectly justifiable reasons, he favors one (old) method of calculating this number, whereas the government’s experts (also for perfectly justifiable reasons) favor another (new) method. Presumably there are many other methods which did not find favor either with the 1978 gnomes or the 2008 gnomes.
All of these figures are stuffed to the gills with subjective fudge. Whether the gnomes evaluate the products and weight the averages using their own personal taste (don’t forget the Bond films—entertainment is a significant percentage of consumer spending), or whether their subjective taste is kept at the level of algorithmic choice, the problem is the same: different gnomes, different numbers.
Of course, there is nothing wrong with subjective estimation for purposes of illustration. If an auto reviewer says something like “the 2008 Mustang is three times the car of its 1988 ancestor,” one might find it a little odd. If the number is not three but 3.47, a troubling Aspergery feel begins to emerge. But this is the reviewer’s judgment, and judgment is why we read reviews. As a reader one may prefer adjectives to numbers, but the numbers are certainly no less expressive.
And this even holds true for “inflation.” A historian may find it useful, in explaining the economics of the Roman Empire, to postulate an exchange rate between modern dollars and the denarius of Diocletian. Wheat prices, or labor costs, or precious metals, or any commodity or basket thereof, so long as the same good is obtainable in both 308 and 2008, may serve.
Illustrations, however, are one thing. Equations are another.
Quantitative economics can be refuted simply by observing that almost all its models rely on inflation-adjusted numbers. Figures such as “real interest rates”—i.e., actual or “nominal” interest rates minus “inflation”—are everywhere. Other headline statistics, such as “growth,” are calculated using the same kinds of spurious pseudo-prices. (And others, such as “unemployment,” are consequences of different subjective choices.) Needless to say, an inflation discrepancy on the order of 7% is enough to throw most of your macroeconomic models into a completely different galaxy.
The key observation is that the relationship between the identity of the gnomes who produce the “inflation” number, and the prediction of the model (such as “unemployment”), is in exactly the same class as the relationship between the position of Saturn and the batting of Rodriguez. The latter cannot possibly depend on the former.
In other words, there is no conceivable causal mechanism by which the subjective taste of the gnomes can affect the variable which the model is designed to predict. It is just as erroneous to introduce the BLS’s opinion of antilock brakes—whether produced by the judicious conclusion of grizzled auto reviewers, or by an algorithm which is simply encoding the details of Ford’s model-year transition strategy—into a model designed to predict unemployment, as it would be to introduce the phase of the moon. Neither the phase of the moon or the stopping power of antilock brakes can either cause or cure unemployment, and any formula which implies otherwise can be discarded without any empirical evidence whatsoever.
Thus we can say: de gustibus non computandum. In matters of taste there is no computing. It is not necessary for us to examine the empirical results of models which purport to predict future events from a mixture of price measurements and subjective quality assessments. We know deductively that no such model can be accurate. We cannot be surprised to learn inductively that they do not, in fact, appear to work. Deduction trumps induction, though it’s always reassuring when the two agree.
This analysis leaves two questions open.
One, we have demonstrated the worthlessness of economodeling, but we have not demonstrated the worth of econography. Perhaps both are worthless. In that case, do they need a divorce? Maybe they’re more like one of those couples which deserve each other.
Two, we have failed to explain why, if economodeling is such obvious nonsense, it became and has remained so popular.
Perhaps these questions call for a sequel. The next UR post will appear on August 28, 2008. It will probably be about something completely different.