Laurence J. Peter quotation on economists

Quotation #1233 from Michael Moncur’s (Cynical) Quotations:

An economist is an expert who will know tomorrow why the things he predicted yesterday didn’t happen today.
Laurence J. Peter
US educator & writer (1919 – 1988)

Like I say all too often, economists cannot predict. Beware of those economists that say they can. They have a bridge to sell you, a beautiful bridge in Brooklyn. Economics needs several more centuries of research before it can develop decent predictive capabilities, and I wouldn’t bet it will even then.

(Once in a while I am really glad I follow “quotes of the day”.)

More on Type M errors in statistical analyses

A bit earlier, I was intrigued by a blog post by Columbia Statistics and Political Science professor Andrew Gelman about “Type M” errors in statistical analyses (link).  A Type M error is an overestimation of the strength of the relationship between two variables and such an error is caused by having too small a sample to draw upon.

I can try to explain this to you now this because I have now read “Of Beauty, Sex and Power” by Andrew Gelman and David Weakliem (American Scientist, Volume 97, 310-316, 2009). I found the text of the article by following a link in the Gelman post I quoted earlier. I think I now understand a little what’s going on here and I really enjoyed reading the article.

Suppose there are two variables I care to study with an eye to whether they are related. Perhaps I have a theory, based on a hypothesis from evolutionary psychology, that “Beautiful parents have more daughters”. (In fact, Gelman and Weakliem wrote their article after being prompted by a paper with this very title, and some other papers by the same author, published in the prestigious Journal of Theoretical Biology.) Let’s call these variables X and Y (behold the poverty of my imagination).

Let’s also suppose that there is in fact a relationship between these variables, but very small in magnitude. As a researcher, I do not know this relationship but I want to discover it and make my name based on the discovery. What do I do then? I go after data sets that contain variables X and Y and try some statistical estimation techniques, looking for a number to indicate how strongly the variables are related. Classical statistical methodology tells me to estimate not only that number, but also an interval around my estimate that gives an idea of the error of my estimation. This is called a “confidence interval”. (Gelman and Weakliem also explain how this argument goes if I were to use Bayesian estimations techniques, for those in my vast* readership who know what these are.) Roughly speaking, if I have done my stats well, and do the same estimation work with 100 different data sets, then the true value of the number I am after will be in 95 of the 100 confidence intervals that I will find.

But here’s the rub. What I really am testing, if I am doing classical statistics, is whether the number I want to estimate can be shown (with 95 percent confidence) to be different from some a priori estimate (the “null hypothesis”). For a relationship that is very small, presumably any previous evidence will have shown it is small, and perhaps would have shown conflicting results about the sign of the relationship: some studies would have found it negative, some positive. So I should have as my null hypothesis that X and Y are unrelated.

Now let’s say I find that this relationship coefficient that I am trying to estimate is in fact equal to 0. I do not know this, of course. If I do 100 independent studies to estimate this coefficient, then I can expect 5 of them to indicate to me that the coefficient is statistically significant from zero; all of the 5 would be misleading. But concluding that the correlation I want to find is in fact not there is not exciting, and will get me no fame. If I find one of the erroneous “significant” results, on the other hand, I will send my study to a prestigious journal, talk to some reporters, and maybe even write a book about it. All of the noise thus generated would be good for my name recognition. But I would still be wrong, having infinitely overestimated the coefficient of interest.

The same kind of error could arise if the true relationship was in fact positive. Say the coefficient was not 0 but instead 0.3, and my data allowed me an estimate with a standard error of 4.3 percent. Then I would have a 3 percent probability of estimating a positive coefficient that would appear statistically significant and, perhaps worse, a 2 percent probability of estimating a _negative_ coefficient that would appear statistically significant. I could even be strongly convinced, then about the wrong sign of my coefficient! Whichever of these two errors I fall into, the estimated coefficient will be more than an order of magnitude larger in absolute value than the true coefficient. This is why we are talking about Type M effects; M stands for magnitude, indeed. (Well, we also saw a Type S effect in this example, when the sign of the estimated coefficient was wrong.)

Is there an escape from this trap? More data would help expose my error. The more data I base my estimation on, the more the so-called “statistical power” of my testing procedure, and the less likely I will be to fall in error. For variables with small but significant correlations, which happens in the medical literature, often the data sets contain millions of observations. It is understood by sophisticated scientists that you need a lot of power (a lot of data) to tease out small effects.

What can we conclude from this? Besides the obvious value of skepticism when assessing the value of any statistical finding, we should also realize that not all studies that use statistics are created equal. Some have more power than others, and we should trust their results more. And that’s why “more research is needed” is such a refrain in discussions of studies on medical or social questions. I know “more research is needed” is also a plea for funds, and should be always met with the aforementioned skepticism, but bigger data sets do give us the power of more secure conclusions.

*This poor attempt at irony is also an example of a particular Type M error, this one about the correlation of the variable “the size of the set of readers of my blog” and “vast, for not ridiculously small values of ‘vast'”. I hope you’ve heard some variation of the joke that goes something like “It is true that I have made only two mistakes in my life, for very large values of ‘two'”.

Deciding the conclusion ahead of time : Applied Statistics

The more serious issue is that this predetermined-conclusions thing happens all the time. (Or, as they say on the Internet, All. The. Time.) I’ve worked on many projects, often for pay, with organizations where I have a pretty clear sense ahead to time of what they’re looking for. I always give the straight story of what I’ve found, but these organizations are free to select and use just the findings of mine that they like.


Once I started reading the Applied Statistics blog, for my previous post, I just had to read one more item and guess what: I found this one, which is motivated by an article by the economist blogger Mark Thoma. thoma points out an ad by the Chamber of Commerce that blatantly says they are looking an economist to write a “study” to support what the Chamber wants to appear to be true. Reading the full post is highly recommended (click on “” above, after “via”).

Are Wine Ratings Essentially Coin Tosses?

In his first study, each year, for four years, Mr. Hodgson served actual panels of California State Fair Wine Competition judges—some 70 judges each year—about 100 wines over a two-day period. He employed the same blind tasting process as the actual competition. In Mr. Hodgson’s study, however, every wine was presented to each judge three different times, each time drawn from the same bottle.

The results astonished Mr. Hodgson. The judges’ wine ratings typically varied by ±4 points on a standard ratings scale running from 80 to 100. A wine rated 91 on one tasting would often be rated an 87 or 95 on the next. Some of the judges did much worse, and only about one in 10 regularly rated the same wine within a range of ±2 points.

The article was published in the January issue of the Journal of Wine Economics.  The Wall Street Journal has a fun writeup.  The same researcher showed that the distribution of medal winners in a sample of wine competitions matched what you would get if the medal was awarded by a fair lottery.

Ha! I love that (1) this blog post is written by a game theorist who does mechanism design (Jeff Ely at Northwestern), (ii) it is about wine, and (iii) there is such a thing as the Journal of Wine Economics!

Cooperating bacteria are vulnerable to slackers : Not Exactly Rocket Science

Game theory applies to all living organisms. I was recently saying this to a surprised undergraduate. Yet, it is true, as this blog post from Not Exactly Rocket Science illustrates: Cooperating bacteria are vulnerable to slackers : Not Exactly Rocket Science. It tells the story of a kind of bacterial colony in which some members freeload on the efforts of the others to make the environment more nourishing for all members of the colony. In a range of population sizes of the colony, the freeloading bacteria do so well that they multiply faster than the rest. This advantage dissipates, however, when they become so preponderant in the population of the colony that the whole colony is weakened. It seems like these bacteria have figured out how to deal with the “tragedy of the commons”, where people (or living creatures of any kind) overexploit a common resource because it is in the benefit of each individual to do so, even if it harms the group.

Dubner’s response to the “superfreakonomics” accusations

Read it here. I note it does not discuss Krugman’s criticism, which I deem serious, and which aired in a NYT blog, just as the freakonomics blog is a NYT blog. I am curious to see what their response will be to Krugman, if any. Dubner and Levitt can hardly say Krugman is spreading smears about them; they either misread the Weitzman article, or they did not. It looks like they misread it; it’s up to them to convince me otherwise.

I still have no intention of buying Superfreakonomics. I’ll be damned if I reward the authors and the publisher of such stuff that passes for science writing. Again, I will keep an eye open for any adequate answers by Levitt and Dubner; I have not seen any yet.

More on Superfreakonomics and its early terrible reviews

Mark Thoma has a blog entry that quotes Paul Krugman, Brad DeLong and some responses from Levitt and Dubner. My curiosity continues (are Levitt and Dubner really this blinded by what they want to believe in?) but I most definitely will not buy or recommend this book.