Deciding the conclusion ahead of time : Applied Statistics

The more serious issue is that this predetermined-conclusions thing happens all the time. (Or, as they say on the Internet, All. The. Time.) I’ve worked on many projects, often for pay, with organizations where I have a pretty clear sense ahead to time of what they’re looking for. I always give the straight story of what I’ve found, but these organizations are free to select and use just the findings of mine that they like.


Once I started reading the Applied Statistics blog, for my previous post, I just had to read one more item and guess what: I found this one, which is motivated by an article by the economist blogger Mark Thoma. thoma points out an ad by the Chamber of Commerce that blatantly says they are looking an economist to write a “study” to support what the Chamber wants to appear to be true. Reading the full post is highly recommended (click on “” above, after “via”).

Why most discovered true associations are inflated: Type M errors are all over the place

« Deciding the conclusion ahead of time | Main

Why most discovered true associations are inflated: Type M errors are all over the place

Posted on: November 21, 2009 3:22 PM, by Andrew Gelman

Jimmy points me to this article, “Why most discovered true associations are inflated,” by J. P. Ioannidis. As Jimmy pointed out, this is exactly what we call type M (for magnitude) errors. I completely agree with Ioannidis’s point, which he seems to be making more systematically than David Weakliem and I did in our recent article on the topic.

My only suggestion beyond what Ioannidis wrote has to do with potential solutions to the problem. His ideas include: “being cautious about newly discovered effect sizes, considering some rational down-adjustment, using analytical methods that correct for the anticipated inflation, ignoring the magnitude of the effect (if not necessary), conducting large studies in the discovery phase, using strict protocols for analyses, pursuing complete and transparent reporting of all results, placing emphasis on replication, and being fair with interpretation of results.”

These are all good ideas. Here are two more suggestions:

1. Retrospective power calculations. See page 312 of our article for the classical version or page 313 for the Bayesian version. I think these can be considered as implementations of Iaonnides’s ideas of caution, adjustment, and correction.

2. Hierarchical modeling, which partially pools estimated effects and reduces Type M errors as well as handling many multiple comparisons issues. Fuller discussion here (or see here for the soon-to-go-viral video version).



If you have studied statistics, you may remember Type I and Type II errors. This blog post by Andrew Gelman, from Scienceblogs > Applied Statistics, brings to my attention (probably shamefully late) the prevalence of Type M errors. I am very intrigued and have printed out the article found under “our recent article” in the above quotation. When I manage to wrap my head around this idea a little more, i will post a follow-up. (I am posting on my “general interest” blog as this should interest everyone, not just scientists. Sorry I don’t have a plain language explanation ready yet…)

Are Wine Ratings Essentially Coin Tosses?

In his first study, each year, for four years, Mr. Hodgson served actual panels of California State Fair Wine Competition judges—some 70 judges each year—about 100 wines over a two-day period. He employed the same blind tasting process as the actual competition. In Mr. Hodgson’s study, however, every wine was presented to each judge three different times, each time drawn from the same bottle.

The results astonished Mr. Hodgson. The judges’ wine ratings typically varied by ±4 points on a standard ratings scale running from 80 to 100. A wine rated 91 on one tasting would often be rated an 87 or 95 on the next. Some of the judges did much worse, and only about one in 10 regularly rated the same wine within a range of ±2 points.

The article was published in the January issue of the Journal of Wine Economics.  The Wall Street Journal has a fun writeup.  The same researcher showed that the distribution of medal winners in a sample of wine competitions matched what you would get if the medal was awarded by a fair lottery.

Ha! I love that (1) this blog post is written by a game theorist who does mechanism design (Jeff Ely at Northwestern), (ii) it is about wine, and (iii) there is such a thing as the Journal of Wine Economics!

Tim O’Reilly on the future web wars @ david ascher

I agree with Tim that “If you don’t want a repeat of the PC era, place your bets now on open systems. Don’t wait till it’s too late.”  I think he’d also agree that we need to think beyond code and copyright.  That’s like going to war with trucks but no tanks.  For the open, distributed, heterogeneous web to thrive, we need to incorporate thinking from a host of other fields, such as contract law, design, psychology, consumer behavior, brand marketing, and more.  Figuring out how to engage thinkers and leaders in those fields is likely one of the critical, still missing steps.


I can’t resist pointing to this nice follow-up to the Tim O’Reilly post I talked about earlier this evening. I suggest following the link to David Ascher’s post to read all of it.

The War For the Web – O’Reilly Radar

One of the points I’ve made repeatedly about Web 2.0 is that it is the design of systems that get better the more people use them, and that over time, such systems have a natural tendency towards monopoly.

And so we’ve grown used to a world with one dominant search engine, one dominant online encyclopedia, one dominant online retailer, one dominant auction site, one dominant online classified site, and we’ve been readying ourselves for one dominant social network.

But what happens when a company with one of these natural monopolies uses it to gain dominance in other, adjacent areas? I’ve been watching with a mixture of admiration and alarm as Google has taken their dominance in search and used it to take control of other, adjacent data-driven applications. I noted this first with speech recognition, but it’s had the biggest business impact so far in location-based service

Tim O’Reilly offers a good analysis of the coming wars for control of the Web. Scary but unavoidable, I fear. I echo his call, at the end of his blog post, for every one to support open standards on the Web before it’s too late.

Cooperating bacteria are vulnerable to slackers : Not Exactly Rocket Science

Game theory applies to all living organisms. I was recently saying this to a surprised undergraduate. Yet, it is true, as this blog post from Not Exactly Rocket Science illustrates: Cooperating bacteria are vulnerable to slackers : Not Exactly Rocket Science. It tells the story of a kind of bacterial colony in which some members freeload on the efforts of the others to make the environment more nourishing for all members of the colony. In a range of population sizes of the colony, the freeloading bacteria do so well that they multiply faster than the rest. This advantage dissipates, however, when they become so preponderant in the population of the colony that the whole colony is weakened. It seems like these bacteria have figured out how to deal with the “tragedy of the commons”, where people (or living creatures of any kind) overexploit a common resource because it is in the benefit of each individual to do so, even if it harms the group.

Dubner’s response to the “superfreakonomics” accusations

Read it here. I note it does not discuss Krugman’s criticism, which I deem serious, and which aired in a NYT blog, just as the freakonomics blog is a NYT blog. I am curious to see what their response will be to Krugman, if any. Dubner and Levitt can hardly say Krugman is spreading smears about them; they either misread the Weitzman article, or they did not. It looks like they misread it; it’s up to them to convince me otherwise.

I still have no intention of buying Superfreakonomics. I’ll be damned if I reward the authors and the publisher of such stuff that passes for science writing. Again, I will keep an eye open for any adequate answers by Levitt and Dubner; I have not seen any yet.