Monday, December 19, 2005

Correlation and causation

A nice example of the failure to distinguish between correlation and causation on the badscience.net website. The Daily Mail (which I'll decline to link to) says that moderate drinkers are thinner than non-drinkers, therefore drinking makes you thin.

It's possible to think of a very large number of reasons that this relationship might exist. My top choice would be age - older people tend to drink less (drinking peaks at around age 21 - you'll have to take my word that I have read that somewhere reputable). Older people are more likely to be overweight than younger people. Maybe they've got their causation reversed - fatter people are less likely to drink than thinner people, because they are trying to lose weight.

It's also possible to think of one very good reason that this is not a causal relationship. Take two identical people, make them do exactly the same things. Make them eat the same foods. Make one of them drink water. Make one of them drink wine. The one that drinks wine will become heavier than the one that doesn't - it's the laws of physics.

However, when someone makes a causal statement based on a correlation, it's not for us, the readers, to find reasons it may not be true. It is for them to present a convincing argument. Saying "smokers get more cancer, because more smokers get cancer than non-smokers" is not a convincing argument, until a lot more evidence is collated. This is discussed in statistics as principled argument by Robert Abelson, and also in Applying regression and correlation, by me and Mark Shevlin.