I'm a big fan of Bad Science - both as a web page
and a column in The Guardian
. This week's article
is about evaluating a series of studies - in particular it talks about multiple testing and about one tailed tests. In its description of one tailed statistical tests, it doesn't go far enough though. If you carry out a one tailed test, you are saying that an effect in the opposite direction is meaningless and uninteresting - no matter how large it is, or what the p-value. If I think that drug X will make you better, then I might be tempted to carry out a one tailed test (after all, it gives me more power). However, this means that if drug X makes you worse, and it doesn't matter how much worse
, it can even kill you, the test was one tailed, and the null hypothesis therefore cannot be rejected.
I have a hard time believing that there are many researchers who, on many occasions, would do this. An interesting result is, after all, an interesting result. Using a one tailed test just looks like cheating, because you couldnt' get a significant result by using a two tailed test. Bland and Altman
discuss this in a BMJ article (not Bland and Bland, as it says in the HTML). They write "In general a one sided test is appropriate when a large difference in one direction would lead to the same action as no difference at all." Martin Bland
has told me that in his 30 (or so) years as a practising medical statistician, with somewhere over 300 papers to his name, he has used a one tailed test once.
That paper is here
. The question asked was whether heart transplant was associated with an increased risk of death in a cohort study. If it is associated with a decrease, or with no change, what are we going to do? More heart transplants? We can't do more heart transplants, we do as many as we can. In this case, the rule that an effect in the opposite direction would have the same consequences as no effect is satisfied. But it's prtty rare that that is the case.