Wednesday, December 20, 2006

Horse racing deaths

One reason that we do statistical analysis and don't rely on our brains is that our brains are not to be trusted, a great deal of the time. Brains were designed by evolution to keep us alive in the risky environment of something like the African Savannah. One thing that might get you into a lot of trouble is missing out on some important pattern (there's always a tiger here at dusk, 3 days after rain, there is water over there, 3 weeks after rain, it's over here, etc). So our brains make a lot of type I errors, when it comes to spotting patterns - a type I error (seeing a pattern when there wasn't one) is a lot more serious than a type II error (not seeing a pattern, when there was one). A type I error means you avoid a certain area at dusk, which is a slight inconvenience. A type II error means you get eaten.

The Guardian has a story today about horse racing, and specifically about a run of deaths of horses at Wolverhampton racetrack. This obviously isn't a good thing, but one needs to ask the question of whether this means that there is something wrong with the track which is causing this problem, or whether it's just a statistical fluke - an artefact, which our brains spot as a pattern.

Doing the analysis to find out is tricky, for two reasons - our friends the type I and type II errors. There are quite a lot of racetracks (in England? Britain? The UK? British Isles? Europe? The world?) - what type I error rate do we want to accept, and over what area? And, if we control the type I error rate, we lose power, and therefore don't detect an effect.

This isn't just a problem for horses - the same problem arose in the Harold Shipman case - a type I error means you wrongly accuse a doctor of killing their patients. A type II error means that you miss a murderous GP.

(If you're interested, you can read about appropriate tests here, or watch a video here.)

Example of a Bayesian Analysis

I don't know what a facilitated system is, but there's a nice example of a Bayesian conditional probability calculation on the Facilitated Systems blog, which you can find here: