Monday, August 04, 2008

Retiring

I've decided to retire this blog. Well, I've decided to accept that I've almost run out of things to say on it. I've decided to merge it with my other blog, into a more general set of ramblings, and start a new blog which you can find here (if you so care).

Monday, April 07, 2008

Measurement vs Statistics

Bad Science has an interesting article about issues of measurement, before you even get to issues of analysis. Here's the first paragraph:

"There's this vague idea - which has been going around for the past few centuries - that statistics is quite difficult. But in reality the maths is often the least of your problems: the tricky bit comes way before the number crunching, when you are deciding what to measure, how to measure it, and what those measurements mean."

Measurement vs Statistics

Saturday, March 29, 2008

Zehn Mark I


Zehn Mark I
Originally uploaded by Arenamontanus
Here's an old German 10 mark note, with a picture of Carl Gauss on it (as in Gaussian, as in normal distribution). To the left of the picture of Gauss is a normal distribution, with the formula.

You can read more about the distribution (and see a larger version of the formula) at Wikipedia http://en.wikipedia.org/wiki/Gaussian_distribution.

Monday, January 21, 2008

Another LOLScience Picture


icanhaspaic128452639380076250
Originally uploaded by increpare
It's Florence Nightingale who invented the pie chart (or something very similar she actually called them polar axis charts, but they led to pie charts) and was the first female member of the Royal Statistical Society.

Friday, December 21, 2007

Swanz


swanz
Originally uploaded by mr lynch
Fickr has a group called LOL Science, here's one of the pictures. It's a combination of a hideous number of in jokes. If you get it, you can feel smug now.

If you don't get it, here's some explanation. We'll take them in order of relevance to this blog.

  1. It's about Popper, and his philosophy of science. Popper is important, because his philosophy defined the way that we think about science, and statistics. Popper's important idea was that you couldn't prove a theory, you can only disprove it. If you believe that all swans are white, then you can never prove this by finding white swans. You have to go out and try to disprove your theory - you have to try to find a black swan. If you try really hard to find more white swans, that doesn't tell us anything. If you try really hard to find a black swan, and fail, that supports your theory (but doesn't prove it). If you succeed, you've disproved your theory. That's why we focus on the null hypothesis in statistic. The old approach of collecting data to try to prove your theory, is the inductive method, the new approach, of trying hard to disprove your theory, is called the hypothetico-deductive method. Now we know that black swans live in Australia (or the antipodes).
    Note for the enthusiastic - more recently it's been suggested that an abductive approach should be employed, rather than a hypothetico-deductive approach.
  2. I'm in ur (X), (Y)ing ur Zs, is a snowclone. The term snowclone originated on the Language Log blog, a snowclone (originally) takes the form "If eskimos have 94 words for snow, then X must have Y words for Z" (eskimos, of course, don't have 94 words for snow, or 16 or 138). The original phrase was "I'm in ur base, killing ur d00dz".
  3. The third link is about LOLcats - this one's not about psychology or statistics, but it is about pictures of cats (or other animals) with amusing captions, usually with grammatical and spelling errors (think how your cat would write if it was aged 14 and texting its friends). The I'm in ur X... meme is a very popular one to use for LOLcats pictures. Here's an article that explains more. (That last one didn't have anything to do with statistics or psychology).

Saturday, August 04, 2007

New email subscribe thing

If you want to get an email when a new post is added to this page, enter your email address in the box on the right. (It's organized by Feedburner.)

Wednesday, August 01, 2007

e

New Scientist recently had an article about e [subscription of some sort required for full article], everybody's second favourite mathematical constant. (pi is everyone's favourite mathematical constant, obviously). It talked, in a relatively non-technical manner, about the reasons why e was exciting. It didn't talk about its role in statistics, in, for example, the normal distribution, but it's probably too difficult to do that without assuming some prior knowledge. [Did someone say too boring? Out of the room! NOW!]

Language Log on Odds Ratios

There was an interesting post on the presentation of odds ratios on the Language Log Blog, the other day. They give some examples of odds ratios being deceptive, confusing and misunderstood. It's been said plenty of times, by plenty of people (including me) but it's interesting that linguists are saying it too.

Saturday, June 30, 2007

You're a Bayesian!

I've written a bit before about Bayesian statistics, here, here and here (that last one where I stole a line from Brad Efron, who said "We can all be Bayesians when we need to be," and also in a recently published book. I'm kind of sympathetic towards Bayesian analysis, but I very rarely do it. The basis for Bayesian analysis is that we incorporate the prior probability of a result into our analysis. Some people are positively antagonistic towards Bayesian thinking - denying that there is ever a use for it - the selection of the prior probability being something of a sticking point. (Actually, Bayesian analysis is lots more complex than that, and doesn't always require what are called 'informative priors', but we won't worry about that for a minute.

However, the most recent issue of Significance had a very interesting article by Stephen Senn, in which he wrote about the TeGenero tgn1412 drug trial catastrophe which occurred in March 2006, when 6 volunteers received the drug, and two received a placebo. The 6 volunteers almost immediately had massive immune system reactions - specifically a cytokine storm, and were hospitalised for at least a month.

What we have here, is the potential of a statistical analysis We've got a 2x2 table, so let's do the stats.
                       Placebo    Drug
Yes 0 6
Cytokine Storm
No 2 0


A 2x2 table. We obviously can't do a chi-square test, as the sample is too small. But we can do a Fisher's exact test. If we do that we get a one-tailed p of 0.036. It's a one-tailed test, so our p-value cut off is 0.025, so we don't have evidence that the drug caused the cytokine storm, and all the subsequent ills.

But that's got to be a silly thing to say. It's obvious that the drug did cause the cytokine storm. It's not just barely significant; it's really, really obvious. Why is it so obvious? It's obvious because people don't have cytokine storms every day. In fact, if you haven't got the Spanish Flu we're pretty safe saying that you will never have a cytokine storm. In other words, it's not just the data that we have obtained here that we need to take into account. We need to take into account the probability of having a cytokine storm ever is very low. In other words, we need to take into account the prior probability. And so we have just done a Bayesian analysis.