Calculate the probability of replication?

This can be done in SPSS, but it requires a bit of fiddling, and a bit of knowledge of some functions. Let’s say that, for our study, we do a 2 test.

We know that for our first study, we got a p value of exactly 0.05, and let’s say that we had 1 df. We can use SPSS to tell us what the 2 value must have been.

We select transform, compute, and then in the Numeric Expression box we type:

idf.chisq(0.95, 18)

IDF.CHISQ is the inverse distribution function for 2.

We have to use 0.95, because we are interested in the upper end, not the lower end of the distribution.

We then want to find out what the probability of getting a value as high or higher than 3.84 is, given a certain population value for 2. We usually assume that the population value is 0 – that is, the null hypothesis is true, and ask: what is the probability of getting a value this high or higher. This is called a central 2 distribution.

In this case, we aren’t interested in the usual null hypothesis. We have already rejected that. Our best guess at what the true value for 2 is, is the value that we found in the previous study – it’s 3.84. So, we instead ask what the probability of getting a value of 2 as high as 3.84 is, if the population value is 3.84. This requires that we use a different sort of distribution, called a non-central distribution.

Non-central distributions have an additional parameter, called (and this might surprise you) the non-centrality parameter (ncp for short). However, because that’s a bit easy, it’s also called (which is the Greek letter Lambda). Non-centrality parameters can be a bit tricky to work out for some tests, but for the 2 test they are easy: it’s the expected population value of 2 – (df – 1) (that’s why we’ve used the 2 test).

So the ncp = 3.84 – (1 – 1) = 3.84.

So, we ask, what’s the probability of getting a value of 2 as high as 3.84, when we have a non-centrality parameter of 384, and 1 df.

This is a really, really hard question, if we haven’t got a computer that can do it for us. Luckily, we have. If we called the previous variable chi1, then we use transform, compute, and type:

NCDF.CHISQ(chi1,1, chi1)

Which comes to 0.5. Which isn’t surprising. We’ve got a 50% chance of being higher, and a 50% chance of being lower.

To do this, we do the same thing again, but change the numbers so that the first experiment gave a p-value of 0.01, we get the result 0.27 – that is, we have a 0.27 chance of not getting a significant result.

Using syntax in SPSS makes life much easier – open a new syntax window, paste the following in, and run it.

COMPUTE chi1 = IDF.chisq(0.95, 1) .

COMPUTE p1 = NCDF.CHISQ(chi1,1, chi1) .


COMPUTE chi2 = IDF.chisq(0.99, 1) .

COMPUTE p2 = NCDF.CHISQ(chi1,1, chi2) .