Wednesday 29 May 2013

Sign tests in R

The sign test is a non-parametric test that can be used to test whether there are more negative or positive values in a sample of data. The null hypothesis is that there are an equal number of negative and positive values.

Sign test for paired data:
For example, if you have paired data (eg. 'before' and 'after') for the sample individuals, you can use a sign test to test whether the differences between the 'before' and 'after' values tend to be positive or negative (the null hypothesis is that there is an equal number of negative and positive differences). 

For example:
> x1 <- c(488, 478, 480, 426, 440, 410, 458, 460)
> x2 <- c(484, 478, 492, 444, 436, 398, 464, 476)
> difference <- x1 - x2
[1]   4   0 -12 -18   4  12  -6 -16
> library("BSDA")
> SIGN.test(difference, md=0, alternative = "less")
s = 3, p-value = 0.5

Here three of the differences are positive (4, 4, 12), so the test statistic is 3. There are 7 non-zero differences. Assuming that + and - signs have equal probability, the distribution of the number of + signs out of 7 is B(7, 0.5). We can calculate the P-value for a one-sided test as the probability of observing 3 or fewer positive signs out of 7:
> pbinom(3, size=7, p=0.5)
[1] 0.5
This agrees with the result from SIGN.test().
For this one-sided test, the null hypothesis is that the number of + and - signs are equal, and the alternative hypothesis is that there are less + signs than - signs.

Sign test for non-paired data:
We can also use a sign test for non-paired data, to test whether the population median is equal to some specified value m. To do this, we subtract m from all the values in the sample, and then carry out a sign test on the remainders.

For example, say we want to test whether the population median is equal to m=0.618:
> x <- c(0.693, 0.654, 0.662, 0.615, 0.690, 0.601, 0.570, 0.576, 0.749, 0.670, 0.672, 0.606, 0.628, 0.611, 0.609, 0.553, 0.844, 0.933)
 [1]  0.075  0.036  0.044 -0.003  0.072 -0.017 -0.048 -0.042  0.131  0.052  0.054 -0.012  0.010 -0.007 -0.009
[16] -0.065  0.226  0.315
Here there are 11 remainders (out of 18) that are greater than 0 so the test statistic is 11.
> remainders <- x - 0.618
> library("BSDA")
> SIGN.test(remainders, md=0, alternative = "greater")s = 11, p-value = 0.2403
For this one-sided test, the null hypothesis is that the number of + and - signs are equal, and the alternative hypothesis is that there are more + signs than - signs. 

If we wanted to carry out a two-sided test, we would type:
>  SIGN.test(remainders, md=0, alternative = "two.sided")
s = 11, p-value = 0.4807
For this two-sided test, the null hypothesis is that the number of + and - signs are equal, and the alternative hypothesis is that the number of + and - signs are not equal.

Assuming that + and - signs have equal probability, the distribution of the number of + signs out of 18 is B(18, 0.5). We can calculate the P-value for a one-sided test as the probability of observing 11 or greater positive signs out of 18:
> 1 - pbinom(10, size=18, p=0.5)
[1] 0.2403412
This agrees with the result from SIGN.test().
The P-value for a two-sided test is twice the P-value from the one-sided test:
>  0.2403412*2
0.4806824

No comments: