If you have two independent samples from two Normal distributions, you can test the hypothesis that the variances are Normal in R using the var.test() function. For example:

> x1 <- c(0.06, 0.08, 0.17, 0.12, -0.08, 0.11, 0.09, 0.03, 0.20, 0.08, 0.05, 0.02)

> var(x1)

[1] 0.005275

> x2 <- c(0.04, 0.26, 0.11, 0.08, 0.36, 0.04, -0.05, -0.01, -0.10, 0.00, 0.01, 0.11)

> var(x2)

[1] 0.01668106

> 0.005275/0.01668106

[1] 0.3162269

> var.test(x1, x2)

F test to compare two variances

data: diff1 and diff2

F = 0.31623, num df = 11, denom df = 11, p-value = 0.06884

alternative hypothesis: true ratio of variances is not equal to 1

95 percent confidence interval:

0.09103463 1.09847706

sample estimates:

ratio of variances

0.3162269

Here the sample variances are not too different, 0.005 and 0.017, with a ratio of 0.32.

The var.test() function performs an F test to test the hypothesis that the variances are not equal, and gives a p-value of 0.06884, which gives weak evidence against the null hypothesis of equal variances. So I wouldn't reject the null hypothesis based on that.

If you like, just for fun, you can use the F test yourself, to test the hypothesis. The F test takes the ratio of sample variances (=0.3162269) as the F statistic value, and uses the F-distribution with n1-1=11 and n2-1=11 degrees of freedom, as the two samples have sizes 12 (ie. n1=12, n2=12):

> pf(0.3162269, df1=11, df2=11)

[1] 0.03441958

This gives us a one-tailed p-value, which we need to multiply by 2 to get the two-tailed p-value:

> pf(0.3162269, df1=11, df2=11) * 2

[1] 0.06883916

This gives us the same p-value as var.test(), but using home-made R, hurray!

Note that if you want to convince yourself that var.test() does the same thing as my 'pf' lines above, you can peak at the var.test() code using:

> methods(var.test)

[1] var.test.default* var.test.formula*

see '?methods' for accessing help and source code

This tells us that there is a function var.test.default, let's look at it:

> getAnywhere(var.test.default)

A single object matching ‘var.test.default’ was found

It was found in the following places

registered S3 method for var.test from namespace stats

namespace:stats

with value

function (x, y, ratio = 1, alternative = c("two.sided", "less",

"greater"), conf.level = 0.95, ...)

{

if (!((length(ratio) == 1L) && is.finite(ratio) && (ratio >

0)))

stop("'ratio' must be a single positive number")

alternative <- match.arg(alternative)

if (!((length(conf.level) == 1L) && is.finite(conf.level) &&

(conf.level > 0) && (conf.level < 1)))

stop("'conf.level' must be a single number between 0 and 1")

DNAME <- paste(deparse(substitute(x)), "and", deparse(substitute(y)))

if (inherits(x, "lm") && inherits(y, "lm")) {

DF.x <- x$df.residual

DF.y <- y$df.residual

V.x <- sum(x$residuals^2)/DF.x

V.y <- sum(y$residuals^2)/DF.y

}

else {

x <- x[is.finite(x)]

DF.x <- length(x) - 1L

if (DF.x < 1L)

stop("not enough 'x' observations")

y <- y[is.finite(y)]

DF.y <- length(y) - 1L

if (DF.y < 1L)

stop("not enough 'y' observations")

V.x <- var(x)

V.y <- var(y)

}

ESTIMATE <- V.x/V.y

STATISTIC <- ESTIMATE/ratio

PARAMETER <- c(`num df` = DF.x, `denom df` = DF.y)

PVAL <- pf(STATISTIC, DF.x, DF.y)

if (alternative == "two.sided") {

PVAL <- 2 * min(PVAL, 1 - PVAL)

BETA <- (1 - conf.level)/2

CINT <- c(ESTIMATE/qf(1 - BETA, DF.x, DF.y), ESTIMATE/qf(BETA,

DF.x, DF.y))

}

else if (alternative == "greater") {

PVAL <- 1 - PVAL

CINT <- c(ESTIMATE/qf(conf.level, DF.x, DF.y), Inf)

}

else CINT <- c(0, ESTIMATE/qf(1 - conf.level, DF.x, DF.y))

names(STATISTIC) <- "F"

names(ESTIMATE) <- names(ratio) <- "ratio of variances"

attr(CINT, "conf.level") <- conf.level

RVAL <- list(statistic = STATISTIC, parameter = PARAMETER,

p.value = PVAL, conf.int = CINT, estimate = ESTIMATE,

null.value = ratio, alternative = alternative, method = "F test to compare two variances",

data.name = DNAME)

attr(RVAL, "class") <- "htest"

return(RVAL)

}

<bytecode: 0x7f7fde050840>

<environment: namespace:stats>

You can see that in this code they are using the pf function to get the p-value:

PVAL <- pf(STATISTIC, DF.x, DF.y)

## No comments:

Post a Comment