> mydata <- rnorm(100)

As it's a standard Normal distribution, the population variance is 1.

> var(mydata)

[1] 1.091236

The sample variance is s^2 = 1.091236.

**1. Using the R 'lavaan' package:**

I found out about this from stackexchange.com. This uses the lavaan R package.

> install.packages("lavaan")

> library(lavaan)

> mydataframe <- as.data.frame(mydata)

> names(mydataframe) <- "x"

> mymodel <- 'x ~~ x'

> fit <- sem(mymodel, data=mydataframe, likelihood = "wishart")

> parameterEstimates(fit)

lhs op rhs est se z pvalue ci.lower ci.upper

1 x ~~ x 1.091 0.155 7.036 0 0.787 1.395

This gives us the confidence interval (0.787, 1.395) for the population variance. I assume this is a 95% confidence interval.

__I asked on the lavaan googlegroup, and they said that lavaan is not using the chi-squared distribution to calculate a confidence interval, it is using a Z-statistic. They said that as a result it does not calculate a very reliable confidence interval, and that this is discussed further here .__

*Note added some days later:***2. Using the chi-squared distribution**

We can use the chi-squared distribution to provide a confidence interval for the variance of a Normal distribution. I'm not sure if this is what the 'lavaan' package does already, I don't think so however, as my results don't match those of the lavaan package (see below). As we have 100 data points (n=100), we use a chi-squared distribution with n-1=99 degrees of freedom. As we want a 95% confidence interval (alpha=0.05), we want the 0.025 (=0.05/2) and 0.975 (=1 - (0.05/2)) quantiles of the chi-squared distribution with 99 degrees of freedom:

> qchisq(0.025, df=99)

[1] 73.36108

> qchisq(.975, df=99)

[1] 128.422

Then our confidence interval is given by ( (n-1)s^2 / 128.422, (n-1)s^2 / 73.36108), where n is our sample size of 100, and s^2 is our sample variance 1.091236:

> (99*1.091236)/128.422

[1] 0.8412294

> (99*1.091236)/73.36108

[1] 1.472611

That is, using the chi-squared distribution I get a 95% confidence interval of (0.841, 1.473) for the population variance.

Let's get a 99% confidence interval:

> (99*1.091236)/qchisq(.995, df=99)

[1] 0.7772852

> (99*1.091236)/qchisq(.005, df=99)

[1] 1.6243

That is, a 99% confidence interval is (0.777, 1.624).

## No comments:

Post a Comment