Univariate Statistics Part 2, measuring scale, and the standard deviation

Often, we may want to compare our sample variance, with a hypothetical population variance, in order to ensure it doesn’t exceed a certain value. Back to our sandwiches, perhaps Subway is okay with footlongs varying in length by no more than 1”. The test statistic we use is the chi-squared distribution with n-1 degrees of freedom.

In this test we hypothesize that H0: s2 = σ02, or that our hypothesized value is equal to our sample value. Our alternative hypothesis could be Ha: s2 > σ02, for a one-tailed test. This would imply that χ2, would get larger and larger as s2 increases in comparison to our hypothesized σ2. Alternatively, the opposite scenario would imply small values show evidence against our null hypothesis. With two-tailed tests, we reject our null with very large, and very small values of χ2.

As an example, let’s imagine a quality control experiment to ensure that 500 gram sand samples taken in the field do not exceed a variance of 50 grams2. Our hypothesis could be:

We take 9 samples, and each have the weights 455, 460, 473, 496, 503, 512, 523, 527, 540 grams. Our s2is calculated to be 818.6172. Now we calculate our χ2:

Using R, or a chi-squared table, we find that the p-value for our chi-square score is <0.00001. Although we haven’t stated an acceptable α, we can say there is strong evidence with p-value <0.0001 that the true variance of our sample weights is greater than 50 grams2.

If we are comparing two sample variances, we use the F distribution. In this case we have a sample variance s12, from n1 observations with a variance of σ12. And another sample variance s22, from n2 observations with a variance of σ22. Both have independent observations, and both are from normally distributed populations. As usual we test: H0: σ12= σ22, against either:

Let’s do another example. Does a sample method that collects 2000 gram samples have a greater variability than one that collects 500 gram samples?

Using a computer, we find the p-value of F = 2.785 with 10, 6 degrees of freedom is 0.111225. Our p-value is greater than our α(0.05), and therefore the result is not significant, and we cannot reject our null hypothesis.

We must mention a few caveats. Some sample sizes are very small, and larger sample sizes would be ideal. And secondly, normality is very important for both chi-squared and F distributions. If normality is not certain, non-parametric tests should be done.

The most powerful non-parametric test (Rock, 1988a), would be the Klotz test, based on the squares of normal scores. Normal scores are, Ai, where Ai= φ-1(Ri/(N+1)), and φ-1 is the percent point (cumulative distribution) function of the standard normal distribution, Ri is the rank of the i-th observation, and N is the sample size. The test statistic is calculated as follows:

As usual, if the calculated value is below the test value from the normal distribution, we accept the null hypothesis, if not we reject in favor of the alternative. That is, where α is the significance level: