Page 110

navidi_monk_essential_statistics_1e_ch1_3

108 Chapter 3 Numerical Summaries of Data CAUTION Don’t round off the value of x¯ when computing the sample variance. Solution The calculations are shown in Table 3.5. Step 1: Compute the sample mean: ¯x = 3 + 4 + 6 + 5 + 4 + 2 6 = 4 Step 2: Subtract ¯x from each value to obtain the deviations x − ¯x. These calculations are shown in the second column of Table 3.5. Table 3.5 Calculations for the Sample Variance in Example 3.12 x x− ¯x (x− ¯x)2 3 −1 (−1)2 = 1 4 0 02 = 0 6 2 22 = 4 5 1 12 = 1 4 0 02 = 0 2 −2 (−2)2 = 4 ¯x = 4 (x − ¯x)2 = 10 s2 = 10 6 − 1 = 2 Step 3: Square the deviations. These calculations are shown in the third column of Table 3.5. Step 4: Sum the squared deviations to obtain (x − ¯x)2 = 10 Step 5: The sample size is n = 6. Divide the sum obtained in Step 4 by n −1 to obtain the sample variance s2. s2 = (x − ¯x)2 n − 1 = 10 6 − 1 = 2 Explain It Again Degrees of freedom and sample size: The number of degrees of freedom for the sample variance is one less than the sample size. Why do we divide by n−1 rather than n? It is natural to wonder why we divide by n − 1 rather than n when computing the sample variance. When computing the sample variance, we use the sample mean to compute the deviations x− ¯x. For the population variance, we use the population mean for the deviations x − μ. Now it turns out that the deviations using the sample mean tend to be a bit smaller than the deviations using the population mean. If we were to divide by n when computing a sample variance, the value would tend to be a bit smaller than the population variance. It can be shown mathematically that the appropriate correction is to divide the sum of the squared deviations by n − 1 rather than n. The quantity n−1 is sometimes called the degrees of freedom for the sample standard deviation. The reason is that the deviations x − ¯x will always sum to 0. Thus, if we know the first n −1 deviations, we can compute the nth one. For example, if our sample consists of the four numbers 2, 4, 9, and 13, the sample mean is ¯x = 2 + 4 + 9 + 13 4 = 7 The first three deviations are 2 − 7 = −5, 4 − 7 = −3, 9 − 7 = 2 The sum of the first three deviations is −5 + (−3) + 2 = −6


navidi_monk_essential_statistics_1e_ch1_3
To see the actual publication please follow the link above