Page 107

navidi_monk_essential_statistics_1e_ch1_3

Section 3.2 Measures of Spread 105 Another way to say this is that the temperatures in St. Louis are more spread out than the temperatures in San Francisco. The dotplots in Figure 3.5 illustrate the difference in spread. 30 San Francisco St. Louis 40 50 60 70 80 30 40 50 60 70 80 Figure 3.5 The monthly temperatures for St. Louis are more spread out than those for San Francisco. The mean does not tell us anything about how spread out the data are; it only gives us a measure of the center. It is clear that the mean by itself is not adequate to describe a data set. We must also have a way to describe the amount of spread. Dotplots allow us to visualize the spread, but we need a numerical summary to measure it precisely. Objective 1 Compute the range of a data set The Range The simplest measure of the spread of a data set is the range. DEFINITION The range of a data set is the difference between the largest value and the smallest value. Range = Largest value − Smallest value EXAMPLE 3.10 Compute the range of a data set Compute the range of the temperature data for San Francisco and for St. Louis, and interpret the results. Solution The largest value for San Francisco is 63 and the smallest is 51. The range for San Francisco is 63 − 51 = 12. The largest value for St. Louis is 79 and the smallest is 30. The range for St. Louis is 79 − 30 = 49. The range is much larger for St. Louis, which indicates that the spread in the temperatures is much greater there. Although the range is easy to compute, it is not often used in practice. The reason is that the range involves only two values from the data set — the largest and the smallest. The measures of spread that are most often used are the variance and the standard deviation, which use every value in the data set. Objective 2 Compute the variance of a population and a sample The Variance When a data set has a small amount of spread, like the San Francisco temperatures, most of the values will be close to the mean. When a data set has a larger amount of spread, more of the data values will be far from the mean. The variance is a measure of how far the values in a data set are from the mean, on the average. We will describe how to compute the variance of a population. The difference between a population value, x, and the population mean,μ, is x−μ. This difference is called a deviation. Values less than the mean will have negative deviations, and values greater than the mean will have positive deviations. If we were simply to add the deviations, the positive and the negative ones would cancel out. So we square the deviations to make them all positive. Data sets with a lot of spread will have many large squared


navidi_monk_essential_statistics_1e_ch1_3
To see the actual publication please follow the link above