Page 127

navidi_monk_essential_statistics_1e_ch1_3

Section 3.3 Measures of Position 125 of standard deviations from the mean, we can easily interpret the z-score for bell-shaped populations. z-Scores and the Empirical Rule When a population has a histogram that is approximately bell-shaped, then • Approximately 68% of the data will have z-scores between −1 and 1. • Approximately 95% of the data will have z-scores between −2 and 2. • All, or almost all, of the data will have z-scores between −3 and 3. The z-score is less useful for populations that are not bell-shaped. For example, in some skewed populations there will be no values with z-scores greater than 1, while in others, values with z-scores greater than 1 occur frequently. We can’t be sure how to interpret z-scores when the population is skewed. It is best, therefore, to use z-scores only for populations that are approximately bell-shaped. See Exercise 39 for an illustration. Objective 2 Compute the percentiles of a data set Percentiles The weather in Los Angeles is dry most of the time, but it can be quite rainy in the winter. The rainiest month of the year is February. Table 3.9 presents the annual rainfall in Los Angeles, in inches, for each February from 1965 to 2006. Table 3.9 Annual Rainfall in Los Angeles During February Year Rainfall Year Rainfall Year Rainfall Year Rainfall 1965 0.23 1976 3.71 1987 1.22 1998 13.68 1966 1.51 1977 0.17 1988 1.72 1999 0.56 1967 0.11 1978 8.91 1989 1.90 2000 5.54 1968 0.49 1979 3.06 1990 3.12 2001 8.87 1969 8.03 1980 12.75 1991 4.13 2002 0.29 1970 2.58 1981 1.48 1992 7.96 2003 4.64 1971 0.67 1982 0.70 1993 6.61 2004 4.89 1972 0.13 1983 4.37 1994 3.21 2005 11.02 1973 7.89 1984 0.00 1995 1.30 2006 2.37 1974 0.14 1985 2.84 1996 4.94 1975 3.54 1986 6.10 1997 0.08 There is a lot of spread in the amount of rainfall in Los Angeles in February. For example, in 1984 there was no measurable rain at all, while in 1998 it rained more than 13 inches. In Section 3.1, we learned how to compute the mean and median of a data set, which describe the center of a distribution. For data sets like the Los Angeles rainfall data, which exhibit a lot of spread, it is useful to compute measures of positions other than the center, to get a more detailed description of the distribution. Percentiles provide a way to do this. Percentiles divide a data set into hundredths. DEFINITION For a number p between 1 and 99, the pth percentile separates the lowest p% of the data from the highest (100 − p)%. There are several methods for computing percentiles, all of which give similar results. We present a fairly straightforward method here.


navidi_monk_essential_statistics_1e_ch1_3
To see the actual publication please follow the link above