SPSS Online Guide
Chi-Square with SPSS
The chi-square goodness of fit test and test for independence are both available on SPSS. Recall that chi-square is useful for analyzing whether a frequency distribution for a categorical or nominal variable is consistent with expectations (a goodness of fit test), or whether two categorical or nominal variables are related or associated with each other (a test for independence). Categorical or nominal variables assign values by virtue of being a member of a category. Sex is a nominal variable. It can take on two values, male and female, which are usually coded numerically as 1 or 2. These numerical codes do not give any information about how much of some characteristic the individual possesses. Instead, the numbers merely provide information about the category to which the individual belongs. Other examples of nominal or categorical variables include hair color, race, diagnosis (e.g., ADHD vs. anxiety vs. depression vs. chemically dependent), and type of treatment (e.g., medication vs. behavior management vs. none). Note that these are the same type of variables that can be used as independent variables in a t-test or ANOVA. In the latter analyses, the researcher is interested in the means of another variable measured on a interval or ratio scale. In chi-square, the interest is in the frequency with which individuals fall in the category or combination of categories.
Chi-Square Test for Goodness of Fit
A chi-square test for goodness of fit can be requested by clicking Statistics > Nonparametric Tests > Chi-square. This opens up a window very similar to other tests. Enter the variable to be tested into the Test Variable box. Then a decision about the expected values against which the actual frequencies are to be tested needs to be made. The most common choice is "All categories equal." However, it is also possible to enter specific expected values by checking the other circle and entering expected values in order. The expected values used in computing the chi-square will be proportional to these values. The Options... button provides access to missing value options and descriptive statistics for each variable. To submit the analysis click the OK button. Results for a goodness of fit chi-square are shown below.
[The data were taken from the previous ANOVA example.]
[The "residual" is just the difference between the observed and expected frequency.]
[Warning: Using the Chi-Square statistic is questionable here because all four cells have expected frequencies less than 5. See your statistics textbook for advice if you are in this situation.]
a. 4 cells (100.0%) have expected frequencies less than 5. The minimum expected cell frequency is 3.0.
Chi-Square Test for Independence
The chi-square test for independence is a test of whether two categorical variables are associated with each other. For example, imagine that a survey of approximately 200 individuals has been conducted and that 120 of these people are females and 80 are males. Now, assume that the survey includes information about each individual's major in college. To keep the example simple, assume that each person is either a psychology or a biology major. It might be asked whether males and females tend to choose these two majors at about the same rate or does one of the majors have a different proportion of one sex than the other major. The table below shows the case where males and females tend to be about equally represented in the two majors. In this case college major is independent of sex. Note that the percentage of females in psychology and biology is 59.8 and 60.2, respectively. Another way to characterize these data is to say that sex and major are independent of each other because the proportion of males and females remains the same for both majors.
The next example shows the same problem with a different result. In this example, the proportion of males and females depends upon the major. Females compose 79.6 percent of psychology majors and only 39.2 percent of biology majors. Clearly, the proportion of each sex is different for each major. Another way to state this is to say that choice of major is strongly related to sex, assuming that the example represents a statistically significant finding. It is possible to represent the strength of this relationship with a coefficient of association such as the contingency coefficient or Phi. These coefficients are similar to the Pearson correlation and interpreted in roughly the same way.
The method for obtaining a chi-square test for independence is a little tricky. Begin by clicking Statistics > Summarize > Crosstabs.... Transfer the variables to be analyzed to the Row(s) and Column(s) boxes. Then go to the Statistics... button and check the Chi-square box and anything that looks interesting in the Nominal Data box, followed by the Continue button. Next, click the Cells... button and check any needed descriptive information. Percentages are particularly useful for interpreting the data. Finally, click OK and the output will quickly appear.
Sample results are shown below. These data are from the ANOVA example so the number of observations in each cell is only two. This is a problematic situation for chi-square analysis and, should this be encountered in an actual analysis, consulting a textbook is recommended. Furthermore, the results are far from significant because the distribution of sex across class remains constant.
Case Processing Summary
The "Case Processing Summary" provides some basic information about the analysis. In studies with large numbers of participants, this information can be very useful.
Sex * Class Crosstabulation
Note: The above results can be obtained by requesting all the available percentages in the cross-tabulation. In this simple example, the percentages are not very useful. However, when large numbers of participants are in the design, the percentages help greatly in understanding the pattern of the results. Also, when the analysis is presented in a research report, the percentages within one of the variables will help the reader interpret the results.
a. 8 cells (100.0% have expected count less than 5. The minimum expected count is 2.00.
The values for "Sig" are probabilities. A statistically significant result has a probability of less than .05.
Other Helpful Features of SPSS
There are a number of additional features available in SPSS that can be extremely helpful for the beginning researcher. These features will be described briefly.