Factorial ANOVA with SPSS

 

Factorial analysis of variance is an extension of the one-way analysis. The difference is that a factorial analysis has more than one independent (grouping) variable. For example, a study could be designed to simultaneously assess the relationship of sex (male vs. female) and class (firstyear, sophomore, junior, senior) to current college GPA. College GPA would be the dependent variable. Sex and class would be the two independent or grouping variables. Factorial ANOVA procedures can become very problematic due to the complexity of assumptions involved and the variety of methods available for computing the analysis. As a general rule, it is recommended that beginning researchers try to have the same number of participants in each of the cells of the design. When the cells have unequal numbers of participants, the danger of assumptions being violated becomes greatest and the available options for computing the analysis are the most variable. However, even when the cells contain unequal numbers of participants, the default options (those automatically available from SPSS) will provide good solutions.

As an example, consider the following data which represents the results of a fictional study of the relationship of sex (male vs. female) and class (firstyear, sophomore, junior, senior) to cumulative college GPA. Although these data are fictional, the means are representative of prior studies. The minimum information needed to analyze such data is three pieces of information about each participant, their GPA, sex, and class. In preparing the data for analysis by hand, it might appear as shown below. Note that each cell contains the data for two individuals which also makes the analysis very straightforward and easy to understand. This presentation makes it easy to see to which group each participant's GPA belongs, but it is not the way the data need to be placed into the SPSS data input window.

 

Class

 

First-year (1) Sophomore (2) Junior (3) Senior (4)

Female (1)

3.2;  3.1 3.3;  3.4 3.2;  3.3 3.3;  3.2

Male (2)

2.8;  2.9 3.3;  3.0 3.1;  3.2 3.2;  3.1

 

In order to prepare these data for analysis with SPSS, they need to be arranged differently. The information about each participant (GPA, sex, class) needs to be arranged horizontally. The result will be twelve rows of data, one row for each participant. Reserving the first column for a participant identification number may also be helpful, especially when it is necessary to look up data on the original data forms. The data below have been arranged in the same order in which they appear in the above table so it is easy to see how the coding was done. This is not necessary because SPSS (or any statistical package for that matter) will place each participant in the correct group based upon coding for the Sex and Class variables. For example, Participant #1 will be classified as a first-year female based upon the values for Sex (1) and Class (1). Similarly, Participant #4 will be classified as a sophomore female based upon the values for Sex (1) and Class (2). In more advanced analyses, each classification could be based upon several independent variables instead of only two.

To begin the analysis with SPSS, the data would be entered into a Newdata spreadsheet in the manner shown below. Adding variable names and variable labels will make the output much easier to interpret. These tasks can be accomplished using the procedures described at the beginning of the handout. The next task is to request that the analysis of variance be performed.

 

Participant #

GPA

Sex

Class

#1

3.2

1

1

#2

3.1

1

1

#3

3.3

1

2

#4

3.4

1

2

#5

3.2

1

3

#6

3.3

1

3

#7

3.3

1

4

#8

3.2

1

4

#9

2.8

2

1

#10

2.9

2

1

#11

3.3

2

2

#12

3.0

2

2

#13

3.1

2

3

#14

3.2

2

3

#15

3.2

2

4

#16

3.1

2

4

 

To do the analysis, select Statistics > General Linear Model > GLMGeneral Factorial. An input window will appear.  Highlight the dependent variable (GPA, in this case) and transfer it to the "Dependent" box and highlight the independent or grouping variables and transfer them to the "Fixed Factor(s)" box.  At this point the Okay button will be available. Clicking it will produce the desired analysis using the default settings that the SPSS program provides.

At this point, an interesting choice needs to be made. Clicking the Options... button reveals that three Methods (Type I, Type II, Type III, and Type IV) are available for calculating the sums of squares for the analysis of variance. The default option is Type III which should not be changed. The Enter Covariates and Maximum Interactions boxes can be ignored, although these are useful options for complex studies. The  Options... window may also be used to request means and frequency counts. As you become more skilled in data analysis, the Help button or a click of the right mouse button can be used to obtain more information about a procedure and what it means. Annotated results for the above example are shown below.

Back to the Top of the Page

Univariate Analysis Of Variance

Between-Subjects Factors

 

N

Sex 1.00

2.00

1.00

2.00

3.00

4.00

8

8

4

4

4

4

 
Class
 
 
 

The above table simply shows the number of individuals in each of the conditions.

 

                                Descriptive Statistics

SEX CLASS Mean Std. Dev. N
1.00 1.00 3.1500 7.071E-02 2
  2.00 3.3500 7.071E-02 2
  3.00 3.2500 7.071E-02 2
  4.00 3.2500 7.071E-02 2
  Total 3.2500 9.258E-02 8
2.00 1.00 2.8500 7.071E-02 2
  2.00 3.1500 .2121 2
  3.00 3.1500 7.071E-02 2
  4.00 3.1500 7.071E-02 2
  Total 3.0750 .1669 8
Total 1.00 3.0000 .1826 4
  2.00 3.2500 .1732 4
  3.00 3.2000 8.165E-02 4
  4.00 3.2000 8.165E-02 4
  Total 3.1625 .1586

16

Although this table may appear a bit complicated at first, it is really easy to understand. The two columns on the left indicate the condition or group for each row of data. For example, the first mean in the table (3.15) is the mean for first-year females because the data were coded by having a "1" for Sex indicate females and a "1" for Class indicate first-year students. Of course, the researcher must remain aware of how the data were coded in order to interpret the table unless variable labels are used. The second mean (3.35) is for female sophomores. The third mean (3.25) is for female juniors and the fourth mean (3.25) is for female seniors. The fifth or Total mean (3.25) is the mean for all females in the study. The same interpretation applies to the five following means except that they are for males. The designation "Total" in the column labeled "Sex" is the means for all individuals in each Class. For example, the mean for Sex = "Total" and Class = "1.00" is the mean of all four first-year students. Finally, the mean for Sex = "Total" and Class = "Total" is the mean of all 16 individuals in the study.

The standard deviations are interpreted in the same way as the means. The unusual notation for some standard deviation values is standard scientific notation. The "E-02" that follows some values indicates that the decimal point should be shifted two places to the left to read the number. For example, the number 8.165E-02 stands for .08165.

 

                                            Tests of Between-Subjects Effects

Source Type III Sum of Squares DF Mean Square F Sig of F Eta Squared
Corrected Model .298 7 4.250E-02 4.250 .030 .788
Intercept 160.063 1 160.023 16002.250 .000 1.000
SEX .122 1 .122 12.250 .008 .605
CLASS .147 3 4.917E-02 4.917 .032 .648
SEX * CLASS 2.750E-02 3 9.167E-02 .917 .475 .256
Error 8.000E-02 8 1.000E-02      
Total 160.400 16 .010      
Corrected Total .377 15 .025      

 

The Source, Type III Sum of Squares, DF, Mean Square, F, and Sig[nificance] of F provide information that can be interpreted as described in your textbook. Eta Squared is a measure of the effect size or magnitude of the effect. It is a squared measure of association and has an interpretation similar to a squared correlation coefficient. It describes the degree of association between the independent and dependent variable.

When the number of individuals per cell or condition is equal (also called a "balanced" design), as in this example, the Type III Sum of Squares for SEX, CLASS, the SEX * CLASS interaction, and the Corrected Total will correspond to the "classic" computational method described in most introductory textbooks. When the cells or conditions contain different numbers of individuals, the Type III sum of squares will differ from the "classic" computations. However, these differences provide necessary adjustments that result from the unbalanced nature of the design. Another issue that results from an unbalanced design (unequal numbers of participants in the cells) is that even the various main effect (Total) or marginal means may be distorted unless adjustments are made. The adjusted marginal means may be requested by clicking the Options button and checking the appropriate boxes. These issues will be covered in advanced statistics courses.

 

 

 


Copyright 2000 The McGraw-Hill Companies. All rights reserved. Any use is subject to the Terms of Use and Privacy Policy.
McGraw-Hill Higher Education is one of the many fine businesses of The McGraw-Hill Companies.

If you have a question or a problem about a specific book or product, please fill out our Product Feedback Form.
For further information about this site contact mhhe_webmaster@mcgraw-hill.com
or let us know what you think by filling out our Site Survey.