So far, you have been reading about statistics that describe sets of data. In many research studies, psychologists might want to know the extent to which two variables are related. Correlational statistics do just that. Correlational statistics yield a number called the coefficient of correlation. The coefficient may be positive or negative and may vary from -1.00 to 1.00. In a positive correlation, scores on two different variables increase and decrease together. For example, there is a positive correlation between high school GPA and freshman GPA in college. If a student has a score above average on one of the variables of interest, then they likely will have a score above average on the other variable of interest. Likewise a student who has a below average high school GPA is likely to have a below average college freshman GPA These are referred to as positive correlations or direct relationships. In a negative correlation, as scores for one variable decrease, they increase for the other variable. For example, there is a negative correlation between absenteeism and course performance. If a student has more absences than average, we would expect their course performance to be below average, and vice versa. This type of relationship is referred to as inverse. The sign (positive or negative) of the correlation coefficient describes the nature or direction of the relationship that exists between two variables (direct or inverse). The strength of a correlation is determined by how close its’ absolute value comes to 1. For example, a correlation of -.72 indicates an inverse relationship between two variables and shows a stronger relationship than that associated with a correlation of +.53.
Correlational statistics are important because they permit us to determine the strength and direction of the relationship between different sets of data or to predict scores on one variable based on our knowledge of scores on another. If the correlation between two sets of data were a perfect 1.00 (or –1.00), we could predict a score on one variable from a score on the other variable with complete accuracy. But because correlations are almost always less than perfect, we predict scores on one variable from scores on another variable only with a particular probability of being correct--the higher the correlation, the higher the probability.
It cannot be stressed strongly enough that correlation does not mean causation. For example, years ago, authorities presumed that autistic children, those who have poor social and communication skills, were caused by "refrigerator mothers." Mothers of autistic children were aloof from them. This was taken as a sign that the children suffered from mothers who were emotionally cold. Knowing that this is simply a correlation, you might wonder whether causality was in the opposite direction. Perhaps autistic children, who do not respond to their mothers, cause their mothers to become aloof from them. Moreover, why would a mother have several normal children, then an autistic child, and then several more normal ones? It would be difficult to believe she was a warm parent to all but one. Today, evidence indicates that autism is a neurological problem that has nothing to do with the mother’s emotionality.
As another example, although there is a positive correlation between smoking and cancer in human beings, this correlation is not scientifically acceptable evidence that smoking causes cancer. Perhaps another factor (such as a level of stress tolerance) might make someone prone to both smoking and cancer, without smoking necessarily causing cancer. Of course, correlation does not imply the absence of causation. For example, there may indeed be a causal relationship between smoking and cancer. The point is that if two variables measured in a correlational study are strongly correlated, one of the variables may cause the other, or there may not be a causal link--we just cannot tell for sure based on data from a correlational study. But remember that knowing that two variables are related is still an important piece of information. It allows us to make predictions more accurately than would otherwise be the case.
Many studies that utilize the correlational technique to analyze their data are not true experiments (the researcher does not have control over assigning participants to different values on the independent variable). To eliminate the temptation to infer causality between the variables under study, the independent variable is often referred to as the predictor and the dependent variable is referred to as the criterion when a correlational design is used.
Learning Check #15:
Many studies have determined that there is a positive correlation between viewing
violence on television and violent behavioral patterns. What does this mean?
Learning Check #16:
Given that there is a positive correlation between viewing violence on television
and violent behavior, can we conclude from this data that watching the violence
on television causes children to behave violently?
Learning Check #17:
Researchers used to believe that there was a negative correlation between age
and IQ. Recently, this correlation has turned out to be much weaker than we
originally thought. Describe what is meant by a negative correlation between
age and IQ.