Title: Chapter Eight
1Chapter Eight
- Correlation and Prediction
PowerPoint Presentation created by Dr. Susan R.
BurnsMorningside College
2Correlation
- Correlation is the extent to which two variables
are related. - If the two variables are highly related, then
knowing the value of one of them will allow you
to predict the other variable with considerable
accuracy. - The less highly related the variable, the less
accurate your ability to predict when you know
the other.
3The Nature of Correlation
- Often used as means for prediction, correlation
tells us how related two variables are. - However, note that even though two variables may
be highly correlated, you should not assume that
one variable causes the other. - CORRELATION DOES NOT IMPLY CAUSATION.
- For example, there is the third variable
possibility (i.e., there may be additional
variable(s) that are causing the two things you
are investigating to be related to each other).
Theres a significant NEGATIVE correlation
between the number of mules and the number of
academics in a state, but remember, correlation
is not causation
4The Scatterplot Graphing Correlations
- Also known as the scatter diagram, the
scatterplot allows us to visually see the
relation between two variables. - One variable is plotted on the ordinate and the
other on the abscissa. - Although you can list either variable on either
axis, it is common to place the variable you are
attempting to predict on the ordinate. - Positive correlations occur when both variables
move in the same direction (e.g., as SAT scores
increase, so to do GPAs). - Negative Correlations occur when one variable
increases, the other decreases (e.g., as age
increases, the number of speeding tickets
decrease).
5The Scatterplot Graphing Correlations
6The Pearson Product Moment Correlation Coefficient
- The correlation coefficient is the single number
that represents the degree of relation between
two variables. - The Pearson Product-Moment Correlation
Coefficient (symbolized by r) is the most common
measure of correlation researchers calculate it
when both the X variable and the Y variable are
interval or ration scale measurements.
Mathematically, it can be defined as the average
of the cross-products of z-scores. - The raw score formula for r is
-
7(No Transcript)
8The Range of r Values
- The Range of r correlation coefficients can
range in value from -1.00 to 1.00. - A correlation of -1.00 indicates a perfect
negative correlation between the two variables of
interest. That is, whenever there is an increase
of one unit in one variable, there is always the
same proportional decrease in the other variable.
9The Range of r Values
- The Range of r correlation coefficients can
range in value from -1.00 to 1.00. - A zero correlation means there is little or no
relation between the two variables. That is, as
scores on one variable increase, scores on the
other variable may increase, decrease, or not
change at all.
10The Range of r Values
- The Range of r correlation coefficients can
range in value from -1.00 to 1.00. - Perfect positive correlation occurs when you have
a value of 1.00 and as we see an increase of one
unit in one variable, we always see a
proportional increase in the other variable. - The existence of a perfect correlation indicates
there are no other factors present that influence
the relation we are measuring. This situation
rarely occurs in real life.
11(No Transcript)
12Interpreting Correlation Coefficients
- Statistically significant results mean that a
research result occurred rarely by chance. - If the correlation you calculate is sufficiently
large that it would occur rarely by chance, then
you have reason to believe that these two
variables are related. - The standard by which significance in psychology
is determined is at the .05 level. - That is, a result is significant when it occurs
by chance 5 times out of a hundred. - Researchers who are more caution may choose to
adopt a .01 level of significance.
13Effect Size
- Even though statistical significance is an
important component of psychological research, it
may not tell us very much about the magnitude of
our results. - Effect size refers to the size or magnitude of
the effect an independent variable (IV) produced
in an experiment or the size or magnitude of a
correlation. - Effect size calculation is important because,
unfortunately, a research result can be
significant and yet the effect size may be quite
small. - An example of this situation occurs as sample
size gets larger, the critical value needed to
achieve significance becomes smaller.
14Effect Size
- To calculate the effect size for the Pearson
product-moment correlation, all you have to do is
square the correlation coefficient. - r2 is known as the coefficient of determination.
- Multiply the coefficient of determination by 100
and you will see what percentage of the variance
is accounted for by the correlation. - The higher r2 becomes, the more variance is
accounted for by the relation between the two
variables under study. - Lower r2 values indicate that factor, other than
the two variables of interest are influencing the
relation in which we are interested.
15Prediction
- Generally speaking, regression refers to the
prediction of one variable from our knowledge of
another variable. - We label the variable that is being predicted as
the Y variable and refer to it as the criterion
variable. - We label the variable that we are predicting from
as the X variable and refer to it as the
predictor variable. - In other words, we use X to predict Y.
16Prediction
- The Regression Equation
- The regression equation is the statistical basis
of prediction -
17The Regression Equation
- The regression line is a graphical display of the
relation between the values on the predictor
variable and predicted values on the criterion
variable. It is similar to the scatterplot used
to display correlations. - The calculation of b can be done a couple of
ways -
18The Regression Equation
- A second way to calculate involves the following
formula -
19The Regression Equation
- The calculation of a is as follows
-
20Constructing the Regression Line
- There are two ways to construct the regression
line - First, you could calculate several predicted
values, plot these values, connect the points,
and then extend the line to the Y intercept. This
procedure works, but carries with it the
potential for calculation errors and inaccurate
points. - The second procedure is less likely to have
calculation errors - Locate the Y intercept (a)
- Plot My and Mx
- The line that passes through these points is the
regression line. - Remember, the steepness of the regression line is
known as the slope, whereas the point at which
this line crosses the vertical axis is called the
Y intercept.
21Constructing the Regression Line