Chapter 7 Part 1 - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Chapter 7 Part 1

Description:

Correlational research: comparing tX & tY scores ... To compare scores on two different variables, you transform them into ZX and ZY ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 65
Provided by: john77
Category:

less

Transcript and Presenter's Notes

Title: Chapter 7 Part 1


1
Chapter 7 -Part 1
  • Correlation

2
Correlation Topics
  • Correlational research what is it and how do
    you do co-relational research?
  • The three questions
  • Is it a linear or curvilinear correlation?
  • Is it a positive or negative relationship?
  • How strong is the relationship?
  • Solving these questions with t scores and r, the
    estimated correlation coefficient derived from
    the tx and ty scores of individuals in a random
    sample.

3
Correlational research how to start.
  • To begin a correlational study, we select a
    population or, far more frequently, select a
    random sample from a population.
  • (Since we use samples most of the time, for the
    most part, we will use the formulae and symbols
    for computing a correlation from a sample.)
  • We then obtain two scores from each individual,
    one score on each of two variables. These are
    usually variables that we think might be related
    to each other for interesting reasons). We call
    one variable X and the other Y.

4
Correlational research comparing tX tY scores
  • We translate the raw scores on the X variable to
    t scores (called tX scores) and raw scores on the
    Y variable to tY scores.
  • So each individual has a pair of scores, a tX
    score and a tY score.
  • You determine how similar or different the tX and
    tY scores in the pairs are, on the average, by
    subtracting tY from tX, then squaring, summing,
    and averaging the tX and tY differences.

5
The estimated correlation coefficient, Pearsons r
  • With a simple formula, you transform the average
    squared differences between the t scores to
    Pearsons correlation coefficient, r
  • Pearsons r indicates (with a single number),
    both the direction and strength of the
    relationship between the two variables in your
    sample.
  • r also estimates the correlation in the
    population from which the sample was drawn
  • In Ch. 8, you will learn when you can use r that
    way.

6
Going from pairs of raw scores to r Linearity -
A preliminary question.
  • Once you have scores on two variables, you
  • ask, Is this a linear or curvilinear
    relationship?
  • Psychology is a relatively new science and this
    is an intro stat course
  • For both reasons, you will only learn how to deal
    with linear relationships between two variables
    and save correlation with three or more variables
    and curvilinear relationships for grad school.
  • BUT YOU MUST KNOW WHAT A LINEAR RELATIONSHIP IS,
    AND HOW TO RECOGNIZE A NONLINEAR (CURVILINEAR)
    CORRELATION.

7
Linearity vs. Curvilinearity
  • In a linear relationship, as scores on one
  • variable go from low to high, scores on the
  • other variable either generally increase or
  • generally decrease.
  • In a curvilinear relationship, as scores on one
  • variable go from low to high, scores on the
  • other variable change directions. They can go
  • 1.)down and then up, 2.) up and then down, 3.)
    up and down and then up again, 4.) up or down
    then flat. ETC.

8
Examples of linear relationships.
  • For example, think of the relationship of the
    size of a pleasure boat (X) and its cost (Y).
  • As one variable (boat size) increases, scores
  • on the other variable (cost) also increase.
  • Another example of a linear relationship the
    relationship between the size of a car and the
    number of miles per gallon it gets.
  • In general, as cars get gradually larger (X),
    they tend to get fewer miles per gallon (Y).

9
A curvilinear relationship
  • In a curvilinear relationship, as scores on the X
    variable go gradually from low to high, the Y
    variable changes direction.
  • For example, think of the relationship between
    age (X) and height (Y).
  • As age increases from 0-14 or so, height
    increases also.
  • But then people stop growing. As age increases,
    height stays the same.
  • Thus the Y variable, height, changes direction.
    It goes from gradually rising to flat.
  • If you graph age and height, the best fitting
    line is a curved line.

10
Correlation Characteristics Which line best
shows the relationship between age (X) and height
(Y)
Linear vs Curvilinear
11
Another non-linear relationship shortstops and
linemen great shortstops may be too small to be
great football lineman.
Football potential Terrible Average Average Very
Good Excellent Good Poor
Is this a linear relationship?
12
Plot the dots!
  • To check whether a relationship is linear, make a
    graph and place the scores on it.
  • Thats what I mean by Plot the dots.
  • If you really want to know what is going on with
    data, Plot the dots!
  • Here is a graph for the baseball skills and
    football potential data.

13
When you plot the dots, is this linear?
Football Skill
Chuck
Frank
Al
Baseball Skill
Ben
Ed
George
NO! It is best described by a curved line. It is
a curvilinear relationship!
David
14
After you know a correlation is linear, there are
other two questions Direction and Strength of a
correlation. But first, a definition of high and
low scores.
  • Definition of high and low scores
  • High scores are scores above the mean. They are
    represented by positive t scores.
  • Low scores are scores below the mean of each
    variable.
  • They are represented by negative t scores.

15
Positive relationships
  • In a positive relationship, as X scores gradually
    increase, Y scores tend to increase as well.
    Example The longer a sailboat is, the more it
    tends to cost. As length goes up, price tends to
    go up.
  • In a positive correlation, X and Y scores tend to
    be on the same side of their respective means.
  • As a result, the tX and tY scores tend to be
    similar and the difference between them (tX tY)
    tends to be small.
  • Since (tX tY) is small, the squared difference
    between them, (tX tY)2 also tends to be small

16
(No Transcript)
17
Graphing a positive relationship.
  • In a positive correlation high scores on X tend
    to go with high scores on Y. On a graph, as the
    line runs from left to right, scores increase on
    the X axis. At the same time, Y scores also
    generally get higher. So, the line will tend to
    rise as it runs.
  • Remember from math, slope equals how far a line
    rises on the Y axis for each unit it moves from
    left to right or runs along the X axis.
  • If a line rises from left to right, rise is
    positive. Run is always positive. So a positive
    rise divided by an (always) positive run results
    in a positive slope. (Thats why we call it a
    positive correlation.)

18
Positive vs Negative scatterplot
19
Graphic display of a strong POSITIVE correlation.
20
Negative relationships
  • In a negative relationship, as X scores gradually
    increase, Y scores tend to decrease. Example
    The more years a sailboat is used, the less it
    tends to cost. As use goes up, price tends to go
    down.
  • In a negative correlation, X and Y scores tend to
    be on opposite sides of their respective means.
  • As a result, the tX and tY scores tend to be
    dissimilar and the difference between them (tX
    tY) tends to be large.
  • Since (tX tY) is large, the squared difference
    between them, (tX tY)2 also tends to be large.

21
(No Transcript)
22
Graphing a negative relationship
  • In a negative correlation, high scores on X tend
    to go with low scores on Y. On a graph, as the
    line runs from left to right, scores increase on
    the X axis. At the same time, Y scores get lower.
    So, the line will tend to fall as it runs.
  • Remember from math, slope equals how far a line
    rises on the Y axis for each unit it moves from
    left to right or runs along the X axis.
  • If a line falls from left to right, rise is
    negative. Run is always positive. So a negative
    rise divided by an (always) positive run results
    in a negative slope. (Thats why we call it a
    negative correlation.)

23
Positive vs Negative scatterplot
24
Summary
  • When t scores are consistently more similar than
    different, we have a positive correlation. On a
    graph the dots will rise from your left to your
    right.
  • When t scores are consistently more different
    than similar, we have a negative correlation. On
    a graph the dots will fall from your left to your
    right.

25
Positive vs Negative scatterplot
26
How strong is the relationship between the tX and
tY scores?
  • Here the question is about the consistency with
    which tX and tY scores are either similar or
    dissimilar.

27
t scores sign and size
  • There are two aspects to the consistency of the
    relationship between tX and tY scores.
  • First, are the t scores consistently of the same
    sign (positive correlation) or opposite signs
    (negative correlation).
  • If they are almost always one way or the other,
    you have at least a moderately strong
    relationship.
  • On the other hand, if you sometimes see t scores
    on the same side of the mean and sometimes on
    opposite sides, you have a relatively weak
    correlation.

28
t scores sign and size
  • If there is a consistent pattern of same signed t
    scores (positive correlation) or a consistent
    pattern of opposite signed t scores (negative
    correlation), then whether the tX and tY scores
    are about the same distance from the mean comes
    into play.
  • The large majority of t scores (usually well over
    95, range from 2.50 to 2.50
  • Given a consistent positive or negative
    correlation, the more similar in size the t
    scores, the stronger the correlation.

29
Positive correlations
  • Perfect tX and tY scores are all the same sign
    and are identical in size.
  • Strong tX and tY scores are almost all the same
    sign and are fairly similar in size.
  • Moderate tX and tY scores are predominately the
    same sign. This is especially true for pairs in
    which one of the values is one or more standard
    deviations from the mean. Size may be fairly
    dissimilar.
  • Weak tX and tY scores are a little more often
    the same sign than opposite in sign. Nothing can
    be said about size.

30
Negative correlations
  • Perfect tX and tY scores are all of the opposite
    sign and are identical in size.
  • Strong tX and tY scores are almost all of
    opposite sign and are fairly similar in size.
  • Moderate tX and tY scores are predominately
    opposite in sign. This is especially true for
    pairs in which one of the values is one or more
    standard deviations from the mean. Size may be
    fairly dissimilar.
  • Weak tX and tY scores are a little more often of
    opposite signs than the same in sign. Nothing can
    be said about size.

31
Unrelated (independent) variables
  • When the size and sign of the tX scores bears no
    relationship to the size and sign of the tY
    scores, the variables are unrelated.
  • We also can call the variables independent of
    or orthogonal to each other. The three terms,
    unrelated, independent and orthogonal are
    synonymous in this context.

32
Graphing it on t axes The strength of a
relationship tells us approximately how the dots
representing pairs of t scores will fall around a
best fitting line.
  • Perfect - scores fall exactly on a straight line
    whose slope will be 1.00 or 1.00.
  • Strong - most scores fall near the line whose
    slope will be close to .750 or -.750.
  • Moderate - some are near the line, some not. The
    slope of the line will be close to .500 or -.500.

33
Graphing it on t axes The strength of a
relationship tells us approximately how the dots
representing pairs of t scores will fall around a
best fitting line.
  • Weak some scores fall fairly close to the line,
    but others fall quite far from it. The slope of
    the line will be close to .250 or -.250
  • Independent - the scores are not close to the
    line and form a circular or square pattern. The
    best fitting line will be the X axis, a line with
    a slope of 0.000.

34
Strength of a relationship
35
Strength of a relationship
36
Strength of a relationship
Moderate
37
Strength of a relationship
38
What is this relationship?
39
What is this?
40
What is this?
41
What is this?
42
Computing the correlation coefficient.
43
Comparing apples to oranges? Use Z or t scores!
  • You can use correlation to look for the
    relationship between ANY two values that you can
    measure of a single subject.
  • However, there may not be any relationship
    (independent).
  • A correlation tells us if scores are consistently
    similar on two measures, consistently different
    from each other, or have no real pattern

44
Comparing apples to oranges? Use t scores!
  • To compare scores on two different variables, you
    transform them into ZX and ZY scores if you are
    studying a population or tX and tY scores if you
    have a sample.
  • ZX and ZY scores (or tX and tY scores) can be
    directly compared to each other to see whether
    they are consistently similar, consistently quite
    different, or show no consistent pattern of
    similarity or difference

45
Comparing variables
  • Anxiety symptoms, e.g., heartbeat, with number of
    hours driving to class.
  • Hat size with drawing ability.
  • Math ability with verbal ability.
  • Number of children with IQ.
  • Turn them all into Z or t scores

46
Pearsons Correlation Coefficient
  • coefficient - noun, a number that serves as a
    measure of some property.
  • The correlation coefficient indexes BOTH the
    consistency and direction of a correlation with a
    single number

47
Pearsons rho
  • Pearsons rho (?) is the parameter that
    characterizes the strength and direction of a
    linear relationship (and only a linear
    relationship) between two variables. To compute
    rho, you must have the entire population. Then
    you can compute sigma, mu, Z scores and rho.
  • The formula rho 1 -(1/2 ?(ZX - ZY)2 / (NP))
    where NP is the number of pairs of Z scores in
    the population
  • In English The correlation coefficient equals 1
    minus half the average squared distance between
    the Z scores.

48
Pearsons rho
  • When you have a perfect positive correlation, the
    Z scores will be identical in size and sign. So
    the average squared distance will be zero and rho
    1.000-1/2(0.000) 1.000
  • When you have a perfect negative correlation, the
    Z scores will be identical in size and opposite
    in sign. It can be proven algebraically that the
    average squared distance in that case will be
    4.000 rho 1.000-1/2(4.000) -1.000
  • When you have two totally independent variables,
    the average squared distance will be 2.000
    (halfway between 0.000 and 4.000). Thus, rho
    1.000-1/2(2.000) 0.000

49
Pearsons Correlation Coefficient
  • Thus, rho varies from -1.000 (perfect negative
    correlation to 0.000 (independent variables) to
    1.000 (perfect positive correlation).
  • A negative value indicates a negative
    relationship a positive value indicates a
    positive relationship.
  • Values of r close to 1.000 or -1.000 indicate a
    strong (consistent) relationship values close
    to 0.000 indicate a weak (inconsistent) or
    independent relationship.

50
Estimating rho with r
  • Computing rho involves finding the actual average
    squared distance between the ZX and ZY scores in
    the whole population.
  • In computing r, we are estimating rho.

51
The formula for r
  • Pearsons r is a least squares, unbiased estimate
    of rho, based on the relationships found between
    tX and tY scores in a random sample.
  • r 1 - (1/2 ?(tX - tY)2 / (nP - 1)) where nP-1
    equals one less than the number of pairs of t
    scores in the sample.
  • In English Pearsons r equals 1.000 minus half
    the estimated average squared difference between
    the Z scores in the population based on squared
    differences between the t scores in the sample.

52
Look at those formulae again.
  • rho 1 -(1/2 ?(ZX - ZY)2 / (NP)) where NP is the
    number of pairs of Z scores in the population
  • ?(ZX - ZY)2 / (NP) is the average squared
    distance between the Z scores.
  • The rest of the formula, simply transforms the
    average squared distance between the Z scores
    into a variable that goes from 1.000 to 1.000.

53
Look at those formulae again.
  • r 1 - (1/2 ?(tX - tY)2 / (nP - 1)) where nP-1
    equals one less than the number of pairs of t
    scores in the sample.
  • REMEMBER, t scores are estimated Z scores
  • . ?(tX - tY)2 / (nP - 1)) is a least squared,
    unbiased estimate of the average squared
    difference between the Z scores in the population
    based on the differences between the tX and tY
    scores in a random sample.
  • The rest of the formula, simply transforms the
    estimated average squared distance between the Z
    scores into a variable that goes from 1.000 to
    1.000.

54
Thus, r, the least squared, unbiased estimate of
rho, is basically an estimate of the average
squared difference between the ZX and ZY scores
in the population transformed into a variable
that goes from -1.00 to 1.00.
55
Similarities of r and rho
  • r and rho vary from -1.000 to 1.000.
  • For both r and rho, a negative value indicates a
    negative relationship a positive value indicates
    a positive relationship.
  • Values of r or rho close to 1.000 or -1.000
    indicate a strong (consistent) relationship
    values close to 0.000 indicate a weak
    (inconsistent) or independent relationship.

56
Since we almost always are studying random
samples, not populations, we almost always
compute Pearsons r, not Pearsons rho.
57
r, strength and direction
Perfect, positive 1.00 Strong, positive
.75 Moderate, positive .50 Weak, positive
.25 Independent .00 Weak, negative -
.25 Moderate, negative - .50 Strong, negative
- .75 Perfect, negative -1.00
58
Calculating Pearsons r
  • Select a random sample from a population obtain
    scores on two variables, which we will call X and
    Y.
  • Convert all the scores into t scores.

59
Calculating Pearsons r
  • First, subtract the tY score from the tX score in
    each pair.
  • Then square all of the differences and add them
    up, that is, ?(tX - tY)2.

60
Calculating Pearsons r
  • Estimate the average squared distance between ZX
    and ZY by dividing by the sum of squared
    differences between the t scores by (nP -
    1). ?(tX - tY)2 / (nP - 1)
  • To turn this estimate into Pearsons r, use the
    formula r 1 - (1/2 ?(tX - tY)2 / (nP - 1))

61
Example Calculate t scores for X
DATA 2 4 6 8 10
MSW 40.00/(5-1) 10
sX 3.16
62
Calculate t scores for Y
DATA 9 11 10 12 13
MSW 10.00/(5-1) 2.50
sY 1.58
63
Calculate r
tY -0.63 -1.26 -0.63 0.63 1.26
tX -1.26 -0.63 0.00 0.63 1.26
tX - tY 0.00 -0.63 0.63 0.00 0.00
(tX - tY)2 0.00 0.40 0.40 0.00 0.00
This is a very strong, positive relationship.
? (tX - tY)2 / (nP - 1)0.200
r 1.000 - (1/2 (? (tX - tY)2 / (nP - 1)))
r 1.000 - (1/2 .200)
1 - .100 .900
64
By the way - True graphs.
  • Ch.7 has true graphs, displays in which each dot
    stands for a score on two (in this case) or more
    (in more advanced cases) variables.
  • In Ch. 1 through Ch. 6, most of the figures have
    represented the frequency of scores on a single
    variable.
  • Formally, displays of frequencies are figures,
    but they are not graphs.
Write a Comment
User Comments (0)
About PowerShow.com