Relations Between Two Variables - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Relations Between Two Variables

Description:

Relations Between Two Variables Regression and Correlation In both cases, y is a random variable beyond the control of the experimenter. In the case of correlation, x ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 15
Provided by: Douglas
Category:

less

Transcript and Presenter's Notes

Title: Relations Between Two Variables


1
Relations Between Two Variables
Regression and Correlation
In both cases, y is a random variable beyond the
control of the experimenter. In the case of
correlation, x is also a random variable. In the
case of regression, x is treated as a fixed
variable. (As if there is no sampling error in
x.)
Regression you are wishing to predict the value
of y on the basis of the value of
x. Correlation you are wishing to express the
degree the relation between a and y.
2
Scatter Diagram or Scatter Plot
X axis (abscissa) predictor variable Y axis
(ordinate) criterion variable
Positive
Negative
Perfect
None
3
Covariance
is a number reflecting the degree to which two
variable vary or change in value together.
n the number of xy pairs.
Using an example of collecting RT and error
scores. If a subject is slow (high x) and
accurate (low y), then the d score for the x will
be positive and the d score for the y will be
negative their product will be negative. If a
subject is slow (high x) and inaccurate (high y),
then the d score for the x will be positive and
the d score for the y will be positive their
product will be positive. If a subject is fast
(low x) and accurate (low y), then the d score
for the x will be negative and the d score for
the y will be negative their product will be
positive. If a subject is fast (low x) and
inaccurate (high y), then the d score for the x
will be negative and the d score for the y will
be positive their product will be negative.
4
Illustrative Trends
Sub. x
y
  • 100 -200 20 10 -2000
  • 200 -100 15 5 -500
  • 300 0 10 0 0
  • 400 100 5 -5 -500
  • 500 200 0 -10 -2000

Those subjects who are fast make more errors.
Total -5000
  • 100 -200 0 -10 2000
  • 200 -100 5 -5 500
  • 300 0 10 0 0
  • 400 100 15 5 500
  • 500 200 20 10 2000

Those subjects who are fast make fewer errors.
Total 5000
  • 100 -200 10 0 0
  • 200 -100 5 -5 500
  • 300 0 20 10 0
  • 400 100 5 -5 -500
  • 500 200 10 0 0

There is no trend.
Total 0
5
Scatter plots of data from previous page.
We can see a trend after all.
100 200 300 400 500
6
Scale Issues
(Sec.) (Min.)
x
y
1 -4 5 -8 32
  1. -2 13 0 0

5 0 9 -4 0
7 2 17 4 8
Total 72
9 4 21 8 32
1 -4 300 -430 1920
3 -2 780 0 0
5 0 540 -240 0
7 2 1020 240 480
Total 4320
9 4 1260 480 1920
7
  • Sub X Y
  • 2 10
  • 3 12
  • 2 12
  • 4 15
  • 4 12

What is the covariance?
The absolute value of the covariance is a
function of the variance of x and the variance
of y. Thus, a covariance could reflect a strong
relation when the two variances are small, but
maybe express a weak relation when the variances
are large.
8
Linear Relation is one in which the relation can
be most accurately represented by a straight
line. Remember a linear transformation
The general equation for a straight line
(a is the y intercept and b is the slope of the
line.)
A 1.5
If x 8 then, y .5(8) 1.5 5.5
9
When the relation is imperfect
(not all points fall on a straight line.)
Why are the points not on the line?
We draw the best fit using what is called the
least-squares criterion.
Why squares?
See optional link on simultaneous equations for a
closer look at the idea of least-squares.
10
Regression Line Example
Subject Stat. Score (x) GPA (y)
GPA
1 110 1.0
2 112 1.6
3 118 1.2
4 119 2.1
5 122 2.6
6 125 1.8
7 127 2.6
8 130 2.0
9 132 3.2
10 134 2.6
11 136 3.0
12 138 3.6
4 3 2 1
110 120 130 140
Statistics Score
11
We wish to minimize
The predicted value of y for a given value of x
the slope minimizing the errors predicting y
y-axis minimizing the errors predicting y
For our example
What does this mean?
12
Our working example
A 2.275 0.074(125.25) -7.006
The regression line for our data
Using the regression formula to predict e.g., x
124
Note If the x value you are inserting is
beyond the range of the values used to construct
the Formula, caution must be used.
13
Remember To minimize the sum of the squared
deviations about a point, the mean is best.
GPA
Note Using our GPA and Statistic Scores data
1.0 1.69
1.6 .49
1.2 1.21
2.1 .04
2.6 .09
1.8 .25
2.6 .09
2.0 .09
3.2 .81
2.6 .09
3.0 .49
3.6 .169
.79
We could call this a type of Standard Error of y.
14
Using only the mean of y to predict y, all y
values would be the mean. Using X,
Which MODEL is superior? Why? Is there a reliable
difference?
Standard Error of the Estimate similar to a
standard deviation Where the relation is
imperfect, there will be prediction error,
whether one use the mean or the regression line.
Transformed.
What is r?
Residual Variance What might create residual
variance?
Write a Comment
User Comments (0)
About PowerShow.com