Regression and Correlation - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Regression and Correlation

Description:

measuring how well birds build nests after a period of practice. ... Quality of nest (Y) Amount of practice (X) With some invented data: ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 22
Provided by: tri5203
Category:

less

Transcript and Presenter's Notes

Title: Regression and Correlation


1
Regression and Correlation
2
Regression vs. Correlation
(1) if relationship in functional
dependence i.e. magnitude of one
variable(dependent) determined by (as a function
of) second variable (independent) -
REGRESSION e.g. blood pressure and age (2) if
relationship between 2 variables is not one of
dependence i.e. magnitude of one variable may
change as magnitude of second variable changesbut
no reason to consider them dependent or
independent - CORRELATION e.g. in birds -
wing length tail length
3
(No Transcript)
4
Strength of the relationship -use a correlation
coefficient Pearson Product Moment
Correlation for a sample - r for a population
- r (rho) r can range from 1 to -1
Strong positive - r approaches 1
Strong negative - r approaches -1
No relationship
5
n(SXY) - (SX)(SY) n(SX2) - (SX)2n(SY2) -
(SY)2
r
In our blood pressure example,
Subject Age(X) BP(Y) XY X2 Y2
A 43 128 5504 1849 16384 B 48 120
C 56 135 D 61 143 E 67 141
F 70 152 SX345 SY819 SXY47634 SX220399 SY2
112443
r .897
Strong positive relationship
6
From previous slide r .897 Indicates a strong
positive relationship BUT Is this relationship
due to chance or is it significant??
7
Hypotheses for determining significance of a
correlation H0 r 0 H1 r ? 0
N - 2 4.059 1 - r2
t r tcrit(.05, df1) 2.776
Therefore, there is a significant relationship
between age and blood pressure
8
  • Regression
  • here we are trying to determine the nature and
    strength of
  • relationship

Start with a line of best fit
Any straight line with have the equation - Y a
bX (for a sample) and Y a bX (for a
population)
9
To determine the line that best fits the
data -use least squares method
10
To determine the line that best fits the
data -use least squares method
d8
d6
-determine the difference between every point and
the line
d7
d5
d4
d2
d3
d1
11
d8
d6
d7
d12 d22 d32 d42 d52 . d82
d5
d4
d2
OR S(Y - Y)2
d3
d1
should be a minimum The sum of all the d2s is
the residual or error sum of squares
12
Note - for Y a bX, -a is Y intercept and B
is the slope or regression coefficient
Same b - different a
Same a - different b
13
A least squares line can be fit to any set of data
Imagine an entire population of points where b 0



































































































14
A least squares line can be fit to any set of data
Imagine an entire population of points where b 0








































































































What if you took a random sample and got the
points () Could generate a least squares line
where Y a bX and b ? 0
15
What is the probability that the set of points in
red is from the same population and accurately
describes the relationship between X and
Y OR H0 b 0 H1 b ? 0








































































































USE AN ANOVA
16
  • Steps
  • Compute total summ of squares or the total
    variability of the dependent or Y variable
  • Total SS S(Yi - Y)2 (same as the ds we
    had earlier)
  • 2) Compute regression (or model) SS
  • Regression SS SXiYi -
  • 3) Compute residual (error) SS
  • total SS - regression SS
  • 4) Compute mean square - regression residual
  • MS SS/df and df total n-1, df regression
    1, df residual dftotal - dfregression
  • 5) Determine F

2
SXiSYi n
SXi2 - (SXi)2 n
17
Spearman Rank Correlation - used for ordinal
data
Experiment -measuring how well birds build nests
after a period of practice.
18
Might expect - relationship is not linear
Quality of nest (Y)
Amount of practice (X)
19
With some invented data Bird Amount of
Practice Quality of nest A 4 9 B 2 2 C
10 10 D 3 8
C
A



D
Nest quality

B
Practice
20
Now look at the ranks for the same
data Bird Amount of Practice Quality of
nest A 3 3 B 1 1 C 4 4 D 2 2


Nest quality


Practice
21
  • A Spearman Rank Correlation - calculate rs
  • 1) Rank X and Y values
  • 2) Compute a Pearson (SP) for the ranks
  • Bird Amount of Practice Quality of nest XY
  • A 3 3 9
  • B 1 1 1
  • C 4 4 16
  • D 2 2 4
  • SX 10 and SX2 31 SXY 31
  • SSx SX2 - (SX)2
  • n
  • 6
Write a Comment
User Comments (0)
About PowerShow.com