Regression and Correlation - PowerPoint PPT Presentation

About This Presentation
Title:

Regression and Correlation

Description:

bivariate. linear. multivariate. non-linear (curvi-linear) Graphical ... Bivariate relationship is described by a best-fitting line through the scatterplot ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 26
Provided by: lax3
Category:

less

Transcript and Presenter's Notes

Title: Regression and Correlation


1
Regression and Correlation
  • GTECH 201
  • Lecture 18

2
ANOVA
  • Analysis of Variance
  • Continuation from matched-pair difference of
    means tests but now for 3 cases
  • We still check whether samples come from one or
    more distinct populations
  • Variance is a descriptive parameter
  • ANOVA compares group means and looks whether they
    differ sufficiently to reject H0

3
ANOVA H0 and HA
4
ANOVA Test Statistic
  • MSB between-group mean squares
  • MSW within-group mean squares
  • Between-group variability is calculated in three
    steps
  1. Calculate overall mean as weighted average of
    sample means
  2. Calculate between-group sum of squares
  3. Calculate between-group mean squares (MSB)

5
Between-group Variability
  1. Total or overall mean
  2. Between-group sum of squares
  3. Between-group mean squares

6
Within-group Variability
  1. Within-group sum of squares
  2. Within-group mean squares

7
Kruskal-Wallis Test
  • Nonparametric equivalent of ANOVA
  • Extension of Wilcoxon rank sum W test to 3
    cases
  • Average rank is Ri / ni
  • Then the Kruskal-Wallis H test statistic is
  • With N n1 n2 nk total number of
    observations, and
  • Ri sum of ranks in sample i

8
ANOVA Example
House prices by neighborhood in ,000 dollars
A B C D 175 151 127 174 147 183 14
2 182 138 174 124 210 156 181 150 191 18
4 193 180 148 205 196
9
ANOVA Example, continued
Sample statistics n X s A
6 158.00 17.83 B 7 183.29 17.61 C
5 144.60 22.49 D 4 189.25 15.48 Total 22
168.68 24.85
  • Now fill in the six steps of the ANOVA calculation

10
The Six Steps
11
Correlation
  • Co-relatedness between 2 variables
  • As the values of one variable go up, those of the
    other change proportionally
  • Two step approach
  • Graphically - scatterplot
  • Numerically correlation coefficients

12
Is There a Correlation?
13
Scatterplots
  • Exploratory analysis

14
Pearsons Correlation Index
  • Based on concept of covariance
  • covariation between X and Y
  • deviation of X from its mean
  • deviation of Y from its mean
  • Pearsons correlation coefficient

15
Sample and Population
  • r is the sample correlation coefficient
  • Applying the t distribution, we can infer the
    correlation for the whole population
  • Test statistic for Pearsons r

16
Correlation Example
  • Lake effect snow

17
Spearmans Rank Correlation
  • Non-parametric alternative to Pearson
  • Logic similar to Kruskal and Wilcoxon
  • Spearmans rank correlation coefficient

18
Regression
  • In correlation we observe degrees of association
    but no causal or functional relationship
  • In regression analysis, we distinguish an
    independent from a dependent variable
  • Many forms of functional relationships
  • bivariate
  • linear
  • multivariate
  • non-linear (curvi-linear)

19
Graphical Representation
  • In correlation analysis either variable could be
    depicted on either axis
  • In regression analysis, the independent variable
    is always on the X axis
  • Bivariate relationship is described by a
    best-fitting line through the scatterplot

20
Least-Square Regression
  • Objective minimize

21
Regression Equation
  • Y a bX

22
Strength of Relationship
  • How much is explained by the regression equation?

23
Coefficient of Determination
  • Total variation of Y (all the bucket water)
  • Large Y dependent variable
  • Small y deviation of each value of Y from
    its mean
  • e explained u unexplained

24
Explained Variation
  • Ratio of square of covariation between X and Y
    to the variation in X
  • where Sxy covariation between X and Y
  • Sx2 total variation of X
  • Coefficient of determination

25
Error Analysis
  • r 2 tells us what percentage of the variation is
    accounted for by the independent variable
  • This then allows us to infer the standard error
    of our estimatewhich tells us, on average,
    how far off our prediction would be in
    measurement units
Write a Comment
User Comments (0)
About PowerShow.com