Chapter 4 Describing the Relation Between Two Variables - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Chapter 4 Describing the Relation Between Two Variables

Description:

Compute and interpret the linear correlation coefficient. ... A lurking variable is one that is related to the response and/or predictor ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 31
Provided by: michae1239
Category:

less

Transcript and Presenter's Notes

Title: Chapter 4 Describing the Relation Between Two Variables


1
Chapter 4Describing the Relation Between Two
Variables
  • 4.1
  • Scatter Diagrams Correlation

2
Objectives
  • Draw scatter diagrams
  • Interpret scatter diagrams
  • Understand the properties of the linear
    correlation coefficient
  • Compute and interpret the linear correlation
    coefficient.

3
Bivariate Data
Bivariate data is data in which two variables are
measured on an individual. Do you want to use
the value of one variable to predict the value of
the other variable? If so, how will we do this?
The response variable is the variable whose value
can be explained or determined based upon the
value of the predictor variable.
4
Bivariate Data
  • Remember that if the type of data we collect is
    observational, we can not conclude a causal
    relation.
  • Also sometimes it is not clear which is the
    predictor variable and which is the response
    variable.
  • A lurking variable is one that is related to the
    response and/or predictor variable, but is
    excluded from the analysis

5
Scatter Diagrams
  • A scatter diagram shows the relationship between
    two quantitative variables measured on the same
    individual.
  • Each individual in the data set is represented by
    a point in the scatter diagram.
  • The predictor variable is plotted on the
    horizontal axis (x) and the response variable is
    plotted on the vertical axis (y).
  • Do not connect the points when drawing a scatter
    diagram.

6
EXAMPLE Drawing a Scatter Diagram
The following data are based on a study for
drilling rock. The researchers wanted to
determine whether the time it takes to dry drill
a distance of 5 feet in rock increases with the
depth at which the drilling begins. So, depth
at which drilling begins is the predictor
variable, x, and the response variable, y is the
time (in minutes) to drill five feet. Draw a
scatter diagram of the data. Source Penner, R.,
and Watts, D.G. Mining Information. The
American Statistician, Vol. 45, No. 1, Feb. 1991,
p. 6.
7
(No Transcript)
8
(No Transcript)
9
Interpreting Scatter Diagrams
  • Since scatter diagrams show the type of relation
    that exists between two variables, our goal is to
    determine if there exists a
  • linear relation,
  • a non-linear relation
  • or no relation.
  • The next 2 slides show the various scatter
    diagrams and the type of relation implied.

10
(No Transcript)
11
Interpreting Scatter Diagrams
12
Interpreting Scatter Diagrams
  • Two variables that are linearly related can be
    positively associated or negatively associated.
  • They are positively associated when above average
    values of one variable are associated with above
    average values of the corresponding variable.
  • That is, two variables are positively associated
    when the values of the predictor variable
    increase, the values of the response variable
    also increase.

13
Interpreting Scatter Diagrams
  • Two variables that are linearly related are said
    to be negatively associated when above average
    values of one variable are associated with below
    average values of the corresponding variable.
  • That is, two variables are negatively associated
    when the values of the predictor variable
    increase, the values of the response variable
    decrease.

14
Correlation
The linear correlation coefficient or Pearson
product moment correlation coefficient is a
measure of the strength of linear relation
between two quantitative variables. We use the
Greek letter ? (rho) to represent the population
correlation coefficient and r to represent the
sample correlation coefficient. We will only do
the sample correlation coefficient.
15
Correlation
  • Where r is the sample correlation coefficient
  • Xi is the data value for the predictor variable,
  • X bar is the sample mean for the predictor
    variable,
  • Sx is the standard deviation for the predictor
    variable
  • yi is the data value for the response variable,
  • y bar is the sample mean for the response
    variable,
  • Sy is the standard deviation for the response
    variable and
  • n is the number of individuals in the sample.

16
Properties of the Linear Correlation Coefficient
1. The linear correlation coefficient is always
between -1 and 1, inclusive. That is, -1 lt r lt
1. 2. If r 1, there is a perfect positive
linear relation between the two variables. 3. If
r -1, there is a perfect negative linear
relation between the two variables. 4. The closer
r is to 1, the stronger the evidence of positive
association between the two variables. 5. The
closer r is to -1, the stronger the evidence of
negative association between the two variables.
17
Properties of the Linear Correlation Coefficient
6. If r is close to 0, there is evidence of no
linear relation between the two variables.
Because the linear correlation coefficient is a
measure of strength of linear relation, r close
to 0 does not imply no relation, just no linear
relation. 7. It is a unitless measure of
association. So, the unit of measure for x and y
plays no role in the interpretation of r.
18
Correlation Coefficient
19
Correlation Coefficient
20
Correlation Coefficient
21
Correlation Coefficient
22
Correlation Coefficient
23
Correlation Coefficient
24
Correlation Coefficient
25
Correlation Coefficient
26
Correlation Coefficient
  • So the correlation coefficient describes the
    strength and the direction of the linear
    relationship between a predictor variable and a
    response variable.

27
EXAMPLE Determining the Linear Correlation
Coefficient Determine the linear correlation
coefficient of the drilling data.
28
(No Transcript)
29
Sum8.501037 / 11 .773
30
EXAMPLE Determining the Linear Correlation
Coefficient
r .773 This is a linear correlation coefficient
that implies a POSITIVE association. Note that
it only applies an association not causation. A
linear correlation coefficient computed using
observational data does not imply causation among
the variables.
Write a Comment
User Comments (0)
About PowerShow.com