Title: Examining Relationships
1Chapter 2
2(No Transcript)
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8Scatterplot
50
20
5
10
9Scatterplot
50
20
5
10
10Degree days number of degrees the average daily
temp. fell below 65º F accumulated over all the
days in the month
11(No Transcript)
12Days with solar panel installed in blue
13Saturday sales symbol blue cross
14Not all relationships are linear pg. 93, ex. 2.7
15Which of these graphs is more dispersed?
16r is the average of the products of the
z-scores of X Y
17Example
18- The correlation coefficient ranges between -1 and
1. - The sign tells whether the correlation is
positive or negative. - The size tells the strength of the correlation.
19Assessing Linear Relationships with r
20How would you describe this relationship?
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26Calculation of Slope and Intercept for Regression
Equation
27Coefficient of Determination
- When there is only 1 independent variable,
(correlation coefficient)2 coefficient of
determination - Here (in our example), r2 (.9487)2 .90
- Suppose y sales and x ad. expenditure
- This would mean that 90 of the variation in
sales is explained by ad. expenditures.
28Compare with Excel Scatterplot
29Output from Excel Regression Data Analysis Tool
30Interpretation of Slope and Intercept for Simple
Regression
Slope the change in the predicted mean value of
Y for each unit increase in X Intercept
(theoretically) the predicted mean value of Y
when X equals zero. May not have a practical
meaning. Also, valid only if dataset includes
observations with X 0. Suppose Y Sales
(10,000) and X ad. Expenditure (1,000) in our
example Slope For each 1,000 spent on ads,
sales is expected to increase by
6,000 Intercept When no funds are spent on ads
the sales is expected to be 12,000
31Predicting with Simple Regression Equation
From equation y 0.64 1.2 2.4 1.2 3.6
32Not all relationships are linear pg. 93, ex. 2.7
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37A Desirable Residual Plot No Observable Pattern
38Residual Plot Suggesting Model with Curvilinear
Relationship
39Increasing spread as X increases Prediction will
be less accurate
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47Multiple Regression
- One independent variable may not be sufficient to
adequately explain the variation in our dependent
variable. - We may have to include more than one independent
variable in the model. - There is a separate slope coefficient for each
independent variable - We can use the new multiple regression model to
do predictions on the dependent variable, Y
48NFL Data
- Suppose we wish to build a regression model to
predict the number of games won by teams in the
NFL. - It seems logical that two primary characteristics
related to games won are - Yards gained
- Yards allowed
- Let us use EXCEL to analyze the data from a
randomly selected season on all 31 NFL teams. - Begin with two separate regressions, one for each
explanatory variable.
49Simple Regression Games Won vs. Yards Gained
50Simple Regression Games Won vs. Yards Gained
36.2 of the variation in games won is explained
by yards gained
51Simple Regression Games Won vs. Yards Allowed
52Simple Regression Games Won vs. Yards Allowed
26.4 of the variation in games won is explained
by yards allowed
53Multiple Regression with Both Yards Gained
Yards Allowed
Now 57.3 of the variation in games won is
explained by yards gained yards allowed
54Predicting with a Multiple Regression Model
First write down the estimated regression
equation from the coefficients column of the
regression output
Then substitute in the equation for specific
values of each explanatory variable. In this
example we have Yards gained and Yards allowed
55Predicting with a Multiple Regression Model
Suppose we want to predict the number of games
won by a team that gains 5200 yards and allows
5000 yards for the season.
So the predicted mean number of games won by a
team with this record in yards gained and allowed
is between 8 and 9
56Interpreting Slope Estimates in a Multiple
Regression Model
Similar to simple regression but we must consider
the other explanatory variables in the model
being held constant
Yards gained For a given number of yards
allowed, the number of games won is expected to
increase by 2.17 (on average) for each 1000 yards
gained in the season.
57Interpreting Slope Estimates in a Multiple
Regression Model
Yards allowed For a given number of yards
gained, the number of games won is expected to
decrease by 2.61 (on average) for each 1000 yards
allowed in the season.