Title: Marketing Research
1Marketing Research
- Aaker, Kumar, Day and Leone
- Tenth Edition
- Instructors Presentation Slides
2Chapter Nineteen
Correlation Analysis and Regression Analysis
3Definitions
- Correlation analysis
- Measures strength of the relationship between two
variables - Correlation coefficient
- Provides a measure of the degree to which there
is an association between two variables (X and Y)
4Regression Analysis
- Statistical technique that is used to relate two
or more variables - Objective is to build a regression model or a
prediction equation relating the dependent
variable to one or more independent variables - The model can then be used to describe, predict,
and control the variable of interest on the basis
of the independent variables - Multiple regression analysis - Regression
analysis that involves more than one independent
variable
5Correlation Analysis
- Pearson correlation coefficient
- Measures the degree to which there is a linear
association between two interval-scaled variables - A positive correlation reflects a tendency for a
high value in one variable to be associated with
a high value in the second - A negative correlation reflects an association
between a high value in one variable and a low
value in the second variable
6Correlation Analysis (Contd.)
- Population correlation (p) - If the database
includes an entire population - Sample correlation (r) - If measure is based on a
sample
R lies between -1 lt r lt 1 R 0 ---gt absence
of linear association
7Scatter Plots
8Scatter Plots (Contd.)
9Correlation Coefficient
Simple Correlation Coefficient
Pearson Product-moment Correlation Coefficient
10Determining Sample Correlation Coefficient
11Testing the Significance of the Correlation
Coefficient
- Null hypothesis Ho p 0
- Alternative hypothesis Ha p ? 0
- Test statistic
-
-
- Example n 6 and r .70
- At ? .05 , n-2 4 degrees of freedom,
- Critical value of t 2.78
- Since 1.96lt2.78, we fail to reject the null
hypothesis.
12Partial Correlation Coefficient
- Measure of association between two variables
after controlling for the effects of one or more
additional variables
13Regression Analysis
- Simple Linear Regression Model
- Yi ßo ß1xi ei
-
- Where
- Y Dependent variable
- X Independent variable
- ß o Model parameter that represents mean value
of dependent variable (Y) when the independent
variable (X) is zero - ß1 Model parameter that represents the slope
that measures change in mean value of dependent
variable associated with a one-unit increase in
the independent variable - ei Error term that describes the effects on Yi
of all factors other than value of Xi
14Simple Linear Regression Model
15Simple Linear Regression Model A Graphical
Illustration
16Assumptions of the Simple Linear Regression Model
- Error term is normally distributed (normality
assumption) - Mean of error term is zero E(ei) 0)
- Variance of error term is a constant and is
independent of the values of X (constant variance
assumption) - Error terms are independent of each other
(independent assumption) - Values of the independent variable X are fixed
(non-stochastic X)
17Estimating the Model Parameters
- Calculate point estimate bo and b1 of unknown
parameter ßo and ß1 - Obtain random sample and use this information
from sample to estimate ßo and ß1 - Obtain a line of best "fit" for sample data
points - least squares line
18Residual Value
- Difference between the actual and predicted
values - Estimate of the error in the population
ei yi - yi yi - (bo b1 xi)
- bo and b1 minimize the residual or error sum of
squares (SSE) - SSE ?ei2 (?(yi - yi)2
- S yi-(bo b1xi)2
19Standard Error
- Mean Square Error
- Standard Error of b1
- Standard Error of b0
20Testing the Significance of Independent Variables
- Null Hypothesis
- There is no linear relationship between the
independent dependent variables -
-
- Alternative Hypothesis
- There is a linear relationship between the
independent dependent variables
H0 ß1 0
Ha ß1 ? 0
21Testing the Significance of Independent Variables
(Contd.)
- Test Statistic t b1 - ß1
- sb1
- Degrees of Freedom V n 2
- Testing for a Type II Error
- Ho ß1 0
- Ha ß1 ? 0
- Decision Rule
Reject ho ß1 0 if a gt p value
22Sum of Squares
- SST Sum of squared prediction error that would
be - obtained if we do not use x to predict y
- SSE Sum of squared prediction error that is
obtained - when we use x to predict y
- SSM Reduction in sum of squared prediction error
that - has been accomplished using x in predicting y
23Predicting the Dependent Variable
- Dependent variable, yi bo bixi
- Error of prediction is yi y
- Total variation (SST)
-
- Explained variation (SSM) Unexplained
variation (SSE) -
-
Coefficient of Determination (r2)
- Measure of regression model's ability to predict
r2 (SST - SSE) / SST SSM / SST
Explained Variation / Total Variation
24Multiple Regression
- A linear combination of predictor factors is used
to predict the outcome or response factors - The general form of the multiple regression model
is explained as
where ß1 , ß2, . . . , ßk are regression
coefficients associated with the independent
variables X1, X2, . . . , Xk and e is the error
or residual.
25Multiple Regression (Contd.)
- The prediction equation in multiple regression
analysis is
Y a b1X1 b2X2 .bkXk
where Y is the predicted Y score and b1 . . .
, bk are the partial regression coefficients.
26Partial Regression Coefficients
Y a b1X1 b2X2 error
- b 1 is the expected change in Y when X1 is
changed by one unit, keeping X 2 constant or
controlling for its effects. - b 2 is the expected change in Y for a unit change
in X2, when X1 is held constant. - If X1 and X2 are each changed by one unit, the
expected change in Y will be (b1 / b2)
27Evaluating the Importance of Independent Variables
- Consider t-value for ßi's
- Use beta coefficients when independent variables
are in different units of measurement - Standardized ßi bi Standard deviation of
xi - Standard deviation of Y
- Check for multicollinearity
28Stepwise Regression
- Predictor variables enter or are removed from the
regression equation one at a time - Forward Addition
- Start with no predictor variables in regression
equation - i.e. y ßo e
- Add variables if they meet certain criteria in
terms of F-ratio
29Stepwise Regression (Contd.)
- Backward Elimination
- Start with full regression equation
- i.e. y ßo ß1x1 ß2 x2 ... ßr xr e
- Remove predictors based on F- ratio
- Stepwise Method
- Forward addition method is combined with removal
of predictors that no longer meet specified
criteria at each step
30Residual Plots
Random distribution of residuals
Nonlinear pattern of residuals
Heteroskedasticity
Autocorrelation
31Predictive Validity
- Examines whether any model estimated with one set
of data continues to hold good on comparable data
not used in the estimation. - Estimation Methods
- The data are split into the estimation sample
(with more than half of the total sample) and the
validation sample, and the coefficients from the
two samples are compared. - The coefficients from the estimated model are
applied to the data in the validation sample to
predict the values of the dependent variable Yi
in the validation sample, and then the model fit
is assessed. - The sample is split into halves estimation
sample and validation sample for conducting
cross-validation. The roles of the estimation and
validation halves are then reversed, and the
cross-validation is repeated
32Regression with Dummy Variables
Yi a b1D1 b2D2 b3D3 error
- For rational buyer, Yi a
- For brand-loyal consumers, Yi a b1