Title: Multiple Regression Analysis (MRA)
1Multiple Regression Analysis (MRA)
- Design requirements
- Multiple regression model
- R2
- Comparing standardized regression coefficients
2Steps in data analysis
- Look first at each variable separately
- Then at relationships among the variables
- Examine the distribution of each variable to be
used in multiple regression to determine if there
are any unusual patterns that may be important in
building our regression analysis.
3Distribution of variables
4Correlation Analysis
- If interested only in determining whether a
relationship exists, use - correlation analysis.
- Example Students height and weight.
5Correlation Analysis
- Correlation coefficient close to 1strong
positive relationship. - Correlation coefficient close to -1 strong
negative relationship. - Correlation coefficient close to 0 no
relationship. -
6Example Self Concept and Academic Achievement
(N103)Correlation
7Multiple Regression Analysis (MRA)
- Method for studying the relationship between a
dependent variable and two or more independent
variables. - Purposes
- Prediction
- Explanation
- Theory building
8Design Requirements
- One dependent variable (criterion)
- Two or more independent variables (predictor
variables). - Sample size gt 50 (at least 10 times as many
cases as independent variables)
9Assumptions
- Independence The scores of any particular
subject are independent of the scores of all
other subjects - Normality In the population, the scores on the
dependent variable are normally distributed for
each of the possible combinations of the level of
the X variables each of the variables is
normally distributed
10Assumptions
- Homoscedasticity In the population, the
variances of the dependent variable for each of
the possible combinations of the levels of the X
variables are equal. - Linearity In the population, the relation
between the dependent variable and the
independent variable is linear when all the other
independent variables are held constant.
11Homoscedasticity(Homogeneity of variance)
12Linear regression
- In simple linear regression the relationship
between one explanatory variable (IV) and one
response variable (DV). - In multiple regression, several explanatory
variables work together to explain the dependent
variable.
13Models
14What is a Model?
Representation of Some Phenomenon (Non-Math/Stats
Model)
15What is a Math/Stats Model?
- Describe Relationship between Variables
- Types
- Deterministic Models
- (no randomness)
- Probabilistic Models
- (with randomness)
16Deterministic Models
- Hypothesize Exact Relationships
- Suitable When Prediction Error is Negligible
- Example Body mass index (BMI) is measure of body
fat based on this formula. - Non-metric Formula BMI Weight (pounds)x703
-
(Height in inches)2
17Probabilistic Models
- Hypothesize 2 Components
- Deterministic
- Random Error
- Example Systolic blood pressure (SBP) of
newborns is 6 Times the Age in days Random
Error - SBP 6xage(d) ?
- Random Error May Be Due to Factors Other than age
in days (e.g. Birth weight)
18Types of Probabilistic Models
19Regression Models
20Types of Probabilistic Models
21Regression Models
- Relationship between one dependent variable and
explanatory variable(s) - Use equation to set up relationship
- Numerical Dependent (Response) Variable
- 1 or More Numerical or Categorical Independent
(Explanatory) Variables - Used Mainly for Prediction Estimation
22Regression Modeling Steps
- 1. Hypothesize Deterministic Component
- Estimate Unknown Parameters
- 2. Specify Probability Distribution of Random
Error Term - Estimate Standard Deviation of Error
- 3. Evaluate the fitted Model
- 4. Use Model for Prediction Estimation
23Multiple Regression
- Very popular among social scientists.
- Most social phenomena have more than one cause.
- Very difficult to manipulate just one social
variable through experimentation. - Social scientists must attempt to model complex
social realities to explain them.
24Multiple Regression
- Allows us to
- Use several variables at once to explain the
variation in a continuous dependent variable. - Isolate the unique effect of one variable on the
continuous dependent variable while taking into
consideration that other variables are affecting
it too. - Write a mathematical equation that tells us the
overall effects of several variables together and
the unique effects of each on a continuous
dependent variable. - Control for other variables to demonstrate
whether bivariate relationships are spurious
25 Multiple Regression
- For example
- A researcher may be interested in the
relationship between Education and Income and
Number of Children in a family.
Independent Variables Education Family Income
Dependent Variable Number of Children
26Multiple Regression
- For example
- Research Hypothesis As education of respondents
increases, the number of children in families
will decline (negative relationship). - Research Hypothesis As family income of
respondents increases, the number of children in
families will decline (negative relationship).
Independent Variables Education Family Income
Dependent Variable Number of Children
27Multiple Regression
- For example
- Null Hypothesis There is no relationship
between education of respondents and the number
of children in families. - Null Hypothesis There is no relationship
between family income and the number of children
in families.
Independent Variables Education Family Income
Dependent Variable Number of Children
28Multiple Regression
57 of the variation in number of children is
explained by education and income!
29Explaining Variation How much?
Predictable variation by combination of
independent variables
Total Variation in Y
Unpredictable Variation
30Proportion of Predictable and Unpredictable
Variation
(1-R2) Unpredictable (unexplained) variation in
Y
Where Y Children X1 Education X2 Income
Y
X1
R2 Predictable (explained) variation in Y
X2
31Multiple Regression
- Now More Variables!
- The social world is very complex.
- What happens when you have even more variables?
- For example
- A researcher may be interested in the effects of
Education, Income, Sex, and Gender Attitudes on
Number of Children in a family.
Dependent Variable Number of Children
Independent Variables Education Family
Income Sex Gender Attitudes
32Simple vs. Multiple Regression
- One dependent variable Y predicted from a set of
independent variables (X1, X2 .Xk) - One regression coefficient for each independent
variable - R2 proportion of variation in dependent variable
Y predictable by set of independent variables
(Xs)
- One dependent variable Y predicted from one
independent variable X - One regression coefficient
- r2 proportion of variation in dependent variable
Y predictable from X
33Different Ways of Building Regression Models
- Simultaneous (Enter) All independent variables
entered together - Stepwise Independent variables entered according
to some order (Determined by researcher) - By size or correlation with dependent variable
- In order of significance (theory)
- Hierarchical (Forward, Backward) Independent
variables entered in stages
34Multiple RegressionBLUE Criteria
- Regression forces a best-fitting model onto data.
If the model is appropriate for the data,
regression should be used. - How do we know that our model is appropriate for
the data? - Criteria for determining whether a regression
model is appropriate for the data are nicknamed
BLUE for best linear unbiased estimate.
35Multiple RegressionBLUE Criteria
- Violating the BLUE assumptions may result in
biased estimates or incorrect significance tests.
(However, OLS is robust to most violations.) - Data (constellation) should meet these criteria
- The relationship between the dependent variable
and its predictors is linear - No irrelevant variables are either omitted from
or included in the equation. (Good luck!) - All variables are measured without error. (Good
luck!)