Sociology 680 - PowerPoint PPT Presentation

About This Presentation

Title:

Sociology 680

Description:

Sociology 680 Multivariate Analysis Logistic Regression The Analysis of Categories What is Logistic Regression Logistic regression is typically used as an extension ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 12

Provided by: Jera153

Learn more at: http://www.csun.edu

Category:

more less

Transcript and Presenter's Notes

Title: Sociology 680

1
Sociology 680

Multivariate Analysis
Logistic Regression

2
The Analysis of Categories
Quantity
Category
IV DV
Quantity
2) Structural Equation Models (SEM)
1) Analysis of Variance Models (ANOVA)
Linear Models
Category
4) Logistic Regression Models (LRM)
3) Log Linear Models (LLM)
Category Models
3
What is Logistic Regression

Logistic regression is typically used as an
extension of multiple regression, particularly
adapted to situations where the DV is categorical
and IVs are continuous.
It is not, however, restricted to quantitative
IVs. In fact, to the extent the IVs are
categorical themselves, logistic regression can
be thought of as an extension of log linear
modeling, where we are interested in
differentiating the IVs and DV.
If the categorical DV is dichotomous
(2-outcomes), it is called Binomial Logistic
Regression. If the DV has more than two
attributes, it is called Multinomial Logistic
Regression. If the DV categories can be ranked,
it is called ordinal logistic regression.

4
The Premise of Logistic Regression

Logistic Regression is similar to OLS regression
with the exception that it is based on the IVs
prediction of probabilities, odds, and the
logarithm of the odds, for a categorical DV,
rather than the prediction of specific values of
a quantitative DV
For example, age and income become predictors of
the likelihood of a dichotomous DV variable
like union membership, rather than some
quantitative variable, such as occupational
prestige, which would be predicted in a multiple
regression / path analysis.

5
Probability and Odds

Consider the following distribution of union
membership for 650 respondents members 212
non-members 438.
The probability of being a member would be
simply the number of members (outcome of
interest) expressed as a proportion to the total
possible (e.g. P(M) 212 / 650 .326)
The odds would be the ratio of the probability
P(x) to its compliment (1-P(x)). Using the
example above, the odds of being a member would
be P(M) / 1-P(M) .326/.674 .484.

6
The Logit of Logistic Regression

The index analyzed by logistic regression is the
log of the odds. In our example, the odds were
0.484 and the log of the odds is ln(.484)
-.728. This is called a logit and is simply
the natural logarithm of the odds of being in
that category.
In our union membership example, we might want to
know the effect a one unit change, in the value
of age or income has on predicting union
membership. (In logistic regression this odds
ratio is symbolized by or Exp (B) in SPSS).
It is defined as the ratio of the odds of being
classified in one category of the DV for two
different values of the IV.

7
The Formulae of Logistic Regression

Taking the definitions of probability, odds and
logits into account, we produce a formula that is
equivalent to a regression equation and is
characterized by the value , where B1X1
B2X2 ...BkXk ln / 1- .
is a somewhat involved calculation based on
the expected values of the odds ratios, but for
us, lets look at it as a number that gets us to
where we can comment on the probability of
observing one outcome or another, on the DV,
given the best linear combination of IV
predictors.

8
Example of Logistic Regression

Suppose we had a dichotomous dependent variable
such as job satisfaction (satisfied vs. not
satisfied), and wanted to know the ability of age
and hours worked (as IVs) to predict the
likelihood of being satisfied with ones job, or
not (i.e. to predict the likelihood of being in
one category or the other).
This would be equivalent to a multiple regression
analysis if job satisfaction were a continuous
dependant variable. But it is not. Therefore,
we use the binary logistic regression procedure
to identify the equivalent of beta weights, the
multiple R and residuals.

9
SPSS Input for Logistic Regression

In SPSS, this procedure is accessed through the
menus ANALYSE, REGRESSION, BINARY LOGISTIC.

10
Output 1 for Logistic Regression
There are two important pieces of output to
review in assessing the effect of the IVs. The
first is a classification table that uses the
values of to generate predicted frequencies
for each category of the DV. When compared to
the observed frequencies, we can determine the
percentage correct in using our IVs variables to
predict DV outcomes
11
Output 2 for Logistic Regression
The second output to be looked at is the table of
coefficients. Here, it would show the beta
weights for each variable and demonstrate that an
incremental change in life satisfaction is
marginally lower for each unit change in age and
marginally higher for each unit change in hours
worked the previous week. However, due to its
lack of significance, age makes this a weak
predictor IV.

Write a Comment

User Comments (0)