Sociology 680 - PowerPoint PPT Presentation

About This Presentation
Title:

Sociology 680

Description:

Sociology 680 Multivariate Analysis Logistic Regression The Analysis of Categories What is Logistic Regression Logistic regression is typically used as an extension ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 12
Provided by: Jera153
Learn more at: http://www.csun.edu
Category:

less

Transcript and Presenter's Notes

Title: Sociology 680


1
Sociology 680
  • Multivariate Analysis
  • Logistic Regression

2
The Analysis of Categories
Quantity
Category
IV DV
Quantity
2) Structural Equation Models (SEM)
1) Analysis of Variance Models (ANOVA)
Linear Models
Category
4) Logistic Regression Models (LRM)
3) Log Linear Models (LLM)
Category Models
3
What is Logistic Regression
  • Logistic regression is typically used as an
    extension of multiple regression, particularly
    adapted to situations where the DV is categorical
    and IVs are continuous.
  • It is not, however, restricted to quantitative
    IVs. In fact, to the extent the IVs are
    categorical themselves, logistic regression can
    be thought of as an extension of log linear
    modeling, where we are interested in
    differentiating the IVs and DV.
  • If the categorical DV is dichotomous
    (2-outcomes), it is called Binomial Logistic
    Regression. If the DV has more than two
    attributes, it is called Multinomial Logistic
    Regression. If the DV categories can be ranked,
    it is called ordinal logistic regression.

4
The Premise of Logistic Regression
  • Logistic Regression is similar to OLS regression
    with the exception that it is based on the IVs
    prediction of probabilities, odds, and the
    logarithm of the odds, for a categorical DV,
    rather than the prediction of specific values of
    a quantitative DV
  • For example, age and income become predictors of
    the likelihood of a dichotomous DV variable
    like union membership, rather than some
    quantitative variable, such as occupational
    prestige, which would be predicted in a multiple
    regression / path analysis.

5
Probability and Odds
  • Consider the following distribution of union
    membership for 650 respondents members 212
    non-members 438.
  • The probability of being a member would be
    simply the number of members (outcome of
    interest) expressed as a proportion to the total
    possible (e.g. P(M) 212 / 650 .326)
  • The odds would be the ratio of the probability
    P(x) to its compliment (1-P(x)). Using the
    example above, the odds of being a member would
    be P(M) / 1-P(M) .326/.674 .484.

6
The Logit of Logistic Regression
  • The index analyzed by logistic regression is the
    log of the odds. In our example, the odds were
    0.484 and the log of the odds is ln(.484)
    -.728. This is called a logit and is simply
    the natural logarithm of the odds of being in
    that category.
  • In our union membership example, we might want to
    know the effect a one unit change, in the value
    of age or income has on predicting union
    membership. (In logistic regression this odds
    ratio is symbolized by or Exp (B) in SPSS).
    It is defined as the ratio of the odds of being
    classified in one category of the DV for two
    different values of the IV.

7
The Formulae of Logistic Regression
  • Taking the definitions of probability, odds and
    logits into account, we produce a formula that is
    equivalent to a regression equation and is
    characterized by the value , where B1X1
    B2X2 ...BkXk ln / 1- .
  • is a somewhat involved calculation based on
    the expected values of the odds ratios, but for
    us, lets look at it as a number that gets us to
    where we can comment on the probability of
    observing one outcome or another, on the DV,
    given the best linear combination of IV
    predictors.

8
Example of Logistic Regression
  • Suppose we had a dichotomous dependent variable
    such as job satisfaction (satisfied vs. not
    satisfied), and wanted to know the ability of age
    and hours worked (as IVs) to predict the
    likelihood of being satisfied with ones job, or
    not (i.e. to predict the likelihood of being in
    one category or the other).
  • This would be equivalent to a multiple regression
    analysis if job satisfaction were a continuous
    dependant variable. But it is not. Therefore,
    we use the binary logistic regression procedure
    to identify the equivalent of beta weights, the
    multiple R and residuals.

9
SPSS Input for Logistic Regression
  • In SPSS, this procedure is accessed through the
    menus ANALYSE, REGRESSION, BINARY LOGISTIC.

10
Output 1 for Logistic Regression
There are two important pieces of output to
review in assessing the effect of the IVs. The
first is a classification table that uses the
values of to generate predicted frequencies
for each category of the DV. When compared to
the observed frequencies, we can determine the
percentage correct in using our IVs variables to
predict DV outcomes
11
Output 2 for Logistic Regression
The second output to be looked at is the table of
coefficients. Here, it would show the beta
weights for each variable and demonstrate that an
incremental change in life satisfaction is
marginally lower for each unit change in age and
marginally higher for each unit change in hours
worked the previous week. However, due to its
lack of significance, age makes this a weak
predictor IV.
Write a Comment
User Comments (0)
About PowerShow.com