Title: Discriminant Function Analysis DFA
1Discriminant Function Analysis (DFA)
- Goal
- describe or predict group membership from a set
of predictors - group membership discrete/binary variable
- predictors typically continuous variables
- For example,
- group membership variable psychiatric diagnosis
- three groups schizophrenic, bipolar, multiple
personality - predictor variables
- IQ, self-concept, positive affect, negative
affect
2General Purpose and Description
- To determine which attributes contribute most to
group separation - DFA is MANOVA turned around
- MANOVA group membership associated with group
differences on multiple DVs - they are mathematically identical
- Major differences
- DFA can classify (predict)
- DFA generally looks only at main effects
3General Description
- Predictors are combined to predict group
membership - weighted combinations of predictors are called
linear discriminant functions (LDFs or canonical
variables) - LDF a b1X1 b2X2 b3X3....
- linear linear combinations of linearly weighted
variables - discriminant weights chosen to maximally
separate groups - function constructed from other variables
- so each LDF defines a "new" variable
4General Description
- Separating groups
- like the canonical, you can have multiple LDFs
- i.e., groups can be separated multiple ways, for
example - first LDF might separates bipolars from
schizophrenic and multiple personality - second LDF might separate schizophrenic from
multiple personality - What separates these groups?
- differences on the predictors
- 2-groups b1X1 b2X2 b3X3....
5General Description
Self-Concept
multiple personality group
bipolar group
IQ
6General Description
Self-Concept
multiple personality group
bipolar group
IQ
7Types of DFA
- Direct
- throw all the predictors in at once
- typically used
- Sequential
- predictors entered based on priority
- similar to hierarchical multiple regression
- Stepwise
- the statistically best predictors fight it out
8Limitations/Rules of Thumb
- Typically only look at main effects continuous
predictors - Non-normality, linearity, multicollinearity are
key - Equality of within-group covariances
- Boxs M tests this (use p lt .001 as your rule)
- Typically minimum sample size is 10 X the number
of predictors - Number in the smallest group gt Number of
predictors
9The Process
- Overall tests of LDFs
- Wilks Lambda ( ?) is used to determine if there
are significant LDFs - this statistic is distributed as a ?2
- we want this to be significant
- effect size?
- Next, statistically determine how many LDFs
- number of potential LDFs is the smallest between
- number of predictors
- number of groups - 1
10Evaluating individual LDFs
- There will be a ? value for each possible LDF
- along with an associated significance test
- Like the canonical correlation, these are
examined hierarchically - first LDF, if significant test a second
- first LDF always separates groups the best
- accounts for the most explained variance
- second LDF, orthogonal to first
- tested the same way
11Evaluating individual LDFs
- Once the number of LDFs is determined, examine
the functions at the group centroid - describes the group means for the LDF(s)
- these means are standardized
12Evaluating individual LDFs
- Group centroids are calculated by
- Calculating/creating a discrimination score for
every participant - calculate the mean for each group
- LDF_1 .05(Z_IQ) .36(Z_SC) .40(Z_PA)
.60(Z_NA) - for the schizophrenic group
- substitute into the standardized values for each
of the 4 predictors for each member of this group - calculate the mean LDF for this group
- for the bipolar group
- substitute into the standardized values for each
of the 4 predictors for each member of this group - calculate the mean LDF for this group
- this could be done with raw scores as well
13Evaluating group centroids and individual
predictors
- assume two groups (1schizophrenic, 2bipolar)
- group centroids will be generated
- Simply tells us that bipolars have higher LDF
scores - Must evaluate individual predictors to determine
what variable(s) is/are responsible for these
differences
Group Function schizophrenic
-.72 bipolar .72
14Evaluating group centroids and individual
predictors
- A brief detour for 2 LDFs
- assume we find two LDFs for our 3 groups
- 1schizophrenic, 2bipolar, 3 multiple
personality - create an LDF plot based on these centroids to
examine group separation
Group LDF_1 LDF_2 schizophrenic
-.72 0 bipolar .72 1.30 multiple
personality -.71 -1.35
15LDF Plot
bipolar
schizo
multiple
16Relations between LDF(s) and individual variables
- Standardized function coefficients
- unique contribution of each predictor to an LDF
- highest coefficients are those that will show the
largest group difference - use .30 as an indicator of practical
significance - LDF_1 .05(IQ) .36(SC) .40(PA) .60(NA)
Predictors LDF_1 IQ
.05 Self-Concept .36 Positive Affect
.40 Negative Affect .60
17Relations between LDF(s) and individual variables
- remember that bipolars have a positive group
centroid and schizophrenics are negative - positive coefficients tells us bipolar
individuals have better self-concepts and higher
PA and NA than schizophrenics individuals
Predictors LDF___ IQ
.05 Self-Concept .36 Positive Affect
.40 Negative Affect .60
18Relations between LDF(s) and individual variables
- Correlations between predictors and LDF are
called loadings - presented in the structure matrix
- interpret as was done with standardized
coefficients
Predictors LDF IQ
.25 Self-Concept .50 Positive Affect
.70 Negative Affect .80
19Classification
- Knowing what we know, how well can we predict
group membership? - internal classification vs. external
classification - Simply compare predicted to actual classification
for each group - Predicted classification is based on the LDF for
each group - Predicted values that are closest to the group
centroid are classified in that particular group
20Classification
- When group sizes are equal, this is easy?
- e.g., for 3 groups, by chance we would expect
33.33 in each group - values greater than this indicate the model works
(i.e., the predictors separate groups) - What about when the groups sizes are unequal?
- see the next slide for the process
21Classification continued
- calculate prior probabilities
- e.g. n 10, 20, 20 (divide by 50) ? .20, .40,
.40 - multiply prior probabilities by number in each
group - e.g., group 1 ? ( 10 ) (.20) 2
- so, we would expect two people to be categorized
in this group by chance - add up the total number of cases by chance
- e.g., ( 10 ) ( .2 ) ( 20 ) ( .40) ( 20 )
(.40) 18 - convert this to get percentage classified
correctly by chance - e.g., chance 18/50 36
22Classification continued
Actual Group Predicted Group Membership 1
2 3 Schizophrenic 80 10 10 Bipolar 0 90
10 Multiple Personality 10 20 70 __________
_________________________________ Overall
prediction rate is 80 (referred to as the hit
ratio)
23Other things
- How does DFA differ from logistic regression
(LR)? - DFA more limited in that
- LR does not require the assumptions of DFA
- LR can handle both qualitative and quantitative
(and their interactions) predictors - LR more limited in that you generally need a
larger sample size - maximum likelihood estimation requires it
- No provision for repeated-measures
- use hierarchical linear modeling
24On Your Own
- Read the chapter for information on Predictive
DFA if you are interested (pp. 296-313)