Title: Multivariate Descriptive Research
1Multivariate Descriptive Research
- In the previous lecture, we discussed ways to
quantify the relationship between two variables
when those variables are continuous. - What do we do when one or more of the variables
is categorical?
2Categorical Variables
- Fortunately, this situation is much easier to
deal with because we can use the same techniques
that weve discussed already. - Lets consider a situation in which we are
interested in how one continuous variable varies
as a function of a categorical variable. - Example How does mood vary as a function of sex
(male vs. female)?
3- In this case, we want to know how the average
womans score compares to that of the average
mans score. - level of a categorical variable
4Participants Mood score
Males
A 4
B 3
C 4
D 3
M 3.5, SD .5
Females
A 5
B 4
C 5
D 4
M 4.5 , SD .5
First, find the average score for each level of
the categorical variable separately. (Also find
the SD.) Second, find the difference between the
means of each group. This is called a mean
difference. (4.5 3.5 1.0) Third, express this
mean difference relative to the SD. This is
called a standardized mean difference. 1/.5
2 In this example, women score 2 SD higher than
the men.
5Participants Mood score
Males
A 4
B 3
C 4
D 3
M 3.5, SD .5
Females
A 5
B 4
C 5
D 3
M 4.25 , SD .83
Note If the SDs for the two groups are
different, you can simply average the two
SDs. Here, the two SDs are .5 and .83.
Averaged, these are (.5 .83)/2 .66. The
standardized mean difference is (4.25 3.5)/.66
.75/.66 1.13 Thus, on average, women score
1.13 SDs higher than men on this mood variable.
6Cohens d
- If we divide the mean difference by the average
SD of the two groups, we obtain a standardized
mean difference or Cohens d.
Pooled standard deviation
7Bargraph
8Bargraph More than two categorical variables
9Both variables are categorical
- When two variables are categorical, it is
sometimes most useful to express the data as
percentages. - Example Lets assume that depression is a
categorical variable, such that some people are
depressed and others are not. - What is the relationship between biological sex
and depression?
10Depression status Depression status
Sex Not Depressed Depressed row total
Male 600 60 660
Female 40 300 340
column total 640 360 1000
11Depression status Depression status
Sex Not Depressed Depressed row total
Male .60 .06 .66
Female .04 .30 .34
column total .64 .36 1.00
In this table, weve expressed each cell as a
proportion of the total.
12Depression status Depression status
Sex Not Depressed Depressed row total
Male .60 .06 .66
Female .04 .30 .34
column total .64 .36 1.00
.60/.64 .94 .06/.36 .16
Here, weve expressed the association with
respect to sex. For example, we can see here
that 16 of people who are depressed are male.
Moreover, 94 of people who are not depressed are
male.
13Depression status Depression status
Sex Not Depressed Depressed row total
Male .60 .06 .66 .06/.66 .09
Female .04 .30 .34 .30/.34 .88
column total .64 .36 1.00
Here, weve expressed the association with
respect to depression status. For example, we
can see here that 9 of men are depressed and 88
of women are depressed.
14Phi
- It is possible to quantify the association among
these variables using a correlation coefficient
when the two variables are binary. - This statistic is sometimes referred to as phi.
- (Phi is .78 in this example)
15Variable 1 Variable 1
Variable 2 0 1 row total
0 a b n3
1 c d n4
Col total n1 n2
Phi (ad) (bc) / sqrt(n1n2n3n4)
Online calculator at http//www.quantitativeskill
s.com/sisa/statistics/twoby2.htm