Displaying and Describing Categorical Data - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Displaying and Describing Categorical Data

Description:

Open your bag of M&Ms. DO NOT EAT THEM YET! ... Do we need to list black, white, chartreuse? 5. Brwn. 5. Blue. 4. Grn. 2. Yllw. 3. Orng. 6. Red ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 26
Provided by: catheri80
Category:

less

Transcript and Presenter's Notes

Title: Displaying and Describing Categorical Data


1
Chapter 3
  • Displaying and Describing Categorical Data

2
Before we begin
  • Finish Chapter 2
  • ActivStats
  • DataDesk
  • Copy DD to computer

3
MMs
  • Open your bag of MMs.
  • DO NOT EAT THEM YET!!
  • Construct a question that needs a categorical
    variable.
  • Think about your Who when you do this
  • Construct a question that uses a quantitative
    variable.
  • Gather that data from your bag.
  • Share with class.

4
More MMs
  • Class-
  • Choose a color
  • Count the number of each color in your bag.
  • Count the total number of MMs in your bag
  • Eat your MMs.

5
Representing Data
  • How do you organize things?
  • One typical method- Make piles!

More like this.
Not like this, though!
6
Make a picture!
  • Without some way to organize our data and
    summarize it, it is hard to see what is going on.
  • Making a picture is one very important way to
    help us
  • Helps us to think about patterns, relationships
  • Shows important features
  • Helps us to tell others about our data.

7
Organizing MMs
  • In the class activity, you collected information
    on the colors of the MMs in your bag.
  • If you wanted to convey that information to
    someone else, how would you?
  • Ideas?

8
Options Frequency Table
  • Frequency Table A data table showing category
    names and pile counts for each category.
  • Note We are counting the number of cases falling
    into each category of a categorical variable.
  • Category labels are the What
  • Individuals counted are the Who
  • For your MM packet?
  • Categorical variable Color (what was measured?)
  • Individuals measured each MM in the packet.

9
Frequency Table MMs
Do we need to list black, white, chartreuse?
10
Relative Frequency
  • Counts are useful, but what if Jenny has 25
    candies, and Jimmy has only 18? Is a straight
    comparison of of reds the best method?
  • Relative Frequencies, or looking at the pile
    percentages is sometimes a better option.
  • Percentage, Proportion, relative frequency all
    refer to the proportion of the whole that a
    particular category makes up.
  • If asked for Proportion- give decimal. Percent-
    give !

11
Relative Frequency MMs
Both of these tables tell us the distribution of
a categorical variable, ie all the possible
categories and how frequently each occurs.
12
Other ways to represent Cat. Vars
  • Graphs!
  • Bar Charts
  • Give either a count for each category, or the .
  • Important to read axis labels!
  • Easy comparison
  • Pie Charts
  • Show whole group of cases as circle
  • Each category shown as a slice of the pie, size
    proportional to the fraction of the whole in each
    category.
  • Note These graphs are only for Categorical
    Variables! Wont work for Quantitative Vars.

13
Bar and Pie Charts DD
  • Usually youll do bar and pie charts by hand,
    unless you have the original data.
  • Remember, frequency tables are summaries of
    categorical data.
  • Must observe the Area Principle
  • Equal amount of area for each increase in count.
  • Lets look at some distributions of Cat. Vars. in
    DataDesk.

14
Contingency Tables
  • What if we want to look at 2 Cat. Vars together?
  • Use a Contingency Table
  • Shows counts or of individuals falling into
    categories on two (or more) variables.
  • Can reveal patterns in one variable that may
    depend on the category of the other.

15
Contingency Table Example
Cells in the table give the count of cases for
every combination of the 2 Cat. Vars
From 1990 Census Data Women age 18 by age and
current marital status (in thousands)
16
Contingency Tables
How many women are included, total, in the
population?
99,585 thousand
How many women were Widowed AND aged 25-64?
2,425 thousand
How many women were Widowed?
11,080 thousand
68,709 thousand
How many women were aged 25-64?
17
Marginal Distributions
Marginal Distribution of Marital Status
Marginal Distribution of Age
What if we are only interested in one of the
categorical variables?
Want the Marginal Distribution the frequency
distribution of only one var in a contingency
table
18
Percents and Contingency Tables
  • Often, well want to see percentages rather than
    counts.
  • What number to divide each count by depends on
    the question you need answered.

19
Contingency Tables
If we want to know the percentage of the whole
that each combination of Categories represents,
simply divide by the size of the sample.
20
Conditional distributions
  • Often, well want to restrict our interest to
    only one group (widows, perhaps), and want to
    look at the breakdown of the second variable
    (age) for that group.
  • This is known as a conditional distribution
  • The distribution of one variable for only those
    cases satisfying some condition on the other
    variable.

21
Conditional Distributions
  • What is the distribution of Age for Widowed?
  • Divide each count in the Widowed Row by the total
    number of Widowed (the group of interest).
  • Ex What percent of Widowed women are 18-24?
  • 19 / 11080 0.17 (Count meeting criteria /
    in sample)

Look for clues in the wording to determine the
group of interest (widowed women) and the
criteria (aged 18-24). Typical What proportion
OF _(group)___ ARE _(criteria)___.
22
Conditional Distributions
  • What is the distribution of marital status for
  • 18-24 yr olds?
  • Divide each count in the 18-24 yr old column by
    the total number of 18-24 yr olds (the
    group of interest).
  • Ex What percent of 18-24 yr olds are married?

23
Independence
  • Does your marital status depend on your age?
  • Common sense may tell you one thing, but dont
    trust it- verify with numbers!!
  • So how do we do that?
  • Look at the conditional distributions.
  • If the distribution of one variable is the same
    for all categories of another, then the variables
    are independent (not associated)
  • So lets look at the conditional distributions of
    marital status for each age group.

24
Independence
Are the conditional distributions of age the same
for each marital status?
No- so we conclude that Age and Marital Status
ARE related, so they are NOT independent.
We could have also looked at the conditional
distribution of age for each marital status.
Patterns??
25
Example 24
  • What percent of grads joined the military?
  • What percent of 1970 grads joined the military?
  • Of the students who joined the military, what
    graduated in 1970?
  • Is what people did after HS independent of the
    year?
Write a Comment
User Comments (0)
About PowerShow.com