Crosstabulation - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Crosstabulation

Description:

For use with 2 nominal variables, or variables with SMALL # of ... Split this display diagonally, and the same plots (w/ variables flipped) are on each side. ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 31
Provided by: josephs84
Category:

less

Transcript and Presenter's Notes

Title: Crosstabulation


1
Crosstabulation Plotting Data
  • EDLD 6333 Statistical Reasoning

2
Crosstabulation
  • For use with 2 nominal variables, or variables
    with SMALL of possible values
  • To look at the RELATIONSHIP b/w two variables w/
    few possible values
  • A Crosstabulation is MORE than simply two
    frequency tables
  • Examines combinations of values

3
Income and Job Satisfaction
  • Gssft.sav data file
  • The variable income4 divides income into
    quartiles (only 4 groups)
  • The variable jobsat has four categories
  • To view two frequency tables tells you nothing
    about the relationship b/w the two variables

4
A Crosstabulation
ANALYZE?DESRIPTIVE STATISTICS?CROSSTABS
In this case, satjob is the DV and income is the
IV. When selecting variables for a Crosstabs,
remember they must both be CATEGORICAL.
5
Cell Display Subdialog Box
You should ask for percentages in the direction
of the IV (if you can tell which is IV and DV).
In this case, we need the Column , b/c income is
the IV.
IV thought to influence another variable DV
the variable influenced
6
Crosstabs Output
Read down columns - s add to 100. The
expresses the number of cases in each cell as a
of column total. The TOTAL row and TOTAL column
display same values as a frequency table would.
It appears the lowest income people are less
likely than avg. to be very satisfied.
7
Graphical Displays
Compare bar lengths within cluster and see if
patterns are same across clusters.
8
Stacked Bar with Scale
In chart editor window select GALLERY?BAR?STACKED
Then under OPTIONS, make changes as shown below.
Comparisons are now made easier.
9
Control Variables
  • Adding another variable into the analysis might
    change the relationship you find
  • Control variable removing the effect of another
    variable
  • In this example, we are controlling out gender

10
Jobsat by income4 by gender
Sex is the layer variable. Income4 is still
the IV, so ask for column percentages. Here we
are looking at IV income and DV satjob,
controlling for gender.
11
What did we find?
For women more than men, jobsat increases with
income.
12
Crosstabs Summary
  • Looking at relationships b/w variables with small
    number of possible values (categorical)
  • Number of cases in a cell can be expressed as a
    (in direction of IV)
  • The variable that is influenced is DV
  • Layer variables control out the effects of other
    variables
  • Bar charts are useful displays for categorical
    variables
  • Later in the semester, well do tests of sig for
    such relationships

13
Plotting Data
  • To look at relationships between TWO numeric
    (interval/ratio) variables
  • Graphical display shows values of two numeric
    variables
  • Scatterplot, Sunflower Plots
  • Scatterplot Matrix, Overlay Plots
  • 3-D Scatterplot
  • Bar charts, histograms, etc. displayed single
    variables only (across groups)

14
Why You Plot Data
  • Before you do any statistical tests (like the
    ones well learn later) you should always plot
    the data first
  • Scatterplots look for relationships and patterns
    between TWO variables
  • For the following examples, well use the
    country.sav data file

15
Life Expectancy Birthrate
For a Simple scatterplot Y AXIS vertical
should be DV X AXIS horizontal should be
IV You can give the scatterplot a title by
selecting Title.
16
Resulting Scatterplot
PATTERN As birthrate increases (see values
along X Axis), life expectancy decreases (values
along Y Axis). NEGATIVE RELATIONSHIP Dont know
if sig yet and not necessarily causal!
17
Scatterplot Points
  • We are looking for patterns
  • Can visually see linear relationships
  • From upper left to bottom right NEG
  • From bottom left to upper right POS
  • We are looking at COMBINATIONS of values
    (stemleaf/histogram only look at individual
    values)
  • You can label points by country

18
Controlling Out Development
Here we are controlling out the effects of some
variable. You get a plot of 2 variables with a
3rd variable used to classify. Ea. Country still
only appears once.
19
Scatterplot Controlled for Dev.
Most of the developed nations cluster in the
upper left corner of the plot. You can see a
clear difference in the pattern for developing
vs. developed countries.
20
Sunflower Plots
These are used to show the density of points
whether there are overlapping or nearly
overlapping points. A visual representation of
how points cluster.
21
Scatterplot Matrices
To see how variables relate to another
variable. Scatterplots for all possible pairs of
variables.
22
Resulting Matrix
You must look across, and up/down to find the
variable pairing.
Each cell is a scatterplot of a pair of
variables. Split this display diagonally, and the
same plots (w/ variables flipped) are on each
side. Strongest relationship is tightest grouping
birthrate/life exp.
23
Reading the Results
  • Scan across a row or down an entire column
  • Look up/down variable on horizontal axis
  • Look right/left variable on vertical axis
  • The strongest relationship was negative
  • As birthrate decreases, the life expectancy
    increases. The birthrate also decreases w/
    increasing urbanization, but not as strongly
  • Life exp. urbanization are positively related
  • As urbanization increases, so does life exp.

24
Overlay Plot
To see 2 pairs of variables. Each country
represented twice.
Select pairs of variables to create overlay
scatterplots.
25
Resulting Overlay Plot
W/ Lowess smooths
Variables in an overlay plot must be measured on
the same scale to make sense. Here you can see
that both death rates and birthrates decrease as
urbanization increases. Birthrates decline more
steeply.
26
3-D Scatterplot
A 3-D Plot will show the values of three
variables simultaneously. There are X, Y, and Z
Axes. Points are positioned w/in a 3-dimensional
box.
27
Resulting 3-D Scatterplot
You have to read point values off of three planes
now, so its tricky. The relationship b/w three
variables is presented.
You could also insert spikes, or control by
status again.
28
Identifying Unusual Points
  • The data value for Bhutan stands out far from the
    rest
  • Using the Point Selection Tool, select the point
    and view it in the Data Editor
  • You realize that urban MUST be wrong
  • Norusis gives directions at the end of chapter 9
    (p. 179) for how to change the data value for
    Urban from 95 to 5 Make sure to change this
    in your data file and save your changes.

29
Where We Are
  • Weve still just been looking at data
  • Weve looked at categorical variables and
    continuous numerical variables
  • We will learn tests that will quantify the
    significance and strength of relationships b/w
    variables in future chapters
  • How can we use this w/ education data?

30
Homework
  • Check the website
Write a Comment
User Comments (0)
About PowerShow.com