Title: Understanding Statistics
1Understanding Statistics
2Reasons for Analyzing Data
- Describe data
- Determine if two or more groups differ on some
variable - Determine if two or more variables are related
- Reduce data
3Types of Data
- Nominal
- categories
- race
- hair color
- Ordinal
- rank order
- baseball standings
- waiting list placements
- Interval
- equality of intervals
- performance ratings
- temperature
- Ratio
- true zero
- equality of ratios
- salary
- height
4The Concept of Significance
- Interoccular Significance
- Statistical Significance
- Practical Significance
5Significance Levels
- Indicate the probability that results occurred by
chance - Standard is .05, but others can be used
- Type I error Concludes there is a difference
when in fact there is none - Type II error Concludes there is no difference
when there is one
6Statistical Significance
- When deviating from the .05 level, consider
- the common sense of your finding
- previous research
- the quality of your data
- the cost of being wrong
- Probability level is influenced by
- sample size
- differences between groups
- within group variability
7Significance Levels in Journal Articles
The job satisfaction level of female employees
(M4.21) was significantly higher than that of
male employees (M3.50), t (60) 2.39, p lt .02.
Academy Score Commendations _____________ __
___________ Cognitive ability
.43 .03 Education .28 .24 _______
___________________________________________ p lt
.05, p lt .01, p lt .001
8Statistics That Describe Data
9Raw Data Are Not Usually Meaningful
Client IQ
Caffey 98
Doherty 104
Yokas 110
Boscarelli 93
Sullivan 121
Parker 114
Davis 99
10Statistics That Describe Data
- Sample Size
- overall (N)
- subgroups (n)
- Frequencies
- Central Tendency
- mean (statistical average)
- median (midpoint)
- mode (most common)
- Dispersion
- range
- variance standard deviation
11Measures of Central Tendency
12The Mean
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
Sum 738
N 7
Mean 105.4
13The Median
- Median is the point at which 50 of your data
fall above and 50 fall below - Odd number of scores, the median is the middle
score - Even number of scores, the median is the average
of the two middle scores - 93 98 98 100 104 110 114 121
- 102
14The Median
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
15The Mode The Most Frequently Occurring Score
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
16Which Measure of Central Tendency Should I Use?
- Mode
- nominal data (categories)
- Mean
- interval data
- ratio data
- Median
- ordinal (ranked) data
- interval or ratio data if
- outliers
- skewed distribution
17Therapist Therapist
Newhart Brothers
17 17
18 18
19 19
20 20
21 21
22 22
23 51
Mean 20 24
Median 20 20
18Measures of Dispersion
- Range
- Minimum
- Maximum
- Spread
- Variance (s2)
- Standard deviation (s)
- Square root of the variance
- 1 SD 68 of scores
- 2 SD 95 of scores
19Number of Days Absent
Day Shift Evening Shift Night Shift
4 2 3
4 3 3
4 4 4
4 5 5
4 6 5
4 4 4
Mean
20Performance Ratings
Supervisor Supervisor
Tribbiani Geller
3 2
3 2
3 3
3 3
3 4
3 4
3 3
Mean
21IQ Scores for Two Training Groups
Training Group Mean IQ SD 1 SD Range 2 SD Range
Morning 100 3 97 103 94 106
Afternoon 100 15 85 115 70 - 130
22Salary Survey Example
- Salary Survey Data
- Mean for police officer is 25,000
- SD 3,000
- Our Department Salary
- 24,000
23The Normal Curve
24We know that a students GPA is one standard
deviation above the mean
Standard Deviation Cumulative
- 3.0 0.14
- 2.0 2.28
- 1.5 6.68
-1.0 15.87
- 0.5 30.85
0.0 50.00
0.5 69.15
1.0 84.13
1.5 93.32
2.0 97.72
3.0 99.86
25Caution About Inferences From Standard Deviations
- Inferences can be made only when
- Data are normally distributed
- Sample size is large
- If conditions are not met, using percentiles
based on actual data is best
26Officer Elmwood PD Oakdale PD
A 1 1
B 2 1
C 2 1
D 3 1
E 3 1
F 3 1
G 4 1
H 4 1
I 4 1
J 4 1
K 5 1
L 5 1
M 5 1
N 5 9
O 5 9
P 5 9
Q 6 9
R 6 9
S 6 9
T 6 9
U 7 9
V 7 9
W 7 9
X 8 9
Y 8 9
Z 9 9
Number of tickets written at two police
departments
27Elmwood PD Oakdale PD
Mean 5.00 5.00
SD 2.00 4.08
1 SD Range 3 7 .92 9.08
2 SD Range 1 9 - 5.0 14.16
28Measures of Comparison and Explanation
- Percent
- Percentile
- Q1
- Q2
- Q3
- Standard Score (Z)
- mean of zero
- standard deviation of 1
- T-Score
29Using Descriptive Statistics to Ensure Data
Integrity
- Reasons for Errors
- Inaccurate source data
- Copied incorrectly from source data
- Input error
- misread
- keystroke error
- conversion error
- Input statement error
- Methods to Check
- Proofread raw data
- Sure thing analysis that didnt work
- Use descriptive statistics to
- check for values outside the possible range
- check for values that dont make sense
30Statistics That Test Differences Between Groups
31What Statistic to Use
- Frequencies
- Chi Square
- Means
- two groups t-test
- Analysis of Variance
- more than two groups
- more than one independent variable
- Analysis of Covariance
- more than one dependent variable
- controlling for other variables
32 of Independent Variables Number of Dependent Variables Number of Dependent Variables
of Independent Variables One 2 or More
One
Two levels t-test MANOVA
2 or more levels ANOVA MANOVA
Two or More ANOVA MANOVA
33Differences in FrequenciesChi-Square
- Goodness of Fit
- Does the observed frequency differ from the
expected frequency - Example
- Secretary 92 80
- Welder 20 25
- Supervisor 40 50
- Tests of Independence
- Does the distribution for one group differ from
that of another - Example Hired Not
- Male 32 16
- Female 10 20
34The t-test Tests Differences in Means Between Two
Groups
Sex Sex
Male Female
Salary 46,000 43,000
Race Race
Nonminority Minority
Interview score 52.6 47.3
35Differences Between Two MeansThe t-test
- Assumptions
- Normal distribution
- Equal variances in each group
- Size and Significance
- Differences in means
- Amount of within group variance
- Sample size
- Journal Listing
- t (45) 2.31, p lt .01
36t-value Needed for Significance
Degrees of Freedom Significance Level (2-tailed) Significance Level (2-tailed)
Degrees of Freedom .05 .01
10 2.228 3.169
15 2.131 2.947
20 2.086 2.845
30 2.042 2.750
40 2.021 2.704
60 2.000 2.660
120 1.980 2.617
37Analysis of Variance
- Tests differences in means when there
- Are more than two groups
- White 23,121
- African-American 20,243
- Hispanic 21,176
- West Virginian 18,543
- Is more than one independent variable
- White Black Total
- Male 28,100 21,900 25,000
- Female 24,000 22,000 23,000
- Total 26,050 21,950 24,000
- Is an interaction between the two independent
variables -
38Interpreting the Results of an ANOVA
- DF SS MS F p lt
- Sex 1 382106006 382106006 13.16 .0004
- Race 1 42857538 42857538 1.48 .2260
- Race Sex 1 14079430 1079430 0.48
.4871 - Error 174 5051526673 29031762
- Total 177 935490569647
- White Black Total
- Male 45,008 43,349 44,621
- Female 41,556 41,330 41,505
- Total 43,874 42,708
39Interpreting an ANOVA
- What is an F Ratio?
- The between group variance divided by the within
group variance - An F of 1.0 indicates that there are equal
amounts of within and between groups variance - t is the square root of F
- significance determine by size of F and sample
size
- Sample Size Cautions
- Sample size in each cell should be reasonable (at
least 10) - Sample size in each cell should be about equal or
at least proportional to the marginal totals
40Multiple ComparisonsExample
- Employee Education Performance Rating
- _________________ ________________
- GED 3.13
- High school diploma 3.41
- Associates degree 4.26
- Bachelors degree 4.35
- Masters degree 4.37
41Multiple ComparisonsConsiderations
- Planned vs. post hoc comparisons
- Planned contrasts
- Post hoc contrasts
- Scheffee
- Tukey HSD
- Newman-Keuls
- Duncan
- Fischers least significant difference test
- Number of comparisons made
- Bonferroni Adjustment
42Analysis of Covariance
- DF SS MS F p lt
- Covariates
- Education 1 2036063 2036063 0.10
.76 - Years in company 1 132707859 132707859
6.33 .02 - Years in grade 1 83553431 83553431
3.99 .06 - Years experience 1 16708479 16708479
0.80 .38 - Sex 1 12096720 12096720 0.58 .46
- Uncorrected Corrected
- Male 41,399 38,236
- Female 37,859 36,682
- Difference 3,540 1,554
43Interpreting Correlations
- Direction
- Positive
- Negative
- Magnitude
- Distance from zero
- Comparison to norms
- Utility analysis
- Type of Relationship
- Linear
- Curvilinear
44Interpreting Correlations
- Types of Correlation
- Pearson
- Spearman rank order
- Point biserial
45Regression
- Enables prediction
- Allows combinations of small correlations
- Accounts for overlap of variables
- Two main types
- Stepwise
- Hierarchical
46Regression Formula
- Y a (b1) (x1) (b2) (x2)
- Y predicted criterion score
- a constant (intercept)
- b weight (slope)
- x score on the predictor
47Regression
- Things to watch for
- Total number of subjects
- Subject-to-variable ratio
- Multicollinearity
- Inclusion of nonsignificant variables
- Missing variables
- Types of equations
- Raw score
- Standard score
- Types of regressions
- Stepwise
- Hierarchical
48Interpreting Regression Results
Variable Regression Weight r2 R2 F Plt
Constant 3.67
IQ 0.10 .151 .151 3.69 .05
Interview .59 .036 .187 3.69 .05
Model 9.57 .001
Performance 3.67 (.10)(IQ) (.59)(Interview)