Understanding Statistics - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Understanding Statistics

Description:

Understanding Statistics Reasons for Analyzing Data Describe data Determine if two or more groups differ on some variable Determine if two or more variables are ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 49
Provided by: maam1
Category:

less

Transcript and Presenter's Notes

Title: Understanding Statistics


1
Understanding Statistics
2
Reasons for Analyzing Data
  • Describe data
  • Determine if two or more groups differ on some
    variable
  • Determine if two or more variables are related
  • Reduce data

3
Types of Data
  • Nominal
  • categories
  • race
  • hair color
  • Ordinal
  • rank order
  • baseball standings
  • waiting list placements
  • Interval
  • equality of intervals
  • performance ratings
  • temperature
  • Ratio
  • true zero
  • equality of ratios
  • salary
  • height

4
The Concept of Significance
  • Interoccular Significance
  • Statistical Significance
  • Practical Significance

5
Significance Levels
  • Indicate the probability that results occurred by
    chance
  • Standard is .05, but others can be used
  • Type I error Concludes there is a difference
    when in fact there is none
  • Type II error Concludes there is no difference
    when there is one

6
Statistical Significance
  • When deviating from the .05 level, consider
  • the common sense of your finding
  • previous research
  • the quality of your data
  • the cost of being wrong
  • Probability level is influenced by
  • sample size
  • differences between groups
  • within group variability

7
Significance Levels in Journal Articles
The job satisfaction level of female employees
(M4.21) was significantly higher than that of
male employees (M3.50), t (60) 2.39, p lt .02.

Academy Score Commendations _____________ __
___________ Cognitive ability
.43 .03 Education .28 .24 _______
___________________________________________ p lt
.05, p lt .01, p lt .001
8
Statistics That Describe Data
9
Raw Data Are Not Usually Meaningful
Client IQ
Caffey 98
Doherty 104
Yokas 110
Boscarelli 93
Sullivan 121
Parker 114
Davis 99
10
Statistics That Describe Data
  • Sample Size
  • overall (N)
  • subgroups (n)
  • Frequencies
  • Central Tendency
  • mean (statistical average)
  • median (midpoint)
  • mode (most common)
  • Dispersion
  • range
  • variance standard deviation

11
Measures of Central Tendency
  • Mean
  • Median
  • Mode

12
The Mean
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
Sum 738
N 7
Mean 105.4
13
The Median
  • Median is the point at which 50 of your data
    fall above and 50 fall below
  • Odd number of scores, the median is the middle
    score
  • Even number of scores, the median is the average
    of the two middle scores
  • 93 98 98 100 104 110 114 121
  • 102

14
The Median
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
15
The Mode The Most Frequently Occurring Score
Client IQ
Boscarelli 93
Caffey 98
Davis 98
Doherty 104
Yokas 110
Parker 114
Sullivan 121
16
Which Measure of Central Tendency Should I Use?
  • Mode
  • nominal data (categories)
  • Mean
  • interval data
  • ratio data
  • Median
  • ordinal (ranked) data
  • interval or ratio data if
  • outliers
  • skewed distribution

17
Therapist Therapist
Newhart Brothers
17 17
18 18
19 19
20 20
21 21
22 22
23 51
Mean 20 24
Median 20 20
18
Measures of Dispersion
  • Range
  • Minimum
  • Maximum
  • Spread
  • Variance (s2)
  • Standard deviation (s)
  • Square root of the variance
  • 1 SD 68 of scores
  • 2 SD 95 of scores

19
Number of Days Absent
Day Shift Evening Shift Night Shift
4 2 3
4 3 3
4 4 4
4 5 5
4 6 5
4 4 4
Mean
20
Performance Ratings
Supervisor Supervisor
Tribbiani Geller
3 2
3 2
3 3
3 3
3 4
3 4
3 3
Mean
21
IQ Scores for Two Training Groups
Training Group Mean IQ SD 1 SD Range 2 SD Range
Morning 100 3 97 103 94 106
Afternoon 100 15 85 115 70 - 130
22
Salary Survey Example
  • Salary Survey Data
  • Mean for police officer is 25,000
  • SD 3,000
  • Our Department Salary
  • 24,000

23
The Normal Curve
24
We know that a students GPA is one standard
deviation above the mean
Standard Deviation Cumulative
- 3.0 0.14
- 2.0 2.28
- 1.5 6.68
-1.0 15.87
- 0.5 30.85
0.0 50.00
0.5 69.15
1.0 84.13
1.5 93.32
2.0 97.72
3.0 99.86
25
Caution About Inferences From Standard Deviations
  • Inferences can be made only when
  • Data are normally distributed
  • Sample size is large
  • If conditions are not met, using percentiles
    based on actual data is best

26
Officer Elmwood PD Oakdale PD
A 1 1
B 2 1
C 2 1
D 3 1
E 3 1
F 3 1
G 4 1
H 4 1
I 4 1
J 4 1
K 5 1
L 5 1
M 5 1
N 5 9
O 5 9
P 5 9
Q 6 9
R 6 9
S 6 9
T 6 9
U 7 9
V 7 9
W 7 9
X 8 9
Y 8 9
Z 9 9
Number of tickets written at two police
departments
27
Elmwood PD Oakdale PD
Mean 5.00 5.00
SD 2.00 4.08
1 SD Range 3 7 .92 9.08
2 SD Range 1 9 - 5.0 14.16
28
Measures of Comparison and Explanation
  • Percent
  • Percentile
  • Q1
  • Q2
  • Q3
  • Standard Score (Z)
  • mean of zero
  • standard deviation of 1
  • T-Score

29
Using Descriptive Statistics to Ensure Data
Integrity
  • Reasons for Errors
  • Inaccurate source data
  • Copied incorrectly from source data
  • Input error
  • misread
  • keystroke error
  • conversion error
  • Input statement error
  • Methods to Check
  • Proofread raw data
  • Sure thing analysis that didnt work
  • Use descriptive statistics to
  • check for values outside the possible range
  • check for values that dont make sense

30
Statistics That Test Differences Between Groups
31
What Statistic to Use
  • Frequencies
  • Chi Square
  • Means
  • two groups t-test
  • Analysis of Variance
  • more than two groups
  • more than one independent variable
  • Analysis of Covariance
  • more than one dependent variable
  • controlling for other variables

32
of Independent Variables Number of Dependent Variables Number of Dependent Variables
of Independent Variables One 2 or More
One
Two levels t-test MANOVA
2 or more levels ANOVA MANOVA
Two or More ANOVA MANOVA
33
Differences in FrequenciesChi-Square
  • Goodness of Fit
  • Does the observed frequency differ from the
    expected frequency
  • Example
  • Secretary 92 80
  • Welder 20 25
  • Supervisor 40 50
  • Tests of Independence
  • Does the distribution for one group differ from
    that of another
  • Example Hired Not
  • Male 32 16
  • Female 10 20

34
The t-test Tests Differences in Means Between Two
Groups
Sex Sex
Male Female
Salary 46,000 43,000

Race Race
Nonminority Minority
Interview score 52.6 47.3
35
Differences Between Two MeansThe t-test
  • Assumptions
  • Normal distribution
  • Equal variances in each group
  • Size and Significance
  • Differences in means
  • Amount of within group variance
  • Sample size
  • Journal Listing
  • t (45) 2.31, p lt .01

36
t-value Needed for Significance
Degrees of Freedom Significance Level (2-tailed) Significance Level (2-tailed)
Degrees of Freedom .05 .01
10 2.228 3.169
15 2.131 2.947
20 2.086 2.845
30 2.042 2.750
40 2.021 2.704
60 2.000 2.660
120 1.980 2.617
37
Analysis of Variance
  • Tests differences in means when there
  • Are more than two groups
  • White 23,121
  • African-American 20,243
  • Hispanic 21,176
  • West Virginian 18,543
  • Is more than one independent variable
  • White Black Total
  • Male 28,100 21,900 25,000
  • Female 24,000 22,000 23,000
  • Total 26,050 21,950 24,000
  • Is an interaction between the two independent
    variables

38
Interpreting the Results of an ANOVA
  • DF SS MS F p lt
  • Sex 1 382106006 382106006 13.16 .0004
  • Race 1 42857538 42857538 1.48 .2260
  • Race Sex 1 14079430 1079430 0.48
    .4871
  • Error 174 5051526673 29031762
  • Total 177 935490569647
  • White Black Total
  • Male 45,008 43,349 44,621
  • Female 41,556 41,330 41,505
  • Total 43,874 42,708

39
Interpreting an ANOVA
  • What is an F Ratio?
  • The between group variance divided by the within
    group variance
  • An F of 1.0 indicates that there are equal
    amounts of within and between groups variance
  • t is the square root of F
  • significance determine by size of F and sample
    size
  • Sample Size Cautions
  • Sample size in each cell should be reasonable (at
    least 10)
  • Sample size in each cell should be about equal or
    at least proportional to the marginal totals

40
Multiple ComparisonsExample
  • Employee Education Performance Rating
  • _________________ ________________
  • GED 3.13
  • High school diploma 3.41
  • Associates degree 4.26
  • Bachelors degree 4.35
  • Masters degree 4.37

41
Multiple ComparisonsConsiderations
  • Planned vs. post hoc comparisons
  • Planned contrasts
  • Post hoc contrasts
  • Scheffee
  • Tukey HSD
  • Newman-Keuls
  • Duncan
  • Fischers least significant difference test
  • Number of comparisons made
  • Bonferroni Adjustment

42
Analysis of Covariance
  • DF SS MS F p lt
  • Covariates
  • Education 1 2036063 2036063 0.10
    .76
  • Years in company 1 132707859 132707859
    6.33 .02
  • Years in grade 1 83553431 83553431
    3.99 .06
  • Years experience 1 16708479 16708479
    0.80 .38
  • Sex 1 12096720 12096720 0.58 .46
  • Uncorrected Corrected
  • Male 41,399 38,236
  • Female 37,859 36,682
  • Difference 3,540 1,554

43
Interpreting Correlations
  • Direction
  • Positive
  • Negative
  • Magnitude
  • Distance from zero
  • Comparison to norms
  • Utility analysis
  • Type of Relationship
  • Linear
  • Curvilinear

44
Interpreting Correlations
  • Types of Correlation
  • Pearson
  • Spearman rank order
  • Point biserial

45
Regression
  • Enables prediction
  • Allows combinations of small correlations
  • Accounts for overlap of variables
  • Two main types
  • Stepwise
  • Hierarchical

46
Regression Formula
  • Y a (b1) (x1) (b2) (x2)
  • Y predicted criterion score
  • a constant (intercept)
  • b weight (slope)
  • x score on the predictor

47
Regression
  • Things to watch for
  • Total number of subjects
  • Subject-to-variable ratio
  • Multicollinearity
  • Inclusion of nonsignificant variables
  • Missing variables
  • Types of equations
  • Raw score
  • Standard score
  • Types of regressions
  • Stepwise
  • Hierarchical

48
Interpreting Regression Results
Variable Regression Weight r2 R2 F Plt
Constant 3.67
IQ 0.10 .151 .151 3.69 .05
Interview .59 .036 .187 3.69 .05
Model 9.57 .001
Performance 3.67 (.10)(IQ) (.59)(Interview)
Write a Comment
User Comments (0)
About PowerShow.com