Title: Chapter 2 Statistical Tools in Evaluation
1Chapter 2Statistical Tools in Evaluation
2Chapter 2 Outline
- Types of Scores
- Organizing and Graphing Test Scores
- Descriptive Statistics
- Percentile Rank
- Standard Scores
- Normal Curve
- Determining Relationships between Scores
- Regression Analysis
- Additional Statistical Tests
3Types of Scores
- Continuous Scores scores with a potentially
infinite number of values. - Discrete Scores scores limited to a specific
number of values.
4Levels of Measurement
- Nominal
- Ordinal
- Interval
- Ratio
5Scales of Measurement
- Nominal
- Set of mutually exclusive categories.
- Classify or categorize subject.
- No meaningful order to categorization.
6Scales of Measurement
- Ordinal
- Order to scores so that one can be classified as
higher or lower. - No common unit of measurement between numbers.
- Numbers cannot be averaged or used in any way
except to indicate better than.
7Scales of Measurement
- Interval
- Have meaningful order and common unit of
measurement between scores. - Arbitrary zero point.
8Scales of Measurement
- Ratio
- Common unit of measurement and absolute zero
point. - A score of zero indicates lack of value.
9Organizing and Graphing
- Simple frequency distribution listing of a
distribution of scores in order. - Easy to construct using a data analysis program
(e.g., SPSS).
10Frequency Distribution
- Helps organize and interpret data.
11Graphing
- Frequency Polygon
- Histogram
12For Frequency Polygon or Histogram
- Similar scores are grouped together in an
interval. - Midpoint of interval is plotted on X-axis.
- Frequency is plotted on Y-axis.
13SPSS Sample Frequency Polygon
14SPSS Sample Histogram
15Skewness
- An asymmetrical distribution.
- Normal Curve - no skewness.
- Positive Skew - tail of curve on right, few
high scores. - Negative Skew - tail of curve on left, few
low scores.
16Measurement
- - process of obtaining test scores.
- Statistics
- - methodology for analyzing the scores to
enhance interpretation.
17In this course, we use statistics
- To describe a set of scores.
- To standardize scores.
- To estimate validity and reliability.
18Descriptive Statistics
- Central Tendency
- (how data cluster around the center)
- Variability
- (how data spread around the center)
19Mode
- Most frequently occurring score.
20Median
- 50th percentile
- Middle score
- Need to order scores in a frequency distribution
- Found from cumulative percent column
21Mean
22Calculate the Mean, Median, and Mode for Three
Distributions
1 2 3 100 75 51
50 50 50 50 50 50
0 25 49 Mean Median Mode
23Calculate the Mean, Median, and Mode for Three
Distributions
1 2 3 100 75 51
50 50 50 50 50 50
0 25 49 Mean 50 Median 50 Mode 50
24Calculate the Mean, Median, and Mode for Three
Distributions
1 2 3 100 75 51
50 50 50 50 50 50
0 25 49 Mean 50 50 Median
50 50 Mode 50 50
25Calculate the Mean, Median, and Mode for Three
Distributions
1 2 3 100 75 51
50 50 50 50 50 50
0 25 49 Mean 50 50 50 Median
50 50 50 Mode 50 50 50
26So these three distributions are all the same,
right?
No What makes them different? Measure of
Variability
27Range High score - Low score
1 2 3 100 75 51
50 50 50 50 50 50
0 25 49 Range 100 50 2
28Variability
- A second type of descriptive statistic.
- Describes spread or heterogeneity of scores.
29Measures of Variability
- Range
- Standard Deviation
- Variance
30Range
- Range high score - low score.
- Unstable because it depends on only two scores.
31Standard Deviation (s)
- Average deviation of each score from the mean.
- Minimum value of s 0.
- Larger s, more heterogeneous the group.
standard deviation of population
?
s standard deviation of sample
32Standard Deviation (s)
- Definitional Formula
- s ? ?(X - X)2 (n - 1)
33Calculate the Standard Deviation
s ? ?(X - X)2 (N - 1) X (X - X) (X -
X)2 5 0 0 4 -1 1 2 -3
9 9 4 16
34Standard Deviation
X (X - X) (X - X)2 5 0
0 4 -1 1 2 -3
9 9 4 16 ? X20 ?(X-X)0
?(X-X)226 X 5 s ? 26 (4 -1) ? 8.67
2.94
35Standard Deviation
- Calculational Formula
- s ?? X2 - (? X)2 / n (n - 1)
- X X2
- 5 25
- 4 16
- 2 4
- 9 81
- ?X20 ?X2126
36Standard Deviation
X X2 s ?126-((20)2/4)(4-1) 5 25 4 1
6 s ?126 - 100 3 2 4 s ?
8.667 9 81 s 2.94 ?X20 ?X2126
37Variance (s2)
- Average squared deviation from the mean.
- Standard deviation squared.
- Not used for description.
- Used with higher level statistics like regression
analysis or analysis of variance. - s2 ?(X - X)2 (n - 1)
- s2 ?X2 - (?X)2 / n (n - 1)
38Percentile Rank
- Percentage of subjects that scored below a given
score. - Read from cumulative percent column in a simple
frequency distribution. - Percentile ranks are ordinal data.
39Standard Scores
- Change variables to a constant mean and standard
deviation. - Different units of measurement are converted to
the same unit (standardized) and can then be
averaged.
40Z - score
- standard score with a mean 0 and standard
deviation 1. - Z (X - X) S
41T -score
- standard score with a mean 50 and standard
deviation 10. - T 10(Z) 50
42Z-scores
- Provide descriptions of relative performance on
one or more tests.
43Example use of Z-scores
Student A Subject Raw Score Math 30 English 70
Science 120 On which test did Student A
perform best?
44Dont know
45Example use of Z-scores
Student A Subject Raw Score Mean Math 30
25 English 70 65 Science 120
140 On which test did Student A perform best?
46Still Dont Know
47Example use of Z-scores
Student A Subject Raw Score Mean SD Math 30
25 5 English 70 65 10 Science
120 140 10 On which test did Student A
perform best?
48Now we know
49Example use of Z-scores
Student A Subject Raw Score Mean SD Z-score Math
30 25 5 1.00 English 70 65 10
0.50 Science 120 140 10 -2.00 On
which test did Student A perform best? Math The
test with the highest standard score.
50Why use standard scores?
- To combine different units of measurement.
- To assign different weights to each score.
51Characteristics of Normal Curve
- Symmetric
- Asymptotic
- Unimodal
- Area
52Using the Normal Curve to Determine Meaningful
Test Score
- X mean Z (standard deviation)
- If mean 500 and SD 100, what is score above
which 10 of scores would fall? - X 500 1.28 (100)
- Z 1.28 comes from normal curve for 90th
percentile. - X 628
53Determining Relationships between Scores
54Graphing
- Each subject must have a score on two variables
an X and a Y score. - Coordinates of X and Y are plotted.
- Coordinate - paired X and Y score for a subject.
- X scores are placed on horizontal axis.
- abscissa
- Y scores are placed on vertical axis.
- ordinate
55Regression Line
- Line of Best Fit
- Straight line drawn through the data points.
- Represents the trend in the data.
56Characteristics of Correlations
- Direction
- Magnitude (size)
57Direction of r
Positive () or Negative (-)
58Positive Relationship
- When high scores on one measure are associated
with high scores on the other measure.
59Negative Relationship
- When high scores on one measure are associated
with low scores on the other measure.
60- The closer the data points fall to the line of
best fit, the higher the relationship. - Examine sample graphs on following slides.
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65r ?
66r .80
67r ?
68r -.24
69r ?
70r -.42
71Correlation (r)
- Mathematical technique to determine the
relationship between two sets of scores.
72Pearson Product-moment Correlation (r)
- Estimates the linear relationship between
variables.
73Magnitude (strength) of r
- How close r is to 1.00 or -1.00.
- Higher absolute value of r, the stronger the
correlation. - r 1.00 -- perfect positive correlation.
- r -1.00 -- perfect negative correlation.
74Factors that influence magnitude of r
- Linearity
- If the relationship between two variables is
curvilinear, Pearson r will underestimate the
true relationship.
75Factors that influence magnitude of r
- Reliability
- Low reliability on one or both variables will
decrease the correlation.
76Factors that influence magnitude of r
- Range of Scores
- A restricted range of scores on one or both
variables will decrease the correlation. - r will be smaller for a homogeneous group than
for a heterogeneous group.
77Effect of Restricted Range of Scores on r
78A high r does not necessarily indicate a
cause-and-effect relationship.
Causal
79Calculation of r
r n??XY - (?X)(?Y)
?n?X2 - (?X)2n?Y2 - (?Y)2
80Interpretation of r
- Direction?
- Magnitude?
- Varies under certain circumstances.
- Only the relationship you expect determines the
quality of a given r.
81Coefficient of Determination (r2)
- Square of the correlation coefficient.
- Proportion of variance in one measure that is
explained by other measure. - If r .60, r2 .36
- 36 of the variance in Y can be explained by X.
- If r .90, r2 .81
- 81 of the variance in Y can be explained by X.
82r2 proportion of variance in Y explained by
X.1 - r2 proportion of variance in Y not
explained by X (coefficient of non-determination).
Variance of Y
1 - r2
Variance of X
r2
83Prediction-Regression Analysis
- Regression statistical model used to predict
performance on one variable from another. - Simple regression estimating a score on one
variable (Y) from one other variable (X). - Multiple regression estimating a score on one
variable (Y) from two or more other variables
(X1, X1, etc.)
84General form of prediction equation
Y bX C b slope of regression line b
rate of change in Y per unit change in X b rxy
(Sy / Sx) c Y-intercept or constant c mean
of Y - b (mean of X)
85Sample Regression Equations
Boys fat (0.735 ?skinfolds)
1 Girls fat (0.61 ?skinfolds) 5
86Prediction Equation
- Y
- Dependent Variable
- Criterion
- X
- Independent Variable
- Predictor
87Standard Error of Estimate (SEE)
- Predicted Score Y
- Actual Score Y
- Y will not equal Y unless rxy 1
- When rxy ? 1 there is prediction error
- The standard deviation of this error SEE
- SEE Sy ?1 - r2
88Standard Error of Estimate (SEE)
- Expect to find the subjects actual score (Y) in
the boundaries - Y Z (SEE)
- Y 1.00 (SEE) 68 of the time
- Y 1.96 (SEE) 95 of the time
89Standard Error of Estimate (SEE)
- Our best index of prediction accuracy
- The equation with the lowest SEE is the most
accurate.
90Other important measures
- R correlation between Y and Y
- Ranges between 0 and 1.00
- An index of prediction accuracy
- R2 coefficient of determination
- Proportion of variance in criterion (Y scores)
explained by the predictor (X scores) - An index of prediction accuracy
91Cross-validation
- Testing the prediction equation on a second group
of subjects similar to the first group. - When cross-validating, use the following formula
to find SEE - SEE ??(Y - Y)2 / N
- This is also called Total Error
92Multiple Regression
- Predict criterion (Y) using several predictors
(X1, X2, X3, etc). - Basic multiple regression equation has one
intercept (c) and several bs (one for each
predictor variable). - Y b1X1 b2X2 b3X3 c
- Important measures R, R2, SEE
93Sample Multiple Regression Equation
VO2max 56.363 (1.921 SRPA) - (0.381
age) - (0.754 BMI) 10.987 (sex) SRPA
self-reported physical activity BMI weight
(kg) ? height (m2) sex F 0, M 1
94Sample SPSS Regression Output
95Additional Statistics
- t-Tests
- used to compare two means.
- is one mean significantly higher than another
mean? - this is sometimes used to demonstrate known
groups evidence of validity.
96t-Tests
- t-test for one group
- t-test for two independent groups
- t-test for two dependent groups
97t-Test for one group
- used to compare one sample mean to a hypothesized
population mean. - t (mean - µ) ? (s ?n)
- denominator (s ?n) is called standard error of
the mean. - degrees of freedom (df) n - 1
98t-Test Interpretation
- If calculated t-statistic is ? critical value
from a table, reject null hypothesis or - If p-value is .05 from computer printout, then
reject null hypothesis. - If reject null hypothesis, the means are
considered to be significantly different.
99t-Test for One Group Example
- Skinfolds n 81 sample mean 27, s 14
- Hypothesized population mean (µ) 32
- Standard error of the mean s ?n
- SEmean 14 ?81 1.56
- t (mean - µ) ? (s ?n)
- t (27 - 32) ? 1.56 -3.21
- tcritical (.05) 2.00 Reject null hypothesis
or - p lt .05 Reject null hypothesis
100t-Test Interpretation
- If calculated t-statistic is ? critical value,
reject null hypothesis. - -3.21 is ? 2.00, reject null hypothesis.
- If reject null hypothesis, the means are
considered to be significantly different.
101t-Test for Two Independent Groups
- Independent groups means the subjects in one
group are not related to (independent of) the
subjects in the other group. - t (mean1 - mean2) ? (?SEmean1 SEmean2)
- SEmean1 s1 ? ?n1 SEmean2 s2 ? ?n2
- df n1 n2 - 2
102Sample SPSS Independent Groups t-Test
103t-Test for Two Dependent Groups
- Dependent groups means the groups are correlated,
paired, or matched in some fashion or that you
have the same subjects in both groups (e.g.,
pretest vs. posttest). - t (mean1 - mean2)
(?SEmean1 SEmean2 - 2(r)(SEmean1)(SEmean2) - df n - 1
104Sample SPSS Dependent Groups t-Test
105One-way ANOVA
- Used to compare means when there are two or more
groups.
106Two-way Repeated Measures ANOVA
- Used to compare means when two or more measures
are taken on each person.
107Formative Evaluation of Chapter Objectives
- Select statistical technique correct for a given
situation. - Calculate accurately with the formulas presented.
- Interpret calculated statistical values.
- Make decisions based on available information
about a given situation. - Use a personal computer to analyze data.
108Chapter 2Statistical Tools in Evaluation