Title: EDUCATIONAL STATISTICS
1EDUCATIONAL STATISTICS
EDU5950
Consider the following data set A 5, 7, 9 B
3, 7, 11 and other situations in which knowing
the average leaves you without enough
information. Youre an elementary school
teacher who has been assigned a class of fifth
graders whose mean IQ is 115, well above the IQ
of 100 that is the average for general
population. Because children with IQs of 115
can handle more complex, abstract material, you
plan many sophisticated projects for the year.
But consider variability if the variability of
the IQs in the class is small, your projects will
probably succeed. If the variability is large,
the projects will be too complicated for some,
but for others, even these projects will not be
challenging enough.
WK03
2EDUCATIONAL STATISTICS
EDU5950
The range of a quantitative variable is the
highest score minus the lowest score. The range
depends on only the two extremes scores quickly
calculated, easy-to-understand statistic that is
just fine in some situations. The next measure
of variability, the interquartile range, tells
the range of scores that make up the middle 50
percent of the distribution. To find
interquartile range, you need the 25th percentile
score and the 75th percentile score.
Score f 31 1 30 3 29 5 28 4 27 5 26 4 25
6 24 7 23 3 22 2
WK03
3EDUCATIONAL STATISTICS
EDU5950
The standard deviation is the most widely used
statistic because it is useful.
WK03
4EDUCATIONAL STATISTICS
EDU5950
Both s and S are used to describe the variability
of a set of data and calculated with similar
formulas - using deviation score formula and the
raw score formula. A deviation score formula is
a raw scores minus the mean, either X X or X
µ. Raw scores that are greater than the mean
have positive deviation scores, raw scores that
are less than the mean have negative deviation
scores, and raw scores that are equal to the mean
have a deviation score of zero.
WK03
5EDUCATIONAL STATISTICS
EDU5950
A deviation score tells you the number of points
that a particular score deviates from, or differs
from, the mean. The table in slide 2, the X -
value for Azlan, tells you that the scored six
points above the mean. Mariam, 0, scored at the
mean, and Fauzi, -5, scored five points below the
mean. The deviation score formula for computing
the standard deviation as a descriptive index
is where s standard deviation of a
population S standard deviation
of a sample N number of scores (same as the
number of deviations)
WK03
6EDUCATIONAL STATISTICS
EDU5950
Table 2 presents data on boxes of cookies sold by
six Girl Scouts. Lets define these six sales
reports as a population thus, the standard
deviation is s.
WK03
7EDUCATIONAL STATISTICS
EDU5950
The result is s 8.66 boxes. What does s 8.66
mean? How does it help your understanding? The
8.66 boxes is a measure of the variability in the
number of boxes the six Girl Scouts sold. If s
was zero, you would know that each girl sold the
same number of boxes. The closer s is to zero,
the more confidence you can have in predicting
that the number of boxes any girl sold was equal
to the mean of the group. Conversely, the
further s is from zero, the less confidence you
have. With s 8.66 and 10, you know
that the girls varied a great deal in cookie
sales. The deviation score formula helps you
understand what is actually going on when you
calculate a standard deviation.
WK03
8EDUCATIONAL STATISTICS
EDU5950
Unfortunately, the deviation-score formula
almost always has you working with decimal
values. The raw-score formula, involves far
fewer decimals. The raw score formula is
where sum of the squared scores
square of the sum of the raw scores N
number of scores
WK03
9EDUCATIONAL STATISTICS
EDU5950
WK03
10EDUCATIONAL STATISTICS
EDU5950
As we discuss earlier, the purpose of using
sample is usually to find out something about a
population that is statistic from a sample is
used to estimate a parameter of population. s as
an estimate of s. If you have sample data and
you want to calculate an estimate of s, you
should use the statistic s A raw score
formula for calculating s
WK03
11EDUCATIONAL STATISTICS
EDU5950
WK03
12EDUCATIONAL STATISTICS
EDU5950
The last step in calculating a standard deviation
is to find a square root. The number you take the
square root of is the variance. The symbols for
the variance are s (population variance) and s
(sample variance used to estimate the population
variance).
WK03
13EDUCATIONAL STATISTICS
EDU5950
The z-score, combines a raw score with the mean
and standard deviation of a distribution in a way
that allows you to know relative standing of the
raw score in the distribution. A z-score
description works regardless of the kind of
scores you using or the shape of the
distribution. Suppose Fauzi says she got a 95
on a statistics exam. What does that tell you
about her statistical ability? A score gets its
meaning from its relation to the mean and the
variability of other scores in the distribution.
A z-score is a mathematical way to modify an
individual raw score so that the result conveys
the scores relationship to the mean and standard
deviation of its fellow scores. The formula is
WK03
14EDUCATIONAL STATISTICS
EDU5950
Remember that X - X is an acquaintance of yours,
the deviation score. A z-score describes the
relation of X to X with respect to the
variability of the distribution. For instance,
if you know that a score (X) is 5 units from the
mean (X X 5), you know only that the score is
better than average, but you have no idea how far
above average it is. If the distribution has a
range of 10 units and X 50, then an X of 55 is
a very high score. On the other hand, if the
distribution has a range of 100 units, an X of 55
is barely above average.
WK03
15EDUCATIONAL STATISTICS
EDU5950
To know the scores position in a distribution,
the variability of the distribution must be taken
into account (divide X - X by a unit that
measures the variability, the standard
deviation). The z-score is sometimes referred
to as a standard score because it is a deviation
score expressed in standard deviation units.
Any distribution of raw scores can be converted
into a distribution scores. For each of raw
score, there is one z score. Positive z scores
represent raw scores that are greater than the
mean negative z scores go with raw scores that
are less than the mean. In both cases, the
absolute value of the z score tells the number of
standard deviations the score is from the mean.
Converting a raw score to a z score gives you a
number that indicates the raw scores relative
position in the distribution. If two raw scores
are converted to z scores, you will know their
positions relative to each other as well as to
the distribution.
WK03
16EDUCATIONAL STATISTICS
EDU5950
Z scores are also used to compare two scores from
different distributions, even when the scores are
measuring different things. Let say, one of
your lecturer returned tests with a z-score
rather than a percentage score. This z score was
the key to figuring out your grade. A z score of
1.50 or higher was an A, and -1.50 or lower was
an F. Table 1 lists the raw scores (percentage
correct) and z scores for four of the many
students who took 2 of the tests in that class.
WK03
17EDUCATIONAL STATISTICS
EDU5950
Z scores give you a way to compare raw scores.
The basis of the comparison is the distribution
itself rather than some external standard (such
as grading scale of 90-80-70-60 percent for As,
Bs, and so on). A boxplot (box-and-whisker
plot) is a graphic with information on one
variable, much like a frequency polygon. A
boxplot gives you the mean, median, range,
interquartile range, and skew of the distribution
with just one picture.
WK03
18EDUCATIONAL STATISTICS
EDU5950
- Lets examine the boxplots of four distributions.
- Questions
- Which distribution has positive skew?
- Which distribution is the most compact?
- Which distribution has a mean closet to 40?
- Which distribution is most symmetrical?
- Which distribution has a median closet to 50?
- Which distribution is most negatively skewed?
- Which distribution has the largest range?
WK03
19EDUCATIONAL STATISTICS
EDU5950
The effect size index, symbolized with d. an
effect size index gives you a way to describe the
size of the difference between two distributions
(you know that an independent variable makes a
difference in the scores on the dependent
variable, then d tells you how much of a
difference the independent variable makes). Does
the statement, On average, womens verbal scores
are higher than mens leave you with a question?
An effect size index is the statisticians way
to answer the question How much difference is
there? The effect size index symbolized by d,
where
WK03
20EDUCATIONAL STATISTICS
EDU5950
- A descriptive statistics report gives you fairly
complete story of the data coz you 1) understand
data better, 2) communicate with others who use
statistics, and 3) persuade others. - The most interesting descriptive statistics
reports are those that compare two or more
distributions scores. - A better way to complete a report of two groups
- Construct boxplots
- Find the effect size
- Tell the story that the data reveal
- As for telling story, these points is suggested
- Form of distributions
- Central tendency
- Overlap of the two distributions
- Interpretation of the effect size index
- Of course, you should arrange the points so that
your story is told well.
WK03
21EDUCATIONAL STATISTICS
EDU5950
WK03
22EDUCATIONAL STATISTICS
EDU5950
A descriptive statistics report on the heights of
women and men. The graph shows boxplots of
heights of women and men. The difference in means
produces an effect size index of 1.86.
The mean and median height of women is
64.6 inches the mean and median height of men is
69.8 inches. The difference is about 5 inches.
Although there is some overlap of the two
distributions, the middle 50 percent of the men
are all taller than the middle 50 percent of the
women. This difference in the two distributions
is reflected by an effect size index of 1.86, a
value that is very large (more than twice .80, a
value that is traditionally considered large). In
each distribution the heights are distributed
symmetrically.
WK03