Title: Scales, Transformations,
1- Scales, Transformations, Norms
2- Norms
- Norm-Referenced Test one of the most useful
ways of describing a persons performance on a
test is to compare his/her test score to the test
scores of some other persons or group of people. - Norms are average scores computed for a large
representative sample of the population. - The arithmetic average (mean) is used to judge
whether a score on the scale above or below the
average relative to the population of interest. - a representative sample is required to ensure
meaningful comparisons are made.
3- Norms (cont.)
- No single population can be regarded as the
normative group. - a representative sample is required to ensure
meaningful comparisons are made. - When norms are collect from the test performance
of groups of people these reference groups are
labeled normative or standardized samples.
4Norms (cont.)
- The normative sample selected as the normative
group, depends on the research question in
particular. - It is necessary that the normative sample
selected be representative of the examinee and of
the research question to be answered, in order
for meaningful comparisons to be made. - For example tests measuring attitudes towards
federalism having norm groups consisting of only
students in the province of Quebec might be very
useful for interpretation regionally in Quebec,
however their generalizability in other parts of
the country (Yukon, Toronto, Ontario) would be
suspect.
5Sample Groups
Although the three terms below are used
interchangeably, they are different.
Standardized Sample - is the group of individuals
on whom the test is standardized in terms of
scoring procedures, administration procedures,
and developing the tests norms. (e.g., sample
used in technical manual)
Normative Sample - can refer to any group from
which norms are gathered. Norms collected after
test is published
Reference Group - any group of people against
which test scores are compared. (e.g., a
designated group such as students in 3090.03 or
World Champions)
6- Types of Norms
- Norms can be Developed
- Locally
- Regionally
- Nationally
- Normative Data Can be Expressed By
- Percentile Ranks
- Age Norms
- Grade Norms
7- Local Norms
- Test users may wish to evaluate scores on the
basis of reference groups drawn from specific
geographic or institutional setting. - For Example
- Norms can be created employees of a particular
company or the students of a certain university. - Regional National norms examine much broader
groups.
8- Subgroup Norms
- When large samples are gathered to represent
broadly defined populations, norms can be
reported in aggregate or can be separated into
subgroup norms. - Provided that subgroups are of sufficient size
and fairly representative of their categories,
they can be formed in terms of - - Age
- - Sex
- - Occupation
- - Education Level
-
- Or any other variable that may have a significant
impact on test scores or yield comparisons of
interest. - .
9- Percentile Ranks
- The most common form of norms and is the
simplest method of presenting test data for
comparative purposes. - The percentile rank represents the percentage of
the norm group that earned a raw score less than
or equal to the score of that particular
individual. - For example, a score at the 50th percentile
indicates that the individual did as well or
better on the test than 50 of the norm group. - When a test score is compared to several
different norm groups, percentile ranks may
change. - For example, a percentile rank on a mathematical
reasoning test may be lower when comparing it to
math grade students, than music students.
10- Age Norms
- Method of describing scores in terms of the
average or typical age of the respondents
achieving a specific test score. - Age norms can be developed for any
characteristic that changes systematically with
age. - In establishing age norms, we need to obtain a
representative sample at each of several ages and
measure the particular age related characteristic
in each of these samples. - It is important to remember that there is
considerable variability within the same age,
which means that some children at one age will
perform similar to children at other ages.
11- Grade Norms
- Most commonly used in school settings.
- Similar to age norms except the baseline is
grade level rather than age.
- It is important to remember that there is
considerable variability within individuals of
different grade, which means that some children
in one grade will perform similar to or below
children in other grades.
- One needs to be extremely careful when
interpreting grade norms not to fall into the
trap of saying that, just because a child obtains
a certain grade-equivalent on a particular test,
he/she is the same grade in all areas.
12Evaluating Suitability of a Normative Sample
- How large is the normative sample?
- When was the sample gathered?
- Where was the sample gathered?
- How were individuals identified and selected?
- What was the composition of the normative
sample? - - age, sex, ethnicity, education level,
socioeconomic status
13Caution When Interpreting Norms
- Norms are not based on samples that adequately
represent the type of population to which the
examinees scores are compared.
- Normative data can become outdated very quickly.
- The size of the sample taken.
14Setting Standards/Cutoffs
- Rather than finding out how you stand compared
to others, it might be useful to compare your
performance on a test to some external standard.
For Example - if most people in class get an F on
a test and you get a D, your performance in
comparison to the normative group is good.
However, overall your score is not good.
Criterion-Referenced Tests - assesses your
performance against some set of standards. (e.g.,
school tests, Olympics)
Cutoff Scores - 1 SD?, 2 SD?
15Raw Scores Raw scores are computed for
instruments using Likert scales (interval or
ordinal) by assigning scores to responses and
totaling the scores of the items. - For
positively phrase items, e.g., I think things
will turn out right 5Always, 4Often,
3Sometimes, 2Seldom, 1Never - For positively
phrase items, e.g., I think things will turn
out right 1Always, 2Often, 3Sometimes,
4Seldom, 5Never The raw score would be the
sum of the scores for pertinent items. The
problem with raw scores are that they are fairly
meaningless without some sort of benchmark with
which to make a comparison (e.g., What would a
raw score of 30 on an Optimism scale mean?)
16- Transformations
- Raw scores (i.e., simplest counts of behaviour
sampled by a measuring procedure) do not always
provide useful information. - It is often necessary to reexpress, or transform
raw scores into some more informative scale. - The simplest form of transformation is changing
raw scores to percentages. - For Example
- If a student answers 35 questions out of 50
correctly on a test, that students score could
be reexpressed as a score of 70.
17- Linear Transformations
- Changes the units of measurement, while leaving
the interrelationship unaltered. - An advantage of this procedure is that the
normally distributed scores of tests with
different means and score ranges can be
meaningfully compared and averaged. - Most familiar linear transformation is the z
score.
18Standard Scores Standard scores allow each
obtained score to be compared to the same
reference value. In order to facilitate
comparison between obtained scores and the scores
of other individuals (i.e., the normative
sample), as well as comparison among the various
scales and instruments. Standard scores are
calculated from raw scores such that each scale
and subscale will have the same mean (or average)
score and standard deviation. For example, IQ
scores are transformed so that the average score
is 100, with a SD of 15.
19- Z Scores
- A z-score tells how many standard deviations
someone is above or below the mean. Simply put,
the mean of the distribution is given the z value
of zero (0) and is standard deviation is counted
by ones. - A z-score of -1.4 indicates that someone is 1.4
standard deviations below the mean. Someone who
is in that position would have done as well or
better than 8 of the students who took the test.
- To calculate a z-score, subtract the mean from
the raw score and divide that answer by the
standard deviation. (i.e., raw score 15, mean
10, standard deviation 4. Therefore 15 minus 10
equals 5. 5 divided by 4 equals 1.25. Thus the
z-score is 1.25.) - Z scores have negative values, which can be
difficult to interpret to test users. How can you
explain an examinee that his z score is -1.5? For
this reason it is often convenient to perform a
linear transformation on z-scores to convert them
to values that are easier to record or explain.
The general form of such a transformation is
20T Scores
- T-Scores (or standardized scores) are a
conversion (transformation) of raw individual
scores into a standard form, where the conversion
is made without knowledge of the population's
mean and standard deviation. - The scale has a mean set at 50 and a standard
deviation at 10. - T 50 l0 x z score
- An advantage of using a T-Scores is that none of
the scores are negative.
21- Area Transformations
- Area transformations do more than simply put
scores on a new and more convenient scale -- it
changes the point of reference. - Area transformations adjust the mean and
standard deviation of the distribution into
convenient units. - Advantages of area transformations are obvious.
Out of the infinite number of possible empirical
distributions of test scores, the normal
distribution is most frequently assumed and
approximated. It is also most frequently studied,
in considerably greater detail than other
possible test score distributions. - Normalization thus allows the application of
knowledge concerning properties of standard
normal distribution toward the interpretation of
the obtained scores.
22- Normal Distribution Curve
- Many human variables fall on a normal or close
to normal curve including IQ, height, weight,
lifespan, and shoe size. - Theoretically, the normal curve is bell shaped
with the highest point at its center. The curve
is perfectly symmetrical, with no skewness (i.e.,
where symmetry is absent). If you fold it in half
at the mean, both sides are exactly the same. - From the center, the curve tapers on both sides
approaching the X axis. However, it never touches
the X axis. In theory, the distribution of the
normal curve ranges from negative infinity to
positive infinity. - Because of this, we can estimate how many people
will compare on specific variables. This is done
by knowing the mean and standard deviation.
23(No Transcript)
24Normal Distribution The bell-shaped curve has
the following properties 1. bilaterally
symmetrical (right and left halves are mirror
images) 3. the limits of the curve are plus and
minus infinity, so the tails of the curve will
never quite touch the baseline4. about 68 of
the total area of the curve lies between one
standard deviation below the mean and one
standard deviation above the mean 5. about 95 of
the total area of the curve lies between two
standard deviations below the mean and two
standard deviations above the mean 6. about
99.8 of the total area of the curve lies between
three standard deviations below the mean and
three standard deviations above the mean.
25- Skewness
- Skewness is the nature and extent to which
symmetry is absent. - Positive Skewness - when relatively few of the
scores fall at the high end of the distribution. - For Example - positively skewed examination
results may indicate that a test was too
difficult. - Negative Skewness - when relatively few of the
scores fall at the low end of the distribution. - For Example - negatively skewed examination
results may indicate that a test was too easy.
26Standard Deviations The standard deviation
represents the average distance each score is
from the mean. Use of Standard Deviations with
Norms Knowing the average of a population
allows for a determination as to whether a
particular respondent scored above or below that
average, but does not indicate how much above or
below average the score falls. Standard Deviation
plays a role in this. Scores within 1 SD of
average are pretty much in the middle cluster of
the population. Scores between 1 2 SDs from the
average are moderately above or below the average
, and scores 2 SDs from the average are markedly
for above or below the average.