Title: UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT
1UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT
- CHAP 14 ITEM ANALYSIS
- CHAP 15 INTRODUCTION TO ITEM RESPONSE THEORY
- CHAP 16 DETECTING ITEM BIAS
2 CHAPTER 14? ITEM ANALYSIS
- The goal of test construction is to create a
test with minimum length and good reliability
and validity. - Item Analysis is the computation and
examination of any statistical property of an
item response distribution. - Item Analysis is a process that we go through
when constructing a new test or subtests from a
pool of items with good reliability and validity.
3CHAPTER 14 ITEM ANALYSIS
- Categories of Item Parameter
- Item parameters fall into 3 categories or
indices. - 1. Indices that describe the distribution of
responses to a single item (e. g. mean and
variance of item responses). - 2. Indices that describe the degree of
relationship between the response to the item and
some criterion of interest. - Ex. next
4CHAPTER 14 ? ITEM ANALYSIS
- Ex. The relationship between the questions
(items) and the criterion of interest i.e.,
depression in Factor Analysis. - 3. Indices that are a function of both, meaning,
relationship to item variance/mean and a
criterion of interest. - Ex. First, find the variance/mean for your items
then, calculate the relationship between these
items variance and the criterion of interest
(i.e., depression) for two groups..
5CHAPTER 14 ITEM ANALYSIS
- Item Difficulty P
- P f/N or Number of examinees who answered an
item correctly / Total number of participants
(See your midterm item analysis and Chap 5). - The higher the P value the easier the item
6(No Transcript)
7CHAPTER 14 ITEM ANALYSIS
- Steps in Item Analysis
- In a typical item analysis the test
developer will take 7 steps (they are similar to
the process of test construction in Chapter 4).
Next Slide
8FYI Process of Test Construction Chap IV
- 1-Identifying purposes of test scores use
- 2-Identifying behaviors to represent the
construct - 3- Preparing test specification i.e., Bloom
Taxonomy - 4- Item construction
- 5- Item Review
9 Process of Test Construction
- 6- Preliminary item tryouts
- 7- Field test
- 8- Statistical Analysis
- 9- Reliability and Validity
- 10- Guidelines
10CHAPTER 14 ITEM ANALYSIS
- 7 Steps in Item Analysis
- 1. Describe what proportions of the test score
are of greatest important. - Ex. when I select questions for your
midterm/final exam I look for the similarities of
the questions with those of qualifying/comprehensi
ve or EPPP exam.
11CHAPTER 14 ITEM ANALYSIS
- Steps in Item Analysis
- 2. Identify the item parameters (e.g. mean,
variance) most relevant to these proportions. - 3. Administer the items to a sample of
examinees representative of those for whom the
test is intended. - Ex. IQ test for children or depression test
for adults.
12CHAPTER 14 ITEM ANALYSIS
- Steps in Item Analysis
- 4. Estimate for each item the parameters
identified in step 2 i.e., variance). - 5. Establish a plan for item selection.
- Ex. Using item difficulties (P) as in Item
Analysis to select the items.
13CHAPTER 14 ITEM ANALYSIS
- Steps in Item Analysis
- 6. Select the final subset of items, or use the
data (Items in your Item Analysis) for test
revision. - Ex. Takeout all questions with very high or
very low item difficulties. - 7. Conduct a cross validation (validity) study.
- Ex. Use SPSS and compare the results of 2 tests
or 2 classes (e. g. this year class and last year
class). i.e., Confirmatory Factor Analysis.
14 UNIT V
TEST SCORING AND INTERPRETATION
- CHAP 17 CORRECTING FOR GUESSING AND OTHER
SCORING METHODS - CHAP 18 SETTING STANDARDS
- CHAP 19 NORMS AND STANDARD SCORES
- CHAP 20 EQUATINGSCORESFROM DIFFERENT TESTS
15 UNIT VTEST
SCORING AND INTERPRETATION
- CHAPT 19
- NORMS AND STANDARDS SCORES
16 CHAPTER 19NORMS AND STANDARD SCORES
- Alfred Binet (1910)?Ratio IQ Ratio of MA/CA
- Louis Terman ? Ratio IQ Ratio of MA/CA X 100
standardized it. - Deviation IQ Uses Norms to estimate the IQ
- We use Norms when we want to compare an
examinees score (raw score) or score on a test
to the distribution of scores (scaled or standard
scores) for a sample from a well-defined
population. Ex. next
17CHAPTER 19NORMS AND STANDARD SCORES
- Ex. When we want to estimate the IQ of a 20
year-old person, We compare his/her raw score on
the subtest of an IQ test with the people of
his/her age, which is his/her norm (standard
scores). Using this technique tells us where this
person stands among the people of his/her age.
18NORMS AND STANDARD SCORES9 Basic Steps in
Conducting a Norming Study (p.432)
- 1. Identify the population of interest
- Ex. Students, employees of a company,
inmates, patients, etc. - 2. Identify the most critical statistics that
will be computed for the sample data. - Ex. Standard deviation s, s² , M, SS, p
-
19NORMS AND STANDARD SCORES9Basic Steps in
Conducting a Norming Study (p.432)
- 3. Decide on the tolerable amount of sampling
error - That is the discrepancy between the sample
statistic (M) and population parameter, (µ)
(Central Tendency Mµ). The Central Limit Theorem
has 3 characteristics - 1. Central Tendency 2.The Shape of the
Distribution (normal) and 3. Variability or
Standard Error of Mean (sm). M-µ
209Basic Steps in Conducting a Norming Study
(p.432)
- 4. Device a procedure for drawing a sample from
the population of interest. - There are 4 types of probability sampling
- I Simple Random Sampling
- Give everyone in the population an equal chance
to be selected Ex. Draw names from a hat. - II Systemic Sampling N/n
- Select every Kth name on the list. Ex. CAU
Pop N1500 and your sample size n150 - N/n1500/15010 Select every 10th student.
219Basic Steps in Conducting a Norming Study
(p.432)Sampling cont..
- III Stratified Sampling Strata means different
layers. We use Stratified Sampling when we want
to compare 2 different groups (e.g. Males and
females CAU Doctoral Students). - First we randomly select males then, randomly
select females.
229Basic Steps in Conducting a Norming
Study(p.432)Sampling cont..
- IV Cluster Sampling We use Cluster sampling when
the population consists of units not individuals,
such as classes. Ex. Miami Dade School
Districts. If we want to conduct a research with
the Miami Dade 2nd graders (1000- 2nd grade
classes). Well randomly select about 10 of these
1000- 2nd grade classes to be in our sample then
we conduct research.
239Basic Steps in Conducting a Norming Study
(p.432)
- 5.Estimate the minimum sample size (n) required
to hold the sampling error within the specific
limits. - There are different statistical procedures to
estimate the (n). (n) should be 30. - 1. n (s/d)²
- deffect size dM-µ/s
- 2. n (s/sm) ²
- sm s/vn Standard error of mean? for pop Ex.
Z score - SmS/vn Estimated Standard Error of the Mean
for a sample. Ex. t-distribution
24 NORMS AND STANDARD SCORES
25The Effect Size Ex. Two Independent t-test
26NORMS AND STANDARD SCORES
279Basic Steps in Conducting a Norming Study
(p.432)
- 6. Draw the Sample and collect the Data
- 7. Compute the Values of the Group Statistics of
interest and their standard error. SmS/vn or
sm s/vn - Calculate the standard error of measurement,
which is the difference between M and µ. Also
known as sampling error. -
289Basic Steps in Conducting a Norming Study
(p.432)
- 8. Identify the Types of Normative Scores that
will be needed, and prepare the Normative Score
Conversion table (see next 2 slide). - 9. Prepare written documentation of the
Normative Scores. -
29NORMS AND STANDARD SCORES
- Types of Normative Scores
- Raw Score? Score on a subtest or a test.
- Scaled Score? Normative score for specific age.
30Normative Scores
Wex-ler
31Normative Scores
32NORMS AND STANDARD SCORES
- Usefulness of Scaled Scores
- Scaled Scores are useful for two purpose
- 1. Scaled scores relate the examinees
performance to percentile rank scores of the norm
group and their grade level. - 2. In evaluation and research the mean scaled
score is a better estimation of average group
performance than the mean raw score.
33(No Transcript)
34(No Transcript)
35Normative Scores
-
- Multiply by 5 to convert to percentile. This
means neither USA nor Iran are using a Normal
Distribution in their grading system. USA is
negatively and IRAN is positively skewed. -
36CHAPTER 19NORMS AND STANDARD SCORES
- Echternacht (1971) 3 steps Process of Grade and
Age Equivalent Scores - 1. First we convert the raw scores to scaled
scores - 2. Second, calculate the median scaled score for
each grade-level, and plot them on a bivariate
scatter plot. - 3.Connect the points and draw a smooth curve.
- It is similar to Deviation IQ. I.e., Childs
performance compares with that of
others at a particular age or grade level.
37CHAPTER 19NORMS AND STANDARD SCORES
38(No Transcript)