Title: Designing an assessment system
1(No Transcript)
2Designing an assessment system
- Presentation to the Scottish Qualifications
Authority, August 2007 - Dylan Wiliam
- Institute of Education, University of London
- www.dylanwiliam.net
3Overview
- The purposes of assessment
- The structure of the assessment system
- The locus of assessment
- The extensiveness of the assessment
- Assessment format
- Scoring models
- Quality issues
- The role of teachers
- Contextual issues
4Functions of assessment
- Three functions of assessment
- For evaluating institutions (evaluative)
- For describing individuals (summative)
- For supporting learning
- Monitoring learning Whether learning is taking
place - Diagnosing (informing) learning What is not
being learnt - Forming learning What to do about it
- No system can easily support all three functions
- Traditionally, we have grouped the first two, and
ignored the third - Learning is sidelined summative and evaluative
functions are weakened - Instead, we need to separate the first
(evaluative) from the other two
5The Lake Wobegon effect
All the women are strong, all the men are
good-looking, and all the children are above
average. Garrison Keillor
Scores
Time
6Goodharts law
- All performance indicators lose their usefulness
when used as objects of policy - Privatization of British Rail
- Targets in the Health Service
- Bubble students in high-stakes settings
7Reconciling different pressures
- The high-stakes genie is out of the bottle, and
we cannot put it back - The clearer you are about what you want, the more
likely you are to get it, but the less likely it
is to mean anything - The only thing left to us is to try to develop
tests worth teaching to - This is fundamentally an issue of validity.
8Validity
- Validity is a property of inferences, not of
assessments - One validates, not a test, but an interpretation
of data arising from a specified procedure
(Cronbach, 1971 emphasis in original) - No such thing as a valid (or indeed invalid)
assessment - No such thing as a biased assessment
- A pons asinorum for thinking about assessment
9Threats to validity
- Inadequate reliability
- Construct-irrelevant variance
- The assessment includes aspects that are
irrelevant to the construct of interest - the assessment is too big
- Construct under-representation
- The assessment fails to include important aspects
of the construct of interest - the assessment is too small
- With clear construct definition all of these are
technicalnot valueissues
10Two key challenges
- Construct-irrelevant variance
- Sensitivity to instruction
- Construct under-representation
- Extensiveness of assessment
11Sensitivity to instruction
1 year
Distribution of attainment on an item highly
sensitive to instruction
12Sensitivity to instruction (2)
1 year
Distribution of attainment on an item moderately
sensitive to instruction
13Sensitivity to instruction (3)
1 year
Distribution of attainment on an item relatively
insensitive to instruction
14Sensitivity to instruction (4)
1 year
Distribution of attainment on an item completely
insensitive to instruction
15Consequences (1)
16Consequences (2)
17Consequences (3)
18Insensitivity to instruction
- Primarily attributable to the fact that learning
is slower than assumed - Exacerbated by the normal mechanisms of test
development - Leads to erroneous attributions about the effects
of schooling
19A sensitivity to instruction index
Test Sensitivity index
IQ-type test (insensitive) 0
NAEP 6
TIMSS 8
ETS STEP tests (1957) 8
ITBS 10
Completely sensitive test 100
20Extensiveness of assessment
- Using teacher assessment in certification is
attractive - Increases reliability (increased test time)
- Increases validity (addresses aspects of
construct under-representation) - But problematic
- Lack of trust (Fox guarding the hen house)
- Problems of biased inferences (construct-irrelevan
t variance) - Can introduce new kinds of construct
under-representation
21The challenge
- To design an assessment system that is
- Distributed
- So that evidence collection is not undertaken
entirely at the end - Synoptic
- So that learning has to accumulate
22A possible model
- All students are assessed at test time
- Different students in the same class are assigned
different tasks - The performance of the class defines an
envelope of scores, e.g. - Advanced 5 students
- Proficient 8 students
- Basic 10 students
- Below basic 2 students
- Teacher allocates levels on the basis of
whole-year performance
23Benefits and problems
- Benefits
- The only way to teach to the test is to improve
everyones performance on everything (which is
what we want!) - Validity and reliability are enhanced
- Problems
- Students scores are not inspectable
- Assumes student motivation
24The effects of context
- Beliefs about what constitutes learning
- Beliefs in the reliability and validity of the
results of various tools - A preference for and trust in numerical data,
with bias towards a single number - Trust in the judgments and integrity of the
teaching profession - Belief in the value of competition between
students - Belief in the value of competition between
schools - Belief that test results measure school
effectiveness - Fear of national economic decline and educations
role in this - Belief that the key to schools effectiveness is
strong top-down management
25Conclusion
- There is no perfect assessment system anywhere.
Each nations assessment system is exquisitely
tuned to local constraints and affordances. - Assessment practices have impacts on teaching and
learning which may be strongly amplified or
attenuated by the national context. - The overall impact of particular assessment
practices and initiatives is determined at least
as much by culture and politics as it is by
educational evidence and values.
26Conclusion (2)
- It is probably idle to draw up maps for the ideal
assessment policy for a country, even although
the principles and the evidence to support such
an ideal might be clearly agreed within the
expert community. - Instead, focus on those arguments and initiatives
which are least offensive to existing assumptions
and beliefs, and which will nevertheless serve to
catalyze a shift in them while at the same time
improving some aspects of present practice.
27Questions?Comments?
Institute of Education University of London 20
Bedford Way London WC1H 0AL Tel 44 (0)20 7612
6000 Fax 44 (0)20 7612 6126 Email info_at_ioe.ac.uk