Title: Language Testing and Assessment
1Language Testing and Assessment
- Dr. Gerard Sharpling
- Centre for Applied Linguistics
- University of Warwick
- Coventry CV47AL
2Introduction
- Individualisation-Standardisation
3Why test language?
- Measure
- Compare
- Monitor
- Gatekeep
- Maintain standards
- Motivate
- Give feedback
- Make people employable
4Individualising
- Focus on the individual
- Knowledge is socially constructed
- Difference
- Learning style
- Not only writing
- Flexible formats
- Stress
- Awareness of SEM (Standard Error of Measurement)
- Validity is driving force
5Standardising
- Focus on the group
- Knowledge is given and objective
- Everyone treated equally
- Candidates expected to adapt
- Format reflects standards
- Stress is common to all individuals and can be
overcome or cured (medical model) - Driving force reliability/objectivity
6Statistical overviews
Statistics Statistics Statistics
Score
N Valid 200.000
N Missing .000
Mean Mean 21.020
Std. Error of Mean Std. Error of Mean .227
Median Median 21.000
Mode Mode 23.000
Std. Deviation Std. Deviation 3.214
Variance Variance 10.331
Skewness Skewness -.556
Std. Error of Skewness Std. Error of Skewness .172
Kurtosis Kurtosis .707
Std. Error of Kurtosis Std. Error of Kurtosis .342
Range Range 19.000
Minimum Minimum 9.000
Maximum Maximum 28.000
Sum Sum 4204.000
Percentiles 25 19.000
Percentiles 50 21.000
Percentiles 75 23.000
7Traditional view of Testing
- Reliability and objectivity
- Psychometric
- Standardisation
- Comparison of individual with the group
- Normal distribution curves
- No relationship between assessor and assessed
8Who might the traditional system fail?
- It might fail children (or students)
- Who have special needs
- who are gifted and talented
- who have individual learning differences
- Who have a non-traditional learning style
- who experience social deprivation
- who are looked after and/or within the care
system - from particular ethnicities
-
9Part 2
- Test purposes and test outcomes
10What are the different types of test
- Diagnostic
- Proficiency
- Placement
- Achievement
- Selection
- Summative
- Formative
- High stakes
- Low stakes
11How performance is represented
- Raw mark
- Mark compared to broad standard
- Mark or grade linked to verbal descriptor
- Criteria checklist
- Norm-referencing
- Criterion-referencing
12Representing language ability
- A grade or percentage for English
- A grade or percentage plus a breakdown into
different skills areas - A grade or percentage, a breakdown into skills
and a written report - All of the above plus a portfolio of evidence
that the student has compiled
13Part 3
- The qualities of good language tests
14Reliability
- Restrict choice
- Write unambiguous items
- Provide uniform test conditions
- Have fewer marking bands
- Ensure familiarity with format
- Give clear criteria
- Provide good training for raters
- Ensure anonymous marking
- (Hughes, 1989 36-43)
15Item facility
16More facility values
17Split half analysis
- First test (part)
- Mean 34.96 St.Dev. 4.61
- Second test (part)
- Mean 33.08 St.Dev. 4.88
- The t-test for difference gives a value of 6.297
with 195 df - The correlation between the two sets of scores is
0.612 - Assuming form means are exactly the same over
whole population - reliability 0.552 st. error of measurement
3.24 - Within forms analysis
- reliability 0.611 st. error of measurement
2.96 - Split parts analysis
- Cronbach Alpha 0.767 and Spearman Brown
Coefficient 0.759
18Sample unreliability issues (Essays)
- Describe a blizzard that you have experienced.
How did you make your way home? - Describe the legal system in your country.
- Discuss your favorite sport and why you enjoy it.
- Discuss the importance of Christmas in our
contemporary society.
19Sample unreliability issues (MCQ 1)
- 1. Why hasnt your mother come?
- Well, she said that she leave the baby.
- cant
- wont
- couldnt
- maynt
20Sample unreliability issues (MCQ 2)
- 2. Which word or phrase, a, b, c, or d, means the
same as the word underlined. - Ill see you soon.
- next year
- in a few days
- tomorrow
- In a couple of hors
21Validity
- Ensure good visual presentation of test
- Test only the skill(s) you plan to test
- Avoid introducing any unnecessary complexities
that dont matter - Avoid testing something else instead, by default.
- Ignore any mistakes outside the skills area being
tested (?) - Keep focussed.
-
22Sample validity issues
- Listen to the radio recording and write a short
summary of 150-200 words on the content. - Outline the extent to which economic factors have
impinged on individuals abilities to manage
their lives effectively. - Read the following text and answer the questions
which follow. Copy your answers correctly from
the text as marks will be deducted for incorrect
spelling.
23Part 4
- Language testing philosophy
24An example of a bell curve
25Example of test histogram
26Further histogram
27Inter-test correlations