Title: International Assessments and Quality of Education
1International Assessments and Quality of Education
- Eugenio Gonzalez
- September 2007
2Purpose of the Presentation
- Review the three largest international
comparative studies of education undertaken to
date - These studies are TIMSS, PISA and PIRLS
- Develop understanding (and appreciation...) of...
- Value provided by these studies
- Technical complexities
- Limitations imposed by reality
3The Studies
- IEA - TIMSS
- Originally Third International Mathematics and
Science Study - Now Trends in International Mathematics and
Science Study - IEA - PIRLS
- Progress of International Reading Literacy Study
- OECD - PISA
- Programme of International Student Assessment
4The Studies
- Two major organizations
- International Association for the Evaluation of
Educational Achievement (IEA) - Conducts TIMSS PIRLS
- Focus on comparative studies, research and
training - Over 60 country members
- Organization for Economic Cooperation and
Development (OECD) - Conducts PISA
- Data collection in education and other fields
- Collects indicators to inform governments
- Focuses on needs of OECD country members
5Why International Assessments?
- I'm not a fan of facts. You see, the facts can
change, but my opinion will never change, no
matter what the facts are. (Steven Colbert) - When the president decides something on Monday,
he still believes it on Wednesday... no matter
what happened Tuesday. (Steven Colbert) - If you think data collection is expensive, you
should try to see the cost of ignorance
6Why International Assessments?
- Benchmarking function
- Provide comparable indicators on student
performance and schooling practices across
countries - Analytic function
- Suggest hypotheses about
- Relationship between student performance and
factors that may influence performance - Areas where students have particular strengths or
weaknesses
7Why International Assessments?
- Development function
- Data on strengths and weaknesses of the system
- Data on what the present looks like, and how the
future might look like
8How Do Countries Benefit?
- Understanding
- Variation of student performance across content
and cognitive domains - National performance in an international context
- Performance in comparisons with international
benchmarks - Instructional practices
- Effects of resources in school
- Relationship between context and student outcomes
- Monitoring change over time and policy impact
- Developing local capacity
9Scope of TIMSS
- Largest comparative educational study
- Achievement and the context in which learning
occurs - Two subjects Mathematics and Science
- Three Educational Levels in 1994/95
- Grades 3 4, Grades 7 8, Grade 12
- Grade 8 in 1998/99
- Grade 4 and 8 in 2002/03 and 2006/07
- Grade 12 in 2008
- Next assessment in 2010/11
10The Questions
- What should students learn?
- Who provides the instruction?
- How is the instruction organized?
- Where does instruction take place?
- When does instruction take place?
- What have students learned?
- What is the change over time?
11The Components
- Achievement Booklets
- Mathematics and Science tests
- Questionnaires
- Students, School Principals, Mathematics and
Science Teachers, Curriculum
12TIMSS Assessment Design
- 90 minute assessment at grade 8
- Rotated block assessment design
- No student answers all items
- Students encountered booklets with both
mathematics and science items - Each booklet contains trend items
- Some items are released after every assessment
13TIMSS Achievement Scales
- Overall Mathematics
- At Grade 8 Number Algebra Measurement
Geometry Data - At Grade 4 Number Patterns, Equations, and
Relationships Measurement Geometry Data - Overall Science
- At Grade 8 Life Science Chemistry Physics
Earth Science Environmental Science - At Grade 4 Life Science Physical Science Earth
Science - Also cognitive domain scales
14The TIMSS Test Design in 2003
15What is PIRLS?
- International study of reading literacy at fourth
grade - Assessment of student proficiency in reading
comprehension - Extensive collection of data on context for
learning to read - Designed to measure trends on a 5-year cycle
- Next assessment in 2011
16Definition of Reading Literacy
- The ability to understand and use those written
language forms required by society and/or valued
by the individual. Young readers can construct
meaning from a variety of texts. They read to
learn, to participate in communities of readers,
and for enjoyment. (Campbell, Kelly, Mullis,
Martin, Sainsbury, 2001) - Administered at the time when students transition
from learning to read into reading to learn.
17Assessment Booklet Design
- Total reading assessment time
- 5 hours 20 minutes
- Available student testing time
- 1 hour 20 minutes
- Matrix sampling approach
- 10 interlinked booklets
18Contexts for Literacy
- Student and Home Questionnaire
- Early literacy activities
- Home educational resources
- Language in the home
- Out-of-school literacy activities
- Teacher and School Questionnaires
- Environment and resources
- Instructional strategies and activities
- Instructional materials and technology
- Teacher training and preparation
19What is PISA?
- Program for International Student Assessment
- The main questions are
- Are students well prepared to meet the challenges
of the future? - Are they able to analyze, reason, and communicate
their ideas effectively? - Do they have the capacity to continue learning
throughout life?
20What is PISA?
- Need to monitor student learning and raise
aspirations - Provide directions for national policy
- Measure skills relevant to adult life
- Lead to better understanding of causes and
consequences of observed skill shortage
21What is PISA?
- Measure how well 15-year-olds approaching the end
of compulsory schooling are prepared to meet the
challenges of todays knowledge societies - Does not focus on mastery of school curriculum,
but on the ability to use knowledge and skills
(acquired in schools) to meet real-life
challenges
22Scope of PISA 2003
- 41 participating countries
- Over 250,000 students participated
- Students answered a 30 minute questionnaire in
addition to a 120 minute test - School principals completed a questionnaire about
their school - Main Domain in 2003 was Mathematics, however also
administered tests in Science, Reading and
Problem Solving
23PISA Assessment Design
- Administered every 3 years
- Originally 3 major domains Mathematics, Science
and Reading - In addition, problem solving in 2003
- Every 9 years major emphasis on a domain
- In between the 9 years minor emphasis on a domain
24PISA Assessment Cycle
25Summary
- TIMSS assessed 4th and 8th graders in mathematics
and science - Students, their teachers, and school principals
answered a background questionnaire - PIRLS assessed 4th graders in reading
- Students, their teachers, school principals and
parents/guardian answered a background
questionnaire - PISA assessed 15yo students in mathematics,
science, reading and problem solving - Students and school principals answered a
background questionnaire
26Other International Studies
- Information Technology
- IEA - SITES - Second Information and Technology
Study - Administered in 2006
- Grade 8
- Civics and Citizenship Education
- IEA - ICCS - International Civic and Citizenship
Study - Administered in 1999 and 2008
- Grade 8
- Teacher Education
- IEA - TEDS - Teacher Education Development Study
- Administered in 2008
- Focus on the training of Mathematics teachers
27About International Assessments
- International Assessments are
- NOT a test, they are a survey
- NOT meant to give individual scores
- NOT to be used to reward or punish schools,
teachers or students - NOT to be used to dictate curriculum or teaching
methods - This is by design, not by defect!
28Basic Goals of the Designs
- Inform on performance on broad areas of knowledge
and skills - Not tied to a specific curriculum of a country
- Content defined by frameworks developed by the
concensus of participating countries - Reports of performance of groups of students, not
individual - There is a limited assessment time
- Assessment time is generally limited to no more
than 2 hours - The entire pool of items is generally over 4 hours
29Basic Goals of the Designs
- Sampling of Students
- Selected randomly throughout the country
- Sampling of Items
- Selected according to a design
- Its necessary to take error into account
- Error Variace or uncertainty of Estimates
30How Does Sampling Help?
- Impossible to test everyone on everything
- Too many people
- Too many items
- Too expensive
- Not necessary to test everyone on everything
- Blood sample
- Some students are tested on some things
- Calculations need to be adjusted using sampling
information
31What are Standard Errors?
- The estimates are not precise
- Not all students are tested
- No student is administered all items
- Standard errors have two components
- Sampling Error
- Resulting from sampling students from a
population - Complex sample design requires special
computation - Imputation Error
- Resulting from sampling items from a universe and
using statistical models to obtain estimates - IRT and conditioning models estimate these
32Interpreting Background Variables
- TIMSS/PIRLS/PISA are a survey
- We know what students know and can do now, and
the context in which this occurs - We mostly have current background information,
while learning or effect might have occurred a
while back - Cross-sectional, with repeated and independent
measures over time - Can make statements about correlations, not
causation - Has invaluable descriptive power
33Interpreting Background Variables
- TIMSS/PIRLS/PISA are not an experiment
- We do not control assignment of students to
treatment groups - We can not establish causality, or direct effect
- Events have already happened and all we do is
record what has happened
34Interpreting Background Variables
- Important to know how to word statements about
contextual variables correctly - For example
- About how many books are there in your home?
- Few (0-10)
- Enough to fill one shelf (11-25)
- Enough to fill one bookcase (26-100)
- Enough to fill several bookcases (more than 100)
35Interpreting Background Variables
- We could ask
- Is there a (statistical) relationship between the
number of books in the home and reading
achievement at grade 4? - Are students who report having more books in the
home more likely to do better in reading than
those who do not? - We should not ask
- Does having more books in the home have an effect
(increase/decrease) on reading achievement?
36Interpreting Background Variables
- We could answer
- Grade 4 students who come from homes where there
are more books tend to do better in reading than
those who do not. - Grade 4 students who do well in reading are more
likely to have come from homes where there are
more books in the home - We should not answer
- Students do better at reading because there are
more books in the home - High reading achievement tends to have an effect
on the number of books found in the home
37About Measuring School Quality
- What is it?
- Quality of teachers? Schools? Instruction?
Context? Interactions? Which is it? - How can we measure it?
- Differential input
- No random assignment
- Ceiling and floor effect of instruments
38Contact Information
- By e-mail IERInstitute_at_iea-dpc.de
- On the web http//www.IERInstitute.org