Title: Addressing the Evaluation Gap
1Addressing the Evaluation Gap
- Responding to the paper by William D. Savedoff
and Ruth Levine When Will We Ever Learn?
Closing the Evaluation Gap, Center for Global
Development www.cgdev.org
2- There have been and continue to be multiple
discussions concerning the evaluation of
international development. They include some
commonly agreed frames of reference (as we hope
to discover here in Sussex). But they also
include forces pulling in many divergent
directions ... Or at least different
interpretations of what form of impact
evaluation is called for. -
3- Some attempt to address the complexities of
increasingly integrated, multi-intervention,
multi-donor national development assistance,
including those promoting human rights and
advocating for policy change.
4- Others call for a form of impact evaluation
that focuses on the need to conduct rigorous
research on more specific cause-effect
relationships. The findings of such evaluations
can be used to inform subsequent project design.
5- There are those who propose to use randomized
'scientific' experimental research designs to
evaluate 'the real impact' of development
projects. Among such proponents are the MIT
Poverty Action Lab (http//www.povertyactionlab.c
om/)
6- Another is the Center for Global Development's
"Evaluation Gap" Working Group. Their recently
released report (http//www.cgdev.org/section/init
iatives/_active/evalgap) is receiving
high-profile attention. Not only in the US, but
also in Europe, including a multi-national,
multi-agency conference held in June at the
Rockefeller Foundation center in Bellagio, Italy.
7- There are many aspects of the CGDs initiative
that I believe we should applaud and support.
These include (among others) - Pointing out that An evaluation gap exists
because there are too few incentives to conduct
good impact evaluations and too many obstacles. - Calling for more financial and technical support
for more rigorous evaluation - Advocating that there be more collaborative
evaluations
8- The CGDs two main suggested solutions are
- The formation of an International Council to
Catalyze Independent Impact Evaluations of Social
Sector Interventions. - The conducting of more rigorous impact
evaluations (implying randomized experimental
trials).1
1 In fairness, their proposals are more
comprehensive than what I am highlighting here.
But this points to an important methodological
challenge.
9- I suggest that those of us gathered here in
Sussex consider responses to both of these - Do we agree that there is need for the proposed
CGD-organized International Council? - If so, in what ways are we and the institutions
we represent willing to collaborate with it? - Or are its proposed purposes (see next slide)
already being adequately met by existing
institutions or networks? - What is the role of randomized experimental
trials among other evaluation designs?
10The International Council
- Establish quality standards for rigorous
evaluations - Organize and disseminate information
- Identify priority topics
- Review proposals rapidly
- Build capacity to produce, interpret and use
knowledge - Create a directory of researchers
- Provide grants for impact evaluation design
- Create and administer a pooled impact evaluation
fund - Signal quality with a Seal of Approval
- Communicate with policymakers
11Evaluation Designs
- Though I humbly acknowledge that this is a room
full of experts, permit me to share with you the
introduction to evaluation design that
participants in my training workshops have found
helpful.1 This could help clarify the role of
more rigorous evaluations (even randomized
trials) when they are needed, and when they may
be inappropriate or not feasible. - 1These are included in the book RealWorld
Evaluation by Bamberger, Rugh and Mabry,
published by Sage February 2006
12Design 1 Post-test only of project participants
X P
Project participants
end of project evaluation
12
13Design 2 Prepost of project no comparison
P1 X P2
Project participants
baseline
end of project evaluation
13
14Design 3 Prepost of project post-only
comparison
P1 X P2 C
Project participants
Comparison group
baseline
end of project evaluation
14
15Design 4 Quasi-experimental (prepost, with
comparison)
P1 X P2 C1 C2
Project participants
Comparison group
baseline
end of project evaluation
15
16Design 5 Randomized experimental (prepost,
with control)
P1 X P2 C1 C2
Project participants
Research subjects randomly assigned either to
project or control group.
Control group
baseline
end of project evaluation
16
17Design 6 Longitudinal Quasi-experimental
P1 X P2 X P3 P4 C1 C2
C3 C4
Project participants
Comparison group
baseline
end of project evaluation
post project evaluation
midterm
17
18Design 7 Randomized Longitudinal Experimental
P1 X P2 X P3 P4 C1 C2
C3 C4
Project participants
Research subjects randomly assigned either to
project or control group.
Control group
baseline
end of project evaluation
post project evaluation
midterm
18
19How often are more rigorous evaluation designs
actually used?
- Of the 67 projects included in the last bi-annual
meta-evaluation conducted by CARE International,
50 (75) used a posttest-only design without a
comparison group (Design 1) 12 used pre
posttest of project group (Design 2). We guess
that these are fairly typical of evaluation
designs actually used by INGOs and other
development agencies. - There actually had been baseline studies
conducted for 19 of the projects where
posttest-only evaluations were conducted. Among
the reasons the baselines were not used included
accessibility of the baseline data to the
evaluators, comparability (in terms of indicators
and methodologies), questions regarding the
quality of the baseline studies, and/or oversight
by those conducting the evaluations
20We need to be clear on whatre defining as
impact and what the contributing
causes/contributions are to achieve that impact.
- We do need to have proven hypotheses of what
interventions and outputs have been shown to lead
to what outcomes. - But such research needs to be clear on the
relevant conditions and what other contributing
factors there were.
21High infant mortality rate
Children are malnourished
Diarrheal disease
Insufficient food
Poor quality of food
Need for improved health policies
Need for strengthened capacity of health
institutions
Unsanitary practices
Flies and rodents
Do not use facilities correctly
People do not wash hands before eating
22Lower infant mortality rate
More Children are well nourished
Less diarrheal disease
Sufficient food
Good quality of food
Improved health policies
Strengthened capacity of health institutions
Sanitary practices
Fewer flies and rodents
facilities used correctly
People wash hands before eating
23What is the role of randomized experimental
trials?
- I believe there are examples of where they should
be used to test interventions, to determine clear
cause-effect correlations. These then can then
be used in subsequent project design and
evaluation. - I solicit your suggestions of examples where they
have been or should be used.