Title: Generalization from empirical studies
1Generalization from empirical studies
- Tore Dybå Session introduction (20 min.)
- Erik Arisholm Generalizing results through a
series of replicated experiments on software
maintainability (20 min.) - Jeff Carver Methods and tools for supporting
generalization (20 min.) - Mini-group discussions (10 min.)
- Plenary discussion (20 min.)
- ISERN Meeting, Noosa Heads, Queensland, Australia
- 1415 November, 2005
2Generalization from Empirical Studies in SE
Session Introduction
- Tore Dybå
- SINTEF ICT
- tore.dyba_at_sintef.no
- ISERN Meeting, Noosa Heads, Queensland, Australia
- 1415 November, 2005
3(Some of) the problem
- Empirical SE research often generalizes about
software organizations as if they were all alike,
or refrains from generalizing at all, as if they
were all unique - In the first case, it is never really clear that
findings about organizations actually sampled
apply to organizations not sampled. - With respect to the second, is there really any
point in studying software organizations if one
does not believe that common denominators exist
among relatively large classes of organizations? - We must become more concerned about the
conditions under which our research findings are
valid if our work is to be applied more widely.
4Generalization is closely related toconstruct
validity and external validity
- Construct validity
- the degree to which inferences are warranted from
the observed persons, settings, and cause and
effect operations included in a study to the
constructs that these instances might represent. - External validity
- the validity of inferences about whether the
causal relationship holds over variations in
persons, settings, treatment variables, and
measurement variables.
W.R. Shadish, T.D. Cook, and D.T. Campbell
(2002) Experimental and Quasi-Experimental
Designs for Generalized Causal Inference,
Houghton Mifflin Company.
5Statistical, sampling-based generalization
- The statisticians traditional two-step ideal of
- the random selection of units for enhancing
generalization and - the random assignment of those units to different
treatments for promoting causal inference - is often advocated as the gold standard for
empirical studies. - However, this model is of limited utility for
generalized causal inference in empirical SE
because - it assumes that random selection and its goals do
not conflict with random assignment and its
goals - it is rarely relevant for making generalizations
about systems, tasks, settings, treatments and
outcome variables - ethical, political, logistical, and economical
constraints often limit random selection to less
meaningful populations.
6The painful problem of induction
- Humes truism
- In past experience, all tests have
confirmedTheory 1. - Therefore, the next test will confirm Theory 1
or all tests will confirm Theory 1. - induction or generalization is never fully
justified logically. Whereas the problems of
internal validity are solvable within the limits
of the logic of probability of statistics, the
problems of external validity are not logically
solvable in any neat, conclusive way.
Generalization always turns out to involve
extrapolation into a realm not represented in
ones sample. Such extrapolation is made by
assuming one knows the relevant laws.
D.T. Campbell and J.C. Stanley (1963)
Experimental and Quasi-Experimental Designs for
Research, Houghton Mifflin Company, p. 17.
7Yins conception of generalization
theory
rival theory
Level-2 inference(Analytical)
case study findings
population characteristics
experimental findings
Level-1 inference(Statistical)
sample
subjects
R.K. Yin (2003) Case Study Research Design and
Methods, Third Edition, Sage Publications.
8Lee and Baskervilles framework
Generalizing to empiricalstatements
Generalizing to theoreticalstatements
EE Generalizingfrom datato description
ET Generalizingfrom descriptionto theory
Generalizingfrom empiricalstatements
TE Generalizingfrom theoryto description
TT Generalizingfrom conceptsto theory
Generalizingfrom theoreticalstatements
A.S. Lee and R.L. Baskerville (2003)
Generalizing Generalizability in Information
Systems Research, Information Systems Research,
14(3)221-243.
9Shadish, Cook, and CampbellFive principles of
generalized causal inference
- Surface similarity judging the apparent
similarities between what was studied and the
targets of generalization. - Ruling out irrelevancy identifying those
attributes of persons, settings, treatments, and
outcome measures that are irrelevant because they
do not change a generalization. - Making discriminations making discriminations
that limit generalization (e.g., from the lab to
the field). - Interpolation and extrapolation interpolating to
unsampled values within the range of the sampled
persons, settings, treatments, and outcomes and
by extrapolating beyond the sampled range. - Causal explanation developing and testing
explanatory theories about the target of
generalization.
W.R. Shadish, T.D. Cook, and D.T. Campbell
(2002) Experimental and Quasi-Experimental
Designs for Generalized Causal Inference,
Houghton Mifflin Company.
10Summary
- Formal sampling-based methods are of limited use
for generalizing from empirical SE studies. - specifically so for tasks, settings, treatments,
and outcome measures - Additionally, theres a dilemma between
scientific validity (complying with Humes
truism) and practical impact (applying a theory
in a new organizational setting). - Although we should advocate the two-step model of
random sampling followed by random assignment
when it is feasible, we cannot advocate it as the
model for generalized causal inference in SE. - So, SE researchers must use other concepts and
methods to explore generalization from empirical
SE studies. - In fact, most SE researchers routinely make such
generalizations without using formal sampling
theory. - In the rest of this session we will attempt to
make explicit the concepts and methods used in
such work. - We turn to examples of such alternative methods
now
11Mini-group and plenary discussions
- Form mini-groups with three persons without
leaving your chairs (first three, next three,
etc.) - Discuss the following two questions in the
mini-groups for 10 minutes - How do you generalize the results from YOUR
studies? - How can you improve the validity of these
generalizations? - Plenary discussion based on viewpoints from the
groups