Title: Week 2
1Week 2
2PART 1
Scientific Theory - Its Nature and
Utility - Its Elements Concepts and Definitions
3Naive Science and Theory
- People regularly observe events around them and
speculate about their causes. - Personal observations frequently forms the basis
of peoples explanations of these events. - In these instances, people are behaving like
scientistsin part. They are trying to
understand and explain events and predict
outcomes. - But they are doing so without awareness of the
rules of sciencehence the term naïve science.
4Naive Science and Theory
- As naïve scientists, we try to understand some
interesting situation in a way that will predict
or explain its operation. - A Definition of Theory
- A set of interrelated constructs (concepts),
definitions, and propositions that present a
systematic view of phenomena by specifying
relations among variables, with the purpose of
predicting and explaining the phenomena.
(Kerlinger, Foundations of Behavioral Research,
1986). - Naïve science/understanding is a kind of
theory, but it could be considered mere
speculation. Well use the term theory to mean
a simplified explanation of reality.
5Theory Its Purpose Components
- The goal to predict and explain events.
- Important practical ramifications.
- A theory achieves prediction and explanation by
stating relationships between concepts, when they
are operationalized as variables. - Variables things that vary (take on different
intensities, values, or states). - Concepts (or constructs) the mental image of
the thing which varies. - Example Fire is the concept size, heat or
other details about the fire are the variables
based on the concept.
6Naive Theory Building An Example
- Jill decides to vacation at an ocean resort. The
first day at the beach, the water is warm and
great for swimming. The second day, it is very
cold. The next day, the water is again very warm.
This phenomenon (variability of the water
temperature) interests her, because she likes to
swim, but doesnt like cold water.
What is the cause of this day-to-day variation?
7Potential Contributing Factors
- The sun has been out each day. Jill reasons that
the sun cant be the cause of differing water
temperatures. She therefore doesnt include the
sun in her naïve theory. - She observes the water very carefully each day
and notices the water is clearer on the days it
is cold, and murkier on the days that are better
for swimming. - Jill can now predict whether swimming will be
good by observing the clarity of the water. We
can stop now if the goal is merely to pick the
best days to swim! - But the identification of the pattern doesnt
explain why the water temperature should shift.
8Additional Factors to Consider
- Jill next notices a relationship between
variations in the prevailing wind direction on
the previous day and the water temperature the
next day. Days with winds out of the Northeast
are followed by days with cold water. Days with
winds from another direction are followed by warm
water. - Why should wind direction affect water
temperature? She consults a map and improves her
naïve theory by adding some process or mechanism
to explain these events. - Open ocean lies to the Northeast, while the bay
she swims in is protected on all other sides by
land. Thus, one possibility is that the Northeast
winds may blow colder deep ocean waters (which
are clearer, as less algae grow in cold
temperatures) into the bay.
9The Beginnings of Theory Development
- Jill has identified variables (bay water
temperature, bay water clarity, and wind
direction) and specified relationships among
them. - She is likely to call one of these variables the
cause, and the other two variables the effects. - Jill now has an intuitive idea of what
constitutes a causal relationship. It is - A specific condition of a variable (Northeast
wind) which occurs earlier in time than a
corresponding condition of another variable (cold
water), combined with some reasonable explanation
for the relationship between these two variables
(the nature of the geography of the region).
10Is Jill finished?
- Given the data thus far, it is too early too
conclude the proposed causal solution. So whats
next? - Jill should collect more data, so she extends her
vacation for a month and continues her
observations. - If the pattern continues, we can be increasingly
certain that the relationship accurately reflects
reality. More evidence can improve the
probability that Jills theory is true. - Naïve scientists will consider their personal
observations to be sufficient to construct a
completed theory. For the true scientist,
personal observations are only the beginning. - The scientific method a highly formalized,
systematic and controlled approach to theory
development and testing.
11Testing Theories Naïve Science vs. Science.
- Naive scientists are likely to be satisfied with
Jills evidence because it is self-evident,
common sense, is what any reasonable person
would conclude. - It is important to rule out alternative
explanations (competing causes of the phenomena)
by building controls into the experimental
design. - There are many procedures to guard against biased
testing of theories - Randomization (random selection random
assignment). - Appropriate research design and methodology.
- Valid and reliable instrumentation.
- Statistical procedures.
12Methods of Knowing (fixing belief)
- Method of Tenacity Least sophisticated, but
commonly used. Establishes explanations by
asserting that something is true because it is
commonly known to be true. Occurs entirely within
a given individual and is therefore subject to
their beliefs, values and idiosyncrasies.
Surprisingly resistant to contrary evidence. - Method of Authority Truth established when
something or someone held in high regard states
the truth. Relies on the actual truth of the
expert or source. Widespread in marketing.
Potentially dangerous. - Method of Reasonable Men (apriori method) Relies
on the idea that the propositions are
self-evident or reasonable. Criterion for fixing
belief lies in the reasonableness of the
argument and how reasonable is defined. May
agree with reason but not the observable facts. - Scientific Method Critical shift all three
previous methods are focused inward. Science
shifts the locus of truth from single individuals
to groups, by establishing a mutually agreed upon
rules for establishing truth.
13Basic Requirements of the Scientific Method
- The Use and Selection of Concepts
- Linking Concepts by Propositions
- Testing Theories with Observable Evidence
- Defining Concepts
- Publication of Definitions and Procedures
- Control of Alternative Explanations
- Unbiased Selection of Evidence
- Reconciliation of Theory and Observation
- Limitations of the Scientific Method
14- The Use and Selection of Concepts First,
develop a verbal (conceptual) description or name
for the events. Here we seek to explain events
by linking two concepts a cause to an
effect. Scientists arrive at causally related
concepts through a thorough review of previous
research, by using logical deduction, and by
insight and personal observations. - Linking Concepts by Propositions To explain a
phenomenon, we must specify the functional
mechanism whereby changes in variable A (a
cause) should lead to changes in some variable
B (an effect). Such a functional statement
distinguishes between causal relationships (that
have such an explanation) and covariance
relationships (that do not). - Testing Theories with Observable Evidence No
theory regarded as probable truth until it has
been empirically tested against some observable
reality.
15- Defining Concepts Testing theory with some
observable evidence generates this requirement we
must bridge the gap between theory (stated at a
high level of abstraction) and observation (which
occurs at a very concrete level). - The gap is bridged by defining both the meanings
of concepts and the indicators or measures used
to capture those meanings, a process that
produces an operational definition. - An operational definition adds three things to
the theoretical definition - - Describes the unit of measurement
- - Specifies the level of measurement
- - Provides a mathematical or logical statement
that clearly states how measurements are to be
made and combined to create a single value for
the abstract concept.
16(No Transcript)
17- Publication of Definitions/Procedures The
scientific method is public. All other
researchers need to have the ability to carry out
the same procedures to arrive at the same
conclusions. Requires that we be as explicit and
objective as possible in stating and publicizing
definitions/procedures. - Control of Alternative Explanations Scientific
studies must be designed to rule out alternative
causes. Isolating a true causal variable means
that these other confounding variables have to be
identified and their effects eliminated or
controlled. - Unbiased Selection of Evidence Decision to
accept a theory as probably true or probably
false will be based on observations of limited
evidence (e.g., a few hundred college students).
Generalizing results beyond the (limited) study
sample requires the evidence to be selected so as
to eliminate biases and be representative of some
broader population.
18- Reconciliation of Theory and Observation Degree
of agreement between what theory predicts we
should observe and what we actually do is the
basis of the self-correcting nature of this
iterative approach. - Limitations of the Scientific Method Scientific
method cannot be used when objective observation
is not possible (e.g., determining whether a
social policy is good or bad, if objective
measurement of good and bad is not possible).
Basic beliefs or assumptions are not testable
propositions, as they can never be disproved, and
thus cannot be investigated scientifically.
19PART 2
- Types of Relationships
- Testing Hypotheses Confounds Controls
20Types of Relationships Null, Covariance, Causal
- Null relationship No relationship at all.
Concepts operate independently of each other. - Covariance relationship Concepts vary together
(directly or inversely). - Causal relationship Concepts covary (are
related), changes in one concept precede changes
in the other concept, and a causal relationship
between the two (a cause and an effect) can be
justified logically.
21(No Transcript)
22- Covariance relationships can provide prediction,
but not a (necessarily) valid explanation of the
relationship. -
- Accepting covariance relationships as true
without empirical testing fails to identify
spurious relationships. Two variables may covary
because they are both the effects of a common
cause.
23The unobserved, but real, causal variable (Amount
of Education) is termed a confounding variable,
since it may mislead us by creating the
appearance of a relationship between the observed
variables.
24Covariation vs. Causality Key Differences
- Covariance alone does not imply causality.
- Covariance merely means that a change in one
variable is associated with a change in the other
variable. - Causality requires that a change in one variable
(IV) creates the change in the other (DV). - Covariance is 1of 4 conditions that must be met
- Spatial Contiguity (connected in the same time
and space). - Temporal Ordering (change in the IV occurs before
the change in the DV). - Necessary Connection (statement specifying why
the cause can bring about a change in the
effect).
25Covariation vs. Causality An Example
- Consider the example of an observed relationship
between first letter of a persons last name and
the persons exam grade. - Spatial Continguity requirement? Yes
- The name and the exam score both exist within the
same person. - Covariance requirement? Yes
- Last names A through M scored lower than the
others. - Temporal Ordering requirement? Yes
- A persons last name was established before exams
were taken. - Necessary Connection requirement? Not so fast
- Is there a sensible reason why a persons last
name should create different levels of
performance on an exam?
26Covariation vs. Causality An Example
We expect persons with higher incomes to read
more newspapers (covariation) because the income
provides the purchasing power and leisure time
for such readership (necessary connection)
We expect older persons will read more newspapers
(covariation) for two reasons they have fewer
children at home and thus more leisure time and
they developed the habit of reading before the
dominance of TV and the Internet (necessary
connection)
27Spurious Relationships
A city's ice cream sales are found to be highest
when the rate of drownings in the citys swimming
pools is highest. To allege that ice cream sales
cause drowning, or vice-versa, would be to imply
a spurious relationship between the two. In
reality, a third variable, in this instance a
heat wave, more likely caused both.
28Testing Hypotheses Confounds and Controls
- Life would be simpler if every effect variable
(DV) had only one cause! - Hardly ever the case Becomes difficult to sort
out how variables affect each other. - An observed covariance relationship between two
variables could occur because of some real
relationship or due to the spurious effect of a
third confounding variable. - Suppose we are interested in determining whether
there is a real relationship between Exposure to
Movie Violence and the Number of Violent Acts
committed by adolescents.
29- If we ignore, or are unaware of, the confounding
variable (Predisposition to Violence) we may
erroneously conclude that all change in the
number of Acts of Violence is due to the direct
action of level of Exposure to Movie Violence.
30Controlling for Confounding Variables
- Identifying Control Variables
- Internal Validity, External Validity, and
Information - Methods for Controlling Confounding Variables
31Identifying Control Variables
32Internal Validity, External Validity, and
Information
- Internal Validity the extent to which we can be
sure that no confounding variables have obscured
the true relationship between the variables in
the hypothesis test. That a change in the IV
causes a change in the DV. - External Validity the ability to generalize from
results of a study to the real world. - Information pertains to the amount of
information we can obtain about any confounding
variable and its relationship with the relevant
variables.
33Methods of Controlling Confounding Variables
- Manipulated Control we eliminate the effect of a
confounding variable by not allowing it to vary
(e.g., selecting and/or matching subjects on
potentially important confounding variables). - Statistical Control we build the confounding
variable(s) into the research design as
additional measured variables. - Randomization randomly assign study participants
to the experimental groups or conditions so that
the potential effects of confounding variables
are distributed equally among the groups.
34Manipulated Control Eliminating effects of
confounding variables through research design and
sampling decisions
- Example
- A researcher investigating the effects of seeing
justified violence in video games on children
knows that young children cannot interpret the
motives of characters accurately. She decides to
limit her study to older children only, to
eliminate random responses or unresponsiveness of
younger children.
35Statistical Control Confounding variables
measured mathematical procedures used to remove
their effects
- Example
- A political communication researcher interested
in studying emotional appeals versus rational
appeals in political commercials suspects that
the effects vary with the age of the viewer. She
measures age, and uses it as an independent
predictor to isolate, describe, and remove its
effect.
36Randomization Unknown sources of error are
equalized by randomly assigning subjects to
research conditions
- Example
- Many different factors are known to affect the
amount of use of Internet social networking
sites. A researcher wants to test two different
site designs. He randomly assigns subjects to
work with each of the two designs. This approach
aims to distribute the amount of confounding
error from unknown factors equally across groups.
37Methods of Controlling Confounding Variables A
summary
- Manipulated and statistical control give high
internal validity, while randomization is a bit
weaker. - Statistical control and randomization give high
external validity, while manipulated control is
weaker. - Key difference between randomization and the
other techniques is that randomization doesnt
involve identifying/measuring the confounding
variables. - A major advantage of randomization is that we can
assume that all confounding variables have been
controlled to a certain extentbut any random
process will result in disproportionate outcomes
occasionally. Randomization also provides little
information about the action of any confounding
variables.
38PART 3
- Classes of Research Variables
- Measurement The Foundation of Scientific
Inquiry - Essential Elements of Research Reliability,
Validity, Control and Importance
39Classes of Research Variables Variables
defined by their use in research
40Classes of Research Variables Levels of
Measurement
Depending on our operational definition, a
measurement can give us differing kinds of
information about a theoretical concept.
- Nominal. A variable made up of discrete,
unordered categories. Each category is either
present or absent and categories are mutually
exclusively and exhaustive (e.g., gender). - Ordinal. A variable for which different values
indicate a difference in the relative amount of
the characteristic being measured. Not always
possible to determine the absolute distance
between adjacent categories. - Interval. A variable for which equal intervals
between variable values indicate equal
differences in amount of the characteristic being
measured. - Ratio. Ratios between measurements as well as
intervals are meaningful because there is a
starting point (zero).
41Nominal Measurement An Example
A nominal measurement makes a simple distinction
between the presence or absence of the
theoretical concept within the unit of analysis.
Theoretical concepts can have more than two
nominal response categories (nominal factors) as
in the example below.
42Ordinal Measurement An Example
Categories of a nominal level variable cannot be
arranged in any order of magnitude. By adding
ordering by quantity to the definition of the
categories, the sensitivity of our observations
is improved.
Example Subjects in a study are asked to sort a
stack of photographs according to their physical
attractiveness so that the most attractive photo
is on top and the least attractive photo is on
the bottom. This introduces the general idea of
comparative similarity in observations. We can
now say that the 2nd photo in the stack is more
attractive to the subject than all the photos
below it, but less attractive than the photo on
top of the pile. We can assign an
attractiveness score to each photo by
numbering, starting at the top of the pile
(1most attractive 2second most attractive,
etc.). This is called a rank order measurement.
43With ordinal measurement, we cannot determine the
absolute distance between adjacent categories.
Suppose we knew the real attractiveness scores
of the photos for two subjects. Although their
real evaluation of the photos are quite
different, they rank the comparative
attractiveness identically.
44Interval Measurement An Example
- If we can rank order observations and assign them
numerical scores that register the degree of
distance between observations or points on the
measurement scale, we have improved the level of
measurement to interval-level. - Interval scales are numerical scales in which
intervals have the same interpretation
throughout. As an example, consider the
Fahrenheit scale of temperature. The difference
between 30 degrees and 40 degrees represents the
same temperature difference as the difference
between 80 degrees and 90 degrees. This is
because each 10-degree interval has the same
physical meaning (in terms of the kinetic energy
of molecules). - Interval scales are not perfect, however. In
particular, they do not have a true zero point.
45Scales of Measurement
Levels of Measurement Nominal Ordinal Interval
Ratio
Diagnostic categories Socioeconomic Test
scores Weight length brand names
political class ranks personality and reaction
time or religious affiliation attitude
scales of responses Identity Identity
magnitude Identity magnitude Identity
magnitude equal intervals equal
intervals true zero point None Rank
order Add subtract Add subtract multipl
y divide Nominal Ordered Score Score Chi
Square Mann-Whitney t-test ANOVA t-test
ANOVA U-test
Examples Properties Mathematical
Operations Type of Data Typical Statistics
46Evaluating Measures Effective
Range
47Essential Elements of Measurement Reliability,
Validity, Control and Importance
48Types of Reliability
- Test-retest Reliability
- Consistency of measurement over time
- Internal Consistency
- Inter-item correlation
- Interrater Reliability
- Level of agreement between independent
observers of behavior(s). Assessed via
correlation or the
procedure at right. -
Agreement Agreement Disagreement
x 100
49Types of Validity
Face validity. The (non-empirical) degree to
which a test appears to be a sensible
measure. Content validity. The extent to which a
test adequately samples the domain of
information, knowledge, or skill that it purports
to measure. Criterion validity. Now (concurrent)
and Later (predictive). Involves determining the
relationship (correlation) between the predictor
(IV) and the criterion (DV). Construct validity.
The degree to which the theory or theories behind
the research study provide(s) the best
explanation for the results observed.
50Internal vs. External Validity
- Internal Validity
- Extent to which causal/independent variable(s)
and no other extraneous factors caused the change
being measured. - External Validity (generalizability)
- Degree to which the results and conclusions of
your study would hold for other persons, in other
places, and at other times.
51Threats to Internal ValidityFactors that reduce
our ability to draw valid conclusions
Selection History Maturation Repeated
Testing Instrumentation Regression to the
mean Subject mortality Selection-interactions Expe
rimenter bias
52Reducing Threats to Internal Validity
The role of Control Behavior is influenced by
many factors termedconfounding variablesthat
tend to distort the results of a study, thereby
making it impossible for the researcher to draw
meaningful conclusions. Some of these may be
unknown to the researcher. Control refers to the
systematic methods (e.g., research designs)
employed to reduce threats to the validity of the
study posed by extraneous influences on both the
participants and the observer (researcher).
53Group/Selection threat
- Occurs when nonrandom procedures are used to
assign subjects to conditions or when random
assignment fails to balance out differences among
subjects across the different conditions of the
experiment. - Example
- A researcher is interested in determining the
factors most likely to elicit aggressive behavior
in male college students. He exposes subjects in
the experimental group to stimuli thought to
provoke aggression and subjects in the control
group to stimuli thought to reduce aggression and
then measures aggressive behaviors of the
students. How would the selection threat operate
in this instance?
54History threat
- Events that happen to participants during the
research which affect results but are not linked
to the independent variable. - Example
- The reported effects of a program designed to
improve medical residents prescription writing
practices by the medical school may have been
confounded by a self-directed continuing
education series on medication errors provided to
the residents by a pharmaceutical firm's medical
education liaison.
55Maturation threat
- Can operate when naturally occurring biological
or psychological changes occur within subjects
and these changes may account in part or in total
for effects discerned in the study. - Example
- A reported decrease in emergency room visits in
a long-term study of pediatric patients with
asthma may be due to subjects outgrowing
childhood asthma rather than to any treatment
regimen introduced to treat the asthma.
56Repeated testing threat
- May occur when changes in test scores occur not
because of the intervention but rather because of
repeated testing. This is of particular concern
when researchers administer identical pretests
and posttests. - Example
- A reported improvement in medical resident
prescribing behaviors and order-writing practices
in the study previously described may have been
due to repeated administration of the same short
quiz. That is, the residents simply learned to
provide the right answers rather than truly
achieving improved prescribing habits.
57Instrumentation threat
- When study results are due to changes in
instrument calibration or observer changes rather
than to a true treatment effect, the
instrumentation threat is in operation. - Example
- In Kalshers Experimental Methods and Statistics
course, he evaluates students progress in
understanding principles of research design at
week 3 of the semester. A graduate T.A.
evaluates the students at the conclusion of the
course. If the evaluators are dissimilar enough
in their approach, perhaps because of lack of
training, this difference may contribute to
measurement error in trying to determine how much
learning occurred over the semester.
58Statistical Regression threat
- The regression threat can occur when subjects
have been selected on the basis of extreme
scores, because extreme (low and high) scores in
a distribution tend to move closer to the mean
(i.e., regress) in repeated testing. - Example
- if a group of subjects is recruited on the basis
of extremely high stress scores and an
educational intervention is then implemented, any
improvement seen could be due partly, if not
entirely, to regression to the mean rather than
to the coping techniques presented in the
educational program.
59Experimental Mortality threat
- Experimental mortalityalso known as attrition,
withdrawals, or dropoutsis problematic when
there is a differential loss of subjects from
comparison groups subsequent to randomization,
resulting in unequal groups at the end of a
study. - Example
- Suppose a researcher conducts a study to compare
the effects of a corticosteroid nasal spray with
a saline nasal spray in alleviating symptoms of
allergic rhinitis (irritation and inflammation of
the nasal passages). If subjects with the most
severe symptoms preferentially drop out of the
active treatment group, the treatment may appear
more effective than it really is.
60Selection Interaction threats
- A family of threats to internal validity
produced when a selection threat combines with
one or more of the other threats to internal
validity. When a selection threat is already
present, other threats can affect some
experimental groups, but not others. - Example
- If one group is dominated by members of one
fraternity (selection threat), and that
fraternity has a party the night before the
experiment (history threat), the results may be
altered for that group.
61Threats to External ValidityWays you might be
wrong in making generalizations
People, Places, and Times Demand
Characteristics Hawthorne Effects Order Effects
(or carryover effects)
62People threatAre the results due to the unusual
type of people in the study?
Example You learn that the grant you submitted
to assess average drinking rates among college
students in the U.S. has been funded. In late
November, you post an announcement about the
study on campus to get subjects for the study.
100 students sign up for the study. Of these, 78
are members of campus fraternities the other 22
are members of the schools football team.
63Places threatDid the study work because of the
unusual place you did the study in?
Example Suppose that you conduct an
educational study in a college town with lots
of high-achieving educationally-oriented kids.
64Time threatWas the study conducted at a
peculiar time?
Example Suppose that you conducted a smoking
cessation study the week after the U.S. Surgeon
General issued the well publicized results of the
latest smoking and cancer studies. In this
instance, you might get different results than if
you had conducted the study the week before.
65Demand Characteristics
- Participants are often provided with cues to the
anticipated results of a study. - Example
- When asked a series of questions about
depression, participants may become wise to the
hypothesis that certain treatments may work
better in treating mental illness than others.Â
When participants become wise to anticipated
results (termed a placebo effect), they may begin
to exhibit performance that they believe is
expected of them. - Making sure that subjects are not aware of
anticipated outcomes (termed a blind study)
reduces the possibility of this threat.
66Hawthorne Effects
- Similar to a placebo, research has found that
the mere presence of others watching a persons
performance causes a change in their
performance. If this change is significant, can
we be reasonably sure that it will also occur
when no one is watching? - Addressing this issue can be tricky but
employing a control group to measure the
Hawthorne effect of those not receiving any
treatment can be very helpful. In this sense,
the control group is also being observed and will
exhibit similar changes in their behavior as the
experimental group therefore negating the
Hawthorne effect.
67Order Effects (carryover effects)
- Order effects refer to the order in which
treatment is administered and can be a major
threat to external validity if multiple
treatments are used. - Example
- If subjects are given medication for two months,
therapy for another two months, and no treatment
for another two months, it would be possible, and
even likely, that the level of depression would
be least after the final no treatment phase.Â
Does this mean that no treatment is better than
the other two treatments? It likely means that
the benefits of the first two treatments have
carried over to the last phase, artificially
elevating the no treatment success rates.
68PART 4
- Describing data Measures of Central Tendency and
Dispersion - The Role of Variance
69Describing Data
Measures of Central Tendency - Mean (the
average) - Median (the middle number) - Mode
(the most frequently occurring number) Measures
of Dispersion - Range - Standard Deviation
(square root of the variance) - Variance (the
average squared deviation from the mean)
70The Role of Variance
- In an experiment, IV(s) are manipulated to
cause variation between experimental and control
conditions. - Experimental design helps control
extraneous variation--the variance due to factors
other than the manipulated variable(s). Sources
of Variance - Systematic between-subjects
variance Experimental variance due to
manipulation of the IV(s) The Good
Stuff Extraneous variance due to confounding
variables. Natural variability due to sampling
error - Non-systematic within-groups
variance Error variance due to chance factors
(individual differences) that affect some
participants more than others within a group
The Not-So-Good Stuff
71Separating Out The Variance
SST Sums of Squares Total SSM Sums of Squares
Model SSR Sums of Squares Error
SST
SSM
SSR
72Controlling Variance in Experiments
- In experimentation, each study is designed to
- Maximize experimental variance.
- Control extraneous variance.
- Minimize error variance.
- Good measurement
- Manipulated and Statistical control
-
73Test Statistics
Essentially, most test statistics are of the
following form
Systematic variance
Test statistic
Unsystematic variance
Test statistics are used to estimate the
likelihood that an observed difference is real
(not due to chance), and is usually accompanied
by a p value (e.g., plt.05, plt.01, etc.)
74A Very Simple Statistical Model
- outcomei (model) errori
- model an equation made up of variables and
parameters - variables measurements from our research (X)
- parameters estimates based on our data (b)
- outcomei (bXi) errori
- outcomei (b1X1i b2X2i b3X3i) errori
75Types of Mistakes
True state of null hypothesis
Statistical decision
Ho true
Ho false
Reject Ho
Type I error
Correct
Dont reject Ho
Correct
Type II error
76Statistical Power
- A measure of how well Type II errors have been
avoided (i.e. how well a test is able to find an
effect) - 1 type II error rate
- Power should be 0.8 or higher, so Type II error
rate should not exceed .20.
77Effect Sizes The Correlation coefficient
The statistical test only tells us whether it is
safe to conclude that the means come from
different populations. It doesnt tell us
anything about how strong these differences are.
So, we need a standard metric to gauge the
strength of the effects. The correlation
coefficient (r) is one metric for gauging effect
size.
- Ranges from 0 1 (no effect to perfect effect)
- Rough cutoffs (nonlinear, that is twice the r
value doesnt necessarily mean twice the effect) - 0.10 small effect (explains 1 of the variance)
- 0.30 medium effect (explains 9 of the
variance) - 0.50 large effect (explains 25 of the variance)
78Effect Sizes The coefficient of determination
The statistical test only tells us whether it is
safe to conclude that the means come from
different populations. It doesnt tell us
anything about how strong these differences are.
So, we need a standard metric to gauge the
strength of the effects. r2 (r-Square), or the
Coefficient of Determination, is one metric for
gauging effect size. Rules of Thumb regarding
effects sizes Small effect 1-3 of the total
variance Medium effect 10 of the total
variance Large effect 25 of the variance
SSM
r2
SST
79Reporting Statistical Models
- APA recommends exact p-values for all reported
results best to include an effect size, too - Effect x was not statistically significant in
condition y, p .24, d .21 - Report a mean and the upper and lower boundaries
of the confidence interval as M 30, 95 CI
20,40 - If all confidence intervals you are reporting are
95, its acceptable to say so and then later say
something likeIn this condition, effect x
increased, M 30 20,40.
80A Model of the Research Process Levels of
Constraint (Model used to illustrate the
continuum of demands placed on the adequacy of
the information used in research and on the
nature of the processing of that information.)
High Experimental Research Differential
Research Correlational Research Case-stu
dy Research Low Naturalistic
Observation Exploratory Research
Research plan becomes increasingly detailed
(e.g., precise hypotheses and analyses) but less
flexible. Research plan may be general, ideas,
questions, and procedures relatively unrefined.
Demand
81Observational Methods
- No direct manipulation of variables by the
researcher. Behavior is merely recorded--but
systematically and objectively so that the
observations are potentially replicable. - Advantages
- Reveals how people normally behave.
- Experimentation without prior careful observation
can lead to a distorted or incomplete picture. - Disadvantages
- Generally more time-consuming.
- Doesnt allow identification of cause and effect.
82Quasi-Experimental Design
- In a quasi-experimental study, the experimenter
does not have complete control over manipulation
of the independent variable or how participants
are assigned to the different conditions of the
study. - Advantages
- Natural setting
- Higher face validity (from practitioner
viewpoint) - Disadvantages
- Not possible to isolate cause and effect as
conclusively as with a true experiment.
83Types of Quasi-Experimental Designs
84One Group Post-Test Design
Treatment
Measurement
Time
Change in participants behavior may or may not
be due to the intervention. Prone to time
effects, and lacks a baseline against which to
measure the strength of the intervention.
85One Group Pre-test Post-test Design
Measurement
Treatment
Measurement
Time
Comparison of pre- and post-intervention scores
allows assessment of the magnitude of the
treatments effects. Prone to time effects, and
it is not possible to determine whether
performance would have changed without the
intervention.
86Interrupted Time-Series Design
Measurement
Measurement
Time
Measurement
Treatment
Measurement
Measurement
Measurement
Dont have full control over manipulations of the
IV. No way of ruling out other factors.
Potential changes in measurement.
87Static Group Comparison Design
Treatment
Measurement
Group A
(experimental group)
Measurement
No Treatment
Group B
(control group)
Time
Participants are not assigned to the conditions
randomly. Observed differences may be due to
other factors. Strength of conclusions depends
on the extent to which we can identify and
eliminate alternative explanations.
88Experimental ResearchBetween-Groups and
Within-Groups Designs
89Between-Groups Designs
- Separate groups of participant are used for each
condition of the experiment.
- Within-Groups (Repeated Measures) Designs Each
participant is exposed to each condition of the
experiment (requires less participants than
between groups design).
90Between-Groups Designs
- Advantages
- Simplicity
- Less chance of practice and fatigue effects
- Useful when it is not possible for an individual
to participate in all of the experimental
conditions
- Disadvantages
- Can be expensive in terms of time, effort, and
number of participants - Less sensitive to experimental manipulations
91Examples of Between-Groups Designs
92Post-test Only / Control Group Design
Treatment
Measurement
Group A
(experimental group)
Random
allocation
Measurement
No Treatment
Group B
(control group)
Time
If randomization fails to produce equivalence,
there is no way of knowing that it has failed.
Experimenter cannot be certain that the two
groups were comparable before the treatment.
93Pre-test / Post-test Control Group Design
Treatment
Group A
Measurement
Measurement
Random
allocation
No Treatment
Group B
Measurement
Measurement
Time
Pre-testing allows experimenter to determine
equivalence of the groups prior to the
intervention. However, pre-testing may affect
participants subsequent performance.
94Solomon Four-Group Design
Treatment
Measurement
Group A
Measurement
Group B
Measurement
No Treatment
Measurement
Random allocation
Measurement
Treatment
Group C
Measurement
No Treatment
Group D
Time
95Within-Groups Designs Repeated Measures
Advantages
Disadvantages
- Carry-over effects from one condition to another
- The need for conditions to be reversible
96Repeated-Measures Design
Treatment
Measurement
Measurement
No Treatment
Random Allocation
No Treatment
Measurement
Measurement
Treatment
Time
Potential for carryover effects can be avoided by
randomizing the order of presentation of the
different conditions or counterbalancing the
order in which participants experience them.
97Latin Squares Design
Three Conditions or Trials
order of conditions or trials
One group of participants
A B C
Another group of participants
B C A
Yet another group of participants
C A B
Order of presentation of conditions in a
within-subjects design can be counterbalanced so
that each possible order of conditions occurs
just once. Problem not completely eliminated
because A precedes B twice, but B precedes A only
once. Same with C and A.
98Balanced Latin Squares Design
Four Conditions or Trials
order of conditions or trials
One group of participants
A B C D
Another group of participants
B D A C
Yet another group of participants
D C B A
And yet another group of participants
C A D B
Note This approach works only for experiments
with an even number of conditions. For
additional help with more complex multi-factorial
designs, see http//www.jic.bbsrc.ac.uk
99Factorial Designs
- include multiple independent variables
- allow for analysis of interactions between
variables - facilitate increased generalizability