Title: Why do we Need Randomised Controlled Trials?
1Why do we Need Randomised Controlled Trials?
- David Torgerson
- Director, York Trials Unit
2What works?
- In most areas, education, health, criminal
justice, etc, we want to know WHAT or WHETHER
something works. - Do bootcamps reduce criminal behaviour?
- Are teaching volunteers effective?
- Are computers effective at improving literacy and
numeracy? - Of secondary importance is HOW.
3The WHAT question
- The ONLY way we can find out whether something
works or not is by using a RANDOMISED CONTROLLED
TRIAL. - All other evaluative methods are INFERIOR ways of
answering the WHAT question and some cannot
answer it at all (e.g., qualitative research).
4Structure of Session
- Randomised Controlled Trials ARE the
gold-standard evaluation method. - What is wrong with other research methods?
- Why should we do trials
5Before and After Methods
6Clinical Practice in the 18th Century
- "It is incident to physicians, I am afraid,
beyond all other men, to mistake subsequence for
consequence." - Samuel Johnson, 1734
7Background
- Traditionally most interventions have been
evaluated using a pre-test post-test or before
and after design. - Participants are tested treated and then tested
again any improvements are attributable to the
intervention. - Currently this is probably the most POPULAR
evaluative method in most fields.
8Who uses before and after?
- Policy makers
- Teachers assessing individual children.
- Action researchers.
- Parents
- Lecturers
- We all do.
9Problems
- Problems include
- Temporal changes
- Regression to the mean.
10Temporal Change
- Self-learning irrespective of teaching occurs.
- As children mature they will become better at
learning. - Any intervention or treatment is mixed up with
these temporal changes difficult to disentangle.
11Changes in Outcomes
- If we measured outcome on public examination
results we will see an improvement. Is this
because the intervention has worked? Or is it
because exams have got easier? Or have children
become more intelligent? - Without a control group we CANNOT know.
12Regression to the Mean
- As well as temporal changes before and after
studies are confounded by a statistical
phenomenon known as Regression to or towards the
mean
13Regression to the mean
- This is a GROUP phenomenon and occurs when the
group are measured with an inexact measurement
tool and then remeasured. Those individuals with
extreme values will have a high probability of
regressing towards the mean on the second
measurement.
14History of RTM
- Galtons work from 1869 started to provide the
understanding of the phenomenon. - By 1886 Galton had described the phenomenon among
the heights of children and their parents
(children of tall parents tend to be shorter and
vice versa regression to mediocrity).
15Economists and RTM
- I suspect that the regression fallacy is the
most common fallacy in the statistical analysis
of economic data - Milton Friedman 1992
16Marking Exam Scripts
- For MSc in Health Sciences system of double
marking markers are blind to student identity and
the other markers mark. - There is a tendency to disagree with marks at the
extreme of the distribution. - Explanation Regression to mean.
17RTM and exam scripts
18Annual Increase in offences with firearms
Amnesty
19Did the Amnesty work?
- Unclear, the year preceeding the amnesty had a
large, unexpected, increase in offences, we would
expect through regression to the mean that in the
following year the rate of increase would
regress back to towards the average annual
increase.
20Education intervention
- Wheldall selected 40 pupils whose reading was at
least 2 years behind their peers. - Half were exposed to an intervention.
Wheldall Educational Review 20005229.
21Before and after reading programme
Difference highly statistically significant p lt
0.001
22Before and after reading programme
Differences between groups NOT statistically
significant
23RTM misunderstanding
- the mean gain scores translated to impressive
effect sizes of 0.6. - It could be argued that it is asking too much of
any program to demonstrate enhanced efficacy on
top of such high existing efficacy - control group gains were largely attributable
to pre-existing literacy programme.. - Perhaps, BUT much of the gain will be due to RTM.
24RTM and School Exclusions
- A qualitative and before and after evaluation of
an intervention to reduce school exclusions said - an RCT would not have been able to adequately
address fundamental problems concerning the
reliability and validity of quantitative data in
relation to exclusions
25Flawed Methods
- Selected schools with HIGH exclusion rates on
which to intervene. Therefore we would EXPECT
exclusions to fall. - They did by 15.
- BUT schools with the fewest exclusions INCREASED
exclusions by 55 whilst schools with the highest
exclusions had a fall of 32.
26Mentoring
- In England, part of the KS3 Strategy
- Backed by Government and private funding
- Mentoring means a lot of different things
- Research evidence is
- Case studies
- Feelings and perceptions of participants
- Completely inadequate to infer impact
27Neil Applebys Experiment
- A randomised controlled trial involving 20
underachieving Y8 (12-13 year-old) students - Matched in pairs on ability and gender
- Randomly allocated in each pair, one mentored,
the other not - Mentored group had 20 mins individually every two
weeks (11 sessions) - It nearly killed me
- Cost estimated at between 170 and 410 per
mentored pupil, represents between 8-19 of the
schools annual per pupil funding for the whole
of their education
28What the teachers said about the mentored
students
- is a changed person this year she has
progressed greatly and is a superb helpful
student. - Better now, has achieved more, more confident.
- Generally a great improvement recently.
- s attitude and effort have improved over
the year. He is a lot pleasanter and more willing
to participate in lessons particularly oral work,
he responds well to praise.
29What they said about the control group
- Has improved overall this term.
- s attitude and effort have improved over
the last few months, she is now trying very hard
to achieve her target. Great effort. - Commended for attitude and progress.
- has settled since the beginning of the
year. - has undergone quite a transformation since
September. Her attitude towards the teacher and
her learning have improved drastically and she
should be congratulated.
30Change in Teachers Ratings of progress, effort
and attitude (English, maths and science
combined)
31What this proves
- If you identify a group of underachieving pupils
at a particular time and then come back to them
after a few months, many of them will have
improved, whatever you did. - Others (the hard cases) will not have improved,
whether mentored or left alone. - The interpretation of this would have been very
different without a control group
32RTM and League Tables
- RTM GREAT for Governments to help the credulous
into believing what they do works. - In any league table those at the bottom will tend
to regress upwards to the mean whilst those at
the top regress down. This lends support to
naming and shaming or extra financial help to
those at the bottom.
33Dealing with RTM
- The only way to reliably deal with the problem is
through randomised trials. - Which is why before and after data are generally
regarded, by the congnescenti, as almost USELESS.
34History of Controlled Trials
- Because of temporal and regression to mean
effects we MUST have a control group.
35Background
- Many researchers over the centuries have seen the
need for a control group to avoid the inherent
biases in the before and after study. - Controlled trials have been conducted for several
hundred years probably occasionally using
randomisation.
36Scurvy
- Scurvy was a very prevalent condition among
sailors before the 19th Century. - A controlled trial in the middle of the 18th
Century of 12 sailors showed that the two sailors
allocated to receive lime or orange juice
recovered and were able to care for their ship
mates allocated to vinegar or salt water.
37Lack of Dissemination
- An even earlier trial in scurvy prevention used a
cluster design whereby a whole ships crew were
allocated citrus fruit and were compared with two
ships crews who were not. - The treatment worked but lesson forgotten.
- After second trial took Navy 50 years to
implement results
38Agriculture
- Fisher is usually thought of as the originator of
randomisation in the 1920s in agricultural
experiments. - He was concerned with the statistical properties
of randomness as well as the formation of
unbiased groups.
39Cambridge-Somerfield
- In 1937 a classic experiment the
Cambridge-Somerfield trial was launched. - The aim was to show that social worker
intervention among delinquent boys would reduce
criminality.
40Design
- 650 boys were identified by their teachers as
having delinquent behaviour that put them at
later risk of criminal activity. - 325 pairs were formed and one from each pair was
allocated a social worker supported by
psychiatrists.
41Results early follow-up of boys indulging in
crime.
Green bar indicates intervention grop
42Results later follow-up
- In 1975 boys were followed up again when middle
aged men. - 58 of intervention group had NOT had a criminal
conviction - BUT 68 of control group had NOT had a
conviction. - If a control group had not been used success of
the intervention would be assured.
43Consequences of the Trial
- The social work profession largely ABANDONED the
RCT as a method of evaluation as it failed to
give the RIGHT results.
44RCTs and education
- Lindquist writing about experimental methods in
1940 argued that advanced text books use all
illustrations given are in the field of
agricultural experimentation and are concerned
with plots blocks yields treatments etc,
rather than with schools classes scores
methods pupils etc.
Lindquist Statistical Analysis in Educational
Research, 1940.
45The Importance of Design in Educational
Experiments (Lindquist)
- In 1940 in his book on statistics in educational
research Lindquist quite clearly describes
appropriate RCTs for educational research. - His book is also the first description of the
appropriate techniques to be used in analysing
pupils scores in classes (I.e, cluster analysis),
which was an advance on Fishers Design of
experiments.
46Cluster analysis
- In health statistics Lindquists statistical
methods were largely ignored until the late 1980s
when it became accepted to use the methods he
advocated to analyse clustered data although even
now most cluster trials are badly analysed. - But 64 years on what about his descriptions on
how to rigorously evaluate educational
interventions?
47Educational Trials UK
- Not many trials in education have been undertaken
in the UK. - Most educational trials are from the USA.
- WHY? (my personal view)
- Futility of the paradigm war
- Failure to understand their importance
- Trials often give the wrong answer
- Lack of funding.
48Opposition to Trials is widespread
- In health care many doctors will refuse to
believe the results of a trial and argue the
trial was faulty or poorly conducted if the
result was wrong. - Recent example WHI study of hormone replacement
many doctors REFUSE to accept the findings of
this study that it INCREASES risk of heart
disease.
49Opposition to Polio Trial
- I found but one person who rigidly adherred to
the idea of a placebo control and he is a
bio-statistician who, if he did not adhere to
this view, would have had to admit his own
purposelessness in life (Jonas Salk).
501950s to 1970s
- The use of trials expanded rapidly within and
beyond medicine. - In the social sciences experiments included
- Negative income tax
- Adoption
- Busing
- Public vs private schools
- Prevention of spousal abuse.
51Health Care Trials
- Although ALL new medicines have to be evaluated
using RCTs many medical treatments do not. - HOWEVER, health care is fortunate because we
bury our disasters we KNOW how important trials
are as a protection for patients.
52Health Care Disasters
- Opposition to RCTs has declined over the years,
partly due to a number of catastrophes, from
unevaluated treatments. - Harmful treatments are still in widespread use
today we just dont know which ones.
53Disasters among babies
- Routine practice in 40s and 50s to give premature
infants pure oxygen. At the same time it was
noted that there was an epidemic of blindness
among babies. Linked to oxygen use. - Routine practice in 50s to give prophylactic
antibiotics to premature infants, caused brain
damage and death. - BOTH of these problems only discovered AFTER an
RCT was undertaken.
54Trial sabotage
- Interestingly an early trial of pure oxygen for
neonates was sabotaged by nurses who secretly
gave oxygen to some of the controls because they
KNEW that it was effective. - Because of this ARROGANCE they contributed to the
blinding of healthy babies.
55Educational Disaster?
- On the basis of before and after and anecdote
widespread implementation of driver education (in
the USA) among older pupils was implemented. - It was thought that this would reduce car
accidents. - Did it? Fortunately, some sceptics undertook a
series of trials in the USA.
56Driver Education - Results
- Roberts and colleagues (see Campbell
Collaboration) reviewed these trials and
undertook a meta-analysis. - They found that driver education INCREASED the
likelihood of deaths in car accidents as it
increased the prevalence of young motorists.
57UK Policy makers
- Have IGNORED these results and implemented driver
education in some schools. - This will directly increase deaths among young
drivers.
58Computers in Schools
- Introduction of computers into schools has not
been preceded by large RCTs. - The best evidence we have is from a
quasi-experiment from Israel, which showed that
introduction of computers into half the state
schools led to no change in Hebrew literacy but a
DECLINE in maths. - The Israeli Government has since introduced
computers into all schools!!!
59Volunteers in Schools
- The use of volunteers to help children learn to
read is widespread but are they effective? - In a systematic review of RCTs only 7 trials
could be identified with largest with ONLY 99
children. - The effect of volunteering was very slight (0.19,
-0.31 to 0.68) and not statistically significant.
Torgerson et al. 2002 Ed Studies, 28 No 4.
60Conclusions
- Virtually all new interventions need to be
evaluated using RCTs. - Unlike health care children are compelled to have
education. Therefore it is even more urgent that
they should not be exposed to ineffective
educational interventions.
61We need more trials