Sample size and analytical issues for cluster trials

About This Presentation
Title:

Sample size and analytical issues for cluster trials

Description:

... issues for cluster trials. David Torgerson. Director, York Trials Unit ... Cook claims that using covariates allows a school based RCT to reduce the number ... –

Number of Views:44
Avg rating:3.0/5.0
Slides: 51
Provided by: SSO68
Category:

less

Transcript and Presenter's Notes

Title: Sample size and analytical issues for cluster trials


1
Sample size and analytical issues for cluster
trials
  • David Torgerson
  • Director, York Trials Unit
  • djt6_at_york.ac.uk
  • www.rcts.org

2
Background
  • For any trial we want to make it sufficiently
    large that if there were a true difference
    between the groups that this difference would be
    statistically significant.
  • A Type II error occurs when we wrongly conclude
    there is no difference when there actually is.

3
Sample size calculations
  • Most hand calculations diabolically strain human
    limits, even for the easiest formula,.. (Schulz
    Grimes, Lancet 2005)

4
Sample size formulae
  • Usually need a computer to calculate. However, a
    simple approximation for a two armed randomised
    trial with 11 ratio for a continuous variable
    (e.g., blood pressure) is as follows d effect
    size (difference/standard deviation)

5
Example
  • We want to investigate a treatment for back pain.
    The measure is the Roland and Morris back pain
    scale with a standard deviation of 4. If we want
    to detect a 2 point difference how many do we
    need?
  • 2/4 0.5 Effect size (d). 0.5 x 0.5 0.25.
  • 32/0.25 128 in total for 80 power, 5
    significance (use 42 for 90 power).
  • NB using computer software answer 126

6
Binary variables
  • For a dichotomous variable (cured not cured) the
    following is useful (a average proportion
    difference).

7
Example
  • Breast feeding rates are only 50 and we have an
    educational intervention where we think this will
    increase to 60 how many do we need?
  • d2 0.6-0.5 0.12 0.01
  • a 0.60.5/2 0.55
  • a2 0.552 0.3025
  • 0.01/(0.55-0.3025) 0.040
  • 32/0.040 792
  • Need 792 to have 80 power to show a 10
    difference in breast feeding rates if it were
    present (use 42 for 90 power).
  • NB using computer software the answer is 774

8
Approximations
  • The formulae slightly overestimate the true
    sample size needed. But they can be done on a
    hand calculator and you can impress the
    statisticians.
  • What about cluster trials?

9
Cluster Sample Size
  • Usual sample size estimates assume independence
    of observations. When people are members of the
    same cluster (e.g., classroom, GP surgery) they
    are more related than we would expect to be at
    random.
  • This is the intra-cluster correlation
    co-efficient.

10
ICC
  • The ICC needs to incorporated into the sample
    size calculations. The formula is as follows
    Design effect 1 (m 1) X ICC. Design effect
    is the size the sample needs to be inflated by.
    M is the number of people in the cluster.

11
Sample size example.
  • Lets assume for an individually randomised trial
    we need 128 people to detect 0.5 of an effect
    size with 80 power (2p 0.05). Now assume we
    have 24 groups with 7 members. The ICC is 0.05,
    which is quite high.
  • 1 (7 1) x 0.05 1.3, we need to increase the
    sample size by 30. Therefore, we will need 166
    participants.

12
What happens if cluster gets bigger?
  • If our cluster size is twice as big (14), things
    begin to get really interesting.
  • 1(14-1)x0.05 1.65.
  • What about 30? (1(30-1)x 0.05 2.45 (I.e, 314
    participants).
  • Say we randomise a larger cluster, such as a
    school (n 500) (1(500-1) x 0.05 25.95 (ie.
    3322).

13
ICC size
  • ICCs can be large for some things. ICCs for
    educational outcomes for examples are often
    around 0.4 to 0.5.
  • A class-based RCT with n 30 and an ICC of 0.4
    would need 1,612 participants or 54 classes with
    n 30 in each class.

14
What makes the ICC large?
  • If the treatment is applied to health care
    provider (e.g., guidelines will increase ICCs for
    patients).
  • If cluster relates to outcome variable (e.g.,
    smoking cessation and schools)
  • If members of cluster are expected to influence
    each other (e.g., households).

15
Reviews of Cluster Trials
Authors Source Years Clustering allowed for in sample size Clustering allowed for in analysis
Donner et al. (1990) 16 non-therapeutic intervention trials 1979 1989 lt20 lt50
Simpson et al. (1995) 21 trials from American Journal of Public Health and Preventive Medicine 1990 1993 19 57
Isaakidis and Ioannidis (2003) 51 trials in Sub-Saharan Africa 1973 2001 (half post 1995) 20 37
Puffer et al. (2003) 36 trials in British Medical Journal, Lancet, and New England Journal of Medicine 1997 2002 56 92
Eldridge et al. (Clinical Trials 2004) 152 trials in primary health care 1997 - 2000 20 59
16
Sample Size Problems
Cluster Trials Demand Larger Sample Sizes
17
Conditional ICC
  • The key ICC is the conditional ICC, usually we
    only have access to estimates of the
    unconditional ICC.
  • If we know, and can measure, characteristics that
    cause the ICC, we can adjust for this and lower
    the ICC.
  • Cook claims that using covariates allows a school
    based RCT to reduce the number for schools from
    about 50 to around 22.

18
Summary of sample size
  • The KEY thing is the size of the cluster. It is
    nearly always best to get lots of small clusters
    than a few large ones (e.g, a trial with small
    hospital wards, GP practices, classrooms will,
    ceteris paribus, be better than large clusters).
  • BUT if the ICC is tiny may not affect the sample
    too much.

19
Cluster Trials Should I do one?
  • If possible avoid like the plague. BUT although
    they are difficult to do, properly, they WILL
    give more robust answers than other methods,
    (e.g., observational data), when done properly.
  • Is it possible to avoid doing them and do an
    individually randomised trial?

20
Contamination
  • An important justification for their use is
    SUPPOSED contamination between participants
    allocated to the intervention with people
    allocated to the control.

21
Spurious Contamination?
  • Trial proposal to cluster randomise practices for
    a breast feeding study new mothers might talk
    to each other!
  • Trial for reducing cardiac risk factors patients
    again might talk to each other.
  • Trial for removing allergens from homes of
    asthmatic children.

22
Contamination
  • Contamination occurs when some of the control
    patients receive the novel intervention.
  • It is a problem because it reduces the effect
    size, which increases the risk of a Type II error
    (concluding there is no effect when there
    actually is).

23
Patient level contamination
  • In a trial of counselling adults to reduce their
    risk of cardiovascular disease general practices
    were randomised to avoid contamination of control
    participants by intervention patients.

Steptoe. BMJ 1999319943.
24
Accepting Contamination
  • We should accept some contamination and deal with
    it through individual randomisation and by
    boosting the sample size rather than going for
    cluster randomisation

Torgerson BMJ 2001322355.
25
Counselling Trial
  • Steptoe et al, wanted to detect a 9 reduction in
    smoking prevalence with a health promotion
    intervention. They needed 2000 participants
    (rather than 1282) because of clustering.
  • If they had randomised 2000 individuals this
    would have been able to detect a 7 reduction
    allowing for a 20 CONTAMINATION.

Steptoe. BMJ 1999319943.
26
Comparison of Sample Sizes
NB Assuming an ICC of 0.02.
27
Misplaced contamination
  • The ONLY health study, Im aware of to date, to
    directly compare an individually randomised study
    with a cluster design, showed no evidence of
    contamination.
  • In an RCT of nurse led cardiovascular risk factor
    screening some intervention clusters had
    participants allocated to no treatment. NO
    contamination was observed.

28
What about dilution bias?
  • If, in the presence of contamination, we use
    individual allocation we might observe a
    difference that is statistically significant but
    is not clinically or economically significant.
  • Dilution has biased the estimate towards the mean.

29
Dealing with contamination
  • Sometimes there may be substantial contamination
    and this will dilute the treatment effects, it
    may, however, still be best to individually
    randomise if you can measure contamination.

30
Per-protocol analysis?
  • We cannot adjust for contamination using either
    per-protocol or on treatment analysis these
    popular analytical methods are plainly wrong as
    they violate the random allocation.

31
CACE analysis a solution?
  • If we can measure contamination we can use a
    statistical approach known as Complier Average
    Causal Effect (CACE) analysis.

32
Assumptions of CACE
  • Assumption 1 if the control group had been
    offered treatment the same proportion would
    comply with treatment this must be true as
    random allocation ensures that it is.
  • Assumption 2 merely being offered treatment has
    no effect on outcomes.

33
Example CRC screening
  • In a RCT of bowel cancer screening only 53 of
    people invited for screening attended.
  • ITT relative risk 0.85. BUT what happened to
    those who were screened? The per protocol RR was
    0.62 THIS IS WRONG.
  • What is the true estimate?

34
       
35
True differences
  • For ITT the policy of offering screening to the
    whole community the RR 0.85, that is a 15
    reduction in CRC deaths.
  • For those who accepted screening their RR was
    0.68 a 32 reduction in deaths, NOT a 38
    reduction.

36
Individuals are best
  • Using CACE we can get the best of both worlds
    retain individual randomisation and get unbiased
    estimates.

37
Sample size simulation
  • CACE analysis generally produces wider confidence
    intervals as there are two sources of variance.
  • Therefore, it is possible that cluster allocation
    may actually have a lower standard error in some
    circumstances.
  • To assess whether this is true we undertook a
    simulation exercise.

38
 
Sample size Trade-off between cluster and
individual allocation
Cluster Size ICC 0.04, Cluster trial Contamination () Individual RCT with CACE Contamination effect
10 1080 0 630 1
30 1740 10 756 1.20
50 2400 20 890 1.41
100 4000 30 1090 1.73
NB 80 power to detect an effect size of
0.2 Source Hewitt PhD thesis.
 
39
Sample size
  • CACE performs better than cluster allocation in a
    range of sample size scenarios
  • Because of the difficulties of doing a cluster
    trial then an individual trial design with CACE
    analysis might be best.

40
Limitations
  • The assumption that being offered treatment has
    no effect is a weakness as some may appear not to
    comply but actually access some of the treatment.

41
Still need to do a cluster trial?
  • If a cluster trial is be undertaken it is
    important, once the trial has been completed that
    it is analysed correctly and that the effect of
    the clustering is accounted for. This has been
    known since 1940, when Linquist advocated that
    educational trials should use the class as the
    natural unit of allocation.

42
What did Lindquist proposed
  • Each class should be treated both as the unit of
    allocation and the unit of analysis.
  • Put simply a trial with 20 classes of 30 children
    is NOT a trial of 600 children it is a trial of
    20 classes.
  • The simplest approach is to calculate the mean
    score of each cluster and do a t-test comparing
    the two means.

43
Example
  • A randomised trial of 28 adult literacy classes
    sought to ascertain whether or not paying
    participants an incentive to attend would improve
    adherrence.
  • 14 classes were randomised for students to get an
    incentive 14 were controls.
  • Students were paid 5 per class attended
  • There were 150 students in total the ICC was
    0.39.

See Martin Blands website http//www-users.york.a
c.uk/mb55/ for a worked example
44
Two-sample t test with equal variances -----------
--------------------------------------------------
----------------- Group Obs Mean
Std. Err. Std. Dev. 95 Conf.
Interval ---------------------------------------
-------------------------------------- Group X
70 6.685714 .4177941 3.495516
5.852238 7.519191 Group Y 82
5.280488 .2991881 2.709263 4.685197
5.875778 ----------------------------------------
------------------------------------- combined
152 5.927632 .2566817 3.164585
5.42048 6.434783 -----------------------------
------------------------------------------------
diff 1.405226 .5037841
.4097968 2.400656 ----------------------
--------------------------------------------------
------ diff mean(Group X) - mean(Group Y)
t 2.7893 Ho diff 0
degrees of
freedom 150 Ha diff lt 0
Ha diff ! 0 Ha diff gt 0
Pr(T lt t) 0.9970 Pr(T gt t) 0.0060
Pr(T gt t) 0.0030
45
Wrong
  • This analysis is wrong it treats all of the
    students as individuals and ignores the
    clustering of outcomes between the two
    approaches.
  • Let us try Lindquists approach to the anlaysis.

46
Two-sample t test with equal variances -----------
--------------------------------------------------
----------------- Group Obs Mean
Std. Err. Std. Dev. 95 Conf.
Interval ---------------------------------------
-------------------------------------- 1
14 6.69932 .7457716 2.790422
5.088178 8.310461 2 14
5.189229 .3974616 1.487165 4.330565
6.047893 ----------------------------------------
------------------------------------- combined
28 5.944274 .439363 2.32489
5.042776 6.845773 ----------------------------
-------------------------------------------------
diff 1.510091 .8450746
-.226985 3.247166 ---------------------
--------------------------------------------------
------- diff mean(1) - mean(2)
t 1.7869 Ho diff 0
degrees of
freedom 26 Ha diff lt 0
Ha diff ! 0 Ha diff gt 0
Pr(T lt t) 0.9572 Pr(T gt t) 0.0856
Pr(T gt t) 0.0428
47
T-test method
  • This is correct in the sense that it takes
    clustering into account, however, it does not
    take chance differences in cluster size into
    account or powerful predictors of outcome.
  • We have information of cluster size and pre-test
    literacy score we can use to improve the
    precision of our estimate (i.e., reduce width of
    the confidence intervals). We can use summary
    statistics in a regression approach

48
Source SS df MS
Number of obs 28 -------------------
------------------------ F( 2, 25)
22.97 Model 88.6762362 2
44.3381181 Prob gt F 0.0000
Residual 48.252853 25 1.93011412
R-squared 0.6476 ------------------------
------------------- Adj R-squared
0.6194 Total 136.929089 27
5.07144775 Root MSE
1.3893 ------------------------------------------
------------------------------------ sessions
Coef. Std. Err. t Pgtt 95
Conf. Interval ---------------------------------
--------------------------------------------
group -1.778653 .5301429 -3.36 0.003
-2.870503 -.6868038 midscl -.0945941
.015181 -6.23 0.000 -.1258598
-.0633283 _cons 13.13811 1.175841
11.17 0.000 10.71642 15.5598 -----------
-------
49
Other methods
  • There are other statistical methods, that are
    more complex, and may yield slightly different
    results. However, simple methods are
    approximately correct and easier to do.

50
Summary
  • Cluster trials need larger sample sizes than
    individually randomised studies.
  • Clustering needs to be taken into account both in
    the sample size and the analysis.
  • There are simple methods that can do this.
Write a Comment
User Comments (0)
About PowerShow.com