Diapositiva 1 - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Diapositiva 1

Description:

Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and Predictive ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 20
Provided by: minicozz
Category:

less

Transcript and Presenter's Notes

Title: Diapositiva 1


1
Sampling and power analysis in the High
Resolution studies
Pamela Minicozzi
2
High Resolution studies
collected detailed data from patients clinical
records, so that the influence of non-routinely
collected factors (tumour molecular
characteristics, diagnostic investigations,
treatment, relapse) on survival and differences
in standard care could be analysed
3
Problem
  • In each country, the population of incident
    cases
  • for a particular cancer consists of N
    subjects
  • N is large (so, rare cancers are not considered
    here)
  • Since N is large, not all cases can be
    investigated
  • use a representative sample to derive valid
    conclusions
  • that are applicable to the entire original
    population

Solution
4
Two questions
  1. What kind of probability sampling should we use?
  2. What sample size should we use?

5
Sampling
6
Previous High Resolution studies
  • Samples were representative of
  • 1-year incidence
  • a time interval (e.g. 6 months) within the study
    period, provided that incidence was complete
  • an administratively defined area covered by
    cancer registration

7
Present High Resolution studies
We want to eliminate variations in types of
sampling between countries and within a single
country
This implies more sophisticated sampling
Main types of probability sampling
8
Simple random sampling
  • assign a unique number to each element of the
    study population
  • determine the sample size
  • randomly select the population elements using
  • a table of random numbers
  • a list of numbers generated randomly by a
    computer

Advantage - auxiliary information on
subjects is not required Disadvantage - if
subgroups of the population are of particular
interest, they may
not be included in sufficient
numbers in the sample
9
Stratified sampling
  • identify stratification variable(s) and
    determine the number of
  • strata to be used (e.g. day and month of
    birth, year of diagnosis, cancer registry, etc.)
  • divide the population into strata and determine
    the sample size of each
  • stratum
  • randomly select the population elements in each
    stratum

Advantage - a more representative sample
is obtained Disadvantage - requires
information on the proportion of the total
population belonging to
each stratum
10
Systematic sampling
  • determine the sample size (n) thus the sampling
    interval i is n/N
  • randomly select a number r from 1 to i
  • select all the other subjects in the following
    positions
  • r, r i, r 2i, etc, until the sample is
    exhausted

Advantage - eliminate the possibility of
autocorrelation Disadvantage - only the first
element is selected on a probability
basis ? pseudo-random sampling
11
How many subjects do we need?
12
The main elements
the probability that the difference will be
detected (e.g. 80, 90)
the probability that a positive finding is due to
chance alone (e.g. 1, 5)
they explored whether some variables can be
measured with sufficient precision (or
available) and checked the study vision
13
Previous High Resolution studies
  • Number of patients was defined based on
  • observed differences in survival and risk of
    death
  • incidence of the cancer under study
  • difficulties in collecting clinical information
  • available economic resources

Notwithstanding that ...
  • we were able to identify statistically
    significant relative excess risks of death
  • up to 1.60 among European countries
  • up to 1.40 among Italian areas
  • for breast cancer for which differences in
    survival are small.
  • ? Applicable to other cancers for which survival
    differences are larger

14
Example for breast cancer (diagnosis 95-99)
Plot power as a function of hazard ratio for
a 5 two-sided log-rank test with 80 power
over sample sizes ranging from 100 and 1000
Assume 75 survival as reference (the overall
survival in Europe, range 65-90)
45
15
Example for colorectal cancer (diagnosis 95-99)
Plot power as a function of hazard ratio for
a 5 two-sided log-rank test with 80 power
over sample sizes ranging from 100 and 1000
Assume 50 survival as reference (the overall
survival in Europe, range 30-70)
32
16
Example for lung cancer (diagnosis 95-99)
Plot power as a function of hazard ratio for
a 5 two-sided log-rank test with 80 power
over sample sizes ranging from 100 and 1000
Assume 10 survival as reference (the overall
survival in Europe, range 5-20)
30
17
Present High Resolution studies
We want to analyse both differences in survival
and adherence to standard care
Power analysis for both logistic regression
analysis (to analyse the odds of receiving one
type of care (typically standard care)) and
relative survival analysis (to analyse
differences in relative survival and relative
excess risks of death)
18
Conclusions
  • Taking into account
  • existing samplings and power methodology
  • experience from previous studies
  • different coverage of Cancer Registries
  • available economic resources
  • We want to
  • standardize the selection of data
  • include a minimum number of cases that satisfies
    statistical
  • considerations related to all aims of our
    studies

Prof. JS Long1 (Regression Models for Categorical
and Limited Dependent,1997) suggests that sample
sizes of less than 100 cases should
be avoided and that 500
observations should be adequate for almost any
situation.
1Professor of Sociology and Statistics at Indiana
University
19
Thank you for your attention
And What about your experience?
Write a Comment
User Comments (0)
About PowerShow.com