Title: Sampling Techniques
1Sampling Techniques
- Dr. Shaik Shaffi Ahamed Ph.D.,
- Assistant Professor
- Department of Family Community Medicine
- College of Medicine
- King Saud University
2Why should we take sample?, Cant we study the
whole ?
- It is possible
- depends on objective
- to know how many live in a country
- -age and sex categories
- -changing pattern of age structure
- -when plan for country
- CENSUS
- -death in a hospital
- record all the death
- It is not possible
- -to test the life of bulbs burn bulbs till it
lost its life - count of RBW in blood draw all the blood
count - Count the stars in the sky
- It is not necessary
- - estimate Hb in blood a drop of blood is
enough blood in any part of the body will
provide same
3Populations and Sampling Reasons for using
samples
- There are many good reasons for studying a sample
instead of an entire population - Samples can be studied more quickly than
populations. Speed can be important if a
physician needs to determine something quickly,
such as a vaccine or treatment for a new disease. - A study of a sample is less expensive than a
study of an entire population because a smaller
number of items or subjects are examined. This
consideration is especially important in the
design of large studies that require a long
follow-up. - A study of the entire populations is impossible
in most situations. - Sample results are often more accurate than
results based on a population.
4Sampling in Epidemiology
- Why Sample?
- Unable to study all members of a population
- Reduce bias
- Save time and money
- Measurements may be better in sample than in
entire population - Feasibility
5Sampling
- Sampling is the process or technique of selecting
a sample of appropriate characteristics and
adequate size.
6Terminology
- Study Population
- A population may be defined as an aggregate of
all things / units possessing a common trait or
characteristic. -
- The whole collection of units (the universe).
7Terminology Cont.
- Target (Study) Population
-
- The population that possesses a characteristic
(parameter) which we wish to estimate or
concerning which, we wish to draw conclusion. - The population you expect the eventual results of
the research to apply (target of inference). - It may be real or hypothetical.
-
8Terminology Cont.
- Sample
- A selected subset of the study population.
- Chosen by some process (e.g. sampling) with the
objective of investigating particular
characteristic (parameter) of the study
population. - Sampling
- Process of obtaining a sample from the target
population.
9Terminology Cont.
- Sampling Frame
- This is the complete list of sampling units in
the target population to be subjected to the
sampling procedure. - Completeness and accuracy of this list is
essential for the success of the study. - Sampling Units
- These are the individual units / entities that
make up the frame just as elements are entities
that make up the population.
10Terminology Cont.
- Study Participants
- Subjects that are actually participating in the
study. - Subset of study population that were contactable
and consented / agreed to participate.
11Study Participants - Cont.
- Study participants may still be not
representative of the target population even with
random sampling because of - Sampling frame is out of date.
- Failure to recruit eligible subjects.
- Non consent or non response.
- Drop Out / Withdrawal.
12Terminology Cont.
- Sampling Error
- This arises out of random sampling and is the
discrepancies between sample values and the
population value. - Sampling Variation
- Due to infinite variations among individuals and
their surrounding conditions. - Produce differences among samples from the
population and is due to chance.
13Repeat the same study, under exactly similar
conditions, we will not necessarily get
identical results.
- Example In a clinical trail of 200 patients we
find that the efficacy of a particular drug is
75 - If we repeat the study using the same drug in
another group of similar 200 patients we will not
get the same efficacy of 75. It could be 78 or
71. - Different results from different trails
though all of them conducted under the same
conditions
14- Example
- If two drugs have the same efficacy then the
difference between the cure rates of these two
drugs should be zero. - But in practice we may not get a difference of
zero. - If we find the difference is small say 2, 3,
or 5, we may accept the hypothesis that the two
drugs are equally effective. - On the other hand, if we find the difference to
be large say 25, we would infer that the
difference is very large and conclude that the
drugs are not of equally efficacy.
15- Example
-
- If we testing the claim of pharmaceutical
company that the efficacy of a particular drug is
80. - We may accept the companys claim if we
observe the efficacy in the trail to be 78, 81,
83 or 77. - But if the efficacy in trail happens to be
50, we would have good cause to feel that true
efficacy cannot be 80. - And the chance of such happening must be very
low. We then tend to dismiss the claim that the
efficacy of the drug is 80.
16- THEREFORE
-
- WHILE TAKING DECISIONS BASED ON EXPERIMENTAL
DATA WE MUST GIVE SOME ALLOWANCE FOR SAMPLING
VARIATION . - VARIATION BETWEEN ONE SAMPLE AND ANOTHER
SAMPLE IS KNOWN AS SAMPLING VARIATION.
17Decisions Required for selecting sample
- Specify what is the target population. This is
entirely determined by the research objective. - 2. Specify what is the study population.
- (e.g. who are eligible for inclusion in the
study) - 3. Select a sampling design for obtaining a
sample for study. - 4. Strategy to ensure high response or
participation rate, otherwise inference must
take account of non-responses. - Decisions will have considerable impact on study
validity (soundness of conclusion or inference
made).
18Study populations and sampling summarized
schematically
Target population real or hypothetical
Select based on judgment and accessibility
Study Population
Probability sampling
Sample
Consent or respond
Participants in study
19How to sample ?
- In general 2 requirements
- Sampling frame must be available, otherwise
construct one or use special sampling techniques.
Frame construction may not be easy. - Choose an appropriate sampling method to draw a
sample from the frame.
20The Sampling Design Process
Fig. 11.1
21Classification of Sampling Techniques
Fig. 11.2
Probability Sampling Techniques
Other Sampling Techniques
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
22Simple Random Sampling
- A sample may be defined as random if every
sampling unit in the study population has an
equal chance of being selected. - Selection of SRS may be done by
- Drawing the number or name from a hat or box.
- Using a Random Number Table.
- Using a computer to generate the numbers.
23(No Transcript)
24SRS Methods
- Lottery Method
- Random Number Table method
25Example
- A Tattslotto draw is a good example of simple
random sampling. A sample of 6 numbers is
randomly generated from a population of 45, with
each number having an equal chance of being
selected.
26Tables of random numbers
- are used after numbers have been assigned to
numbers of the study population. Use random
number table to select subject. Start anywhere.
Continue selecting until the desired sample is
reached
27 Random Number table
28How to select a simple random sample
- Define the population
- Determine the desired sample size
- List all members of the population or the
potential subjects - For example
- 4th grade boys who have demonstrated problem
behaviors - Lets select 10
29Potential Subject Pool
30So our selected subjects are numbers 10, 22, 24,
15, 6, 1, 25, 11, 13, 16.
31- Simple random sampling
- Estimate hemoglobin levels in patients with
sickle cell anemia - Determine sample size
- Obtain a list of all patients with sickle cell
anemia in a hospital or clinic - Patient is the sampling unit
- Use Lottery method/ a table of random numbers to
select units from the sampling frame - Measure hemoglobin in all patients
- Calculate mean and standard deviation of sample
32- Simple random sampling
- Advantages
- Simple process and easy to understand
- Easy calculation of means and variance
- Disadvantages
- Not most efficient method, that is, not the most
precise estimate for the cost - Requires knowledge of the complete sampling frame
- Cannot always be certain that there is an equal
chance of selection - Non respondents or refusals
33Sampling in Epidemiology
- Systematic sampling
- The sampling units are spaced regularly
throughout the sampling frame, e.g., every 3rd
unit would be selected - May be used as either probability sample or not
- Not a probability sample unless the starting
point is randomly selected - Non-random sample if the starting point is
determined by some other mechanism than chance
34Systematic Sampling
- The sample is chosen by selecting a random
starting point and then picking every i th
element in succession from the sampling frame. - The sampling interval, i, is determined by
dividing the population size N by the sample size
n and rounding to the nearest integer. - For example, there are 100,000 elements in the
population and a sample of 1,000 is desired. In
this case the sampling interval, i, is 100. A
random number between 1 and 100 is selected. If,
for example, this number is 23, the sample
consists of elements 23, 123, 223, 323, 423, 523,
and so on.
35Example
- If a systematic sample of 500 students were to be
carried out in a university with an enrolled
population of 10,000, the sampling interval would
be - I N/n 10,000/500 20
- All students would be assigned sequential
numbers. The starting point would be chosen by
selecting a random number between 1 and 20. If
this number was 9, then the 9th student on the
list of students would be selected along with
every following 20th student. The sample of
students would be those corresponding to student
numbers 9, 29, 49, 69, ........ 9929, 9949, 9969
and 9989.
36Systematic Sampling
- Decide on sample size n
- Divide population of N individuals into groups
of - k individuals k N/n
- Randomly select one individual from the 1st
group. - Select every k-th individual thereafter.
-
N 64 n 8 k 8
First Group
37(No Transcript)
38- Systematic sampling
- Advantages
- Sampling frame does not need to be defined in
advance - Easier to implement in the field
- If there are unrecognized trends in the sample
frame, systematic sample ensure coverage of the
spectrum of units - Disadvantages
- Variance cannot be estimated unless assumptions
are made
39Stratified Sampling
- A two-step process in which the population is
partitioned into subpopulations, or strata. -
- The strata should be mutually exclusive and
collectively exhaustive in that every population
element should be assigned to one and only one
stratum and no population elements should be
omitted. -
- Next, elements are selected from each stratum by
a random procedure, usually SRS. - A major objective of stratified sampling is to
increase precision without increasing cost.
40- Stratified random sample
- The sampling frame comprises groups, or strata,
with certain characteristics - A sample of units are selected from each group or
stratum
41Sampling in Epidemiology
- Stratified random sample
- Assess dietary intake in adolescents
- Define three age groups 11-13, 14-16, 17-19
- Stratify age groups by sex
- Obtain list of children in this age range from
schools - Randomly select children from each of the 6
strata until sample size is obtained - Measure dietary intake
42Stratified Random selection for drug trail in
hypertension
Severe
Mild
Moderate
43- Stratified random sample
- Advantages
- Assures that certain subgroups are represented in
a sample - Allows investigator to estimate parameters in
different strata - More precise estimates of the parameters because
strata are more homogeneous, e.g., smaller
variance within strata - Strata of interest can be sampled most
intensively, e.g., groups with greatest variance - Administrative advantages
- Disadvantages
- Loss of precision if small number of units is
sampled from strata
44Cluster Sampling
- The population is first divided into mutually
exclusively groups of elements called clusters. - Ideally, each cluster is a representative
small-scale version of the population (i.e.
heterogeneous group). - A simple random sample of the clusters is then
taken. - All elements within each sampled (chosen)
cluster form the sample. - Elements within a cluster should be as
heterogeneous as possible, but clusters
themselves should be as homogeneous as possible.
Ideally, each cluster should be a small-scale
representation of the population. -
45- Cluster sampling
- Estimate the prevalence of dental caries in
school children - Among the schools in the catchments area, list
all of the classrooms in each school - Take a simple random sample of classrooms, or
cluster of children - Examine all children in a cluster for dental
caries - Estimate prevalence of caries within clusters
than combine in overall estimate, with variance
46- Cluster sampling
- Advantages
- The entire sampling frame need not be enumerated
in advance, just the clusters once identified - More economical in terms of resources than simple
random sampling - Disadvantages
- Loss of precision, i.e., wider variance, but can
be accounted for with larger number of clusters
47Multistage Sampling
- Similar to cluster sampling except that there are
two sampling events, instead of one - Primary units are randomly selected
- Individual units within primary units randomly
selected for measurement
48MultiStage Sampling
- This sampling method is actually a combination of
the basic sampling methods carried out in stages. - Aim of subdividing the population into
progressively smaller units by random sampling at
each stage.
49Sampling in Epidemiology
- Multistage sampling
- Estimate the prevalence of dental caries in
school children - Among the schools in the catchments area, list
all of the classrooms in each school - Take a simple random sample of classrooms, or
cluster of children - Enumerate the children in each classroom
- Take a simple random sample of children within
the classroom - Examine all children in a cluster for dental
caries - Estimate prevalence of caries within clusters
than combine in overall estimate, with variance
50Classification of Sampling Techniques
Fig. 11.2
Probability Sampling Techniques
Other Sampling Techniques
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
51Sampling Methods Non-probability samples
52Convenience Sampling
- Convenience sampling attempts to obtain a sample
of convenient elements. Often, respondents are
selected because they happen to be in the right
place at the right time. - use of students, and members of social
organizations - mall intercept interviews without qualifying the
respondents - department stores using charge account lists
- people on the street interviews
53- Convenience sample
- Case series of patients with a particular
condition at a certain hospital - Normal graduate students walking down the hall
are asked to donate blood for a study - Children with febrile seizures reporting to an
emergency room - Investigator decides who is enrolled in a study
54Judgmental Sampling
- Judgmental sampling is a form of convenience
sampling in which the population elements are
selected based on the judgment of the researcher. - It involves hand-picking from the accessible
population those individuals judged most
appropriate for the study.
55 56Quota Sampling
- Quota sampling may be viewed as two-stage
restricted judgmental sampling. - The first stage consists of developing control
categories, or quotas, of population elements. - In the second stage, sample elements are selected
based on convenience or judgment. - Population Sample composition composition
Control Characteristic Percentage Percentage Nu
mberSex Male 48 48 480 Female 52 52 520
____ ____ ____ 100 100 1000
57QUOTA SAMPLING
58Snowball Sampling
- In snowball sampling, an initial group of
respondents is selected, usually at random. - After being interviewed, these respondents are
asked to identify others who belong to the target
population of interest. - Subsequent respondents are selected based on the
referrals.
59Consecutive sample
- Consecutive sample
- A case series of consecutive patients with a
condition of interest - Consecutive series means ALL patients with the
condition within hospital or clinic, not just the
patients the investigators happen to know about
60- Consecutive sample
- Outcome of 1000 consecutive patients presenting
to the emergency room with chest pain - Natural history of all 125 patients with
HIV-associated TB during 5 year period - Explicit efforts must be made to identify and
recruit ALL persons with the condition of interest
61Sampling Methods Non-probability samples
- Depends on experts opinion,
- Probabilities of selection not considered.
- Advantages include convenience, speed, and lower
cost. - Disadvantages
- Lack of accuracy,
- lack of results generalizability.
62Availability sampling selecting on the basis of convenience. Random sampling every combination of a given size has an equal chance of being chosen.
Cluster sampling dividing the population into clusters, typically on the basis of geography, and taking a sample of the clusters. Snowball sampling asking individuals studied to provide references to others.
Multi-stage sampling sampling subunits within sampled units. Stratified sampling dividing the population into groups on the basis of some characteristic and then sampling each group.
Quota sampling selecting fixed numbers of units in each of a number of categories. Systematic sampling choosing every nth item from a list, beginning at a random point.
63Strengths and Weaknesses of Basic Sampling
Techniques
Table 11.3
Technique
Strengths
Weaknesses
Nonprobability Sampling
Least expensive, least
Selection bias, sample not
Convenience sampling
time-consuming, most
representative, not recommended for
convenient
descriptive or causal research
Judgmental sampling
Low cost, convenient,
Does not allow generalization,
not time-consuming
subjective
Quota sampling
Sample can be controlled
Selection bias, no assurance of
for certain characteristics
representativeness
Snowball sampling
Can estimate rare
Time-consuming
characteristics
Probability sampling
Easily understood,
Difficult to construct sampling
Simple random sampling
results
projectable
frame, expensive,
lower precision,
(SRS)
no assurance of
representativeness.
Systematic sampling
Can increase
Can decrease
representativeness
representativeness,
easier to implement than
SRS, sampling frame not
necessary
Stratified sampling
Include all important
Difficult to select relevant
subpopulations,
stratification variables, not feasible to
precision
stratify on many variables, expensive
Cluster sampling
Easy to implement, cost
Imprecise, difficult to compute and
effective
interpret results
64Random . . .
- Random Selection vs. Random Assignment
- Random Selection every member of the population
has an equal chance of being selected for the
sample. - Random Assignment every member of the sample
(however chosen) has an equal chance of being
placed in the experimental group or the control
group. - Random assignment allows for individual
differences among test participants to be
averaged out.
65Subject Selection (Random Selection)
Choosing which potential subjects will actually
participate in the study
66Subject Assignment (Random Assignment)
Deciding which group or condition each subject
will be part of
Group B
Group A
67Population 200 8th Graders
40 High IQ students
120 Avg. IQ students
40 Low IQ students
Random Selection
30 students
30 students
30 students
Random Assignment
15 students
15 students
15 students
15 students
15 students
15 students
Group A
Group B
Group A
Group B
Group A
Group B
68Randomization (Random assignment to two
treatments)
- Randomization tends to produce study groups
comparable with respect to known and unknown risk
factors, - removes investigator bias in the allocation of
participants - and guarantees that statistical tests will have
valid significance levels - Trialists most powerful weapon against bias
69Randomization (Cont)
- Simple randomization Toss a Coin
- AAABBAAAAABABABBAAAABAA
- Random permuted blocks (Block Randomization)
- AABB-ABBA-BBAA-BAAB-ABAB-AABB-
70Block Randomization
- Each block contains all conditions of the
experiment in a randomized order.
E, C, C, E
C, E, C, E
E, E, C, C
Control Group N 6
Experimental Group N 6
71Prevalence and risk factors of HIV 1 and HIV 2
infection in Urban and rural areas in TN. Int. J.
of STD AIDS 1998998-103 Objective Find
prevalence and risk factors. Setting Centers in
metropolitan city municipality. Subjects
Individuals in Tamil nadu. Sampling Procedure
Health camps were organized in 5 urban and 5
rural centers to cover entire state
graphically Every third person screened, in
the active reproductive age group, were recruited
as a subject. At each camp the inclusion of
subjects continued until 200 persons were
recruited
72Sex differences in the use of asthma drugs
Cross-sectional study. BMJ 1998 317
1434-7 Objective To assess the use of asthma
drugs. Design Cross-sectional study. Setting
Six general practices in East Anglia. Subjects
Adults aged 20-54 with Asthma Sampling
method identify cases with asthma received drugs
one year before through database from each
participating practices. The sample was
stratified into three categories of severity
corresponding the prescribed drugs Bronchodilator
alone (mild) 38 Steroids
(moderate)
57 Nebulizer treatment (severe)
5 Use SRS to select subject in each practice
based on proportion of use of each type of drug
within the practice
73Genital ulcer disease and acquisition of HIV
infection. Indian J Med Microbiol 1992
10(4)265-269 Objective To find out the
association of HIV infection with genital ulcer
disease . Setting Dept. of STD, GGH, Chennai.
Subjects Individuals attending the STD
dept. Sampling procedure Blood samples from
first 20 patients were taken for analysis once a
week for 40 weeks.
74Prevalence of series eye disease and visual
impairment in a north London population
Population based, cross sectional study. BMJ
1998 3161643-48. Objective To estimate eye
disorders and of visual
impairment Design Cross-sectional survey.
Setting General Practices in metropolitan in
England. Subjects aged 65 or
older registered
75Sampling Procedure
17 general practice group
Random sampling
7 were selected
People age 65 or older were registered with the
general practices. Total 750-850 in each Gen
Pract
Use SRS to select eligible people in each practice
One third in each practices were selected to form
survey sample
76A die is rolled to decide which one of the six
volunteers will get a new , experimental vaccine
- Simple Random sampling
- Stratified random sampling
- C. Cluster sampling
- D. Systematic random sampling
77A sample of students in a school is chosen as
follows Two students are selected from each
batch by picking roll number at random from the
attendance registers
- Simple Random sampling
- Stratified random sampling
- C. Cluster sampling
- D. Systematic random sampling
78 A target population for a telephonic survey is
picked by selecting 10 pages from a total of 100
pages from a telephone directory by using a table
of random numbers. In each of the selected pages,
all listed persons are called for Interview
- Simple Random sampling
- Stratified random sampling
- C. Cluster sampling
- D. Systematic random sampling
79The number 35 is a two-digit random number
generated by a calculator. A sample of two
wheelers in a state is selected by picking all
those vehicles have registration numbers ending
with 35
- Simple Random sampling
- Stratified random sampling
- C. Cluster sampling
- D. Systematic random sampling
80Example
- A medical student in a city in South Africa
conducted a survey to measure the prevalence of
HIV in his village. He used simple random
sampling to select the subjects. At the end of
his study, he was able to estimate the prevalence
in the general population of the village.
However, he was not able to calculate the
prevalence of HIV in some subgroups such as
homosexual due to the absence of this subgroup
from his sample. So, to guarantee the presence of
such rare group, what kind of sampling should he
have used? - A. Systematic random sample.
- B. Cluster sample.
- C. Multistage-staged sample.
- D. Stratified random sample.
- E. None of the above.
81Example
- A post-graduate trainee of family medicine
was assigned a project to evaluate the effect of
teachers smoking on students behavior. He
presented the following scenario as an
explanation of his method of subjects selection - Out of 400 schools in Riyadh 30 schools were
selected randomly and then all subjects
(teachers) in each selected school will be
included in the study - The type of sampling method is
- A. Multi-staged sample
- B. Cluster sample
- C. Simple random sample
- D. Stratified random sample
- E. None of the above
82Example
- Stratified random sample
- A. Make use of random number tables
- B. Is one type of non-random sample
- C. Divide the population into groups or
clusters according to characteristic of
interest - D. Take all units in some clusters
- E. Increase precision