Title: Stratified sampling technique.
1welcome
2MAJOR CREDIT SEMINAR
ON
- STRATIFIED RANDOM SAMPLING
-
- By -Shashank kshandakar
M.V.Sc(1st year)
Division
of lesit
3Content....
- Definitions and concepts in sampling
- Principles of sampling
- Advantages of sampling
- Types of sampling
- Stratified random sampling its properties
-
4Important definitions
- Population population is an aggregates of
object under study. - Sample- A finite sub-set of statistical
individual in a population is called a sample. - Sampling Unit Sets of units considered for
selection in some stage of sampling. - Sampling scheme-Method of selecting sampling
units from population
5Important definitions.
- Census-The process of complete enumeration of
all the elements or units in the population - Sample size-The number of elements or units in a
sample is known as sample size. - Accuracy-The amount of deviation of the
estimate from the true value. - Precision-The deviation of the estimate from the
average value. It is reciprocal of the variance
6Sampling Frame
- Sampling Frame- A complete list, map or other
acceptable materials which serve as a guide to
the population to be covered is known as frame. - A sampling frame which has the property that we
can identify every single element and include any
in our sample - The sampling frame must be representative of the
population - A sampling frame must be up-to-date.
7Important definitions
- Sampling fraction- Ratio of sample size(n) to
the population size(n) i.e. (n/N) - Finite population correction-1- (Sampling
fraction) - Sample space-A space consisting of all possible
sample is called sample space - Relation between cost (per unit of sample
selection) precision
8Important definitions
- Sampling distribution-The aggregates of the
various value of the statistic under
consideration so obtained(one from each sample)
may be grouped into a frequency distribution
which is known as sampling distribution of the
statistic. - Standard error-The standard deviation of the
sampling distribution of a statistic is known as
standard error.
9Important definitions
- Parameter-A descriptive measure computed from
the data of a population is called as parameter. - Statistic-A descriptive measure computed from a
sample data is called as statistic. - Estimator-An estimator is a rule, function or
formula of variates for estimating the population
parameter. - Estimates -A particular value of an estimator
from a fixed set of value of a random sample is
known as estimate.
10It is very easy and convenient to draw the
sample from homogenous population
11The population having significant variations
(Heterogeneous), observation of multiple
individual needed to find all possible
characteristics that may exist
12When sampling is not necessary?
- When population is very small.
- When we have extensive resources.
- When we dont expect a very high response.
13Sampling
- Sampling is the process of selecting units or
elements from the study population in such a way
that the units or elements selected represent the
whole population
14Principles of Sampling Theory
- Principle of Statistical Regularities.
- Principle of Inertia of large number.
- Principle of Validity.
- Principle of Optimization.
15Principal Steps in a Sample Survey
- Statement of objectives.
- Definition of population to be studied.
- Determination of sampling frame and sampling
units. - Selection of proper sampling design.
- Organization of field work.
- Summary and analysis of data.
16The Sampling Process
Plan procedure for selecting sampling units
4
Determine if a probability or non-probability
sampling method will be chosen
Determine sample size
3
5
Select actual sampling units
Select a Sampling Frame
2
6
Define the Target population
Conduct fieldwork
1
7
17Advantage of sampling
- 1. Reduced cost of survey.
- 2. Greater speed of getting results.
- 3. Greater accuracy of results.
- 4. Greater scope.(When its impossible to study
the whole population) - 5. Adaptability.
18Type of sampling method
- Probability sampling
- Non-probability sampling.
19Probability sampling
- The term probability sampling is used when the
selection of the sample is purely based
on chance. - Every unit of the population has known nonzero
probability of being selected for the sample. The
probability of selection may be equal or unequal
but it should be non-zero and should be
known. The probability sampling is also called
the random sampling
20Types Probability sampling
- Probability sampling includes
- Simple Random Sampling,
- Systematic Sampling,
- Stratified Random Sampling,
- Cluster Sampling
- Multistage Sampling.
- Multiphase sampling
21Simple random sampling
- Applicable when population is small, homogeneous
readily available but there is no guarantee
that all segment of population will be
represented in SRS - All unit within the frame have an equal
probability. - It provides for greatest number of possible
samples. - Simplest common method of sampling
- Every sample are drawn unit by unit with equal
probability of selection on the basis of sample
drawn SRS may be divide into two group
22Simple random sampling with replacement
- Simple random sampling with replacement
- If a unit is selected noted then
returned back to the population before the next
selection is made this procedure is repeated n
times is known as Simple random sampling with
replacement - Number of sample required Z2a pq
/(S.E)2 - Var(yn)
-
23Simple random sampling without replacement
- If a unit is selected noted not returned back
to the population (selected unit is not available
for further selection) this procedure is known as
Simple random sampling with replacement - Number of sample requiredn0/1( n0-1)/N
-
- Var(yn)
24 Advantage of SRS
- Minimal knowledge of population needed
- Economical required less time.
- Easy to analyze data
- Disadvantage of SRS
- High cost low frequency of use
- Requires sampling frame
- Does not use researchers expertise
- Larger risk of random error than stratified
25STEPS IN COMPUTING THE SIZE OF A SAMPLE
- Determine the size of the target population.
- Decide on the margin of error. As much as
possible the margin of error should not be higher
than 5. - Use the formula
- n N
- 1 Ne2 (Pagoso , et
al.) - n sample size
- N the size of the population
- e the margin of error
- Compute the sample proportion by dividing the
result in number 3 by the population.
26STEPS IN COMPUTING THE SIZE OF A SAMPLE
- Population is 5,346
- Margin of error is 3
- Using the formula
- n ___5,346_
- 1 5346(.03)2
- n 920
- Sample proportion () 920 / 5346
- 17
27Procedure of selection of a random sample
- Lottery method
- Use of random number table
- Tippets random number table
- Fisher Yates random number table
- Kendall smith random number table
- A million random digit number table.
28Test for randomness of selected random number
- Frequency test
- Serial test
- Gap test
- Poker test.
29Stratified random sampling
- Most commonly used sampling technique
- Stratum (Strata.pl) - Clear division into which
some thing is separated - The population (N) unit is divided into K group
or sub-population called strata - The sample is drawn by first dividing the
population into homogeneous sub-populations
(strata) and then drawing samples from each of
the sub-population (stratum). - It ensures proportionate representation of
character under study when drawing a sample from
a heterogeneous population and hence increases
the quantity of information for a given cost.
30How to Stratify and How Many Strata?
- Strata should be homogeneous to increase
efficiency - Usually 5-10 Strata are taken. If too many strata
then the sample size within strata is too low.
31Principle of stratification
- Strata should be non-overlapping should
together comprise the whole population. - Strata are homogenous themselves with respect to
characters under study. - Population are divided into strata on the basis
of sex , age, purpose ( milch, drought, dual )
breed, land holding capacity of livestock owners,
or any auxiliary information.
32Auxiliary information
- Past data or some other information related
to the character on which we divide the
population into strata such that - Within strata -Variable are homogenous.
Between strata -Variable
are heterogeneous.
Sampling technique Use of auxiliary information
Stratified sampling technique Construction of strata
PPS Sample selection, to get more efficient estimator of the population parameter
Ratio regression estimation For estimation purpose.
33Advantage of stratification
- Administrative convenience
- Improving sampling design
- Stratification makes it possible to use different
sampling design in different strata - Provide better cross section of population i.e.
adequate representation from various group of the
population - Gain in precision in the estimation of population
parameter.
34Main problem in stratification
- What should be the number of strata?
- How to allocate n sample from different strata
- How we distribute population into various strata
- How to determine the boundary of the strata.
35Allocation of sample size in different strata
- Allocation of the sample to different strata is
done by considering of 3 factors - Total no. of unit in the strata
- Variability within the stratum
- The cost of taking observation per sampling unit
in the stratum - (N.B-a good allocation is one where maximum
precision is obtained with minimum recourses)
36Method of allocation
- Equal allocation
- Proportional allocation
- Neyman allocation
- Optimum allocation.
37Equal allocation
- Most convenience
- Equal no. of sample are drawn from each stratum.
- Not usually sensible unless all strata are equal
size in terms of overall estimates precision.
However, maybe good if you want to compare
stratum means as your primary focus
38Proportional allocation
- Given by Bowley (1926)
- Very common in practice
- Sampling fraction of each stratum is same
39Proportional allocation
Allocate proportional to the size of the
strata-very widely used
40Neyman allocation
- Also known as minimum variance allocation
- 1st discovered by Tschuprow (1923), but
rediscovered by J.Neyman (1934)
(Assumption sampling cost per unit
among different strata is same size of the
sample is fixed) - Allocation of sample among different strata is
based on joint consideration of stratum size
stratum variation
41Optimum allocation
- Optimum allocation is mean to choose sample so as
to - Minimise the variance(maximise the precision) of
the estimate for fixed sample size(n). - Minimise the variance(maximise the precision) of
the estimate at fixed cost. - Minimise the total cost for fixed desired
precision
42Cluster Sample
- A Cluster Sample is obtained by first grouping
the elements of the population into clusters and
then simple random sampling or other type of
sampling is used to select the clusters. - This type of sampling is used when a sampling
frame cannot be prepared for individual units in
the population but can be prepared for some
cluster of them or when substantial time or
expense can be saved by collecting data from a
modest number of clusters.
43.
- When drawing a cluster sample, the first task is
to specify appropriate clusters. In doing so,
consideration has to be made about the level of
heterogeneity of elements within clusters.
- If clusters are generally heterogeneous, then
few large clusters may be selected to constitute
the sample but if they are homogeneous, then many
small-sized clusters should be used.
44Difference between strata cluster
- Although strata and clusters are both
non-overlapping subsets of the population, they
differ in several ways. - All strata are represented in the sample but
only a subset of clusters are in the sample. - With stratified sampling, the best survey results
occur when elements within strata are internally
homogeneous. However, with cluster sampling, the
best results occur when elements within clusters
are internally heterogeneous.
45Difference Between Cluster and Stratified Sampling
Population of L strata, stratum l contains nl
units
Population of C clusters
Take simple random sample in every stratum
Take srs of clusters, sample every unit in chosen
clusters
46Drawbacks
- Sampling frame of entire population has to be
prepared separately for each stratum - When examining multiple criteria, stratifying
variables may be related to some, but not to
others, further complicating the design, and
potentially reducing the utility of the strata. - In some cases (such as designs with a large
number of strata, or those with a specified
minimum sample size per group), stratified
sampling can potentially require a larger sample
than would other methods
47Post-stratification
- Stratification is sometimes introduced after the
sampling phase in a process called
"poststratification. - This approach is typically implemented due to a
lack of prior knowledge of an appropriate
stratifying variable or when the experimenter
lacks the necessary information to create a
stratifying variable during the sampling phase.
Although the method is susceptible to the
pitfalls of post hoc approaches, it can provide
several benefits in the right situation.
Implementation usually follows a simple random
sample. In addition to allowing for
stratification on an ancillary variable,
poststratification can be used to implement
weighting, which can improve the precision of a
sample's estimates.
48Oversampling
- Choice-based sampling is one of the stratified
sampling strategies. In this, data are stratified
on the target and a sample is taken from each
strata so that the rare target class will be more
represented in the sample. The model is then
built on this biased sample. The effects of the
input variables on the target are often estimated
with more precision with the choice-based sample
even when a smaller overall sample size is taken,
compared to a random sample. The results usually
must be adjusted to correct for the oversampling
49References
- Cochran, W. G., 1977. Sampling Techniques, Third
Edition. New York John Wiley SonsCochran, W.
G., 1977. Sampling Techniques, Third Edition. New
York John Wiley Sons - Des Raj and Chandhok, P. (1998). Sampling Survey
Theory. Narosa Publishing House, New Delhi - Horvitz, D. G., and D.J. Thompson, 1952. A
generalization of sampling without replacement
from a finite universe. The Journal of the
American Statistical Association 47663-685. - Murthy, M.N. (1977). Sampling Theory and Methods.
Statistical Publishing Society, Calcutta. - Sukhatme, P.V., Sukhatme, B.V., Sukhatme, S and
Ashok, C. (1984). Sampling Theory of Surveys with
Applications. Indian Society of Agricultural
Statistics, New Delhi
50 Thank you