Title: Sampling%20and%20Statistical%20Analysis%20for%20Decision%20Making
1Sampling and Statistical Analysis for Decision
Making
A. A. Elimam College of Business San Francisco
State University
2Chapter Topics
- Sampling Design and Methods
- Estimation
- Confidence Interval Estimation for the Mean
- (s Known)
- Confidence Interval Estimation for the Mean
- (s Unknown)
- Confidence Interval Estimation for the
- Proportion
3Chapter Topics
- The Situation of Finite Populations
- Students t distribution
- Sample Size Estimation
- Hypothesis Testing
- Significance Levels
- ANOVA
4Statistical Sampling
- Sampling Valuable tool
- Population
- Too large to deal with effectively or
practically - Impossible or too expensive to obtain all data
- Collect sample data to draw conclusions about
unknown population
5Sample design
- Representative Samples of the population
- Sampling Plan Approach to obtain samples
- Sampling Plan States
- Objectives
- Target population
- Population frame
- Method of sampling
- Data collection procedure
- Statistical analysis tools
6Objectives
- Estimate population parameters such as a mean,
proportion or standard deviation - Identify if significant difference exists
between two populations - Population Frame
- List of all members of the target population
7Sampling Methods
- Subjective Sampling
- Judgment select the sample (best customers)
- Convenience ease of sampling
- Probabilistic Sampling
- Simple Random Sampling
- Replacement
- Without Replacement
8Sampling Methods
- Systematic Sampling
- Selects items periodically from population.
- First item randomly selected - may produce bias
- Example pick one sample every 7 days
- Stratified Sampling
- Populations divided into natural strata
- Allocates proper proportion of samples to each
stratum - Each stratum weighed by its size cost or
significance of certain strata might suggest
different allocation - Example sampling of political districts - wards
9Sampling Methods
- Cluster Sampling
- Populations divided into clusters then random
sample each - Items within each cluster become members of the
sample - Example segment customers for each geographical
location - Sampling Using Excel
- Population listed in spreadsheet
- Periodic
- Random
10Sampling Methods Selection
- Systematic Sampling
- Population is large considerable effort to
randomly select - Stratified Sampling
- Items in each stratum homogeneous - Low
variances - Relatively smaller sample size than simple
random sampling - Cluster Sampling
- Items in each cluster are heterogeneous
- Clusters are representative of the entire
Population - Requires larger sample
11Sampling Errors
- Sample does not represent target population
- (e. g. selecting inappropriate sampling method)
- Inherent errorsamples only subset of population
- Depends on size of Sample relative to population
- Accuracy of estimates
- Trade-off cost/time versus accuracy
12Sampling From Finite Populations
- Finite without replacement (R)
- Statistical theory assumes samples selected
with R - When n lt .05 N difference is insignificant
- Otherwise need a correction factor
- Standard error of the mean
13Statistical Analysis of Sample Data
- Estimation of population parameters (PP)
- Development of confidence intervals for PP
- Probability that the interval correctly
estimates true population parameter - Means to compare alternative decisions/process
- (comparing transmission production processes)
- Hypothesis testing validate differences among PP
14Estimation Process
Population
Random Sample
I am 95 confident that m is between 40 60.
Mean X 50
Mean, m, is unknown
Sample
15Population Parameters Estimated
Point Estimate
Population Parameter
_
Mean
m
X
Proportion
p
p
s
2
2
Variance
s
s
Std. Dev.
s
s
16Confidence Interval Estimation
- Provides Range of Values
- Based on Observations from Sample
- Gives Information about Closeness to Unknown
Population Parameter - Stated in terms of Probability
- Never 100 Sure
17Elements of Confidence Interval Estimation
A Probability That the Population Parameter Falls
Somewhere Within the Interval.
Sample Statistic
Confidence Interval
Confidence Limit (Lower)
Confidence Limit (Upper)
18Example of Confidence Interval Estimation
Example 90 CI for the mean is 10 2. Point
Estimate 10 Margin of Error 2 CI 8,12
Level of Confidence 1 - ? 0.9 Probability
that true PP is not in this CI 0.1
19Confidence Limits for Population Mean
Parameter Statistic Its Error
Error
Error
Error
Error
20Confidence Intervals
_
X
90 Samples
95 Samples
99 Samples
21Level of Confidence
- Probability that the unknown
- population parameter falls within the
- interval
- Denoted (1 - a) level of confidence e.g.
90, 95, 99 - a Is Probability That the Parameter Is Not
Within the Interval
22Intervals Level of Confidence
Sampling Distribution of the Mean
s
_
x
a
/2
a
/2
1 -
a
_
X
Intervals Extend from
(1 - a) of Intervals Contain m. a Do Not.
to
Confidence Intervals
23Factors Affecting Interval Width
- Data Variation
- measured by s
- Sample Size
- Level of Confidence (1 - a)
Intervals Extend from
X - Zs to X Z s
x
x
24Confidence Interval Estimates
Confidence
Intervals
Mean
Proportion
Finite
s
s
Unknown
Known
Population
25Confidence Intervals (s Known)
- Assumptions
- Population Standard Deviation is Known
- Population is Normally Distributed
- If Not Normal, use large samples
- Confidence Interval Estimate
26Confidence Interval Estimates
Confidence
Intervals
Mean
Proportion
Finite
s
s
Unknown
Known
Population
27Confidence Intervals (s Unknown)
- Assumptions
- Population Standard Deviation is Unknown
- Population Must Be Normally Distributed
- Use Students t Distribution
- Confidence Interval Estimate
28 Students t Distribution
- Shape similar to Normal Distribution
- Different t distributions based on df
- Has a larger variance than Normal
- Larger Sample size t approaches Normal
- At n 120 - virtually the same
- For any sample size true distribution of Sample
mean is the students t - For unknown ? and when in doubt use t
29Students t Distribution
Standard Normal
t (df 13)
Bell-Shaped Symmetric Fatter Tails
t (df 5)
Z
t
0
30Degrees of Freedom (df)
- Number of Observations that Are Free to Vary
After Sample Mean Has Been Calculated - Example
- Mean of 3 Numbers Is 2X1 1 (or Any
Number)X2 2 (or Any Number)X3 3
(Cannot Vary)Mean 2
degrees of freedom n -1 3 -1 2
31Students t Table
Assume n 3 df n - 1 2 a .10
a/2 .05
Upper Tail Area
df
.25
.10
.05
1
1.000
3.078
6.314
2
0.817
1.886
2.920
.05
3
0.765
1.638
2.353
0
t
2.920
t Values
32Example Interval Estimation s Unknown
- A random sample of n 25 has 50 and
- s 8. Set up a 95 confidence interval estimate
for m.
.
.
m
46
69
53
30
33Example Tracway Transmission
- Sample of n 30, S 45.4 - Find a 99 CI for,
m , the mean of each transmission system process.
Therefore a .01 and a/2 .005
m
266.75
312.45
34Confidence Interval Estimates
Confidence
Intervals
Mean
Proportion
Finite
s
s
Unknown
Known
Population
35Estimation for Finite Populations
- Assumptions
- Sample Is Large Relative to Population
- n / N gt .05
- Use Finite Population Correction Factor
- Confidence Interval (Mean, sX Unknown)
m
X
36Confidence Interval Estimates
Confidence
Intervals
Mean
Proportion
Finite
s
s
Unknown
Known
Population
37Confidence Interval Estimate Proportion
- Assumptions
- Two Categorical Outcomes
- Population Follows Binomial Distribution
- Normal Approximation Can Be Used
- np ³ 5 n(1 - p) ³ 5
- Confidence Interval Estimate
38Example Estimating Proportion
- A random sample of 1000 Voters showed 51 voted
for Candidate A. Set up a 90 confidence interval
estimate for p.
p
.484
.536
39Sample Size
- Too Big
- Requires too
- much resources
- Too Small
- Wont do
- the job
40 Example Sample Size for Mean
- What sample size is needed to be 90 confident of
being correct within 5? A pilot study
suggested that the standard deviation is 45.
2
2
2
2
Z
1
645
45
s
.
n
_at_
219
2
220
.
2
2
Error
5
Round Up
41Example Sample Size for Proportion
- What sample size is needed to be within 5 with
90 confidence? Out of a population of 1,000, we
randomly selected 100 of which 30 were defective.
228
_at_
Round Up
42Hypothesis Testing
- Draw inferences about two contrasting
propositions (hypothesis) - Determine whether two means are equal
- Formulate the hypothesis to test
- Select a level of significance
- Determine a decision rule as a base to conclusion
- Collect data and calculate a test statistic
- Apply the decision rule to draw conclusion
-
43Hypothesis Formulation
- Null hypothesis H0 representing status quo
- Alternative hypothesis H1
- Assumes that H0 is true
- Sample evidence is obtained to determine whether
H1 is more likely to be true -
-
44Significance Level
False
True
Test
Accept
Type II Error
Reject
Type I Error
Probability of making Type I error ? level of
significance Confidence Coefficient 1-
? Probability of making Type II error ? level
of significance Power of the test 1- ?
45Decision Rules
- Sampling Distribution Normal or t distribution
- Rejection Region
- Non Rejection Region
- Two-tailed test , ?/2
- One-tailed test , ?
- P-Values
-
46Hypothesis Testing Cases
- Two-Sample Means
- F-Test for Variances
- Proportions
- ANOVA Differences of several means
- Chi-square for independence
-
47Chapter Summary
- Sampling Design and Methods
- Estimation
- Confidence Interval Estimation for Mean
- (s Known)
- Confidence Interval Estimation for Mean
- (s Unknown)
- Confidence Interval Estimation for Proportion
48Chapter Summary
- Finite Populations
- Students t distribution
- Sample Size Estimation
- Hypothesis Testing
- Significance Levels Type I/II errors
- ANOVA