Title: Estimation of Means and Proportions
1Estimation of Means and Proportions
2(No Transcript)
3Concepts
- Estimator a rule that tells us how to estimate
a value for a population parameter using sample
data - Estimate a specific value of an estimator for
particular sample data
4Concepts
- A point estimator is a rule that tells us how to
calculate a particular number from sample data to
estimate a population parameter - An interval estimator is a rule that tells us how
to calculate two numbers based on sample data,
forming a confidence interval within which the
parameter is expected to lie
5Properties of a Good Estimator
- Unbiasedness mean of the sampling distribution
of the estimator equals the true value of the
parameter - Efficiency The most efficient estimator among a
group of unbiased estimators is the one with the
smallest variance
6Properties of a Good Estimator
7Estimation of a Population Mean
- The CLT suggests that the sample mean may be a
good estimator for the population mean. The CLT
says that - Sampling distribution of sample mean will be
approximately normally distributed regardless of
the distribution of the sampled population if n
is large - The sample mean is an unbiased estimator
- The standard error of the sample mean is
8Estimation of a Population Mean
- A point estimator of the population mean is
- An interval estimator of the population mean is a
- confidence interval,
meaning that the true population parameter lies
within the interval
- of the time, where
is the z value corresponding to an area
in the upper tail of a standard normal
distribution
9Estimation of a Population Mean
- Usually s (the population standard deviation) is
unknown. - If n is large enough (n 30) then we can
approximate it with the sample standard deviation
s.
10One Sided Confidence Intervals
- In some cases we may be interested in the
probability the population parameter falls above
or below a certain value - Lower One Sided Confidence Interval (LCL)
- LCL (point estimate)
- Upper One Sided Confidence Interval (UCL)
- UCL (point estimate)
11Small Sample Estimation of a Population Mean
- If n is large, we can use sample standard
deviation s as reliable estimator of population
standard deviation - No matter what distribution the population has,
sampling distribution of sample mean is normally
distributed - As the sample size n decreases, the sample
standard deviation s becomes a less reliable
estimator of the population standard deviation
(because we are using less information from the
underlying distribution to compute s) - How do we deal with this issue?
12t Distribution
- Assume
- (1) The underlying population is normally
distributed - (2) Sample is small and s is unknown
- Using the sample standard deviation s to replace
s, the t statistic - follows the t distribution
13Properties of the t Distribution
- mound-shaped
- perfectly symmetric about t0
- more variable than z (the standard normal
distribution)
- affected by the sample size n (as n increases s
becomes a better approximation for s) - n-1 is the degrees of freedom (d.f.) associated
with the t statistic
14More on the t Distribution
- Remember the t-distribution is based on the
assumption that the sampled population possesses
a normal probability distribution. - This is a very restrictive assumption.
- Fortunately, it can be shown that for non-normal
but mound-shaped distributions, the distribution
of the t statistic is nearly the same shape as
the theoretical t-distribution for a normal
distribution. - Therefore the t distribution is still useful for
small sample estimation of a population mean even
if the underlying distribution of x is not known
to be normal
15How to use the t-distribution table
- The t-distribution table is in the book
(Appendix II, Table 4, pp611). ta is the value
of t such that an area a lies to its right. - To use the table
- Determine the degrees of freedom
- Determine the appropriate value of a Lookup the
value for ta
16 Table t Distribution
17The Difference Between Two Means
- Suppose independent samples of n1 and n2
observations have been selected from populations
with means , and variances ,
- The Sampling Distribution of the difference in
means ( ) will have the following
properties
18The Difference Between Two Means
- The mean and standard deviation of
is - If the sampled populations are normally
distributed, the sampling distribution of (
) is exactly normally distributed regardless
of n - If the sampled populations are not normally
distributed, the sampling distribution of (
) is approximately normally distributed when
n1 and n2 are large
19Point Estimation of the Difference Between Two
Means
- Point Estimator
- A confidence interval for (
) is
20Difference Between Two Means (small sample)
- If n1 and n2 are small then the t statistic
- is distributed according to the t distribution
if the following assumptions are satisfied - 1. Both samples are drawn from populations
with a normal distribution - 2. Both populations have equal variances
21Difference Between Two Means (small sample)
- In practice, the t statistic is still appropriate
even if the underlying distributions are not
exactly normally distributed. - To compute s, we can pool the information from
both samples - or
22Difference Between Two Means (small sample)
- Point Estimate
- Interval Estimate
- a confidence interval
for - is
- Where s is computed using the pooled estimate
described earlier
23Sampling Distribution of Sample Proportions
- Recall from Chapter 6
- If a random sample of n objects is selected from
the population and if x of these possess a
chararacteristic of interest, the sample
proportion is - The sampling distribution of will have a
mean and standard deviation
24Estimators for p
- Assuming n is sufficiently large and the interval
lies in the interval from 0 to 1, the - Point Estimator for p
- Interval Estimator for p
- A confidence interval for p
is
25Estimating the Difference Between Two Binomial
Proportions
- Point estimate
- Confidence interval for the difference
26Choosing Sample Size
- How many measurements should be included in the
sample? - Increasing n increases the precision of the
estimate, but increasing n is costly - Answer depends on
- What level of confidence do you want to have
(i.e., the value of 100(1- a )? - What is the maximum difference (B) you want to
permit between the estimate of the population
parameter and the true population parameter
27Choosing Sample Size
- Once you have chosen B and a, you can solve the
following equation for sample size n - If the resulting value of n is less than 30 and
an estimate
28Choosing Sample Size