Title: Sampling Distributions
1Sampling Distributions
2Overview
- To frame our discussion, consider
3Outline
4Population
Population
Parameter the measurement of a characteristic
of an entire population
Population the complete set of objects that you
want to study
5Sample
Sample Subset of objects that are the focus of
ones interest
- Statistic Number calculated on
- sample data quantifying a
- characteristic of the sample
Population
6Sampling (1)
- Random Sampling
- Subjects are chosen from the population at
random. - Stratified Random Sampling
- The population is divided into groups (strata)
then random sampling is applied to the groups.
7Sampling (2)
- Convenience Sampling
- The most convenient persons are chosen.
- Quota Sampling
- Subjects from various portions of the population
are chosen.
8Randomization
- Statistical methods require observations from
independent random variables. Randomization is
used in an attempt to meet this requirement. - Randomization applies to the allocation of
objects, subjects, and the order of treatments.
9Why Randomization?
- By random assignment you try to keep the results
from being biased by sources of variation over
which you have no control.
10Sample Size
- The larger the variability in the population the
larger the sample needed. - The size of the sample impacts our ability to
generalize since larger samples reduce error.
11Context
- Take a random sample of n observations from a
population P. Compute the mean for the sample.
How well does the sample mean estimate the
population mean? - Notice we generate statistics as estimates of
parameters.
Demonstration
12Sampling Distribution - Mean(s known)
- If a random sample of size n is taken from a
population having a mean µ and variance s2 , then
is a random variable whose distribution has a
mean of µ and variance
13Using the Sample Mean
Let X1,, Xn be a random sample from a
distribution with mean value and standard
deviation Then
In addition, with To X1 Xn,
14Normal Population Distribution
Let X1,, Xn be a random sample from a normal
distribution with mean value and standard
deviation Then for any ngt0, is
normally distributed as is To ,
15The Central Limit Theorem
Let X1,, Xn be a random sample from a
distribution with mean value and variance
Then if n sufficiently large, has
approximately a normal distribution with
and To also has
approximately a normal distribution with
The larger the value of
n, the better the approximation.
Demonstration
16Rule of Thumb
If n gt 30, the Central Limit Theorem can be used.
17Central Limit Theorem (2)
- If is the mean of a sample of size n taken
from a population have mean µ and variance s2
then - is a random variable whose distribution function
approaches standard normal.
18Notes
- Central Limit Theorem holds regardless of the
population distribution. - The sampling distribution is approximately normal
when ngt30. - If the population from which you are sampling has
a normal distribution, then the sampling
distribution is a normal distribution.
http//www.ruf.rice.edu/lane/stat_sim/sampling_di
st/index.html
19Problem 1
- Company records indicate that the time spent
preparing for a code inspection is normally
distributed with a mean of 55 minutes and a
standard deviation of 15 minutes. - What is the probability an employee spends more
than 75 minutes preparing for a review?
20Solution - Problem 1
21Problem 2
- Company records indicate that the time spent
preparing for a code inspection is normally
distributed with a mean of 55 minutes and a
standard deviation of 15 minutes. - What is the probability that the average time for
the review team of 6 people exceeds75 minutes?
22Solution - Problem 2
23Problem 3
- A group of women project leaders for CompuCorp is
considering filing a sex-discrimination suit
against the corporation. A recent report stated
that the average salary for project leads at the
company is 128,000 with a standard deviation of
8,500. A random sample of 65 women taken from
the 350 female project leads at the company had
an average income of 125,000. If the population
of female project managers is assumed to have
same mean and standard deviation as project
leads, what is the probability of observing this
sample average?
24Solution - Problem 3
25Sampling Distribution - Mean(s unknown)
- If a is the mean of a random sample of size n
is taken from a normal population having a mean
of µ and variance s2 , and s2 is the variance of
the sample, then - is a random variable having the t distribution
with the parameter nn-1.
26Notes
- The parameter n is referred to as the degrees of
freedom. - t distribution is similar to normal.
- Notice the requirement of sampling from normal
population. - N(0,1) is good approximation for t distribution
when n30.
27Problem 4
- The CEO submitted a white paper indicating a few
changes in the software development process are
in order. His statements include a claim that
the average effort devoted to unit testing on
projects is 7.8 person-months. You collect
random sample of 75 effort-logs from projects
and determine the average effort for unit testing
was 7.5 person-months with a standard deviation
of 1.75 person-months. Does the data you
collected support or refute the CEO?