Title: Chapter 7, Part A Sampling and Sampling Distributions
1Chapter 7, Part ASampling and Sampling
Distributions
- Introduction to Sampling Distributions
2Statistical Inference
The purpose of statistical inference is to
obtain information about a population from
information contained in a sample.
A population is the set of all the elements of
interest.
A sample is a subset of the population.
3Statistical Inference
The sample results provide only estimates of
the values of the population characteristics.
With proper sampling methods, the sample
results can provide good estimates of the
population characteristics.
A parameter is a numerical characteristic of a
population.
4Simple Random SamplingFinite Population
- Finite populations are often defined by lists
such as - Organization membership roster
- Credit card account numbers
- Inventory product numbers
- A simple random sample of size n from a finite
- population of size N is a sample selected
such - that each possible sample of size n has the
same - probability of being selected.
5Simple Random SamplingFinite Population
- Replacing each sampled element before
selecting - subsequent elements is called sampling with
- replacement.
- Sampling without replacement is the procedure
- used most often.
- In large sampling projects, computer-generated
- random numbers are often used to automate
the - sample selection process.
6Simple Random SamplingInfinite Population
- Infinite populations are often defined by an
ongoing process whereby the elements of the
population consist of items generated as though
the process would operate indefinitely.
- A simple random sample from an infinite
population - is a sample selected such that the following
conditions - are satisfied.
- Each element selected comes from the same
- population.
- Each element is selected independently.
7Simple Random SamplingInfinite Population
- In the case of infinite populations, it is
impossible to - obtain a list of all elements in the
population.
- The random number selection procedure cannot
be - used for infinite populations.
8Point Estimation
In point estimation we use the data from the
sample to compute a value of a sample statistic
that serves as an estimate of a population
parameter.
s is the point estimator of the population
standard deviation ?.
9Sampling Error
- When the expected value of a point estimator
is equal - to the population parameter, the point
estimator is said - to be unbiased.
- The absolute value of the difference between
an - unbiased point estimate and the
corresponding - population parameter is called the sampling
error.
- Sampling error is the result of using a subset
of the - population (the sample), and not the entire
- population.
- Statistical methods can be used to make
probability - statements about the size of the sampling
error.
10Sampling Error
11Example St. Andrews
- St. Andrews College receives
- 900 applications annually from
- prospective students. The
- application form contains
- a variety of information
- including the individuals
- scholastic aptitude test (SAT) score and whether
or not - the individual desires on-campus housing.
12Example St. Andrews
- The director of admissions
- would like to know the
- following information
- the average SAT score for
- the 900 applicants, and
- the proportion of
- applicants that want to live on campus.
13Example St. Andrews
- We will now look at three
- alternatives for obtaining the
- desired information.
- Conducting a census of the
- entire 900 applicants
- Selecting a sample of 30
- applicants, using a random number table
- Selecting a sample of 30 applicants, using Excel
14Conducting a Census
- If the relevant data for the entire 900
applicants were in the colleges database, the
population parameters of interest could be
calculated using the formulas presented in
Chapter 3. - We will assume for the moment that conducting a
census is practical in this example.
15Conducting a Census
- Population Mean SAT Score
- Population Standard Deviation for SAT Score
- Population Proportion Wanting On-Campus Housing
16Simple Random Sampling
- Now suppose that the necessary data on the
- current years applicants were not yet
entered in the - colleges database.
- Furthermore, the Director of Admissions must
obtain - estimates of the population parameters of
interest for - a meeting taking place in a few hours.
- She decides a sample of 30 applicants will be
used.
- The applicants were numbered, from 1 to 900,
as - their applications arrived.
17Simple Random SamplingUsing a Random Number
Table
- Taking a Sample of 30 Applicants
- Because the finite population has 900
elements, we - will need 3-digit random numbers to randomly
- select applicants numbered from 1 to 900.
- We will use the last three digits of the
5-digit - random numbers in the third column of the
- textbooks random number table, and continue
- into the fourth column as needed.
18Simple Random SamplingUsing a Random Number
Table
- Taking a Sample of 30 Applicants
- The numbers we draw will be the numbers of the
applicants we will sample unless - the random number is greater than 900 or
- the random number has already been used.
- We will continue to draw random numbers until
- we have selected 30 applicants for our
sample.
- (We will go through all of column 3 and part of
- column 4 of the random number table,
encountering - in the process five numbers greater than 900 and
- one duplicate, 835.)
19Simple Random SamplingUsing a Random Number
Table
- Use of Random Numbers for Sampling
3-Digit Random Number
Applicant Included in Sample
744
No. 744
436
No. 436
865
No. 865
790
No. 790
835
No. 835
902
Number exceeds 900
190
No. 190
836
No. 836
. . . and so on
20Simple Random SamplingUsing a Random Number
Table
Random Number
SAT Score
Live On- Campus
No.
Applicant
1 744 Conrad Harris 1025 Yes
2 436 Enrique Romero 950 Yes
3 865 Fabian Avante 1090 No
4 790 Lucila Cruz 1120 Yes
5 835 Chan Chiang 930 No
. . . . .
. . . . .
30 498 Emily Morse 1010 No
21Simple Random SamplingUsing a Computer
- Taking a Sample of 30 Applicants
- Computers can be used to generate random
- numbers for selecting random samples.
- For example, Excels function
- RANDBETWEEN(1,900)
- can be used to generate random numbers
between - 1 and 900.
- Then we choose the 30 applicants corresponding
- to the 30 smallest random numbers as our
sample.
22Point Estimation
- s as Point Estimator of ?
Note Different random numbers would
have identified a different sample which would
have resulted in different point estimates.
23Summary of Point Estimates Obtained from a Simple
Random Sample
Population Parameter
Point Estimator
Point Estimate
Parameter Value
m Population mean SAT score
990
997
80
s Sample std. deviation for SAT
score
75.2
s Population std. deviation for
SAT score
.72
.68
p Population pro- portion wanting
campus housing
24- Process of Statistical Inference
A simple random sample of n elements is
selected from the population.
Population with mean m ?
25where ? the population mean
26Finite Population
Infinite Population
- A finite population is treated as being
- infinite if n/N lt .05.
27(No Transcript)
28(No Transcript)
29(No Transcript)
30Step 1 Calculate the z-value at the upper
endpoint of the interval.
z (1000 - 990)/14.6 .68
Step 2 Find the area under the curve to the
left of the upper endpoint.
P(z lt .68) .7517
31Cumulative Probabilities for the Standard Normal
Distribution
32Area .7517
990
1000
33Step 3 Calculate the z-value at the lower
endpoint of the interval.
z (980 - 990)/14.6 - .68
Step 4 Find the area under the curve to the
left of the lower endpoint.
P(z lt -.68) P(z gt .68)
1 - P(z lt .68)
1 - . 7517
.2483
34Area .2483
980
990
35Step 5 Calculate the area under the curve
between the lower and upper endpoints of the
interval.
P(-.68 lt z lt .68) P(z lt .68) - P(z lt -.68)
.7517 - .2483
.5034
The probability that the sample mean SAT score
will be between 980 and 1000 is
36Area .5034
1000
980
990
37- Suppose we select a simple random sample of
100 - applicants instead of the 30 originally
considered.
38(No Transcript)
39(No Transcript)
40Area .7888
1000
980
990