Title: Random Sampling and Sampling Distributions
1Random Sampling and Sampling Distributions
2Statistical Inference
- The purpose of statistical inference is to obtain
information about a population from information
contained in a sample. - A population is the set of all the elements of
interest. - A sample is a subset of the population.
- The sample results provide only estimates of the
values of the population characteristics. - With proper sampling methods, the sample results
will provide good estimates of the population
characteristics.
3Population and Sample
Sample
Population
4Parameter
A number describing a population
5Statistic
A number describing a sample
6Random Sampling from a Population
- To make an inference about a population parameter
(characteristic), we draw a random sample from
the population - Suppose we select a sample of size n from a
population of size N - A random sampling procedure is one in which every
possible sample of n observations from the
population is equally likely to occur
7Note-I
- A random sample should represent the population
well, so sample statistics from a random sample
should provide reasonable estimates of population
parameters - All sample statistics have some error in
estimating population parameters
8Note-II
- If repeated samples are taken from a population
and the same statistic (e.g.mean) is calculated
from each sample, the statistics will vary, that
is, they will have a distribution - A larger sample provides more information than a
smaller sample so a statistic from a large sample
should have less error than a statistic from a
small sample
9Point Estimation
- In point estimation we use the data from the
sample to compute a value of a sample statistic
that serves as an estimate of a population
parameter. - We refer to as the point estimator of the
population mean ?. - s is the point estimator of the population
standard deviation ?. - is the point estimator of the population
proportion p.
10Sampling Error
- The absolute difference between an unbiased point
estimate and the corresponding population
parameter is called the sampling error. - Sampling error is the result of using a subset of
the population (the sample), and not the entire
population to develop estimates. - The sampling errors are
- for sample mean
- s - s for sample standard deviation
- for sample proportion
11Sampling Distribution
- A statistic is any function of observations in a
random sample - A statistic is a random variable with a
probability distribution - The probability distribution of a statistic is
called its sampling distribution. Its standard
deviation is called the standard error of the
statistics.
12Sampling Distribution of the Sample Mean
- Suppose we attempt to make an inference about the
population mean by drawing a sample from the
population and calculating the sample mean - The sample mean of a random sample of size n from
a population is given by
13Sampling Distribution of the Sample Mean
- Making Inferences about a Population Mean
Population with mean m ?
A simple random sample of n elements is
selected from the population.
14Sampling Distribution of the Sample Mean
- The sampling distribution of is the
probability distribution of all possible values
of the sample - mean .
- Expected Value of
- E ( ) ?
-
- where ? the population mean
15Sampling Distribution of the Sample Mean
- Central Limit Theorem
- Suppose X1, X2, , Xn are n independent random
variables from a population with mean ? and
variance ?2. Then the sum or average of those
variables will be approximately normal with mean
? and variance ?2/n as the sample size becomes
large - Implication
- If we view each member of a random sample as an
independent random variable, then the mean of
those random variables, meaning the sample mean,
will be normally distributed as the sample size
gets large
16Sampling Distribution of the Sample Mean
- Implication The variance of the sampling
distribution of the sample mean decreases as the
sample size n increases - The larger is the sample drawn from a population,
the more certain is the inference made about the
population mean based on sample information, such
as the sample mean
17Sampling Distribution of the Sample Mean
- The CLT applies when sample size is greater or
equal than 30 - Note In most applications with financial data,
sample size will be significantly greater than 30 - Using the results of the CLT, the sampling
distribution of the sample mean will have a mean
equal to ? and a variance equal to ?2/n - The corresponding standard deviation of the
sample mean, called the standard error of the
sample mean, will be
18Does have a normal distribution?
Is the population normal?
Yes
No
is normal
Is ?
Yes
No
may or may not be considered normal
is considered to be normal
(We need more info)
19Sampling Distribution of a Sample Proportion
- If X follows a binomial distribution, then to
find the probability of a certain number of
successes in n trials, we need to know the
probability of a success p - To make inferences about the population
proportion p (the probability of a success as
described above), we use the sample proportion - The sample proportion is the ratio of the number
of successes (X) in a sample of size n
20Sampling Distribution of a Sample Proportion
- The sampling distribution of is the
probability distribution of all possible values
of the sample proportion . - Expected Value of
- where
- p the population proportion
- Thus, is an unbiased estimate of the
population proportion, p.
21Sampling Distribution of a Sample Proportion
- Making Inferences about a Population Proportion
22Sampling Distribution of the Sample Variance
- Suppose we draw a random sample n from a
population and want to make an inference about
the population variance - This inference can be based on the sample
variance defined as follows - The mean of the sampling distribution of the
sample variance is equal to the population
variance
23Population Variance
- s2 Is the population variance, s is the
population standard deviation - Note that the divisor of sample variance is the
sample size minus one (n-1), while for the
population variance it is the population size N.
24Statistical Inference
- There are two procedures for making inferences
- Estimation
- There are 2 types of estimators point
estimator and interval estimator - 2. Confidence Intervals
- 3. Hypotheses testing
25Point Estimation
- The functional form of the pdf is known
- The distribution depends on an unknown parameter,
say , that may have any value in the
parameter space - Point estimation is to choose a member from the
family by guessing a value of
26- Point Estimator
- A point estimator draws inference about a
population by estimating the value of an unknown
parameter using a single value or a point.
Parameter
Population distribution
?
Sample distribution
Point estimator
27Point Estimate
- The statistic is the point estimator
28(No Transcript)
29 Point Estimation
The point estimator is an unbiased estimator
for the parameter if If the estimator is
not unbiased, then the difference Is called the
bias of the estimator