Continuous Probability Distributions - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

Continuous Probability Distributions

Description:

The letter z is commonly used to designate this normal random variable. ... 95% Confidence Interval for a Population Mean, : Small Sample Case (n 30) ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 32

Provided by: stephani214

Category:

more less

Transcript and Presenter's Notes

Title: Continuous Probability Distributions

1
Continuous Probability Distributions

Uniform Probability Distribution
Normal Probability Distribution
Exponential Probability Distribution
Other continuous probability distributions

2
Continuous Probability Distributions

A continuous random variable can assume any value
in an interval on the real line or in a
collection of intervals.
It is not possible to talk about the probability
of the random variable assuming a particular
value.
Instead, we talk about the probability of the
random variable assuming a value within a given
interval.
The probability of the random variable assuming a
value within some given interval from x1 to x2 is
defined to be the area under the graph of the
probability density function between x1 and x2.

3
Uniform Probability Distribution

A random variable is uniformly distributed
whenever the probability is proportional to the
intervals length.
Uniform Probability Density Function
f(x) 1/(b - a) for a lt x lt b
0 elsewhere
where
a smallest value the variable can assume
b largest value the variable can assume

4
Uniform Probability Distribution

Expected Value of x
E(x) (a b)/2
Variance of x
Var(x) (b - a)2/12
where
a smallest value the variable can assume
b largest value the variable can assume

5
Graph of the Normal Probability Density Function
6
Characteristics of the Normal Probability
Distribution

The shape of the normal curve is often
illustrated as a bell-shaped curve.
Two parameters, m (mean) and s (standard
deviation), determine the location and shape of
the distribution.
The highest point on the normal curve is at the
mean, which is also the median and mode.
The mean can be any numerical value negative,
zero, or positive.
The normal curve is symmetric.
The standard deviation determines the width of
the curve larger values result in wider, flatter
curves.
The total area under the curve is 1 (.5 to the
left of the mean and .5 to the right).
Probabilities for the normal random variable are
given by areas under the curve.

7
Normal Probability Density Function

where
? mean
? standard deviation
? 3.14159
e 2.71828

8
Standard Normal Probability Distribution

A random variable that has a normal distribution
with a mean of zero and a standard deviation of
one is said to have a standard normal probability
distribution.
The letter z is commonly used to designate this
normal random variable.
Converting to the Standard Normal Distribution
We can think of z as a measure of the number of
standard deviations x is from ?.

9
Exponential Probability Density Function

where µ mean e 2.71828
xgt0
The exponential distribution is commonly used to
measure time between events occurring.

10
The Gamma Distribution

The Gamma distribution is an extension to the
exponential distribution
where xgt0, agt0, ßgt0
?(a)(a-1)! for a1,2,..
The Chi-squared distribution, ? a2, is closely
related to the gamma distribution

11
Student t distribution

The student t distribution is closely related to
the normal and gamma distributions and plays and
important role in certain statistical testing
procedures.

Sampling and Sampling Distributions
Simple Random Sampling
Point Estimation
Introduction to Sampling Distributions
Sampling Distribution of
Sampling Distribution of p
Interval Estimation
Interval estimation of a population mean large
and small sample cases
Determining the sample size
Interval estimation of a population proportion
Hypothesis Testing
Tests about a population mean large and small
sample cases
Tests about a population proportion

13
Statistical Inference

The purpose of statistical inference is to obtain
information about a population from information
contained in a sample.
A population is the set of all the elements of
interest.
A sample is a subset of the population.
The sample results provide only estimates of the
values of the population characteristics.
A parameter is a numerical characteristic of a
population.
With proper sampling methods, the sample results
will provide good estimates of the population
characteristics

14
Point Estimation

In point estimation we use the data from the
sample to compute a value of a sample statistic
that serves as an estimate of a population
parameter.
We refer to as the point estimator of the
population mean ?.
s is the point estimator of the population
standard deviation ?.
p is the point estimator of the population
proportion ?.

15
Sampling Distribution of p

The sampling distribution of p is the probability
distribution of all possible values of the sample
proportion
Expected Value of p
E( p ) ?
where ? the population proportion
Standard Deviation of p
sp is referred to as the standard error of the
proportion.

16
Sampling Distribution of p

The sampling distribution of p can be
approximated by a normal distribution whenever
the sample size is large.
A sample can be considered large when both np 5
and n(1-p) 5.

17
Interval Estimation

A point estimate only gives us a single estimate
for population parameter and does not take into
account the variability in the data or the sample
size.
The standard error of the sampling distribution
of x is a measure of the reliability or precision
of x as an estimate of µ.
The standard error can be used to construct a
confidence interval for the population mean, µ.
Confidence - the level of confidence that the
interval will contain µ.
The confidence level is normally set at 95.

18
95 Confidence Interval for a Population Mean,
µLarge Sample Case (n gt 30)
With s unknown where x is the sample
mean s is the sample standard
deviation n is the sample size
19
95 Confidence Interval for a Population Mean,
µSmall Sample Case (n lt 30)

Population is Not Normally Distributed
The only option is to increase the sample size to
n gt 30 and use the large sample interval
estimation
procedures.
Population is Normally Distributed and ? is
Unknown
The appropriate interval estimate is based on the
t distribution.

20
t Distribution

The t distribution is a family of similar
probability distributions.
A specific t distribution depends on a parameter
known as the degrees of freedom.
(e.g., for a problem with 20 elements, degrees of
freedomdf20-119)
As the number of degrees of freedom increases,
the difference between the t distribution and
the standard normal probability distribution
becomes smaller and smaller.
A t distribution with more degrees of freedom
has less dispersion.
The mean of the t distribution is zero.

21
95 Confidence Interval for a Population Mean,
µSmall Sample Case (n lt 30)

where x is the sample mean
s is the sample standard
deviation
n is the sample size

22
Hypothesis Testing

Hypothesis testing can be used to determine
whether a statement about the value of a
population parameter should or should not be
rejected.
The null hypothesis, denoted by H0 , is a
tentative assumption about a population
parameter.
The alternative hypothesis, denoted by H1, is the
opposite of what is stated in the null hypothesis
and is often referred to as the hypothesis of
interest.

23
Null and Alternative Hypotheses about a
Population Mean

The equality part of the hypotheses always
appears in the null hypothesis.
A hypothesis test about the value of a population
mean ?? usually takes the following form (where
?0 is the hypothesised value of the population
mean).
H0 ? ?0
H1 ? ? ?0

24
The Steps of Hypothesis Testing

Determine the appropriate hypotheses.
Select the test statistic for deciding whether or
not to reject the null hypothesis.
Collect the sample data
Use a statistical package to compute the test
statistic and the p-value
If p-valuelt0.05 then reject H0
Make sensible conclusions based on the decision
to reject H0 or not.

25
Tests about a Population Mean Large-Sample Case
(n gt 30)

Test Statistic
Rejection rule
Reject H0 if Zgt1.96 or Zlt-1.96

26
Tests about a Population MeanSmall-Sample Case
(n lt 30)

Assuming that the population is normally
distributed
Test Statistic
Rejection rule
Reject H0 if TgttQ2.5,n-1 or Tlt-tQ2.5,n-1

27
The Use of p-Values

The p value is the probability of obtaining a
sample result that is at least as unlikely as
what is observed.
The p value can be used to make the decision in a
hypothesis test
Reject H0 if the p value lt 0.05

28
Null and Alternative Hypotheses for a Population
Proportion

The equality part of the hypotheses always
appears in the null hypothesis.
In general, a hypothesis test about the value of
a population proportion p usually takes the
following form (where p0 is the hypothesized
value of the population proportion).
H0 p p 0
H1 p ? p 0

29
Example

The following nucleotide distribution was
observed
Question 1 Compute estimates of the nucleotide
probabilities
P(A) 2000/8100 0.247
P(C) 2100/8100 0.259
P(G) 1500/8100 0.185
P(T) 2500/8100 0.309
Question 2 Test the null hypothesis that the
nucleotide probabilities are equal, that is
H0 p1 p2 p3 p4 ¼
using a goodness of fit test based on the
chi-square distribution

Given that
H0 p1 p2 p3 p4 ¼,
expected counts for A, C, G, and T are
8100/42025
For Chi-square Test, the following expression is
used to find a critical value
eexpected value (e.g., mean value)
o actual value
X2 (2000-2025) 2/2025 (2100-2025) 2/2025
(1500-2025) 2/2025 (2500-2025)2/2025
X2 250.62
As we have 4 elements,
Degrees of freedom (df) 4 - 1 3
from the Chi-Square Distributions table (see next
page), the critical value for 95 confidence
interval (or 5 significance level alpha) is
found to be 7.8147 (Note that if a confidence
level is not specified in the question, you may
consider this as 95)
Conclusion Reject H0 at 5 significance level as
X2 is higher than the critical value. There is
evidence that the probabilities are not all
equal. There are more A and T than expected. It
can be seen from the Question 1