Title: Probability and statistics
1Probability and statistics
- Dr. K.W. Chow
- Mechanical Engineering
2Contents
- Review of basic concepts
- - permutations
- - combinations
- - random variables
- - conditional probability
- Binomial distribution
3Contents
- Poisson distribution
- Normal distribution
- Hypothesis testing
4Basics
- Principle of counting
- There are mn different combinations of marriage
(i.e. for each lady, there are n possible
marriage combinations, thus mn)
m women
n men
A
B
5Basics
- Permutation (order important )
-
- Form a 3-digit number from (1, 2,9)
- Combination (order unimportant )
- Mary marries John John marries Mary
6Permutations
- Permutations of n things taken r at a time
(assuming no repetitions) - For the first slot / vacancy, there are n
choices. - For the second slot / vacancy, there are (n 1)
choices. - Thus there are n(n 1)(n r 1) n!/(n r)!
ways.
7Combinations
- Combinations of n things taken r at a time
(assuming order unimportant) - Permutations n(n 1)(n r 1) n!/(n r )!
ways. - Every r! combinations are equivalent to a single
way. - Hence number of combinations
- n!/((n r)! r ! )
8Conditional Probability
- The probability that an event B occurs, given
that another event A has happened. - Definition
- Note that when B and A are independent, then
9Random variables
- (Intuitive) Random variables are quantities whose
values are random and to which a probability
distribution is assigned. - Either discrete or continuous.
10Random variables
- Example of random variables
-
-
- Outcome of rolling a fair die
11Random variables
- All possible outcomes belong to the set
- Outcome is random.
- Probabilities of every outcome are the same, i.e.
the outcomes follow the uniform distribution. - Hence the outcomes are random variables.
12Random variables
- (Rigorous definition) Random variable is a
MAPPING from elements of the sample space to a
set of real numbers (or an interval on the real
line). - e.g. for a fair die mapping from 1, 2,3,4,5,6
to 1/6.
13Probability density function
- In physics, mass of an object is the integral of
density over the volume of that object - Probability density function (pdf) f(x) is
defined such that the probability of a random
variable X occurring between a and b is equal to
the integral of f between a and b.
14Probability density function
- Defining properties
- Probability density function is non-negative.
- The integral over the whole sample space (e.g.
the whole real axis) must be unity.
15Probability density function
- The probability is not defined at single point,
it does not make sense to say what is the chance
of x 1.23 for a continuous random variable, as
that chance is zero (infinitely many points).
16Probability density function
- For discrete random variables, the probability at
a point is equal to the probability density
function evaluated at that point - Probability between two points (inclusive)
17Cumulative density function
- Cumulative density function (cdf) F is related to
pdf by - Note the lower limit is the smallest value that
? can take, not necessarily
18Cumulative density function
- For discrete random variables
- cdfs for discrete random variables are
discontinuous
19Cumulative density function
cdf of a discrete random variable
cdf of a continuous random variable
20Expectation and variance of random variables
- Expectation (or mean) Integral or sum of the
probability of an outcome multiplied by that
outcome. - For continuous variables, the probability of X
falling in the interval (x, xdx) is
21Expectation and variance of random variables
- The expectation is
- The integral is taken over the whole sample
space. - Not all distributions have expectation, since the
integral may not exist, e.g. the Cauchy
distribution.
22Expectation and variance of random variables
- For discrete variables, the probability of an
outcome is - The expectation is
23Expectation and variance of random variables
- Expectation represents the average amount one
"expects" as the outcome of the random trial when
identical experiments are repeated many times.
24Expectation and variance of random variables
- Example Expectation of rolling a fair die
- Note that this expected value is never achieved
!!
25Expectation and variance of random variables
- Standard deviation a measure of how a
distribution is spread out relative to the mean. - Definition
26Expectation and variance of random variables
- Variance is defined as the square of standard
deviation
27Binomial distribution
- Bernoulli experiment outcome is either success
or fail. - The number of successes in n independent
Bernoulli experiments are governed by the
Binomial distribution. - This is a distribution with discrete random
variables.
28Binomial distribution
- Suppose we perform an experiment 4 times. What is
the chance of getting three successes? (Chance
for success p, chance for failure q, p q
1).
29Binomial distribution
- Scenario
- p, p, p, q
- p, p, q, p
- p, q, p, p
- q, p, p, p
- There are 4C3 ways of placing the failure case.
30Binomial distribution
- Thus the chance is 4 p3 q.
-
- For a simpler case getting 2 heads in throwing
a fair coin 3 times - H, H, T
- H, T, H
- T, H, H.
31Binomial distribution
- Example chance of getting exactly 2 heads when a
fair coin is tossed 3 times is
32Binomial distribution
- The probability density function for r successes
in a fixed number (n ) trials is - (r 0, 1, 2n)
- where r is the number of successes, and p is the
probability of success of each trial.
33Binomial distribution
34Binomial distribution
- Methods to derive the formula E(X) np for the
binomial distribution - (1) Direct argument Gain of p at each trial.
Hence total gain of np in n trials. - (2) Direct summation of series.
- (3) Differentiate the series expansion of the
binomial theorem.
35Binomial distribution
The probability density function
36Binomial distribution
The cumulative density function
37Poisson distribution
- Poisson distribution is a special limiting case
of the binomial distribution by taking - while keeping the product np finite.
- The probability density function is
38Poisson distribution
- Expectation of the Poisson distribution
- Variance of the Poisson distribution
39The Poisson distribution
- Physical meaning a large number of trials (n
going to infinity), and the probability of the
event occurring by itself is pretty small (p
approaching zero). - BUT (!!) the combined effect is finite (np being
finite).
40The Poisson distribution
- Examples
- (a) The number of incorrectly dialed telephone
calls if you have to dial a huge number of calls. - (b) Number of misprints in a book.
- (c) Number of accidents on a highway in a given
period of time.
41Poisson distribution
The probability density function (usually shows a
single maximum).
42Poisson distribution
The cumulative density function (must start from
zero and end up in one)
43Normal distribution
- The normal distribution for a continuous
- random variable is a bell-shaped curve with a
maximum at the mean value. - It is a special limit of the binomial
distribution when the number of data points is
large (i.e. n going to infinity but without
special conditions on p). -
44Normal distribution
- As such the normal distribution is applicable to
many physical problems and phenomena. - The Central Limit Theorem in the theory of
probability asserts the usefulness of the normal
distribution.
45Normal distribution
- The probability density function
- where
46Normal distribution
- The curve is symmetric about
The probability density function
47Normal distribution
- For small standard deviation, the curve is tall,
sharply peaked and narrow. - For large standard deviation, the curve is short
and widely spread out. - (As the area under the curve must sum up to one
to be a probability density function).
48Normal distribution
The cumulative density function
49Normal distribution
- Cumulative density function or probability of a
normally distributed random variable falling
within the interval (a, b) - Values of the above integral can be found from
standard tables.
50Simple tutorial examples for the normal
distribution
- It is obviously not possible to tabulate the
normal distribution pdf for all values of mean
and standard deviation. In practice, we reduce,
by simple scaling arguments, every normal
distribution problem to one with mean zero and
standard deviation. (Notation N(µ, s2))
51The binomial approximation of the normal
distribution
- In many situations, the binomial distribution
formulation is impractical as the computation of
the factorial term is problematic. - The normal distribution provides a good
approximation to the binomial distribution.
52The binomial approximation of the normal
distribution
- Example chance of getting exactly 59 heads in
tossing a fair coin 100 times - The exact formulation is
- 100C59 (1/2)59 (1/2)41
- but difficult to calculate 100!
53Normal distribution
- Instead we use the normal distribution (a
continuous random variable (rv)) to approximate
the binomial distribution (a discrete rv) -
54The binomial approximation of the normal
distribution
- We use the mean (np) and variance (npq) of the
binomial distribution as the corresponding
parameters of the normal distribution. - We use an interval of length one to cover every
integer, e.g. to cover an integer of 59, we use
the interval (58.5, 59.5).
55Normal distribution
- Set
- Form the standard variable
56Normal distribution
- Find the probability of this range of Z from
tables
Value obtained from binomial formulation 0.0159
(agree to three decimal places)
57Normal / binomial distributions
- (For your information) Class example on
university admission. - Yield rate (number of students who actually
attend) / (number of offers or admission
letters sent to students) - Vary from year to year. Even Harvard has only a
yield ratio of about 0.6 0.8.
58Normal distribution
- A large state university with a yield ratio of
say 0.3. - Will send out 450 offers or letters of
admission. - Chance of more than 150 students actually coming
to campus (i.e. cannot accommodate beyond this
limit of 150).
59Normal distribution
- The exact binomial formulation Sum r
- 450Cr (0.3)r (0.7)450 r
- from 151 to 450. (a) 450! is too large and (b)
sum of 300 terms??
60Normal distribution
- Use (150.5, 151.5) for r 151,
- (151.5, 152.5) for r 152,
- (152.5, 153.5) for r 153 and so on.
- n 450, p 0.3
- 150 (450)(0.3)/Sqrt450(0.3)(0.7)
- 1.59
61The binomial approximation of the normal
distribution
- Upper limit of 450.5 can effectively be taken as
positive infinity. Thus we need to find the area
of the normal curve between 1.59 and infinity.
From table this area is 0.0559. Hence the chance
of 151 admitted students or more actually coming
to campus is 0.0559.
62Chi-squared distribution
- Chi-squared distribution is a distribution for
continuous random variables. - Commonly used in statistical significance tests.
63Chi-squared distribution
- If are independent and identically
distributed random variables which follow the
normal distribution, then -
- has a chi-squared distribution of
degree-of-freedom k.
64Chi-squared distribution
- The probability density function is
- where is the gamma-function
65Chi-squared distribution
The pdf
66Chi-squared distribution
The cdf
67Sum of random variables
- Consider the problem of throwing a die twice.
What is the chance of getting a sum of the two
outcomes at 7? The answer is the combination of
(1,6), (2,5), (3,4), (4,3), (5,2), (6,1) or 6
outcomes out of 36 possible ones, i.e. a chance
of 6/36 1/6.
68Sum of continuous r. v.
- Now consider a more complicated problem of
finding the probability density function of the
sum of two continuous random variables.
69Sum of normal r. v.
- Suppose Z X Y and each of X, Y are N(µ, s2).
We consider the simpler case of N(0, 1) first.
Suppose Z is to attain a value of z, and if X is
of value ?, then Y MUST have the value of z ?,
and now we integrate over ? from negative
infinity to plus infinity.
70Sum of normal r. v.
- On calculating the integrals, Z is found to go
like N(0, 2). In general if - X N(µ1, (s1)2)
- Y N(µ2, (s2)2)
- X Y N(µ1 µ2, (s1)2 (s2)2 )
71Linearity of normal r. v.
- Suppose Z a X b, where X is N(µ, s2), and a,
b are scalars, then - (a) Mean of Z a µ b
- (b) Variance of Z a2 s2
72Sum of normal r. v.
- (a) The mean is just shifted accordingly to this
linear scaling. - (b) b does NOT affect the variance of Z. This
makes sense as b is just a translation of the
data and should not affect how the data are
spread out. Note also that a2 is involved.
73A sequence of random variables
- Now consider the problem of doing a series of
experiments, and assume the outcome of each
experiment is random. Alternatively, we are
collecting a large number of data point, and we
assume each data point might be considered as the
outcome of a random experiment (e.g. asking for
information in a census).
74Sequence of random variables
- Now consider a sequence of n random variables
(e.g. throwing a die n times, doing the
experiment n times, or asking for the age of n
residents in a censusetc). Each outcome is a
random variable Xr , r 1, 2, 3 n.
75The Sample Mean (Careful!!)
- The sample mean is defined by
- The sample mean is a random variable itself!!!
76The Sample Variance (Careful)
- The sample variance is defined by
- Note the denominator is n 1 to get an
unbiased estimation.
77Unbiased Estimator
- A function or an expression of a random variable
will be an UNBIASED ESTIMATOR of a random
variable, if the expectation or mean will give
the true mean of the random variable, e.g. the
Sample Mean is an unbiased estimator of the mean.
78Mean and S.D. of the Sample Mean
- Since all are normally distributed
then the mean and variance of the sample mean
are
79t- distribution
- Arises in the problem of estimating the mean of a
normally distributed population when the standard
deviation is unknown. - The random variable
- follows a t- distribution with degree of freedom
n-1
80t- distribution
- The probability density function is
- with k as the degree-of-freedom
81t- distribution
The pdf
82t- distribution
The cdf
83Hypothesis testing
- Example 1
- Sample space All cars in America
- Statement (hypothesis) 30 of them are trucks.
84Hypothesis testing
- Impossible to examine all cars in the country
(impractical). - Test a sample of cars, e.g. find 500 cars in a
random manner. If close of 30 of them are
trucks, accept the claim.
85Hypothesis testing
- Example 2
- Sample space All students at HKU
- Statement (hypothesis) The average balance of
their bank accounts is 100 dollars.
86Hypothesis testing
- Not enough time and money to ask all students.
They might not tell you the truth anyway. - Test a sample of students, e.g. find 50
students in a random manner. If the statement
holds, accept the claim.
87Hypothesis testing
- The original hypothesis is also known as the null
hypothesis, denoted by - Null hypothesis, H0 µ a given value.
- Alternative hypothesis, H1 µ ? the given value.
88Hypothesis testing
- Type I error
- Probability that we reject the null hypothesis
when it is true. - Type II error
- Probability that we accept the null hypothesis
when it is false (other alternatives are true).
89Hypothesis testing
- Class Example A Claim 60 of all households in
a city buy milk from company A. Choose a random
sample of 10 families, if 3 or less families buy
milk from company A, reject the claim. - H0 p 0.6 versus H1 p lt 0.6
90Hypothesis testing
- One sided test (µ0 a given value)
- H0 µ µ0 versus H1 µ lt µ0
- H0 µ µ0 versus H1 µ gt µ0
- Two sided test
- H0 µ µ0 versus H1 µ ? µ0
91Hypothesis testing
- Implication in terms of finding the area from the
normal curve - For 1-sided test, find the area in one tail only.
- For 2-sided test, the area in both tails must be
accounted for.
92Hypothesis testing
- Probability model Binomial dist.
- Type I error rejecting null hypothesis even
though it is true, i.e. (we are so unfortunate in
picking the data such that) 3 or less families
buy milk from company A, even though p is
actually 0.6.
93Hypothesis testing
- That very small chance of picking these
unfortunate or far away from the mean data is
called the LEVEL OF SIGNIFICANCE.
94Hypothesis testing
95Hypothesis testing
- Type II error accepting null hypothesis when the
alternative - is true. Usually cannot do much as we need to
fix a value of p before we can compute a binomial
distribution.
96Hypothesis testing
- A simple case of p 0.3 is illustrated here
- Hence the chance that the alternative is
rejected is (hence accepting the null hypothesis)
97Hypothesis testing
- The previous example utilizes the binomial
distribution. Let consider one where we need to
use the normal approximation to the binomial.
98Hypothesis testing
- Class Example B A drug is only 25 effective.
For a trial with 100 patients, the doctors will
believe that the drug is more than 25 effective
if 33 or more patients show improvement.
99Hypothesis testing
- What is the chance that the doctor will (falsely)
believe that the drug is endorsed even it is
really only 25 effective? i.e. What is the
chance that we have such a group of good
patients that most of them improve on their own?
100Hypothesis testing
- For binomial distribution, we sum r for
- 100Cr (0.25)r (0.75)100 r
- r 33 to 100.
101Hypothesis testing
- We use the normal approximation and consider
- (32.5 - 100(0.25))
- /Sqrt100(0.25)(0.75)
- 1.732
102Hypothesis testing
- We then find the area of the normal curve to the
right of 1.732 (as the upper limit of 100.5 is
effectively infinity). That will be the Type I
error.
103Hypothesis testing
- In practice we work in reverse. We fix the
magnitude of the Type I error, i.e. the level of
significance, and then determine what is
threshold level of patients for endorsing the
drug.
104Hypothesis testing
- Probably the most important application is to
test hypothesis involving the sample mean. The
standard deviation may or may not be known (the
more logical case is that it is unknown).
105Hypothesis testing
- If the standard deviation of the whole population
is known, then the standard variable is
106Hypothesis testing
- This is not practical nor reasonable as the
standard deviation of the whole population is
usually unknown. - The SAMPLE standard deviation variable in this
case
107Hypothesis testing
- S is the sample standard deviation obtained by
taking the square root of the sample variance. - Use the t- distribution instead of normal
distribution tables.
108Hypothesis testing
- Class example C
- CLAIM Life expectancy of 70 years in a
metropolitan area. - In a city, from an examination of the death
records of 100 persons, the average life span is
71.8 years.
109Hypothesis testing
- i.e. you actually have noted the 100 data points,
add them together and divide by 100 to get the
sample mean of 71.8
110Hypothesis testing
- H0 µ 70 versus
- H1 µ gt 70
- Using a level of significance of 0.05, i.e.
- z (Xbar mu)/(sigma/sqrt(n))
- must be compared with 1.645.
111Hypothesis testing
- For the present example, assume sigma is known at
8.9, then - (71.8 70)/(8.9/Sqrt100)
- 2.02
- As 2.02 gt 1.645,
- Reject H0, life span is bigger than 70 years.
112Hypothesis testing
- Testing hypothesis is DIFFERENT from solving a
differential equation, e.g. to solve - dy/dx y, y(0) 1
- Once you identity y exp(x), that is the exact
solution beyond all doubt.
113Hypothesis testing
- Nobody can argue with you regarding the true
solution of the differential equation. - In Hypothesis Testing, we do NOT prove that the
mean is a certain value. We just assert that the
data are CONSISTENT with that claim.