Basic Probability and Statistics - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Basic Probability and Statistics

Description:

The set of all possible outcomes of an experiment is called the sample space (S) ... to be continuous if there exists a nonnegative function f(x) such that for any ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 42

Provided by: systema222

Category:

more less

Transcript and Presenter's Notes

Title: Basic Probability and Statistics

1
Basic Probability and Statistics

Random variables
Distribution functions
Various probability distributions

2
Definitions

An experiment is a process whose output is not
known with certainty.
The set of all possible outcomes of an experiment
is called the sample space (S).
The outcomes are called sample points in S.
A random variable is a function that assigns a
real number to each point in S.
A distribution function F(x) of the random
variable X is defined for each real number x as
follows

3
Properties of distribution function
4
Random Variables

A random variable (r.v.) X is discrete if it can
take on at most a countable number of values x1,
x2, x3,
The probability that the discrete r.v. X takes on
a value xi is given by p(xi)Pr(X xi).
p(x) is called the probability mass function.

5
Random Variables

A r.v. is said to be continuous if there exists a
nonnegative function f(x) such that for any set
of real numbers B,
f(x) is called probability density function.

6
Random Variables

Mean or expected value of a r.v. X is denoted by
EX or µ, and given by
Variance of a r.v. X is denoted by Var(X) or s2,
and given by

7
Properties of mean

If X is a discrete random variable having pmf
p(x), then
If X is continuous with pdf f(x), then
Hence, for constants a and b,

8
Property of variance

For constants a and b,

9
Joint Distribution

If X and Y are discrete r.v., then,
is called the joint probability mass function of
X and Y.
Marginal probability mass functions of X and Y
X, Y are independent if

10
Conditional probability

Let A and B be two events.
Pr(AB) is the conditional probability of event A
happening given that B has already occurred.
Bayes theorem
If events A and B are independent, then Pr(AB)
Pr(A).
Hence, from Bayes theorem

11
Dependency

Covariance is a measure of linear dependence and
is denoted by Cij or Cov(Xi, Xj)
Another measure of linear dependency is the
correlation factor
Correlation factor is dimensionless but
covariance is not.

12
Two random numbers in simulation experiment

Let X and Y be two random variates in a given
simulation experiment that are not independent.
Our performance parameter is XY.
However, if the two r.v.s are independent

13
Bernoulli trial

An experiment with only two outcomes Success
and Failure where the chance of outcome is
known apriori.
Denoted by the chance of success p (this is a
parameter for the distribution).
Example Tossing a fair coin.
Let us define a variable Xi such that
Then, EXi p and Var(Xi) p(1-p).

14
Binomial r.v.

A series of n independent Bernoulli trials.
If X is the number of successes that occur in the
n trials, then X is said to be Binomial r.v. with
parameters (n, p). Its probability mass function
is

15
Binomial r.v.
16
Poisson r.v.

A r.v. X which can take values 0, 1, 2, is said
to have a Poisson distribution with parameter ?
(? gt 0) if the pmf is given by
For a Poisson r.v.,
The probabilities can be recursively found out

17
Uniform r.v.

A r.v. X is said to be uniformly distributed over
the interval (a, b) when its pmf is
Expected value

18
Uniform r.v.

Variance
Distribution function F(x) for a given x a lt x lt
b is

19
Normal r.v.

pdf
The normal density is a bell-shaped curve that is
symmetric about µ.
It can be shown that for a normal r.v. X with
parameters (µ, s2),

20
Normal r.v.

If X N(µ, s2), then is
N(0,1).
Probability distribution function of Standard
Normal is given as
If X N(µ, s2), then

21
Central Limit Theorem

Let X1, X2, X3Xn be a sequence of IID random
variables having a finite mean µ and finite
variance s2. Then

22
Exponential r.v.

23
Exponential r.v.

When multiplied by a constant, it still remains
an exponential r.v.
Most useful property Memoryless!!!
Analytical simplicity

24
Poisson process
25
Useful property of Poisson process

Let S11 denote the time of the first event of the
first Poisson process (with rate ?1), and S12
denote the time of the first event of the second
Poisson process (with rate ?2). Then

26
Covariance stationary processes

Covariance between two observations Xi and Xij
depends only on j and not on i.
Let Cj be the covariance for this process.
So the correlation factor is given by

27
Point Estimation

Let X1, X2, X3Xn be a sequence of IID random
variables (observations) having a finite
population mean µ and finite population variance
s2.
We are interested in finding these population
parameters through the sample values.
This sample mean is unbiased point estimator of
µ.
That is to say that

28
Point Estimation

The sample variance
is an unbiased point estimator of s2.
Variance of the mean
We can estimate this variance of mean by
This is true only if X1, X2, X3Xn are IID.

29
Point Estimation

However, most often in simulation experiment, the
data is correlated.
In that case, estimation using sample variance is
dangerous. Because it underestimates the actual
population variance.

30
Interval Estimation

Let X1, X2, X3Xn be a sequence of IID random
variables (observations) having a finite
population mean µ and finite population variance
s2(gt 0).
We want to construct confidence interval for mean
µ.
Let Zn be a random variable with a probability
distribution Fn(z).

31
Interval Estimation

Central Limit Theorem states that
where is the standard normal distribution with
mean 0 and variance 1.
Often, we dont know the population variance s2.
It can be shown that CLT applies if we replace s2
by sample variance S2(n).
The variable tn is approximately normal as n
increases.

32
Standard Normal distribution

Standard Normal distribution is N(0,1).
The cumulative distributive function (CDF) at any
given value (z) can be found using standard
statistical tables.
Conversely, if we know the probability, we can
compute the corresponding value of z such that,
This value is z1-a/2 and is called the critical
point for N(0,1).
Similarly, the other critical point (z2
-z1-a/2) is such that

33
Interval Estimation

It follows for a large n

34
Interval Estimation

Therefore, if n is sufficiently large, an
approximate 100(1- a) percent confidence interval
of µ is given by
If we construct a large number of independent
100(1- a) percent confidence intervals each based
on n different observations (n sufficiently
large), the proportion of these confidence
intervals that contain µ should be 1- a.

35
Interval Estimation

What if the n is not sufficiently large?
If Xis are normal random variables, the random
variable tn has a t-distribution with n-1 degrees
of freedom.
In this case, the 100(1-a) percent confidence
interval for µ is given by

36
Interval Estimation

In practice, the distribution of Xis is rarely
normal and the confidence interval (with
t-distribution) will be approximate.
Also, the CI given with
t is larger than
the one with z.
Hence, it is recommended that we use the CI with
t. Why?
However,

37
Interval Estimation

The confidence level has a long-run relative
frequency interpretation.
The unknown population mean µ is a fixed number.
A confidence interval constructed from any
particular sample either does or does not contain
µ.
However, if we repeatedly select random samples
of that size and each time constructed a
confidence interval, with say 95 confidence,
then in the long run, 95 of the CIs would
contain µ.
This happens because 95 of the time the sample
mean
So 95 of the times, the inference about µ is
correct.

38
Interval Estimation

Every time we take a new sample of the same size,
the confidence interval is going to little
different than the previous one.
This is because the sample mean varies from
sample to sample.
In practice, however, we select just one sample
of fixed size n and construct one confidence
interval using the observations in that sample.
We do not know whether any particular CI truly
contains µ.
Our 95 confidence in that interval is based on
long-term properties of the procedure.

39
Hypotheses testing

Assume that X1, X2, X3Xn are normally
distributed (or be approximately normal) and that
we would like to test whether µ µ0, where µ0 is
a fixed hypothesized value of µ.
If is large then our hypothesis
is not true.
To conduct such test (whether the hypothesis is
true or not), we need a statistical parameter
whose distribution is known when the hypothesis
is true.
Turns out, if our hypothesis is true (µ µ0),
then the statistic tn has a t-distribution with
n-1 df.

40
Hypotheses testing

We form our two-tailed hypothesis (H0) to test
for µ µ0 as
The portion of real line that corresponds to the
rejection of H0 is called the critical region for
the test.
The probability that the statistic tn falls in
the critical region given that H0 is true, which
is clearly equal to a, is called level of the
test.
Typically if the tn doesnt fall in the rejection
region, we do not reject the H0.

41
Hypotheses testing

Type I error If one rejects H0 when it is true,
this is called Type I error, which is again equal
to a. This errors is under experimenter's
control.
Type II error If one accepts H0 when it is
false, it is Type II error. It is denoted by ß.
We call d 1- ß as power of test which is the
probability of rejecting H0 when it is false.
For a fixed a, power of the test can only be
increased by increasing n.