Lecture series two - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Lecture series two

Description:

In order to explain the link between the hidden variable and the function(s) ... interval and we reject the null hypothesis. ... Test the null hypothesis of ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 68
Provided by: ecst7
Category:

less

Transcript and Presenter's Notes

Title: Lecture series two


1
Lecture series two
  • Probability and Statistics

2
Recapitulation
  • In the previous set of lectures we explored the
    link between sets of numbers and functions
  • Today we shall re-examine this link from the
    perspective of the concepts of uncertainty and
    probability theory

3
Modelling uncertainty
  • Probability theory is a mathematical model of
    uncertainty. Consider flipping a fair coin
    repeatedly. There are two possible outcomes of
    the coin flip head or tail but we do not know
    which one will occur in a single flip. The
    outcomes are uncertain.

4
Modelling uncertainty contnd.
  • It can be argued that the uncertainty in the
    world is fully contained in the selection of some
    hidden variable, say ?. If this variable is
    known, then nothing will be uncertain anymore.
  • Many choices are possible, but only one was made
    and everything derives from it. In other words,
    everything that is uncertain is a function, say
    X(?) of the hidden variable.

5
Modelling uncertainty contnd.
  • In order to explain the link between the hidden
    variable and the function(s) defined over the
    hidden variable we need to introduce some
    associated concepts.

6
Some useful definitions
  • It is customary in statistics to refer to any
    process of observation or measurement as an
    experiment. The results that one obtain from an
    experiment are called outcomes. The set S of all
    possible outcomes of some experiment is called
    the sample space. An event A in this set is a set
    of outcomes or a subset in the sample space S.

7
Examples
  • Example one Toss a die and observe the number
    (of dots) that appears on the top face. Let A be
    the event that an even number occurs, B that an
    odd number occurs and C that a number greater
    than 3 occurs. Find the events of A and C and A
    and B occurring simultaneously.
  • Example two Toss a coin 3 times and observe the
    sequence of heads H and tails T that appears. Let
    A be the event that two or more heads appear
    consecutively, and B that all the tosses are the
    same. Find the event of A and B occurring
    simultaneously.
  • Example three A card is drawn from an ordinary
    deck of 52 cards. Let E be the event that a
    picture card was drawn and F the event of a
    heart. Find the event of E and F occurring
    simultaneously.

8
The probability function
  • In all these cases, we found the probability of
    an event. Probabilities are values of a set
    function, which assigns real numbers to various
    subsets of a sample space S. When the sample
    space is discrete
  • P1 For any event A, P(A)?0
  • P2 For the certain event S, P(S)1
  • P3 If A,B,C. Is a finite or infinite sequence
    of mutually exclusive events, P(A?B
    ?C)P(A)P(B)P(C)

9
The probability function cntnd.
  • Theorems on probability spaces
  • T1 The impossible event or, in other words, the
    empty set ? has probability zero, that is P (?)0
  • T2 For any event A, P(Ac)1-P(A)
  • T3 For any event, 0?P(A) ?1
  • T4 For any two events A and B,
    P(A?B)P(A)P(B)- P(A?B).
  • T5 If an experiment can result in any one of N
    different equally likely outcomes, and if n of
    these outcomes together constitute event A, then
    the probability of event A is P(A)n/N.

10
Examples
  • Example one A card is selected at random from an
    ordinary deck of 52 playing cards. Let A be the
    event of a heart, and B the event of a face card.
    Find P(A), P(B), P(A?B), P(A?B).
  • Example two A student is selected at random from
    80 students, where 30 are taking maths, 20
    chemistry and 10 both maths and chemistry. Find
    the probability that a students is taking either
    maths or chemistry.
  • Example three Let three coins be tossed and the
    number of heads observed. Find the probability of
    the event that at least one head appears. Find
    the probability of the event that all heads or
    all tails appear.

11
Conditional probability
  • Example Auto insurance rates usually depend on
    the probability that a random person is involved
    in an accident. It is well known that male
    drivers under 25 years of age get into accidents
    more often than the general public. That is,
    letting P(A) denote the probability of an
    accident and letting E denote male drivers
    younger than 25. The data tells us that
    P(A)ltP(AE).

12
Conditional probability contnd.
  • Suppose E is an event in sample space S with
    P(E)gt0. The probability that an event A occurs
    once event E has occurred, or the conditional
    probability of A given E, is defined as

13
Conditional probability contnd.
  • Example one A pair of dice is tossed. Find the
    probability that one of the dice is 2 if the sum
    is 6.
  • Example two A couple has two children. Find the
    probability that both children are boys if it is
    known that at least one is a boy.

14
Conditional probability contnd.
  • The Multiplication Theorem for Conditional
    Probability is a direct consequence of the
    definition of conditional probability

15
Conditional probability contnd.
  • Example one A lot contains 12 items of which 4
    are defective. Three items are drawn at random
    from the lot one after the other. Find the
    probability that all 3 are non-defective.
  • Example two Suppose the following three boxes
    are given Box X has 10 light bulbs of which 4
    are defective., box Y has 6 light bulbs of which
    1 is defective and box Z has 8 light bulbs of
    which 3 are defective. A box is chosen are random
    and then a bulb is randomly selected from the
    box. Find the probability that the bulb is
    non-defective. If a bulb is non-defective find
    the probability that it came from box Z.

16
Random variable
  • We are now ready to define properly the concept
    of a random variable
  • If S is a sample space with a probability measure
    and X is a real-valued function defined over the
    elements of S, then X is called a random variable

17
Random variable cntnd
  • Example one A pair of fair dice is tossed. The
    sample space S consists of 36 ordered pairs (a,b)
    where a and b can be any integers between 1 and
    6
  • S(1,1),(1,2),.(6,6). Let X assign to each
    point the maximum of its numbers
    X(a,b)max(a,b). Then X is a random variable with
    a range Rx1,2,3,4,5,6.
  • Example two A coin is tossed until a head
    appears. The sample space is SH,TH,TTH,TTTH,TTT
    TH... Let X denote the number of times the coin
    is tossed. Then X is a random variable with range
    space Rx1,2,3,4,

18
Probability distribution of a discrete random
variable
  • If X is a discrete random variable, the function
    given by f(x)P(Xx) for each x within the range
    of X is called probability distribution of X. The
    set of ordered pairs is usually given in the form
    of a table as follows

19
Probability distribution of a finite random
variable cntnd
  • This function f is called the probability
    distribution or distribution of the random
    variable X. It satisfies the following two
    conditions f(xk)?0 and ?k f(xk)1

20
Probability distribution of a finite random
variable cntnd
  • Example Let S be the sample space when a pair of
    fair dice is tossed. The S is a finite
    equiprobable space consisting of the 36 ordered
    pairs (a,b). Let X and Y be random variables such
    that Xmax (a,b) and Yab. Find the distribution
    of X and Y.
  • (a) One toss (1,1) has a maximum value of 1,
    hence f(1)1/36.
  • Three tosses (1,2), (2,2), (2,1) have a max
    value of 2, hence f(2)3/36. Five tosses (1,3),
    (2,3), (3,3), (3,1), (3,2) have a max value of 3
    f(3)5/36 and so on

21
Probability distribution of a finite random
variable cntnd
22
Probability distribution of a finite random
variable cntnd
  • Similarly, the distribution of Y is

23
More examples
  • Example two A fair coin is tossed three times.
    Let X be the random variable that assigns to each
    point in S the number of heads. Find the
    distribution of X.
  • Example three Suppose a coin is tossed three
    times, but now it is weighted so that P(H)2/3
    and P(T)1/3. Find the distribution of X.

24
Expectation of a finite random variable cntnd.
  • Let X be a finite random variable and suppose the
    following is its distribution
  • Then the mean, or expectation (expected value) of
    X, denoted by E(X) is defined as
    E(X)x1f(x1)x2f(x2)xnf(xn)
  • Exercise Find the expected value of the random
    variable in example 1 above.

25
Variance and standard deviation of a discrete
random variable
  • The expectation of a random variable is a measure
    of its mean, or average value. Suppose that X is
    a random variable with n distinct values and
    suppose that each value occurs with the same
    probability pi1/n. Then E(X)x11/nx21/n.
    xn1/n?.
  • The variance and the standard deviation, on the
    other hand, are measures of the spread or
    dispersion of the random variable.

26
Variance and standard deviation of a discrete
random variable cntnd.
  • Let X be a random variable with mean ?E(X) and
    the following probability distribution
  • The variance of X is defined by
  • var(X)(x1- ?)2f(x1) (x2- ?)2f(x2) (xn-
    ?)2f(xn)
  • T var(X)E(X2)- ?2
  • Exercise Compute the variance in example 1
    above.

27
Joint distribution of a random variable
  • Let X and Y be random variables on the same
    sample space S with respective range spaces
    Rxx1,x2,..xn and Ryy1,y2,.yn. The joint
    distribution or joint probability function of X
    and Y is the function h on product space
  • h(xi,yj)?P(Xxi,Yyj) ?P(s?SX(s)xi,Y(s)yj)
  • It has properties (i) h(xi,yj), (ii) ?i?j
    h(xi,yj)1

28
Expectation of a finite random variable
  • T1 Let X be a random variable and let k be a
    real number. Then E(kX)kE(X) and E(Xk)E(X)k
  • Thus, for any real numbers a and b
    E(aXb)aE(X)b.
  • T2 Let X and Y be random variables on the same
    sample space S. Then E(XY)E(X)E(Y).

29
Joint distribution of a random variable cntnd.
f(xi)?jh(xi,yj) and g(xi)?ih(xi,yj) are
marginal distributions
30
Covariance and correlation
  • Let X and Y be random variables with joint
    distribution h(x,y) and respective means ?x and
    ?y. The covariance of X and Y, denoted by
    cov(X,Y) is defined by
  • cov(X,Y)?i,j (xi- ?x)(yj- ?y)h(xi,yj)E(X-
    ?x)(Y- ?y)
  • E(XY)- ?x ?y
  • The correlation
  • -1???1
  • Exercise calculate the joint, marginal
    distributions and the correlation coefficients
    for the random variables in example one.

31
Independent random variables
  • Let X,Y..Z are random variables over space S.
    They are said to be independent if
  • P(Xxi,Yyj.Zzk)P(Xxi)P(Yyj).P(Zzk)
  • T If X an Y are independent random variables
  • E(XY)E(X)E(Y)
  • var(XY)var(X)var(Y)
  • cov(X,Y)0

32
Continuous random variables
  • Suppose that X is a random variable on a sample
    space S whose range space Rx is a continuum of
    numbers such as an interval. From the definition
    of a random variable, the set a?X ?b is an
    event in S and therefore the probability P(a?X
    ?b) is well defined. In calculus terms
  • The function f is called the distribution or
    continuous probability (density) function of X
    and satisfies the following conditions

33
Continuous random variables cntnd.
  • The expectation E(X) for a continuous random
    variable X is defined by the following integral
  • While the variance
  • Example Find the expectation and variance of
    random variable X with the following distribution
    function

34
Continuous random variables cntnd.
Continuous random variables cntnd.
  • A bivariate function with values f(x,y), defined
    over the xy-plane, is called the joint
    probability density function of the continuous
    random variables X and Y iff

35
Continuous random variables cntnd.
Continuous random variables cntnd.
  • Example Given the joint probability density
    function
  • Find the probability P(X,Y)?A, where A is the
    region
  • (x,y)0ltxlt1/2, 1ltylt2.

36
Continuous random variables cntnd.
Continuous random variables cntnd.
  • If X and Y are continuous random variables and
    f(x,y) is the value of their joint probability
    density at (x,y), the function given by
  • is called the marginal density of X, while
    the function
  • is called the marginal density of Y.

37
Continuous random variables cntnd.
Continuous random variables cntnd.
  • Example Given the joint probability density
  • Find the marginal densities of X and Y.

38
Conditional expectation
  • The concept that has a special importance in
    econometric is the concept of conditional
    expectation
  • In the discrete case it is given by
  • In the continuous case it is given by

39
Conditional expectation cntnd.
  • While we shall spend more time on it tomorrow,
    here are a couple of examples
  • Example one Find the conditional mean of X when
    y1 using the following joint distribution
  • Example two find the conditional mean of f(x,y)
    for y1/2

40
Recapitulation
  • It has been observed that one can discuss X and
    f(x) without referring to the original
    probability space S. In fact, there are many
    applications of probability theory which give
    rise to the same probability distribution.
  • Some of the probability distribution and density
    functions widely used in finance are the
    Bernoulli/ Binomial, the Uniform etc.
  • Overall, the material that we learnt during the
    past two series of lectures (in particular, the
    concept of optimization and those of expectation,
    variance, correlation coefficient etc) is
    sufficient to understand as complicated models as
    the portfolio theory model. Examples are given
    in your course pack.
  • For the rest of todays class we will have a
    brief look at some useful characteristics of the
    normal distribution and will define some sampling
    distributions and the principles of hypothesis
    testing. This will enable us to understand well
    what comes in the rest of the class.

41
The normal distribution
  • A random variable X has a normal distribution,
    and is referred to as a normal random variable,
    if and only if its probability density is given
    by

42
The normal distribution cntnd.
  • Suppose that X is any normal distribution
    N(µ,s2). One of the most useful representations
    of the normal distribution is the standardized
    random variable corresponding to X, defined as
  • Z is also a normal distribution with µ0 and s1,
    i.e. ZN(0,1). Its density function is

43
The normal distribution cntnd.
  • One of the most helpful in terms of hypothesis
    testing property of the standard normal
    distribution is the so called 68-95-99.7 rule
    which gives the percentage of area under the
    standardized normal curve as follows
  • 68.2 for -1?z ?1 and for µ-s ?x ? µs
  • 95.4 for -2?z ?2 and for µ-2s ?x ? µ2s
  • 99.7 for -3?z ?3 and for µ-3s ?x ? µ3s

44
The normal distribution cntnd.
  • Evaluating Standard Normal Probabilities
  • Example one Find (a) ?(1.72) (b) ?(0.34) (c)
    ?(2.3) (d) ?(4.3)
  • Example two Evaluate the probabilities
  • P(-0.5?Z ?1.1), P(0.2?Z ?1.4), P(-1.5?Z
    ?-0.7),
  • Example three Evaluate the probabilities
    P(Z ?0.75), P(Z ? -1.2), P(Z?0.60), P(Z ? -0.45)
  • Example four Evaluate the following
    probabilities for N(70,4)
  • P(68 ?X ?74), P(72 ?X ?75), P(63 ?X ?68), P(X
    ?73)

45
Sampling distributions
  • Definition If X1, X2.Xn are independent and
    identically distributed random variables, we say
    that they constitute a random sample from the
    infinite population given by their common
    distribution.
  • Statistical inferences are typically based on
    statistics, i.e. on random variables that are
    functions of a set of random variables X1, X2.Xn
    . Typically these statistics are the sample mean
    and the sample variance.

46
Sampling distributions cntnd.
  • Definition one If X1, X2,., Xn constitute a
    random variable, then the sample mean is defined
    as
  • Definition two If X1, X2,., Xn constitute a
    random variable, then the sample variance is
    defined as

47
Sampling distributions cntnd.
  • The central limit theorem If X1, X2,., Xn
    constitute a random sample from an infinite
    population with mean µ and variance ?, then the
    limiting distribution of as n?? is
    the standard normal
  • distribution.

48
Sampling distributions cntnd.
  • If X1, X2.Xn are independent random variables
    having standard normal distribution, then
  • has the chi-square distribution with n degrees
    of freedom.

49
Sampling distributions and hypothesis testing
cntnd.
  • If Y and Z are independent random variables, Y
    has a chi - square distribution with ? degrees of
    freedom, and Z has the standard normal
    distribution, then the distribution of
  • is called the t distribution with ? degrees
    of freedom.

50
Sampling distributions cntnd.
  • If U and V are independent random variables
    having chi-square distributions with ?1 and ?2
    degrees of freedom, then
  • is a random variable with an F distribution.

51
Introduction to hypothesis testing
  • Estimators such as the sample mean and variance
    of a distribution are point estimates as they
    provide only a single (point) estimate of the
    unknown variable that we are interested in.
  • Instead of basing our inference about the true
    unknown variables of interest on these single
    estimates, we can obtain two different estimates
    and argue with some confidence (probability) that
    the interval between these two values contains
    the true parameter.
  • This is the logic behind interval estimations and
    hypothesis testing.

52
Introduction to hypothesis testing cntnd.
  • The key concept underlying interval estimation is
    the notion of the sampling, or probability
    distribution of an estimate. For instance, it can
    be shown that if a variable X is normally
    distributed, then the sample mean is also
    normally distributed with a mean ? and a variance
    ?2/n. In other words, the sampling or
    probability distribution of the estimator is
  • As a result, we can construct the interval
  • and discuss the probability that an interval
    like this contains the true ?.

53
Introduction to hypothesis testing cntnd.
  • More generally, in interval estimation we
    construct
  • two estimators and , both
    functions of the
  • sample X values, such that
  • That is, we can say that the probability is 1-?
    that the above interval, called the confidence
    interval of size 1-? contains the true value of
    our unknown parameter. If 1-? is 0.95, we can
    argue that in 95 out of 100 such intervals the
    interval will contain the true parameter.

54
Introduction to hypothesis testing cntnd.
  • Example Suppose that the distribution of height
    of men in a population is normally distributed
    with mean? inches and variance ?2.5 inches. A
    sample of 100 men drawn randomly from this
    population had an average height of 67 inches.
    Establish a 95 confidence interval for the mean
    height in the population.
  • Solution Since
  • From the normal distribution table we see
    that
  • Plugging the relevant variables, we obtain a
    95 confidence interval as
  • 66.51? ? ?67.49.

55
Introduction to hypothesis testing cntnd.
  • The problem of hypothesis testing can be stated
    as follows. Assume that we have a random variable
    X with a known PDF f(x,?), where ? is the
    parameter of the distribution. Having obtained a
    random sample with size n, and a point estimator
    , we can raise the question is this estimator
    compatible with some hypothesized value of ?, say
    ? ?.
  • In the language of statistics ? ? is called the
    null hypothesis. It is tested against an
    alternative hypothesis, say ?? ?.

56
Introduction to hypothesis testing cntnd.
  • The null hypothesis and the alternative
    hypothesis can be simple and composite. It is
    simple if it satisfies specific values of the
    parameters, otherwise it is composite.
  • Example H0 ?15 and ?2 is a simple hypothesis.
  • H0 ?15 and ?gt2 is a composite hypothesis.

57
Introduction to hypothesis testing cntnd.
  • To test the null hypothesis we use the sample
    information to obtain what is known as test
    statistics. Very often this is a point estimator
    of the unknown parameter. Then we try to find out
    the sampling, or probability distribution of the
    test statistics and use the confidence interval
    or test of significance approach to test the null
    hypothesis.

58
The confidence interval approach
  • In the previous example, we established that the
    95 confidence interval for the average male
    height in the population is 66.51? ? ?67.49.
  • Now let us test the null hypothesis H0 ?69
    against the alternative hypothesis H1 ??69.
  • Clearly, 69 does not belong to the above interval
    and we reject the null hypothesis.

59
The confidence interval approach cntnd.
  • In the language of hypothesis testing, the
    confidence interval that we established is called
    the acceptance region. The area(s) outside this
    region is (are) called critical region(s). The
    lower and upper limits of the acceptance region
    are called critical values. If the hypothesized
    value falls within the acceptance region we
    cannot reject the null hypothesis otherwise we
    reject it.

60
The confidence interval approach cntnd.
  • In rejecting or accepting the null hypothesis we
    are likely to commit two types of errors
  • As it is difficult to minimize both errors, in
    practice we keep the probability of Type I error
    at 0.01 or 0.05 and try to minimize the
    probability of having a type II error.

61
The confidence interval approach cntnd.
  • In the language of statistics, the probability of
    Type I error ? is called the level of
    significance. The probability of committing Type
    II error is designated as ? and 1- ? is called
    the power of the test.

62
Test of significance approach
  • If instead of constructing a confidence interval,
    we instead substitute the given values in
  • For the hypothesis of ?H0 ?69 versus H1
    ??69
  • we obtain

63
Test of significance approach
  • From the normal distribution table we observe
    that the probability of the Z value to exceed 3
    or 3 is 0.001. The probability of the Z value to
    exceed -1.96 or 1.96 is 0.05 and so on. We
    therefore conclude that the computed value of
    Z-8 is statistically significant, and reject the
    null hypothesis of ?69 at any acceptable
    significance level.

64
Examples
  • Example one Suppose that it is known from
    experience that the standard deviation of the
    weight of an 8-ounce package of cookies made by a
    certain bakery is 0.16 ounce. To check whether
    its production is under control on a given day,
    namely, to check whether the true average weight
    of the packages is 8 ounces, employees select a
    random sample of 25 packages and find that their
    mean weight is 8.091 ounces. Since the bakery
    will loose money when the average package exceeds
    8 ounces and the customer loses money when it is
    smaller than 8, test the hypothesis that the
    average size of a package is 8 ounces.
  • Example two Suppose that 100 tires by a
    manufacturer lasted on average 21,819 miles with
    standard deviation 1,295 miles. Test the null
    hypothesis of ?22,000 versus the alternative
    hypothesis of ?lt22,000.

65
The t-test
  • When the sample size is relatively small and s2
    unknown, the appropriate test statistic is
  • Example The specification of a certain kind of
    ribbon calls for a mean breaking strength of 185
    pounds. If five pieces randomly selected from
    different rolls have breaking strengths of 171.6,
    191.8,178.3, 184.9 and 189.1 pounds, test the
    null hypothesis that µ185 pounds against the
    alternative hypothesis of µlt185 pounds.

66
The Chi-square test
  • Sometimes it is essential to test hypotheses
    related to the variance of a variable. The
    relevant test statistics in this case is

67
Example
  • Suppose that the thickness of a part used in a
    semiconductor is its critical dimension and that
    measurements of the thickness of a random sample
    of 18 such parts have the variance s20.68
    (thousand inch). The process is considered to be
    under control if the variation of the thickness
    is given by a variance not greater than 0.36.
    Assuming that the measurements are a random
    sample from a normal distribution, test the
    hypothesis that s20.36
Write a Comment
User Comments (0)
About PowerShow.com