Review 3401 - PowerPoint PPT Presentation

1 / 163
About This Presentation
Title:

Review 3401

Description:

Data are the facts and figures that are collected, summarized, analyzed, and ... Under Function Name Select either VARA (for Sample Variance) or STDEVA (for ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 164
Provided by: tarekbu
Category:
Tags: review | vara

less

Transcript and Presenter's Notes

Title: Review 3401


1
Review 3401
2
STATISTICS
  • Art and Science of Collecting and Understanding
    DATA

3
Data and Data Sets
  • Data are the facts and figures that are
    collected, summarized, analyzed, and interpreted.
  • Why? Because you want To Predict, Estimate and
    Ultimately make business decisions.
  • The data collected in a particular study are
    referred to as the data set.

4
Elements, Variables, and Observations
  • The elements are the entities on which data are
    collected.
  • A variable is a characteristic of interest for
    the elements.
  • The set of measurements collected for a
    particular element is called an observation.

5
Data, Data Sets, Elements, Variables, and
Observations
Stock Annual Earn/ Company
Exchange Sales(M) Sh.() Dataram A
MEX 73.10 0.86 EnergySouth OTC 74.00
1.67 Keystone NYSE 365.70 0.86
LandCare NYSE 111.40
0.33 Psychemedics AMEX 17.60 0.13
Observation
Variables
Elements
Data Set
Datum
6
Scales of Measurement
  • Scales of measurement include
  • Nominal
  • Ordinal
  • Interval
  • Ratio
  • The scale determines the amount of information
    contained in the data.
  • The scale indicates the data summarization and
    statistical analyses that are most appropriate.

7
Scales of Measurement
  • Nominal
  • Data are labels or names used to identify an
    attribute of the element.
  • A nonnumeric label or a numeric code may be used.
    Example
  • Students of a university are classified by the
    school in which they are enrolled using a
    nonnumeric label such as Business, Humanities,
    Education, and so on.
  • Alternatively, a numeric code could be used
    for the school variable (e.g. 1 denotes Business,
    2 denotes Humanities, 3 denotes Education, and so
    on).

8
Scales of Measurement
  • Ordinal
  • The data have the properties of nominal data and
    the order or rank of the data is meaningful.
  • A nonnumeric label or a numeric code may be used.
    Example
  • Students of a university are classified by
    their class standing using a nonnumeric label
    such as Freshman, Sophomore, Junior, or Senior.
  • Alternatively, a numeric code could be used for
    the class standing variable (e.g. 1 denotes
    Freshman, 2 denotes Sophomore, and so on).

9
Scales of Measurement
  • Interval Ratio
  • Interval Ratio data are always numeric.

10
Qualitative and Quantitative Data
  • Data can be further classified as being
    qualitative or quantitative.
  • The statistical analysis that is appropriate
    depends on whether the data for the variable are
    qualitative or quantitative.
  • In general, there are more alternatives for
    statistical analysis when the data are
    quantitative.

11
Qualitative Data (Categorical Variables)
  • Qualitative data are labels or names used to
    identify an attribute of each element.
  • Qualitative data use either the nominal(cannot be
    ordered meaningfully) or ordinal(values can be
    meaningfully ordered) scale of measurement.
  • Qualitative data can be either numeric or
    nonnumeric.
  • The statistical analysis for qualitative data are
    rather limited.

12
Quantitative Data
  • Quantitative data indicate either how many or how
    much.
  • Quantitative data that measure how many are
    discrete (Number of cars owned by a family, of
    accidents in I-4 day).
  • Quantitative data that measure how much are
    continuous because there is no separation between
    the possible values for the data. (take values in
    intervals)
  • Quantitative data are always numeric.
  • Ordinary arithmetic operations(adding and
    averaging) are meaningful only with quantitative
    data.

13
Cross-Sectional and Time Series Data
  • Cross-sectional data are collected at the same or
    approximately the same point in time.
  • Example data detailing the number of building
    permits issued in June 2000 in each of the
    counties of Florida
  • Time series data are collected over several time
    periods.
  • Example data detailing the number of building
    permits issued in Volusia County in each of the
    last 36 months

14
Example Time-Series
Year Small Business Administration Budget (
Millions)
1991 464 1992 1,891 1993 1,177 1994 2,058 1995 798
1996 749
15
Example
Time series
Year Small Business Administration Budget (
Millions)
1991 464 1992 1,891 1993 1,177 1994 2,058 1995 798
1996 749
Elementary unit defined by year
Quantitative data
16
Example Cross-Sectional
Firm Sales Industry Group SP Rating
IBM 66,346 Office Equipment A Exxon 59,023 Fuel
A- GE 40,482 Conglomerates A ATT 34,357 Teleco
mmunications A-
17
Example
Cross-Sectional
Multivariate Data (3 variables)
Firm Sales Industry Group SP Rating
IBM 66,346 Office Equipment A Exxon 59,023 Fuel
A- GE 40,482 Conglomerates A ATT 34,357 Teleco
mmunications A-
First Observation IBM
Elementary units
Quantitative variable
Nominal Qualitative variable
Ordinal Qualitative variable
18
Sources of Data
  • Primary Data
  • When you control the design of the
    data-collection plan
  • More Control, Exactly what you want
  • More Expensive, Time consuming
  • Secondary Data
  • You use data previously collected by others for
    their purposes. (US Government-INTERNET)

19
Experimental VS Observational
  • When data are not available through existing
    sources (secondary data) data can be obtained by
    conducting statistical studies and are classified
    as
  • Experimental study. A variable of interest is
    first identified. Then one or more other
    variables are identified and controlled so that
    data can be obtained about how they influence the
    variable of interest (DATA MINING FDA ). Only
    experiment allows conclusions to be drawn about
    causes and effect.
  • Observational Study. No attempt is made to
    control the variable of interest (SURVEY)

20
Descriptive Statistics
  • Descriptive statistics are the tabular,
    graphical, and numerical methods used to
    summarize data.
  • The most common numerical descriptive statistic
    is the average (or mean).

21
Statistical Inference
  • Statistical inference is the process of using
    data obtained from a small group of elements (the
    sample) to make estimates and test hypotheses
    about the characteristics of a larger group of
    elements (the population).
  • The objective of inferential statistics is to
    make (predictions decisions) about certain
    characteristics of the population based on
    information contained in a sample.
  • A Population is the set representing all
    observations of interest.
  • A Sample is a subset of measurement selected from
    the population of interest

22
Descriptive Statistics Numerical Methods
  • Measures of Location
  • Measures of Variability
  • Measures of Relative Location and Detecting
    Outliers
  • Exploratory Data Analysis
  • Measures of Association Between Two Variables

23
Measures of Location
  • Mean
  • Median
  • Mode
  • Quartiles

24
Example
  • Given below is a sample of monthly rent values
    ()
  • for one-bedroom apartments. The data is a sample
    of 70
  • apartments in a particular city. The data are
    presented
  • in ascending order.

25
Excel
  • Go to tools
  • Select Data analysis
  • Select Descriptive Statistics
  • Select Summary Statistics box
  • Select Confidence Level for Mean box
  • Select Ok

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
Mean
  • The mean of a data set is the average of all the
    data values.
  • If the data are from a sample, the mean is
    denoted by
  • .
  • If the data are from a population, the mean is
    denoted by (mu).

31
Example Apartment Rents
  • Mean

32
Median
  • The median is the measure of location most often
    reported for annual income and property value
    data.
  • A few extremely large incomes or property values
    can inflate the mean.
  • The median of a data set is the value in the
    middle when the data items are arranged in
    ascending order.
  • If i is not an integer, round up. The p th
    percentile is the value in the i th position.
  • If i is an integer, the p th percentile is the
    average of the values in positions i and i 1.

33
Example Apartment Rents
  • Median Median 50th percentile
  • i (p/100)n (50/100)70 35
    Averaging the 35th and 36th data values
  • Median (475 475)/2 475

34
Mode
  • The mode of a data set is the value that occurs
    with greatest frequency.
  • The greatest frequency can occur at two or more
    different values.
  • If the data have exactly two modes, the data are
    bimodal.
  • If the data have more than two modes, the data
    are multimodal.

35
Example Apartment Rents
  • Mode
  • 450 occurred most frequently (7 times)
  • Mode 450

36
Percentiles
  • A percentile provides information about how the
    data are spread over the interval from the
    smallest value to the largest value.
  • Admission test scores for colleges and
    universities are frequently reported in terms of
    percentiles.

37
Example Apartment Rents
  • 90th Percentile

38
Using Excel
  • Go to Insert
  • Select Function
  • Select Statistical
  • Under Function Name Select Percentile
  • Select OK
  • Under Array Select A2A71
  • Under K Select .90
  • Select Ok

39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
Quartiles
  • Quartiles are specific percentiles
  • First Quartile 25th Percentile
  • Second Quartile 50th Percentile Median
  • Third Quartile 75th Percentile

45
Excel
  • Go to Insert
  • Select Function
  • Select Statistical
  • Under Function Name Select Quartile
  • Select OK
  • Under Array Select A2A71
  • Under Quart Select 3 for the third quartile
  • Select Ok

46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
Measures of Variability
  • Range
  • Variance
  • Standard Deviation

51
Range
  • The range of a data set is the difference between
    the largest and smallest data values.
  • It is the simplest measure of variability.
  • It is very sensitive to the smallest and largest
    data values.

52
(No Transcript)
53
Example Apartment Rents
  • Range
  • Range largest value - smallest value
  • Range 615 - 425 190

54
Variance
  • The variance is a measure of variability that
    utilizes all the data.
  • It is based on the difference between the value
    of each observation (xi) and the mean (x for a
    sample, m for a population).

55
Variance
  • The variance is the average of the squared
    differences between each data value and the mean.
  • If the data set is a sample, the variance is
    denoted by s2.
  • If the data set is a population, the variance is
    denoted by ? 2.

56
Standard Deviation
  • The standard deviation of a data set is the
    positive square root of the variance.
  • It is measured in the same units as the data,
    making it more easily comparable, than the
    variance, to the mean.
  • If the data set is a sample, the standard
    deviation is denoted s.
  • If the data set is a population, the standard
    deviation is denoted ? (sigma).

57
Example Apartment Rents
  • Variance
  • Standard Deviation

58
Sample Variance Standard D Using Excel
59
Sample Variance Standard D Using Excel
  • Go to Insert
  • Select Function
  • Select Statistical
  • Under Function Name Select either VARA (for
    Sample Variance) or STDEVA (for Sample Standard
    Deviation)
  • Select OK
  • Under Value 1 Select A2A71
  • Select Ok

60
(No Transcript)
61
(No Transcript)
62
Measures of Relative Locationand Detecting
Outliers
  • Empirical Rule
  • Detecting Outliers

63
Empirical Rule
  • For data having a bell-shaped
    distribution
  • Approximately 68 of the data values will be
    within one standard deviation of the mean.

64
Empirical Rule
  • For data having a bell-shaped distribution
  • Approximately 95 of the data values will be
    within two standard deviations of the mean.

65
Empirical Rule
  • For data having a bell-shaped distribution
  • Almost all (99.7) of the items will be within
    three standard deviations of the mean.

66
Detecting Outliers
  • An outlier is an unusually small or unusually
    large
  • value in a data set.
  • It might be
  • an incorrectly recorded data value
  • a data value that was incorrectly included in
    the
  • data set
  • a correctly recorded data value that belongs in
  • the data set

67
Detecting Outliers Using IQR
  • IQR 3rd Quartile 1st Quartile
  • The lower limit is located 1.5(IQR) below Q1.
  • The upper limit is located 1.5(IQR) above Q3.
  • Data outside these limits are considered outliers.

68
Example Apartment for Rent
  • IQR 522.5 446.25 76.25
  • Lower Limit Q1 - 1.5(IQR) 446.25 -
    1.5(76.25) 331.875
  • Upper Limit Q3 1.5(IQR) 522.5 1.5(76.25)
    636.875
  • There are no outliers (values less than 332 or
  • greater than 637) in the apartment rent
    data.

69
Exploratory Data Analysis
  • Five-Number Summary

70
Five-Number Summary
  • Smallest Value
  • First Quartile
  • Median
  • Third Quartile
  • Largest Value

71
Example Apartment Rents
  • Five-Number Summary
  • Lowest Value 425 First Quartile 445
  • Median 475
  • Third Quartile 525 Largest Value 615

72
Descriptive Statistics to Summarize a Variable
  • Variable Name
  • Number of Observations
  • Lowest Value
  • Mean
  • Median
  • Standard Deviation
  • Standard Error
  • Maximum Value
  • 1st Quartile
  • 3rd Quartile.

73
Rent Example
74
Measures of Association Between Two Variables
  • Covariance
  • Correlation Coefficient

75
Covariance
  • The covariance is a measure of the linear
    association between two variables.
  • Positive values indicate a positive relationship.
  • Negative values indicate a negative relationship.

76
Covariance
  • If the data sets are samples, the covariance is
    denoted by sxy.
  • If the data sets are populations, the covariance
    is denoted by .

77
Using Excel
  • Go Tools
  • Select Data Analysis
  • Select Covariance
  • Select OK
  • Input Range
  • Select Ok

78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
COVARIANCE
  • EX

82
Correlation Coefficient
  • The coefficient can take on values between -1 and
    1.
  • Values near -1 indicate a strong negative linear
    relationship.
  • Values near 1 indicate a strong positive linear
    relationship.
  • If the data sets are samples, the coefficient is
    rxy.
  • If the data sets are populations, the coefficient
    is .

83
Using Excel
  • Go Tools
  • Select Data Analysis
  • Select Correlation
  • Select OK
  • Input Range
  • Select Ok

84
(No Transcript)
85
(No Transcript)
86
  • -

87
Chapter4 Introduction to Probability
  • Experiments, Counting Rules, and
  • Assigning Probabilities
  • Events and Their Probability
  • Some Basic Relationships of Probability

88
Probability
  • Probability is a numerical measure of the
    likelihood that an event will occur.
  • Probability values are always assigned on a scale
    from 0 to 1.
  • A probability near 0 indicates an event is very
    unlikely to occur.
  • A probability near 1 indicates an event is almost
    certain to occur.
  • A probability of 0.5 indicates the occurrence of
    the event is just as likely as it is unlikely.

89
Probability as a Numerical Measureof the
Likelihood of Occurrence
Increasing Likelihood of Occurrence
0
1
.5
Probability
The occurrence of the event is just as likely
as it is unlikely.
90
An Experiment and Its Sample Space
  • An experiment is any process that generates
    well-defined outcomes. (Ex Toss a coin, Roll a
    die, Select a part for inspection)
  • A procedure that produces an outcome
  • Not perfectly predictable in advance
  • Head-Tail- 1,2,3,4,5,6 Defective no defective
  • The sample space for an experiment is the set of
    all experimental outcomes.
  • A sample point is an element of the sample space,
    any one particular experimental outcome.

91
Random Experiment
  • Event
  • Happens or not, each time random experiment is
    run
  • Formally a collection of outcomes from sample
    space
  • A yes or no situation if the outcome is in the
    list, the event happens
  • Each random experiment has many different events
    of interest
  • Example tossing a coin - the event Head
  • Probability of an Event
  • A number between 0 ( NEVER HAPPENS ) and 1 (
    ALWAYS HAPPENS )
  • The likelihood of occurrence of an event

92
Combinations Vs Permutations
  • List all combinations and all permutations of the
    4 letters A,B,C, and D When they are taken 3 at
    a time
  • Combinations ABC ABD ACD BCD 4 Combinations
  • Permutations ABC ABD ACD BCD
  • ACB ADB ADC BDC
  • BAC BAD CAD CBD
  • BCA BDA CDA CDB
  • CAB DAB DAC DBC
  • CBA DBA DCA DCB
    24 Permutations

93
COUNTING RULE
  • Combinations WITHOUT REGARD TO ORDER
  • Where
  • N!N(N-1)(N-2).(2)(1)
  • n!n(n-1)(n-2).(2)(1)
  • And 0!1

94
  • The odds of winning the lottery in Florida are

95
Permutations
  • Suppose coach XYZ has not settled on who the
    starters are and has a total of 10 team members.
    How many different lineups can he form now?
  • 10!/(10-5)! 30240
  • Permutations are the possible ordered selections
    of r objects out of a total of n objects. The
    number of permutations of n objects taken r at a
    time

96
Assigning Probabilities
  • Classical Method
  • Assigning probabilities based on the assumption
    of equally likely outcomes.
  • Relative Frequency Method
  • Assigning probabilities based on experimentation
    or historical data.
  • Subjective Method
  • Assigning probabilities based on the assignors
    judgment.

97
Chapter 5
98
Random Variables
  • A random Variable is a specification or
    description of a numerical result from a random
    experiment.
  • A random variable is the idea or abstraction of
    the values which are collected in a random
    experiment. A number is what is seen when we
    observe a random variable. For example,
    Tomorrows Dow Jones closing Would be a random
    variable, while the number (hopefully) 12,000
    would be the observation of this random Variable

99
Random Variables (Continued)
  • A random variable can be classified as being
    either discrete or continuous depending on the
    numerical values it assumes.
  • A discrete random variable may assume either a
    finite number of values or an infinite sequence
    of values.
  • A continuous random variable may assume any
    numerical value in an interval or collection of
    intervals.

100
Example JSL Appliances
  • Discrete random variable with a finite number of
    values
  • Let x number of TV sets sold at the store in
    one day
  • where x can take on 5 values (0, 1, 2, 3,
    4)
  • Discrete random variable with an infinite
    sequence of values
  • Let x number of customers arriving in one day
  • where x can take on the values 0, 1, 2, .
    . .
  • We can count the customers arriving, but there
    is no finite upper limit on the number that might
    arrive.

101
Discrete Probability Distributions
  • The probability distribution for a random
    variable describes how probabilities are
    distributed over the values of the random
    variable.
  • The probability distribution is defined by a
    probability function, denoted by f(x), which
    provides the probability for each value of the
    random variable.
  • The required conditions for a discrete
    probability function are
  • f(x) gt 0
  • ?f(x) 1
  • We can describe a discrete probability
    distribution with a table, graph, or equation.

102
Example JSL Appliances
  • Using past data on TV sales (below left), a
    tabular representation of the probability
    distribution for TV sales (below right) was
    developed.
  • Number
  • Units Sold of Days x f(x)
  • 0 80 0 .40
  • 1 50 1 .25
  • 2 40 2 .20
  • 3 10 3 .05
  • 4 20 4 .10
  • 200 1.00

103
Example JSL Appliances
  • Graphical Representation of the Probability
    Distribution

.50
.40
Probability
.30
.20
.10
0 1 2 3 4
Values of Random Variable x (TV sales)
104
Continued
  • Table
  • X0 P(X0) 2/5 .40
  • X1 P(X1) 1/4 .25
  • X2 P(X2) 1/5 .20
  • X3 P(X3) 1/20 .05
  • X4 P(X4) 1/10 .10

105
Example
  • Determine the probability of
  • P(2?X ?3)
  • Determine the probability of
  • P(Xgt1)

106
Example
  • Determine the probability of
  • P(X?1)

107
Expected Value and Variance
  • The expected value, or mean, of a random variable
    is a measure of its central location.
  • Expected value of a discrete random variable
  • E(x) ? ?xf(x)
  • The variance summarizes the variability in the
    values of a random variable.
  • Variance of a discrete random variable
  • Var(x) ? 2 ?(x - ?)2f(x)
  • The standard deviation, ?, is defined as the
    positive square root of the variance.

108
Example JSL Appliances
  • Expected Value of a Discrete Random Variable
  • x f(x) xf(x)
  • 0 .40 .00
  • 1 .25 .25
  • 2 .20 .40
  • 3 .05 .15
  • 4 .10 .40
  • E(x) 1.20
  • The expected number of TV sets sold in a day is
    1.2

109
Example JSL Appliances
  • Variance and Standard Deviation
  • of a Discrete Random Variable
  • x x - ? (x - ?)2 f(x) (x - ?)2f(x)
  • 0 -1.2 1.44 .40 .576
  • 1 -0.2 0.04 .25 .010
  • 2 0.8 0.64 .20 .128
  • 3 1.8 3.24 .05 .162
  • 4 2.8 7.84 .10 .784
  • 1.660 ? ?
  • The variance of daily sales is 1.66 TV sets
    squared.
  • The standard deviation of sales is 1.2884 TV
    sets.

110
Binomial Probability Distribution (Special
Discrete)
  • Properties of a Binomial Experiment
  • The experiment consists of a sequence of n
    identical trials.
  • Two outcomes, success and failure, are possible
    on each trial.
  • The probability of a success, denoted by p, does
    not change from trial to trial.
  • The trials are independent.

111
Binomial Probability Distribution
  • Binomial Probability Function
  • where
  • f(x) the probability of x successes in n
    trials
  • n the number of trials
  • p the probability of success on any one
    trial
  • x number of successes

112
Example UCF
  • Binomial Probability Distribution
  • At UCF, 75 of students live in the
    dormitories. A random sample of 5 students is
    selected. Use the binomial probability formula to
    answer the following questions.
  • a. What is the probability that the sample
    contains exactly three students who live in the
    dormitories?
  • b. What is the probability that the sample
    contains more than three students who live in the
    dormitories?

113
Example UCF
  • c) What is the probability that the sample
    contains at least 4 students who do not live in
    the dormitory ?
  • d) What is the probability that the sample
    contains less than 2 students who live in the
    dormitory ?
  • e). What is the expected number of students (in
    the sample) who do live in the dormitories?

114
Example UCF
  • Using the Binomial Probability Function
  • a. What is the probability that the sample
    contains exactly three students who live in the
    dormitories?
  • (a)
  • ?

115
Example UCF
  • Using the Binomial Probability Function
  • b. What is the probability that the sample
    contains more than three students who live in the
    dormitories?
  • (b) 45
  • ?

116
Example UCF
  • (c )Using the Binomial Probability Function at
    least 4 means 4 and 5. Probability of success is
    .25.

117
Example UCF
  • (d )Using the Binomial Probability Function less
    than 2 means 1 and 0. Probability of success is
    .75.

118
Example UCF
119
Binomial Probability Distribution
  • Expected Value
  • E(x) ? np
  • Variance
  • Var(x) ? 2 np(1 - p)
  • Standard Deviation

120
Example UCF
  • Binomial Probability Distribution ( c )
  • Expected Value
  • E(x) ? 5(.75) 3.75 employees out of
    5
  • Variance
  • Var(x) ? 2 5(.75)(.25) .93
  • Standard Deviation

121
Chapter 6 Continuous Probability Distributions
  • Normal Probability Distribution
  • Standard Normal Distribution

122
Continuous Random Variables
  • A continuous random variable can assume any value
    in an interval on the real line or in a
    collection of intervals.
  • It is not possible to talk about the probability
    of the random variable assuming a particular
    value.
  • Instead, we talk about the probability of the
    random variable assuming a value within a given
    interval.
  • The probability of the random variable assuming a
    value within some given interval from x1 to x2 is
    defined to be the area under the graph of the
    probability density function between x1 and x2.

123
Continuous Probability Distributions
  • Normal Probability Distribution
  • The Standard Normal Probability Distribution

124
Normal Probability Distribution
  • Graph of the Normal Probability Density Function

f(x)
x
?
125
The Characteristics of the Normal Distribution
  • Characteristics
  • 1. f(x) approaches 0 as x approaches µ
    (infinity)
  • 2. symmetric around vertical line at x µ
  • 3. area to right of mean is ½ of total area under
    curve area to left of mean is ½ of total area
    under curve
  • 4. different values for µ (mean) s2 (variance)
    determine different curves µ determines where
    curve centered s2 determines how spread out
    curve is, see Figure 4-2 in text

126
The Normal Distribution
µ
1. f(x) approaches 0 as x approaches infinity
127
The Normal Distribution
µ
2. symmetric around vertical line at x µ
128
The Normal Distribution
1/2 of total area
1/2 of total area
µ
3. area to left of mean is 1/2 of total area
area to right of mean is 1/2 of total area
129
Normal Probability Distribution
  • The shape of the normal curve is often
    illustrated as a bell-shaped curve.
  • The highest point on the normal curve is at the
    mean, which is also the median and mode.
  • The mean can be any numerical value negative,
    zero, or positive.
  • continued

130
Normal Probability Distribution
  • of Values in Some Commonly Used Intervals
  • 68.26 of values of a normal random variable are
    within /- 1 standard deviation of its mean.
  • 95.44 of values of a normal random variable are
    within /- 2 standard deviations of its mean.
  • 99.72 of values of a normal random variable are
    within /- 3 standard deviations of its mean.

131
Areas Under Normal Curve
f(x)
x
µ - s
µ
µ s
68
132
The Normal Distribution
f(x)
x
µ - 2s
µ 2s
µ
95
133
Areas Under Normal Curve
f(x)
x
µ
µ - 3s
µ 3s
99.7
134
Example Analysis of Test Scores
  • Let X be a random variable whose values are the
    test scores obtained on a nationwide test given
    to high school seniors. Suppose that X is
    normally distributed with a mean (m) of 600 and a
    standard deviation (s) of 65.

135
Given that m 600 and s 65
  • The probability that X lies within 2 s 2(65)
    130 points of 600 is 95. In other words, 95 of
    all test scores lie between 470 and 730.
  • Similarly, 99.7 of the scores are within 3 s
    3(65) 195 points of 600. That is, between 405
    and 795.

136
Normal Probability Distribution
  • Normal Probability Density Function
  • where
  • ? mean
  • ? standard deviation
  • ? 3.14159
  • e 2.71828

137
What is Standard Normal Random Variable?
  • The Standard normal random variable is a normal
    variable with mean(m) 0 and standard deviation
    (s) 1 See next slide
  • A continuous random variable Z (Z is special
    designation usually reserved for this type of
    variable) is a standard normal random variable if
    its density function is as shown on next slide.

138
NOTICE Z RATHER THAN X.
mean 0 and standard deviation 1
139
Reading Table the Z Table
  • Example Find the area under the Standard Normal
    Curve between z0 and z2.54 using the Standard
    Normal Table of areas.
  • For row value of 2.5 column under 0.04,
    meaning Z 2.54, value 0.4945 --means area
    under standard normal curve between z 0 z
    2.54 is 0.4945 (49.45 of total area under curve)

140
Z Table
for row value of 2.5 column under 0.04,
meaning Z 2.54, value 0.4945
141
Using Z Table for Negative z-Values Values less
than the Mean
  • Area under standard normal curve between
  • z 0 z -2.54 (note minus 2.54) is also
    0.4945 (49.45 of total area under curve).
  • Area under curve between z 0 and some z0 is
    same to right and left of z 0.
  • Remember that the Normal Curve is symmetrical.
    Each half is a mirror image of the other.

142
Find P(Z gt 1.5)
  • This probability is the area to the right of z
    1.5. This area is equal to the difference between
    the total area to the right of z 0 which equals
    ?? and the area between z 0 and z 1.5. The
    value for z 1.5 from Table 2 is_____??

143
Z Table
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.5
SKIPPED 1.0 - 1.4
144
Find P(Z gt 1.5)
  • Find P(Z gt 1.5) This probability is the area
    to the right of z 1.5 Area .5000 - .4332
    .0668

.
145
Find P(0.5 lt Z lt 2)
  • This probability is the area between z 0.5 z
    2 (draw figure).

146
Z Table
z
.00 .01 .02 .03 .04
.0000 .0398 .0793 .1179 .1554 .1915 .2258 .2580 .
2881 .3159
.0040 .0080 .0120 .0160 .0438 .0478
.0517 .0557 .0832 .0871 .0910 .0948 .1217
.1255 .1293 .1331 .1591 .1628 .1664
.1700 .1950 .1985 .2019 .2054 .2291
.2324 .2357 .2389 .2612 .2642 .2673
.2704 .2910 .2939 .2967 .2996 .3186
.3212 .3238 .3264
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.5
SKIPPED 1.0 - 1.4
.4332 .4345 .4357 .4370 .4382
147
Z Table
z
.00 .01 .02 .03 .04
.4778 .4783 .4788 .4793 .4826 .4830
.4834 .4838 .4864 .4868 .4871 .4875 .4896
.4898 .4901 .4904 .4920 .4922 .4925
.4927 .4940 .4941 .4943 .4945 .4955
.4956 .4957 .4959 .4966 .4967 .4968
.4969 .4975 .4976 .4977 .4977 .4982
.4982 .4983 .4984
.4772 .4821 .4861 .4893 .4918 .4938 .4953 .4965 .
4974 .4981
2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
148
Find P(0.5 lt Z lt 2)
.
  • P(0.5 lt Z lt 2)
  • 0.4772 - 0.1915 0.2857.

149
. Find P(Z lt 2)
  • This probability is the area to the left of z 2
    (draw figure).

150
Find P(Z lt 2)
  • This probability is the area to the left of z 2
    (see figure). This area is equal to the sum of
    the area to the left of z 0 and the area
    between z 0 and z 2
  • P(Z lt 2) 0.5 0.4772 0.9772

151
Find P(-2 lt Z lt -0.5)
  • This probability is the area between z -2 z
    -0.5 (draw figure).

152
Find P(-2 lt Z lt -0.5)
  • This probability is the area between z -2 z
    -0.5 (see figure). This area is equal to the area
    between z 0.5 z 2. Why?

153
Find P(-2 lt Z lt -0.5)
  • This probability is the area between z -2 z
    -0.5. This area is equal to the area between z
    0.5 z 2. Why? The values for z 2 (0.4772)
    z 0.5 (0.1915) are found in Table 4-2.

154
Find P(-2 lt Z lt -0.5)
  • P(-2 lt Z lt -0.5) 0.4772 - 0.1915 0.2857

155
Standard Normal Probability Distribution
  • Transforming x to z
  • Converting to the Standard Normal Distribution

156
Standardizing x to Z
  • Example Suppose X is normally distributed with µ
    4 s 2.
  • Find P(0 lt X lt 6) Convert X 0 to a Z-value
  • Z (X - µ) / s z1 (0 - 4) / 2 -2
  • Convert X 6 to a Z-value
  • (Z (X - µ) / s) z2 (6 - 4) / 2 1

157
P(0 lt X lt 6) P(-2 lt Z lt 1)
area .3413
z values
(Zs sigma 1)
0
-2
1
6
4
0
x values
(Xs sigma 2)
area .4774
158
P(0 lt X lt 6) P(-2 lt Z lt 1)
z values
(Zs sigma 1)
0
-2
1
6
4
0
x values
(Xs sigma 2)
Z (X - µ) / s
159
The Normal Distribution (cont.)
  • it is true that P(0 lt X lt 6) P(-2 lt Z lt 1)
  • P(-2 lt Z lt 1) is area between z -2 and z 1
    this area is sum of area between z -2
    z 0 (area A1) area between z 0 z 1
    (area A2)
  • P(-2 lt Z lt 1) area A1 area A2 0.4774
    0.3413 0.8185 means there is 81.85
    probability that X will be between 0 6 (or
    81.85 of all X values fall between 0 6)

160
Example Survey
  • Standard Normal Probability Distribution
  • According to a survey, subscribers to The WSJ
    interactive edition spend on average 15 hours
    per week using the computer at work. Assume the
    distribution is normally distributed and that the
    standard deviation is 6 hrs. What is the
    probability a randomly selected subscribers spend
    more than 20 hrs using the computer at work ?
    P(x gt 20).

161
Example Survey
  • Standard Normal Probability Distribution
  • The Standard Normal table shows an area of .2967
    for the region between the z 0 and z .83
    lines below. The shaded tail area is .5 - .2967
    .2033. The probability of more 20 hrs is
    .2033.
  • z (x - ?)/?
  • (20 - 15)/6
  • .83

162
Example Survey
  • Using the Standard Normal Probability Table

163
Final Exam
  • The time needed to complete a final examination
    in a particular college course is normally
    distributed with a mean of 80 minutes and a
    standard deviation of 10 minutes. Answer the
    following questions
  • A. What is the probability of completing the exam
    in one hour or less ?
  • B. Assume that the class has 60 students and that
    the examination period is 90 minutes in length.
    How many students do you expect will be unable to
    complete the exam in the allotted time ?
Write a Comment
User Comments (0)
About PowerShow.com