Title: Chapter 3, Part A
1 ECO 3411
- Review Material
- Descriptive Statistics
- The Basics of Probability
- The Normal Probability Distribution
2Descriptive Statistics - review
- Measures of Location
- Measures of Variability
- Measures of Relative Location
- Measures of Association Between Two Variables
3Measures of Location
4Example Apartment Rents
- Given below is a sample of monthly rent values
() - for one-bedroom apartments. The data is a sample
of 70 - apartments in a particular city. The data are
presented - in ascending order.
5Mean
- The mean of a data set is the average of all the
data values. - If the data are from a sample, the mean is
denoted by - .
- If the data are from a population, the mean is
denoted by µ (mu).
6Example Apartment Rents
7Median
- The median of a data set is the value in the
middle when the data items are arranged in
ascending order. - If there is an odd number of items, the median is
the value of the middle item. - If there is an even number of items, the median
is the average of the values for the middle two
items.
8Example Apartment Rents
- Median
- Median 50th percentile
- i (p/100)n (50/100)70 35.5
Averaging the 35th and 36th data values - Median (475 475)/2 475
9Mode
- The mode of a data set is the value that occurs
with greatest frequency.
10Example Apartment Rents
- Mode
- 450 occurred most frequently (7 times)
- Mode 450
11Using Excel to Computethe Mean, Median, and Mode
Note Rows 7-71 are not shown.
12Using Excel to Computethe Mean, Median, and Mode
Note Rows 7-71 are not shown.
13Measures of Variability
- Range
- Variance
- Standard Deviation
14Range
- The range of a data set is the difference between
the largest and smallest data values. - It is the simplest measure of variability.
- It is very sensitive to the smallest and largest
data values.
15Example Apartment Rents
- Range
- Range largest value - smallest value
- Range 615 - 425 190
16Variance
- The variance is the average of the squared
differences between each data value and the mean. - If the data set is a sample, the variance is
denoted by s2. -
- If the data set is a population, the variance is
denoted by ? 2.
17Example Apartment Rents
18Standard Deviation
- The standard deviation of a data set is the
positive square root of the variance. - It is measured in the same units as the data,
making it more easily comparable, than the
variance, to the mean. - If the data set is a sample, the standard
deviation is denoted s. - If the data set is a population, the standard
deviation is denoted ? (sigma).
19Example Apartment Rents
- Variance
- Standard Deviation
20Using Excel to Compute theSample Variance and
Standard Deviation
Note Rows 8-71 are not shown.
21Using Excel to Compute theSample Variance and
Standard Deviation
Note Rows 8-71 are not shown.
22Using ExcelsDescriptive Statistics Tool
-
- Step 1 Select the Tools pull-down menu
- Step 2 Choose the Data Analysis option
- Step 3 Choose Descriptive Statistics from the
list of - Analysis Tools
- continued
-
23Using ExcelsDescriptive Statistics Tool
- Step 4 When the Descriptive Statistics dialog
box appears - Enter B1B71 in the Input Range box Select
Grouped By Columns - Select Labels in First Row
- Select Output Range
- Enter D1 in the Output Range box
- Select Summary Statistics
- Select OK
24Using ExcelsDescriptive Statistics Tool
- Value Worksheet (Partial)
25Using ExcelsDescriptive Statistics Tool
- Value Worksheet (Partial)
26Measures of Relative Locationand Detecting
Outliers
- z-Scores
- The Empirical Rule
- Detecting Outliers
27z-Scores
- The z-score is often called the standardized
value. - It denotes the number of standard deviations a
data value xi is from the mean. - A data value less than the sample mean will have
a z-score less than zero. - A data value greater than the sample mean will
have a z-score greater than zero. - A data value equal to the sample mean will have a
z-score of zero.
28Example Apartment Rents
- z-Score of Smallest Value (425)
29Example Apartment Rents
- z-Score of Smallest Value (425)
- Standardized Values for Apartment Rents
30The Empirical Rule
- For data having a bell-shaped distribution
- 68.26 of the data values will be within one
standard deviation of the mean. - 95.44 of the data values will be within two
standard deviations of the mean. - 99.72 will be within three standard deviations
of the mean.
31Example Apartment Rents
- The Empirical Rule
- Interval in Interval
- Within /- 1s 436.06 to 545.54 48/70 69
- Within /- 2s 381.32 to 600.28 68/70 97
- Within /- 3s 326.58 to 655.02 70/70 100
32Detecting Outliers
- An outlier is an unusually small or unusually
large value in a data set. - A data value with a z-score less than -3 or
greater than 3 might be considered an outlier. - It might be an incorrectly recorded data value.
- It might be a data value that was incorrectly
included in the data set. - It might be a correctly recorded data value that
belongs in the data set !
33Example Apartment Rents
- Detecting Outliers
- The most extreme z-scores are -1.20 and 2.27.
- Using z gt 3 as the criterion for an outlier,
- there are no outliers in this data set.
- Standardized Values for Apartment Rents
34Example
- Suppose annual salaries for sales associates
from a particular store have a mean of 32,500
and a standard deviation of 2,500. - Calculate and interpret the z-score for a sales
associate who makes 36,000. - Suppose that the distribution of annual salaries
for sales associates at this store is
bell-shaped. Use the empirical rule to calculate
the percentage of sales associates with salaries
between 27,500 and 37,500. -
35Measures of Association Between Two Variables
- Covariance
- Correlation Coefficient
36Covariance
- The covariance is a measure of the linear
association between two variables. - Positive values indicate a positive relationship.
- Negative values indicate a negative relationship.
37Covariance
- If the data sets are samples, the covariance is
denoted by sxy. - If the data sets are populations, the covariance
is denoted by ?xy.
38Covariance
A high school guidance counselor collected the
following data about the grade point averages
(GPA) and the SAT mathematics test scores for six
seniors.
- Compute and interpret the sample covariance for
the data
39Correlation Coefficient
- The coefficient can take on values between -1 and
1. - Values near -1 indicate a strong negative linear
relationship. - Values near 1 indicate a strong positive linear
relationship. - If the data sets are samples, the coefficient is
rxy. - If the data sets are populations, the coefficient
is .
40Correlation Coefficient
A high school guidance counselor collected the
following data about the grade point averages
(GPA) and the SAT mathematics test scores for six
seniors.
- Compute and interpret the sample covariance for
the data - Compute and interpret the correlation coefficient
(sx 0.385, sy 85.323)
41Using Excel to Compute theCovariance and
Correlation Coefficient
42Using Excel to Compute theCovariance and
Correlation Coefficient
43Introduction to Probability
- Probability is a numerical measure of the
likelihood that an event will occur. - Probability values are always assigned on a scale
from 0 to 1. - A probability near 0 indicates an event is very
unlikely to occur. - A probability near 1 indicates an event is almost
certain to occur. - A probability of 0.5 indicates the occurrence of
the event is just as likely as it is unlikely.
44An Experiment and Its Sample Space
- An experiment is any process that generates
well-defined outcomes. - The sample space for an experiment is the set of
all experimental outcomes. - A sample point is an element of the sample space,
any one particular experimental outcome.
45Example Bradley Investments
- Bradley has invested in two stocks, Markley Oil
and - Collins Mining. Bradley has determined that the
- possible outcomes of these investments three
months - from now are as follows.
- Investment Gain or Loss
- in 3 Months (in 000)
- Markley Oil Collins Mining
- 10 8
- 5 -2
- 0
- -20
Sample Point
Sample Space
46Assigning Probabilities
- Classical Method
- Assigning probabilities based on the assumption
of equally likely outcomes. - Relative Frequency Method
- Assigning probabilities based on experimentation
or historical data. - Subjective Method
- Assigning probabilities based on the assignors
judgment.
47Classical Method
- If an experiment has n possible outcomes, this
method - would assign a probability of 1/n to each
outcome. - Example
- Experiment Rolling a die
- Sample Space S 1, 2, 3, 4, 5, 6
- Probabilities Each sample point has a 1/6
chance - of occurring.
48Relative Frequency Method
- Example Lucas Tool Rental
- Lucas would like to assign probabilities to the
- number of floor polishers it rents per day.
Office - records show the following frequencies of daily
rentals - for the last 40 days.
- Number of Number
- Polishers Rented of Days
- 0 4
- 1 6
- 2 18
- 3 10
- 4 2
49Relative Frequency Method
- Example Lucas Tool Rental
- The probability assignments are given by
dividing - the number-of-days frequencies by the total
frequency - (total number of days).
- Number of Number
- Polishers Rented of Days Probability
- 0 4 .10 4/40
- 1 6 .15 6/40
- 2 18 .45 etc.
- 3 10 .25
- 4 2 .05
- 40 1.00
50Subjective Method
- When economic conditions and a companys
circumstances change rapidly it might be
inappropriate to assign probabilities based
solely on historical data. - We can use any data available as well as our
experience and intuition, but ultimately a
probability value should express our degree of
belief that the experimental outcome will occur. - The best probability estimates often are obtained
by combining the estimates from the classical or
relative frequency approach with the subjective
estimates.
51Example Bradley Investments
- Applying the subjective method an analyst
- made the following probability assignments.
- Exper. Outcome (Markley, Collins)
Net Gain/Loss Probability - ( 10, 8) 18,000 Gain
.20 - ( 10, -2) 8,000 Gain
.08 - ( 5, 8) 13,000 Gain
.16 - ( 5, -2) 3,000 Gain
.26 - ( 0, 8) 8,000 Gain
.10 - ( 0, -2) 2,000 Loss
.12 - (-20, 8) 12,000 Loss
.02 - (-20, -2) 22,000 Loss
.06
52Events and Their Probability
- An event is a collection of sample points.
- The probability of any event is equal to the sum
of the probabilities of the sample points in the
event.
53Example Bradley Investments
- Events and Their Probabilities
- Event M Markley Oil Profitable
- M (10, 8), (10, -2), (5, 8), (5,
-2) - P(M) P(10, 8) P(10, -2) P(5, 8)
P(5, -2) - .2 .08 .16 .26
- .70
- Event C Collins Mining Profitable
- C (10, 8), (5, 8), (0, 8), (-20,
8) - P(C) .48 (found using the same logic)
54Basic Concepts of Probability
- Complement of an Event
- Union of Two Events
- Intersection of Two Events
- Mutually Exclusive Events
55Complement of an Event
- The complement of event A is defined to be the
event consisting of all sample points that are
not in A. - The complement of A is denoted by Ac.
- The Venn diagram below illustrates the concept of
a complement.
Sample Space S
Event A
Ac
56Union of Two Events
- The union of events A and B is the event
containing all sample points that are in A or B
or both. - The union is denoted by A ??B?
- The union of A and B is illustrated below.
- P(A ??B) The probability of the occurrence of
Event A or Event B.
Sample Space S
57Example Bradley Investments
- Union of Two Events
- Event M Markley Oil Profitable
- Event C Collins Mining Profitable
- M ??C Markley Oil Profitable
- or Collins Mining Profitable
- M ??C (10, 8), (10, -2), (5, 8), (5, -2),
(0, 8), (-20, 8) - P(M ??C) P(10, 8) P(10, -2) P(5, 8) P(5,
-2) - P(0, 8) P(-20, 8)
- .20 .08 .16 .26 .10 .02
- .82
58Intersection of Two Events
- The intersection of events A and B is the set of
all sample points that are in both A and B. - The intersection is denoted by A ????
- The intersection of A and B is the area of
overlap in the illustration below. - P(A ???) The probability of the occurrence of
Event A and Event B.
Sample Space S
Intersection
Event A
Event B
59Example Bradley Investments
- Intersection of Two Events
- Event M Markley Oil Profitable
- Event C Collins Mining Profitable
- M ??C Markley Oil Profitable
- and Collins Mining Profitable
- M ??C (10, 8), (5, 8)
- P(M ??C) P(10, 8) P(5, 8)
- .20 .16
- .36
60Mutually Exclusive Events
- Two events are said to be mutually exclusive if
the events have no sample points in common. That
is, two events are mutually exclusive if, when
one event occurs, the other cannot occur. - Addition Law for Mutually Exclusive Events
- P(A ??B) P(A) P(B)
Sample Space S
Event A
Event B
61Roll the Dice
- If you roll 2 dice, whats the probability of
rolling a 7 or 11?
62Die 1
Die 2
63Die 1
Die 2
P(7) 6/36 .167
P(11) 2/36 .056
64Roll the Dice
- If you roll 2 dice, whats the probability of
rolling a 7 or 11?
65 Continuous Probability Distributions
- Normal Probability Distribution
f(x)
x
?
66Continuous Probability Distributions
- A continuous random variable can assume any value
in an interval on the real line or in a
collection of intervals. - It is not possible to talk about the probability
of the random variable assuming a particular
value. - Instead, we talk about the probability of the
random variable assuming a value within a given
interval. - The probability of the random variable assuming a
value within some given interval from x1 to x2 is
defined to be the area under the graph of the
probability density function between x1 and x2.
67Normal Probability Distribution
- Graph of the Normal Probability Density Function
x1
x2
68Normal Probability Distribution
- Graph of the Normal Probability Density Function
x1
x2
69Normal Probability Distribution
- The shape of the normal curve is often
illustrated as a bell-shaped curve. - Two parameters, m (mean) and s (standard
deviation), determine the location and shape of
the distribution. - The highest point on the normal curve is at the
mean, which is also the median and mode. - The mean can be any numerical value negative,
zero, or positive. - continued
Characteristics of the Normal Probability
Distribution
70Normal Probability Distribution
- The normal curve is symmetric.
- The standard deviation determines the width of
the curve larger values result in wider, flatter
curves. - The total area under the curve is 1 (.5 to the
left of the mean and .5 to the right). - Probabilities for the normal random variable are
given by areas under the curve.
Characteristics of the Normal Probability
Distribution
71Normal Probability Distribution
- of Values in Some Commonly Used Intervals
- 68.26 of values of a normal random variable are
within /- 1 standard deviation of its mean. - 95.44 of values of a normal random variable are
within /- 2 standard deviations of its mean. - 99.72 of values of a normal random variable are
within /- 3 standard deviations of its mean.
72x4
x6
x1
x2
x3
x5
z
-1
-2
1
-3
2
3
.6826
73x4
x6
x1
x2
x3
x5
z
-1
-2
1
-3
2
3
.9544
74x4
x6
x1
x2
x3
x5
z
-1
-2
1
-3
2
3
.9972
75Standard Normal Probability Distribution
- A random variable that has a normal distribution
with a mean of zero and a standard deviation of
one is said to have a standard normal probability
distribution. - The letter z is commonly used to designate this
normal random variable. - Converting to the Standard Normal Distribution
- We can think of z as a measure of the number of
.....standard deviations x is
from ?.
76Standard Normal Probability Distribution
? 0 ? 1
P 1.0
77Standard Normal Probability Distribution
? 0 ? 1
P .5
P .5
78Standard Normal Probability Distribution
? 0 ? 1
f(z)
z
1
0
79- Using the Standard Normal Probability Table
(Table 1)
80- Using the Standard Normal Probability Table
(Table 1)
81- Using the Standard Normal Probability Table
(Table 1)
82Standard Normal Probability Distribution
? 0 ? 1
f(z)
z
1
0
83Standard Normal Probability Distribution
? 0 ? 1
f(z)
z
1
0
84Standard Normal Probability Distribution
? 0 ? 1
f(z)
z
1
0
85Standard Normal Probability Distribution
? 0 ? 1
f(z)
z
1
0
86Using the Standard Normal Probability Table
87Example Pep Zone
- Standard Normal Probability Distribution
- Pep Zone sells auto parts and supplies including
a - popular multi-grade motor oil. When the stock of
this - oil drops to 20 gallons, a replenishment order is
placed. - The store manager is concerned that sales are
being - lost due to stockouts while waiting for an order.
It has - been determined that leadtime demand is normally
- distributed with a mean of 15 gallons and a
standard - deviation of 6 gallons.
- The manager would like to know the probability
of a - stockout, P(x gt 20).
88Example Pep Zone
? 15 ? 6
P(x gt 20)
20
89Example Pep Zone
? 15 ? 6
20
z
0
90Example Pep Zone
? 15 ? 6
20
z
.83
0
91Example Pep Zone
- Standard Normal Probability Distribution
- The Standard Normal table shows an area of .7967
for the region below z .83. The shaded tail
area is 1.00 - .7967 .2033. The probability of
a stock-out is .2033. -
-
-
-
Area .7967
Area 1.00 - .7967 .2033
z
0
.83
92Example Pep Zone
- Standard Normal Probability Distribution
- If the manager of Pep Zone wants the
probability of a stockout to be no more
than .05, what should the reorder point be? -
- Let z.05 represent the z value cutting the .05
tail area.
Area .05
Area .95
z.05
0
93Example Pep Zone
- Using the Standard Normal Probability Table
- We now look-up the .9500 area in the Standard
Normal Probability table to find the
corresponding z.05 value. -
- z.05 1.645 is a reasonable estimate.
94Example Pep Zone
- Standard Normal Probability Distribution
- If the manager of Pep Zone wants the probability
of a stockout to be no more than .05,
what should the reorder point be? -
Area .05
Area .95
z
1.645
0
95Example Pep Zone
- Standard Normal Probability Distribution
- The corresponding value of x is given by
-
- A reorder point of 24.87 gallons will place the
probability of a stockout during leadtime at .05.
Perhaps Pep Zone should set the reorder point
at 25 gallons to keep the probability under .05.
96End of Review