Title: Business 260: Managerial Decision Analysis
1Business 260 Managerial Decision
Analysis Professor David Mease Lecture
1 Agenda 1) Course web page 2)
Greensheet 3) Numerical Descriptive Measures
(Stats Book P. 107) 4) Simple Linear Regression
(Stats Book P. 387)
2Business 260 Managerial Decision Analysis
Professor David Mease Course web
page http//www.cob.sjsu.edu/mease_d/bus260 It
is linked from my San Jose State web
page http//www.cob.sjsu.edu/mease_d/ It is
also linked from my personal page which is easily
found by querying David Mease or simply Mease
on Google
3Business 260 Managerial Decision Analysis
Professor David Mease Greensheet You should
have a hard copy of the greensheet. It can
also be found on the course web
page http//www.cob.sjsu.edu/mease_d/bus260
4Statistics for Managers Using Microsoft Excel
4th Edition
Numerical Descriptive Measures (P. 107)
5Chapter Topics
- Measures of central tendency, variation, and
shape - Mean, median, mode, geometric mean
- Quartiles
- Range, interquartile range, variance and standard
deviation, coefficient of variation - Symmetric and skewed distributions
- Population summary measures
- Mean, variance, and standard deviation
- The empirical rule
- Five number summary and box-and-whisker plots
- Coefficient of correlation
6Summary Measures
Describing Data Numerically
Central Tendency
Variation
Shape
Quartiles
Arithmetic Mean
Range
Skewness
Median
Interquartile Range
Mode
Variance
Standard Deviation
Geometric Mean
Coefficient of Variation
7In class exercise 1 A sample of n9 runners
were asked how many miles they ran last week.
Here is the data 43 17 21 3 32 37 10 26
28 Describe the center of this data. (What are
the mean, median and mode?)
8In class exercise 2 How would your answer
change for ICE 1 if the first runner actually
ran 143 miles instead of 43? Here is the
data 143 17 21 3 32 37 10 26 28
9(No Transcript)
10In class exercise 3 How would your answer
change for ICE 1 if there was a 10th runner who
also ran 17 miles? Here is the data 43 17
21 3 32 37 10 26 28 17
11Summary Measures
Describing Data Numerically
Central Tendency
Variation
Shape
Quartiles
Arithmetic Mean
Range
Skewness
Median
Interquartile Range
Mode
Variance
Standard Deviation
Geometric Mean
Coefficient of Variation
12Quartiles
- Quartiles split the ranked data into 4 segments
with an equal number of values per segment
25
25
25
25
Q1
Q2
Q3
- The first quartile, Q1, is the value for which
25 of the observations are smaller and 75 are
larger - Q2 is the same as the median (50 are smaller,
50 are larger) - Only 25 of the observations are greater than the
third quartile
13In class exercise 4 Compute the quartiles for
the n9 runners. 43 17 21 3 32 37 10 26
28
14In class exercise 5 Compute the quartiles for
the n10 runners. 43 17 21 3 32 37 10 26
28 17
15Quartile Formulas
Find a quartile by determining the value in the
appropriate position in the ranked data, where
First quartile position Q1 at (n1)/4
Second quartile position Q2 at (n1)/2 (the
median) Third quartile position Q3 at
3(n1)/4 where n is the number of
observed values
16In class exercise 6 Redo ICE 4 and ICE 5
using these formulas and check that the answers
are the same.
17Five Number Summary
- 1) Minimum
- 2) Q1
- 3) Q2 (median)
- 4) Q3
- 5) Maximum
- A plot of the 5 number summary like the one below
is called a box-and-whisker plot
Median (Q2)
X
X
Q1
Q3
maximum
minimum
25 25 25
25
18In class exercise 7 Compute the five number
summary for the sample of n9 runners and draw
the box-and-whisker plot. 43 17 21 3 32 37
10 26 28
19Summary Measures
Describing Data Numerically
Central Tendency
Variation
Shape
Quartiles
Arithmetic Mean
Range
Skewness
Median
Interquartile Range
Mode
Variance
Standard Deviation
Geometric Mean
Coefficient of Variation
20Measures of Variation
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Interquartile Range
- Measures of variation give information on the
spread or variability of the data values.
Same center, different variation
21Range
- Simplest measure of variation
- Difference between the largest and the smallest
observations - Disadvantages ignores distribution of data and
sensitive to outliers
Range Xlargest Xsmallest
22In class exercise 8 Compute the range for the
sample of n9 runners. 43 17 21 3 32 37 10
26 28
23Interquartile Range
- Can eliminate some outlier problems by using the
interquartile range - Eliminate some high- and low-valued observations
and calculate the range from the remaining values - Interquartile range 3rd quartile 1st quartile
- Q3 Q1
24In class exercise 9 Compute the IQR
(interquartile range) for the sample of n9
runners. 43 17 21 3 32 37 10 26 28
25Variance
- Average (approximately) of squared deviations of
values from the mean - Advantages each value in the data set is used
in the calculation and values far from the mean
are given extra weight (because theyre squared) - Sample variance
Where
arithmetic mean n sample size Xi ith
value of the variable X
26In class exercise 10 Compute the variance for
this sample of n5. 10 20 30 40 50
27Standard Deviation
- Most commonly used measure of variation in
business application - It is simply the square root of the variance
- Shows variation about the mean
- Has the same units as the original data
- Sample standard deviation
28In class exercise 11 Compute the standard
deviation for this sample of n5. 10 20 30 40
50
29Coefficient of Variation
- Measures relative variation
- Always in percentage ()
- Shows variation relative to mean
- Can be used to compare two or more sets of data
measured in different units
30In class exercise 12 Compute the coefficient of
variation for this sample of n5. 10 20 30 40
50
31Using Excel
Many of these are available by doing insert gt
function gt statistical
32Using Excel
Many of these are available by doing insert gt
function gt statistical Examples sample mean
AVERAGE minimum MIN maximum MAX median
MEDIAN sample standard deviation STDEV sample
variance VAR quartiles QUARTILE doesnt
really work mode MODE doesnt really
work
33In class exercise 13 Check the mean and median
for the sample of n9 runners using Excel. 43
17 21 3 32 37 10 26 28
34In class exercise 14 Check the variance and
standard deviation for the sample of n5 using
Excel. 10 20 30 40 50
35Summary Measures
Describing Data Numerically
Central Tendency
Variation
Shape
Quartiles
Arithmetic Mean
Range
Skewness
Median
Interquartile Range
Mode
Variance
Standard Deviation
Geometric Mean
Coefficient of Variation
36Shape of a Distribution
- Describes how data are distributed
- Measures of shape
- Symmetric or skewed
Right-Skewed
Left-Skewed
Symmetric
Mean Median
Mean lt Median
Median lt Mean
37Distribution Shape and Box-and-Whisker Plot
Right-Skewed
Left-Skewed
Symmetric
Q1
Q2
Q3
Q1
Q2
Q3
Q1
Q2
Q3
38In class exercise 15 Below is the histogram for
the 1500 California house prices at
http//www.cob.sjsu.edu/mease_d/bus260/houses.xls.
How would you describe the shape of the data
based on the histogram? Confirm this by A)
comparing the mean and median B) making the
box-and-whisker plot
39The Empirical Rule
- If the data distribution is close to being
bell-shaped, then the interval - contains about 68 of the values in the
population or the sample
68
40The Empirical Rule
- contains about 95 of the values in
- the population or the sample
- contains about 99.7 of the values in the
population or the sample
99.7
95
41In class exercise 16 Give the empirical rule
for a population for which the mean is 100 and
the standard deviation is 10.
42In class exercise 17 Compare the empirical rule
to the observed percentages for the 1500 house
prices (houses.xls).
43Coefficient of Correlation (r)
Y
Y
X
X
r -1
r -.6
Y
Y
X
X
r 1
r .3
To understand the coefficient of correlation, it
helps to start with scatter diagrams
44Scatter Diagrams
- Scatter Diagrams are used for bivariate numerical
data - Bivariate data consists of paired observations
taken from two numerical variables - The Scatter Diagram
- one variable is measured on the vertical axis and
the other variable is measured on the horizontal
axis
45Scatter Diagram Example
Volume per day Cost per day
23 125
26 140
29 146
33 160
38 167
42 170
50 188
55 195
60 200
46Scatter Diagrams in Excel
1
2
Select XY(Scatter) option, then click Next
3
The data range is the y values and the x values
go under the Series tab Important Dont
include column names
47In class exercise 18 The file
http//www.cob.sjsu.edu/mease_d/football.xls
gives the total number of wins for each of the
117 Division 1A college football teams for the
2003 and 2004 seasons. Use Excel to make a
scatter diagram for this data. Put 2003 wins on
the x-axis.
48In class exercise 18 The file
http//www.cob.sjsu.edu/mease_d/football.xls
gives the total number of wins for each of the
117 Division 1A college football teams for the
2003 and 2004 seasons. Use Excel to make a
scatter diagram for this data. Put 2003 wins on
the x-axis. ANSWER
49Coefficient of Correlation (r)
- Measures the relative strength and direction of
the linear relationship between two variables - Is equal to the square root of R-squared but will
be negative if the relationship is negative - I will NOT make you compute this by hand, but
this is the formula if you are curious
50Features of Correlation Coefficient, r
- Unit free
- Ranges between 1 and 1
- The closer to 1, the stronger the negative
linear relationship - The closer to 1, the stronger the positive linear
relationship - The closer to 0, the weaker any linear
relationship
51In class exercise 19 Match each plot with its
correct coefficient of correlation. Choices
r-3.20, r-0.98, r0.86, r0.95, r1.20,
r-0.96, r-0.40
A)
B)
C)
D)
E)
52Statistics for Managers Using Microsoft Excel
4th Edition
Simple Linear Regression (P. 387)
53The Least Squares Regression Line
- Described on pages 387-398
- It is the line that fits the data the best as
determined by minimizing squared vertical
differences - The coefficient of correlation (r) measures the
strength and direction of the linear relationship
(positiveup, negativedown) - R-squared also measures the strength of the
linear relationship, but not the direction
54Adding the Least Squares Regression Line Using
Excel
1
55Adding the Least Squares Regression Line Using
Excel
2
- From the Chart menu select Add Trendline
56Adding the Least Squares Regression Line Using
Excel
3
- Choose the first choice (Linear) and press OK
57Adding the Least Squares Regression Line Using
Excel
4
- The line should now appear on your scatter
diagram. Double click on the line then under the
Options tab check the last two boxes.
58In class exercise 20 A) Graph the least squares
regression line for the football data on the
scatter diagram using Excel. B) Give the
equation of the least squares regression line
using Excel. C) What is the slope of the least
squares regression line? D) Interpret the slope
of the least squares regression line. E) What is
the coefficient of correlation? F) What is the
value of R-squared? G) Use the least squares
regression line to predict the number of 2004
wins for a team that won 12 games in 2003.
59In class exercise 20 A) Graph the least squares
regression line for the football data on the
scatter diagram using Excel. ANSWER for Part
A