Title: Chapter Three
1Chapter Three
- Numerical Descriptive Measures
2(No Transcript)
3(No Transcript)
4Commonly used Descriptive Measures
- Measures of Central Tendency
- Measures of Variation
- Measures of Position
- Measures of Shape
5Measures of Central Tendency
- Purpose To determine the centre of the data
values.
6Measures of Central Tendency Answer questions
- Where is the middle of my data?
- Mean, Median, Midrange
- Which data value occurs most often?
- Mode
7The Mean
- The sample mean is denoted by x-bar
- The population mean is denoted by µ (mu)
- x individual data values
- X-bar Sx / n
- µ Sx / N
8Example
- The following are accident data for a 5 month
period - 6, 9, 7, 23, 5
9To calculate the average number of accidents per
month
- X-bar Sx / n
- X-bar (6 9 7 23 5) 5
- X- bar 10.0
10Statistic
- What is the average persons monetary value to
society?
11The Median
- is the centre value in a data set when the data
are arranged from smallest to
largest.
12What do we call this ordering process?
13By arranging the data in an Ordered Array
- 5, 6, 7, 9, 23
- With an even number of observations, the value
that has an equal number of items to the right
and to the left is the Median. - Md 7
14To calculate the median with an even number of
observations,
- average the two center values of the ordered
set. - Example With an ordered array 5, 6, 7, 9
- Md ( 6 7 ) 2 6.5
15If there is an odd number of observations
- Md (n 1 ) 2
- where
- n of observations
16Remember
- Median describes the centrally placed location of
a value relative to the rest of the data.
17Question
- Is the mean or median more sensitive to extreme
values (outliers)? - Explain.
18The mean is affected by every value.
- The median is unaffected by extreme values.
19The mean is pulled toward extreme values.
- The median does not use all data information
available.
20Question
- When dealing with data that are likely to contain
outliers (personal income, ages, or prices of
houses), would the Mean or Median be preferred as
the measure of central tendency? Why?
21Think of the Median
- as providing a more
- typical or representative
- value of the situation.
22The Mode(Mo)
- The value that occurs most frequently.
23Questions?
- Can there be more than one mode?
- Is the mode affected by extreme values?
- For continuous variables, is it possible that a
mode does not exist? Explain? - Is the mode always a measure of central tendency?
24Give an example of when the mode may provide more
useful information than the mean or the median.
25Example
- From a purchasers standpoint, the most common
hat or jeans size is what you would like to know,
not the average hat or jeans size.
26Measures of Central Tendency are useful.
MeansMediansModes
27(No Transcript)
28The use of any single statistic to describe a
complete distribution fails to reveal important
facts.
29Dig Deeper!
30Measures of Variation
- Answers the question
- How spread out are my data values?
31Consider Two Scenarios
- Scenario 1
- Jack buys a car pays 1000.
- Jill buys a car pays 21,000.
- Average Price 11,000
32Scenario 2
- Bob buys a car pays 10,000.
- Mary buys a car
- pays 12,000.
- Average Price 11,000
33Based on the data, both scenarios report the same
average price.
34Whats the difference?
35Quiz
- Suppose you are a purchasing agent for a large
manufacturing company. Your two suppliers fill
your orders in an average of 10 days. - The following histograms plot the delivery time
of the two suppliers.
36Do the two suppliers have the same reliability in
terms of making deliveries on time?
37Homogeneity the degree of similarity within a
set of data values.
- The mean of a homogeneous data set is far more
representative of the typical value than a mean
of a heterogeneous data set.
38If all the data values in a sample are identical,
then the mean provides perfect information, the
variation is zero, and the data are perfectly
homogeneous.
39Variation
- the tendency of data values to scatter about the
mean, x-bar. -
40If all the data values in a sample are identical,
then the mean provides perfect information, the
variation is zero, and the data are perfectly
homogeneous.
41Commonly used Measures of Variation
- Range
- Variance
- Standard Deviation
- Coefficient of Variation (CV)
42The Range
- Range H L
- The value of the range is strongly influenced by
an outlier in the sample data.
43Variance Standard Deviation
- During a five week production period, a small
company produced 5,9,16,17, 18 computers,
respectfully. - The average 13 computers/wk
- Describe the variability in these five weeks of
production.
44Variance Standard Deviation
45Formulas for Variance Standard Deviation
46(No Transcript)
47Empirical Rule
48Normally Distributed Data w/ Empirical Rule
49Example Empirical Rule
- A company produces a lightweight valve that is
specified to weigh 1365 g. -
- Unfortunately, because of imperfections in the
manufacturing process not all of the valves
produced weigh exactly 1365 grams. -
- In fact, the weights of the valves produced are
normally distributed with a mean weight of 1365
grams and a standard deviation of 294 grams.
50Question?
- Within what range of weights would approximately
95 of the valve weights fall? - 2) Approximately 16 of the weights would be more
than what value? - 3) Approximately 0.15 of the weights would be
less than what value?
51Answers
- 1) 1365 /- 2s 777 to 1953
- 2) 1365 1 s 1659
- 3) 1365 - 3 s 483
52Example 2 Standard Deviation the Empirical Rule
- A recent report states that for California the
average statewide price of a gallon of regular
gasoline is 1.52. - Suppose regular gas prices vary across the state
with a standard deviation of 0.08 are normally
distributed.
53With x-bar 1.52 s 0.08
- Nearly all gas prices (97.7) should fall between
what prices? - Approximately 16 of the gas prices should be
less than what price? - Approximately 2.5 of the gas prices should be
more than what price?
54Answers
- µ /- 3s 1.28 and 1.76
- 1.44 (Since 68 of the prices lie w/in 1s of the
mean, 32 lie outside this range 16 in each
tail. - 1.68 (Since 95 of the price lie w/in 2 s of the
mean, 5 lie outside this range 2.5 in each
tail.
55Coefficient of Variation
- Compares the variation between
- two data sets with different means
- and different standard deviations
- and measures the variation in
- relative terms.
56Coefficient of Variation (CV) formula
57CV Example 1
- Spot, the dog, weighs 65 pounds. Spots weight
fluctuates 5 pounds depending on Spots exercise
level. - Sea Biscuit, the horse, weighs 1200 pounds. Sea
Biscuits weight fluctuates 125 pounds depending
on the number of rides Sea Biscuit goes on.
58Question?
- Relatively speaking, which animals weight, Spot
or Sea Biscuits, varies the most?
59Coefficient of Variation vs. Standard Deviation
- Some financial investors use the coefficient of
variation or the standard deviation or both as
measures of risk.
60What does the Coefficient of Variation tell us
about the risk of a stock that the standard
deviation does not?
61Relative to the amount invested in a stock, the
coefficient of variation reveals the risk of a
stock in terms of the size of standard deviation
relative to the size of the mean (in percentage).
62CV Example 2
- SUPPOSE
- Five weeks of average prices for stock A are
- 57, 68, 64, 71, and 62.
- While five weeks of average prices for stock B
are - 12, 17, 8, 15, and 13.
63QUESTION
- Relative to the amount of money invested in the
stock, which stock, A or B, is riskier?
64Stock A vs. Stock B in terms of Risk
- Stock B
- µ 13
- s 3.03
- CV s/ µ (100) 23.3
- Stock A
- µ 64.40
- s 4.84
- CV s/ µ (100) 7.5
65Measures of Position
- Indicate how a particular value fits in with all
the other data values. - Commonly used measures of position are
- Percentiles
- Quartiles
- Z-scores
66TO FIND THE LOCATION OF THE Pth PERCENTILE
- Determine n P /100 and use one of the following
two location rules - Location rule 1. If n P /100 is NOT a counting
number, round up, and the Pth percentile will be
the value in this position of the ordered data. - Location rule 2. If n P /100 is a counting
number, the Pth percentile is the average of the
number in this location (of the ordered data) and
the number in the next largest location.
67Use the two rules of percentiles and the
following data to determine both the 85th and the
50th percentile for starting salary.
68Step 1 Arrange the data in ascending order
69Step 2
- Use the formula for percentiles
- n P /100
- Identify the 85th percentile given 12
observations - i n (p /100) 12 (85/100) 10.2
70Because i is not an integer, round up. The
position of the 85th percentile is the next
integer greater than 10.2, the 11th position.
71- From the data, the 85th percentile is the value
in the 11th position, or 3130.
72To calculate the 50th percentile, apply step 2
- n P /100
- i 12 (50/100) 6
- Because i is an integer, the 50th percentile is
the average of the sixth and seventh values - (2890 2920) /2 2905.
73Quartiles
- Quartiles are merely particular percentiles that
divide the data into quarters - Q1 1st quartile 25th percentile (P25)
- Q2 2nd quartile 50th percentile (P50)
- Q3 3rd quartile 75th percentile (P75)
- Quartiles are used as benchmarks, much like the
use of A,B,C,D, and F on exam grades.
74Z- Scores
- A z-score determines the relative position of any
particular data value x, and is expressed in
terms of the number of standard deviations above
or below the mean.
75Measures of Shape
- Measures of shape address skewness and kurtosis.
76Skewness
- Symmetric data the sample mean sample median
- Right-skewed (positive) mean gt median
- Left-skewed (negative) mean lt median
77Closing Example
- The number of defects in 10 rolls of carpets
are - 3, 2, 6, 0, 1, 3, 2, 1, 0, 4
- What are the 75th percentile and the 50th
percentile? - What are the mean, standard deviation, and
coefficient of variation?
78(No Transcript)