Title: S'A'B' Help Desk
1S.A.B. Help Desk
- City West, Yungondi Building, level 3, room 73
- Mondays 500pm 630pm
- Wednesdays 1230pm 230pm
- Thursdays 200pm 400pm
- Fridays 200pm 400pm
- Mawson Lakes, Maths Help Centre, OC1-60
- SAB tutor on Wednesdays 12noon 200pm
2Week 3 Objectives
- 1. Measures of location and dispersion
- 2. Attributes of distribution shape
- 3. Boxplots
- 4. Assessing skewness and bi-modality
- 5. Weighted averages
- 6. Index numbers
3Measures of location and dispersion
4Measures of location and dispersion revision
- Location mean and median (which is robust
against outliers?) - Dispersion standard deviation and IQR (which is
robust against outliers?) - What is the definition of the coefficient of
variation (CV)? - If data units are in minutes, what are the units
of - The median?
- The CV?
- The variance?
5Distribution shapes (a) symmetric or skewed?
6Distribution shapes (b) uni- or bi-modal?
Note for discrete observations, the mode is the
most frequent value. What does modality mean for
histograms?
- Uni-modal
- one peak only, so the mode is at the position of
the peak, and is the most frequently occurring
interval - Bi-modal
- two peaks. Is this caused by genuine bimodality,
i.e. two component distributions, or is it
because of random fluctuations?
7Can uni-modality be distinguished from
bi-modality?
- Be cautious
- Multiple peaks may be caused by random
fluctuations in the data - Very large sample sizes are needed to accurately
assess modality - But bi-modality is of interest, where it
indicates two underlying components
8This histogram is of 100 values sampled from a
uni-modal symmetric distribution
Is the multi-modality real?
9Lecture exercise 1
Questions of distribution shape
- (i) Skew or symmetric?
- (ii) Modality?
103. Box-and-whisker plot
- Box-and-whisker plot is constructed based on the
5-number summary, min, Q1, median, Q3, max. - Two Box plots for (2.10.1) of the textbook with
and without the outlier.
11Uses of box-plots
- Signal possible outliers
- Good visual presentation of data set
- Vertically stacked box-plots (on the same scale)
enable easy comparison of several data sets - Easy in Minitab Graph gt Boxplot or as option
under Graphs after Stat gt Basic Statistics gt
Display Descriptive Statistics
12Indicators of skewness
- Position of the median relative to Q1, Q3 on a
boxplot, i.e. two unequal halves of the box - Median closer to Q1 for right skewed
- Median closer to Q3 for left skewed
- Different lengths of the whiskers
- Longer right (left) whisker for right (left)
skewed - Different values for mean and median
- Mean gt Median for right-skewed
- Mean lt Median for left-skewed
13Lecture example 1
From the boxplot Median is closer to Q3 (bigger
left half of the box) Left whisker is
longer From the summary measures Mean lt
Median Conclusion Distribution of weight gain is
skewed-to-the-left
Descriptive Statistics Variable N Mean
Median StDev Q1 Q3 rat weight
17 0.055 0.26 0.9478 -0.689 0.609
14Can skewness be assessed from the boxplot and the
summary measures?
Lecture exercise 2
15Can skewness be assessed from the boxplot and the
summary measures?
Lecture exercise 3
16Histogram of the Depth data in Lecture Exercise 3
17Which location measure to use?
It depends on the context.
- For example, if the purpose of gathering the data
is to find out about an average (say, average
consumption of a product), then use the mean. - Otherwise, if a general measure of centrality
is needed, the mean could be used for nearly
symmetric data sets without outliers, and the
median could be used for skew distributions.
18Lecture exercise 4
- Describe the following distribution in terms of
shape (skewness and modality), location and
dispersion. - Which location and dispersion measures should you
use?
19Solution to Lecture exercise 4
205. Weighted averages
Every Sunday a social club has a pizza night,
choosing a different pizza supplier each time.
Costs and quantities for 3 weeks are as follows.
What is average unit cost?
Which is the more appropriate, and why?
21A small school clothing business trades Monday to
Friday
Textbook example (3.7.1)
What is its average daily profit over the first
four months?
22Weighted or unweighted average?
Does it make a difference?
Which is the more appropriate?
23Weighted averages general formula
24Index numbers
- A series of index numbers is a sequence of
figures, which keep track of the relative
percentage value of some quantity of interest
(say price) - Familiar indexes are CPI, All-ords, Dow Jones,
etc.
25Number of motor vehicles (millions) in the USA
An example of Index number
(Textbook example (3.9.1)
26Classification of Index Numbers
27Fixed and Chain based indexes
- A Fixed base index is calculated as
- current value base value 100
- A Chain-based index is calculated as
- current value previous value 100
28Number of motor vehicles (millions) in the USA
Textbook example (3.9.1)
29Simple and Composite indexes
- Simple index a series of percentages relating
to a single commodity - Composite index based upon a collection of
commodities, called a basket of commodities.
Examples CPI, all-ordinaries (a composite
index from ASX stocks)
30How to make a Composite Index?
- Unweighted when weights are unavailable. Based
on simple averaging of Prices or of Price
relatives. The types are aggregative and
relative see the text for details. - What is the major deficiency of unweighted
indexes? - Why are the most commonly used composite indexes
of weighted type?
31Weighted, composite indexes
Usually weights quantities
32Laspeyres, Paasche formulas
33Find Laspeyeres index for 1998 relative to base
year 1994
Data from Textbook example (3.9.2)
34Solution
35Find Paasches index for 1998 relative to base
year 1994
Data from Textbook example (3.9.2)
36Solution Lecture exercise 5
What interpretation can be given to the fact that
the Paasche index is higher than the Laspeyeres
index?
37Solution to Lecture exercise 5
38Find Laspeyeres and Paasche indexes for 1997
relative to base year 1995
Lecture exercise 6
39Solution to Lecture Exercise 6