Title: CPE 619 The Art of Data Presentation
1CPE 619The Art of Data Presentation
- Aleksandar Milenkovic
- The LaCASA Laboratory
- Electrical and Computer Engineering Department
- The University of Alabama in Huntsville
- http//www.ece.uah.edu/milenka
- http//www.ece.uah.edu/lacasa
2Overview
- Types of Variables
- Guidelines for Preparing Good Charts
- Common Mistakes in Preparing Charts
- Pictorial Games
- Special Charts for Computer Performance
- Gantt Charts
- Kiviat Graphs
- Schumacher Charts
- Decision Makers Games
3Types of Variables
- Type of computer Super computer, minicomputer,
microcomputer - Type of Workload Scientific, engineering,
educational - Number of processors
- Response time of system
4Guidelines for Preparing Good Charts
- 1) Require minimum effort from the reader
- Direct labeling vs. legend box
- 2) Maximize Information
- Words in place of symbols cleary label the axes
5Guidelines (contd)
- 3) Minimize ink
- No grid lines, more details
- 4) Use commonly accepted practices
- origin at (0,0) independent variable (cause)
along x axis the dependent variable (effect)
along the y axis linear scales increasing
scales equal divisions - 5) Avoid ambiguity
- Show coordinate axes, scale divisions,
originIdentify individual curves and bars
6Checklist for Good Graphics
- Are both coordinate axes shown and labeled?
- Are the axes labels self-explanatory and concise?
- Are the scales and divisions shown on both axes?
- Are the minimum and maximum of the ranges shown
on the axes appropriate to present maximum
information - Is the number of curves reasonably small?
- Do all graphs use the same scale?
- Is there no curve that can be removed without
reducing information? - Are the curves on a line chart individually
labeled? - Are the cells in a bar chart individually
labeled? - Are all symbols on the graph accompanied by
appropriate textural explanations? - If the curves cross, are the line patterns
different to avoid confusion? - Are the units of measurement indicated?
- Is the horizontal scale increasing from left to
right? - Is the vertical scale increasing from bottom to
top? - Are the grid lines aiding in reading the curves?
- Does this whole chart add to information
available to the reader? - Are the scales contiguous?
- Is the order of bars in a bar chart systematic?
- If the vertical axis represents a random
quantity, are confidence intervals shown?
7Common Mistakes in Preparing Charts
- Presenting too many alternatives on a single
chart - Max 5 to 7 messages gt Max 6 curves in a line
charts, no more than 10 bars in a bar chart,
max 8 components in a pie chart - Presenting many y variables on a single chart
8Common Mistakes in Charts (contd)
- Using symbols in place of text
- Placing extraneous information on the chart
- E.g., grid lines, granularity of the grid lines
- Selecting scale ranges improperly
- Automatic selection by programs may not be
appropriate
9Common Mistakes in Charts (contd)
- Using a line chart in place of column chart
- line gt continuity
MIPS
CPU Type
10Pictorial Games
- Using non-zero origins to emphasize the
difference - Three quarter high-rule gt height/width gt 3/4
Mine and yours are almost the same (conceal
difference)
Mine is much better than yours (emphasize
difference)
Height of the highest point should be at least ¾
of the horizontal offset of the rightmost point
11Pictorial Games (contd)
- Using double-whammy graph for dramatization
- Using related metrics
12Pictorial Games (contd)
- Plotting random quantities without showing
confidence intervals
Means of two random variables
Means are not enough. Overlapping confidence
intervals usually means that the two random
quantities are statistically indifferent.
13Pictorial Games (contd)
- Pictograms scaled by height
- Wrong scaling Area(MINE) gt 4Area(YOURS)??
MinePerformance 2
YoursPerformance 1
14Pictorial Games (contd)
- Using inappropriate cell size in histograms
Normal distribution
Exponential distribution
12
12
10
10
Frequency
Frequency
0,2)
2,4)
4,6)
6,8)
8,10)
10,12)
0,6)
6,12)
Response Time
Response Time
15Pictorial Games (contd)
- Using broken scales in column charts
- Amplify differences
12
12
10
11
Resp. Time
Resp. Time
10
9
F
F
A
B
C
D
E
A
B
C
D
E
System
System
16Special Charts for Computer Performance
- Gantt charts
- Kiviat Graphs
- Schumacher's charts
17Gantt Charts
- Shows relative duration of a number of conditions
60
CPU
20
20
IO Channel
10
30
5
15
Network
20
40
60
80
100
0
Utilization
18Example Data for Gantt Chart
19Draft of the Gantt Chart
20Final Gantt Chart
21Kiviat Graphs
- Radial chart with even number of metrics
- HB and LB metrics alternate
- Ideal shape star
22Kiviat Graph for a Balanced System
- Problem Inter-related metrics
- CPU busy problem state Supervisor state
- CPU wait 100 CPU busy
- Channel only any channel CPU/channel overlap
- CPU only CPU busy CPU/channel overlap
23Shapes of Kiviat Graphs
CPU Keel boat
I/O Wedge
I/O Arrow
CPU bound system
I/O bound system
CPU- and I/O bound system
24Merrills Figure of Merit (FoM)
- Performance x1, x2, x3, , x2nOdd values are
HB and even values are LB - x2n1 is the same as x1
- Average FOM 50
25Example FoM
26FoM Example (Cont)
- System BSystem B has a higher
figure of merit and it is better.
27Figure of Merit Known Problems
- All axes are considered equal
- Extreme values are assumed to be better
- Utility is not a linear function of FoM
- Two systems with the same FoM are not equally
good - System with slightly lower FoM may be better
28Kiviat Graphs For Other Systems
- Use Kiviat graphs for networks
ApplicationThroughput
Packets With Error
LinkOverhead
Implicit Acknowledgements
LinkUtilization
Duplicate Packets
29Schumacher Charts
- Performance matrix are plotted in a tabular
manner - Values are normalized with respect to long term
means and standard deviations - Any observations that are beyond mean ? one
standard deviation need to be explained - See Figure 10.25 in the book
30Performance Analysis Rat Holes
Configuration
Workload
Metrics
Details
31Reasons for not Accepting an Analysis
- This needs more analysis.
- You need a better understanding of the workload.
- It improves performance only for long
IOs/packets/jobs/files, and most of the
IOs/packets/jobs/files are short. - It improves performance only for short
IOs/packets/jobs/files, but who cares for the
performance of short IOs/packets/jobs/files, its
the long ones that impact the system. - It needs too much memory/CPU/bandwidth and
memory/CPU/bandwidth isn't free. - It only saves us memory/CPU/bandwidth and
memory/CPU/bandwidth is cheap. - See Box 10.2 on page 162 of the book for a
complete list
32Examples
33Summary
- Qualitative/quantitative, ordered/unordered,
discrete/continuous variables - Good charts should require minimum effort from
the reader and provide maximum information with
minimum ink - Use no more than 5-6 curves, select ranges
properly, Three-quarter high rule - Gantt Charts show utilizations of various
components - Kiviat Graphs show HB and LB metrics
alternatively on a circular graph - Schumacher Charts show mean and standard
deviations - Workload, metrics, configuration, and details can
always be challenged. Should be carefully
selected.
34Exercise 10.1
- What type of chart (line or bar) would you use to
plot - CPU usage for 12 months of the year
- CPU usage as a function of time in months
- Number of I/O's to three disk drives A, B, and
C - Number of I/O's as a function of number of disk
drives in a system
35Exercise 10.2
- List the problems with the following charts
36Exercise 10.3
- On a system consisting of 3 resources, called A,
B, and C. The measured utilizations are shown in
the following table. A zero in a column indicates
that the resource is not utilized. Draw a Gantt
chart showing utilization profiles.
37Exercise 10.4
- The measured values of the eight performance
metrics listed in Example 10.2 for a system are
70, 10, 60, 20, 80, 30, 50, and 20. Draw
the Kiviat graph and compute its figure of merit.
38Exercise 10.5
- For a computer system of your choice, list a
number of HB and LB metrics and draw a typical
Kiviat graph using data values of your choice.
39Ratio Games
40Overview
- Ratio Game Examples
- Using an Appropriate Ratio Metric
- Using Relative Performance Enhancement
- Ratio Games with Percentages
- Ratio Games Guidelines
- Numerical Conditions for Ratio Games
41Case Study 11.1 6502 vs. 8080
1. Ratio of Totals
- Conclusion 6502 is worse. It takes 4.7 more
time than 8080.
426502 vs. 8080 (Cont)
3. 8080 as the base
2. 6502 as the base
- Ratio of Totals 6502 is worse. It takes 4.7
more time than 8080. - With 6502 as a base 6502 is better. It takes 1
less time than 8080. - With 8080 as a base 6502 is worse. It takes 6
more time.
43Case Study 11.2 RISC vs. CISC
- Conclusion RISC-I has the largest code size. The
second processor Z8002 requires 9 less code than
RISC-I.
44RISC vs. CISC (Cont)
8.00
13.00
11.00
10.50
8.50
- Conclusion Z8002 has the largest code size and
that it takes 18 more code than RISC-I.
Peterson and Sequin 1982
45Using an Appropriate Ratio Metric
Example
- Throughput A is better
- Response Time A is worse
- Power A is better
46Using Relative Performance Enhancement
- Example Two floating point accelerators
- Problem Incomparable bases. Need to try both on
the same machine
47Ratio Games with Percentages
- Example Tests on two systems
- 1. System B is better on both systems
- 2. System A is better overall.
System A
System B
48Percentages (Cont)
- Other Misuses of Percentages
- 1000 sounds more impressive than 11-time.
Particularly if the performance before and after
the improvement are both small - Small sample sizes disguised in percentages
- Base Initial. 400 reduction in prices ? Base
Final
49Ratio Games Guidelines
- If one system is better on all benchmarks,
contradicting conclusions can not be drawn by any
ratio game technique
50Guidelines (cont)
- Even if one system is better than the other on
all benchmarks, a better relative performance can
be shown by selecting appropriate base. - In the previous example, System A is 40 better
than System B using raw data, 43 better using
system A as a base, and 42 better using System B
as a base. - If a system is better on some benchmarks and
worse on others, contracting conclusions can be
drawn in some cases. Not in all cases. - If the performance metric is an LB metric, it is
better to use your system as the base - If the performance metric is an HB metric, it is
better to use your opponent as the base - Those benchmarks that perform better on your
system should be elongated and those that perform
worse should be shortened
51Numerical Conditions for Ratio Games
52Numerical Conditions (Cont)
53Numerical Conditions (Cont)
2
B is betterusing all 3
Ratio of B/A response on benchmark j
1
A isbetterusing all 3
Base B
Raw Data
Base A
0
1
1
2
3
1
Ratio of B/A response on benchmark i
54Summary
- Ratio games arise from use of incomparable bases
- Ratios may be part of the metric
- Relative performance enhancements
- Percentages are ratios
- For HB metrics, it is better to use opponent as
the base
55Exercise 11.1
- The following table shows execution times of
three benchmarks I, J, and K on three systems A,
B, and C. Use ratio game techniques to show the
superiority of various systems.
56Exercise 11.2
- Derive conditions necessary for you to be able to
use the technique of combined percentages to your
advantage.
57Homework