Title: General Performance Metrics
1General Performance Metrics
2Todays Agenda
- Understanding Meaningful Performance Metrics
- IPS (Instructions Per Second)
- CPI (Clocks Per Instruction)
- Clock Frequency/Switching Speeds
- Understanding What Metrics Mean
- Benchmarking
- Design Issues
- Targeting What Will Get The Biggest Bang for the
Buck - How Design Effects Metrics
- Technology Issues
- Technology Maturation Gives More Transistors,
Faster Clock - Speedups
- Amdahls Law
- Postpone Requirements Lecture..
3Homework 1
- From Book 1.1, 1.4, 1.6, 1.12
- Due Date Next Tuesday ( 1 week from today)
- Now
4Performance
- My machine is faster than yours.
- But what are you basing this claim on ?
- Benchmark Programs Interesting Programs That
Can Be Used For Comparison Purposes - Benchmarks can be your programs or other
programs. Community created many benchmark
programs starting with scientific applications - Types of Benchmarks
- Your Programs
- Code Fragments Interesting Kernels such as
matrix multiply, sorting, etc. - Synthetic Statistical mixes of instructions
that represent real applications - Are you studying for the exam, or for the
material covered in the exam ? - Designers Take Care
- Buyers Beware !! Many subtle factors can effect
reported results - Total Execution Time
- Absolute Measure. Lets break it down
5Performance Thoughts.
- hawk.c, rock.c, jayhawk.c benchmark programs run
on two machines (called A and B). - We watch our Casio watches and record the
following
- What can we deduce concerning the designs of the
two machines from these numbers ? - Each Program has set number of instructions
- Execution Time Varies Why ? Computer
Architects Concern - Can Form Average
6Performance Thoughts
- Need to think about the questions that are of
interest to us - Abstract machine model (ISA)
- Number, Types of Instructions per program
- Organization (memory, CPU, bus, I/O)
- Organization of Memory for Instruction/Data
Fetching - Internal CPU Organization (Bottlenecks,
Functional Blocks) - Implementation (switching speeds, densities)
- Clock Speed, Implementation/Timing of
Instructions
What affects each of these parameters ?
7Aspects of (CPU) Performance
- Inst Count CPI Clock Rate
- Program X
- Compiler X (X)
- Inst. Set. X X
- Organization X X
- Technology X
8Understanding Performance
- Quick glance at exercise
- Constants
- Source C Instructions
- Rock.c 250,000
- Chawk.c 325,000
- Jayhawk.c 500,000
- Machine Clocks
- Machine A 5 Mhz
- Machine B 7.5 Mhz
- Variables
- Assembler Instructions
- Generated/Executed
- ISA Efficiency
- Compiler
- Clocks Per Instruction
- Mem lt-gt CPU
- Memory Hierarchy
- CPU Organization
- Lets look at some numbers
9Performance Numbers Example
- Machine B is faster (7.5 Mhz vs. 5 Mhz)
- Machine B executes less assembler instructions in
all cases - Machine B is slower
- ??????
10Performance Example Continued
11Cycles Per Instruction
Average Cycles per Instruction
CPI (CPU Time Clock Rate) / Instruction Count
Cycles / Instruction Count
n
CPU time CycleTime S CPI I
i
i
i 1
Instruction Frequency
n
CPI S CPI F where F
I
i
i
i
i
i 1
Instruction Count
- Invest Resources where time is Spent!
12Example Calculating CPI
Base Machine (Reg / Reg) Op Freq Cycles CPI(i) (
Time) ALU 50 1 .5 (33) Load 20 2
.4 (27) Store 10 2 .2 (13) Branch 20 2
.4 (27) 1.5
Typical Mix
13Performance Thoughts.
- Instead of Computers, Let talk Cars..
- Porsche, There is no substitute
- 0-150 Mph in 6.7 seconds (I am making this up !)
- Carries 2 people
- Gets 10 Miles per gallon
- 75,000 (making this up also !)
- Ugh! MiniVan, Moms Taxi
- 0-60 Mph (most of the time)
- Carries 6 people
- Gets 30 Miles per gallon
- 20,000 (again.kinda guessing)
- Which Is Better Choice ?
14Performance Thoughts
- Most young students probably dont like where
this is leading. - Suppose you want to transport 12 people 50
miles.. - Porsche
- Even though much faster, needs 6 trips for 600
miles - If average 80 Miles per hour
- Single Trip 1.25 hours (call this turnaround
time) - Total time 7.5 hours
- Average 12 people/7.5 hrs 1.6 persons/hr
- Minivan
- Even though much slower, needs 2 trips for 200
miles - If average 60 Miles per hour
- Single Trip 1.67 hours (slower than porsche for
sure !) - Total time 3.3 hours ( much better than
porsche) - Average 12 people/3.3 hrs 3.64 persons/hr
- minivan is better on average than porsche, not to
mention initial cost!
15The Bottom Line Performance (and Cost)
- "X is n times faster than Y" means
- ExTime(Y) Performance(X)
- --------- ---------------
- ExTime(X) Performance(Y)
- Speed of Concorde vs. Boeing 747
- Throughput of Boeing 747 vs. Concorde
16Performance Enhancements
- Technology Enhancements
- Clock Speeds continue to increase
- Gives Overall better performance
- Speedup a General Ratio here as clock applies to
all - Architecture Enhancements
- Can redesign parts of circuits
- Look for Biggest Bang for the Buck !!
- Speedup for only particular Instructions,
functions, etc. - Suppose we speed up ALU ops only
- New Equation for Calculating Speedup
17Amdahl's Law
- Speedup due to enhancement E
- ExTime w/o E
Performance w/ E - Speedup(E) -------------
------------------- - ExTime w/ E Performance w/o
E - Suppose that enhancement E accelerates a fraction
F of the task by a factor S, and the remainder of
the task is unaffected - Law helps us focus on what is important.
- Lets see why
18Amdahls Law
ExTimenew ExTimeold x (1 - Fractionenhanced)
Fractionenhanced
Speedupenhanced
1
ExTimeold ExTimenew
Speedupoverall
(1 - Fractionenhanced) Fractionenhanced
Speedupenhanced
19Amdahls Law
- Floating point instructions improved to run 2X
but only 10 of actual instructions are FP
ExTimenew
Speedupoverall
20Amdahls Law
- Floating point instructions improved to run 2X
but only 10 of actual instructions are FP
ExTimenew ExTimeold x (0.9 .1/2) 0.95 x
ExTimeold
1
Speedupoverall
1.053
0.95
21Amdahls Law
- Helps us understand what we will get in return
for our investment. - If cost of the optimization is expensive, balance
that with what performance returns you will get. - Optimize for the biggest return
22Summary, 1
- Designing to Last through Trends
- Capacity Speed
- Logic 2x in 3 years 2x in 3 years
- DRAM 4x in 3 years 2x in 10 years
- Disk 4x in 3 years 2x in 10 years
- 6yrs to graduate gt 16X CPU speed, DRAM/Disk size
- Time to run the task
- Execution time, response time, latency
- Tasks per day, hour, week, sec, ns,
- Throughput, bandwidth
- X is n times faster than Y means
- ExTime(Y) Performance(X)
- --------- --------------
- ExTime(X) Performance(Y)
-
23Summary, 2
- Amdahls Law
- CPI Law
- Execution time is the REAL measure of computer
performance! - Good products created when have
- Good benchmarks, good ways to summarize
performance - Die Cost goes roughly with die area4
- Can PC industry support engineering/research
investment?
24Review, 3Price vs. Cost
25Next Lecture
- Instruction Set Architectures (ISAs)
- Read from Chapter 2 2.1-2.5, 2.8