Fundamentals of Computer Design - PowerPoint PPT Presentation

1 / 63

About This Presentation

Title:

Fundamentals of Computer Design

Description:

Emerging Technologies. Interleaving. Bus protocols. RAID. VLSI. Input/Output and Storage ... Computer Designers and Chip Costs ... – PowerPoint PPT presentation

Number of Views:2044

Avg rating:5.0/5.0

Slides: 64

Provided by: ccNct

Category:

more less

Transcript and Presenter's Notes

Title: Fundamentals of Computer Design

1
Fundamentals of Computer Design
2
Outline

Performance Evolution
The Task of a Computer Designer
Technology and Computer Usage Trends
Cost and Trends in Cost
Measuring and Reporting Performance
Quantitative Principles of Computer Design

3
Computer Architecture Is

The attributes of a computing system as seen by
the programmer, i.e., the conceptual structure
and functional behavior, as distinct from the
organization of the data flows and controls, the
logic design, and the physical implementation.
(Amdahl, Blaaw, and Brooks, 1964)

4
Computer Architectures Changing Definition

1950s to 1960s Computer Architecture Course
Computer Arithmetic
1970s to mid 1980s Computer Architecture Course
Instruction Set Design, especially ISA
appropriate for compilers
1990s to 2000s Computer Architecture Course
Design of CPU, memory system, I/O system,
Multiprocessors

5
Performance Evolution

1K today buys a gizmo better than 1M could buy
in 1965.
1970s
Mainframes dominated performance improved
2530/yr
Mostly due to improved architecture some
technology aids
1980s
VLSI microprocessor became the foundation
Technology improves at 35/yr
Machine language death opportunity
Mostly with UNIX and C in mid-80s
Even most system programmers gave up assembly
language
With this came the need for efficient compilers

6
Performance Evolution (Cont.)

1980s (Cont.)
Compiler focus brought on the great CISC vs. RISC
debate
With the exception of Intel RISC won the
argument
RISC performance improved by 50/year initially
Of course RISC is not as simple anymore and the
compiler is a key part of the game
Does not matter how fast your computer is, if the
compiler wastes most of it due to the inability
to generate efficient code
With the exploitation of instruction-level
parallelism (pipeline super-scalar) and the use
of caches, performance is further enhanced

CISC Complex Instruction Set Computing RISC
Relegate Important Stuff to the Compiler (Reduced
Instruction Set Computing)
7
Growth in Performance (Figure 1.1)
Mainly due to advanced architecture ideas
Technology driven
8
The Task of A Computer Designer
9
Aspects of Computer Design

Changing face of computing different system
design issues
Desktop computing
Servers
Embedded computers
Bottom line is that it is a complex game
Determine important attributes (perhaps a market
issue)
Functional Requirement
THEN maximize performance
WHILE staying within the cost and power
constraints
Classic conflicting constraint problem

10
A Summary of the Three Computing Classes and
Their System Characteristics
11
Functional Requirements
12
Functional Requirements (Cont.)
Functional Requirement Typical
Features Required or Supported
13
Aspects of Computer Design
Software
Hardware
OurFocus
Architecture
Implementation
VLSI
Logic
Power
Packaging

14
Task of A Computer Design
15
Task of A Computer Design
Shared Memory, Message Passing, Data Parallelism
M
P
M
P
M
P
M
P

Network Interfaces
S
Interconnection Network
Processor-Memory-Switch
Topologies, Routing, Bandwidth, Latency, Reliabili
ty
Multiprocessors Networks and Interconnections
16
Optimizing the Design

Usually the functional requirements are set by
the company/marketplace
Which design is optimal dependent on the choice
of metric
Cost minimized ? simple design
Performance maximized ? complex design or better
technology
Time to market minimized ? also favors simplicity
Oh and you only get one shot
Requires heaps of simulation and must quantify
everything
Inherent requirements for deep infrastructure and
support
Plus you must predict the trends

17
Key trends that must always be tracked

Usage patterns and the market
Technology
Cost and performance

18
Technology and Computer Usage Trends
19
Usage Trends

Memory usage
Average program needs grow by 50 to 100/year
Impact - add an address bit each year
(Instruction set)
Assembly language replaced by HLL
Increasingly important role of compilers
Compiler and architecture types MUST now work
together
Whacked on pictures - even TV
Graphics and multimedia capability
Whacked on communications
I/O subsystems become a higher priority

20
Technology Trends

Integrated Circuits
Density increases at 35/yr.
Die size increases 10-20/yr
Combination is a chip complexity growth rate of
55/yr
Transistor speed increase is similar but signal
propagation does not track this curve - so clock
rates dont go up as fast
Semiconductor DRAM
Density quadruples every 3 years (approx. 60/yr)
4x steps
Cycle time decreases slowly - 33 in 10 years
Interface changes have improved bandwidth

21
Technology Trends (Cont.)

Magnetic Disk
Currently density improves at 100/yr
Access time has improved by 33 in 10 years
Network Technology
Depends both on the performance of switches and
transmission system
1GB Ethernet becomes available about 5 years
after 100MB
Doubling in bandwidth every year
Scaling of transistor performance, wires, and
power in ICs

22
Effects of Rapid Technology Trends

Consider todays product cycle
concept to production 2 years
AND market requirement
of something new every 6 - 12 months
Implications
Pipelined design efforts using multiple design
teams
Have to design for a complexity target that cant
be implemented until the end of the cycle (Design
for the next technology)
Cant afford to miss the best technology so you
have to chase the trends

23
Cost, Price, and Their Trends
24
Cost

Clearly a market place issue -- profit as a
function of volume
Lets focus on hardware costs
Factors impacting cost
Learning curve manufacturing costs decrease
over time
Yield the percentage of manufactured devices
that survives the testing procedure
Volume is also a key factor in determine cost
Commodities are products that are sold by
multiple vendors in large volumes and are
essentially identical.

25
Learning Curve at Work
26
Integrated Circuits Costs
27
Remember This Comic?
28
Cost of an Integrated Circuit

The cost of a packaged integrated circuit is
Cost of dieCost of testing
dieCost of packaging and final test
Cost of IC---------------------------------------
--------------------------------------
Final test yield
Cost of die(Cost of wafer) / (Dies per wafer ?
Die yield)
? ? (Wafer
diameter/2)2 ? ? Wafer diameter
Dies per wafer------------------------------ -
-------------------------
Die area
(2 ? Die area) 0.5

29
Cost of an Integrated Circuit

The fraction or percentage of good dies on a
wafer number (die yield)
Defects per
unit area ? Die area -?
Die yieldWafer yield ? 1 ---------------------
---------------------
?
Where ? is a parameter that corresponds roughly
to the number of masking level, a measure on
manufacturing complexity, critical to die yield
(? 4.0 is a good estimate).

Die Cost goes roughly with die area5
30
Example Finding the number of dies

Find the number of die per 30-cm wafer for a die
that is 0.7 cm on a side.
Ans The total die area is 049 cm2. Thus

? ? (30/2)2
? ? 30 Dies per wafer ------------- ?
---------------- 1347
0.49 ( 2 ? 0.49)0.5
31
Example Finding the die yield

Find the die yield for dies that are 1 cm on a
side and 0.7 cm on a side, assuming a defect
density of 0.6 per cm2.
Ans The total die areas are 1 cm2 and 0.49
cm2.
For the larger die yield is
Die yield1(0.6 ? 1)/4-40.57
For the smaller die, it is
Die yield 1(0.6 ? 0.49)/4-40.75

32
Computer Designers and Chip Costs

The computer designer affects die size, and hence
cost, both by what functions are included on or
excluded from the die and by the number of I/O
pins

33
Cost/Price

Component Costs
Direct Costs (add 10 to 30) costs directly
related to making a project
Labor, purchasing, scrap, warranty
Gross Margin (add 10 to 45) the companys
overhead that cannot be billed directly to one
project
RD, marketing, sales, equipment maintenance,
rental, financing cost, pretax profits, taxes
Average Discount to get List Price (add 33 to
66)
Volume discounts and/or retailer markup

34
Cost/Price Illustration
35
Cost/Price for Different Kinds of Systems
36
Measuring and Reporting Performance
37
Performance

2 key aspects making 1 faster may slow the
other
Execution time (single task)
Throughput (multiple tasks)
Comparing performance
Key measurement is Time of real programs
MIPS? MFLOPS?
Performance 1/execution time
If X is N times faster than Y
Similar for throughput comparisons
Improved performance ? decreasing execution time

38
Measuring Performance

Several kinds of time
Wall-clock time response time, or elapsed time
Load, I/O delays, OS overhead
CPU time - time spent computing your program
Factors out time spent waiting for I/O delays
But includes the OS your program
Hence system CPU time, and user CPU time

39
OS Time

Unix time command reports
User CPU time
System CPU time
Total elapsed time
of elapsed time that is user system CPU time
Tells you how much time you spent waiting as a
BEWARE
OSs have a way of under-measuring themselves

90.7u 12.9s 239 65
40
Choosing Programs to Evaluate Performance

Real applications clearly the right choice
Porting and eliminating system-dependent
activities
User burden -- to know which of your programs you
really care about
Modified (or scripted) applications
Enhance portability or focus on particular
aspects of system performance
Kernels small, key pieces of real programs
Best used to isolate performance of individual
features to explain the reasons from differences
in performance of real programs
Livermore Loops and Linpack are examples
Not real programs however -- no user really uses
them

41
Choosing Programs to Evaluate Performance (Cont.)

Toy benchmarks quicksort, puzzle
Beginning programming assignment
Synthetic benchmarks
Try to match the average frequency of operations
and operands of a large set of programs
No user really runs them -- not even pieces of
real programs
They typically reside in cache dont test
memory performance
At the very least you must understand what the
benchmark code is in order to understand what it
might be measuring
Companies thrive or bust on benchmark performance
Hence they optimize for the benchmark
BEWARE ALWAYS!!

42
Benchmark Suites

SPEC (Standard Performance Evaluation
Corporation)
http//www.spec.org
Desktop benchmarks
CPU-intensive SPEC CPU2000
Graphic-intensive SPECviewperf
Server benchmarks
CPU throughput-oriented SPECrate
I/O activity SPECSFS (NFS), SPECWeb
Transaction processing TPC (Transaction
Processing Council)
Embedded benchmarks
EEMBC (EDN Embedded Microprocessor Benchmark
Consortium)

43
Some PC Benchmarks
44
SPEC CPU2000 Benchmark Suites - Integer
45
SPEC CPU2000 Benchmark Suites Floating Point
46
Reporting Performance Results

Claim Spice takes X seconds on machine Y
Missing
Spice version input? What was the circuit?
Operational parameters - time step, duration
Compiler and version optimization settings
Machine configuration - disk, memory, etc.
Source code modification or hand-generated
assembly language
Reproducibility is a must
List everything another experimenter would need
to duplicate the results

47
Benchmark Reporting
48
Other Problems

Lets assume we can get the test jig specified
properly
See the following example
Which is better?
By how much?
Are the program equally important?

49
Some Aggregate Job Mix Options

Arithmetic Mean - provides a simple average
Does not account for weight - all programs
treated equal
Weighted arithmetic mean
Weight is the frequency of use
Better but beware the dominant program time
Depend on the reference machine

50
Weighted Arithmetic Mean
51
Normalized Time Metrics

Geometric Mean
Has the nice property that
Ratio of the means Mean of the ratios
Consistent no matter which machine is the
reference
Better than arithmetic means but
Dont form accurate prediction models dont
predict execution time
Still have to remain cautious (more drawbacks
pp. 3739)

52
Normalized Time Metrics
Arithmetic mean should not be used to average
normalized execution time
53
Quantitative Principles of Computer Design
54
Make the Common Case Fast

Most pervasive principle of design
Need to validate that it is common or uncommon
Often
Common cases are simpler than uncommon cases
e.g. exceptions like overflow, interrupts, ...
Truly simple is usually both cheap and fast -
best of both worlds
Trick is to quantify the advantage of a proposed
enhancement

55
Amdahls Law
Quantification of the diminishing return
principle

Defines speedup gained from a particular feature
Depends on 2 factors
Fraction of original computation time that can
take advantage of the enhancement - e.g. the
commonality of the feature
Level of improvement gained by the feature
Amdahls law

56
Amdahl's Law (Cont.)
Suppose that enhancement E accelerates a fraction
F of the task by a factor S, and the remainder
of the task is unaffected
57
Simple Example
Amdahls Law says nothing about cost

Important Application
FPSQRT 20
FP instructions account for 50
Other 30
Designers say same cost to speedup
FPSQRT by 40x
FP by 2x
Other by 8x
Which one should you invest?
Straightforward plug in the numbers compare BUT
whats your guess??

58
And the Winner Is?
59
Calculating CPU Performance

All commercial machines are synchronous
Implies there is a clock which ticks once per
cycle
Hence there are 2 useful basic metrics
Clock Rate - today in MHz
Clock cycle time
Clock cycle time 1/clock rate
E.g. 250 MHz rate corresponds to a 4 ns. cycle
time

60
Calculating CPU Performance (Cont.)

We tend to count instructions executed IC
Note looking at the object code is just a start
What we care about is the dynamic count - e.g.
dont forget loops, recursion, branches, etc.
CPI (Clock Per Instruction) is a figure of merit

61
Calculating CPU Performance (Cont.)

3 Focus Factors -- Cycle Time, CPI, IC
Sadly - they are interdependent and making one
better often makes another worse (but small or
predictable impacts)
Cycle time depends on HW technology and
organization
CPI depends on organization (pipeline,
caching...) and ISA
IC depends on ISA and compiler technology
Often CPIs are easier to deal with on a per
instruction basis

62
Simple Example