Title: Computer Architecture CSE 3322
1(No Transcript)
2Computer Architecture CSE 3322
Send email to Anwar Hussani sah6469_at_omega.uta.edu
, with the names and emails of your Four Project
team members by Mon Sept 20. If not on a team,
send your email address to Anwar.
Web Site crystal.uta.edu/jpatte
rs/cse3322
3Arrays versus Pointers
Array approach proc1 ( int v , int n ) int
i for ( i 0 i lt n i i 1) Some
function of v i Pointer approach
v0 means address of v0 proc2 ( int v, int
n ) p means the object pointed to
by
p int p for ( p v0, p lt v n , p p
1) Some function of p
4Arrays versus Pointers
Pointer approach v0 means address of
v0 proc2 ( int v, int n ) p means the
object pointed to
by p int p for ( p v0, p
lt v n , p p 1) Some function of p
vn p v0
vn vn-1 vj v2 v1 v0
5Array approach - Ignore linkage code proc1
( int v , int n ) int i
a0 is v base, a1 is n, t0 is i for ( i
0 i lt n i i 1) Some function of v i
add t0, zero, zero i 0 L1 add
t1, t0, t0 t1 2i add t1, t1, t1
t1 4i add t2, a0, t1 t2 addr of
vi
6Array approach - Ignore linkage code proc1
( int v , int n ) int i
a0 is v base, a1 is n, t0 is i for ( i
0 i lt n i i 1) Some function of v i
add t0, zero, zero i 0 L1 add
t1, t0, t0 t1 2i add t1, t1, t1
t1 4i add t2, a0, t1 t2 addr of
vi lw s0, 0 ( t2 ) s0 vi Some
function of v i
7Array approach - Ignore linkage code proc1
( int v , int n ) int i
a0 is v base, a1 is n, t0 is i for ( i
0 i lt n i i 1) Some function of v i
add t0, zero, zero i 0 L1 add
t1, t0, t0 t1 2i add t1, t1, t1
t1 4i add t2, a0, t1 t2 addr of
vi lw s0, 0 ( t2 ) s0 vi Some
function of v i addi t0, t0, 1 i i 1
8Array approach - Ignore linkage code proc1
( int v , int n ) int i
a0 is v base, a1 is n, t0 is i for ( i
0 i lt n i i 1) Some function of v i
add t0, zero, zero i 0 L1 add
t1, t0, t0 t1 2i add t1, t1, t1
t1 4i add t2, a0, t1 t2 addr of
vi lw s0, 0 ( t2 ) s0 vi Some
function of v i addi t0, t0, 1 i i
1 slt t3, t0, a1 t31 if i lt n bne t3,
zero, L1 if t31 goto L1
9Pointer approach - Ignore Linkage proc2 ( int
v, int n ) int p a0 is v
base, a1 is n, t0 is p for ( p v0, p lt
v n , p p 1) Some function of p
add t0, a0, zero p addr of v0
10Pointer approach - Ignore Linkage proc2 ( int
v, int n ) int p a0 is v
base, a1 is n, t0 is p for ( p v0, p lt
v n , p p 1) Some function of p
add t0, a0, zero p addr of
v0 add t1, a1, a1 t1 2 n add t1,
t1, t1 t1 4 n add t2, a0, t1 t2
addr of vn
11Pointer approach - Ignore Linkage proc2 ( int
v, int n ) int p a0 is v
base, a1 is n, t0 is p for ( p v0, p lt
v n , p p 1) Some function of p
add t0, a0, zero p addr of
v0 add t1, a1, a1 t1 2 n add t1,
t1, t1 t1 4 n add t2, a0, t1 t2
addr of vn L2 lw s0, 0 ( t0 ) s0
p Some function of p
12Pointer approach - Ignore Linkage proc2 ( int
v, int n ) int p a0 is v
base, a1 is n, t0 is p for ( p v0, p lt
v n , p p 1) Some function of p
add t0, a0, zero p addr of
v0 add t1, a1, a1 t1 2 n add t1,
t1, t1 t1 4 n add t2, a0, t1 t2
addr of vn L2 lw s0, 0 ( t0 ) s0
p Some function of p addi t0, t0, 4 p
p 4
13Pointer approach - Ignore Linkage proc2 ( int
v, int n ) int p a0 is v
base, a1 is n, t0 is p for ( p v0, p lt
v n , p p 1) Some function of p
add t0, a0, zero p addr of
v0 add t1, a1, a1 t1 2 n add t1,
t1, t1 t1 4 n add t2, a0, t1 t2
addr of vn L2 lw s0, 0 ( t0 ) s0
p Some function of p addi t0, t0, 4 p
p 4 slt t3, t0, t2 t31 if p lt addr of
vn bne t3, zero, L2 if t31 goto L2
14Array approach - Ignore linkage code proc1
( int v , int n ) int i
a0 is v base, a1 is n, t0 is i for ( i
0 i lt n i i 1) Some function of v i
add t0, zero, zero i 0 L1 add
t1, t0, t0 t1 2i add t1, t1, t1
t1 4i add t2, a0, t1 t2 addr of
vi lw s0, 0 ( t2 ) s0 vi Some
function of v i addi t0, t0, 1 i i
1 slt t3, t0, a1 t31 if i lt n bne t3,
zero, L1 if t31 goto L1
15- The Pointer approach has 3 less instructions in
the loop - The pointer is directly incremented by 4
- This avoids multiplying the index by 4 every
pass
16Design Principles
- Simplicity favors regularity
- All instructions the same size
- Always 3 register operands in arithmetic
instructions - Register fields same place in each format
17Design Principles
- Simplicity favors regularity
- Smaller is faster
- 32 Registers
- Reduced number of instructions
18Design Principles
- Simplicity favors regularity
- Smaller is faster
- Good design demands good compromise
- Word length vs. address and constant length
19Design Principles
- Simplicity favors regularity
- Smaller is faster
- Good design demands good compromise
- Make the common case fast
- Immediate addressing for constant operands
- PC-relative addressing for branches
20Design Principles
- Simplicity favors regularity
- Smaller is faster
- Good design demands good compromise
- Make the common case fast
- Evolution from CISC ( Complex Instruction set
Computers ) to RISC - ( Reduced Instruction Set Computers)
21Different Architectures
Accumulator Architecture One register and one
operand Ex A B C load AddressB Acc
B add AddressC Acc B C store AddressA
A B C
22Different Architectures
Accumulator Architecture One register and one
operand Ex A B C load AddressB Acc
B add AddressC Acc B C store AddressA
A B C Register Memory Architecture A few
registers and two operands Ex A B C ,
assume B is in Reg 2 add Reg2, AddressC Reg 1
Reg 2 C store Reg1, AddressA A B C
23Different Architectures
Accumulator Architecture One register and one
operand Register Memory Architecture A few
registers and two operands Load- Store (Register
Register) Architecture Many registers and
three operands Ex A B C add Reg1, Reg2,
Reg3
24- Year Machine Registers Architecture Instr
Length - 1953 IBM 701 1 accumulator
- 1963 CDC6600 8 load-store
- 1964 IBM 360 16 register-memory
2-6 bytes - 1970 DEC PDP-11 8 register-memory
- 1972 Intel 8008 1 accumulator
- 1974 Motorola 6800 2 accumulator
- 1977 DEC VAX 16 register-memory 1-54
bytes - 1978 Intel 8086 1 extended accum
- 1980 Motorola 68000 16 register-memory
- 1985 Intel 80386 8 register-memory 1 17
bytes - 1985 MIPS 32 load-store 4 bytes
- 1987 SPARC 32 load-store 4 bytes
- Power PC 32 load-store 4
bytes - DEC Alpha 32 load-store
4 bytes - 2003 Intel Itanium 128 load-store
3 in 16 bytes
25Performance of Computers
- Measure Performance
- Compare Measurements
- Understand what affects the measurements
26Performance of Computers
- Measure Performance
- Compare Measurements
- Understand what affects the measurements
- Uses
- Selection of Computers
- Optimization of the Design of
- Architecture
- Software
- Hardware
27Comparing Airplanes
Airplane Capacity Speed (mph)
Throughput Boeing 777 375 610
228,750 Boeing 747 470 610
286,700 Concorde 132 1350 178,200 Douglas
DC-8-50 146 544 79,424
Assess Performance by Specifications. Which is
the highest performer ? Speed Throughput
28Hamburger Stand
Customers
Cashier
Cook
Customer Survey Takes Too Long
29Hamburger Stand
Customers
Cashier
Cook
Customer Survey Takes Too Long
Cashier
Cook
Customers
Cashier
Take orders faster ( Initial Response Time )
Same Time to Burger ( Throughput )
30Performance Measures
- Criteria depends on the User ( Application)
31Performance Measures
- Criteria depends on the User ( Application)
- Must examine the complete process
- ( Specs can mislead)
32Performance Measures
- Criteria depends on the User ( Application)
- Must examine the complete process
- ( Specs can mislead)
- Balancing the performance of the parts
- ( Cook limited!)
33Response Time The time between the start
and completion of a task. Also called Execution
Time.
34Response Time The time between the start
and completion of a task. Also called Execution
Time. Another measure is Throughput the number
of tasks done in a given time. Usually they
track one another, but not always.
35Response Time The time between the start
and completion of a task. Also called Execution
Time. Another measure is Throughput the number
of tasks done in a given time. Usually they
track one another, but not always. Another
consideration is the Initial Response Time the
time to get the first response to a user input
for some task.
Note the measure depends on the TASK!
36Definition of Performance for Machine X for
some task
1
Performance of X
Execution Time of X
If the Performance of X is greater than Y
Performance of X gt Performance of Y
Execution Time of X lt Execution Time of Y
37Machine X is n times faster than Y
means Performance of X Performance of Y
n
Then Execution Time of Y Execution Time
of X
n
38Ex Machine X runs a program in 0.15 sec and
Machine Y takes 0.3 sec Execution Time of Y
0.3 Execution Time of X 0.15
n
2
Machine X is 2 times faster than Y or Machine Y
is 2 times slower than X
- To minimize confusion we will use
- Faster to compare machines
- Improve performance and execution time
39Execution Time is the total task time including
OS overhead, memory accesses, disk accesses,
etc. To relate to different specifications,
another metric is CPU Execution Time the time
the CPU spends working on the task. Does not
include waiting time and overhead For a
program CPU Execution Time CPU Clock Cycles
x Clock Cycle Time
40Aspects of CPU Performance
- instr. count CPI clock rate
- Program
- Compiler
- Instr. Set Arch
- Implementation
- Technology
41Aspects of CPU Performance
- instr. count CPI clock rate
- Program X
- Compiler X
- Instr. Set Arch X
- Implementation
- Technology
42Aspects of CPU Performance
- instr. count CPI clock rate
- Program X
- Compiler X
- Instr. Set Arch X X
- Implementation X
- Technology X
43Aspects of CPU Performance
- instr. count CPI clock rate
- Program X
- Compiler X
- Instr. Set Arch X X
- Implementation X
X - Technology X
X