Title: Lectures for 2nd Edition
1Lectures for 2nd Edition
- Note these lectures are often supplemented with
other materials and also problems from the text
worked out on the blackboard. Youll want to
customize these lectures for your class. The
student audience for these lectures have had
assembly language programming and exposure to
logic design
2Chapter 1
3Introduction
- Rapidly changing field
- vacuum tube -gt transistor -gt IC -gt VLSI (see
section 1.4) - doubling every 1.5 years memory capacity
processor speed (Due to advances in
technology and organization) - Things youll be learning
- how computers work, a basic foundation
- how to analyze their performance (or how not to!)
- issues affecting modern processors (caches,
pipelines) - Why learn this stuff?
- you want to call yourself a computer scientist
- you want to build software people use (need
performance) - you need to make a purchasing decision or offer
expert advice
4What is a computer?
- Components
- input (mouse, keyboard)
- output (display, printer)
- memory (disk drives, DRAM, SRAM, CD)
- network
- Our primary focus the processor (datapath and
control) - implemented using millions of transistors
- Impossible to understand by looking at each
transistor - We need...
5Abstraction
- Delving into the depths reveals more
information - An abstraction omits unneeded detail, helps us
cope with complexityWhat are some of the
details that appear in these familiar
abstractions?
6Instruction Set Architecture
- A very important abstraction
- interface between hardware and low-level software
- standardizes instructions, machine language bit
patterns, etc. - advantage different implementations of the same
architecture - disadvantage sometimes prevents using new
innovationsTrue or False Binary compatibility
is extraordinarily important? - Modern instruction set architectures
- 80x86/Pentium/K6, PowerPC, DEC Alpha, MIPS,
SPARC, HP
7Where we are headed
- Performance issues (Chapter 2) vocabulary and
motivation - A specific instruction set architecture (Chapter
3) - Arithmetic and how to build an ALU (Chapter 4)
- Constructing a processor to execute our
instructions (Chapter 5) - Pipelining to improve performance (Chapter 6)
- Memory caches and virtual memory (Chapter 7)
- I/O (Chapter 8)Key to a good grade reading
the book!
8Chapter 3
9Basic Components (revisited)
10Basic Components (revisited)
Compiler
interface
Computer
Processor
Memory
Control
Input
Program
datapath
ALU
Output
11Instructions
- Language of the Machine
- More primitive than higher level languages e.g.,
no sophisticated control flow - Very restrictive e.g., MIPS Arithmetic
Instructions - Well be working with the MIPS instruction set
architecture - similar to other architectures developed since
the 1980's - used by NEC, Nintendo, Silicon Graphics, Sony
- Design goals maximize performance and minimize
cost, reduce design time
12MIPS arithmetic
- All instructions have 3 operands
- Operand order is fixed (destination
first) Example C code A B C MIPS
code add s0, s1, s2 (associated
with variables by compiler)
Processor
Control
Memory
Program
Reg s0 Reg s1 Reg s2 Reg s3 Reg s4 .
ALU
13MIPS arithmetic
- Design Principle simplicity favors regularity.
- Of course this complicates some things... C
code A B C D E F - A MIPS
code add t0, s1, s2 add s0, t0,
s3 sub s4, s5, s0 - Operands must be registers, only 32 registers
provided - Design Principle smaller is faster.
14Registers vs. Memory
- Arithmetic instructions operands must be
registers, only 32 registers provided - Compiler associates variables with registers
- What about programs with lots of variables?
15Memory Organization
- Viewed as a large, single-dimension array, with
an address. - A memory address is an index into the array
- "Byte addressing" means that the index points to
a byte of memory.
0
8 bits of data
1
8 bits of data
2
8 bits of data
3
8 bits of data
4
8 bits of data
5
8 bits of data
6
8 bits of data
...
16Memory Organization
- Bytes are nice, but most data items use larger
"words" - For MIPS, a word is 32 bits or 4 bytes.
- 232 bytes with byte addresses from 0 to 232-1
- 230 words with byte addresses 0, 4, 8, ... 232-4
- Words are aligned i.e., what are the least 2
significant bits of a word address? - Tip 4100000 0000 0000 0000 0000 0000 0000
01002 8100000 0000 0000 0000 0000
0000 0000 10002 12100000 0000 0000 0000
0000 0000 0000 11002 16100000 0000 0000
0000 0000 0000 0001 00002
0
32 bits of data
4
32 bits of data
Registers hold 32 bits of data
8
32 bits of data
12
32 bits of data
...
17Instructions to perform data transfers
0
4
8
LW (Load Word)
12
0
32 bits of data
ALU
1
16
32 bits of data
2
32 bits of data
20
Memory
3
32 bits of data
24
...
28
SW (Store Word)
28
32 bits of data
32
29
32 bits of data
30
36
32 bits of data
31
32 bits of data
40
44
Processor
...
4294967292
18Instructions
- Load and store instructions
- Example C code A8 h A8 MIPS
code lw t0, 32(s3) add t0, s2, t0 sw
t0, 32(s3) - Store word has destination last
- Remember arithmetic operands are registers, not
memory!
4
h
8
A0
12
A1
16
A2
20
A3
24
A4
28
A5
32
A6
36
A7
40
A8
44
A9
19Our First Example
- Can we figure out the code?
swap(int v, int k) int temp temp
vk vk vk1 vk1 temp
swap muli 2, 5, 4 add 2, 4, 2 lw 15,
0(2) lw 16, 4(2) sw 16, 0(2) sw 15,
4(2) jr 31
20So far weve learned
- MIPS loading words but addressing bytes
arithmetic on registers only - Instruction Meaningadd s1, s2, s3 s1
s2 s3sub s1, s2, s3 s1 s2 s3lw
s1, 100(s2) s1 Memorys2100 sw s1,
100(s2) Memorys2100 s1
21Machine Language
- Instructions, like registers and words of data,
are also 32 bits long - Example add t0, s1, s2
- registers have numbers, t08, s117, s218
- Instruction Format 000000 10001 10010
01000 00000 100000 op rs rt
rd shamt funct - Can you guess what the field names stand for?
22Machine Language
- Consider the load-word and store-word
instructions, - What would the regularity principle have us do?
- New principle Good design demands a compromise
- Introduce a new type of instruction format
- I-type for data transfer instructions
- other format was R-type for register
- Example lw t0, 32(s2) 35 18 9
32 op rs rt 16 bit number - Where's the compromise?
23Stored Program Concept
4
- Instructions are bits
- Programs are stored in memory to be read or
written just like data - Fetch Execute Cycle
- Instructions are fetched and put into a special
register - Bits in the register "control" the subsequent
actions - Fetch the next instruction and continue
add t0,s1,s2
8
add t1,s3,s4
12
sub s1,t0,t1
16
memory for data, programs, compilers, editors,
etc.
lw t0,0(s2)
20
24
28
1024
1028
1032
24Control
4
PC
4
add t0,s1,s2
8
add t1,s3,s4
12
sub s1,t0,t1
16
lw t0,0(s2)
20
24
28
1024
1028
1032
25Control
8
PC
4
add t0,s1,s2
8
add t1,s3,s4
12
sub s1,t0,t1
16
lw t0,0(s2)
20
24
28
1024
1028
1032
26Control
12
PC
4
add t0,s1,s2
8
add t1,s3,s4
12
sub s1,t0,t1
16
lw t0,0(s2)
20
24
28
1024
1028
1032
27Control
16
PC
4
add t0,s1,s2
8
add t1,s3,s4
12
sub s1,t0,t1
16
lw t0,0(s2)
20
24
28
1024
1028
1032
28Control
- Decision making instructions
- alter the control flow,
- i.e., change the "next" instruction to be
executed - MIPS conditional branch instructions bne t0,
t1, Label beq t0, t1, Label - Example if (ij) h i j bne s0, s1,
Label add s3, s0, s1 Label ....
NO
i j
YES
h i j
29Control
bne s0, s1, Label add s3, s0,
s1Label ....
4
PC
4
bne s0, s1, Label
8
add s3, s0, s1
12
...
Label
16
...
20
24
i j
28
YES
1024
h i j
1028
1032
30Control
bne s0, s1, Label add s3, s0,
s1Label ....
8
PC
4
bne s0, s1, Label
8
add s3, s0, s1
12
...
Label
16
...
20
24
i j
28
YES
1024
h i j
1028
1032
31Control
bne s0, s1, Label add s3, s0,
s1Label ....
12
PC
4
bne s0, s1, Label
8
add s3, s0, s1
12
...
Label
16
...
20
24
i j
28
YES
1024
h i j
1028
1032
32Control
bne s0, s1, Label add s3, s0,
s1Label ....
4
PC
4
bne s0, s1, Label
8
add s3, s0, s1
12
...
Label
16
...
20
24
NO
i j
28
1024
h i j
1028
1032
33Control
bne s0, s1, Label add s3, s0,
s1Label ....
12
PC
4
bne s0, s1, Label
8
add s3, s0, s1
12
...
Label
16
...
20
24
NO
i j
28
1024
h i j
1028
1032
34Control
- MIPS unconditional branch instructions j label
- Example if (i!j) beq s4, s5, Lab1
hij add s3, s4, s5 else j Lab2
hi-j Lab1 sub s3, s4, s5 Lab2 ...
h i j
h i - j
NO
i ? j
YES
h i - j
h i j
35Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
4
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
NO
28
i ? j
YES
1024
h i - j
h i j
1028
1032
36Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
4
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
28
i ? j
YES
1024
h i - j
h i j
1028
1032
37Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
8
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
28
i ? j
YES
1024
h i - j
h i j
1028
1032
38Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
12
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
28
i ? j
YES
1024
h i - j
h i j
1028
1032
39Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
20
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
28
i ? j
YES
1024
h i - j
h i j
1028
1032
40Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
4
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
NO
28
i ? j
1024
h i - j
h i j
1028
1032
41Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
16
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
NO
28
i ? j
1024
h i - j
h i j
1028
1032
42Control
beq s4, s5, Lab1 add s3, s4, s5 j
Lab2Lab1 sub s3, s4, s5Lab2 ...
20
PC
4
beq s4, s5, Lab1
8
add s3, s4, s5
12
j Lab2
16
Lab1
sub s3, s4, s5
20
Lab2
24
NO
28
i ? j
1024
h i - j
h i j
1028
1032
43So far
- Instruction Meaningadd s1,s2,s3 s1 s2
s3sub s1,s2,s3 s1 s2 s3lw
s1,100(s2) s1 Memorys2100 sw
s1,100(s2) Memorys2100 s1bne
s4,s5,L Next instr. is at Label if s4 ?
s5beq s4,s5,L Next instr. is at Label if s4
s5j Label Next instr. is at Label - Formats
R I J
44Control Flow
- We have beq, bne, what about Branch-if-less-than
? - New instruction if s1 lt s2 then
t0 1 slt t0, s1, s2 else t0
0 - Can use this instruction to build "blt s1, s2,
Label" can now build general control
structures - Note that the assembler needs a register to do
this, there are policy of use conventions for
registers
45Policy of Use Conventions
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29 30 31
zero reserved v0 v1 a0 a1 a2 a3 t0 t1 t2
t3 t4 t5 t6 t7 s0 s1 s2 s3 s4 s5 s6
s7 t8 t9 reserved reserved gp sp fp ra
46Logic Operations
- Logic Operations (Format R)
- AND (op 0000002 funct 3610 1001002)
- 0 and 0 0
- 0 and 1 0
- 1 and 0 0
- 1 and 1 1
- OR (op 0000002 funct 3710 1001012)
- 0 or 0 0
- 0 or 1 1
- 1 or 0 1
- 1 or 1 1
1000 1010 1010 1001 1011 1000 0001 1000
0111 0011 0010 1001 0000 0000 0000 0000
0000 0010 0010 1001 0000 0000 0000 0000
1000 1010 1010 1001 1011 1000 0001 1000
0111 0011 0010 1001 0000 0000 0000 0000
1111 1011 1010 1001 1011 1000 0001 1000
47Constants
- Small constants are used quite frequently (50 of
operands) e.g., A A 5 B B 1 C
C - 18 - Solutions? Why not?
- put 'typical constants' in memory and load them.
- create hard-wired registers (like zero) for
constants like one. - MIPS Instructions addi 29, 29, 4 slti 8,
18, 10 andi 29, 29, 6 ori 29, 29, 4 - How do we make this work?
48How about larger constants?
- We'd like to be able to load a 32 bit constant
into a register - Must use two instructions, new "load upper
immediate" instruction lui t0,
1010101010101010 - Then must get the lower order bits right,
i.e., ori t0, t0, 1010101010101111
lui t0, 43690
ori t0, 43695
lui t0, 43690
1010101010101010
0000000000000000
ori t0, 43695
0000000000000000
1010101010101111
ori
1010101010101010
1010101010101111
49Assembly Language vs. Machine Language
- Assembly provides convenient symbolic
representation - much easier than writing down numbers
- e.g., destination first
- Machine language is the underlying reality
- e.g., destination is no longer first
- Assembly can provide 'pseudoinstructions'
- e.g., move t0, t1 exists only in Assembly
- would be implemented using add t0,t1,zero
- When considering performance you should count
real instructions
50Other Issues
- Things we are not going to cover support for
procedures linkers, loaders, memory
layout stacks, frames, recursion manipulating
strings and pointers interrupts and
exceptions system calls and conventions - Some of these we'll talk about later
- We've focused on architectural issues
- basics of MIPS assembly language and machine code
- well build a processor to execute these
instructions.
51Overview of MIPS
- simple instructions all 32 bits wide
- very structured, no unnecessary baggage
- only three instruction formats
- rely on compiler to achieve performance what
are the compiler's goals? - help compiler where we can
op rs rt rd shamt funct
R I J
op rs rt 16 bit address
op 26 bit address
52Addresses in Branches and Jumps
- Instructions
- bne t4,t5,Label Next instruction is at Label
if t4 ? t5 - beq t4,t5,Label Next instruction is at Label
if t4 t5 - j Label Next instruction is at Label
- Formats
- Addresses are not 32 bits How do we handle
this with load and store instructions?
op rs rt 16 bit address
I J
op 26 bit address
53Addresses in Branches
- Instructions
- bne t4,t5,Label Next instruction is at Label if
t4?t5 - beq t4,t5,Label Next instruction is at Label if
t4t5 - Formats
- Could specify a register (like lw and sw) and add
it to address - use Instruction Address Register (PC program
counter) - most branches are local (principle of locality)
- Jump instructions just use high order bits of PC
- address boundaries of 256 MB
op rs rt 16 bit address
I
54Addressing Modes
55Addressing Modes
56Addressing Modes
s3
35 19 8 36
lw t0, 36(s3)
s3 gt 19
57Addressing Modes
4
bne t4,t5,Label 5 12 13
3 add .... sub
.... Label lw ... add ...
sub ... sw ...
beq t4, t5, Label 4 12 13
-4
PC 04
04 08 12 16 20 24 28 32
58Addressing Modes
4
4 most significant bits
j Label 2
4 add .... sub
.... Label lw ... add ...
sub ... sw ...
j Label 2
4
PC 04
04 08 12 16 20 24 28 32
59Alternative Architectures
- Design alternative
- provide more powerful operations
- goal is to reduce number of instructions executed
- danger is a slower cycle time and/or a higher CPI
- Sometimes referred to as RISC vs. CISC
- virtually all new instruction sets since 1982
have been RISC - VAX minimize code size, make assembly language
easy instructions from 1 to 54 bytes long! - Well look at PowerPC and 80x86
60PowerPC
- Indexed addressing
- example lw t1,a0s3 t1Memorya0s3
- What do we have to do in MIPS?
- Update addressing
- update a register as part of load (for marching
through arrays) - example lwu t0,4(s3) t0Memorys34s3s3
4 - What do we have to do in MIPS?
- Others
- load multiple/store multiple
- a special counter register bc Loop
decrement counter, if not 0 goto loop
6180x86
- 1978 The Intel 8086 is announced (16 bit
architecture) - 1980 The 8087 floating point coprocessor is
added - 1982 The 80286 increases address space to 24
bits, instructions - 1985 The 80386 extends to 32 bits, new
addressing modes - 1989-1995 The 80486, Pentium, Pentium Pro add a
few instructions (mostly designed for higher
performance) - 1997 MMX is addedThis history illustrates
the impact of the golden handcuffs of
compatibilityadding new features as someone
might add clothing to a packed bagan
architecture that is difficult to explain and
impossible to love
62A dominant architecture 80x86
- See your textbook for a more detailed description
- Complexity
- Instructions from 1 to 17 bytes long
- one operand must act as both a source and
destination - one operand can come from memory
- complex addressing modes e.g., base or scaled
index with 8 or 32 bit displacement - Saving grace
- the most frequently used instructions are not too
difficult to build - compilers avoid the portions of the architecture
that are slow - what the 80x86 lacks in style is made up in
quantity, making it beautiful from the right
perspective
63Summary
- Instruction complexity is only one variable
- lower instruction count vs. higher CPI / lower
clock rate - Design Principles
- simplicity favors regularity
- smaller is faster
- good design demands compromise
- make the common case fast
- Instruction set architecture
- a very important abstraction indeed!
64Appendix A
The default value for the begining of data
segment is 0x10010000 268500992 (High0x100140
97 and Low0x00000) .data a .word 36, 20,
27, 15, 1, 62, 41 n .word 7 max .word
0 .text Every code begins with "main"
label main ori 8, 0, 0 i in 8 ori 16,
0, 0 max in 16 lui 18, 4097 lw
17, 28(18) n in s1 m1 slt 18, 8, 17
beq 18, 0, m3 if i is gtn then quit
ori 18, 0, 4 mul 9, 8, 18
scale i lui 18, 4097 add 18, 18, 9
add 4.i to the base address of a lw
10, 0(18) load ai into 10
slt 18, 16, 10 check ai os bigger then
the actual beq 18, 0, m2 skip "then
part" if ai lt max add 16, 0, 10 "then
part" max ai m2 addi 8, 8, 1 i j
m1 m3 nop
....
- Using SPIM
- Basic structure MIPS program to run in SPIM
65Appendix A
66Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
Original code
67Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
Original code
68Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
Original code
69Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
Original code
70Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
31
Original code
71Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
31
Original code
72Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
31
Original code
73Appendix B
- Instructions to cope with functions in Assembly
- Functions are small pieces of code frequently
utilized. - Suppose we have a code to perform arithmetic on
complex numbers the conversions between polar
and rectangular representations are frequently
used in the code. Thus, one function to convert
from polar to rectangular is desired, as well as
one from rectangular to polar.
Rec2pol add .... sub ....
jr 31 return Pol2rec mul ....
add.... jr 31
return main lui ....
lw.... jal Rec2pol call
add.... sw....
call
return
Code with a funtion
31
Original code