Title: Computer Architecture
1Computer Architecture
2Terminology
- Digital
- Discrete, well defined values/steps
- Opposite of analog
- Analogy digital is to analog as int is to double
- Binary
- A system consisting of two states
- on/off, true/false, yes/no, high/low, 0/1
- Basis for modern computers
3Terminology
- Bit
- Binary-digit
- Smallest unit of storage in modern computers
- Nibble 4 bits
- Byte 8 bits
- Word typically two bytes but often refers to
the native bit length of the machine
4Data Representation
- 1000001
- one million, one
- sixteen million, seven hundred seventy seven
thousand, two hundred, seventeen - two hundred sixty two thousand, one hundred forty
five - sixty five
- A
- AJMP assembly language instruction
5Boolean Algebra
- Boolean algebra is an algebra that deals with
binary variables and logic operations
6Boolean Algebra
- Boolean algebra consists of
- A set of symbols that represent variables
- Use letters just like regular algebra
- A, B, C, a, b, c
- Variables are binary (2-valued)
- 0, 1
- Three basic operators
- AND, OR, NOT
- Other symbols
- ( )
7Boolean Operators
- AND
- Notation A B, AB, (AB), A(B)
- Yields a value of 1 when both A and B are 1
- Yields a value of 0 when either A or B is 0
8Boolean Operators
- OR
- Notation A B
- Yields a value of 1 when either A or B is 1
- Yields a value of 0 when both A and B are 0
9Boolean Operators
- NOT
- Notation A, A
- Yields a value of 1 A is 0
- Yields a value of 0 when A is 1
10Boolean Expressions
- As in regular algebra, variables, operators,
and symbols can be combined to form expressions
or functions - F(x, a, b) x (a b)
- F is a boolean function of three variables
- Often written as
- F x (a b)
11Boolean Functions
- Typically, we want to exhaustively evaluate a
given boolean function - That is, we want to know its functional value for
every possible combination of inputs - This leads us to Truth Tables
12Truth Tables
- List all possible combinations of input values in
the left hand columns - List expression result in the right hand column
A B AB
0 0 0
0 1 0
1 0 0
1 1 1
A B AB
0 0 0
0 1 1
1 0 1
1 1 1
A A
0 1
1 0
13Axioms
- x 0 x
- x 1 1
- x x x
- x x 1
- x y y x
- x (y z) (x y) z
- x(y z) xy xz
- (x y) xy
- (x) x
- x 1 x
- x 0 0
- x x x
- x x 0
- xy yx
- x(yz) (xy)z
- x yz (x y)(x z)
- (xy) x y
14Logic Circuits
Schematic Symbols
- These are the things computers (and other digital
devices) are made of - Circuit designers use Boolean algebra to design
circuits drawn on schematic drawings - Fabrication facilities use schematic drawings to
produce silicon chips
15More Gates
- NAND
- Shortened form of not and
16More Gates
- NOR
- Shortened form of not or
17NAND/NOR
- So, whats so special about NAND and NOR?
- NAND and NOR are considered universal gates
- That is, anything that can be done with
AND/OR/NOT can be done with only NAND or NOR
gates (one or the other, not both)
18NAND/NOR
- The universality of NAND/NOR is important because
it means you can make many copies of a single
gate type on a single piece of silicon and then
use it to create complex circuits on a single
chip - Exercise
- Show how to make AND, OR, and NOT gates using
only - NAND gates
- NOR gates
19XOR Operator
- Another specialty gate odd function
A B A B
0 0 0
0 1 1
1 0 1
1 1 0
20K-Maps
- A K-Map is a grid (map) where each square
corresponds to a minterm
Note the ordering here is Gray code, not binary
21Sum-of-Products
- This is what we previously called the
sum-of-minterms - Form the largest power-of-two groupings of 1s on
the K-map - Create the schematic
22Product-Of-Sums
- Instead of forming large adjacent groups of 1s
(on the K-map), form large adjacent groups of 0s - What does this mean in terms of the original
expression/truth-table? - It means you have simplified F, not F
- To fix what youve done you need only negate
the final result them apply De Morgans theorem
23Example Sum-of-Products
B
D
F
C
A
D
24Example Product-of-Sums
B
D
F
A
C
D
25So What?
- As it turns out, the sum-of-products can be
easily implemented with NAND gates - Similarly, the product-of-sums can be easily
implemented with NOR gates - This may greatly simplify the design thus saving
us money!
26NAND/NOR Implementations
27Combinational Circuits
- Definition A connected arrangement of logic
gates with a set of inputs and outputs - Specifically, they have no memory and no clock!
28Combinational Circuit Design
- Design a Half-Adder
- A combinational circuit that adds 2 bits
- Input 1 is call the Augend
- Input 2 is called the Addend
- Output 1 is called the Sum
- Output 2 is called the Carry
29Combinational Circuit Design
- Design a Full-Adder
- A combinational circuit that adds 3 bits
- Input 1 is call the Augend
- Input 2 is called the Addend
- Input 3 is call the Carry-in
- Output 1 is called the Sum
- Output 2 is called the Carry-out
30Sequential Circuits
- Two primary differences between combinational
circuits and sequential circuits - Sequential circuits are synchronous (use a clock)
- Sequential circuits have memory (current state)
31Clock
- A series of pulses
- Sometimes referred to as a pulse train
- Basically, its an digital signal just oscillates
between 0 and 1
32Clock
- Clock period is specified in units of time
- Seconds, milliseconds, microseconds
- Clock frequency is specified in units of
frequency 1/period pulses per time-unit - Hertz, Megahertz, Gigahertz
33Memory Devices
- Flip flops
- Four basic types
- SR, D, JK, T
- Each type stores 1-bit (two states 0/1)
- Each maintains its current state until a clock
pulse arrives - i.e. Ignores input lines until a clock pulse
reaches the clock input
34SR Flip-Flop
- Set/Reset
- Two input lines
- One clock input line
- Two output lines
Characteristic Table
S R Q(t1) Comments
0 0 Q(t) No change
0 1 0 Reset to 0
1 0 1 Set to 1
1 1 ? Unspecified
- Q(t) refers to the current state
- Q(t1) refers to the next state
35D Flip-Flop
- Delay
- One input line
- One clock input line
- Two output lines
Characteristic Table
D Q(t1) Comments
0 0 Clear to 0
1 1 Set to 1
- Q(t1) refers to the next state
36JK Flip-Flop
- JK
- Two input lines
- One clock input line
- Two output lines
Characteristic Table
J K Q(t1) Comments
0 0 Q(t) No change
0 1 0 Reset to 0
1 0 1 Set to 1
1 1 Q(t) Complement
- Q(t) refers to the current state
- Q(t1) refers to the next state
37T Flip-Flop
- Toggle
- One input line
- One clock input line
- Two output lines
Characteristic Table
T Q(t1) Comments
0 Q(t) No change
1 Q(t) Complement
- Q(t1) refers to the next state
38Edge-Triggering
- Output of the flip-flop occurs on the edge of a
pulse - Rising edge (0 to 1 transition)
- Falling edge (1 to 0 transition)
39Edge-Triggering
Positive (rising) edge-triggered
output frozen
positive clock transition
Negative (falling) edge-triggered
negative clock transition
output frozen
40Edge-Triggering
- Setup time
- This is the minimum time that the inputs must
remain constant before the edge transition - Hold time
- This is the amount of time in which the inputs
must not change after the edge transition - These values are not to interesting from a
theoretical point of view but can make or break a
circuit in practice
41Master-Slave Flip-Flops
- Two flip-flops of the same type wired together
- Master
- Rising edge triggered
- Receives inputs from outside world
- Sends outputs to the slave
- Slave is falling edge triggered
- Falling edge triggered
- Receives inputs from master
- Sends outputs to the outside world
- This set-up basically creates a more stable
flip-flop in terms of set-up and hold times
42Master-Slave JK Flip-Flop
43Sequential Circuits
- Combination of logic gates, flip-flops (memory
elements), and a clock signal - The circuit can be described in two parts
- The combinational part
- The sequential part
44The Combinational Part
- Describe outputs in terms of logic gates and
flip-flop outputs
45The Sequential Part
- Flip-flop input equations
- Describe the inputs to flip-flop elements in
terms of logic gates and other flip-flop elements
46Designing a Circuit
- The goal of circuit design is to convert a
specification (a bunch of words) into a circuit - No different than software development!
- Example
- Design a circuit that counts modulo 4 every time
it receives a 1 on the input line
47State Diagram
X 0
X 0
X 1
X 1
X 1
X 1
X 0
X 0
48State Table
Present State (time t) Present State (time t) Input Next State (time t1) Next State (time t1)
A B x A B
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 1 0
1 0 1 1 1
1 1 0 1 1
1 1 1 0 0
49Flip-Flop Usage
- We know we need 2 flip-flops since the counter
must count modulo 4 (2 bits) - We can choose any type we want
- D, T, SR, JK
- By looking at the excitation table for the chosen
type we can create flip-flop equations
50JK Flip-Flop Based Design
State Table Excitation Table
JK Excitation Table
Present State (time t) Present State (time t) Input Next State (time t1) Next State (time t1) Flip-Flop Inputs Flip-Flop Inputs Flip-Flop Inputs Flip-Flop Inputs
A B x A B JA KA JB KB
0 0 0 0 0 0 x 0 x
0 0 1 0 1 0 x 1 x
0 1 0 0 1 0 x x 0
0 1 1 1 0 1 x x 1
1 0 0 1 0 x 0 0 x
1 0 1 1 1 x 0 1 x
1 1 0 1 1 x 0 x 0
1 1 1 0 0 x 1 x 1
JK flip-flop JK flip-flop JK flip-flop JK flip-flop
Q(t) Q(t1) J K
0 0 0 x
0 1 1 x
1 0 x 1
1 1 x 0
51Simplification
- Need to specify combinational circuits for
- JA input
- KA input
- JB input
- KB input
- Use 3-variable K-maps
- One for each required input/state combination
- Map/simplify the Flip-Flop Input columns of the
table
52K-Map Simplification
Bx
Bx
00 01 11 10
0 0 0 1 0
1 x x x x
00 01 11 10
0 x x x x
1 0 0 1 0
A
A
JA
KA
Bx
Bx
00 01 11 10
0 0 1 x x
1 0 1 x x
00 01 11 10
0 x x 1 0
1 x x 1 0
A
A
JB
KB
53Draw The Logic Gates
x
A
B
clock
54Integrated Circuit (IC)
- A silicon crystal (chip) containing electronic
components that create the logic gates weve been
looking at - SSI Small Scale Integration
- MSI Medium Scale Integration
- LSI Large Scale Integration
- VLSI Very Large Scale Integration
- These refer to the number of logic gates
contained on the chip
55Technologies
- TTL Transistor-Transistor Logic
- ECL Emitter-Coupled Logic
- MOS Metal-Oxide Semiconductor
- CMOS Complementary Metal-Oxide Semiconductor
- These refer to the underlying characteristics of
the process for turning silicon into gates
56Digital Components
- Decoder
- Encoder
- Multiplexer
- Register
- Shift Register
- Counter
- Memory
57Decoder
- Convert n input bits to a single output bit
- For example converting binary to octal (3-to-8)
- What does the circuit look like?
- Start with a truth table
58Encoder
- Inverse of a decoder
- Convert one input bit to multiple output bits
- For example converting octal to binary (8-to-3)
59Multiplexer
- Routes one of 2n input data lines to a single
output line based on n selection lines - How many inputs total?
- How big is the truth table?
60Register
- A multi-bit storage element made up of a group of
flip-flops - Recall flip-flops store 1 bit each
61Register
- CLR is a clear input for asynchronous
initialization - Data can be read out at any time
- Data is input with the clock signal, referred to
as loading - Loading can be further controlled through the use
of additional combinational circuitry
62Shift Register
- Like a normal register only bits can be shifted
from one flip-flop to the next
63Counter
- A register that cycles though predetermined
states based on an external input - Like we just designed using flip-flops
- Parallel load/clear functionality is often added
via combinational circuitry
64Memory
- A group of storage cells and associated access
circuits - Bits are grouped into words
- Words are the smallest addressable unit
- Typically made up of 1 or more bytes (8-bits)
- Each word in memory is assigned a unique address
65Memory
66Memory
- How many address lines?
- How many data input lines?
- How many data output lines?
67Memory
- K Kilo-bytes 210 bytes
- M Mega-bytes 220 bytes
- G Giga-bytes 230 bytes
- May be specified in either bytes or words
- Micro-processors will often talk of Kilo-bits
or Mega-bits - Be careful
68Memory
- Two types
- RAM Random Accessible Memory
- Operations we just looked at
- ROM Read Only Memory
- Has no input data lines
- Has no write input
- Has no read input (doesnt need it just acts
when a valid address is supplied)
69ROM
- Significantly cheaper than RAM since it lacks
versatility - How does the data get in there?
- Mask programming data is programmed in at the
time of silicon fabrication - PROM special programming devices allow the user
to write data one time - EPROM data is erased under ultra-violet light
or electronically, but must be entirely erased
and rewritten (cant write single words)
70Architectural Functional Block Diagram
71So What?
- A digital computer is a fascinating thing in and
of itself but somewhat useless - It is the job of the programmer to make it do
something useful - The programmers job is to supply specific,
detailed instructions to move and manipulate
binary data patterns within the architecture to
accomplish a meaningful task
72Sounds Easy
- The problem is that we the programmers dont
want to be burdened with the knowledge of gates,
registers, flip-flops, etc. - So, we describe the architecture in terms of
various parameters useful to the programmer - Note that the programmer may not be an
applications programmer it may be a language
compiler writer, for example
73Descriptive Parameters
- The set of registers within the architecture
- The names and functions (uses) of the registers
- The set of operations available for moving data
between registers and manipulating data contained
within registers - Microoperations
- The method of specifying the sequence of
execution of the microoperations
74Microoperations
- To describe the operations we use a language
called Register Transfer Language - We as programmers assume that the logic
circuits combinational and/or sequential are
available to perform the transfers
75Register Transfer Language
- A system for expressing in symbolic form the
microoperation sequences among the registers of a
digital module - Sound familiar?
- Note that this is not Assembly Language!!!
76Register Transfer Language(RTL)
- Registers are designated by capital letters
- MAR Memory Address Register
- PC Program Counter
- IR Instruction Register
- Rx General purpose register
- etc.
- Bits within registers are numbered 0 to n-1
(n-bit register) starting at the LSB (rightmost
bit)
77Register Representations(Pictorial)
R1
78Register Transfers
- Move the data from one register to another
- The bit pattern that is in register R1 is copied
into register R2 - Again, we are assured that the circuitry required
to perform the transfer is available - Implies a parallel load operation
79Conditional Transfer
- Conditionally move the data from one register to
another - The bit pattern that is in register R1 is copied
into register R2 if the control signal P is high
(1) - This isnt a Java if/then statement!
- Again, we are assured that the circuitry required
to perform the transfer is available - Implies a parallel load operation
80Control Function
- Conditionally move the data from one register to
another - Same meaning as the if/then statement
- P may be a complex logic expression/combinational
circuit
81Parallel Operations
- Some microoperations take multiple clock cycles
(periods) - Some microoperations can be performed
simultaneously (in parallel) - During the same clock edge transition
- Its this kind of operation that distinguishes
between computers and super computers
82Data Paths
- Lots and lots of little wires
- Every register pair that can transfer data must
be wired together
- You would have to do this for all registers!
83Bus Implementation
bit 3
bit 2
bit 1
bit 0
select 1
select 0
A0
B0
C0
D0
A1
B1
C1
D1
A2
B2
C2
D2
A3
B3
C3
D3
A0
A1
A2
A3
B0
B1
B2
B3
C0
C1
C2
C3
D0
D1
D2
D3
register D
register C
register B
register A
84Three-State Gates
- The most common tri-state gate is the buffer
- Whats a buffer?
- Whats a tri-state buffer?
input output
0 0
1 1
input control output
0 0 open
0 1 0
1 0 open
1 1 1
85Register Transfer Language (RTL)
Addition
Subtraction
Increment
Decrement
Complement (invert bits)
Negate
Subtraction
- We need circuits to do all these operations
86Logic Microoperations
- Counter-part to the arithmetic microoperations we
looked at last week - Similar to Boolean functions (expressions) except
that the operands are registers (bit strings)
rather than individual Boolean variables - This is exactly what you implemented in code
87ALU Block DiagramStage i
S3
S2
- Combinational circuits that we looked at
previously are inserted into the boxes
Cin
S1
Arithmetic Circuit Stage i
S0
4x1 MUX
select
Cout
0
Fi
1
Logic Circuit Stage i
2
3
Bi
Ai
shr
Ai-1
shl
Ai1
88Table of Microoperations
Operation Select Operation Select Operation Select Operation Select Operation Select Resultant operation/function Resultant operation/function
S3 S2 S1 S0 Cin Operation Function
0 0 0 0 0 FA Transfer
0 0 0 0 1 FA1 Increment
0 0 0 1 0 FAB Add
0 0 0 1 1 FAB1 Add w/carry
0 0 1 0 0 FAB Sub w/borrow
0 0 1 0 1 FAB1 Subtract
0 0 1 1 0 FA-1 Decrement
0 0 1 1 1 FA Transfer
0 1 0 0 x FA B AND
0 1 0 1 x FA B OR
0 1 1 0 x FA B XOR
0 1 1 1 x FA Complement
1 0 x x x Fshr A Shift right
1 1 x x x F shl A Shift left
89Making a Computer
- Binary number system ? Boolean functions
- Boolean functions ? Combinational circuits
- Combinational circuits ? Sequential circuits
- Sequential/Combinational circuits ?
Functional units - Functional units ? Computer architecture
90Defining a Computer
- Computer Architecture
- Bus
- Registers
- Register transfer language
- Microoperations
- Instruction set
- Timing and control
91Instruction Set
- Instruction (Assembly language statement)
- Binary code
- Consists of an operation code and operand(s)
- Specifies a sequence of microoperations to be
executed - One high level language (e.g. C/Java) statement
specifies a sequence of instructions - Stored in memory
- Note that we are talking about a level higher
than RTL but lower than Java, C, etc.
92Assembly Language
- Every computer architecture (or family of
architectures) has its own unique assembly
language - Unlike Java, you should not learn assembly
language syntax, data types, etc. - You should learn to program/think at the assembly
language level - Its a way of thinking that requires intimate
knowledge of the underlying hardware architecture
93Assembly Language Instructions
- Each instruction has two basic parts
- Operation code (opcode)
- What the instruction wants the processor to do
- Operand(s) (registers, memory addresses)
- Data location that the instruction wants the
processor to manipulated - Some operands will be explicit while others will
be implicit (implied by the opcode)
94Assembly Language Instructions
- n-bit instruction format
- Example 16 bit instruction
2(n-1)-(m1) opcodes
2(m1) addresses
24 16 opcodes
212 4096 addresses
95Assembly Language Instructions
- Instructions within the same Assembly language
may be of differing lengths - i.e. not all instructions utilize the same number
of bits as we saw with the Pentium
96Internal Operation
- To execute an assembly language instruction the
processor goes through 4 steps - Fetch an instruction from memory
- Decode the instruction
- Read the operands from memory/registers
- Execute the instruction
- This is often referred to as the Fetch-Execute
cycle or the Instruction cycle - To execute a program the processor repeats this
cycle until a halt instruction is reached
97Internal Operation
- All this is under the control of the Control Unit
- This is the component that decodes the
instruction and sends out microoperations to the
rest of the hardware - The control unit can be hardwired
- Made up entirely of sequential circuits designed
to do precisely the fetch-execute steps fixed
instruction set - The control unit can be microprogrammed
- A small programmable processor within the
processor programmable instruction set - More on this later
98Addressing Modes
- In designing a computer architecture the designer
must specify the instruction set - Opcode/operand pairs
- In specifying operands there are a number of
alternatives - Immediate instructions
- Direct address operands
- Indirect address operands
99Addressing Modes
I
opcode
address
Mode bit
Immediate
Direct
Indirect
0
addc
3
0
add
0x33
1
add
0x33
0x33
0x33
0x42
0x42
0x42
0x88
100Registers
- In designing a computer architecture the designer
must specify the register set - There are essentially two categories
- Special purpose registers
- General purpose registers
101Special Purpose Registers
- Program Counter (PC)
- Holds the memory address of the next instruction
of our program - Memory Address Register (AR)
- Holds the address of a location in memory that we
want to access (read/write) - The size of (number of bits in) these two
registers is determined by the number of memory
addresses in our architecture
102Special Purpose Registers
- Instruction Register (IR)
- Holds the instruction (opcode/operand) we are
about to execute - Data Register (DR)
- Holds the operand read from memory to be sent to
the ALU - Accumulator (AC)
- Holds an input to the ALU and the output from the
ALU
103Special Purpose Registers
- Input Register (INPR)
- Holds data received from a specified external
device - Output Register (OUTR)
- Holds data to be sent to a specified external
device
104General Purpose Registers
- Temporary Register (TR)
- For general usage either by our program or the
architecture
105Bus
- In designing a computer architecture the designer
must specify the bus layout - The size of the bus (in bits)
- What is connected to the bus
- Access control to the bus
- Recall that a bus is an efficient alternative to
lots of wires when it comes to transferring data
between registers, control units, and memory
locations
106Bus Architecture
S2
Access Select
Memory unit 4096x16
111
S1
S0
address
001
AR
010
PC
011
16-bit Bus
DR
E
ALU
100
AC
INPR
101
IR
110
TR
OUTR
clock
107Bus Architecture
- The three access select lines determine which
register is allowed to write to the bus at a
given time (recall that only one write at a time
is allowed) - Registers have load input signals (LD) that tell
them to read from the bus - If registers are smaller than the bus (less bits)
than unused bits are set to 0 - Some registers have additional input signals
- Increment (INR) and Clear (CLR)
- See figure 5-4, page 130 of the textbook
108Bus Architecture
- Memory has read/write input signals that tell it
when to take data from the bus and send data to
the bus - Memory addresses (for both read and write
operations) are always specified via the Address
Register (AR) - An alternative (used in many architectures) is a
two bus system - One address bus
- One data bus
109Bus Architecture
- Results of all ALU (arithmetic, logic, and shift
operations) are always sent to the Accumulator
(AC) register - The ALU is the only way to set values into the
accumulator except for the clear (CLR) and
increment (INR) control lines - Inputs to the ALU come from
- The Accumulator (AC)
- The Data Register (DR)
- The Input Register (INPR)
- The E output from the ALU is the carry-out
(Extended AC) bit - Many architectures pack this into a register with
other status bits such as overflow
110Bus Architecture
- Some pairs of microoperations can be performed in
a single clock cycle - The key is to make sure they dont both try to
put data on the bus - Consider the RTL statement
- DR ? AC, AC ? DR
- This is allowed since the DR ? AC microoperation
uses the bus while the AC ? DR microoperation
does not
111Instructions
- Three basic types
- Those that reference memory operands
- Those that reference register operands
- Those that reference I/O devices
- Again, this is only for the fictitious
architecture in the textbook but you will find
similar categorizations in real architectures
112Memory Instructions
- There are 14 instructions in this class
- 7 direct memory address forms
- 7 indirect memory address forms
0
11
12
14
15
I
opcode
address
I 0 means direct memory address
I 1 means indirect memory address
113Register Instructions
- There are 12 instructions in this class
- They can use the operand field to specify the
register and type of operation since no memory
address is required
0
11
12
14
15
0
1
1
1
Register operation
114I/O Instructions
- There are 6 instructions in this class
- They can use the operand field to specify the
exact operation since no memory address is
required
0
11
12
14
15
1
1
1
1
I/O operation
115Instruction Decoding
- The control unit evaluates bits 15 12 to
determine the instruction format - At first glance it appears that there can be only
8 unique instructions since the opcode resides in
4 bits - But, additional instructions are created through
the use of the I bit an unused bits in the
operand field
116Instruction Set Design
- To be useful, an architectures instruction set
must contain enough instructions to allow all
possible computations - Four categories are necessary
- Arithmetic, logical, shift operations
- Moving data to/from memory from/to registers
- Control such as branch and conditional checks
- Input/output
117Instruction Set Design
- The set in the book is complete in that all the
possible operations on binary numbers can be
performed through combinations of instructions - But, the set is very inefficient in that highly
used operations require multiple instructions - This is why the Pentium instruction set is so
large and complicated it makes for efficient
programs
118Assembly Language
High Level Language Instruction
119Assembly Language
- Integral part of the computer architecture
- As we saw in previous lectures, the architecture
is designed to perform assembly language
instructions - Often, the assembly language (or instruction set)
is the first thing defined in designing a new
computer