Title: 55:035 Computer Architecture and Organization
155035 Computer Architecture and Organization
2Outline
- Adders
- Comparators
- Shifters
- Multipliers
- Dividers
- Floating Point Numbers
3Binary Representations of Numbers
- To find negative numbers
- Sign and magnitude msb 1
- 1s complement complement each bit to change
sign - 2s complement 2n positive number
b2b1b0 Unsigned Sign and Magnitude 1s Complement 2s Complement
0 1 1 3 3 3 3
0 1 0 2 2 2 2
0 0 1 1 1 1 1
0 0 0 0 0 0 0
1 0 0 4 -0 -3 -4
1 0 1 5 -1 -2 -3
1 1 0 6 -2 -1 -2
1 1 1 7 -3 -0 -1
4Single-Bit Addition
A B Cout S
0 0
0 1
1 0
1 1
A B C Cout S
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
4
5Single-Bit Addition
A B Cout S
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0
A B C Cout S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
6Carry-Ripple Adder
- Simplest design cascade full adders
- Critical path goes from Cin to Cout
- Design full adder to have fast carry delay
7Carry Propagate Adders
- N-bit adder called CPA
- Each sum bit depends on all previous carries
- How do we compute all these carries quickly?
8Propagate and Generate - Define
- An n-bit adder is just a combinational circuit
- si a XOR b XOR c aibici aibici
aibici aibici - ci1 MAJ(a,b,c) aibi aici bici
- Want to write si in sum of products form
- ci1 gi pici, where gi aibi, pi ai bi
- if gi is true, then ci1 is true, thus carry is
generated - if pi is true, and if ci is true, ci is
propagated - Note that the pi equation can also be written as
- pi ai XOR bi since gi 1 when ai bi 1
(i.e. generate occurs, so propagate is a dont
care)
9Propagate and Generate-Lookahead
- Recursively apply to eliminate carry terms
- ci1 gi pigi-1 pipi-1gi-2
- pipi-1p1g0 pipi-1p1p0c0
- This is a carry-lookahead adder
- Note large fan-in of OR gate and last AND gate
- Too big! Build ps and gs in steps
- c1 g0 p0c0
- c2 G01 P01c0
- Where G01 g1 p1g0, P01 p1p0
9
10Propagate and Generate - Blocks
- In the general case, for any j with iltj, j1ltk
- ck1 Gik Pikci
- Gik Gj1,k Pj1,k Gij
- Pik PijPj1,k
- Gik equation in words
- A carry is generated out of the block consisting
of bits i through k inclusive if - it is generated in the high-order part of the
block (j1, k) or - it is generated in the low-order (i,j) part of
the block and then propagated through the high
part
11PG Logic
12Carry-Ripple Revisited
13Carry-Skip Adder
- Carry-ripple is slow through all N stages
- Carry-skip allows carry to skip over groups of n
bits - Decision based on n-bit propagate signal
14Carry-Lookahead Adder
- Carry-lookahead adder computes G0i for many bits
in parallel. - Uses higher-valency cells with more than two
inputs.
15Carry-Select Adder
- Trick for critical paths dependent on late input
X - Precompute two possible outputs for X 0, 1
- Select proper output when X arrives
- Carry-select adder precomputes n-bit sums
- For both possible carries into n-bit group
16Comparators
- 0s detector A 00000
- 1s detector A 11111
- Equality comparator A B
- Magnitude comparator A lt B
171s 0s Detectors
- 1s detector N-input AND gate
- 0s detector NOTs 1s detector (N-input NOR)
18Equality Comparator
- Check if each bit is equal (XNOR, aka equality
gate) - 1s detect on bitwise equality
19Magnitude Comparator
- Compute B-A and look at sign
- B-A B A 1
- For unsigned numbers, carry out is sign bit
20Signed vs. Unsigned
- For signed numbers, comparison is harder
- C carry out
- Z zero (all bits of A-B are 0)
- N negative (MSB of result)
- V overflow (inputs had different signs, output
sign ? B) -
21Shifters
- Logical Shift
- Shifts number left or right and fills with 0s
- 1011 LSR 1 ____ 1011 LSL1 ____
- Arithmetic Shift
- Shifts number left or right. Rt shift sign
extends - 1011 ASR1 ____ 1011 ASL1 ____
- Rotate
- Shifts number left or right and fills with lost
bits - 1011 ROR1 ____ 1011 ROL1 ____
22Shifters
- Logical Shift
- Shifts number left or right and fills with 0s
- 1011 LSR 1 0101 1011 LSL1 0110
- Arithmetic Shift
- Shifts number left or right. Rt shift sign
extends - 1011 ASR1 1101 1011 ASL1 0110
- Rotate
- Shifts number left or right and fills with lost
bits - 1011 ROR1 1101 1011 ROL1 0111
23Funnel Shifter
- A funnel shifter can do all six types of shifts
- Selects N-bit field Y from 2N-bit input
- Shift by k bits (0 ? k lt N)
24Funnel Shifter Operation
- Computing N-k requires an adder
25Funnel Shifter Operation
- Computing N-k requires an adder
26Funnel Shifter Operation
- Computing N-k requires an adder
27Funnel Shifter Operation
- Computing N-k requires an adder
28Funnel Shifter Operation
- Computing N-k requires an adder
29Simplified Funnel Shifter
- Optimize down to 2N-1 bit input
30Simplified Funnel Shifter
- Optimize down to 2N-1 bit input
31Simplified Funnel Shifter
- Optimize down to 2N-1 bit input
32Simplified Funnel Shifter
- Optimize down to 2N-1 bit input
33Simplified Funnel Shifter
- Optimize down to 2N-1 bit input
34Funnel Shifter Design 1
- N N-input multiplexers
- Use 1-of-N hot select signals for shift amount
35Funnel Shifter Design 2
- Log N stages of 2-input muxes
- No select decoding needed
36Multi-input Adders
- Suppose we want to add k N-bit words
- Ex 0001 0111 1101 0010 _____
37Multi-input Adders
- Suppose we want to add k N-bit words
- Ex 0001 0111 1101 0010 10111
38Multi-input Adders
- Suppose we want to add k N-bit words
- Ex 0001 0111 1101 0010 10111
- Straightforward solution k-1 N-input CPAs
- Large and slow
39Carry Save Addition
- A full adder sums 3 inputs and produces 2 outputs
- Carry output has twice weight of sum output
- N full adders in parallel are called carry save
adder - Produce N sums and N carry outs
40CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
41CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
42CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
43Multiplication
44Multiplication
45Multiplication
46Multiplication
47Multiplication
48Multiplication
49Multiplication
- Example
- M x N-bit multiplication
- Produce N M-bit partial products
- Sum these to produce MN-bit product
50General Form
- Multiplicand Y (yM-1, yM-2, , y1, y0)
- Multiplier X (xN-1, xN-2, , x1, x0)
- Product
51Dot Diagram
- Each dot represents a bit
52Array Multiplier
53Rectangular Array
- Squash array to fit rectangular floorplan
54Fewer Partial Products
- Array multiplier requires N partial products
- If we looked at groups of r bits, we could form
N/r partial products. - Faster and smaller?
- Called radix-2r encoding
- Ex r 2 look at pairs of bits
- Form partial products of 0, Y, 2Y, 3Y
- First three are easy, but 3Y requires adder ?
55Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
56Booth Hardware
- Booth encoder generates control lines for each PP
- Booth selectors choose PP bits
57Sign Extension
- Partial products can be negative
- Require sign extension, which is cumbersome
- High fanout on most significant bit
58Simplified Sign Ext.
- Sign bits are either all 0s or all 1s
- Note that all 0s is all 1s 1 in proper column
- Use this to reduce loading on MSB
59Even Simpler Sign Ext.
- No need to add all the 1s in hardware
- Precompute the answer!
60Division - Restoring
- n times
- Shift A and Q left one bit
- Subtract M from A, put answer in A
- If the sign of A is 1
- set q0 to 0
- Add M back to A
- If the sign of A is 0
- set q0 to 1
Shift left
q
a
a
q
a
n
1
-
0
n
n
1
-
0
Dividend Q
A
Quotient
setting
Add/Subtract
-bit
n
1
adder
Control
sequencer
m
m
0
n
1
-
0
Divisor M
61Division Restoring Example
62Division - Nonrestoring
- n times
- If the sign of A is 0
- shift A and Q left
- subtract M from A
- Else
- shift A and Q left
- add M to A
- Now if sign of A is 0
- set q0 to 1
- Else
- set q0 to 0
- If the sign of A is 1
- add M to A
63Division Nonrestoring Example
64Floating Point Single Precision
- IEEE-754, 854
- Decimal point can move hence its floating
- Floating point is useful for scientific
calculations - Can represent
- Very large integers and
- Very small fractions
- 1038
65Floating Point Double Precision
- Double Precision can represent
- 10308
66Floating Point
- The IEEE Standard requires these operations, at a
minimum - Add
- Subtract
- Multiply
- Divide
- Remainder
- Square Root
- Decimal/Binary Conversion
- Special Values
- Exceptions
- Underflow, Overflow, divide by 0, inexact, invalid
E M Value
0 0 /- 0
255 0 /- 8
0 ? 0 0.M X 2 -126
255 ? 0 Not a Number NaN
67FP Arithmetic Operations
- Add/Subtract
- Shift mantissa of smaller exponent number right
by the difference in exponents - Set the exponent of the result the larger
exponent - Add/Sub Mantissas, get sign
- Normalize
- MultiplyDivide
- Add/Sub exponents, Subtract/Add 127
- Multiply/Divide Mantissas, determine sign
- Normalize
68FP Guard Bits and Truncation
- Guard bits
- Extra bits during intermediate steps to yield
maximum accuracy in the final result - They need to be removed when generating the final
result - Chopping
- simply remove guard bits
- Von Neumann rounding
- if all guard bits 0, chop, else 1
- Rounding
- Add 1 to LSB if guard MSB 1
69FP Add-Subtract Unit