Chapter 3 Arithmetic for Computers

About This Presentation

Title:

Chapter 3 Arithmetic for Computers

Description:

Chapter 3 Arithmetic for Computers Taxonomy of Computer Information Number Format Considerations Type of numbers (integer, fraction, real, complex) Range of values ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 60

Provided by: Vict73

Learn more at: https://www.eng.auburn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 3 Arithmetic for Computers

1
Chapter 3Arithmetic for Computers

2
Taxonomy of Computer Information
Information
Instructions
Addresses
Data
Numeric
Non-numeric
Fixed-point
Floating-point
Character (ASCII)
Boolean
Other
Single- precision
Double- precision
Signed
Unsigned (ordinal)
Sign-magnitude
2s complement
3
Number Format Considerations

Type of numbers (integer, fraction, real,
complex)
Range of values
between smallest and largest values
wider in floating-point formats
Precision of values (max. accuracy)
usually related to number of bits allocated
n bits gt represent 2n values/levels
Value/weight of least-significant bit
Cost of hardware to store process numbers (some
formats more difficult to add, multiply/divide,
etc.)

4
Unsigned Integers

Positional number system
an-1an-2 . a2a1a0
an-1x2n-1 an-2x2n-2 a2x22 a1x21
a0x20
Range (2n -1) 0
Carry out of MSB has weight 2n
Fixed-point fraction
0.a-1a-2 . a-n a-1x2-1 a-2x2-2
a-nx2-n

5
Signed Integers

Sign-magnitude format (n-bit values)
A San-2 . a2a1a0 (S sign bit)
Range -(2n-1-1) (2n-1-1)
Addition/subtraction difficult (multiply easy)
Redundant representations of 0
2s complement format (n-bit values)
-A represented by 2n A
Range -(2n-1) (2n-1-1)
Addition/subtraction easier (multiply harder)
Single representation of 0

6
Computing the 2s Complement

To compute the 2s complement of A
Let A an-1an-2 . a2a1a0
2n - A 2n -1 1 A (2n -1) - A 1
1 1 1 1 1
- an-1 an-2 . a2 a1 a0 1
-----------------------------------
--
an-1an-2 . a2a1a0 1 (ones
complement 1)

7
2s Complement Arithmetic

Let (2n-1-1) A 0 and (2n-1-1) B 0
Case 1 A B
(2n-2) (A B) 0
Since result lt 2n, there is no carry out of the
MSB
Valid result if (A B) lt 2n-1
MSB (sign bit) 0
Overflow if (A B) 2n-1
MSB (sign bit) 1 if result 2n-1
Carry into MSB

8
2s Complement Arithmetic

Case 2 A - B
Compute by adding A (-B)
2s complement A (2n - B)
-2n-1 lt result lt 2n-1 (no overflow possible)
If A B 2n (A - B) 2n
Weight of adder carry output 2n
Discard carry (2n), keeping (A-B), which is 0
If A lt B 2n (A - B) lt 2n
Adder carry output 0
Result is 2n - (B - A)
2s complement representation of -(B-A)

9
2s Complement Arithmetic

Case 3 -A - B
Compute by adding (-A) (-B)
2s complement (2n - A) (2n - B) 2n 2n -
(A B)
Discard carry (2n), making result 2n - (A B)
2s complement representation of -(A
B)
0 result -2n
Overflow if -(A B) lt -2n-1
MSB (sign bit) 0 if 2n - (A B) lt 2n-1
no carry into MSB

10
Relational Operators

Compute A-B test ALU flags to compare A vs. B
ZF result zero OF 2s complement
overflow
SF sign bit of result CF adder carry
output

Signed Unsigned
A B ZF 1 ZF 1
A ltgt B ZF 0 ZF 0
A B (SF ? OF) 0 CF 1 (no borrow)
A gt B (SF ? OF) ZF 0 CF ? ZF 1
A B (SF ? OF) ZF 1 CF ? ZF 0
A lt B (SF ? OF) 1 CF 0 (borrow)
11
MIPS Overflow Detection

An exception (interrupt) occurs when overflow
detected for add,addi,sub
Control jumps to predefined address for exception
Interrupted address is saved for possible
resumption
Details based on software system / language
example flight control vs. homework assignment
Don't always want to detect overflow new MIPS
instructions addu, addiu, subu note addiu
still sign-extends! note sltu, sltiu for
unsigned comparisons

12
Designing the Arithmetic Logic Unit (ALU)

Provide arithmetic and logical functions as
needed by the instruction set
Consider tradeoffs of area vs. performance
(Material from Appendix B)

13
Different Implementations

Not easy to decide the best way to build
something
Don't want too many inputs to a single gate (fan
in)
Dont want to have to go through too many gates
(delay)
For our purposes, ease of comprehension is
important
Let's look at a 1-bit ALU for addition
How could we build a 1-bit ALU for add, and, and
or?
How could we build a 32-bit ALU?

cout a b a cin b cin sum a xor b xor cin
14
Building a 32 bit ALU
15
What about subtraction (a b) ?

Two's complement approach just negate b and add.

16
Adding a NOR function

Can also choose to invert a. How do we get a
NOR b ?

O
p
e
r
a
t
i
o
n
A
i
n
v
e
r
t
B
i
n
v
e
r
t
C
a
r
r
y
I
n
a
0
0
1
1
R
e
s
u
l
t
b
0
2

1
C
a
r
r
y
O
u
t
17
Tailoring the ALU to the MIPS

Need to support the set-on-less-than instruction
(slt)
remember slt is an arithmetic instruction
produces a 1 if rs lt rt and 0 otherwise
use subtraction (a-b) lt 0 implies a lt b
Need to support test for equality (beq t5, t6,
t7)
use subtraction (a-b) 0 implies a b

18
Supporting slt
A
i
n
v
e
r
t
0
1
S
e
t
O
v
e
r
f
l
o
w
O
v
e
r
f
l
o
w
d
e
t
e
c
t
i
o
n
all other bits
Use this ALU for most significant bit
19
Supporting slt
B
i
n
v
e
r
t
A
i
n
v
e
r
t
0
R
e
s
u
l
t
2
0
.
.
.
.
.
.
.
.
.
C
a
r
r
y
I
n
R
e
s
u
l
t
3
1
S
e
t
b
3
1
s
0
O
v
e
r
f
l
o
w
20
Test for equality

Notice control lines0000 and0001 or0010
add0110 subtract0111 slt1100 NOR

Note zero is a 1 when the result is zero!

21
Conclusion

We can build an ALU to support the MIPS
instruction set
key idea use multiplexor to select the output
we want
we can efficiently perform subtraction using
twos complement
we can replicate a 1-bit ALU to produce a 32-bit
ALU
Important points about hardware
all of the gates are always working
the speed of a gate is affected by the number of
inputs to the gate
the speed of a circuit is affected by the number
of gates in series (on the critical path or
the deepest level of logic)
Our primary focus comprehension, however,
Clever changes to organization can improve
performance (similar to using better algorithms
in software)
We saw this in multiplication, lets look at
addition now

22
Problem ripple carry adder is slow

Is a 32-bit ALU as fast as a 1-bit ALU?
Is there more than one way to do addition?
two extremes ripple carry and sum-of-products
Can you see the ripple? How could you get rid of
it?
c1 b0c0 a0c0 a0b0
c2 b1c1 a1c1 a1b1 c2
c3 b2c2 a2c2 a2b2 c3
c4 b3c3 a3c3 a3b3 c4 Not feasible!
Why?

23
One-bit Full-Adder Circuit
ci
FAi
XOR
sumi
ai
XOR
AND
bi
AND
OR
Ci1
24
32-bit Ripple-Carry Adder
c0 a0 b0
sum0
FA0
sum1
FA1
a1 b1
sum2
FA2
a2 b2
sum31
FA31
a31 b31
25
How Fast is Ripple-Carry Adder?

Longest delay path (critical path) runs from cin
to sum31.
Suppose delay of full-adder is 100ps.
Critical path delay 3,200ps
Clock rate cannot be higher than 1012/3,200
312MHz.
Must use more efficient ways to handle carry.

26
Fast Adders

In general, any output of a 32-bit adder can be
evaluated as a logic expression in terms of all
65 inputs.
Levels of logic in the circuit can be reduced to
log2N for N-bit adder. Ripple-carry has N levels.
More gates are needed, about log2N times that of
ripple-carry design.
Fastest design is known as carry lookahead adder.

27
N-bit Adder Design Options
Type of adder Time complexity (delay) Space complexity (size)
Ripple-carry O(N) O(N)
Carry-lookahead O(log2N) O(N log2N)
Carry-skip O(vN) O(N)
Carry-select O(vN) O(N)
Reference J. L. Hennessy and D. A. Patterson,
Computer Architecture A Quantitative Approach,
Second Edition, San Francisco, California, 1990.
28
Carry-lookahead adder

An approach in-between our two extremes
Motivation
If we didn't know the value of carry-in, what
could we do?
When would we always generate a carry? gi aibi
When would we propagate the carry? pi ai
bi
Did we get rid of the ripple? c1 g0 p0c0
c2 g1 p1c1 c2 g1 p1 g0 p1p0c0
c3 g2 p2c2 c3
c4 g3 p3c3 c4
Feasible! Why?

29
Use principle to build bigger adders

Cant build a 16 bit adder this way... (too big)
Could use ripple carry of 4-bit CLA adders
Better use the CLA principle again!

30
Carry-Select Adder
sum0-sum15
a16-a31
16-bit ripple carry adder
0
b16-b31
0
sum16-sum31
Multiplexer
a16-a31
16-bit ripple carry adder
1
b16-b31
This is known as carry-select adder
1
31
ALU Summary

We can build an ALU to support MIPS addition
Our focus is on comprehension, not performance
Real processors use more sophisticated techniques
for arithmetic
Where performance is not critical, hardware
description languages allow designers to
completely automate the creation of hardware!

32
Multiplication

More complicated than addition
accomplished via shifting and addition
More time and more area
Let's look at 3 versions based on a gradeschool
algorithm 0010 (multiplicand) __x_101
1 (multiplier)
Negative numbers convert and multiply
there are better techniques, we wont look at
them

33
Multiplication Implementation
Datapath
Control
34
Final Version

Multiplier starts in right half of product

What goes here?
35
Multiplying Signed Numbers withBoothes
Algorithm

Consider A x B where A and B are signed integers
(2s complemented format)
Decompose B into the sum B1 B2 Bn
A x B A x (B1 B2 Bn)
(A x B1) (A x B2) (A x Bn)
Let each Bi be a single string of 1s embedded in
0s
00111100
Example
0110010011100 0110000000000
0000010000000
0000000011100

36
Boothes Algorithm

Scanning from right to left, bit number u is the
first 1 bit of the string and bit v is the first
0 left of the string
v u
Bi 0 0 1 1 0 0
0 0 1 1 1 1
(2v 1)
- 0 0 0 0 1 1 (2u 1)
(2v 1) - (2u 1)
2v 2u

37
Boothes Algorithm

Decomposing B
A x B A x (B1 B2 )
A x (2v1 2u1) (2v2 2u2)
(A x 2v1) (A x 2u1) (A x
2v2) (A x 2u2)
A x B can be computed by adding and subtracted
shifted values of A
Scan bits right to left, shifting A once per bit
When the bit string changes from 0 to 1, subtract
shifted A from the current product P (A x 2u)
When the bit string changes from 1 to 0, add
shifted A to the current product P (A x 2v)

38
Floating Point Numbers

We need a way to represent a wide range of
numbers
numbers with fractions, e.g., 3.1416
large number
976,000,000,000,000 9.76 1014
small number
0.0000000000000976 9.76 10-14
Representation
sign, exponent, significand
(1)sign significand 2exponent
more bits for significand gives more accuracy
more bits for exponent increases range

39
Scientific Notation

Scientific notation
0.525105 5.25104 52.5103
5.25 104 is in normalized scientific notation.
position of decimal point fixed
leading digit non-zero
Binary numbers
5.25 101.01 1.010122
Binary point
multiplication by 2 moves the point to the left
division by 2 moves the point to the right
Known as floating point format.

40
Binary to Decimal Conversion
Binary (-1)S (1.b1b2b3b4) 2E
Decimal (-1)S (1 b12-1 b22-2 b32-3
b42-4) 2E
Example -1.1100 2-2 (binary) - (1 2-1
2-2) 2-2 - (1 0.5 0.25)/4
- 1.75/4 - 0.4375 (decimal)
41
IEEE Std. 754 Floating-Point Format
Single-Precision
S E 8-bit Exponent F 23-bit
Fraction
bits 0-22
bits 23-30
bit 31
Double-Precision
S E 11-bit Exponent F 52-bit
Fraction
bits 0-19
bits 20-30
bit 31
Continuation of 52-bit Fraction

bits 0-31
42
IEEE 754 floating-point standard

Represented value (1)sign (1F) 2exponent
bias
Exponent is biased (excess-K format) to make
sorting easier
bias of 127 for single precision and 1023 for
double precision
E values in 1 .. 254 (0 and 255
reserved)
Range 2-126 to 2127 (1038 to 1038)
Significand in sign-magnitude, normalized form
Significand (1 F) 1. b-1b-1b-1b-23
Suppress storage of leading 1
Overflow Exponent requiring more than 8 bits.
Number can be positive or negative.
Underflow Fraction requiring more than 23 bits.
Number can be positive or negative.

43
IEEE 754 floating-point standard

Example
Decimal -5.75 - ( 4 1 ½ ¼ )
Binary -101.11 -1.0111 x 22
Floating point exponent 129 10000001
IEEE single precision 11000000101110000000000000
0000000

44
Examples
Biased exponent (0-255), bias 127 (01111111) to
be subtracted
1.1010001 210100 0 10010011
10100010000000000000000 1.6328125 220
-1.1010001 210100 1 10010011
10100010000000000000000 -1.6328125 220
1.1010001 2-10100 0 01101011
10100010000000000000000 1.6328125 2-20
-1.1010001 2-10100 1 01101011
10100010000000000000000 -1.6328125 2-20
0.5 0.125 0.0078125 0.6328125
45
Numbers in 32-bit Formats

Twos complement integers
Floating point numbers
Ref W. Stallings, Computer Organization and
Architecture, Sixth Edition, Upper Saddle River,
NJ Prentice-Hall.

Expressible numbers
-231
231-1
0
Positive underflow
Negative underflow
Negative Overflow
Positive Overflow
Expressible negative numbers
Expressible positive numbers
0
-2-127
2-127
(2 2-23)2127
- (2 2-23)2127
46
IEEE 754 Special Codes
Zero
S 00000000 00000000000000000000000

1.0 2-127
Smallest positive number in single-precision IEEE
754 standard.
Interpreted as positive/negative zero.
Exponent less than -127 is positive underflow
(regard as zero).

S 11111111 00000000000000000000000
Infinity

1.0 2128
Largest positive number in single-precision IEEE
754 standard.
Interpreted as 8
If true exponent 128 and fraction ? 0, then the
number is greater than 8. It is called not a
number or NaN (interpret as 8).

47
Addition and Subtraction

Addition/subtraction of two floating-point
numbers
Example 2 x 103 Align mantissas
0.2 x 104
3 x 104
3 x 104
3.2 x 104
General Case
m1 x 2e1 m2 x 2e2 (m1 m2 x
2e2-e1) x 2e1 for e1 gt e2
(m1 x 2e1-e2 m2) x 2e2
for e2 gt e1
Shift smaller mantissa right by e1 - e2 bits
to align the mantissas.

48
Addition/Subtraction Algorithm

0. Zero check
- Change the sign of subtrahend
- If either operand is 0, the other is the result
1. Significand alignment right shift smaller
significand until two exponents are identical.
2. Addition add significands and report
exception if overflow occurs.
3. Normalization
- Shift significand bits to normalize.
- report overflow or underflow if exponent goes
out of range.
4. Rounding

49
FP Add/Subtract (PH Text Figs. 3.16/17)
50
Example

Subtraction 0.5ten- 0.4375ten
Step 0 Floating point numbers to be added
1.000two2-1 and -1.110two2-2
Step 1 Significand of lesser exponent is
shifted right until exponents match
-1.110two2-2 ? - 0.111two2-1
Step 2 Add significands, 1.000two (-
0.111two)
Result is 0.001two 2-1

51
Example (Continued)

Step 3 Normalize, 1.000two 2-4
No overflow/underflow since
127 exponent -126
Step 4 Rounding, no change since the sum fits
in 4 bits.
1.000two 2-4 (10)/16 0.0625ten

52
FP Multiplication Basic Idea

(m1 x 2e1) x (m2 x 2e2) (m1 x m2) x
2e1e2
Separate signs
Add exponents
Multiply significands
Normalize, round, check overflow
Replace sign

53
FP Multiplication Algorithm
P-H Figure 3.18
54
FP Mult. Illustration

Multiply 0.5ten and -0.4375ten (answer -
0.21875ten) or
Multiply 1.000two2-1 and -1.110two2-2
Step 1 Add exponents
-1 (-2) -3
Step 2 Multiply significands
1.000
1.110
0000
1000
1000
1000
1110000 Product is 1.110000

55
FP Mult. Illustration (Cont.)

Step 3
Normalization If necessary, shift significand
right and increment exponent.
Normalized product is 1.110000 2-3
Check overflow/underflow 127 exponent -126
Step 4 Rounding 1.110 2-3
Step 5 Sign Operands have opposite signs,
Product is -1.110 2-3
Decimal value - (10.50.25)/8 - 0.21875ten

56
FP Division Basic Idea

Separate sign.
Check for zeros and infinity.
Subtract exponents.
Divide significands.
Normalize/overflow/underflow.
Rounding.
Replace sign.

57
MIPS Floating Point

32 floating point registers, f0, . . . , f31
FP registers used in pairs for double precision
f0 denotes double precision content of f0,f1
Data transfer instructions
lwc1 f1, 100(s2) f1?Mems1100
swc1 f1, 100(s2) Mems2100?f1
Arithmetic instructions (xxxadd, sub, mul, div)
xxx.s single precision
xxx.d double precision

58
Floating Point Complexities

Operations are somewhat more complicated (see
text)
In addition to overflow we can have underflow
Accuracy can be a big problem
IEEE 754 keeps two extra bits, guard and round
four rounding modes
positive divided by zero yields infinity
zero divide by zero yields not a number
other complexities
Implementing the standard can be tricky
Not using the standard can be even worse
see text for description of 80x86 and Pentium bug!

59
Chapter Three Summary

Computer arithmetic is constrained by limited
precision
Bit patterns have no inherent meaning but
standards do exist
twos complement
IEEE 754 floating point
Computer instructions determine meaning of the
bit patterns
Performance and accuracy are important so there
are many complexities in real machines
Algorithm choice is important and may lead to
hardware optimizations for both space and time
(e.g., multiplication)
You may want to look back (Section 3.10 is great
reading!)