Title: COMPUTER ARITHMETIC
1COMPUTER ARITHMETIC
- Jehan-François Pâris
- jparis_at_uh.edu
2Chapter Organization
- Representing negative numbers
- Integer addition and subtraction
- Integer multiplication and division
- Floating point operations
- Examples of implementation
- IBM 360, RISC, x86
3A warning
- Binary addition, subtraction, multiplication and
division are very easy
4ADDITION AND SUBTRACTION
5General concept
- Decimal addition
- (carry) 1_ 19
- 7
- 26
- Binary addition
- ( carry) 111_
- 10011
- 111
- 11010
- 1682 26
6Realization
- Simplest solution is a battery of full adders
o s3
s2
s1
s0
x3 y3
x2 y2
x1 y1
x0 y0
7Observations
- Adder add four-bit values
- Output o indicates if there is an overflow
- A result that cannot be represented using 4 bits
- Happens when x y gt 15
- Operation is slowed down by carry propagation
- Faster solutions (not discussed here)
8Signed and unsigned additions
- Unsigned addition in 4-bit arithmetic
- ( carry) 11_
- 1011
- 0011
- 1110
- 11 3 14(8 4 2)
- Signed addition in4-bit arithmetic
- ( carry) 11_
- 1011
- 0011
- 1110
- -5 3 -2
9Signed and unsigned additions
- Same rules apply even though bit strings
represent different values - Sole difference is overflow handling
10Overflow handling (I)
- No overflow in signed arithmetic
- ( carry) 111_
- 1110
- 0011
- 0001
- -2 3 1(correct)
- Signed addition in4-bit arithmetic
- ( carry) 1__
- 0110
- 0011
- 1001
- 6 3 ?? -7(false)
11Overflow handling (II)
- In signed arithmetic an overflow happens when
- The sum of two positive numbers exceeds the
maximum positive value that can be represented
using n bits 2n 1 1 - The sum of two negative numbers falls below the
minimum negative value that can be represented
using n bits 2n 1
12Example
- Four-bit arithmetic
- Sixteen possible values
- Positive overflow happens when result gt 7
- Negative overflow happens when result lt -8
- Eight-bit arithmetic
- 256 possible values
- Positive overflow happens when result gt 127
- Negative overflow happens when result lt -128
13Overflow handling (III)
- MIPS architecture handles signed and unsigned
overflows in a very different fashion - Ignores unsigned overflows
- Implements modulo 2n arithmetic
- Generates an interrupt whenever it detects a
signed overflows - Lets the OS handled the condition
14Why?
- To keep the CPU as simple and regular as possible
15An interesting consequence
- Most C compilers ignore overflows
- C compilers must use unsigned arithmetic for
their integer operations - Fortran compilers expect overflow conditions to
be detected - Fortran compilers must use signed arithmetic for
their integer operations
16Subtraction
- Can be implementing by
- Specific hardware
- Negating the subtrahend
17Negating a number
- Toggle all bits then add one
18In 4-bit arithmetic (I)
0000 0 1111 1 0000 0
0001 1 1110 1 1111 -1
0010 2 1101 1 1110 -2
0011 3 1100 1 1101 -3
0100 4 1011 1 1100 -4
0101 5 1010 1 1011 -5
0110 6 1001 1 1010 -6
0111 7 1000 1 1001 -7
19In 4-bit arithmetic (II)
1000 -8 0111 1 1000 ?
1001 -7 0110 1 0111 7
1010 -6 0101 1 0110 6
1011 -5 0100 1 0101 5
1100 -4 0011 1 0100 4
1101 -3 0010 1 0011 3
1110 -2 0001 1 0010 2
1111 -1 0000 1 0001 1
20MULTIPLICATION
21Decimal multiplication
- What are the rules?
- Successively multiply the multiplicand by each
digit of the multiplier starting at the right
shifting the result left by an extra left
position each time each time but the first - Sum all partial results
- (carry) 1_37
- x 12
- 74 370444
-
22Binary multiplication
- What are the rules?
- Successively multiply the multiplicand by each
digit of the multiplier starting at the right
shifting the result left by an extra left
position each time each time but the first - Sum all partial results
- Binary multiplication is easy!
- (carry)111 _1101
- x 101
- 1101
- 00 1101001000001
23Binary multiplication table
X 0 1
0 0 0
1 0 1
24Algorithm
- Clear contents of 64-bit product register
- For (i 0 i lt32 i)
- If (LSB of multiplier_register 1)
- Add contents of multiplicand register to product
register - Shift right one position multiplier register
- Shift left one position multiplicand register
- / / for loop
25Multiplier First version
Multiplier
Multiplicand (64 bits)
64-bitALU
Control
Product (64 bits)
26Multiplier First version
As we learnedin grade school
Multiplier
Multiplicand (64 bits)
To get next bit ( LSB to MSB)
64-bitALU
Control
Product (64 bits)
27Explanations
- Multiplicand register must be 64-bit wide because
32-bit multiplicand will be shifted 32 times to
the left - Requires a 64-bit ALU
- Product register must be 64-bit wide to
accommodate the result - Contents of multiplier register is shifted 32
times to the right so that each bit successively
becomes its least significant bit (LSB)
28Example (I)
- Multiply 0011 by 0011
- StartMultiplicand Multiplier Product0011 0011
0000 - First additionMultiplicand Multiplier
Product0011 0011 0011
29Example (II)
- Shift right and leftMultiplicand Multiplier
Product0110 0001 0011 - Second additionMultiplicand Multiplier
Product0110 0001 1001 - 0110 011 1001
30Example (III)
- Shift right and leftMultiplicand Multiplier
Product1100 0000 1001 - Multiplier is all zeroes we are done
31First Optimization
- Must have a 64-bit ALU
- More complex than a 32-bit ALU
- Solution is not to shift the multiplicand
- After each cycle, the LSB being added remains
unchanged - Will save that bit elsewhere and shift the
product register one position to the left after
each iteration
32Binary multiplication
- 1101
- x 101
- 1101
- 00 1101001000101
- Observe that the least significant bit added
during each cycle remains unchanged
33Algorithm
- Clear contents of 64-bit product register
- For (i 0 i lt32 i)
- If (LSB of multiplier_register 1)
- Add contents of multiplicand register to product
register - Save LSB of product register
- Shift right one position both multiplier register
and product register - / / for loop
34Multiplier Second version
Multiplier
Multiplicand
32-bitALU
Control Test
Product (64 bits)
Shift Right and Save
35Decimal Example (I)
- Multiply 27 by 12
- StartMultiplicand Multiplier Product Result27
12 -- -- - First digitMultiplicand Multiplier
Product Result27 12 54 --
36Decimal Example (II)
- Shift right multiplier and productMultiplicand M
ultiplier Product Result27 1 5 4 - Second digitMultiplicand Multiplier
Product Result27 1 32 4
37Decimal Example (III)
- Shift right multiplier and productMultiplicand M
ultiplier Product Result27 0 3 24 - Multiplier equals zeroResult is obtained by
concatenating contents of product and result
registers - 324
38How did it work?
- We learned
- 27?12 27?10 27?2 27?10 54 270
54 - Algorithm uses another decomposition
- 27?12 27?10 27?2 27?10 50 4
(27?10 50) 4 320 4
39Example (I)
- Multiply 0011 by 0011
- StartMultiplicand Multiplier Product Result 001
1 0011 -- -- - First bitMultiplicand Multiplier
Product Result0011 0011 0011 --
40Example (II)
- Shift right multiplier and productMultiplicand M
ultiplier Product Result0011 0001 0001 1- - Second bitMultiplicand Multiplier
Product Result0011 0001 0100 1- - Product register contains 0011 001 0100
41Example (III)
- Shift right multiplier and productMultiplicand M
ultiplier Product Result0011 0000 010 01- - Multiplier equals zeroResult is obtained by
concatenating contents of product and result
registers - 1001 9
42Second Optimization
- Both multiplier and product must be shifted to
one position to the right after each iteration - Both are now 32-bit quantities
- Can store both quantities in the product register
43Multiplier Third version
Multiplicand
Control Test
32-bitALU
Multiplier Product
Shift Right and Save
44Third Optimization
- Multiplication requires 32 additions and 32 shift
operations - Can have two or more partial multiplications
- One using bits 0-15 of multiplier
- A second using bits 16-31
- then add together the partial results
45Multiplying negative numbers
- Can use the same algorithm as before but we must
extend the sign bit of the product
46 Related MIPS instructions (I)
- Integer multiplication uses a separate pair of
registers (hi and lo) - mult s0, s1
- multiply contents of register s0 by contents of
register s1 and store results in register pair
hi-lo - multu s0, s1
- same but unsigned
47 Related MIPS instructions (II)
- mflo s9
- Move contents of register lo to register s0
- mfhi s9
- Move contents of register hi to register s0
48DIVISION
49Division
- Implemented by successive subtractions
- Result must verify the equality
- Dividend Multiplier Quotient Remainder
50Decimal division (long division
- What are the rules?
- Repeatedly try to subtract smaller multiple of
divisor from dividend - Record multiple (or zero)
- At each step, repeat with a lower power of ten
- Stop when remainder is smaller than divisor
303
7 2126
-210
26 -21 5
51Binary division
011
11 1011
-11
1011 gt-11 101
gtgt-11
10
- What are the rules?
- Repeatedly try to subtract powers of two of
divisor from dividend - Mark 1 for success, 0 for failure
- At each step, shift divisor one position to the
right - Stop when remainder is smaller than divisor
X
X
52Same division in decimal
213
3 11
-12
11 gt-6 5
gt-3
2
- What are the rules?
- Repeatedly try to subtract powers of two of
divisor from dividend - Mark 1 for success, 0 for failure
- At each step, shift divisor one position to the
right - Stop when remainder is smaller than divisor
X
X
53Observations
- Binary division is actually simpler
- We start with a left-shifted version of divisor
- We try to subtract it from dividend
- No need to find out which multiple to subtract
- We mark 1 for success, 0 for failure
- We shift divisor one position left after every
attempt
54How to start the division
- One 64-bit register for successive remainders
- One 64-bit register for divisor
- Start with quotient in upper half
- One 32-bit register for the quotient
Initialized with dividend
All zeroes
55How we proceed (I)-
- After each step we shift the quotient to the
right one position at a time
56How we proceed (II)
- After each step we shift the contents of the
quotient register one position to the left - To make space for the new 0 or 1 being inserted
57 Division Algorithm
- For i in range(0,33) from 0 to 32
- Subtract contents of divisor register
fromremainder register - If remainder ? 0
- Shift quotient register to the left
- Set new rightmost bit to 1
- Else
- Undo subtraction
- Shift quotient register to the left
- Set new rightmost bit to 0
- Shift right one position contents of divisor
register
58A simple divider
Quotient
Divisor (64 bits)
64-bitALU
Control Test
Remainder (64 bits)
59Signed division
- Easiest solution is to remember the sign of the
operands and adjust the sign of the quotient and
remainder accordingly - A little problem
- 5 ? 2 2 and the remainder is 1
- -5 ? 2 -2 and the remainder is -1
- The sign of the remainder must match the sign of
the quotient
60Related MIPS instructions
- Integer division uses the same pair of registers
(hi and lo) as integer multiplication - div s0, s1
- divide contents of register s0 by contents of
register s, leave the quotient in register lo
and the remainder in register hi - divu s0, s1
- same but unsigned
61TRANSITION SLIDE
- Here end the materials that were on the first
fall 2012 midterm - Here start the materials that will be on the
fall 2012 midterm
To be moved to the right place
62FLOATING POINT OPERATIONS
63Floating point numbers
- Used to represent real numbers
- Very similar to scientific notation
- 3.5106, 0.82105, 75106,
- Both decimal numbers in scientific notation and
floating point numbers can be normalized - 3.5106, 8.2106, 7.5107,
64Fractional binary numbers
- 0.1 is ½ or 0.5ten
- 0.01 is 0.1 is 1/4 or 0.25ten
- 0.11 is ½ ¼ ¾ or 0.75ten
- 1.1 is 1½ or 1.5ten
- 10.01 is 2 ¼ or 2.5ten
- 11.11 is ______ or _____
65Normalizing binary numbers
- 0.1 becomes 1.02-1
- 0.01 becomes 1.02-2
- 0.11 becomes 1.12-1
- 1.1 is already normalized and equal to1.020
- 10.01 becomes 1.00121
- 11.11 becomes 1______2_____
66Representation
- Sign exponent coefficient
- IEEE Standard 754
- 1 8 23 32 bits
- 1 11 52 64 bits (double precision)
67The sign bit
- 0 indicates a positive number
- 1 a negative number
68The exponent (I)
- 8 bits for single precision
- 11 bits for double precision
- With 8 bits, we can represent exponents between
-126 and 127 - All-zeroes value is reserved for the zeroes and
denormalized numbers - All-ones value are reserved for the infinities
and NaNs (Not a Number)
69The exponent (II)
- Exponents are represented using a biased notation
- Stored value actual exponent bias
- For 8 bit exponents, bias is 127
- Stored value of 1 corresponds to 126
- Stored value of 254 corresponds to 127
0 and 255 are reserved for special values
70The exponent (III)
- Biased notation simplifies comparisons
- If two normalized floating point numbers have
different exponents, the one with the bigger
exponent is the bigger of the two
71Special values (I)
- Signed zeroes
- IEEE 754 distinguishes between 0 and 0
- Represented by
- Sign bit 0 or 1
- Biased exponent all zeroes
- Coefficient all zeroes
72Special values (II)
- Denormalized numbers
- Numbers whose coefficient cannot be normalized
- Smaller than 2126
- Will have a coefficient with leading zeroes and
exponent field equal to zero - Reduces the number of significant digits
- Lowers accuracy
73Special values (III)
- Infinities
- ? and ?
- Represented by
- Sign bit 0 or 1
- Biased exponent all ones
- Coefficient all zeroes
74Special values (IV)
- NaN
- For Not a Number
- Often result from illegal divisions0/0, 8/8,
8/8, 8/8, and 8/8 - Represented by
- Sign bit 0 or 1
- Biased exponent all ones
- Coefficient non zero
75The coefficient
- Also known as fraction or significand
- Most significant bit is always one
- Implicit and not represented
- Biased exponent is 127ten
- True coefficient is implicit one followed by all
zeroes
76Decoding a floating point number
- Sign indicated by first bit
- Subtract 127 from biased exponent to obtain power
of two ltbegt 127 - Use coefficient to construct a normalized binary
value with a binary point 1.ltcoefficientgt - Number being represented is 1.ltcoefficientgt
2ltbegt 127
77 First example
- Sign bit is zero Number is positive
- Biased exponent is 127 Power of two is zero
- Normalized binary value is 1.0000000
- Number is 120 1
78 Second example
- Sign bit is zero Number is positive
- Biased exponent is 128 Power of two is 1
- Normalized binary value is 1.1000000
- Number is 1.121 11 3ten
79 Third example
- Sign bit is 1 Number is negative
- Biased exponent is 126 Power of two is 1
- Normalized binary value is 1.1100000
- Number is 1.1121 0.111 7/8ten
80 Can we do it now?
- Sign bit is 0 Number is ___________
- Biased exponent is 129 Power of two is _______
- Normalized binary value is 1.__________
- Number is _________________________
81Encoding a floating point number
- Use sign to pick sign bit
- Normalize the numberConvert it to form 1.ltmore
bitsgt 2ltexpgt - Add 127 to exponent ltexpgt to obtainbiased
exponent ltbegt - Coefficient ltcoeffgt is equal to fractional part
ltmore bitsgt of number
82 First example
- Represent 7
- Convert to binary 111
- Normalize 1.1122
- Sign bit is 0
- Biased exponent is 127 2 10000001two
- Coefficient is 11000
83 Second example
- Represent 1/2
- Convert to binary 0.1
- Normalize 1.02-1
- Sign bit is 0
- Biased exponent is 127 1 01111110two
- Coefficient is 000
84 Third example
- Represent 2
- Convert to binary 10
- Normalize 1.021
- Sign bit is 1
- Biased exponent is 127 1 10000000two
- Coefficient is 000
85Fourth example
- Represent 9/4
- Convert to binary 100122
- Normalize 1.00121
- Sign bit is 0
- Biased exponent is 127 1 10000000two
- Coefficient is 00100
86Can we do it now?
- Represent 6.25
- Convert to binary ________
- Normalize 1.______2_______
- Sign bit is _____
- Biased exponent is 127 ___ ______ten
- Coefficient is_________
87Range
- Can represent numbers between1.0002126 and
1.1112127 - Say between 2126 and 2128
- Observing that 210?? 103we divide the exponents
by 10 and multiply them by 3 to obtain the
interval expressed in powers of 10 - Approximate range is 1038 to 1038
88Accuracy
- We have 24 significant bits
- Theoretical precision of 1/224, that is, roughly
1/107 - Cannot add correctly billions or trillions
- Actual situation is worse if we do too many
computations - 1,000,000 999,999.4875 ???
89Guard bits
- Do all arithmetic operations with two additional
bits to reduce rounding errors
90Double precision arithmetic (I)
- Use 64-bit double words
- Allows us to have
- One bit for sign
- Eleven bits for exponent
- 2,048 possible values
- Fifty-two bits for coefficient
- Plus the implicit leading bit
91Double precision arithmetic (II)
- Exponents are still represented using a biased
notation - Stored value actual exponent bias
- For 11-bit exponents, bias is 1023
- Stored value of 1 corresponds to 1,022
- Stored value of 2,046 corresponds to 1,023
- Stored values of 0 and 2,047 are reserved for
special cases
92Double precision arithmetic (III)
- Can now represent numbers between1.00021,022
and 1.11121,203 - Say between 21,022 and 21,204
- Approximate range is 10307 to 10307
- In reality, more like 10308 to 10308
93Double precision arithmetic (IV)
- We now have 53 significant bits
- Theoretical precision of 1/253. that is, roughly
1/1016 - Can now add correctly billions or trillions
94If that is now enough,
- Can use 128-bit quad words
- Allows us to have
- One bit for sign
- Fifteen bits for exponent
- From 16382 to 16383
- One hundred twelve bits for coefficient
- Plus the implicit leading bit
95Decimal floating point addition (I)
- 5.25103 1.22102 ?
- Denormalize number with smaller
exponent5.25103 0.122103 - Add the numbers5.25103 0.122103 5.372103
- Result is normalized
96Decimal floating point addition (II)
- 9.25103 8.22102 ?
- Denormalize number with smaller
exponent9.25103 0.822103 - Add the numbers9.25103 0.822103
10.072103 - Normalize the result10.072103 1.0072104
97Binary floating point addition (I)
- Say 1001 10 or 1.00123 1.021
- Denormalize number with smaller
exponent1.00123 0.0123 - Add the numbers1.00123 0.0123 1.01123
- Result is normalized
98Binary floating point addition (II)
- Say 101 11 or 1.0122 1.121
- Denormalize number with smaller exponent
1.0122 0.1122 - Add the numbers1.0122 0.1122 10.0022
- Normalize the results10.0022 1.00023
99Binary floating point subtraction
- Say 101 11 or 1.0122 1.121
- Denormalize number with smaller exponent
1.0122 0.1122 - Perform the subtraction1.0122 0.1122
0.1022 - Normalize the results0.1022 1.021
100Decimal floating point multiplication
- Exponent of product is the sum of the exponents
of multiplicand and multiplier - Coefficient of product is the product of the
coefficients of multiplicand and multiplier - Compute sign using usual rules of arithmetic
- May have to renormalize the product
101Decimal floating point multiplication
- 6103 2.5102 ?
- Exponent of product is 3 2 5
- Multiply the coefficients 6 2.5 15
- Result will be positive
- Normalize the result 15105 1.5106
102Binary floating point multiplication
- Exponent of product is the sum of the exponents
of multiplicand and multiplier - Coefficient of product is the product of the
coefficients of multiplicand and multiplier - Compute sign using usual rules of arithmetic
- May have to renormalize the product
103Binary floating point multiplication
- Say 110 11 or 1.122 1.121
- Exponent of product is 2 1 3
- Multiply the coefficients 1.1 1.1 10.01
- Result will be positive
- Normalize the result 10.0123 1.00124
104FP division
- Very tricky
- One good solution is to multiply the dividend by
the inverse of the divisor
105 A trap
- Addition does not necessarily commute
- 91037 91037 410-37
- Observe that
- (91037 91037) 410-37 410-37
- while
- 91037 (91037 410-37) 0
- due to the limited accuracy of FP numbers
106 IMPLEMENTATIONS
107The floating-point unit (I)
- Floating-point instructions were an optional
feature - User had to buy a separate floating-point unit
aka floating point coprocessor - Before Intel 80486, all Intel x86 architectures
the option to install a separate floating-point
chip(8087, 80287, 80387)
108The floating-point unit (II)
- Default solution was to simulate the missing
floating-point instructions through assembly
routines - As a result, many processor architectures use
separate banks of registers for integer
arithmetic and floating point arithmetic
109The floating-point unit (III)
- Some older architectures implemented
- Single-precision operations in hardware through
the FPU - Double-precision operations by software
- Made double-precession operations much costlier
than single-precision operations.
110IBM 360 FP INSTRUCTIONS
111Overview
- FPU offers a very familiar user interface
- Eight general purpose FP registers
- Distinct from the integer registers
- Two-operand instructions in both RR and RX
formats - Includes single-precision and double-precision
versions or addition, subtraction, multiplication
and division
112Examples of RR instructions
- AFR f1, f2 add contents of floating-point
register f2 into f1 - ADR f1,f2 add contents of double-precision regi
ster f2 into f1 - LFR f1, f2 load contents of floating-point
register f2 into f1 - Also had load positive, load negative, load
complement instructions for floating-point and
double-precision operands
113Examples of RX instructions
- AF r1, d(r2) add contents of word at
address d contents(r2) into register r1 - AD r1,d(r2)
114MIPS FP INSTRUCTIONS
115Overview
- Thirty-two specialized single-precision
registersf0, f1, f31 - Each pair of single-precision registers forms a
double-precision register - .s instructions apply to single precision format
- .d instructions apply to double precision
format - Most instructions are in the R format
116R-format instructions (I)
- add.s f1, f2, f3 f1 r2 f3 (single precision)
- add.d f2, f4, f6 (f2, f21) (f4, f41) (f6,
f6 1) (double precision applies to
register pairs) - sub.s f1, f2, f3 f1 f2 f3 (single precision)
- sub.d f2, f4, f6 (double precision)
- mul.s f1, f2, f3 f1 f2f3 (single precision)
- mul.d f2, f4, f6 (double precision)
117R-format instructions (II)
- div.s f1, f2, f3 f1 f2 /f3 (single precision)
- div.d f2, f4, f6 (double precision)
- c.x.s f1, f2 FP condition f1 x f2 ? 1 !
0 where x can be equal, not equal, less
than, less than or equal, greater than,
greater than or equal - c.x.d f2, f4 (double precision)
118I-format instructions (I)
- bclt a jump to address computed by adding
4a to the current value of the PC if the FP
condition is true - bclf a jump to address computed by adding
4a to the current value of the PC if the FP
condition is false -
119I-format instructions (I)
- lwcl f1, a(r1) load floating-point word at
address a contents(r1) into f1 - ldcl f2, a(r1) (double precision)
- swcl f1, a(r1) store floating-point value in
f1 into word at address a contents(r1) - sdcl f2, a(r1) (double precision)
-
The "c" in the opcodes stands for coprocessor!
120x86 FP INSTRUCTIONS
121Overview
- Original x86 FP coprocessor had a stack
architecture - Stack registers were 80-bit wide as well as all
internal registers - Better accuracy
- Provided single and double precision operations
122Stack operations (I)
- Three types of operations
- Loads store an operand on the top of the stack
- Arithmetic and comparison operations find two
operands of the top of the stack and replace them
by the result of the operation - Stores move the top of stack register into memory
123Example
- a b c
- Load b on top of stack
- Load c on top of stack
- Add c to b
- Store result into a
b
b
---
b
---
---
124Stack operations (II)
- Instruction set also allowed
- Operations on top of stack register and the ith
register below - Immediate operands
- Operations on top of stack register and a memory
location - Poor performance of FP unit architecture
motivated an extension to the x86 instruction set
125Intel SSE2 FP Architecture (I)
- SSE2 Extension (2001) provided 8 floating point
registers - Could hold either single precision or double
precision values - Number extended to 16 by AMD, followed by Intel
126Intel SSE2 FP Architecture (II)
- Registers are now 128-bit wide
- Can hold
- One quad precision value
- Two double precision values
- Four single precision values
- Can perform same operation in parallel on all
single/double precision values stored in the
same register
Wow!
127REVIEW QUESTIONS
128Review questions
- How would you represent 0.5 in double precision?
- How would you convert this double-precision value
into a single precision format? - When doing accounting, we could do all the
computations in cents using integer arithmetic.
What would we win? What would we lose?
129Solutions
- How would you represent 0.5 in double precision?
- Normalized representation 1.0 2-1
- Sign 0
- Biased exponent 1023 1 1022
- Coefficient All zeroes
- Because the 1 is implicit
130Solutions
- How would you convert this double-precision value
into a single precision format? - Same normalized representation 1.0 2-1
- Same sign 0
- New biased exponent 127 1 126
- Same coefficient All zeroes
- Because the 1 is implicit
131Solutions
- When doing accounting, we could do all the
computations in cents using integer arithmetic.
What would we win? What would we lose? - Big plus
- The results would be exact
- Big minus
- Could not handle numbers bigger than 20,000,000
in 32-bit signed arithmetic
132Why 20,000,000?
- 32-bit unsigned arithmetic can represent numbers
from 0 to 232 1 - 32-bit unsigned arithmetic can represent numbers
from -231 to 231 1 - Roughly from -2000,000,000 to 2,000,000,000
- Must divide by 100 as we were using cents!
133TRANSITION SLIDE
- Here end the materials that were on the first
fall 2012 midterm