Title: EE 319K Introduction to Embedded Systems
1EE 319KIntroduction to Embedded Systems
- Lecture 14 Gaming Engines, Coding Style,
Floating Point
2Agenda
- Recap
- Software design
- 2-D arrays, structs
- Bitmaps, sprites
- Lab 10
- Agenda
- Gaming engine design
- Coding style
- Floating point
3Numbers
- Integers (Z) universe is infinite but discrete
- No fractions
- No numbers between 5 and 6
- A countable (finite) number of items in a finite
range - Real numbers (R) universe is infinite
continuous - Fractions represented by decimal notation
- Rational numbers, e.g., 5/2 2.5
- Irrational numbers, e.g., p 22/7 3.14159265 .
. . - Infinity of numbers exist even in the smallest
range
(Adapted from V. Aagrawal)
4Number Representation
- Integers
- Fixed-width integer number
- Reals
- Fixed-point number ? I ?
- Store I, but ? is fixed
- Decimal fixed-point (?10m) I 10m
- Binary fixed-point (?2m) I 2m
- Floating-point number I BE
- Store both I and E (only B is fixed)
5Wide Range of Real Numbers
- A large number
- 976,000,000,000,000 9.76 1014
- A small number
- 0.0000000000000976 9.76 10-14
- No fixed ? that can represent both
- Not representable in single fixed-point format
(Adapted from V. Aagrawal)
6Floating Point Numbers
- Decimal scientific notation
- 0.513105, 5.13104 and 51.3103
- 5.13104 is in normalized scientific notation
- Binary floating point numbers
- Base B 2
- Binary point
- Multiplication by 2 moves the point to the left
- Normalized scientific notation, e.g., 1.02-1
- Known as floating point numbers
(Adapted from V. Agrawal)
7Normalizing Numbers
- In scientific notation, we generally choose one
digit to the left of the decimal point - 13.25 1010 becomes 1.325 1011
- Normalizing means
- Shifting the decimal point until we have the
right number of digits to its left - Normally one
- Adding or subtracting from the exponent to
reflect the shift
(Adapted from V. Agrawal)
8Floating Point Numbers
- General format
- 1.bbbbb two2eeee
- or (-1)S (1F) 2E
-
- Where
- S sign, 0 for positive, 1 for negative
- F fraction (or mantissa) as a binary
integer, 1F is called significand - E exponent as a binary integer, positive or
negative (twos complement)
(Adapted from V. Agrawal)
9ANSI/IEEE Std 754-1985
- Single-precision float format
-
Bit 31 Mantissa sign, s0 for positive, s1 for
negative Bits 3023 8-bit biased binary exponent
0 e 255 Bits 220 24-bit mantissa, m,
expressed as a binary fraction, A binary 1 as
the most significant bit is implied. m
1.m1m2m3...m23
(Adapted from V. Agrawal)
10IEEE 754 Floating Point Standard
- Biased exponent exponent range -127,127
changed to 0, 255 - Biased exponent is an 8-bit positive binary
integer - True exponent obtained by subtracting 12710 or
011111112 - 255 special case
- First bit of significand is always 1
- 1.bbbb . . . b 2E
- 1 before the binary point is implicitly assumed
- So we dont need to include it just assume its
there! - Significand field is 23 bit fraction after the
binary point - Significand range is 1, 2)
- Standard formats
- Single precision 8 (E) 23 (F) 1 (S) 32
bits (float) - Double precision 11 (E) 52 (F) 1 (S) 64
bits (double)
(Adapted from V. Agrawal)
11Numbers in 32-bit Formats
- Twos complement integers
- Floating point numbers
- The range is larger, but the number of numbers
per unit interval is less than that for a
comparable fixed point range
Expressible numbers
-231
231-1
0
Positive underflow
Negative underflow
Negative Overflow
Positive Overflow
Expressible negative numbers
Expressible positive numbers
0
-2-127
2-127
(2 2-23)2127
- (2 2-23)2127
(Adapted from V. Agrawal)
12Binary to Decimal Conversion
Binary (-1)S (1.b1b2b3b4) 2E
Represents (-1)S (1 b12-1 b22-2 b32-3
b42-4) 2E
Example -1.1100 2-2 (binary) - (1 2-1
2-2) 2-2 - (1 0.5 0.25)/4 -
1.75/4 - 0.4375 (decimal)
(Adapted from V. Agrawal)
13Decimal to Binary Conversion
- Converting from base 10 to the representation
- Single precision example
- Covert 10010
- Step 1 convert to binary - 0110 0100
- In a binary representation form of 1.xxx have
- 0110 0100 1.100100 x 26
14Decimal to Binary Conversion (contd)
- 1.1001 x 26 is binary for 100
- Thus the exponent is a 6
- Biased exponent will be 6127133 1000 0101
- Sign will be a 0 for positive
- Stored fractional part f will be 1001
- Thus we have
- S E F
- 0 100 0 010 1 1 00 1000.
- 4 2 C 8 0 0 0 0 in
hexadecimal - 42C8 0000 is representation for 100
15Positive Zero in IEEE 754
0 00000000 00000000000000000000000
Biased exponent
Fraction
- 1.0 2-127
- Smallest positive number in single-precision IEEE
754 standard. - Interpreted as positive zero.
- Exponent less than -127 is positive underflow
can be regarded as zero.
(Adapted from V. Agrawal)
16Negative Zero in IEEE 754
1 00000000 00000000000000000000000
Biased exponent
Fraction
- - 1.0 2-127
- Smallest negative number in single-precision IEEE
754 standard. - Interpreted as negative zero.
- True exponent less than -127 is negative
underflow may be regarded as 0.
(Adapted from V. Agrawal)
17Positive Infinity in IEEE 754
0 11111111 00000000000000000000000
Biased exponent
Fraction
- 1.0 2128
- Largest positive number in single-precision IEEE
754 standard. - Interpreted as 8
- If true exponent 128 and fraction ? 0, then the
number is greater than 8. - It is called not a number or NaN and may be
interpreted as 8.
(Adapted from V. Agrawal)
18Negative Infinity in IEEE 754
1 11111111 00000000000000000000000
Biased exponent
Fraction
- -1.0 2128
- Smallest negative number in single-precision IEEE
754 standard. - Interpreted as - 8
- If true exponent 128 and fraction ? 0, then the
number is less than - 8 - It is called not a number or NaN and may be
interpreted as - 8.
(Adapted from V. Agrawal)
19IEEE Representation Values
- If E255 and F is nonzero, then VNaN ("Not a
number") - If E255 and F is zero and S is 1, then
V-Infinity - If E255 and F is zero and S is 0, then
VInfinity - If 0ltElt255 then V(-1)S 2 (E-127) (1.F)
where "1.F" is intended to represent the binary
number created by prefixing F with an implicit
leading 1 and a binary point. - If E0 and F is nonzero, then V(-1)S 2
(-126) (0.F) - These are "unnormalized" values.
- If E0 and F is zero and S is 1, then V-0
- If E0 and F is zero and S is 0, then V0
20Addition and Subtraction
- 0. Zero check
- - Change the sign of subtrahend
- - If either operand is 0, the other is the result
- 1. Significand alignment right shift smaller
significand until two exponents are identical. - 2. Addition add significands and report
exception if overflow occurs. - 3. Normalization
- - Shift significand bits to normalize.
- - report overflow or underflow if exponent goes
out of range. - 4. Rounding
(Adapted from V. Agrawal)
21Rounding
- Adjusting significands before addition will
produce results that exceed 24 bit - Round toward infinity
- select next largest normalized result
- Round toward minus infinity
- select next smallest normalized result
- Round toward zero
- truncate result
- Round to nearest
- select closest normalized result
- used by IEEE 754
22Example
- Subtraction 0.510- 0.437510
- Step 0 Floating point numbers to be added
- 1.00022-1 and -1.11022-2
- Step 1 Significand of lesser exponent is
shifted right until exponents match - -1.11022-2 ? - 0.11122-1
- Step 2 Add significands, 1.0002 (- 0.1112)
- Result is 0.0012 2-1
- Step 3 Normalize, 1.0002 2-4
- No overflow/underflow since
- 127 exponent -126
- Step 4 Rounding, no change since the sum fits
in 4 bits. - 1.0002 2-4 (10)/16 0.062510
(Adapted from V. Agrawal)
23FP Multiplication Basic Idea
- Separate sign
- Add exponents
- Multiply significands
- Normalize, round, check overflow
- Replace sign
(Adapted from V. Agrawal)
24FP Division Basic Idea
- Separate sign.
- Check for zeros and infinity.
- Subtract exponents.
- Divide significands.
- Normalize/overflow/underflow.
- Rounding.
- Replace sign.
(Adapted from V. Agrawal)