Floating Point Numbers - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Floating Point Numbers

Description:

Representation of Floating Point Numbers. IEEE 32-bit floating point number. ... of significant figures used to represent the number. ... Representation of Zero ... – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 23
Provided by: TimothyCL4
Category:

less

Transcript and Presenter's Notes

Title: Floating Point Numbers


1
Floating Point Numbers
  • Fixed point Numbers
  • Representation of Floating Point Numbers
  • IEEE 32-bit floating point number.
  • Floating point Arithmetic

2
Fixed Point Numbers
  • The binary (or decimal) point is asumed to be in
    a fixed position
  • Base 10 fixed point arithmetic
  • 7632135 763.2135
  • 1794821 179.4821
  • 9426956 942.6956

3
Fixed Point (Binary) Numbers
  • Example Add 3.625 and 6.5
  • Convert the numbers to 8-bit form (4-bit int,
    4-bit fraction)
  • 3.625 ? 11.101 ? 0011.1010
  • 6.500 ? 110.10 ? 0110.1000
  • Consider the numbers having an imaginary binary
    point and added in the normal way
  • 00111010 01101000 10100010
  • The integer part of the result is converted to
    10, and the fractional part is interpreted as
    .125. Therefore, the result is 10.125.

4
Problem with Fixed Point (Binary) Numbers
  • Some systems require a large range of numbers
  • Mass of sun 1990000000000000000000000000000000
    grams
  • Requires about 14 bytes
  • Mass of electron 00000000000000000000000000091095
    6 grams
  • Requires about 12 bytes

5
Floating Point NumbersDefinitions
  • Range
  • How small and how large the numbers can be.
  • Precision
  • The number of significant figures used to
    represent the number.
  • A measure of a numbers exactness.
  • PI 3.141592 is more precise that PI 3.14
  • Accuracy
  • A measure of the correctness of a number.
  • PI 3.241592 is more precise than PI 3.14, but
  • PI 3.14 is more accurate.

6
IEEE Floating Point NumbersSingle Precision
Format
  • -1s 2E-B 1.F
  • B 127

7
IEEE Floating Point NumbersRange of Mantissa
  • A floating point mantissa is limited to one of
    the three ranges
  • -2 lt x lt -1
  • x 0
  • 1 lt x lt 2

8
IEEE Floating Point NumbersExponent
Binary Value True Exponent Biased Exponent Special Numbers
0000 0000 -127 0 zero
0000 0001 -126 1
0000 0010 -125 2
0000 0100 -124 3
. . .
1000 0000 0 128
. . .
1111 1100 125 252
1111 1101 126 253
1111 1110 127 254
1111 1111 128 255 - Infinity
9
IEEE Floating Point NumbersExcess - n
  • The stored exponent is also called excess n, or
    excess 127, for the IEEE single precision
    format.
  • The stored exponent exceeds the true exponent by
    127, the bias.
  • b b 127
  • where b is the biased exponent, and b is the
    true exponent.
  • Examples
  • If the true exponent is 2, the exponent is stored
    in biased form as 2 127 1000 0001.
  • If the stored exponent is 0000 0001, the true
    exponent is 1 127 -126.

10
IEEE Floating Point NumbersRepresentation of Zero
  • The smallest stored exponent 0000 0000 (in biased
    form), corresponding to a true exponent of -127,
    is used to represent zero.

11
IEEE Floating Point NumbersInfinity and Not a
Number (NaN)
  • 1111 1111 ? used as - infinity.
  • 1111 1111 and Mantissa ! 0 ? used as NaN.

12
IEEE Floating Point NumbersExample Representation
  • Represent -2345.125 as a single precision IEEE
    floating point number.
  • -2345.12510 -100100101001.0012
  • -2345.12510 -1.001001010010012 x 211
  • S 1 (negative)
  • The biased exponent is 11 127 138 100010102
  • The fractional part of the mantissa is
    .00100101001001000000000
  • Therefore, -2345.12510 1 10001010
    00100101001001000000000

13
IEEE Floating Point NumbersAddition and
Subtraction Flowchart
14
IEEE Floating Point Numbers Arithmetic Example 1
  • Convert the decimal numbers 123.5 and 100.25 into
    the IEEE 32-bit floating point number
    representation. Then carry out the subtraction of
    123.5 100.25 and express the result as a
    normalized 32-bit floating point number.
  • 123.510 1111011.12 1.1110111 x 26
  • The mantissa is positive, and so S 0.
  • The exponent is 6, which is stored in biased
    form as 6 127 13310 100001012.
  • The mantissa is 1.1110111, which is stored in
    23-bits, with the leading 1 suppressed.
  • Therefore, 123.510 is stored as
  • 0 10000101 11101110000000000000000IEEE

15
IEEE Floating Point Numbers Arithmetic Example
1 (Continued)
  • Convert the decimal numbers 123.5 and 100.25 into
    the IEEE 32-bit floating point number
    representation. Then carry out the subtraction of
    123.5 100.25 and express the result as a
    normalized 32-bit floating point number.
    (Continued)
  • 100.2510 1100100.012 1.10010001 x 26
  • The mantissa is positive, and so S 0.
  • The exponent is 6, which is stored in biased
    form as 6 127 13310 100001012.
  • The mantissa is 1.10010001, which is stored in
    23-bits, with the leading 1 suppressed.
  • Therefore, 100.2510 is stored as
  • 0 10000101 10010001000000000000000IEEE

16
IEEE Floating Point Numbers Arithmetic Example
1 (Continued)
  • Convert the decimal numbers 123.5 and 100.25 into
    the IEEE 32-bit floating point number
    representation. Then carry out the subtraction of
    123.5 100.25 and express the result as a
    normalized 32-bit floating point number.
    (Continued)
  • The two IEEE numbers are first unpacked the
    sign, exponent, and mantissa must be
    reconstituted.
  • The two exponents are compared. If they are the
    same, the mantissas are added. If they are not,
    the number with the smaller exponent is
    denormalized by shifting its mantissa right
    (i.e., dividing by 2) and incrementing its
    exponent (i.e., multiplying by 2) until the two
    exponents are equal. Then the numbers are added.

17
IEEE Floating Point Numbers Arithmetic Example
1 (Continued)
  • Convert the decimal numbers 123.5 and 100.25 into
    the 32-bit floating point number representation.
    Then carry out the subtraction of 123.5 100.25
    and express the result as a normalized 32-bit
    floating point number. (Continued)
  • After unpacking, insert the leading 1 and
    perform the subtraction.
  • 1.11101110000000000000000
  • -1.10010001000000000000000
  • 0.01011101000000000000000
  • Normalize the result
  • 1.01110100000000000000000

18
IEEE Floating Point Numbers Arithmetic Example
1 (Continued)
  • Convert the decimal numbers 123.5 and 100.25 into
    the IEEE 32-bit floating point number
    representation. Then carry out the subtraction of
    123.5 100.25 and express the result as a
    normalized 32-bit floating point number.
    (Continued)
  • The exponent must be decreased by 2.
  • 10000101 210 10000011
  • The result expressed in IEEE format is
  • 0 10000011 01110100000000000000000

19
IEEE Floating Point Numbers Arithmetic Example 2
  • Convert the decimal numbers 42.6875 and -0.09375
    into the IEEE 32-bit floating point number
    representation. Then carry out the addition of
    42.6875 and 0.09375 and express the result as a
    normalized 32-bit floating point number.
  • 42.687510 101010.10112 1.010101011 x 25
  • The mantissa is positive, and so S 0.
  • The exponent is 5, which is stored in biased
    form as 5 127 13210 100001002.
  • The mantissa is 1.010101011, which is stored in
    23-bits, with the leading 1 suppressed.
  • Therefore, 42.687510 is stored as
  • 0 10000100 01010101100000000000000IEEE

20
IEEE Floating Point Numbers Arithmetic Example
2 (Continued)
  • Convert the decimal numbers 42.6875 and -0.09375
    into the IEEE 32-bit floating point number
    representation. Then carry out the addition of
    42.6875 0.09375 and express the result as a
    normalized 32-bit floating point number
    (continued).
  • -0.0937510 -0.000112 -1.1 x 2-4
  • The mantissa is negative, and so S 1.
  • The exponent is -4, which is stored in biased
    form as -4 127 12310 011110112.
  • The mantissa is 1.1, which is stored in 23-bits,
    with the leading 1 suppressed.
  • Therefore, -0.0937510 is stored as
  • 1 01111011 10000000000000000000000IEEE

21
IEEE Floating Point Numbers Arithmetic Example
2 (Continued)
  • 42.6875100 10000100 101010101100000000000000
  • -0.09375101 01111011 110000000000000000000000
  • In order to perform the addition, the exponents
    must be the same.
  • Increase the second exponent by 9 and shift the
    mantissa right 9 times to get
  • 42.6875100 10000100 101010101100000000000000
  • -0.09375101 10000100 0000000001100000000000000000
    00000

22
IEEE Floating Point Numbers Arithmetic Example
2 (Continued)
  • 42.6875100 10000100 101010101100000000000000
  • -0.09375101 10000100 0000000001100000000000000000
    00000
  • Adding the mantissas, we get
  • 101010100110000000000000
  • The result is positive with a biased exponent of
    10000100.
  • Therefore, the result is stored as
  • 0 10000100 0101010011000000000000
Write a Comment
User Comments (0)
About PowerShow.com