Title: Round-Off and Truncation Errors
1CHAPTER 4
- Round-Off and Truncation Errors
2Numerical Accuracy
- Truncation error Method dependent
- Errors which result from using an approximation
rather than an exact procedure - Round-off error Machine dependent
- Errors which result from not being able to
adequately represent the true value - Result from using an approximate number to
represent exact number
3Taylor Series Expansion
- Construction of finite-difference formula
- Numerical accuracy discretization error
x
a
Base point x a
4Taylor series expansions
5Taylor Series and Remainder
- Taylor series (base point x a)
- Remainder
6Truncation Error
- Taylor series expansion
- Example (higher-order terms truncated)
(xi 0, h x ? xi1 x)
7Power series Polynomials
The function becomes more nonlinear as m increases
8A MATLAB Script
function sum exp(x) Evaluate exponential
function exp(x) by Taylor series expansion
f(x)1 x x2/2! x3/3! xn/n! clear
all x input(enter the value of x ) n
input(enter the order n ) term 1 sum
term for i 1 n term termx/i sum
sum term end
9 MATLAB For Loops
function sum exp(x) Evaluate exponential
function exp(x) by Taylor series expansion
f(x)1 x x2/2! x3/3! xn/n! x
input(enter the value of x ) n input(enter
the order n ) term(1) 1 sum(1)
term(1) for i 1 n term(i1)
term(i)x/i sum(i1) sum(i)
term(i1) end Display the results disp(i
term(i) sum(i)) a 1n1 a term sum
10Truncation Error
n term sum
n term sum
11Truncation Error
n term sum
n term sum
How to reduce error?
12Round-off Errors
- Computers can represent numbers to a finite
precision - Most important for real numbers - integer math
can be exact, but limited - How do computers represent numbers?
- Binary representation of the integers and real
numbers in computer memory
1332 bits (23, 8, 1)
28 256
MATLAB uses double precision
14Order of operation
Addition problem
exact result
with 3-digit arithmetic
Round-off error
15Cancellation error
If b is large, r is close to b
Difference of two numbers very close to each
other ? potential for greater error!
Rationalize
16Try b 97
(r 96.9794)
x2
(3 sig. figs.)
exact 0.01031 standard 0.01050 rationalized 0.
01031
Corresponding to cancellation, critical
arithmetic
17Significant Figures
48.9 mph? 48.95 mph?
18Significant Digits
- The places which can be used with confidence
- 32-bit machine 7 significant digits
- 64-bit machine 17 significant digits
- Double precision reduce round-off error, but
increase CPU time
19False Significant Figures
3.25/1.96 1.65816326530162... (from
MATLAB) But in practice only report 1.65
(chopping) or 1.66 (rounding)! Why??
Because we dont know what is beyond the second
decimal place
20(No Transcript)
21Accuracy and precision
- Accuracy - How closely a measured or computed
value agrees with the true value - Precision - How closely individual measured or
computed values agree with each other - Accuracy is getting all your shots near the
target. - Precision is getting them close together.
More Accurate
More Precise
22Numerical Errors
The difference between the true value and the
approximation
Approximation true value true error Et
true value ? approximation x ? x or in
percent
23Approximate Error
- But the true value is not known
- If we knew it, we wouldnt have a problem
- Use approximate error
24Number Systems
- Base-10 (Decimal) 0,1,2,3,4,5,6,7,8,9
- Base-8 (Octal) 0,1,2,3,4,5,6,7
- Base-2 (Binary) 0,1 off/on, close/open,
negative/positive charge - Other non-decimal systems
- 1 lb 16 oz, 1 ft 12 in, ½, ¼, ..
25Decimal System (base 10)
Binary System (base 2)
26Integer Representation
- Signed magnitude method
- Use the first bit of a word to indicate the sign
0 negative (off), 1 positive (on) - Remaining bits are used to store a number
1 0 1 0 0 1 0 1 1 0
Sign Number
off / on, close / open, negative / positive
27Integer Representation
- 8-bit word
- /- 0000000 are the same, therefore we may use
-0 to represent -128 - Total numbers 28 256 (-128 ?127)
Sign Number
28Integer Representation
- 16-bit word
- Range -32,768 to 32,767
- Overflow gt 32,767 (cannot represent 43,000 AM
students) - Underflow lt -32,768 (magnitude too large)
- 32-bit word
- Range -2,147,483,648 to 2,147,483,647
- 9 significant digits
- Overflow world population ?6 billion
- Underflow budget deficit -100 billion
29Integer Operations
- Integer arithmetic can be exact as long as you
don't get remainders in division - 7/2 3 in integer math
- or overflow the maximum integer
- For a 8-bit computer max 128 (or -127)
- So 123 45 overflow
- and -74 2 underflow
30Floating-Point Representation
- Real numbers (also called floating-point numbers)
are represented differently - For fraction or very large numbers
- Store as
- sign is 1 or 0 for negative or positive
- exponent is maximum value (positive or negative)
of base - mantissa contains significant digits
sign signed exponent mantissa
31Floating-Point Representation
sign of number
signed exponent mantissa
- m mantissa
- B Base of the number system
- e signed exponent
- Note the mantissa is usually normalized if the
leading digit is zero
32Integer representation
Floating-point number representation
33Decimal Representation
sign signed exponent number
10951467 (base B 10) mantissa m
-(110-1 410-2 610-3 710-4 )
-0.1467 signed exponent e (9101 5100)
95
34Floating-Point Representation
- 8-bit word (without normalization)
sign signed exponent number
01110101 (base B 2) mantissa m (02-1
12-2 02-3 12-4 ) 5/16 signed exponent
e - (121 120) -3
35Normalization
(Less accurate) (Normalization)
- Remove the leading zero by lowering the exponent
(d1 1 for all numbers) - if m lt 1/2, multiply by 2 to remove the leading 0
- floating-point allow fractions and very large
numbers to be represented, but take up more
memory and CPU time
36Binary Representation
- 8-bit word (with normalization)
sign signed exponent number
10111001 (base B 2) mantissa m -(12-1
02-2 02-3 12-4 ) -9/16 signed
exponent e (121 120) 3
37Single Precision
- A real variable (number) is stored in four words,
or 32 bits (64 bits for Supercomputers) - bit (binary digit) 0 or 1
- byte 4 bits, 24 16 possible values
- word 2 bytes 8 bits, 28 256 possible values
23 for the digits 32 bits 8
for the signed exponent 1 for
the sign
38Double Precision
- A real variable is stored in eight words, or 64
bits - 16 words, 128 bits for supercomputers
- signed exponent ? 210 ? 1024
52 for the digits 64 bits 11
for the signed exponent 1 for
the sign
39Round-off Errors
- Floating point characteristics contribute to
round-off error (limited bits for storage) - Limited range of quantities can be represented
- A finite number of quantities can be represented
- The interval between numbers increases as the
numbers grow - Example - three significant digits
0.0100 0.0101 0.0102 0.0999 (0.0001
increment) 0.100 0.101 0.102 .
0.999 (0.001 increment) 1.00 1.01
1.02 . 9.99 (0.01 increment)
40MATLAB
- Finite number of real quantities (integers, real
numbers or text) can be represented - For 8-bit, 28 256 quantities
- For 16-bit, 216 65536 quantities
- MATLAB uses double precision
- 4 bytes 64 bits
- more than 1019 (264) quantities