Title: Comp 380 Computers and Society
1Comp 380Computers and Society
- Software Reliability Part 2
- May 21, 2007
- Russell Gayle
- Summer Session I - 2007
2Administrative
Administrative Chef Robot Last time Cases Term
Project
- Stop me about 15 minutes early so that we can
talk about the term project! - Class listserv
- Set up over the weekend
- Check your inbox this afternoon for a message!
3Chef Robot
Administrative Chef Robot Last time Cases Term
Project
- An exercise in reliability
- Your task
Instruct a robot to make.
- Read the entire handout before proceeding
- Follow the instructions on the second page
- Try not to waste time!
4Lets thank Chef Robot!
Administrative Chef Robot Last time Cases Term
Project
- If youd like a peanut butter and jelly sandwich,
feel free to make one!
5Chef Robot - Wrapping up
Administrative Chef Robot Last time Cases Term
Project
- What surprised you?
- Did you think your program would work?
- Windows XP has 40,000,000 lines of instructions,
could details be missed? - What if
- Lessons?
6Software Reliability
Administrative Chef Robot Last time Cases Term
Project
- An introduction
- What are bugs?
- How did they originate?
- What are some causes?
- From the reading?
7Cases
Administrative Chef Robot Last time Cases Term
Project
- Last time
- Electric company outrageous bill
- Football player banned for chewing guy and being
late - Many cases are much more costly, and sometimes
more dangerous
8Mars Climate Orbiter
Administrative Chef Robot Last time Cases Term
Project
1999 Failure to convert English measures to
metric is root cause of the loss of the Mars
Climate Orbiter.
9More cases
Administrative Chef Robot Last time Cases Term
Project
Mar 2000 Sea Launch malfunction blamed on
software glitch
Oct 2005 Russian rocket falls into ocean
shortly after liftoff.
10And more cases...
Administrative Chef Robot Last time Cases Term
Project
- Software horror stories
- Famous IT failures (slideshow)
11Denver Airport Baggage (1995)
Administrative Chef Robot Last time Cases Term
Project
- 4 years in development at cost of 193M
- The promise
- delivered in lt 10 minutes to any part of airport!
- Massively complex system
- 4000 cars
- 21 miles of track
- scanners
- photocells
- 300 computers
12Denver Airport Baggage
Administrative Chef Robot Last time Cases Term
Project
- Any guesses?
- What happened
- misrouted and crashed, baggage lost and damaged
- Delayed opening cost 1.1M/day
- When airport opened a year late only one airline
used the system
13Denver Airport Baggage
Administrative Chef Robot Last time Cases Term
Project
- Examples of bugs
- Photocell could not detect bags on the belt and
therefore didnt stop system - System had lost track of state of carts during
jams - Timing between conveyor belts and carts not
properly synchronized - Overall
- Not just software glitches
- very complex, poorly engineered system
14Ariane 5 (1996)
Administrative Chef Robot Last time Cases Term
Project
Software error
Integer overflow
15Ariane 5
Administrative Chef Robot Last time Cases Term
Project
- Only about 40 seconds after initiation of the
flight sequence, at an altitude of about 3700 m,
the launcher veered off its flight path, broke up
and exploded
16Ariane 5
Administrative Chef Robot Last time Cases Term
Project
Development cost 7 Billion Delay of more
than one year One set of four identical,
uninsured scientific satellites One
rocket 500,000,000
17Ariane 5
Administrative Chef Robot Last time Cases Term
Project
- Overflow tried to put too big a number into too
small a space - What does this mean?
- Even worse the feature that caused the problem
wasnt needed! It was only needed to set up the
launch! - archive.eiffel.com/doc/manuals/technology/contract
/ariane/page.html
18Bank of New York (1985)
Administrative Chef Robot Last time Cases Term
Project
- BoNY Nations largest clearer of Govt
securities. - Software to track Federal securities transactions
wrote new information on top of old. - Feds debited the bank for each transaction but
bank did not know who owed it how much. - 90 minutes gt 32 Billion overdraft!
19Cost of bug
Administrative Chef Robot Last time Cases Term
Project
- Bank had to borrow 24 billion from federal
reserves. Interest paid 5 million for 1 day.
(Annual earnings of bank 120 million) - BoNY share prices dropped by 25
- Federal funds rate dropped from 8.4 to 5.5
- System down for 28 hours.
- Fear of financial crisis caused increase in price
of platinum!
20Cause of bug
Administrative Chef Robot Last time Cases Term
Project
- Message buffer counter at BoNY system was 16-bit
long. - Counters at Fed (and other banks) 32 bit.
- More than 32,000 transactions that morning!
gtCounter overflow - Securities database corrupted.
21Therac-25
Administrative Chef Robot Last time Cases Term
Project
- Landmark case of how things can go terribly wrong
- Medical linear accelerator radiation therapy for
cancer patients - Used to zap tumors with high energy beams
- Electron beams for shallow tissue
- X-ray photons for deeper tissue
- Eleven Therac-25s were installed
- Six in Canada
- Five in the United States
- Developed by Atomic Energy of Canada Limited
(AECL).
22Therac-25
Administrative Chef Robot Last time Cases Term
Project
- Improvements over Therac-20
- Uses new double pass technique to accelerate
electrons. - Machine itself takes up less space.
- Other differences from the Therac-20
- Software now coupled to the rest of the system
and responsible for safety checks. - Hardware safety interlocks removed.
- Easier to use.
23Therac-25
Administrative Chef Robot Last time Cases Term
Project
24Therac-25
Administrative Chef Robot Last time Cases Term
Project
251985-1987 Six known accidents
Administrative Chef Robot Last time Cases Term
Project
- Jun 1985 Patient at Mareitta GA received
overdose - July 1985 Hamilton, Ontario patient severely
burned, died that November. - December 1985 Patient in Yakima, WA ?overdose
26Vernon Kidd
Administrative Chef Robot Last time Cases Term
Project
- Early March 1986, Tyler, Tx
- receives dose gt 100 times too high
- Complained he felt burned..
- Engineer Its not possible for Therac-25 to give
an overdose. - Engineering firm Machine does not appear capable
of giving a patient an electrical shock... - Died 5 months later
- Put back in use late March
27What went wrong?
Administrative Chef Robot Last time Cases Term
Project
- User Interface
- Operator entered code for high energy rather than
low energy - Malfunction message
- Operator entered Proceed because system was
known to give quirky errors - Result
- Turntable was in the wrong position
283 weeks later Ryan Cox
Administrative Chef Robot Last time Cases Term
Project
- Second accident in Tyler, Tx
- Same operator
- Patient died 1 month later
- This time they were able to reproduce
29What would cause it to happen?
Administrative Chef Robot Last time Cases Term
Project
- Race conditions.
- Several different race condition bugs.
- Overflow error.
- The turntable position was not checked every
256th time the Class3 variable is incremented. - No hardware safety interlocks.
- Wrong information on the console.
- Non-descriptive error messages.
- Malfunction 54
- H-tilt
- User-override-able error modes.
30Source of the bug
Administrative Chef Robot Last time Cases Term
Project
- Incompetent engineering.
- Safety analysis excluded the software!
- No usability testing.
31Integer overflow
Administrative Chef Robot Last time Cases Term
Project
- A known software design error
- Software included a set-up test before each
treatment. - Tested various components .
- Variable incremented with each part of test
X X 1 - 8 bits.
- Can store values from 0 thru 255
32Integer overflow
Administrative Chef Robot Last time Cases Term
Project
256 1
0 0 0 0
0 0 0 0 0
- IF X 0 then PROCEED with treatment
-
33Sources
Administrative Chef Robot Last time Cases Term
Project
- Leveson, N., Turner, C. S., An Investigation of
the Therac-25 Accidents. IEEE Computer, Vol. 26,
No. 7, July 1993, pp. 18-41. http//courses.cs.vt.
edu/cs3604/lib/Therac_25/Therac_1.html - Information for this article was largely obtained
from primary sources including official FDA
documents and internal memos, lawsuit
depositions, letters, and various other sources
that are not publicly available.
The authors
34Therac-25
Administrative Chef Robot Last time Cases Term
Project
- For you to consider What did each group do
wrong? Where do ethics come in? What might be
said in their defense? - AECL (the manufacturer)
- Programmers
- Operators (technicians)
- Hospitals
35Ethical use?
Administrative Chef Robot Last time Cases Term
Project
- Well get more into this tomorrow but for a
brief introduction
The ethical dimensions of computer reliability
are bound up with the nature of software, and
the complexity of such systems.
36Ethics and complexity
Administrative Chef Robot Last time Cases Term
Project
Im happy to work on games... critical systems
are scary. But I would like to make a
difference.
37The nature of software
Administrative Chef Robot Last time Cases Term
Project
- Programs are large LOSC
- Programs are buggy
- Chef Robot?
- Incorrect algorithm can result in unexpected
output! (logic errors) - Imagine far more complex systems.
38The nature of software
Administrative Chef Robot Last time Cases Term
Project
- Fixing one bug can introduce others.
- Changing one part of program can affect many
other parts.
39The nature of software
Administrative Chef Robot Last time Cases Term
Project
- No one person understands whole system
- Programming teams
- Windows 2000 gt4,000 programmers
- Market pressures! Meaning what?
- Is it riskier not to have that system available?
40The nature of software
Administrative Chef Robot Last time Cases Term
Project
- Unreasonable to demand zero risk from
anything.? - Doesnt hardware fail?
- How risky is too risky?
- Hopefully you can explain why the nature of
software is different in fundamental ways. - What troubled the FBI Virtual Case file?
41Many, many more stories
Administrative Chef Robot Last time Cases Term
Project
- Links will be added to readings section of our
webpage - http//www5.in.tum.de/huckle/bugse.html
- http//www.baddesigns.com/
42Final Discussion
Administrative Chef Robot Last time Cases Term
Project
- Should Microsoft be held responsible for the
business problems and viruses caused by security
holes in their software?
43Term Project
Administrative Chef Robot Last time Cases Term
Project
- Many parts, details, dates, etc. the course
website!
44Next time
Administrative Chef Robot Last time Cases Term
Project
- Reading 8
- Assigned All portions of the term project
- Respond to listserv message
- Tomorrow Bring your Initial Topic Preference
worksheet to class!