Title: Recovery-Oriented Computing User Study
1Recovery-Oriented ComputingUser Study
- Training Materials
- October 2003
2Overview
- Informed consent Introduction
- User study scenario your role
- Training (20 minutes)
- Two study sessions (30 minutes each)
- Wrapup and questionnaire
3Informed Consent
- Please read the overview of the study and the
informed consent form - please feel free to ask any questions you have
about the experiment, its goals, its procedures,
etc. - If you agree to participate in the experiment,
please sign the informed consent form
4Introduction
- This study is evaluating new recovery tools
- the tools are designed to help system
administrators recover from problems affecting
server systems - You will be playing the role of a system
administrator - in each of two sessions, you will be trying to
recover an e-mail server system from a
pre-existing problem
5Introduction (2)
- In each session, you may (or may not) be given an
experimental recovery tool to use - We are trying to understand when the tool is
useful for you and when it is not - so if you are given the tool, please think
carefully about whether or not to use it when you
are attempting to recover from a problem - at the end of the session, you will be asked to
explain why you chose to use (or not use) the tool
6The Scenario
7User Study Scenario
- You are one of several system administrators of
an electronic mail (e-mail) service - the administrators work in shifts
- the study starts when you arrive for your shift
- You arrive to find users complaining that the
e-mail service is not working - you will be provided with details of the
complaint - the e-mail failure may be caused by
- failure of the e-mail software, or
- an error made by the administrator on the
previous shift
8User Study Scenario Your Role
- Your responsibilities and goals
- restore the e-mail service to normal operation as
quickly as possible - minimize the amount of lost e-mail and user work
- Note
- you should prioritize restoring service over
preserving changes made by other administrators
9User Study Scenario Resources
- Resources you will have
- a log of all actions performed by administrators
in previous shifts - a day-old backup of the servers file systems
- the Internet
- a test e-mail account
- a guru
- during each session, you may make up to one
request for help to the guru - Plus any experimental recovery tool that we
provide (described later)
10Training E-mail Server
11E-mail Overview
- This study concerns e-mail store servers
- e-mail stores receive and store e-mail for their
users - users mailboxes live on the e-mail store
- they do not handle sending or routing of outgoing
mail - E-mail stores use two protocols
- SMTP used to deliver incoming e-mail to a
mailbox - SMTP is spoken between a remote server that sends
the message, and the local recipient e-mail store
server - IMAP used to retrieve manipulate mail in a
mailbox - IMAP is spoken between a users e-mail client and
their local e-mail store server
12E-mail Server Configuration
SMTPServerProcess sendmail
IMAPServerProcess imapd
SMTP
IMAP
Internet
incominge-mail
reading e-mail
Users
Mailboxes /var/mail/userNNN
E-mail Server (Linux) undovmN.cs.berkeley.edu
N1,2,3
- Mailboxes are text files in /var/mail, e.g.
/var/mail/user173 - sendmail process that receives and delivers
incoming e-mail - imapd process that provides remote access to
mailboxes - Mail store configuration files can be found in
/etc/mail
13Simple Familiarization Task
- Take some time to get familiar with the console
and the e-mail system - by performing a basic task as described below
- Goals
- ensure sendmail is running
- reconfigure server to recognize mail sent
to user_at_roc.cs.berkeley.edu - restart sendmail to activate reconfiguration
- First step
- connect to undovm3.cs.berkeley.edu with ssh
- continues...
14Simple Familiarization Task (2)
- Next, check if sendmail is running
- execute the command ps ax grep sendmail
- Reconfigure server to accept new host name
- edit /etc/mail/local-host-names to add the
line roc.cs.berkeley.edu - Finally, restart sendmail
- run /etc/init.d/sendmail restart
- Try this task now!
15Training Experimental Recovery Tool
16Recovery Tool an Undo System
- The undo system can undo administrative changes
to the e-mail store, including - changes to configuration files
- software upgrades
- deleted or altered files
- It can be used to restore the e-mail server to a
previously known-good state - by rewinding to a date when the system worked
OK - The undo system preserves incoming e-mail and
user mailbox changes
17When Can the Undo System Help?
- The undo system is useful
- when you cannot tell what is causing a problem
- but you know that the system was working at some
point in the past - when a problem affects system state
- typically, the same cases where restoring a
backup would fix the problem - It does not help when the problem does not affect
state - like if a server process (e.g., sendmail) has
crashed cleanly without corrupting state
18Why Use the Undo System?
- Unlike using a backup, the undo system also
repairs the side effects of problems - example if a problem caused e-mail to be lost,
using undo to fix the problem will restore the
lost e-mail - the undo system does this by recording incoming
e-mail and users mailbox edits, then restoring
them during recovery - Undo is also useful when you cannot diagnose a
problem - simply undo the system to a point in time when it
was known to be working
19Undo System Operation
- An undo cycle has two stages
- rewind the e-mail systems state is reverted to
the way it appeared at a past time (the rewind
point) - all changes to the system made since the rewind
point are undone, including - changes made by administrators
- changes due to software bugs
- incoming e-mail delivery and user mailbox edits
- commit makes the rewind permanent but restores
incoming e-mail user mailbox edits to present
time - Net effect undo cycle undoes all changes except
incoming e-mail and mailbox edits
20Illustration of Undo Cycle
user event
user events(incoming e-mail, mailbox edits)
time
admin changes
admin change
undone changes
user events(incoming e-mail, mailbox edits)
time
admin changes
Rewind point
restored user events
user events(incoming e-mail, mailbox edits)
time
admin changes
note that admin changes remain undone
21Controls for the Undo System
- Rewind begins an undo cycle
- defines a rewind point and undoes all later
changes - may cause e-mail server to automatically reboot
- takes 4 to 5 minutes to execute
- Commit completes the undo cycle
- makes the rewind permanent
- restores incoming e-mail mailbox edits to
present time - takes about 5 minutes to execute
- Cancel aborts the undo cycle
- restores e-mail server to the state it was in
before rewinding
22Undo System Interface
- time is divided into 5-minute intervals
- each interval contains user events like incoming
mail - its fastest to rewind to a checkpoint
Intervals
Intervalscontainingcheckpoints
Timeline(color indicatesrelative load)
Checkpoints
Current time
Current undo status
23Undo System Interface (2)
- Main window rewound state
Current time (inthe past) indicatesundo point
Current undo status
History of undooperations
Commit andCancel buttons
24Undo System Interface (3)
- Event window
- used to initiate rewind
- to view, double-click on an interval in main
window
Click to invokeundo cycle
Selected event(rewind point)
Current time
Description of event(here, user170 is examining
their mailbox)
Event sequence
25Familiarization, Part II
- Try out the undo system interface
- note actually performing an undo cycle may take
10 or more minutes to complete - Familiarize yourself with the various resources
available to you during the study - Outlook Express e-mail client
- the test e-mail account user250_at_undovmN.cs.ber
keley.edu N1,2,3 - the system backup /backup
- books, documentation, the Internet
- guru advice at most one question per session
26Resources for More Information
- E-mail in general
- About Internet email protocols http//perl.about.c
om/library/weekly/aa020600a.htm - E-mail references http//www.newt.com/email/refer
ences.html - Sendmail
- OReilly Sendmail book (next to your workstation)
- Sendmail home page http//www.sendmail.org
- SMTP RFC http//www.isi.edu/in-notes/rfc2821.txt
- IMAPd
- IMAP general info http//www.imap.org/
- UW-IMAP home page http//www.washington.edu/imap/
- IMAP RFC http//www.isi.edu/in-notes/rfc3501.txt