Title: Fine-Grained Failover Using Connection Migration
1Fine-Grained FailoverUsing Connection Migration
- Alex C. Snoeren,
- David G. Andersen, Hari Balakrishnan
- MIT Laboratory for Computer Science
2The Problem
Client
Content server
More often than users want to know
3Solution Server Redundancy
Use a healthy one at all times.
4Failover Components
- Health Monitoring
- Connection Resumption
- Server Selection
5Todays Replication Technology
- DNS/Content Routing
- Wide-area replication
- Need client awareness
- Layer 4/Web Switches
- Transparent, possibly mid-stream failover
- Requires co-location
Web Switch
6Ideal Technology
- Wide area replication
- Yet somehow synchronize replica servers
- Transparent failover
- Enable other servers to continue connections
7Migrate Architecture
- Stream Mapping
- Infer application state from transport layer
information - Connection Migration
- Transparently hand off sessions between servers
Stream Mapper
Stream Mapper
Stream Mapper
8Stream Mapping
Client
Server Response
TCP ISS 083521
HTTP 1.1 200 OK Content-Length 328987
... Content-Type video/mpeg
TCP SeqNo 083346
Stream Map
Client Object (URL) Offset (TCP SeqNo)
128.89.3.244234 /StreamingContent.mpg 083346
9Anatomy of Failover
Client
Support Group
10Support Groups
- Set of partially mirrored servers
- All servers able to provide same content
- Can be topologically diverse
- Synchronize on per-connection basis
- Servers need not be complete mirrors
- Connections from a failed server can be handled
by a different support server - Connections may have distinct support groups
11Soft State Synchronization
- Synchronize within support groups
- Periodic advertisements
- Advertise client application object requests
- Communicate initial transport layer state
- Only initial state need be communicated
- Current info inferred from transport layer
- Clients will reject redundant migrates from stale
support servers
12TCP ConnectionMigration
client
server
1. Initial SYN 2. SYN/ACK 3. ACK (with
data) 4. Normal data transfer 5. Migrate
SYN 6. Migrate SYN/ACK 7. ACK (with data)
13TCP ConnectionMigration
client
server
1. Initial SYN 2. SYN/ACK 3. ACK (with
data) 4. Normal data transfer 5. Migrate
SYN 6. Migrate SYN/ACK 7. ACK (with data)
14TCP ConnectionMigration
client
server
1. Initial SYN 2. SYN/ACK 3. ACK (with
data) 4. Normal data transfer 5. Migrate
SYN 6. Migrate SYN/ACK 7. ACK (with data)
15Implementation
- Software Wedge
- Stream Mapping
- Synchronization
Wedge
Server App
Stream Mapping Wedges
Wedge
Server App
Client
16Wedge Overhead
1e07
Wedge
Direct
1e06
Microseconds per request
100000
10000
1000
1
10
100
1000
10000
Request size (Kbytes)
17Experimental Topology
Client initiates a transfer to A
Linux/Apache 1.3
128Kbs links
then migrates to B
and back to A
Linux/Apache 1.3
18Varying Oscillation Rates
1e06
900000
800000
700000
600000
Goodput (bytes)
500000
400000
No Oscillations
10 sec
300000
12 sec
2 sec
5 sec
200000
100000
0
0
10
20
30
40
50
60
Time (secs)
19Benefits Limitations
- Enable wide area server replication
- Low server synchronization overhead
- Infer current state from transport layer
- Robust even under adverse loads
- Health monitors can be overly reactive
- Gracefully handle cascaded failures
- Leverages connection migration
- Requires modern transport stack
20Networks and Mobile Systems
- Software available on the web
- http//nms.lcs.mit.edu/software/migrate