Secure FailOver Protocol - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Secure FailOver Protocol

Description:

Grid Computing on a massive, global scale. Using a truly distributed network operating system where even high-end multi ... Anyone can connect anytime, anywhere, ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 27

Provided by: hp26102

Category:

more less

Transcript and Presenter's Notes

Title: Secure FailOver Protocol

1
Secure Fail-Over Protocol

Sonja Tideman
Advisor Dr. Indrajit Ray,
Colorado State University
August 9, 2002

2
The Dream

Grid Computing on a massive, global scale
Using a truly distributed network operating
system where even high-end multi-processor
supercomputers are treated as simply another
simple resource
Anyone can connect anytime, anywhere, on any
device
The cost of entering the system is no more than
the cost of the device an user puts the OS on
the device and is given full access to the system

3
The Reality

Need to make it an attractive option for
corporations, research labs, and educational
institutions
Give them what they want, but cheaper and easier
Means security, reliability, performance,
storage, and other Quality of Service (QoS)
guarantees
Not there yet

4
Fail-Over Protocol how this fits into the dream

Business needs to provide clients with secure,
reliable service
Typically utilize a network of servers scattered
geographically to provide servers
Users connect to one machine in the network
(round-robin DNS)
But what happens when a server goes down or is
comprised?
Clients usually have to have existing connections
terminated and restarted
This can cause problems for both the server and
the client
The goal is to get the client to be simply
switched over to another server on the network in
a transparent (to the application) manner

5
Related Work

Significant amount of related work
Main contribution is that protocol does not rely
on either the server or the client being trusted
Protocol also includes server-to-server
monitoring method

6
Related Work Cont

Migratory TCP done at Rutgers
Process migration work
Computing Communities research done at New York
University and Arizona State University
Similar effect as fail-over protocol, different
approach
TCP Migration

7
TCP Migration Work

Protocol described in Internet Draft
draft-snoeren-tcp-migrate-00
Offers a secure cryptographic key to be sent with
TCP connection
System similar to the fail-over protocol
described in Fine-Grained Failover using
Connection Migration
Proceedings of 3rd USENIX Symposium on Internet
Technologies and Systems

8
Differences from TCP Migration Work

Do not want to allow server that is initially
accessed to know cryptographic key
So client encrypts it with public key of a
trusted coordinator
Protocol specifies format for health monitoring
Allows monitoring applications to be added

9
Protocol Overview Server

Protocol to allow servers to monitor each other
using many different monitoring applications
At the moment, all have to conform to a Java
interface, but that could change
Protocol to allow servers to join the monitoring
network in a secure fashion
Also allows servers to be able to get enough
information about another servers client
interactions to be able to take over the
interactions

10
Protocol Overview Server

Peer-to-peer monitoring
Each server runs all of its monitoring programs
on the other servers
Depending on size of system, it may run on all
other servers or on a subset given to each server
by the coordinator
One central coordinator
Handles decision to terminate a servers client
connections, server join/remove request, and
registration of new monitoring programs
Is either connected via an internal-only link (no
outside access) or may eventually be fully
distributed as well

11
Protocol Overview Client

Should be mostly transparent to the client
Kernel will handle necessary overhead
Must recognize and setup migrate option fields in
TCP options field
Must verify security information
Client must have published public key
Could also be sent with the initial request

12
Sequence Overview
Server A
2A
1A
Coordinator
Client
3A
4A
Server B
1B
2B
13
Connection Sequence

Client requests service from server (1A)
Must set migrate-permitted option in TCP header
As outlined in TCP migration draft, would also
specify parameters for generation of the
cryptographic cookie
All parameters would be encrypted with public key
of coordinator

2A
1A
3A
4A
1B
2B
14
Connection Sequence, Cont

Server sends coordinator parameter data (2A)
Coordinator hands back a ticket encrypted with
the clients public key (3A)
Contains the cryptographic cookie necessary for
TCP migration, a timestamp, and a unique
transaction ID
Server hands the ticket back to the client,
continues with transaction (4A)

2A
1A
3A
4A
1B
2B
15
Fail Over Sequence

Network of machines detect failure
Coordinator selects one machine to take over
Selection could be based on system load, and/or
geographic proximity
Hands new server a ticket encrypted with public
key of client (1B)
Contains cryptographic cookie, timestamp, and the
unique transaction ID

2A
1A
3A
4A
1B
2B
16
Fail Over Sequence, Cont

New server sends client ticket (2B) in a TCP
packet with the Migrate option set
Client gets ticket, decrypts it, and verifies
Transaction ID is the same
Cryptographic key is the same
Timestamp is reasonable (within accepted range)
Clients accepts new servers messages, rejects
any subsequent messages from old server

2A
1A
3A
4A
1B
2B
17
Protocol to join the group

Server sends join request to coordinator
Coordinator asks network of machine to decide to
accept or reject the request
Network of machines each run suite of monitoring
applications on new machine
May be a subset of the entire network of machine,
based on physical location

18
Protocol to join the group

Each machine decides to accept or reject the
request based on status returned from monitoring
suite
Decision based on voting mechanism
Only machines that ran suite may vote
Each machine will report the result of running
the suite
Use algorithm for Byzantine Generals problem
proposed by Lamport et al (1982)
Allows for m failures in a group of 3m1
Every machine stores the result from every other
machine

19
Logging Mechanism

Each initial client connection generates a log
Log initially contains client IP address, type of
service requested, and TCP header (port numbers,
sequence number, etc)
Log updated on status change
Based on type of service
May be pruned periodically
Ack and Seq number will be updated each time log
updated

20
Logging Mechanism

Each server sends out initial log to each other
server that it is monitoring
Encrypted with coordinator key
Will then send out encrypted updates periodically
Will send notification when client connection has
terminated
All servers will then remove the log

21
Logging Mechanism, Cont

When server is asked to take over for another
server, coordinator gets the log and decrypts it
Will send back a subset of log
Subset will contain most recent TCP header
information as well as information needed to
continue service to client

22
Status

Proof-of-concept in development
Using FTP as initial application
Information logged will be the clients directory
Initially assuming using anonymous ftp (no
authentication necessary, no time-outs)
Right now all done in user space using Java, but
client subset will eventually be done in kernel
TCP migration code modified version of MIT code
for TCP migration

23
Further work

Protocol would most likely extend past clients
timeout window.
Should special measures be taken here?
Can we make the protocol more efficient?
What about scalability?
Procedure for choosing subset not yet determined
Probably will be done by taking average time for
a message reply
Fastest n servers will be in subset

24
Further Work, Cont

Coordinator is poor choice
Poor scalability
Single point of attack
Should be peer-to-peer
How does a server that has determined to be
compromised get back into group as a trusted
member?
Should it have to request to join the group again?

25
Further Work, Cont