Secure FailOver Protocol - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Secure FailOver Protocol

Description:

Grid Computing on a massive, global scale. Using a truly distributed network operating system where even high-end multi ... Anyone can connect anytime, anywhere, ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 27
Provided by: hp26102
Category:

less

Transcript and Presenter's Notes

Title: Secure FailOver Protocol


1
Secure Fail-Over Protocol
  • Sonja Tideman
  • Advisor Dr. Indrajit Ray,
  • Colorado State University
  • August 9, 2002

2
The Dream
  • Grid Computing on a massive, global scale
  • Using a truly distributed network operating
    system where even high-end multi-processor
    supercomputers are treated as simply another
    simple resource
  • Anyone can connect anytime, anywhere, on any
    device
  • The cost of entering the system is no more than
    the cost of the device an user puts the OS on
    the device and is given full access to the system

3
The Reality
  • Need to make it an attractive option for
    corporations, research labs, and educational
    institutions
  • Give them what they want, but cheaper and easier
  • Means security, reliability, performance,
    storage, and other Quality of Service (QoS)
    guarantees
  • Not there yet

4
Fail-Over Protocol how this fits into the dream
  • Business needs to provide clients with secure,
    reliable service
  • Typically utilize a network of servers scattered
    geographically to provide servers
  • Users connect to one machine in the network
    (round-robin DNS)
  • But what happens when a server goes down or is
    comprised?
  • Clients usually have to have existing connections
    terminated and restarted
  • This can cause problems for both the server and
    the client
  • The goal is to get the client to be simply
    switched over to another server on the network in
    a transparent (to the application) manner

5
Related Work
  • Significant amount of related work
  • Main contribution is that protocol does not rely
    on either the server or the client being trusted
  • Protocol also includes server-to-server
    monitoring method

6
Related Work Cont
  • Migratory TCP done at Rutgers
  • Process migration work
  • Computing Communities research done at New York
    University and Arizona State University
  • Similar effect as fail-over protocol, different
    approach
  • TCP Migration

7
TCP Migration Work
  • Protocol described in Internet Draft
    draft-snoeren-tcp-migrate-00
  • Offers a secure cryptographic key to be sent with
    TCP connection
  • System similar to the fail-over protocol
    described in Fine-Grained Failover using
    Connection Migration
  • Proceedings of 3rd USENIX Symposium on Internet
    Technologies and Systems

8
Differences from TCP Migration Work
  • Do not want to allow server that is initially
    accessed to know cryptographic key
  • So client encrypts it with public key of a
    trusted coordinator
  • Protocol specifies format for health monitoring
  • Allows monitoring applications to be added

9
Protocol Overview Server
  • Protocol to allow servers to monitor each other
    using many different monitoring applications
  • At the moment, all have to conform to a Java
    interface, but that could change
  • Protocol to allow servers to join the monitoring
    network in a secure fashion
  • Also allows servers to be able to get enough
    information about another servers client
    interactions to be able to take over the
    interactions

10
Protocol Overview Server
  • Peer-to-peer monitoring
  • Each server runs all of its monitoring programs
    on the other servers
  • Depending on size of system, it may run on all
    other servers or on a subset given to each server
    by the coordinator
  • One central coordinator
  • Handles decision to terminate a servers client
    connections, server join/remove request, and
    registration of new monitoring programs
  • Is either connected via an internal-only link (no
    outside access) or may eventually be fully
    distributed as well

11
Protocol Overview Client
  • Should be mostly transparent to the client
  • Kernel will handle necessary overhead
  • Must recognize and setup migrate option fields in
    TCP options field
  • Must verify security information
  • Client must have published public key
  • Could also be sent with the initial request

12
Sequence Overview
Server A
2A
1A
Coordinator
Client
3A
4A
Server B
1B
2B
13
Connection Sequence
  • Client requests service from server (1A)
  • Must set migrate-permitted option in TCP header
  • As outlined in TCP migration draft, would also
    specify parameters for generation of the
    cryptographic cookie
  • All parameters would be encrypted with public key
    of coordinator

2A
1A
3A
4A
1B
2B
14
Connection Sequence, Cont
  • Server sends coordinator parameter data (2A)
  • Coordinator hands back a ticket encrypted with
    the clients public key (3A)
  • Contains the cryptographic cookie necessary for
    TCP migration, a timestamp, and a unique
    transaction ID
  • Server hands the ticket back to the client,
    continues with transaction (4A)

2A
1A
3A
4A
1B
2B
15
Fail Over Sequence
  • Network of machines detect failure
  • Coordinator selects one machine to take over
  • Selection could be based on system load, and/or
    geographic proximity
  • Hands new server a ticket encrypted with public
    key of client (1B)
  • Contains cryptographic cookie, timestamp, and the
    unique transaction ID

2A
1A
3A
4A
1B
2B
16
Fail Over Sequence, Cont
  • New server sends client ticket (2B) in a TCP
    packet with the Migrate option set
  • Client gets ticket, decrypts it, and verifies
  • Transaction ID is the same
  • Cryptographic key is the same
  • Timestamp is reasonable (within accepted range)
  • Clients accepts new servers messages, rejects
    any subsequent messages from old server

2A
1A
3A
4A
1B
2B
17
Protocol to join the group
  • Server sends join request to coordinator
  • Coordinator asks network of machine to decide to
    accept or reject the request
  • Network of machines each run suite of monitoring
    applications on new machine
  • May be a subset of the entire network of machine,
    based on physical location

18
Protocol to join the group
  • Each machine decides to accept or reject the
    request based on status returned from monitoring
    suite
  • Decision based on voting mechanism
  • Only machines that ran suite may vote
  • Each machine will report the result of running
    the suite
  • Use algorithm for Byzantine Generals problem
    proposed by Lamport et al (1982)
  • Allows for m failures in a group of 3m1
  • Every machine stores the result from every other
    machine

19
Logging Mechanism
  • Each initial client connection generates a log
  • Log initially contains client IP address, type of
    service requested, and TCP header (port numbers,
    sequence number, etc)
  • Log updated on status change
  • Based on type of service
  • May be pruned periodically
  • Ack and Seq number will be updated each time log
    updated

20
Logging Mechanism
  • Each server sends out initial log to each other
    server that it is monitoring
  • Encrypted with coordinator key
  • Will then send out encrypted updates periodically
  • Will send notification when client connection has
    terminated
  • All servers will then remove the log

21
Logging Mechanism, Cont
  • When server is asked to take over for another
    server, coordinator gets the log and decrypts it
  • Will send back a subset of log
  • Subset will contain most recent TCP header
    information as well as information needed to
    continue service to client

22
Status
  • Proof-of-concept in development
  • Using FTP as initial application
  • Information logged will be the clients directory
  • Initially assuming using anonymous ftp (no
    authentication necessary, no time-outs)
  • Right now all done in user space using Java, but
    client subset will eventually be done in kernel
  • TCP migration code modified version of MIT code
    for TCP migration

23
Further work
  • Protocol would most likely extend past clients
    timeout window.
  • Should special measures be taken here?
  • Can we make the protocol more efficient?
  • What about scalability?
  • Procedure for choosing subset not yet determined
  • Probably will be done by taking average time for
    a message reply
  • Fastest n servers will be in subset

24
Further Work, Cont
  • Coordinator is poor choice
  • Poor scalability
  • Single point of attack
  • Should be peer-to-peer
  • How does a server that has determined to be
    compromised get back into group as a trusted
    member?
  • Should it have to request to join the group again?

25
Further Work, Cont
  • Coordinators private key needs to be kept
    private
  • Lots of attacks possible to determine private key
  • Should key be changed periodically?
  • Client requires a published public key
  • How does this get set-up initially?

26
Web Sites
  • TCP Migration
  • Code at http//nms.lcs.mit.edu/software/migrate/
  • Papers at http//nms.lcs.mit.edu/publications/
  • M-TCP (Migratory TCP)
  • http//discolab.rutgers.edu/mtcp/index.htm
  • Computing Communities Project
  • http//www.cs.nyu.edu/pdsg/projects/cc/cc.htm
Write a Comment
User Comments (0)
About PowerShow.com