Bypass: A tool for building distributed systems - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Bypass: A tool for building distributed systems

Description:

int read(int fd, out 'length' void *data, int length ) ... { return read(fd,data,length); ondor. C. www.cs.wisc.edu/condor. Remote Console. Shadow ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 36
Provided by: Csw5
Category:

less

Transcript and Presenter's Notes

Title: Bypass: A tool for building distributed systems


1
BypassA tool for buildingdistributed systems
2
Building distributed systems is hard.
3
Bypass makes building split execution systems
easy. Bypass is to split execution
systems as yacc is to compilers
4
ProblemUnfriendly Machines
  • Many systems can distribute your jobs to
    available machines scattered around the world.
    (rsh, Condor, Globus, etc...)
  • But... the machines you have access to may not be
    properly equipped to run your job.

5
ProblemUnfriendly Machines
  • An unfriendly machine
  • allows you to login under some identity.
  • allows you to execute your program.
  • might not have your files or a shared file
    system!
  • might not have space for your output!
  • might be a different architecture or OS!
  • (If you want to use a lot of machines, you cant
    be picky!)

6
core dumped
foreign machine
home machine
7
Solution Split Execution
  • General strategy
  • An agent process traps some of the application's
    standard library calls.
  • Some of the calls can be executed at the foreign
    machine.
  • Some of the calls are sent via RPC back to the
    home machine.
  • A shadow process at the home machine executes the
    RPCs and sends the results back to the agent.

8
Solution Split Execution
Trapped system calls
Agent
Application
Kernel
Foreign Machine
9
Split Execution is anOpen Research Topic
  • We want to explore many possibilities
  • Foreign machine could be partially friendly has
    some needed resources, but not all.
  • Data may be buffered and cached at both the agent
    and the shadow.
  • What procedure calls to trap depends on the
    application and the services needed.
  • Some procedure calls could be routed to third
    parties such as file servers.

10
ProblemSplit Execution is Hard
  • One example of many Trapping stat()
  • Different data types
  • struct stat, struct stat64
  • Depending on system, integer elements are 2-gt8
    bytes
  • Multiple entry points
  • stat, _stat, __libc_stat
  • Surprises
  • define stat(a,b) _fxstat(VERSION,a,b)

11
Solution Bypass
  • Bypass takes a specification of a split execution
    system and produces a matched shadow and agent.
  • Bypass hides all of the ugly details of trapping,
    type conversion, and RPCs.
  • Bypass lets you
  • split any dynamically-linked application.
  • transparently use heterogeneous systems.
  • trap calls with minimal overhead.
  • control execution paths with plain C.

12
foreign machine
home machine
13
Bypass Language
  • Declare what procedures to trap in C
  • Annotate pointer types with data flow.
  • Direction in, out, or in out
  • Binary data give expression yielding the number
    of bytes to send/receive.
  • Give two function bodies
  • agent_action
  • shadow_action

14
ssize_t write ( int fd, in "length" const void
data, size_t length ) agent_action if(
fd1 ) return bypass_shadow_write(fd,data,len
gth) else return write(fd,data,length)
shadow_action printf("remote data s",
data )
15
Agent Action
  • Any arbitrary C code.
  • When the program invokes write(), the
    agent_action is executed at the home machine.
  • Within the agent_action
  • write() - Invoke the original write() at the
    foreign machine.
  • bypass_shadow_write() - Invoke the shadow_action
    via RPC.

16
Shadow Action
  • Any arbitrary C code.
  • If the agent decides to invoke the RPC to the
    shadow, the shadow_action is executed at the home
    machine.
  • Within the shadow_action
  • write() - Invoke write() at the home machine.

17
Using Bypass
  • Run "bypass" to read the specification and
    produce C source code
  • bypass -agent -shadow simple.bypass
  • The shadow is compiled into a plain executable.
  • The agent is compiled into a shared library.

18
Using Bypass
  • The dynamic linker is used to force the agent
    into an executable at run-time
  • setenv LD_PRELOAD simple_agent.so
  • Procedure calls are trapped merely by putting
    the agent first in the link list.
  • This method can be used on any dynamically-linked
    program tcsh, netscape, emacs

19
Example ApplicationComplete Remote I/O
  • Trap all the standard I/O calls, and send them to
    the home machine unmodified
  • open(in string char path, int flags, int mode)
  • close(int fd)
  • int read(int fd, out length void data, int
    length )
  • int write(int fd, in length void data, int
    length )
  • int lseek(int fd, off_t offset, int whence )

20
Complete Remote I/O
Trapped system calls
Agent
Shadow
Application
Kernel
Kernel
Foreign Machine
Home Machine
21
Example ApplicationRemote Console
  • Trap only read and write, and send operations on
    standard files back to a single shadow process.
  • int read( int fd, in length void data, int
    length )
  • agent_action
  • if( fdlt3 )
  • bypass_remote_read( fd, data,length )
  • else return read(fd,data,length)

22
Remote Console
Shadow
Standard I/O reads and writes
Kernel
Home Machine
23
Example ApplicationAttach New Filesystem
  • Trap standard I/O calls and replace them with
    calls to a user-level filesystem library, such as
    Globus GASS.
  • int open( in string const char path, int flags,
    int mode )
  • agent_action return globus_gass_open(
    path, flags, mode )
  • int close( int fd )
  • agent_action
  • return globus_gass_close( fd )

24
Attach New Filesystem
Agent
Trapped system calls
Globus Library
open
close
THE GRID
Application
all other calls
more system calls
Kernel
Foreign Machine
25
Bypass can be used by Real Users!
  • Bypass works on unmodified executables.
  • (Real Users are not willing/able to
    rewrite/recompile their programs.)
  • Bypass requires no special privileges.
  • (Real Users do not have the root password)
  • Thus, Bypass allows a Real User to make good use
    of a remote cluster without begging the
    administrator to configure it to his/her needs.

26
Performance
  • Overhead of trapping a system call is very small
    1-4 us
  • The "trapping mechanism" simply interposes a few
    extra function calls.
  • Small compared to the expense of a real system
    call (about 10-70us)
  • Remote procedure calls are, as expected, much
    slower about 1 ms under the best conditions.

27
Related Work
  • Classic RPC and XDR
  • Define standard integer sizes, endianness, etc.
  • Start by defining external protocol, then produce
    programming interface which is not always
    convenient
  • struct read_results read_1( int fd, int length
    )

28
Related Work
  • Bypass
  • We are stuck with existing interfaces, so
    annotate them to produce a protocol
  • int read( int fd, out length void data, int
    length )
  • Do best effort conversion to/from external data
    format
  • off_t is 4 bytes on some platforms, 8 bytes on
    others.
  • A conversion might fail!
  • Define canonical values for source-level symbols
  • O_CREAT has different values on Linux and Solaris!

29
Related Work
  • Hunt and Brubacher, Detours
  • Trap library calls on NT using binary rewriting
    can be applied to any executable.
  • Make original procedure available through special
    trampoline call.
  • Bypass leaves the original entry point intact, so
    subroutines need not be re-written to use the
    trampoline.

30
Related Work
  • Alexandrov, et al., UFO
  • Use a kernel-level facility to trap all of a
    process system calls and translate some of them
    into WWW operations.
  • The kernel mechanism is secure and can be applied
    to any process.
  • But it has a high (7x) trapping overhead and
    cannot be applied to procedures that are not true
    system calls.

31
Related Work
  • Bypass
  • Trapping overhead is very small and can be
    performed on procedures that are not necessarily
    system calls.
  • But can only be applied to dynamically-linked
    executables, and is not suitable as a security
    mechanism.

32
Related/Future Work
  • A complete remote execution system needs both
    methods
  • The program owner provides a lightweight
    mechanism for creating a correct split execution
    environment.
  • The machine owner provides a heavyweight
    mechanism to defend itself from a (possibly)
    malicious program.

33
Complete System
Sandbox
Shadow
Agent
open, close
read, write, lseek
Application
Kernel
Kernel
Foreign Machine
Home Machine
34
Future Work
  • Multiple agents applied to one application
  • How to select and invoke the correct agent
    action?
  • Signal handling
  • Flow of control is backwards.
  • Other implementations
  • Binary rewriting.
  • Build specialized linker that understands
    multiple definitions of symbols.

35
Further Questions?
  • Douglas Thain
  • thain_at_cs.wisc.edu
  • Miron Livny
  • miron_at_cs.wisc.edu
  • Bypass Web Page
  • http//www.cs.wisc.edu/condor/bypass
  • Questions now?
Write a Comment
User Comments (0)
About PowerShow.com