Title: Advanced Operating Systems
1Advanced Operating Systems
Lecture 10 RPC (Remote Procedure Call)
- University of Tehran
- Dept. of EE and Computer Engineering
- By
- Dr. Nasser Yazdani
2Communication in distributed systems
- How process communicates in DS.
- References
- Chapter 2 of the text book
- Andrew D. Birrell Bruce J. Nelson,
Implementing Remote Procedure calls - Brain Bershad, Thomas Anderson, et.l,
Lightweight Remote Procedure Call - Chapter of Computer Network, A system
approach, Section 5.3
3Outline
- Why RPC
- Local Procedure Call
- Call semantics
- Different implementations
- Lightweight Procedure Call (LRPC).
4Communication Models
- Communication of processes in distributed system
environment. - Communication is being done on the higher levels.
- Remote procedure call (RPC)
- Remote Object Invocation.
- Message passing queues
- Support for continuous media or stream
5Remote procedure call
- A remote procedure call makes a call to a remote
service look like a local call - RPC makes transparent whether server is local or
remote - RPC allows applications to become distributed
transparently - RPC makes architecture of remote machine
transparent - Well-known method to transfer control and data
among processes running on different machines
6Developing with RPC
- Define APIs between modules
- Split application based on function, ease of
development, and ease of maintenance - Dont worry whether modules run locally or
remotely - Decide what runs locally and remotely
- Decision may even be at run-time
- Make APIs bullet proof
- Deal with partial failures
7Goal of RPC
- Big goal Transparency.
- Make distributed computing look like centralized
computing - Allow remote services to be called as procedures
- Transparency with regard to location,
implementation, language - Issues
- How to pass parameters
- Bindings
- Semantics in face of errors
- Two classes integrated into prog, language and
separate
8Benefits of RPC
- A clean and simple semantic to build distributed
computing. - Transparent distributed computing Existing
programs don't need to be modified - Efficient communication!
- Generality Enforces well-defined interfaces
- Allows portable interfaces Plug together
separately written programs at RPC boundaries
e.g. NFS and X clients and servers
9Conventional Procedure Call
- Parameter passing in a local procedure call the
stack before the call to read
- b) The stack while the called procedure is active
10Parameter Passing
- Local procedure parameter passing
- Call-by-value
- Call-by-reference arrays, complex data
structures - Copy and store
- Remote procedure calls simulate this through
- Stubs proxies
- Flattening marshalling
- Related issue global variables are not allowed
in RPCs
11Client and Server Stubs
- Principle of RPC between a client and server
program.
12Stubs
- Client makes procedure call (just like a local
procedure call) to the client stub - Server is written as a standard procedure
- Stubs take care of packaging arguments and
sending messages - Packaging is called marshalling
- Stub compiler generates stub automatically from
specs in an Interface Definition Language (IDL) - Simplifies programmer task
13Steps of a Remote Procedure Call
- Client procedure calls client stub in normal way
- Client stub builds message, calls local OS
- Client's OS sends message to remote OS
- Remote OS gives message to server stub
- Server stub unpacks parameters, calls server
- Server does work, returns result to the stub
- Server stub packs it in message, calls local OS
- Server's OS sends message to client's OS
- Client's OS gives message to client stub
- Stub unpacks result, returns to client
14Passing Value Parameters (1)
- Steps involved in doing remote computation
through RPC
2-8
15Passing Value Parameters (2)
- Original message on the Pentium
- The message after receipt on the SPARC
- The message after being inverted. The little
numbers in boxes indicate the address of each byte
16Parameter Specification and Stub Generation
- A procedure
- The corresponding message.
17Doors
- The principle of using doors as IPC mechanism.
18Asynchronous RPC (1)
2-12
- Interconnection in a traditional RPC
- The interaction using asynchronous RPC
19Asynchronous RPC (2)
- A client and server interacting through two
asynchronous RPCs
2-13
20DCE RPC
- Distributed Computing Environment developed by
Open Software Foundation. - It is a middleware between network Operating
Systems and distributed application. - Distributed file service
- Directory service
- Security service
- Distributed time service
21Binding a Client to a Server
- Client-to-server binding in DCE.
2-15
22Writing a Client and a Server
- Uuidgen a program to generate a prototype IDL
(Interface Definition Language) file.
2-14
23Marshalling
- Problem different machines have different data
formats - Intel little endian, SPARC big endian
- Solution use a standard representation
- Example external data representation (XDR)
- Problem how do we pass pointers?
- If it points to a well-defined data structure,
pass a copy and the server stub passes a pointer
to the local copy - What about data structures containing pointers?
- Prohibit
- Chase pointers over network
- Marshalling transform parameters/results into a
byte stream
24Binding
- Problem how does a client locate a server?
- Use Bindings
- Server
- Export server interface during initialization
- Send name, version no, unique identifier, handle
(address) to binder - Client
- First RPC send message to binder to import
server interface - Binder check to see if server has exported
interface - Return handle and unique identifier to client
25Binding Comments
- Exporting and importing incurs overheads
- Binder can be a bottleneck
- Use multiple binders
- Binder can do load balancing
26Failure Semantics
- Client unable to locate server return error
- Lost request messages simple timeout mechanisms
- Lost replies timeout mechanisms
- Make operation idempotent
- Use sequence numbers, mark retransmissions
- Server failures did failure occur before or
after operation? - At least once semantics (SUNRPC)
- At most once
- No guarantee
- Exactly once desirable but difficult to achieve
27Failure Semantics
- Client failure what happens to the server
computation? - Referred to as an orphan
- Extermination log at client stub and explicitly
kill orphans - Overhead of maintaining disk logs
- Reincarnation Divide time into epochs between
failures and delete computations from old epochs - Gentle reincarnation upon a new epoch broadcast,
try to locate owner first (delete only if no
owner) - Expiration give each RPC a fixed quantum T
explicitly request extensions - Periodic checks with client during long
computations
28Implementation Issues
- Choice of protocol affects communication costs
- Use existing protocol (UDP) or design from
scratch - Packet size restrictions
- Reliability in case of multiple packet messages
- Flow control
- Using TCP too much overhead
- Setup time
- Keeping communication alive
- State information
29Implementation Issues(2)
- Copying costs are dominant overheads
- Need at least 2 copies per message
- From client to NIC and from server NIC to server
- As many as 7 copies
- Stack in stub message buffer in stub kernel
NIC medium NIC kernel stub server
- Scatter-gather operations can reduce overheads
30RPC vs. LPC
- 4 properties of distributed computing that make
achieving transparency difficult - Partial failures
- Concurrency
- Latency
- Memory access
31Case Study SUNRPC
- One of the most widely used RPC systems
- Developed for use with NFS
- Built on top of UDP or TCP
- TCP stream is divided into records
- UDP max packet size lt 8912 bytes
- UDP timeout plus limited number of
retransmissions - TCP return error if connection is terminated by
server - Multiple arguments marshaled into a single
structure - At-least-once semantics if reply received,
at-least-zero semantics if no reply. With UDP
tries at-most-once - Use SUNs eXternal Data Representation (XDR)
- Big endian order for 32 bit integers, handle
arbitrarily large data structures
32Binder Port Mapper
- Server start-up create port
- Server stub calls svc_register to register prog.
, version with local port mapper - Port mapper stores prog , version , and port
- Client start-up call clnt_create to locate
server port - Upon return, client can call procedures at the
server
33Rpcgen generating stubs
- Q_xdr.c do XDR conversion
34Partial failures
- In local computing
- if machine fails, application fails
- In distributed computing
- if a machine fails, part of application fails
- one cannot tell the difference between a machine
failure and network failure - How to make partial failures transparent to
client?
35Strawman solution
- Make remote behavior identical to local behavior
- Every partial failure results in complete failure
- You abort and reboot the whole system
- You wait patiently until system is repaired
- Problems with this solution
- Many catastrophic failures
- Clients block for long periods
- System might not be able to recover
36Real solution break transparency
- Possible semantics for RPC
- Exactly-once
- Impossible in practice
- More than once
- Only for idempotent operations
- At most once
- Zero, dont know, or once
- Zero or once
- Transactional semantics
- At-most-once most practical
- But different from LPC
37Where RPC transparency breaks
- True concurrency
- Clients run truely concurrently
- client()
- if (exists(file))
- if (!remove(file)) abort(remove failed??)
-
- RPC latency is high
- Orders of magnitude larger than LPCs
- Memory access
- Pointers are local to an address space
38RPC implementation
- Stub compiler
- Generates stubs for client and server
- Language dependent
- Compile into machine-independent format
- E.g., XDR
- Format describes types and values
- RPC protocol
- RPC transport
39RPC protocol
- Guarantee at-most-once semantics by tagging
requests and response with a nonce - RPC request header
- Request nonce
- Service Identifier
- Call identifier
- Protocol
- Client resends after time out
- Server maintains table of nonces and replies
40RPC transport
- Use reliable transport layer
- Flow control
- Congestion control
- Reliable message transfer
- Combine RPC and transport protocol
- Reduce number of messages
- RPC response can also function as acknowledgement
for message transport protocol
41Implementing RPC (BiRREL et.l)
- The primary purpose to make Distributed
computation easy. Remove unnecessary difficulties - Two secondary Aims
- Efficiency (A factor of beyond network
transmission. - Powerful semantics without loss of simplicity or
efficiency. - Secure communication
42Implementing RPC (Options)
- Shared address space among computers?
- Is it feasible?
- Integration of remote address space
- Acceptable efficiency?
- Keep RPC close to procedure call
- No timeout
43Implementing RPC
- 1st solid implementation, not 1st mention
- Semantics in the face of failure
- Pointers
- Language issues
- Binding
- Communication issues (networking tangents)
- Did you see those HW specs?
- very powerful 1000x slower smaller
44RPC Semantics 1
- Delivery guarantees.
- Maybe call
- Clients cannot tell for sure whether remote
procedure was executed or not due to message
loss, server crash, etc. - Usually not acceptable.
45RPC Semantics 2
- At-least-once call
- Remote procedure executed at least once, but
maybe more than once. - Retransmissions but no duplicate filtering.
- Idempotent operations OK e.g., reading data that
is read-only.
46RPC Semantics 3
- At-most-once call
- Most appropriate for non-idempotent operations.
- Remote procedure executed 0 or 1 time, ie,
exactly once or not at all. - Use of retransmissions and duplicate filtering.
- Example Birrel et al. implementation.
- Use of probes to check if server crashed.
47RPC Implementation (Birrel et al.)
Caller
Callee
User stub
RPC runtime
RPC runtime
Server stub
Call packet
User
Server
work
rcv
call
unpk
call
xmit
pck args
Result
pck result
xmit
unpk result
return
rcv
return
48Implementing RPC
- Extra compiler pass on code generates stubs
- Client/server terminology thinking
49Implementing RPC (Rendezvous, binding)
- How does client find the appropriate server?
- BirrellNelson use a registry (Grapevine)
- Server publishes interface (type instance)
- Client names service ( instance)
- Fairly sophisticated then, common now
- Simpler schemes portmap/IANA (WW names)
- Still a source of complexity insecurity
50Implementing RPC (Marshalling)
- Representation
- Processor arch dependence (big vs little endian)
- Always canonical or optimize for instances
- Pointers
- No shared memory, so what about pointer args?
- Make RPC read accesses for all server
dereferences? - Slow!
- Copy fully expanded data structure across?
- Slow incorrect if data structure is dynamically
written - Disallow pointers? Common
- Forces programmers to plan remote/local
51Implementing RPC (Communication)
- Round trip response time vs data bandwidth
- One packet in each direction
- Connectionless (these days, sticky connections)
- RPC-specific reliability (bad idea we havent
killed yet) - Failure semantics ordering
- RPC sequence number
- Idempotent (repeatable) operations vs exactly
once - Lost requests vs lost responses repeat or replay
cache - Heartbeat (probes) to delay failure handling
- Server thread (process) pools affinity binding
52Binding
- How to determine where server is? Which procedure
to call? - Resource discovery problem
- Name service advertises servers and services.
- Example Birrel et al. uses Grapevine.
- Early versus late binding.
- Early server address and procedure name
hard-coded in client. - Late go to name service.
53RPC Performance
- Sources of overhead
- data copying
- scheduling and context switch.
- Light-Weight RPC
- Shows that most invocations took place on a
single machine. - LW-RPC improve RPC performancefor local case.
- Optimizes data copying and thread scheduling for
local case.
54LW-RPC 1
- Argument copying
- RPC 4 times (2 on call and 2 on return) copying
between kernel and user space. - LW-RPC common data area (A-stack) shared by
client and server and used to pass parameters and
results access by client or server, one at a
time.
55Lightweight RPC (LRPC)
- Combine programming semantics, large- grained
protection model of RPC with the control transfer
and communication model. - Making protected procedure call.
- Usually, designed and used with microkernels
56LW-RPC 2
- A-stack avoids copying between kernel and user
spaces. - Client and server share the same thread less
context switch (like regular calls).
user
4. executes returns
1. copy args
3. upcall
A
2. traps
server
client
kernel
57Context Microkernels
App1
App2
App3
User Space
File Server
Network Server
File Server
Network Server
DD
PM
MM
IPC
DD
PM
MM
IPC
Microkernel
Microkernel
IPC
Multiprocessor
Multiprocessor
58Guiding Principle
Optimize for the common case!
59The Common Case
- A large percentage of cross-domain calls are to
processes on the same machine (95 - 99 ) (V,
Taos, Unix NFS). - Large and complex parameters are rarely passed
during these calls.
60Unoptimized Cross-domain RPC
- Procedure call - cant be avoided
- Stub overhead
- marshalling parameters
- translating procedure call to the interface used
by the RPC system
61Unoptimized RPC
- Message buffer overhead
- allocating memory
- memory copies (client, kernel, server)
- Access validation
- validate message sender
- Message transfer
- enqueue and dequeue messages
62Unoptimized RPC
- Scheduling overhead
- block client thread, schedule server thread
- Kernel trap and Context switch overhead
- Dispatch overhead
- interpret message
- maybe create new thread
- dispatch server thread to execute call
63Observations
Unoptimized RPC is between 5 and 10 times more
expensive than the theoretical minimum e.g. Mach
on CVAX 90/754 microseconds
64Optimizations Overview
- Map message memory into multiple domains
- Handoff scheduling
- Passing arguments in registers
- Trade safety for performance
- Avoid dynamic allocation of memory
- Avoid data copies
65LRPC Optimizations
- Do as much as possible at server bind time
- pre-allocate argument stacks (kernel)
- obtain entry address into the server domain for
each procedure in interface - allocate a linkage record to record client return
address (kernel) - return a non-forgeable Binding Object to the
client and an argument stack list
66LRPC Optimizations
- Argument stacks are preallocated and shared
(mapped into client, kernel and server domains) - Use the client thread to execute the call
- Optimize validation of client by using a
capability (the Binding Object) - Enforce security by using separate execution
stacks
67LRPC Optimizations
- Stubs are automatically generated in assembler
and optimized for maximum efficiency - Minimize the use of shared data structures
- On multiprocessors, if possible switch to a
processor idling in the server domain - Minimize data copying (1 instead of 4)
68Next Lecture
- Naming
- Read
- Chapter 4 of the book