Title: Distributed System Concepts and Architectures
1Distributed System Concepts and Architectures
2Outline
- Advantages and disadvantages of distributed OS
- Goals
- Transparency
- Services
- Architecture Models
- Communication Network Protocols
- Major Design Issues
- Distributed Computing Environment (DCE)
3Distributed OS
- An integration of system services, presenting a
transparent view of a multiple computer system
with distributed resources and control - A collection of independent computers that appear
to the users of the system as a single computer - Examples
- Personal workstations a pool of processors
single file system - Robots on the assembly line Robots in the parts
department - A large bank with hundreds of branch offices all
over the world
4Advantages of Distributed Systems Over
Centralized Systems
- Economics microprocessors offer a better
price/performance than mainframes - Speed a distributed system may have more total
computing power than a mainframe - Inherent distribution some applications involve
spatially separated machines - Reliability if one machine crashes, the system
as whole can still survive - Incremental growth computing power can be added
in small increments
5Advantages of Distributed Systems Over Isolated
Computers
- Data sharing allow many users access to a
common data base - Device sharing allow many users to share
expensive peripherals like color printers - Communication make human-to-human communication
easier, for example, by E-mail - Flexibility spread the workload over the
available machines in the most cost effective way
6Disadvantages of Distributed Systems
- Software complex software
- Networking the network can saturate or cause
other problems - Security easy access also applies to secret data
7Goals (I)
- Provide a high-performance and robust computing
environment with least awareness of the
management and control of distributed system
resources - Efficiency -Â difficult due to communication
delays - Propagation delay nothing can be done
- Protocol overhead
- Effective communication primitives, good
protocols - Load distribution bottleneck or congestions in
Network/SW - Balance and overlap computation and communication
- Distributed processing and load sharing
8Goals (II)
- Flexibility
- User view friendly system and freedom in using
the system - Friendliness user interface, consistency,
reliability ? use OO - Freedom
- No unreasonable restrictions in using systems
- Easy to build additional tools or services
- System view
- Ability to evolve and migrate
- Modularity, scalability, portability and
interoperability - Difficult to achieve
- Heterogeneous HW/SW components
9Goals (III)
- Consistency -Â Lack of global information,
replication and partitioning of data, component
failures, complexity of interaction among
components - User needs uniformity in using the system and
predictable system behavior - System needs proper concurrency control
mechanisms and failure handling and recovery
procedure - Robustness - problem with failures in
communication links, processing nodes and
client/server processes - System must reinitialize itself to a state where
integrity preserved and only small loss in
performance - Handle exceptions and errors, changes to
topology, long message delays, inability to
locate server - Security reliability, protection, and access
control
10Transparency
- Transparency
- Hide all irrelevant system-dependent details from
users - Create an illusion of the model users are
supposed to see - Trade-off between simplicity and effectiveness
- Objective
- Provide a logical view of a physical system and
at the same time reduce the effect and awareness
of the physical system to a minimum
11Type of Transparency (I)
- Access access local and remote system objects in
same way - Phone (local) VS. letter (remote)
- Location (name) No awareness of object location
- use logical names - Area code for other cities
- Migration object can be moved to different
locations without changing names - Local numbers are changed if one moves to other
cities - Need universal name (symbolic or numerical)
- Concurrency sharing of objects without
interference
12Type of Transparency (II)
- Relocation a resource may be moved to another
location when in use - Replication consistency of multiple instances of
files and data - Parallelism permit parallel activities without
users knowing how, where, and when these
activities are carried out by the system - Failure fault tolerance, graceful performance
degradation, minimum damages to the user - Performance consistent and predictable
performance level even if changes in structure or
load distribution - Size modularity and scalability Incremental
growth in HW without user awareness - Persistence (software) resource may be in memory
or on disk - Revision SW revisions not visible (vertical
growth) Â Â Â Â Â Â Â
13Categorization of Transparency Based on System
Goals
- Efficiency
- Concurrency
- Parallelism
- Performance
- Flexibility
- Access
- Location
- Relocation
- Migration
- Size
- Revision
- Consistency
- Access
- Replication
- Performance
- Persistence
- Robustness
- Failure
- Replication
- Size
- Revision
14Distributed System Issues and Transparencies
15Services (I)
- Primitive services - most fundamental, in kernel
- Must implemented in the kernel of each node in
the system - Communication message passing (send/receive
primitives) - Synchronous or asynchronous
- Inter-node, inter-process Synchronization
synchronous communication - Synchronous semantics of communication or
synchronization server - Processor multiplexing -- Process server (for
transparency reason) - Creation, deletion, tracking for memory and
processing time
16Service (II)
- Services by System Servers fundamental, not
need in kernel - Provide fundamental services for managing
processes, files, and process communication - Can be implemented anywhere in the system, and
still perform functions basic to the operation of
a distributed system - Mapping logical names to physical addresses
- Name server locate processes, users, machines
- Directory server locate files, communication
ports - Translate addresses and locations into
communication paths network server - Broadcast messages broadcast or multicast
servers - Clocks for synchronization - impossible to agree
on global clock information - Time server physical clocks and logical clocks
(for event ordering) - File servers, print servers, migration server,
authentication server
17Service (III)
- Value-added Services - not essential in
implementation of system but useful, higher-level
or special purpose services (such as user
applications) - Increase computational performance, enhance fault
tolerance, cooperative activities - Example is Web server
- Groups of interacting processes
- Group server membership (add/remove), admission
policies, privileges - Distributed conferencing server and concurrent
editing server
18System Architecture Models
- System Architectures
- Workstation-server model
- Client workstations
- Local processing capability and interface to the
network - Server workstations
- Dedicated for special services
- Processor pool model - collect all processing
power in one place, users use terminals only - Terminal remote booting, remote file mounting,
virtual terminal handling, packet assembling and
disassembling (PAD) - File and processor allocation done by system
- Integrated hybrid model
19Workstation-Server Model
File Server
Printer Server
20Processor-Pool Model
21Communication Network Architecture Models
- HW interconnection inter-node inter-process
communication protocols - Hardware interconnection
- Point-to-point links direct connections between
pairs of nodes - Multipoint links allow connection of nodes into
clusters - Common bus time shared
- IEEE 802 LAN Standard Ethernet, Token Bus/Ring,
FDDI - Switch space/time multiplexing at higher HW
cost/complexity - Private switches for multiprocessor systems
cross-bar - Public switches ISDN, SMDS, ATM
- LAN, MAN, WAN
- Ratio of propagation delay to transmission delay
- LAN small. Close components, more suitable for
distributed processing - MAN/WAN large. More communication oriented
22WAN, MAN, LAN
Point-to-Point
Point-to-Point
23Communication Network Protocols
- Communication Protocol set of rules that
regulate the exchange of messages to provide a
reliable and orderly flow of information among
communicating processes - Connection-oriented communication service Phone
- Need explicit set up of a connection channel
before communication - Messages are delivered reliably and in sequence
- Virtual circuit (logical) or circuit switching
(physical) - Connectionless communication service postal
service - No initial connection establishment is necessary
- Messages are delivered on a best-effort basis in
timing and route and may arrive in arbitrary
order - Datagram (logical) or packet switching (physical)
24OSI Protocol Suite
- Seven-layer protocol suite
- OSI focuses on interconnecting computers
- A process communicates with a remote process by
passing data through the seven layers, then the
physical network, and finally through the remote
layers in reverse order - Segmenting/reassembling
- Transparency between layers encapsulation
- Add header for protocol data unit (PDU) from
upper layer - The remote corresponding layer strip off the
header - A gateway or intermediate node only stores and
forwards messages at the three lower network
dependent layers
25OSI Protocol Suite (Cont.)
Peer-to-Peer Protocols
Application
Application
Presentation
Presentation
Session
Session
Transport
Transport
Intermediate Node
Network
Network
Network
Network
Data Link
Data Link
Data Link
Data Link
Physical
Physical
Physical
Physical
Communication Link
Communication Link
26OSI Protocol Suite (Cont.) -- Physical Layer
- Specify the electrical and mechanical
characteristics of the physical communication
link standardize - Coding method, modulation technique,
wire/connector specification - Sharing of common bus needs interface standards
for the medium access control in the data link
layer - Reliable mapping of signals to bits need bit
synchronization - Bit synchronization
- Detection of the beginning of a bit and a
sequence of bits - Bit synchronous large blocks of bits transmitted
at a regular rate - Offer higher data transfer speed and better link
utilization - Character asynchronous small fixed-size bit
sequences transmitted asynchronously - Low-speed character-oriented terminals
27OSI Protocol Suite (Cont.) -- Data Link Control
(DLC) Layer
- Ensure reliable data transfer of groups of bits
(frames) - Configuration setup
- Establishment and termination of a connection
- Full- or half-duplex, synchronous or asynchronous
connection? - Error controls
- Transmission errors and loss or replication of
data frames - Detected by checksum or time-out mechanisms
- Recovered by retransmissions or forward error
corrections - Sequencing
- Maintain an orderly delivery of frames by
sequence numbers - Sequence number can assist error control and flow
control of data frames - Flow control of data frames
- Permit the transmission of a frame only if it
falls into an allowed windows of buffers for the
send and the receiver - Multipoint configuration DLC sublayer MAC
sublayer Physical layer - Resolve the access contention of the multiple
access channel
28OSI Protocol Suite (Cont.) Network Layer
- Address issues of sending packets across the
network through several link segments - Routing function
- Which link should be selected for forwarding a
packet, based on its destination address - Static or dynamic routing centralized or
distributed - Routing decision can be made at the time when a
connection is requested and is being established
(connection-oriented) or packet-by-packet basis
(connectionless, multiple path routing) - Error, sequencing, and flow control function
- Reassemble packets and discard duplicate ones
- Congestion control for favorable routing nodes
29OSI Protocol Suite (Cont.) Transport Layer
- The most important layer from the OS view
- The only interface between the communication
sub-network layers and network-independent layers - Provide a reliable end-to-end communication
between peers processes - All network-dependent faults or problems are to
be shielded from the communicating processes - Message ??packets (breaking/reassembling)
- Multiple sessions can be multiplexed on one
transport connection - One session may occupy multiple transport
connection - Five classes (TP0 to TP4) of transport services
to support sessions - Depend on application and network quality
- TP4 multiplexing, error detection, and
retransmission
30OSI Protocol Suite (Cont.) Session,
Presentation, Application Layers
- Session layer add additional dialog and
synchronization services to transport layer - Dialog establishment of sessions
- Synchronization allow processes to insert
checkpoints for efficient recovery from system
crashes - Presentation layer data encryption, compression,
and code conversion for messages that use
different coding schemes - Application layer standard is completely left to
the designer of the application
31TCP/IP Protocol Suite
- Address inter-process and inter-node
communication - How is communication between a pair of processes
maintained? - Transport Layer ? TCP (TP4 in OSI)
- Connection-oriented (TCP) or Connectionless (UDP)
- How are messages routed through the network
nodes? - Network Layer ? IP (a little more than the OSI
network Layer) - Virtual circuit or datagram
- TCPI/IP focuses on interconnecting networks
- (TCP, UDP) (Virtual Circuit, Datagram IP)
- Shift burden of maintaining reliable
communication from network to OS - Port and Socket (more in Chapter 4)
- Port inter-process communication endpoints
- Socket interface to port
32TCP/IP Protocol Suite (Cont.)
Peer to Peer Protocols
Applicationprocesses
Applicationprocesses
message
Transportlayer
Transportlayer
packet
Gateway
Internetlayer
Internetlayer
Internetlayer
datagram
Data link andphysical Layer
Data link andphysical Layer
Data link andphysical Layer
Frame in bits
33Major Design Issues
- A distributed system consists of concurrent
processes accessing distributed resources (which
may be shared or replicated) through message
passing in a network environment that may be
unreliable and contain un-trusted components - How to model and identify objects
- How to coordinate the interaction among objects
- How to achieve objects communication
- How to manage shared or replicated objects
- How to protect objects and system security
- How to support transparency
34Major Design Issues Object Models and Naming
Schemes
- Objects processes, data files, memory, devices,
processors, networks - Assume all objects can be represented uniformly
- An object is represented abstractly by the
allowable operations - The physical details of the object are
transparent to other objects - To identify a server
- By name - map name to logical address
- Physical or logical address - done by network
service, port for logical - By service - needed by CAS
35Major Design Issues Distributed Coordination
- Coordinate interacting concurrent processes to
achieve synchronization - Requirements
- Barrier synchronization a set of processes (or
events) must reach a common synchronization point
before they can continue - Condition coordination a set of processes (or
events) must wait for an asynchronously condition
set by other processes to maintain some ordering
of execution - Mutual exclusion - concurrent processes must have
mutual exclusion when accessing a critical shared
resource - Need knowledge of state information about other
processes - Through messages ? inaccurate or incomplete
(unreliable network) - Centralized coordinator (leader election) or
distributed resolution - Deadlock handling detect and recover
- Assimilate partial global state information and
use it for decision making - Exchange local knowledge among cooperating sites
36Major Design Issues (Cont.)
- IPC - Use high-level methods for transparency in
communication - Message passing low level and physical
- Client/Server Model - system interactions through
message exchanges request/reply - RPC - request/reply like procedure call, built on
top of client/server model - RPC assumes point to point, but need groups
(multicast, broadcast) - Distributed Resources - data processing capacity
- Multiprocessor scheduling - static load
distribution vs. dynamic load sharing - Process migration, real-time scheduling
- Distributed file system and distributed shared
memory - Sharing and replication of data
37Major Design Issues (Cont.)
- Fault tolerance and security
- Failure - unintentional intrusion - redundancy
alleviates it - Security violation - intentional intrusion - need
secure communication processes, integrity of
messages - Need to authenticate clients/severs, messagesÂ
38Distributed Computing Environment (DCE)
- Proposed by Open Software Foundation (OSF)
- Develop and standardize an open Unix environment
that is free from the influence of ATT and Sun - DEC an integrated package of software and tools
for developing distributed applications on an
existing OS - Hierarchically layered architecture
39DCE Architecture