Title: Introduction to Distributed Systems
1Introduction to Distributed Systems
- Distributed Computer Systems
2Introduction to Distributed Systems
- A distributed system produces an integrated
computing facility and consists of - a collection of autonomous computers linked by a
network and - distributed system software that
- communicate and coordinate their actions using
message passing.
3 A simple Distributed System
PCs
File Servers login servers prtint servers
LAN
WAN Gateway
Workstations
4Examples of Distributed Systems
- The Internet
- Intranets
- Mobile and Ubiquitous Computing
5The Internet
- A very large distributed System enabling the use
of open-ended services such as - WWW, E-mail, File transfer
- Multimedia services
- but currently with limited capacity for special
communication requirements
6Intranets
- A portion of the Internet
- Can be administered and configured locally to
enforce local security policies - Can be connected to the Internet via a router.
- Firewalls are used to protect an Intranet from
unauthorised messages into or out of the Intranet.
7Mobile and Ubiquitous Computing
- Integration of portable and small devices into
distributed systems - Typical example
- A wireless LAN consisting of Laptops, Mobile
phones, camera, - with the Mobile phone being connected to the
Internet using WAP via a gateway.
8Mobile Computing- System Issues
- Discovery of resources in a host environment
- No need for the user to reconfigure mobile
devices while they move around - Users coping with limited connectivity as they
move around - Security guarantees
9Advantages of distributed systems over
Centralized Systems
- Economic better price/performance ratio
- Speed More total computing power.
- Inherent Distribution some applications involve
spatially spread machines - Reliability If one machine crashes, the system
as a whole may survive. - Incremental growth Computing power can be added
in small increments.
10Major Challenges
- Heterogeneity
- variety and difference
- Networking
- The network can saturate or cause other
problems. - Security
- Easy access also applies to secret data.
- hacking!
11Heterogeneity Applies to
- Computer Networks
- Machine Architectures
- Operating Systems
- Programming Languages
- Applications written by different programmers
12Middleware
- A software layer that masks heterogeneity
- Mostly implemented over the Internet protocols
which themselves mask the underlying networks. - All Middleware deal with differences in Operating
Systems and architectures.
13Middleware
- Also provides a uniform computational model to
implement servers and distributed applications. - Possible models
- Remote Object invocation
- Remote SQL access and distributed transactions
14Examples of middleware
- CORBA - provides remote object invocation
- Supports different programming languages
- Java RMI - provides remote object invocation
- Supports only Java
15Heterogeneity and mobile code
- Mobile code moves from one computer to run at
destination. - But the destination may not be able to run it!
(e.g. from PC to a Linux box) - The virtual machine approach (e.g JVM) may be a
solution - As long as both sides use the same language
16Key Characteristics
- Resource sharing Hardware and data
- Openness Published interfaces
- Concurrency
- Scalability
- the ability to provide a huge distributed system
- Reliability and Fault Tolerance
- Hardware redundancy
- Software recovery
- Transparency
17Concurrency
- Several processes may exist in a single computer.
The computer may have only one, or several
processors. - With one processor concurrency is achieved by
interleaving the execution of processes.
P3 P2 P1
Time
Interleaving
18Transparency
- To the user it should appear that there is a
single processor timesharing system - Types of Transparency(ISORM-ODP)
- Location transparency
- Access Transparency
- Migration Transparency
- Replication Transparency
- Concurrency Transparency
- Parallelism transparency
- Failure Transparency
- Scaling Transparency
19Location Transparency
- User can not tell where resources are located..
Example Sending Email to a user in the
Internet (P.Saeidi_at_staffs.ac.uk)
The physical or network location is
transparent
DNS
?
20Access Transparency
- Accessing local or remote objects using identical
operations.
Example clicking an icon on a graphical user
interface
Local
Email
Remote
Same implementation of email software everywhere
Access and Location transparency together are
referred to as Network Transparency
21Migration Transparency
- Resources moving at will without changing names
or affecting the operation of application
programs.
Example 1 Database when a database moves to
another computer the user workstations would
automatically adjust to the new
location. Example 2 Process migration
gtgtProcesses may be moved even after they have
started execution. gtgtBetter load balancing
but design is more complex.
22Replication Transparency
- User can not tell how many copies of a resource
exist.
Requests and Replies
via front ends
Front End
client
RM
RM
Front End
client
RM
Replica Managers
An Architectural model for the management of
replicated data. The front ends implement
replication transparency
23Concurrency Transparency
- Several processes operating concurrently using
shared resources without interference between
them.
Example changes to a file by one client should
not interfere with the operation of other clients
simultaneously accessing the same file
client1
X1000 Y 0
X Y
client 2
Y X 2 print (y)
24Concurrency Transparency
- Some Possible Results (with interference)
client1
X1000 Y 0
X Y
client 2
c1 sets x1000 c2 y10002 c2 print 2000
c1 x1000 c2 y10002 c1 y0 c2 print 0
Y X 2 print (y)
c1, c2 are clients
25Parallelism Transparency
- Activities happening in parallel without users
knowing. - The most general model of parallelism is MIMD
(Multiple Instruction Multiple Data Computing). - There is two types of MIMD
1. Tightly Coupled Multiprocessors (shared
memory) 2. Loosely Coupled Multicomputers(privat
e memory)
26Example of parallelism
- max (v) v (12,22, 43, 3,56, 4, 23)
- in two phases
1.Distribute sub-problems into network 2.Collect
results from network
12,22,43
12,22
12,22,43,3,56,4,23
43
3,56,4,23
3, 56
Distribution phase
4,23
27Example of Parallelism..
max(12,22)22
max(22,43)43
max(43,56)56
43
max(56,23)56
max(3, 56)56
max(4,23)23
28Failure Transparency
- Conceals faults
- Users can complete their tasks despite failure of
hardware or software components
29Scaling transparency
- System and application can expand in scale
- without change to system structure or application
algorithms
30Design Issues
- Some design issues that arise from key
characteristics of distributed systems - Software structure
- Workload allocation
- Naming
- Communication
- Consistency maintenance
31Software structure
- A conventional operating system such as UNIX has
a monolithic structure - A hierarchical layered abstraction of operating
system kernel services - The kernel services may be duplicated over a
distributed system that also incorporates some
communication services.
32Distributed software structure
- In practice a distributed system like UNIX offers
the following services - A distributed filing system, offering transparent
access to remote files - A distributed naming scheme.
- Interposes communication
33Software Structure
- This duplication of kernel services is
undesirable. - Microkernels are more flexible than monolithic
kernels. Most distributed systems provide
flexibility by doing minimal services such as - an IPC mechanism
- some memory management
- some process management
- some low level I/O.
- All other services are implemented as user-level
services
34Consistency Maintenance
- Update consistency
- several processes may access and update shared
data concurrently. - Replication consistency
- ExampleThe Internet netnews system. Messages
posted to news groups may appear in an
inconsistent order at sites - the answers to
questions may appear before question appeared! - Cache consistency
35Cache Consistency
- Clients may update a cached block of a file that
belongs to a file server.
File Server
Processor
Processor
Processor
memory
memory
memory
cache
cache
cache
data block k