Introduction to Distributed Systems

About This Presentation

Title:

Introduction to Distributed Systems

Description:

... applications such as controlling a flight, weather forecast, stock trading, ... performance) is to use several cheap CPUs or connecting the existing small ... – PowerPoint PPT presentation

Number of Views:91

Avg rating:3.0/5.0

Slides: 64

Provided by: CIT788

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Distributed Systems

1
Introduction to Distributed Systems

What is a Distributed System?
Why do we want to have distributed systems?
Different Types of Distributed Computer Systems
Middleware DOS Vs. NOS
Some Examples Applications
Resources Sharing in a Distributed System
Challenges and Problems
New Developments in Distributed Systems

What is a distributed system?
A definition based on what we want to have
From Single Processor System -gt
Distributed System Middleware
What is a middleware?
A definition based on what we want to have and
the current situations

3
What is a Distributed System?

What is a distributed system?
A system which is distributed
What is a system? Multiple components, connected,
defined functions, etc
What is a centralized system? Standalone
computer???
Centralized Vs. distributed (performance and
structures)
What is the meaning of distributed? Separated
connected?
What are the implications (problems) of being
distributed? Many!
A distributed system is a collection of
independent computers that appear to the system
as a single computer (physically distributed but
logically centralized) (abstraction)
- Tanenbaum 2002
A distributed system is one in which components
located at networked computers communicate and
coordinate their actions only by passing messages
(components) (connection)
- Dollimore et al. 2005

4
Abstraction Vs. Connection

Which definition is better?
A. Tanenbaum (functionally single but physically
distributed)
B. Dollimore (physically distributed) (A or B?)
Why do we have these two definitions?
Any more? Any suggestions?
What are the differences between these two
definitions in
System development and implementation
Services provided
Performance

5
Why do we want to have distributed systems?

Two basic reasons for going distributed
Performance reasons
Reduce response time (better performance)
Distributed systems give better performance
(normally)
More processing units, larger memory, more data
for processing
Performance tradeoffs (security, reliability, )
Resource sharing
More resources (data, hardware, memory, computing
units, ) for sharing across the networks
I.e., sharing of a printer, memory, disk
storages, CPUs, etc

6
Different Types of Computer Systems

What is a Computer (Computing system)?
Hardware software (system boundary)
Functionally
A machine that can perform computation
What is the meaning of computation or
compute?
Structurally
A specially designed machine a CPU, memory
devices and I/O devices, etc.
Mostly, a computer can be used for multiple
(general) purposes (loading different programs to
execute for different purposes)
Do we have computers for specific purposes? Yes?
No?
Hardware
Single processor, multiple processors, multiple
computers, loosely coupled, tightly coupled
hardware
Software (supported by OS applications)
Single process, multiple processes, concurrent
processes

7
Performance Issues

In the old days, a computer has a single CPU and
processes jobs sequentially one by one without
interleaving
How to improve the performance of a computer
system?
More and better (faster) hardware, operating
environment, efficient coding,
How to measure the performance of a computer
system?
I.e., Response time, throughput (number of jobs
completed per unit time), utilization
Limitations of machines adding more resources
(faster CPU, more CPUs, more memory, )
Performance is limited by bottleneck resource
sequentially uses resources
Limitations of operating environment Concurrent
execution of processes may not be allowed or
limited (each time can only serve one job
although multiple jobs may be active)
What are other considerations in addition to
these performance measures? Reliability,
security, availability,

8
Single Thread and Single CPU
What is the meaning of a thread in here? Single
active process
U S / A Q U / (1 U) R QS S U
utilization S service time A inter-arrival
time Q queue length
9
Multiple Threads and Single CPU
Executing
How to determine the switching order and the
number active processes?
Process A
CPU
Process E
Process D
Process B
Process C
Switching among Processes A, B and C
Multiple active processes
What is the main benefit of having multiple
threads? If Process A is suspended, i.e., due to
waiting input data, the CPU may execute Process
B What are the overheads? Context switching
10
Concurrent ProcessesMultiple Threads and
Multiple CPU
CPU 1
Process A
CPU 2
Process E
Process D
Process B
Processes A, B and C are executed
concurrently Shorten response time (waiting time)
CPU 3
Process C
What is ignored in this figure? Data structure
algorithm (modeling of application environment
into the computer virtual environment)
11
Different Types of Computer Systems

Centralized Computer Systems
Computing units are physically located at the
same site
Note the users may be distributed
No network delay in processing (communication/memo
ry data access delay gt 0)
What are the implications? Timing and
synchronization are easier
Single processor or multiprocessors

12
Centralized Computer System
Computing units (may be multi-processor and
multithreading)
request
Simple user interface for submitting requests
result
network
user
13
Centralized Computer Systems

Performance problems of centralized computer
systems
All requests (jobs) are performed at a
centralized site
Workload at the site could be very heavy
(overloaded) (unpredictable performance)
Q U / (1 U)
Transmission delay of job requests and results to
and from the originating site and the centralized
site (requests may be from remote users)
Scalability problem
Management of a very large amount of data (i.e.,
a large database)
I.e., making a phone call requires location
management of millions of mobile users (searching
a tree)
Reliability problem single point of failure
Price/performance a power machine (mainframe)
(millions HKD) Vs. several cheap machines (PCs)
(a few thousand HKD)

14
Multicomputer/Multiprocessor Systems

Multiprocessors are aimed to resolve the
performance problem (i.e., shorten response time
and higher throughput) in CCS
Note a single processor can complete a program
in 10 sec does not mean that using two processor
can finish it in 5 sec (why?)
Different architectures of multiprocessor/multicom
puter systems
Different degrees of sharing of hardware
resources
Varieties in machine architecture and operation
environment of different machines (how to
organize the processors)
Multiprocessor system (tightly coupled)
All processors map to the same memory address
space
Multicomputer system (loosely coupled)
Each processor (computer) has its own private
memory
Heterogeneous and homogeneous
How to make the sharing?
Needs the redesign of the whole architecture of
the processors and computer, and the support of
the operation systems
What are the functions of an operating system?

15
Shared-Memory Architecture
16
Shared-Disk Architecture
17
Shared-Nothing Architecture
Are they distributed systems? Structurally YES
18
Multiple threads Single processor system
Tightly coupled system
Single thread Single processor system
Multiple threads multiple processors system
Loosely coupled system
Distributed computers
19
What are distributed?

Hardware resources
Software resources (various types of services)
What is software? Specifying the functions to be
performed, normally in steps
How to divide a single software program into
several software programs to be executed by
different computing units?
How to implement an algorithm into distributed
processes? I.e., a searching algorithm becomes a
distributed searching algorithm
Data
I.e., a large database is partitioned into
several fragments to be maintained by different
local database systems
How to process the distributed data? I.e., a
SELECT state to access to distributed database.

20
Operating Systems

How to connect the different machines together?
What are the tasks of an OS?
Distributed operating systems (DOS)
Network operating systems (NOS)
Middleware

21
Distributed Systems Services

DOS
An operating system for distributed computers
Not intended for independent computers (can join
and leave independently)
The computers have high degree of coupling and
similarity in structure, architecture and
operating environment
NOS
An operating system for loosely connected
computers and could be very different in
structure, architecture and operating environment
Does not intended to provide a view of single
coherent system
Add an additional layer (middleware) to achieve
the two objectives
To hide the heterogeneity (differences) and
provide a high degree of transparency

Why does DOS have these limitations, such as high
degree of coupling, not for independent computer
and heterogeneous computers?
If you are asked to design a new OS, will you
choose to build a new OS which is a DOS or use a
NOS and add a new layer as middleware?

23
Network Operating Systems

In principles, there is NO distributed operating
systems (DOS). Why?
An operating system that produces a single system
image like this for all the resources in a
distributed system
The DOS has total control over all the nodes in
the system and it transparently locates new
processes and resources at whatever node suits
its scheduling policies
Examples of NOS Unix and Windows
They provides networking capability and can
access to remote resources
NOS retains autonomy in managing their own
resources. Processes created by the process
resided at another machine has no control of its
child process

24
Middleware Positioning

A distributed system organized as middleware on
top of a network operating system to hide the
heterogeneity of the underlying platform from the
applications
The middleware layer extends over multiple
machines
Applications become operating system independent
but middleware dependent
The primary function to be provided from the
middleware is the various types of transparency
services (What is the meaning of transparency?
Transparent to whom? What are the benefits?)
The machines to the user program are logically a
single machine (Why?)
Each local operating system forming a part of the
entire network operating system provides local
resource management

25
Middleware Positioning
NOS
26
Transparencies Provided by Middleware
Different forms of transparency in a distributed
system
27
Middleware Services
In an open middleware-based distributed system,
the protocols used by each middleware layer
should be the same, as well as the interfaces
they offer to applications.
28
A Comparison of Different Architectures
29
Middleware Services

Some common services from middleware
Distributed file systems (accessing a remote file
like accessing a local file)
Remote procedure calls (RPC) (calling a procedure
supported by a remote node is similar to calling
local procedure)
Distributed objects
Distributed documents
High levels communication facilities that hides
the low level message passing
Naming services allow the search of remote
entities
Persistence storage of data
Distributed transaction management
Security
Note Many of them are resource management jobs

What is a distributed system?
By connecting existing Computer Systems
A definition based on the existing
architecture/structure

31
Distributed Systems Concepts of Networked
Computers

Components gt processes (communicating processes)
Networked computers gt connected (loosely
coupled) computers for sharing of resources
Networked computers
Similar to loosely coupled hardware
Spatially (physically) separated
Communication delays are long and unpredictable
gt when to decide for time-out (in case of
network failure, worst-case estimation)
Concurrent execution of processes are common
(concurrency) gt performance
No global clocks
Coordinating processes at different networked
computers
What are the problems of lacking a global clock
What is the main function of a global clock?
Event sequencing
Independent failures

32
Examples of Distributed Systems

The Internet
Variety a large number of different types of
networked computers connected using a set of
standard communication protocols
Mostly a share of information and resources
A lot of reading requests
Use the same interfaces and protocols to access
remote resources
Intranets
A portion of the Internet separately
administrated and has a boundary
Configured with local security policy
Connect to the Internet through a router and
protected with a firewall
A firewall filters incoming and outgoing messages
Mobile computing and ubiquitous computing
Mobile computing provides computing services
while the application is moving (also called
nomadic computing)
Ubiquitous computing provide computing services
everywhere (smart spaces) (also called pervasive
computing)

33
Examples of Distributed Systems
A typical portion of the Internet
34
Examples of Distributed Systems
A typical intranet
35
Examples of Distributed Systems
Portable and handheld devices in a distributed
system
36
Examples of Distributed Systems

Note Computers are NOT just Internet computers
What are the applications of computer systems?
Personal, commercial, government and many others
Computers are not just for entertainment, (i.e.,
playing games , chatting with people), there are
still many various applications such as
controlling a flight, weather forecast, stock
trading,
Many of these applications are distributed in
nature, i.e., stock trading systems and ticket
booking systems
Our real world gt virtual world in computers
Our world is distributed. We are the computing
unit. Our brain is the memory unit and we have
communication facilities
They are better to be supported by a distributed
architecture instead of a centralized
architecture
Distributed users, distributed data and
distributed resources
We use a single computer in the past mainly
because building distributed computer systems
were expensive

37
Some Benefits of Distributed Systems

Price/performance
Computers are expensive in the past
Easier to manage a centralized computer system
A cost-effective way to build a larger system
(higher performance) is to use several cheap CPUs
or connecting the existing small computers to
form a large system
Reliability
If one machine crashes, the system as a whole can
still survive
What are the different types of failures?
Different degrees of reliability gt some
functions are failed, multiple components provide
the same function
Nature of some applications
Some applications are inherently distributed
(e.g. banking and supermarket chain)
Some applications are moving (Examples? Why?)

38
Example
39
Some Benefits of Distributed Systems

Communication
It provides communication facilities (i.e. same
communication protocol)
Sending emails and transmitting documents to
different users
Flexibility
Load balancing
It spreads the workload over the available
machines in the most cost-effective way
Dynamic workload management (performance Vs.
workload)
Performance gt response time
Given a workload, under what situation, the
response time is the smallest?
Different nodes have similar utilization gt
minimum response time
Note These two are not benefits (Why?)

40
Resources Sharing in a Distributed System

Many physical resources are distributed in nature
(devices)
The sources for generating soft resources
(information/data) are also distributed in nature
I.e. weather, news, sport results, ticket
information, etc.
A natural trend to share resources
Data and software sharing
It allows many users access to a remote database
or even download a program for execution locally
Device sharing
It allows many users to share expensive
peripherals
I.e., Printers and other peripherals
Computation power
Computation may be performed by remote computer
Incremental growth and scalability
Computing power can be added in small increments

41
Resources Sharing in a Distributed System

Note resource sharing is NOT always good
Why do you want to sharing of resources with
other users?
Although you access to other users resources,
you also need to provide your resources for other
users to access to
If you have all your required resource, what do
you want? Sharing? No sharing?
What are the problems associated with resource
sharing?
Security, management problems, access problems,
reliability,

42
Resources Sharing in a Distributed System

How to access to remote resources? Through a
Resource manager
What is a resource manager?
A program that offers a communication interface
enabling the resource to be accessed, manipulated
and updated reliably and consistently
What should the resource manager do?
Provide resource name (naming services)
Identify resource location (distributed directory
management)
Map resource name to communication address
(naming directory management)
Coordinate concurrent accesses to ensure
consistency (correctness)
Different scales in sharing
Internet and computer-supported cooperative
working (CSCW)
Resource encapsulation (security)
Only the resource manager can access the resource
Other users send request to the resource manager
using a standard way and protocol

43
Example Association (Group Management)

Multiple objects
Multiple objects co-exist in a distributed
environment. Some of them are service providers
and the others are users
Association at least one of a given pair of
components communicates with another within the
system (cooperatively perform a task (provide
services))
After association gt Interoperation the
interaction during association
Association is spontaneous (without user
intervention)
Network bootrapping
Communication takes place over a local network
within the system
The device acquires an address (ID and a name) on
the local network
Who determine the assignment and manage the
network

44
Centralized Vs. Distributed Management

Management (Algorithm) gt Centralized OR
Distributed
Centralized approach use a powerful server to
manage the space status and connection
information
Distributed approach multiple devices (service
providers) manage the information
Comparisons
Problems in distributed computing
Perform operations at device level because of
limited bandwidth
Due to the dynamic properties of the objects, a
lot of updates are needed to be generated
A distributed approach can make the management of
objects to be localized and adaptive to the
changing systems status (in-network processing).
But, the communication overhead could be very
heavy
A hierarchical approach multiple levels with
different levels of coordinators may be used

45
Example Jinis Discovery System

Java based system for mobile and pervasive
computing systems
Components lookup services (discovery services),
Jini services and Jini clients
A Jini service provides services
The lookup stores services
Jini clients request services
Lookup service allows Jini services to register
the services they offer
A Jini service may be registered with one or more
lookup services
Jini clients request services that match their
requirements
If a match is found, the Jini client downloads an
object that provides access to the service from
the lookup service

46
Example Jinis Discovery System

When a Jini client or service starts up, it sends
a request to a well-known IP multicast address
Any lookup service that receives the request
sends its address enabling the requester to
perform a remote invocation to look up or
register a service with it
The client requires a lookup service in the
finance group so it multicasts a request with
that group name
Only one lookup is bound to the group name and
that service responds including its address
The client communicates directly using RMI to
locate all services of type printing
Only one printing service has registered with the
lookup service
The client then uses the printing directly

47
Service Discovery in Jini
admin
1. finance lookup service
Printing
service
Client
admin
Client
Lookup
service
Network
2. Here I am .....
4. Use printing
service
admin, finance
Lookup
3. Request printing
service
Printing
Corporate
infoservice
service
finance
48

How to satisfy the definitions (requirements) of
a distributed system?
Tanenbaums requirements and others
Challenges???

49
Challenges of Distributed Systems

Heterogeneity
Openness
Security
Scalability
Failure handling
Concurrency
Transparency

50
Heterogeneity

One of the most important aims of the middleware
is to hide the differences in underlying systems
Applications access remote objects and resources
using a standard way (interface and protocols) as
they are managed locally
Heterogeneity Vs. Transparency
Differences in
Networks (LAN, WAN, wireless LAN, GSM, etc.)
Computer hardware (different types of CPUs and
machines)
Operating systems (unix, windows, WinCE, etc.)
Programming languages (C, Java, C, etc.)
Implementations by different developers
Standardization Although the Internet consists
of different types of networks, all the
communications use the same set of Internet
protocols

51
Openness

Expandability it is the characteristic that
determines whether the system can be extended in
various ways and connected to other systems
(interoperability)
New users can join the Internet at any time
New resources can be added and be made available
for use
Portability an existing application developed
for a specific distributed system can be moved
to work in another distributed system
Standardization of interface for accessing the
resources
Flexibility a distributed system should be
easily configured (reconfigured) even the system
components are from different developers
Need to provide definitions not only for the high
level interface but also definitions for
interfaces to internal parts of the system and
describe how those parts interact
Monolithic systems tend to be closed

52
Security

Security for information has three components
Confidentiality protection against disclosure to
unauthorized individuals
Integrity protection against alteration or
corruption (correctness)
Availability protection against interference
with the mean to access the resources
Specification of what services and resources are
provided to each user or each group of users
(levels of accesses and authorities)
Methods (encryption and decryption) for encoding
the messages transmitted over the network
Identification of the right users
Denial of services
A user may wish to disrupt the service by
bombarding the service with a large number of
pointless requests
Security of mobile code
Receives an executable program as an attachment
of an email

53
Concurrency gt Consistency

Processes access to the same resources (or
different resources) at the same time
The server serves the processes concurrently
(why?)
Parallel executions occur for two reasons
Many users simultaneously invoke commands or
interact with application programs
Many server processes run concurrently, each
responding to different requests from client
processes
A higher concurrency in general implies a better
performance (shorter waiting time for services)
In a distributed system with M computers, up to M
processes can execute in parallel
However, this may not be true in many cases
(why?)
The two processes may alter the resources that
will be used by the other

54
Example
Global Data
X
Data Synchronization
X
X
X is duplicated
55
Scalability

A distributed system is scalable if it will
remain effective (providing similar quality of
services) if there is a significant increase in
the number resources and users
There are 3 scales
The smallest 2 workstations 1 file server
Local area network (LAN) up to hundreds
workstations and several file servers and print
servers (fax servers etc..)
Internetworking Several LANs interconnected may
contains thousands of computers and share
resources
The Internet
What will be the consequence of doubling the
number of users?
Requesting the same set of data
Requesting to connect to the same server
Requesting to transmit data through the same
segment of network

56
Scalability

To resolve the performance problem, the system
configuration may need to be changed
Adding more servers to balance the workload
Duplicating data to resolve the problem in data
synchronization
Caching data to reduce the transmission workload
Note Mostly, a solution creates another problem
The applications should not be affected due to
the change in system configuration

57
Failure Handing

Failures are possible at any time (planning for
the worst) (unavoidable)
Mostly the failures are partial in a distributed
systems and failures occur one by one
Failure handling consists of
Detection of failures
Masking failures
Recovery of failures
The design of fault-tolerant computer systems is
based on (redundancy)
Hardware redundancy the use of redundant
components
Software redundancy and data redundancy
Software recovery the design of programs to
tolerate (process group) or recover from faults
Availability measures the proportion of time
that the system is available for services

58
Transparency

Hidden from the user (application) programmer of
separation of components
Achieve a single system image to make everyone
into thinking that the collection of machines is
simply an old-fashioned time-sharing system
Using the same access method even when the system
configuration has been changed
Logical system design Vs. physical implementation
Layer structure to hide the details
Access transparency
Enable local and remote information to be
accessed using identical operations
Location transparency
Enable the information objects to be accessed
without knowledge of their location (users need
not tell where resources are located)
Who knows the locations?

59
Transparency

Concurrency transparency
Enable several processes to operate concurrently
using shared information objects without
interference (multiple users can share resources
automatically)
Replication transparency
Enable multiple replicas to be used to increase
reliability and performance without user
knowledge of how many replicas exist
Why need replicated data?
Failure transparency
Enable concealment of faults, allowing users to
complete their tasks despite the failure of
hardware or software components
Migration transparency
Allow information objects move within a system
without changing their name or affecting users
Why do we need to migrate data objects?

60
Transparency

Performance transparency
Allow the system to be configured to improve
(maintain the guaranteed) performance as loads
vary
Scaling transparency
Allow the system and applications to expand in
scale without change to the system structure or
the application algorithms
Parallelism transparency
Allow the program to be executed in parallel
without users knowledge

61
Some Basic Techniques for Building a Distributed
System

Replicate to increase availability
Trade off availability against consistency
Exploit cache locality to reduce access delay
Use time-out for revocation
Use a standard remote invocation mechanism
Use encryption for authentication and data
security
Distributed Vs. centralized resource management

62
New Development in Distributed Systems

Computing units getting smaller and smaller but
with higher computation power and energy supply
Extreme large memory storage units
Network everywhere both wired and wireless
networks
Performance of mobile network has been improved
greatly
Applications both commercial and personal
(personal computer becomes one of our essential
units at home)
Computation everywhere (mobile games and mobile
phones)
New applications
Real-time systems Distributed real-time
multimedia systems
Many small computation units Peer-to-peer
systems
Interaction with environment sensor network
systems
Multiple information stream Information
integration and filtering
What will be the FUTURE???