Title: TCOM505 Networked MultiComputer Systems
1TCOM-505Networked Multi-Computer Systems
- Instructor Dr. Peter W. Pachowicz
- ST 2, R. 253
- OH Tuesdays 200 400 pm, and by an
appointment - phone 993-1552 email ppach_at_gmu.edu
2CONTENTS
- Introduction to multi-computer distributed
systems - Client/Server model
- Multi-server architectures
- Network Operating System
- Distributed File System and Principles of
Replication - Scaling Up
- Robust Computer Architectures
- Federated DB and Data Warehousing
- Replication architectures
- C/S Distributed System Management
- Other issues
3MOTIVATION
- Business-driven computing revolution
- Shift in computing paradigm
- Traditional computing paradigm vs.
Internet-based computing paradigm - Top-level decomposition
- Implications - Internet architect
- Time table
4DISTRIBUTED SYSTEMS
- Distributed system
- A system in which components located at networked
computers communicate and coordinate their
actions only by passing messages - A collection of independent computers that
appears to its users as a single coherent system - Concurrency
- Concurrent program execution at different
locations - Coordination is needed
- No global clock
- No global synchronization by a clock
- Synchronization by messages
- Independent failures
- Each component can fail independently
- A system can fail in many new ways
5- Examples of Distributed Systems
- Internet (a global system)
- Intranets (a subsystem)
- Mobile computing (a system with a different type
of dynamics)
A distributed system organized as
middleware.Note that the middleware layer
extends over multiple machines.
6Challenges
- Connecting users and resources
- Heterogeneity
- Integration of systems of different computer
hardware, networks, operating systems,
programming languages, implementations by
different developers - Hardware, data representations, and software
differ through computing platforms - Middleware
- Provides a programming abstraction as well as
masking the heterogeneity of the underlying
networks, hardware, OSs, programming languages - Provides a uniform computational model for use by
the programmers - Mobile code
- Can be sent from one computer to another and run
at a destination - Virtual machine concept - code executable on any
machine
7- Openness
- Defines system extension and re-implementation
capabilities by adding new services or modifying
existing ones - Systems are designed to support resource sharing
in a way that these resources and services can be
extended - Extension can be achieved at
- Hardware level - by the addition of computer to
the network - Software level - by the introduction of new
services and re-implementation of the existing
ones - Open system
- Open system interfaces are published
- Open distributed systems are based on the
provision of a uniform communication mechanism
and published interfaces - Open distributed systems can be constructed from
heterogeneous hardware and software, possibly
from different vendors !!!
8- Security
- Confidentiality protection against disclosure
to unauthorized individuals - Integrity protection against alteration or
corruption - Availability protection against interference
with the means to access the resources
9- Scalability
- Scalability is a dominant theme in the
development of distributed systems. Three
dimensions Size, Distance, Administration - A system is scaleable if it will remain effective
when there is a significant increase in the
number of resources and the number of users - Controlling the cost of physical resources e.g.,
- The need for new resources should be proportional
to the number of users - Controlling the performance loss e.g.,
- Algorithms using hierarchical structures scale
better than those using linear structures - Preventing software resources from running off
e.g., - Design must consider the demand growth years
ahead - Avoiding performance bottlenecks e.g.,
- Algorithms should be decentralized
- Preventing total failures
10- Failure handling
- Failures in a distributed system are partial
- Many failure scenarios exist (due to the
combination of partial failures). It is
difficult to predict them at the design phase. - Dealing with failures
- Failure detection
- Failure masking
- Failure toleration
- Recovery from failures
- Redundancy
- Concurrency
- Concept of sharing resources - availability to
others - Multi-threading
11- Transparency
- Concealment from the user and the application
programmer --- they do not have to know about the
separation of components in a distributed system
12- Another look at transparency
- Access transparency - access to local and remote
resources using identical operations - Location transparency - no knowledge of the
location is needed - Concurrency transparency - several processes can
operate concurrently and share resources without
interference - Replication transparency - multiple instances of
resources can be used without knowledge of the
replicas - Failure transparency - user does not need to know
about faults - Mobility transparency - allows movement of
resources and system elements - Performance transparency - allows for system
reconfiguration to improve performance as load
vary - Scaling transparency - allows the system and
applications to expand in scale without change to
the system structure or the application algorithm
13HARDWARE CONCEPTS
- Basic organizations of processors and memories
(RAM) in distributed computer systems - Multi-processor systems (have shared memories)
- Multi-computer systems (do not share memory)
14Bus-Based Multiprocessor
- Pros
- Simple configuration
- Improved speed by added caches (hit ratio may
reach 90 for a larger cache) - Cons
- Incoherent memory (data can be changed by another
processor) - Scalability
- A large number of processors cannot be added
- A bus becomes a system bottleneck
15Switch-Based Multiprocessor
- Cross-bar
- Memory divided into modules and accessed through
a crossbar - N2 number of crossbars is needed
- Omega switching network
- A single 2x2 switch has 2 inputs and two outputs
allows access to every memory - Fewer switches needed
- Switching is more time consuming
- Higher cost
16ARCHITECTURAL MODELS
- The architecture of a system is its structure in
terms of separately specified components
interaction capabilities - Considerations
- Placement of system components across a network
- Patterns for the distribution of data and
workload - Interrelationships between components
- Component functional roles
- Patterns of communication
- Additional considerations for dynamic systems
- Moving code from one process (computer) to
another - Dynamic connection and removal from a net, search
for services
17CLIENT-SERVER ARCHITECTURE / MODEL
- The most important and basic architecture of a
distributed system - Communication request/reply invocation/result
request
CLIENT
SERVER
reply
request
SERVER/CLIENT
SERVER
reply
18C/S Characteristics
- Service
- A relationship between two computers
- The server is a provider of services
- The client is a consumer of services
- Shared resources
- A server can serve many clients providing the
same resources - Asymmetrical protocols
- Clients always initiate the dialog
- Servers are passively awaiting requests from
clients - Callback - a client can pass a reference to a
callback object and a server can invoke it. So,
the client becomes a server. - Transparency of location
- Computers can be anywhere but connected to the
network - Users do not need to know servers location
19- Mix-and-match
- Independence from hardware and software platforms
- Message-based exchanges
- Clients and servers are loosely coupled systems
- Interactions through message-passing mechanism
- Encapsulation of services
- The server is a specialist and decides how to
get the job done - Servers can be upgraded without affecting clients
as long as the published interface is not changed - Scalability
- C/S systems can be scaled horizontally (more
clients) or vertically (faster servers or
distributed processes) - Integrity
- Servers are centrally managed
- Clients remain personal and independent
20Basic C/S Systems
- File Servers
- File transfer across the network - file sharing
- The server functions as a storage of files
- Primitive form of C/S data service
- Generation of a lot of network traffic
request
Application
21- DB Servers
- The client passes SQL requests as messages (it
looks like sending instructions one after another
for the execution on a remote computer) - The server executes requests in SQL statements
and returns result (data) - Application code on the client
- Data and data retrieval controlled by SQL
statements on the server - Efficient use of distributed processing power
- Vendor support for DBMS servers (Oracle, MS)
SQL statement
DBMS Server
Application
Data
22- Transaction Servers
- The client invokes remote procedures that are on
the server (a procedure is a collection of SQL
statements). The SQL statements either all
succeed or fail as a unit. - Creation of a distributed application by writing
the code for the client and the server - OLAP (On-Line Transaction Processing) - Mission
critical applications requiring 1-3 sec. response
time 100 of the time
TP Objects
Invocation
Application
DBMS
Result
Application
23- Groupware Servers
- Putting people in direct contact - group project
management, access to email, etc. - Groupware software is distributed through the
network - Vendor specific software
Groupware Server
Application
Application
24- Object Application Servers
- Application software is written as a set of
communicating objects - Client objects communicate with server objects
using an Object Request Broker (ORB) - The client invokes a method on a remote object
- ORB locates an instance of that object server
class, invokes the requested method, and returns
the results to the client object - Server objects must provide support for
concurrency and sharing - CORBA - an emerging technology for distributed
systems
Application
Invocation
ORB
ORB
Objects
Return
ORB
Object
25- Web Application Servers
- Thin, portable, universal client idea
- Superfat servers with stored documents
- Communication through HTML protocol (RPC
protocol) - Clients have extended GUI capabilities
- Evolving model - bringing web and objects
together
HTML Forms
Application
CGI
HTTP over TCP/IP
HTML Documents
Java
Internet
26Fat Servers or Fat Clients
- Fat client
- More traditional form of C/S configuration
- Main processing on the client
- Fat server
- Easier to manage and deploy on the network
- Main processing on the server
- Ultra-thin clients
- Mobile communication devices
- Groupware, transaction, and web servers are
examples of fat servers - Database and file servers are examples of fat
clients - Distributed objects can be either
27MULTISERVER ARCHITECTURES
- Splitting the C/S application into functional
units and distributing them over multiple
computers - Functional units
- User interface - GUI
- Business logic - Application
- Shared data - BD
- Multi-tier architectures depend on
- The split of the application into functional
units and their distribution - The middleware used to communicate between the
tiers
28Alternative Client-Server Organizations
291-Tier Architecture
- All modules on one computer - there is no
distribution - Advantage - simplicity, cost
- Problems relate to the old computing paradigm -
for example - no scaling up
Application
GUI
DB
302-Tier Architecture
- Remote presentation architecture
- Distributed presentation architecture
31- Distributed programs architecture
- Remote data architecture
- Distributed data architecture
32- Advantages
- Simplicity
- Fast development of 2-tier application systems
- Problems
- Systems do not scale up
- Difficulty in managing fat clients
333-Tier Architecture
- Basic 3-tier architecture
- Thin-client 3-tier architecture
34- Thick-client 3-tier architecture
- Hybrid 3-tier architecture
353-Tier Architecture Summary
- The most popular and flexible configuration
- Configuration
- 1st tier - Presentation devices and software
(GUI) - 2nd tier - Mission critical server (Web server
Application server) - - Gateway to back-end application servers
- 3rd tier - Data, software applications,
additional software - Less complex system administration
- Good performance and excellent scaling up
- Excellent application reuse
- Legacy applications integration
- Excellent Internet support - clients on
low-bandwidth - Heterogeneous data base support
- Excellent hardware and software flexibility
36n-Tier Architecture
- The most recent trend
- Architecture for serious application systems
(large-scale systems) with many clients and
real-time DBs - Bridging several application systems into a large
enterprise and inter-enterprise infrastructure - Inter-enterprise transactions
- E-business Inventory-planning-ordering-accounti
ng-banking-production-delivery - E-biding
- Challenge
- Bridging distributed software components
altogether - Synchronization in a distributed environment
- Fault-tolerant computing - fault detection and
recovery
37- Architecture 1 - Middle-tier is a
gateway/dispatcher/scheduler - Architecture 2 Middle-tier is a cluster of
cooperating services
38- Benefits of n-tier architecture
- Embedded load balancing (gateway/dispatcher/sched
uler) - Easy scaling up
- Development of large systems / applications in
small steps - A cluster of small applications
- Easy gradual testing
- Gradual deployment of large systems
- Small development teams
- Incremental development
- Risk reduction of system development (53 of IT
projects fail) - Component reuse
- Can be sold separately
- Can be used by the other applications
- Integration with off-the-shelf components
- Suite of applications
- Enterprise system (a win-win situation)
- Component environments dont get older - they
only get better
39- Problems
- Providing robust communication between components
- Component integration through middleware
- Difficult plug-and-play capability
- Traffic increase
- Many fault scenarios
40Proxy Server and Caches
- Cache memory - analogy to the microprocessor
system design - A storage of recently used data
- Small size storage
- Elimination of unnecessary bus transfer cycles
- Explain the process
µP
Memory
CM
C
Bus
41- Cache distribution
- Within a client
- On a proxy server
- Caches are used extensively in practice
- Traffic reduction
- Web browser maintains a cache of recently visited
web pages - Web proxy servers
- They provide a shared cache of web resources for
the client machines at a site or across several
sites - The purpose of proxy servers
- Increase availability and performance of the
service by reducing the load on the wide-area
network and web servers - Access remote web servers through a firewall
42Peer Architecture
- All processes play similar role - no distinction
between clients and servers - Peers interact cooperatively to perform a
distributed activity - Peers are distributed components and monitor
common resources blackboard to view and
interactively modify data posted and shared - Major problem - coordination
43Other Concepts - Mobile Agents
- Mobile agent is a running program (code and data)
- Mobile agent travels from one computer to another
over a network with a given mission - May make many invocations to local resources
- May go back to the source point of its journey
- Problems
- They are a potential security threat - secret
sniffers, silent viruses, etc. - They may not accomplish their mission due to
faults, data constraints, or mission specification
44Other Concepts - Mobile Devices
- A form of distribution - spontaneous networking
or dynamic networking - Connecting mobile and non-mobile devices to
networks - A form of a very flexible network architecture
and inter-device communication possibilities - Require special type of middleware to build
distributed applications - Key features
- Easy connection to a local network - no cabling
- Easy integration with local services - no special
configuration is needed - services are
broadcasted - Serious problems
- Limited connectivity Security and privacy
Handling dynamic change of location and IP
address Robustness etc.
45C/S BUILDING BLOCKS
- The three basic building blocks
- Figure II-3-6
- Server
- Server side of the application - typically
- SQL database server TP Monitors Groupware
servers Object servers The Web - Support for Distributed System Management (DSM)
- A simple agent
- A managed PC
Client
Server
Middleware
46- Client
- Client side of the application - for example, web
browser - Non-GUI clients
- Without multitasking - ATM machines, barcode
reader, cellular phones, fax machines, etc. - Requiring multitasking - robots, testers, etc.
- GUI clients
- Graphical windows with dialog boxes - Figure
II-5-8 - Object Oriented User Interfaces (OOUI)
- More sophisticated environments, highly iconic,
with interactive manipulation of graphical
components - A visual desktop metaphore - Figure II-5-9
- The GUI/OOUI evolution - Figure II-5-10
- From Web pages to Shippable Places (virtual
world) - Figure II-5-11
47- Middleware
- The nervous system of a C/S infrastructure
- Three categories - Figure II-3-6
- Transport Stack
- Network Operating System
- Service Specific Middleware
- Also contains DSM components
- Middleware for n-tier architecture must provide a
platform for - running server-side components balancing loads
managing the integrity of transactions
maintaining high-availability supporting
security providing C/S communication pipes - Figure II-3-7
- Platforms
- The application servers that run the server-side
components - Used across different OSs to provide a unified
view of the distributed environment - Web server,
Object Transaction server, TP Monitor - Pipes
- Provide the intercomponent communication
48NETWORK OPERATING SYSTEM
- Task of an OS
- Provide problem-oriented abstractions of the
underlying physical resources - the processors,
memory, communications, and storage - System layers - Figure I-6-1
- Kernels and processes are executable components
(programs) that manage resources - Encapsulation - provide a useful service
interface to their resources - Protection - protection from illegitimate access
- Concurrency - provide access by many clients
- One of the kernels processes executes
application code
49Core OS functionality
- Figure I-6-2
- OS components
- Process manager
- A process is a unit of resource management,
including address space and one or more threads - Thread manager
- Creation, synchronization and scheduling of
threads - Communication manager
- Communication between threads attached to
different processes on the same computer. A
kernel may support remote thread communication. - Memory manager
- Management of physical and virtual memory of a
computer - Supervisor
- Dispatching of interruptions, system call traps
and other exceptions
50Processes and Threads
- Division into processes and threads caused by
multitasking and concurrency requirement insisted
on a single computer - Processes creation is expensive but secure
- Threads creation is fast but they work over the
same execution environment - A process consists of
- An execution environment
- An address space
- Thread synchronization and communication
resources such as semaphores and communication
interfaces (sockets) - Higher-level resources such as open files and
windows - One or more threads
- A thread is the operating system abstraction of
an activity - Threads can be created or destroyed dynamically
as needed to maximize the degree of concurrency - The future belongs to multi-threaded processes
51Kernel (Application)
Process (Computation)
Process (In/Out)
Execution Environment
Thread 1
Thread 2
Thread 3
Process
52- Benefits of multi-threaded systems
- Creating a new thread within an existing process
is cheaper than creating a process - Switching to a different thread within the same
process is cheaper than switching between threads
belonging to different processes - Process switching is executed by a system call to
the OS - Switching processes causes an extensive OS
overhead for memory management, etc - Threads within a process may share data and other
resources conveniently and efficiently compared
with separate processes - But by the same token, threads within a process
are not protected from one another
53Multi-Thread Architectures
- Threads on a single processor server
- Help to maximize the throughput
- Architectures for multi-threaded servers
- The worker pool architecture - the simplest
architecture - Advantage Prioritized queue
- Problems Inflexibility
Threads (workers)
In/Out
queue
Requests
54- Thread-per-request architecture
- Each request generates a thread
- Thread is destroyed after the execution is
finished - Advantages Throughput is potentially maximized
- Problems The overhead of the thread creation
and destruction
Remote Objects
Workers
I/O
Requests
55- Thread-per-connection architecture
- Associates a thread with each connection - a new
worker thread to service a single client - the
thread is destroyed when the connection is closed - Advantages Low thread management
Workers
Remote Objects
Requests
56- Thread-per-object architecture
- A thread associated with each remote object
- Per-object queuing by I/O thread
- Advantages Resource-driven design Very low
thread management - Problems Scaling up
Workers
Remote Objects
Requests
57DISTRIBUTED FILE SYSTEMS
- Distributed file system supports the sharing of
information in the form of files throughout the
Internet - To enable programs to store and access remote
files exactly as they do local ones - allows
users to access files from any computer in the
Internet - File transparency feature
- A part of NOS
- Sun Network File System - NFS - a case study
- Basic distributed file systems provide an
essential support for organizational computing
based on Intranets - File caching concept - similar to memory caching
- Caching on the server and client side
- Caching on the client side is more important
58File System Requirements
- Transparency
- Usually the most heavily loaded service on the
Internet - Access transparency
- Clients should be unaware of file distribution
- The same file access operation are for local and
remote files - Location transparency
- Can be relocated without changes to the pathnames
- Mobility transparency
- Neither client programs nor system administration
tables in client nodes need to be changed when
files are moved to another location - Scaling transparency
- The service can be expanded by incremental growth
to deal with a wide range of loads and network
sizes
59- Concurrent file updates
- Changes to a file by one client should not
interfere with the operations of other clients
simultaneously accessing or changing the same
file - File replication
- A file may be represented by several copies of
its contents at different locations - Hardware and OS heterogeneity
- Fault tolerance
- The server will operate in the face of client and
server failures - Stateless server (web server)
- Consistency
- Across multiple copies/replicas - a delay exists
in replica update
60File Service Architecture
- Figure I-8-5
- Components
- Flat file service
- Operations on the contents of files
- Unique File Identifiers (UFIDs) are used to refer
to files - Directory service
- Provides mapping between file names and their
UFIDs - Provides hierarchical organization of a file
system - Client module
- Integrating and extending the operations of the
flat file service and the directory service under
a single application programming interface - Holds information about network resources
(locations of the flat file server and directory
server processes)
61- Flat file service interface
- RPC service used by client modules - it is not
used directly by the user-level programs - Typical operations
- Read and Write
- Create and Delete
- GetAttributes and SetAttributes
- In comparison with UNIX OS it does not have Open
and Close operations - files can be accessed
immediately - Access control
- Access rights check is implemented on a server
using users ID - Two approaches - both support stateless server
implementation - Access check when a file name is converted into a
UFID - Access check with every client request (any
operation on a file) - Directory service interface
- Translation of a file names into UFIDs
- Hierarchic file system - a tree structure of file
directories - File groups - for file moving purpose across
servers, etc.
62Sun NFS
- Architecture - Figure I-8-8
- Client integration
- User programs can access files via UNIX system
calls without recompilation or reloading - The encryption key is used to authenticate user
IDs passed to the server - Buffering and caching in/out data
- Virtual file system
- Provides access transparency
- Distinguishes between local and remote files
- Integration between UNIX and non-UNIX remote file
servers
63- Mount service
- Allows for mounting the remote file system on a
given machine - Figure I-8-10
- Server caching
- Used for improved performance
- Read-ahead concept
- Delayed-write concept
- UNIX sync operation flushes altered cache pages
every 30 seconds - Client caching
- Used to reduce the number of requests across the
network
64SCALING UP
- Scaling up is one of strategic requirements
- Goals
- Serve more clients
- Reduce traffic
- Protect the system against a crash
- Grow the service
- Increase the productivity / decrease the costs
- Scaling up must target
- Infrastructure growth (architectural issues)
- Single computer hardware modifications
- Application software architecture, design and
implementation
65Server Scalability
- Goal To extend upper limits of a server
- Evolution - Figure II-5-3
- PC Server
- Asymmetric Multiprocessing Superserver
- Symmetric Multiprocessing Superserver
- Multiserver Cluster
- Multiprocessor servers
- Multiple processors in one box
- High-speed disk arrays for intensive I/O
- Fault-tolerant processing features
66- Asymmetric multiprocessing - Figure II-5-4 -
Simple solution - Only one processor (master processor) runs OS
- Coordinates all processing
- Divides the tasks into specialized processors
- Other processors are used as workers (slaves)
- Problems
- Some processors can be temporarily overloaded
- Symmetric multiprocessing (SMP) - Advanced
solution - All processors are equal
- Applications are divided into threads that can
run concurrently on any available processor - Any processor in the pool can run the OS kernel
- OS should support OS kernel, global scheduler,
shared I/O structure - Fully multiprocessor hardware is needed with
shared memory and local instruction caches - Applications must be written in a way that
supports multithreaded processing
67- Clusters
- Made of a group of interconnected SMP machines
behaving like a single system - High-speed LAN is frequently used as the
interconnection - Types of clusters - Figure II-5-5
- Shared-disk cluster
- Shared-nothing cluster - provide a very
high-level of parallelism - Clusters provide high-availability because
- SMP machines do not share memory or synchronized
caches - so - - Failures are contained within a single node
- Some form of Im alive mechanism is needed to
monitor the health of cluster components - Advanced OS support server clusters - Figure
II-6-2
68Scaling-Up vs. Scaling-Out
Scaling Up
Scaling Out
69- Scaling up is the traditional approach - get a
bigger and bigger server (cluster) - A complex solution
- Time consuming solution to implement
- Best when applied to DB servers
- Scaling out - a new approach - use multiple small
servers - Simple and inexpensive solution ??? - not always
- Fast deployment
- Typical for web servers and e-commerce
- Temporary solution
70Other Solutions To Scalability
- Dynamic-caching (on a Proxy server)
- Applies updates to portions of a page that
actually changed - Condenser sits between Content Site and the ISP
- Condenser is an intelligent agent
- Understands page content (sections)
- Can self-adjust to network conditions -
regulating frequency of updates, etc. - Condenser can reduce the traffic by 90 !!!
Client
Proxy Server
Content Provider
71- Replication of the same resources
- Replication of service
- Replication of data
- Mainly used for global systems with users spread
over multiple and/or larger geographic regions - US server (East coast, West coast), Europe
server, Asia server - Clear advantage for information providers
- Problems
- Update and synchronization of replicas for
systems with frequent data updates
72New ArchitecturesStorage-Area-Network (SAN)
- Crossing boundaries of storage limits (size and
access) - Architecture
Server
Users on LAN
Disk Arrays
Fibre Channel Hub
Switch
Users on LAN
Server
73- Provides very fast data access for widely
distributed users - High-speed (gt1G bit/sec) dedicated subnetwork
connecting storage disks or tapes with their
associated servers - Long-distance fibre connections (lt10km)
- Provides the most efficient use of storage
devices (sharing over a number of LANs) - Designed to support
- Disk mirroring
- Backup and restoration
- Archiving and retrieving
- Data migration among storage devices
- Great solution to data warehousing
- Problems
- Complex technology High cost Experienced people
needed
74FAULT-TOLERANT COMPUTING
- Fault tolerant computing describes an environment
that provides continuous, uninterrupted service -
access to data and application programs - even
when a hardware, software or network component
fails - Fault tolerance is about true redundancy
- Provided by
- hardware
- software
- combination of hardware and software
- Typical users
- Financial institutions
- Airline institutions
- E-commerce
75Fault-Tolerance vs. High-Availability
- Both
- Designed to maximize application and server
availability - Use of backup resources - like mirrored servers
and disks for recovering from failure - Goal of availability
- To recover from a crash quickly
- Goal of fault-tolerance
- To eliminate the recovery time completely
- Less than 5min of downtime a year
- Fault-tolerant configurations feature a high
degree of built-in hardware redundancy,
serviceability and remote management capabilities
76Architecture 1 Process pair
- Primary process and a backup process run on a
separate processors - The backup process mirrors all the information in
the primary process and can take over in any case
of the primary processor failure - Comment
- This is not the best architecture for
fault-tolerant computing - due to potential
server failure - Trend - multiserver architectures
77Architecture 2 A four-server architecture
Computational Processor (CPUMemory)
Computational Processor (CPUMemory)
High-Speed Links
I/O Processor
I/O Processor
100 Base-T
Disk Storage
Disk Storage
Mirrored Storage
LAN
78DB SERVER ARCHITECTURES
- SQL server
- Manages the control and execution of SQL commends
- Provides logical and physical views of the data
and generates optimized access plans for
executing the SQL commands - Server administration features and utilities that
help manage the data - All other functions related to concurrency,
security, and consistency - Process-per-Client Architecture - Figure II-10-2
- Provides maximum separation - separate address
spaces - Advantages
- Users are directly protected from each other
- Processes can be assigned to separate processors
(SMP machine) - Problems
- Uses more memory and CPU - potentially slower
solution - Relies on OS supporting SMP
79- Multithreaded Architecture - Figure II-10-3
- Single multithreaded process
- Advantages
- Best performance by running all the user
connections, applications, and the database in
the same address space - Does not rely on OS
- Conserves memory and CPU cycles - does not
require frequent context switching - Problems
- It is easier to bring it down by a misbehaving
application - Hybrid Architecture - Figure II-10-3
- Consists of three components multithreaded
network listeners, dispatcher tasks, reusable
shared server worker processes - Advantage
- Protected environment without requiring
significant memory - Difficult to break it down
- Problems
- Queue latencies can be a problem
80DATA WAREHOUSES
- Motivation
- Explosion of data stored on computers and data
sources - Business change - intelligent and fast decisions
based on advanced analysis of available data - Data warehousing provides foundation technology
for creating intelligent clients - Exploding emerging technology of an exceptional
growth - 2nd-tier (or 3rd-tier) of multi-computer
architecture taken by - OLTP - On-line Transaction Processing
- Time critical systems
- Mission critical systems
- DSS - Decision Support Systems
- Analyzing and finding right information and
presenting it to a session maker
81Elements of Data Warehouse
- Data warehouse
- An active intelligent store of data that can
manage and aggregate information from many
sources, distribute it where needed, and activate
business policies - Top level architecture - Figure II-12-1
- Operational Data
- Data Replication Manger
- Manages the copying and distribution of data
across databases - Transformation Cleansing Replication
- Informational Database
- Goal specific subset of operational data
- Metadata - data about data - describes contents
of the DB - Information Directory
- Information hound element - defines what kind of
data should be collected - EIS/DDS Tool Support - intelligent data analysis
82Warehouse Hierarchies - The Datamarts
- Architecture - Figure II-12-2
- All data extracts from production databases are
first applied against an enterprise data
warehouse - Data from the enterprise warehouse can be
distributed / replicated (as needed) to
departmental (goal-oriented) warehouses also
known as datamarts - Datamarts are organized by subject (sales data,
product data, etc.) - Why such an organization
- Data warehousing is a large project - carried out
in increments - from a single datamart to the
other (collectively called data warehouse) - A business case to order the data
83REPLICATION ARCHITECTURES
- Motivation
- A key in providing high availability and fault
tolerance in distributed systems - Used to remove capacity, performance, and
organizational roadblocks of centralized data
access - Data replication and transformation process
- Figure II-12-5
- Specialized middleware provides a glue in
muli-server DB replication environment - Replication transparency
- Client should not have to be aware that multiple
physical copies of data exist
84Refresh and Updates
- Refresh
- Architecture Figure II-12-6
- Replace the entire target with data from the
source - Update
- Architecture Figure II-12-7
- Send the changed data only to the target
- Synchronous update - high availability
applications - Asynchronous update - data warehousing
applications - Staging - Figure II-12-8
- Broadcasting
- Can be considered as a form of replication
service - Does not have a feedback from the update
85Passive Replication Architecture
- Architecture
- Figure I-14-4
- Front ends for clients - communication function
only - A single primary replica
- One or more secondary replicas (backups)
- Characteristics
- Clients communicate with the primary replica only
- Primary replica executes operations and sends
copies of updated data to the backups - If the primary fails, one of the backups is
promoted to act as the primary
86- Communication sequence
- Request
- Made by a client to the primary replica with
attached request identifier - Coordination
- Checks execution. If request duplicated re-sends
the response - Execution
- The primary executes the request and stores the
response - Agreement
- If the request is an update then the primary
sends the updated state, the response and the
identifier to all backups - The backups send an acknowledgment
- Response
- The primary responds to the client
87Active Replication Architecture
- Architecture
- Figure I-14-5
- Characteristics
- Replicas are state machines that play equivalent
roles and are organized as a group - Front-ends multicast their requests to the group
of replicas - Replicas process the request independently and
reply - Front-ends collect and compare replies
- This architecture can tolerate many failures
- More replicas are needed to support a voting
process at a front end
88- Communication sequence
- Request
- Multicast a request with its identifier to the
group of replicas - Coordination
- The group communication system delivers requests
to all available replicas - Execution
- Every replica executes the request independently
- Agreement
- No agreement phase is needed
- Response
- Each replica sends its response to the front end
- Frond end collects responses, checks their
consistency and replies to the client
89The Gossip Architecture
- Architecture
- Figure I-14-6
- Highly available service but of weaker
consistency - Application bulletin post - due to slightly
out-of-date information - Characteristics
- Replicated data is close to the points where
groups of clients need it - Two basic types of operations
- Queries - read-only (read-only replicas)
- Updates - change without reading (update
replicas) - Replicas exchange gossip messages periodically
in order to exercise the updates they received
from clients - All replicas eventually receive all updates -
convergence over time - The architecture has weaker consistency - due to
casual nature of updates but they are less
costly
90- Scalability of the gossip architecture
- A problematic issue - if the number of replicas
grow than the traffic of gossip messages grows - Solution - increase the number of read-only
replicas and place them closer to the clients