Title: Distributed Databases
1Distributed Databases
- Frank Chen
- Elise Edson
- Jian Gao
- Nathan Waldman
- Phil Wang
- Hui Zhang
2Why Distribute Data?
- Centralized DBMS vs. DDBMS
- Several factors have led to the development of
DDBMS - Distributed nature of some database applications
- Increased reliability and availability
- Allowing data sharing while maintaining some
measure of local control - Improved performance
3The Voltron Analogy
4Additional Functionality
- Distribution leads to increased complexity in the
system design and implementation - DDBMS must be able to provide additional
functions to those of a centralized DBMS. Some of
these are - To keep track of the data distribution and
replication in the DDBMS catalog. - To devise execution strategies for queries and
transactions that access data from more than one
site. - To decide on which copy of a replicated data item
to access. - To maintain the global conceptual schema of the
distributed database
5Dates 12 Rules of a DDBMS
- 1. Local autonomy
- 2. No reliance on a central site
- 3. Continuous operation (Failure independence)
- 4. Location independence
- 5. Fragmentation independence
- 6. Replication independence
6Dates 12 Rules of a DDBMS
- 7. Distributed query processing
- 8. Distributed transaction processing
- 9. Hardware independence
- 10. Operating system independence
- 11. Network independence
- 12. Database independence
7Todays Topics
- Architecture Design
- Transparency
- Functionality
8Architecture Design
9Introduction
- Distributed database
- A logically interrelated collection of shared
data (and a description of this data) physically
distributed over a computer network. - Distributed DBMS
- The software system that permits the management
of the distributed database and makes the
distribution transparent to users.
10Overview
11Distributed DBMS VS. Distributed Processing
Site 1
DB
Site 1
Computer Network
Computer Network
Site 2
Site 2
Site 4
Site 4
Site 3
DB
Site 3
DB
DB
Distributed Processing
Distributed DBMS
12Distributed DBMS VS. Parallel DBMS
- Parallel DBMS
- A DBMS running across multiple processors and
disks that is designed to execute operations in
parallel, whenever possible, in order to improve
performance. - Parallel DB Architectures
- Shared Memory
- Shared Disk
- Shared Nothing
13Architectural Dimensions
- Three dimensions of distributed database systems
- Autonomy of the individual sites comprising the
system - Distribution of data throughout the system
- Heterogeneity of hardware, operating system, data
model and/or DBMS across the various sites
14Autonomy
- No Autonomy
- The various sites/machines are tightly
integrated and cannot function independently - Semi-Autonomous
- The various sites/machines are able to process
independently, but their cooperation involves
some modifications to their otherwise independent
processing - Fully Autonomous
- The various sites/machines function in complete
isolation, knowing nothing of and being
unaffected by other sites processing
15Distribution
- No Distribution
- All of the data is located on one site/machine
- Semi-Distributed
- The data is spread across some sites but not all
there is a differentiation between sites, with
some being servers and others clients - Fully Distributed
- There is no distinction between sites/machines,
at least in the sense that data is spread across
all of them
16Heterogeneity
- Homogeneous
- All sites used the same DBMS product
- Are aware of each other and agree to cooperate in
processing user requests. - Each site surrenders part of its autonomy in
terms of right to change schemas or software - Appears to user as a single system
17Heterogeneity
- Heterogeneous
- Different sites may use different DBMS product
- Need not be based on the same underlying data
model - May be composed of relational, network,
hierarchical, and object-oriented DBMSs. - Sites may not be aware of each other and may
provide only limited facilities for cooperation
in transaction processing
18Interesting Architectures
- Client/server, with multiple servers
- No autonomy Semi-distributed Homogeneous
- Peer to peer distributed database
- Semi-autonomous Fully distributed
Homogeneous - Federated (multi-) database
- Full autonomy Fully distributed Heterogeneous
19Architectural Dimensions
20Reference Architecture for a DDBMS
- DDBMS can be viewed as an extension of the
ANSI/SPARC 3-schema architecture - The schema consists
- A set of global external schemas
- A global conceptual schema
- A fragmentation schema and allocation schema
- A set of schemas for each local DBMS
21Reference Architecture for a DDBMS
GES1
GES2
GES3
GES Global External Schema GCS Global
Conceptual Schema FS Fragmentation Schema AS
Allocation Schema LMS Local Mapping Schema LCS
Local Conceptual Schema LIS Local Internal Schema
GCS
FS
AS
LMS3
LMS2
LMS1
LCS1
LCS2
LCS2
LIS3
LIS2
LIS1
DB
DB
DB
22Reference Architecture for a Federated MDBS
- From an architectural viewpoint, multi-database
distributed systems come in two basic varieties - Use a global conceptual schema tightly coupled
- Dont use a global conceptual schema loosely
coupled - The former case is a generalization, in a sense,
of the peer-to-peer system
23Reference Architecture for a Federated MDBS
GES1
GES2
GES3
ANSI/SPARC
ANSI/SPARC
GCS
LES11
LESn1
LES12
LESn2
LCSn
LCS1
LES13
LESn3
LISn
LIS1
Sites
(Tightly Coupled)
24Component Architecture for a DDBMS
- Major Components
- Local DBMS (LDBMS) component
- Data communications (DC) component
- Global system catalog (GSC)
- Distributed DBMS (DDBMS) component
- Comparison of strategies for data allocation
25Distributed Relational Database Design
- Fragmentation
- A relation may be divided into a number of
sub-relations, called a fragments, which are then
distributed. - Allocation
- Each fragment is stored at the site with
optimal distribution. - Replication
- The DDBMS may maintain a copy of a fragment at
several different sites.
26Distributed Relational DB Design (Data Allocation)
- Strategic Objectives
- Centralized
- Fragmented (or Partitioned)
- Complete Replication
- Selective Replication
27Distributed Relational DB Design (Fragmentation)
- Why Fragment?
- Usage
- Efficiency
- Parallelism
- Security
- Primary disadvantages
- Performance
- Integrity
- Rules to follow during fragmentation
- Completeness
- Reconstruction
- Disjointness
28Distributed Relational DB Design (Fragmentation)
- Types of fragmentation
- Horizontal fragmentation
- -- Consists of a subset of the tuples of a
relation - Vertical fragmentation
- -- Consists of a subset of the attributes of a
relation - Mixed fragmentation
- -- Consists of a horizontal fragment that is
subsequently vertically fragmented, or a vertical
fragment that is then horizontally fragmented. - Derived horizontal fragmentation
- -- A horizontal fragment that is based on the
horizontal fragmentation of a parent relation.
29Distributed Relational DB Design (Methodology
Summary)
- Design global relations.
- Examine the topology of the system.
- Analyse the most important transactions and
identify where horizontal or vertical
fragmentation may be appropriate. - Decide the relations not to be fragmented (but to
be replicated). - Examine the relations that are on the one-side of
a relations and decide a suitable fragmentation
schema. - Check for situations where either vertical or
mixed fragmentation would be appropriate during
the previous step.
30Transparency
31Transparency
- Q Why do we need transparency?
32Transparency
- Q Why do we need transparency?
- A Without DDBMS transparency, applications must
be written using special knowledge about the
stored data
33Transparency
- Distribution Transparency
- Transaction Transparency
- Performance Transparency
- Heterogeneity Transparency
34Distribution Transparency
- Idea Users perceive the DB as a single, unified
entity - Types of Distribution Transparency
- Fragmentation
- Location
- Replication
- Local-mapping
- Naming
35Distribution TransparencyFragmentation
- What is a fragment?
- Horizontal fragments
- Fragmented by SELECTION
- Composed by UNION
- Vertical fragments
- Fragmented by PROJECTION
- Composed by JOIN
36Distribution TransparencyFragmentation
- Mixed/Hybrid fragments
- Horizontal ? Vertical fragmentation
- Vertical ? Horizontal fragmentation
- Derived (Horizontal) fragments
- Fragmenti Fragmentj
- Fragmented by SEMIJOIN
37Fragment Types
38Distribution TransparencyFragmentation
- Idea Tables can be fragmented to different sites
without user awareness - Example
- SELECT name
- FROM tblStaff
- WHERE salary gt 30000
39Distribution TransparencyFragmentation Example
EMP DEPT SALARY E1 DX
45K E2 DY 40K E3
DZ 50K E4 DY
63K E5 DZ 40K
Employee
User Perception
New York fragment
London fragment
EMP DEPT SALARY E1 DX
45K E3 DZ 50K E5
DZ 40K
EMP DEPT SALARY E2 DY
40K E4 DY 63K
40Distribution TransparencyLocation
- Idea Users dont know where the data is but must
specify fragment names - Example
- SELECT name
- FROM tblStaff21
- WHERE staffNo IN
- (SELECT staffNo FROM tblStaff1
- WHERE salary gt 30000)
41Distribution TransparencyReplication
- Idea Objects can be copied, and copies are
maintained automatically and without user
knowledge - Location Transparency ? Replication Transparency
42Distribution TransparencyReplication Example
EMP DEPT SALARY E1 DX
45K E2 DY 40K E3
DZ 50K E4 DY
63K E5 DZ 40K
Employee
User Perception
EMP DEPT SALARY E1 DX
45K E3 DZ 50K E5
DZ 40K
EMP DEPT SALARY E2 DY
40K E4 DY 63K
London fragment
New York fragment
EMP DEPT SALARY E1 DX
45K E3 DZ 50K E5
DZ 40K
Replica of New York fragment
Replica of London fragment
EMP DEPT SALARY E2 DY
40K E4 DY 63K
43Distribution TransparencyLocal-mapping
- Idea User needs to specify fragment names and
location of data items - Example
- SELECT name
- FROM tblStaff21 _at_ SITE 3
- WHERE staffNo IN
- (SELECT staffNo FROM tblStaff1 _at_ SITE 9
- WHERE salary gt 30000)
44Distribution TransparencyNaming
- Idea Each attribute in the DDB must have an
unique name DDBMS ensures this by either - Creating a central name server
- Prefixing attribute with site identifier
- Using aliases Best Solution
45Transaction Transparency
- Idea Distributed transactions look like
single-site transactions - Example
46Transaction Transparency
- Concurrency Transparency
- Failure Transparency
47Transaction TransparencyConcurrency
- Idea DDBMS ensures non-interference among local
and global transactions - Replication complicates concurrency
- Solution 1 Establishing atomicity
- Solution 2 Limit update
- Solution 3 Asynchronous update
48Transaction TransparencyWhat if we goof up?
- Several problems can occur when concurrent
transactions execute in an uncontrolled manner - The lost update problem
- The temporary update problem
- The incorrect summary problem
49Transaction TransparencyFailure
- Idea Recovery procedure must handle
- Loss of a message
- Failure of a communication link
- Failure of a site
- Network partitioning
- Must ensure all sub-transactions have completed
successfully before committing to the transaction.
50Transaction TransparencyCommit Protocol
- Idea Ensure atomicity of both local and global
transactions by employing - Transaction Manager (Coordinator)
- Resource Managers (Participants)
- Two-Phase Commit (2PC)
- Three-Phrase Commit (3PC)
51Transaction TransparencyTwo-Phase Commit (2PC)
52Transaction Transparency2PC Topologies
- Centralized 2PC
- Linear 2PC
- Distributed 2PC
53Transaction Transparency2PC Topology -
Centralized
C coordinator Pi participant
RC READY_COMMIT GC GLOBAL
_COMMIT GA GLOBAL_ABORT
54Transaction Transparency2PC Topology - Linear
C coordinator Pi participant
RC READY_COMMIT GC GLOBAL
_COMMIT GA GLOBAL_ABORT
55Transaction Transparency2PC Topology -
Distributed
C coordinator Pi participant
56Transaction TransparencyThree-Phase Commit (3PC)
57Performance Transparency Query Optimization
- Idea A DDBMS performs as if it were a
centralized DBMS - Decisions
- Which fragment to access
- Which copy of a fragment to use, if the fragment
is replicated - Which location to use
- Costs
- Access time (I/O), Communication time, CPU time
58Heterogeneity Transparency
- Idea Shouldnt matter what DBMS is running at
each site
Running Oracle
Running Sybase
1
N
Distributed Database
59But wait, theres a caveat
- Its generally not a good idea to implement full
transparency!
60Functionality
61Overview
- Transaction Management
- Concurrency Control
- Deadlock Management
- Database Failure Recovery
- The X/Open Distributed Transaction Processing
Model and Replication - Query Optimization
- Mobile Databases
62Transaction Management
- Local Database
- Transaction Manager
- Scheduler
- Recovery Manager
- Buffer Manager
- Database Site
- Transaction Coordinator
- Data Communication Component
63Concurrency Control
- Objectives
- Serializabilty
- Locking Protocols
- Centralized 2PL
- Primary Copy 2PL
- Distributed 2PL
- Timestamp Protocols
64Concurrency Control
- Objectives
- Preserve consistency of data items
- Enable parallel processing for efficiency
- Minimize overhead
- Work efficiently over communication delayed
network - Reduce constraints on structure of atomic actions
65Concurrency Control
- Serializability
- Simple answer
- one transaction has full, locked control
- Better method?
- Want
- Parallel processing
- Maximization of concurrent operations
- How to implement in Distributed DBMS?
66Concurrency Control
- Methodology
- Locking
- Timestamping
- How do we implement locking mechanism?
67Concurrency Control
- Centralized 2PL (two phase locking)
- Single site that maintains all locking
information - One lock manager
- Advantages
- Straightforward implementation
- Disadvantages
- Bottleneck
- Lower reliability
68Concurrency Control
- Primary Copy 2PL
- Many sites have lock managers
- For replicated data, one copy is chosen as the
primary copy, and the rest are slave copies - On update, transaction coordinator determines
location of primary copy - Update is propagated to slave copies
- Advantages
- Fixes Centralized 2PL disadvantages
- Disadvantages
- Deadlock managing more complex
- Primary copy local is bottleneck
69Concurrency Control
- Distributed 2PL
- Every site has a lock manager
- If data is replicated, ROWA (read-one-write-all)
control is executed from any one copy. - Advantages
- Decentralized lock handling
- Disadvantages
- Again, deadlock managing more complex
- High communication costs (all items locked before
update)
70Timestamp Protocols
- Objective
- Order transactions globally
- Give older transactions priority in conflict
- i.e. breaks ties
71Deadlock Management
- Centralized Deadlock Detection
- Hierarchical Deadlock Detection
- Distributed Deadlock Detection
72Deadlock Management
- Centralized Deadlock Detection
- Single site appointed Deadlock Detection
Coordinator (DDC) - Constructs aggregate wait-for-graphs (WFGs), and
rolls back deadlocked sites - Advantages
- Need only send delta of WFGs
- Disadvantages
- Single point of failure
73Wait for Graph
74Deadlock Management
- Hierarchical Deadlock Detection
- Sites organized into hierarchy
- Each node is responsible for checking for
deadlock in child nodes. - Advantages
- Less network communication
- Disadvantages
- Much more complex to implement
75Deadlock Management
- Distributed Deadlock Detection
- Each site contains partial deadlock detection
information - Advantages
- Extremely robust
- Disadvantages
- High network traffic
76Obermarck Detail
- Transactions are ordered
- Ex node is used to represent remote sites in
the local WFG - Upon receiving deadlock detection
- Create local WFG
- Resolve local deadlocks
- For cycles that go from an Ex node through
local nodes then back out to an Ex node, send
local node graph out to other sites to fix
77Database Failure Recovery
- Failures in a Distributed Environment
- How Failures Affect Recovery
- Distributed Recovery Protocols
- Network Partitioning
- Identifying Updates
- Maintaining Integrity
- Pessimistic Protocols
- Optimistic Protocols
78Failures in a Distributed Environment
- Distributed DBMS-specific failures
- Packet loss
- Communications link failure
- Site failure
- Network partitioning
- Sites split into multiple groups that can
communicate to each other, but not outside the
group - Hard to tell if network failure or busy site
79Failure and Recovery
- Goal
- Maintain globally atomic transactions
- Leave operational sites unblocked
- Upon failure
- Abort affected transactions
- Flag site as failed
- Wait for site to recover
- Initiate recovery procedure to remove partial
transactions - Update its copy of the database to maintain
consistency with the overall database.
80Distributed Recovery Protocols
- Assumptions
- Every global transaction has transaction manager
TM site. - Two common protocols
- Two-phase commit (2PC)
- Three-phase commit (3PC)
81Network Partitioning
- Problem with replicated data
- Classic unsynchronized bank account
- Constraints on account (e.g. Balance cant go
below 0) may be violated - Generally not possible to design a non-blocking,
atomic commit protocol for arbitrary partitioned
networks
82Network Partitioning
- Pessimistic protocols
- Choose consistency of database over availability.
Block transactions in a partitioned state if
consistency cannot be maintained. - Optimistic protocols
- Availability at the expense of consistency.
Updates are allowed to proceed independently in
the various partitions.
83The X/Open Distributed Transaction Processing
Model
- Vendor-neutral, international consortium of
users, software vendors, and hardware vendors
whose mission is to cause the creation of a
viable, global information infrastructure. - Goals and Achievements (DTP Standard)
- Develop APIs for atomic, consistent, isolated,
and durable transaction processing - Created idea of Transaction Manager (TM) and
Resource Manager (RM) discussed previously
84The X/Open Distributed Transaction Processing
Model
- DDBMS Tasks
- Remote invocation mechanism
- Provided by Remote Operations Service (ROSE) and
Remote Procedure Call (RPC) - Distributed transaction mechanism
- Uses Open System Interconnection Transaction
Processing (OSI-TP) protocol - Supports flat, chained, and nested transactions
85Replication
- Replication is the process of generating and
reproducing multiple copies of data at one or
more sites. - Data Ownership
- Master/Slave Ownership
- Workflow Ownership
- Update Anywhere Ownership
86Master-Slave Ownership
- Asynchronously replicated data is owned by one
site and can only be updated by that site. - Other sites subscribe to the data owned by the
publisher site. (publish and subscribe metaphor)
87Workflow Ownership
- Similar to Master-Slave Ownership, but allows the
right to update to move from site to site. - Only one site can have the right to update at any
given moment.
88Update Anywhere Ownership
- P2P style functionality
- Everyone can update
- Provides local autonomy even if the rest of the
network is dead.
89Replication Mechanisms
- Snapshots
- Database Triggers
- Conflict Detection and Resolution
90Snapshot
- Allow for asynchronous distribution of changes to
individual relations, collections of relations,
views, or fragments of relations according to a
predefined schedule.
91Database Triggers
- Shifts the responsibility to the users code
- Executes whenever an appropriate event occurs
- Disadvantages
- Increases overhead
- Have no knowledge of transactional nature of the
modification - Frequent updates stress the network and
application - Cannot be scheduled
- Refreshing related relations is complex
- Difficult to rollback
92Conflict Detection and Resolution
- Detection
- Before and after images
- Resolution
- Earliest and latest timestamps
- Site priority
- Additive and average updates
- Minimum and maximum values
- User-defined
- Hold for manual resolution
- Resolve Duplicate Primary Key or Unique Keys
- Append site name to duplicate value
- Append sequence to duplicate value
- Discard duplicate value
93Current X/Open DTP and Replication Support
94Query Optimization
- Need for speed
- Distributed Query Transformation
- Technique Primary Horizontal Fragmentation
- Technique Distributed Joins
95Need for speed
- Why worry?
- For query optimization, you want to avoid disk
reads, because they are slow - WAN or even LAN speeds are much slower than disks
96Distributed Query Transformation
- Same idea as was presented by query optimization
group, but extended in the following ways - Once you apply the heuristic processing, take
data distribution into account - Replace global relations at the leaves of the
tree with their reconstruction algorithms
97Technique Primary Horizontal Fragmentation
- (R1 Union R2) JOIN R3
- (R1 JOIN R3) Union
- (R2 JOIN R3)
- Allows for join of two relations to be
implemented as a union of partial joins (that can
be performed in parallel)
98Technique Distributed Joins
- Join is very expensive
- To optimize, replace join by a combinations of
semi-joins. This reduces the size of the operand
relation. - R1 JOIN(x) R2
- (R1 SEMIJOIN(x) R2) JOIN R
99Technique Distributed Joins
- Evaluation
- R PROJ(x) (R2) as Site 2
- Transfer R to Site 1
- R R1 SEMIJOIN(x) R as Site 1
- Transfer R to Site 2
- Result R JOIN(x) R2 at Site2
100Mobile Databases
- Problem
- Laptops and PDAs
- Want database functionality, but have limited
bandwidth and storage - Solution
- Mobile Databases
- Portable and physically separate from a
centralized database server, but is capable of
communicating with that server from remote sites
101Mobile Databases
- Advantages
- Mobile!
- Poll centralized database for up-to-date data and
statistics - Can do a one-shot update of centralized
database - Disadvantages
- Typically have pre-defined SQL queries
- May change in future devices
102Questions?
103References Web
- http//www.csee.umbc.edu/kalpakis/Courses/621-fa0
3/Lectures/DistributedDeadlocks.pdf - http//www.itee.adfa.edu.au/courses/ACSC7360/
- http//www.postgresql.org/
- http//www.oracle.com/
- http//www.microsoft.com/
- http//www.mysql.com/
- http//www.cis.usouthal.edu/share/allen_324/Presen
tr/CH10.ppt - http//www.cs.rpi.edu/noel/distr_scaleup/distribu
ted.html - http//db.cs.berkeley.edu/jmh/cs262b/ddbms.html.
- http//www.doc.mmu.ac.uk/STAFF/P.Quick/distribu.pp
t
104References Text
- Database Systems Design, Implementation,
Management, 5th Edition, Rob Coronel - Distributed and parallel database systems M.
Tamer Ozsu and Patrick Valduriez - Database System Concepts, Silberschatz, Korth and
Sudarshan, McGraw-Hill
105References Review Article
- Mariposa A Wide-Area Distributed Database
System, Michael Stonebraker, Paul M. Aoki, Avi
Pfeffer, Adam Sah,Jeff Sidell, Carl Staelin and
Andrew Yu