Title: Federated Database Systems Part I
1Federated Database SystemsPart I
- CSCI 8370 Advanced Database
- Meena Nagarajan
- http//lsdis.cs.uga.edu/meena
2Multi database systems
- Multiple databases created for the same
functionality - Different operating systems, data formats, query
languages etc - Typically DBs managed by DBMSs running on
heterogeneous computing platforms - Information sharing across dissimilar platforms
- Interconnect previously isolated software systems
(DBMS) - Not only invoke but also coordinate interactions
3Interoperating with heterogeneous databases -
requirements
- Distributed transparency-users must access a
number of different databases in the same way as
accessing a single database. - Heterogeneity transparency-users must access
other schemas in the same way they access their
local database (using a familiar model and
language). - The existing database systems and applications
must not be changed.
4Interoperating with heterogeneous databases -
requirements
- Addition of new databases must be easily
accommodated into the system. - The databases have to be accessed both for
retrievals and updates. - The performance of heterogeneous systems has to
be comparable to the one of homogeneous
distributed systems.
5Multi database systems
- Interconnection and cooperation of autonomous and
heterogeneous databases must address - Distribution
- Autonomy
- Heterogeneity
- In order
- Overlooked autonomy (intra corporate, poor
networking infrastructure) - More of autonomy and flexible bridging of
heterogeneity (federated approach) - Autonomy over heterogeneity (multi database
language approach)
6More on heterogeneity
- Heterogeneity independent of location of data
- When is an information system homogeneous
- Software that creates, manipulates data is the
same - All data follows same structure and data model
and is part of a single universe of discourse - Different levels of heterogeneity
- Different languages to write applications
- Different query languages
- Different models
- Different DBMSs
- Different File systems
- Semantic heterogeneity etc.
7More on autonomy
- Databases usually under separate and independent
control - Aspects of autonomy
- Design autonomy Local DBs chose their own data
model, query language, interpretation of data
etc. - Communication autonomy Local DBs decide when and
how to respond to other DB requests - Execution autonomy Execution of local/external
operations/transactions is not controlled by any
external DBMS - Association autonomy Local DBs can decide how
much of their data/functions/operations to share
with other classes of users
8Interoperability
- The ability to request and receive services
between the interoperating systems and use each
others functionality. - Systems considered interoperable if
- They can exchange messages and requests
- They can receive services and operate as a unit
in solving a common problem
9Terminologies
- A DBS consists of software called DBMS and one or
more databases it manages. - A FDBS is a collection of cooperating but
autonomous component DBSs. - The software that controls, coordinates the
component DBSs is called a FDBMS
10Heterogeneous Distributed Databases
- Information systems that provide interoperation
and varying degrees of integration among multiple
DBs are called - Multi database systems or
- Federated systems or
- More generally, heterogeneous distributed
database systems (HDDBSs)
11Solutions to integrating HDDBSs
- Global Schema Integration
- Federated Database systems
- Multi database language approach
12Global Schema Integration
- Based on complete integration to provide a single
view - Advantages
- Consistent, uniform view of and access to data
for users - Users unaware of existing multiple existing DBs
13Global Schema Integration
- Disadvantages
- Hard to automate creation of a global schema
structural, semantic or behavioral conflicts - Autonomy esp. association autonomy sacrificed
all local data and operations to be revealed - Loss of semantic information depending on how the
schema integration is performed - Correctness of global schema is hard to prove
hard because of context dependent meanings
14Global Schema Integration
- Error prone, time consuming
- Unsuitable for frequent dynamic changes to
schemas - Does not scale well with size of DB networks
15Federated Database systems
- Aim remove the need for static global schema
integration - Allows each local DB to have more control over
the shareable information - Control is decentralized
- Integration need not be complete but depends on
needs of users
16A FDBS and its components cooperation among
independent systems
Can continue local operations and participate in
more than 1 federation. Can be (de/) centralized
or another FDBMS
17FDBs
- Compromise between
- no integration in which users must explicitly
interface between multiple autonomous DBs - AND
- Total integration in which autonomy of each
component DBS is sacrificed so that users can
access data through a single global interface but
not as a local user - Support local and global (federated) operations
18Taxonomy - based on autonomy
- DBS either centralized or distributed
- Centralized a single DBMS managing a single DB
- Distributed a single distributed DBMS managing
multiple DBs - MDBS supports operations on multiple DBs
19Taxonomy
- Loosely coupled FDBS
- If users responsibility to create and maintain
the federation. No control enforced by the
federation admin. - Tightly coupled FDBS
- If federation admin have responsibility for
creating and maintaining the federation and
actively controlling access to the component
DBSs. - Association autonomy of the individual component
DBs still exists
20FDBSs Schemas
- Local schema
- Conceptual schema of a component DB
- Component schema
- Local schema translated to a common data model of
the FDBS. Alleviates data model heterogeneity. - Export schema
- Specify shareable objects to other members or
classes of members of the FDBS. - Federated schema
- A statically integrated schema or dynamic view of
multiple export schemas. Can be multiple
federated schemas. - External schema
- For customization when the federated schema is
large and complicated. Another level of
abstraction for class of users for example.
21Five level schema architecture of a FDBS
22Loosely coupled FDBSs
- User creates and maintains federation schema
- Creating schema corresponds to creating a view
against relevant export schemas - Therefore, each user must be aware of information
and structure of the export schemas - Hard to support view updates therefore, assume
highly autonomous read-only DBs
23Loosely coupled FDBSs - Advantages
- Flexibility of different interpretations possible
for same federated schema - Easier to cope with dynamic changes in schemas
since it is easier to create views. Detection of
changes is however expensive.
24Loosely coupled FDBSs - Disadvantages
- Duplicated effort in creation of similar
federated schemas. - Difficulty in understanding the semantics of
schemas available to the user. - Due to possible multiple view creations, view
updating cannot be supported.
25Tightly coupled FDBSs
- Aim provide location, replication and
distribution transparency - Federation administrators have full control over
creation and maintenance of federated schemas and
access to other export schemas - Single federated schema same as global schema but
view updates possible if administrators
understand the mappings.
26Tightly coupled FDBSs Disadvantages
- FDBS administrator and component DBSs negotiate
creation of export schemas during which adm. has
complete read access to component schema and/or
data. Violates autonomy - Change in export/component schemas imply redoing
federated schema creation.
27Multi database language approach
- Aim provide constructs that perform queries
involving several DBs at the same time - Has features not supported by traditional
languages. Ex a global name can be used to
identify a group of DBs - DBs covering same subject are grouped under a
collective name. Inter DB relationships are
specified in the dependency schemas
28Multi database language approach - disandvantages
- Lack of distribution and location transparency
for users. - Users responsible for
- finding relevant DBs,
- understanding schemas,
- detecting and resolving semantic conflicts
- performing view integration
- Some support offered by the language constructs
29More on Federated Databases
- System architecture - Core components combined in
different ways to produce different data
management architectures - Data Data are the basic facts and information
managed by a DBS. - Database A database is a repository of data
structured according to a data model. - Commands Commands are requests for specific
actions that are either entered by a user or
generated by a processor. - Processors Processors are software modules that
manipulate commands and data. - Schemas Schemas are descriptions of data managed
by one or more DBMSs. A schema consists of schema
objects and their interrelationships. - Mappings Mappings are functions that correlate
the schema objects in one schema to the schema
objects in another schema.
30FDBSs Schemas
- Local schema
- Conceptual schema of a component DB
- Component schema
- Local schema translated to a common data model of
the FDBS. Alleviates data model heterogeneity. - Export schema
- Specify shareable objects to other members or
classes of members of the FDBS. - Federated schema
- A statically integrated schema or dynamic view of
multiple export schemas. Can be multiple
federated schemas. - External schema
- For customization when the federated schema is
large and complicated. Another level of
abstraction for class of users for example.
31Basic system components of the data management
architecture
32Processors in a FDBS
- Transforming P Uses mappings to transform
commands from internal command language to local
query language etc. - Filtering P Uses access control specified in
export schema to limit allowable operations
submitted to corresponding component schemas - Constructing P Performs query decomposition and
merges data
33System architecture of an FDBS schemas and
processors
34Some Research in multi database systems
- Schema and Language translation
- Schema integration
- Multi database consistency and dependencies
- Workflow management systems
- Transaction processing
35Schema and language translation
36Schema integration
37Multi database consistency and dependencies
38Workflow management systems
39Transaction processing
40Evolution of FDBS
41Multi DBMS/FDBS Efforts
42Significant features