Databases and Information Systems 4

About This Presentation

Title:

Databases and Information Systems 4

Description:

Database Systems have been very successful in providing good ... 2. The data was relatively simple (largely alphanumeric). 3. The data was regular and complete. ... – PowerPoint PPT presentation

Number of Views:199

Avg rating:3.0/5.0

Slides: 96

Provided by: computin7

Category:

more less

Transcript and Presenter's Notes

Title: Databases and Information Systems 4

1
Databases and InformationSystems 4

Richard Cooper (rich_at_dcs)
and
Tony Printezis (tony_at_dcs)

2
The Fundamental Problem

Database Systems have been very successful in
providing good support for managing data which is
fairly large and fairly complex
What happens when
the data gets very much larger
the data gets very much more complex

3
Contents of Course

Week 1 (Richard)
Introduction
Overview of RDB/ORDB/OODB
Week 2 (Richard)
Orthogonal Persistence
Object Oriented Database Systems

4
Contents of Course

Week 3 (Tony)
Java Object Serialization
The PJama API
Week 4 (Tony)
Object Caching and Object Faulting
Pointer Swizzling

5
Contents of Course 3

Week 5 (Tony)
Garbage Collection - Disk Behaviour
Object Promotion
Week 6 (Tony)
Object Eviction
Orthogonal Persistence for Java

6
Contents of Course 4

Week 7 (Tony)
Store Organisation
Garbage Collection
Week 8 (Richard)
Object Query Languages
Transaction Models

7
Contents of Course 5

Week 9 (Richard)
Transaction Models for Multi-Site Databases
Schema Evolution
Week 10
Specialised Indexing (Ela)
XML (Richard)

8
Assumptions about Database Use

As database systems evolved, it was assumed that
1. There was a central data store with lots of
distributed users.
2. The data was relatively simple (largely
alphanumeric).
3. The data was regular and complete.
4. There was a lot of data, but there was also
an implicit limit to the size.
5. The users were either consumers or
specialised creators

9
The Real World

Now we have
data all over the place
in all kinds of structures
much of it is text
even more of it is graphical or aural
vast amounts of it
some of it is missing or is structured
differently in different places
users with various kinds of interest/involvement

10
When Data is Small

You can get away with
non-linear algorithms
hand-crafted code and data
an ad hoc structure
implicit rules and informal conventions

11
When Data Gets Large

You must have
linear or (better still) incremental algorithms
systematic code and data management
regular structures, frameworks and tools to
support them
explicit, visible and interpretable rules

12
When Data is also Long Lived

We have the hardware to keep data for a very long
time
and there are often laws forcing us to do so
However, long-lived data tends to change
new data is added
it is restructured
the software expected to handle it evolves
Can you read a ten year old floppy???

13
When Data is also Heterogeneous

Information Systems increasingly must bring
together data produced
of different kinds (numeric and multi-media)
separately (e.g. in merged companies)
for different purposes
using different technologies
As though they were all designed to work together

14
Large, Long-lived, Heterogenous and Unstoppable

Because the data supports continuous operations
utilities, banking, airlines, public service
You may not stop such systems if
you want to change the hardware or software
you want to change your database
you want to change the application
there are hardware or software failures
there are operations which require exclusive
access

15
This is the Reality we Live With

There are lots of examples
shared scientific data (e.g. genomic data)
e-business
governmental systems and health-care data
computer aided design and manufacturing
geographic information systems
etc., etc.

16
And There Are Many More Media for Data Access

Not just a private network, but also
the internet
digital television
mobile devices
etc., etc.

17
How To Cope 1

Software Re-use
not just small libraries such as Java APIs
but large components, such as
databases, payroll packages, GUI packages, etc.
Standardised Frameworks
CORBA, DCOM, EJB, .NET, XML

18
How to Cope 2

Generate code rather than write it
since much code is repetitious and can be
generated from
a high-level notation or by reflecting over data
Work incrementally
revolution is never affordable
plan and resource route for transition
remember the users!

19
The Fundamental Coping Device

Effective high-level and complex standards for
representing
data (relations not enough)
applications (regular, strict languages needed)
distributed systems (CORBA, etc.)
processes (UML, business processes, etc.)
etc., etc.

20
But also ...

It may be necessary to create new storage
techniques to fit new data structures
It will be necessary to invent new storage
structures to manage the new complexity
There is need for work at both
the implementation level and
the usability level

21
Lecture 2

New Requirements on DB Functions
Why Relations Won't Do
Extending Relations
Historical and Deductive Databases
Object Relational Databases
Oracle Objects, SQL3, etc.
Object Oriented Databases
intro only

22
New Applications withNew Requirements

1. CAD, CAE, CIM
2. Computer Aided Software Engineering
3. Office Information Systems
4. Geographic Information Systems
5. Hypermedia Systems
Data is large, often graphical, multiple
versions required, data is complex

23
Requirements which carry over from Traditional
Applications

Efficient access to large amounts of data
Recovery mechanisms
Security mechanisms
Data independence
Distribution of data

24
Requirements Modified by the New Applications I

Transactions
in traditional applications, these are short -
milliseconds to book a seat
in novel applications, they may be long - hours
or days to edit a design
in traditional applications they are competitive
- don't book the same seat twice
in novel applications they may be co-operative
e.g. collaboration on design development

25
Requirements Modified by the New Applications II

Integrity Constraints are much more important
as the data is more semantically complex
some of the semantics is best expressed as
constraints
User Interfaces play a greater rôle
the data is manageable only if appropriate
visualised
complex operations must be made usable

26
Requirements Modified by the New Applications III

Data is organised differently
Trad. Apps Novel Apps
Numbers of Objects Large Small
Number of Types Small Large
Object size Small Large/Huge

27
New Requirements Made by the New Applications I

Complex Data Structures
Just sets of records won't do
Object identity easier than primary keys
Implicit references easier than foreign keys
First Normal Form is a Killer!
Multimedia Data Types

28
New Requirements Made by the New Applications II

The Database must hold Code
to hold complex derived data
to hold "active values"
Multiple Versions
We only want one bank account record at any time
But many alternative designs
Building configurations becomes a problem

29
Can We Go On UsingRelational DBMS?

Only with increased mapping problems
The RM only has two ways of relating two pieces
of data
They are in the same record.
They are in two records connected by a foreign
key.

30
The Semantic Poverty of the RM

The former is used for
grouping attributes
1-1 relationships
compound attributes
connecting keys of M-N relationships
The latter is used for
multi-valued attributes
sub-typing
one-many attributes

31
Other Problem with RDBs

You can't do recursive queries
e.g. "Return all the ancestors of X"
Nor much support for constraints
e.g. "All employees earn less than their boss"
You can't add new operations
e.g. "Return the volume of a building"
Impedance mismatch
if you have use a PL this has a different data
model than does SQL

32
Three Approaches for Progress

Start with traditional DBMS Object-Relational
System and extend its modelling power
or
Start with rich data model Object Oriented DBMS
and add DBMS facilities
or
Start with a Programming Persistent Prog
Language Language and add DBMS facilities
Manifesto Wars

33
The Third-Generation Database System Manifesto I

Three tenets
Besides traditional data management services,
third generation DBMSs will provide support for
richer object structures and rules
Third generation DBMSs must subsume second
generation systems
Third generation DBMSs must be open to other
subsystems

34
The Third-Generation Database System Manifesto II

Thirteen Propositions
Rich type system Inheritance
Functions/encapsulation OIDs only if no
primary key
Rules (triggers and constraints) are
important
The query language should be central to all
access
ManualAutomatic Collections Update through
views
Performance and data model should be kept
separate
Multiple Prog. Languages SQL is the de facto
standard
Persistent extension of languages is good
Network communication through queries and results

35
The Object Oriented System Manifesto I

Mandatory Features
Complex Objects Object Identity Encapsulation
Types and Classes Inheritance Late binding
Ad hoc querying Extensibility Persistence
Efficient storage Concurrency Recovery
Computational completeness
Disagreement
Integrity constraints DB Admin Tools Views
Schema Evolution Tools

36
The Object Oriented System Manifesto II

Optional Features
Multiple inheritance Type checking
Distribution Design Transactions Versions
Open Choices
Programming paradigm Type system Uniformity

37
The Third Manifesto

The relational model is still important and OO
features should be orthogonal
Like
relations relational algebra up front
integrity constraints mutiple and single
inheritance
computational completeness static type checking
Don't like
SQL, object Ids and null values

38
Two Extensions of RDBMS

Historical DBMS
keep all past states of the database
Deductive DBMS
derived data as well as base data
uses a language like Prolog to add the derived
data

39
Historical DBMS

Old records are kept when they are deleted to
answer queries like "give balance on 1/10/88?"
Records have two extra fields - creation and
deletion dates
delete sets the deletion field
insert sets the creation field
update sets the deletion field and creates a new
record
Two notions of time
when the data is valid and when it is entered

40
Deductive DBMS (DDB)

A DDB is made up of two kinds of component
facts are simple base assertions - i.e. records
father( jane, john ) mother( jill, jane)
rules are ways of deriving more facts
grandfather( C, G ) - parent( C, P ), father(
P, G )
parent( C, P ) - father( C, P ), etc.
Queries are rules with variables to be filled in
grandfather( X, john )? - who are john's
grandchildren

41
Object-Relational Databases

Also known as
Extended relational databases
Complex object databases
Main features
get rid of First Normal Form
add methods to tables
Main examples
Oracle 8/i onwards, SQL3, Infomix

42
The Main Additions to RDBs

User defined abstract data types
Row types so that one value can include a nested
complex value
Collection types for domains
Inclusion of user-defined functions defined on
types
Inheritance
Multimedia data types and large objects

43
SQL3 (Evolving Standard)

This is a massive extension to SQL and has
computational completeness
row types
user-defined types
user-defined procedures, functions and operators
type constructors for arrays, sets, lists and
multisets
support for large objects - BLOBs and CLOBs
recursion

44
Row Types in SQL3

A row type is a sequence of field name/type pairs
- i.e. the type of a row of a table
In SQL3 it can also be the domain of a column
create table Branch( branchNo longInt,
address row( street varchar(20),
city varchar(20) ) )
Row types can be named
create row type EmpRT( Ename varchar(35), age
integer )
create table Employee of type EmpRT

45
User-Defined Types (UDTs) in SQL3

These are a means of defining new domain types in
SQL3, e.g.
create type StaffNumberType as varchar(5) final
More generally a UDT is an abstract data type
with
(non First Normal Form) fields
constructor methods
observer and mutator (get and set) methods
general methods

46
UDT Example

create type personType as
( private dateOfBirth Date,
public fname VARCHAR(15) not null,
public lname VARCHAR(15) not null,
function age(p PersonType) returns integer
return / code to calculate age /
end )
ref is system generated // see later
instantiable // if not, only subtypes are
not final // can have sub-types

47
Subtypes and Supertypes

Given a type, we can create a subtype, e.g.
create type StaffType under PersonType as
( staffNo varchar(6), etc.
This works by creating an extra attribute which
refers to a PersonType value
This also works at the table level
create table Manager under Staff( MgrStartDate
Date)
This creates a table with all the columns of
Staff duplicated and all manager records in both
tables

48
References

In SQL3 it is possible to set up OID style
references.
On slide 46 we said that PersonType had
system-generated references, so we can do
create table Branch as
( branchNo integer,
address addressType,
manager ref(PersonType)
..... )
In this, the value is a system-generated OID

49
Collection Types

SQL3 supports four collection types
ARRAY - one dimensional fixed length array
LIST - ordered and allows duplicates
SET - unordered and does not allow duplicates
MULTISET - unordered and allows duplicates
E.g. if PersonType has an attribute
nextOfKin set(PersonType)
The following makes sense
select fName, lName, count(NextOfKin)

50
Triggers

Triggers are pieces of code which act when some
condition is met. Each trigger defines
the event and whether to act before or after it
occurs
whether to operate on each row or only once
what to do
create trigger MailNewStaffNextOfKin
after insert on Staff referencing new row as ST
begin
insert into StaffToMail values ( select P.name,
P.address
from Person where ST.nextOFKin1
ST.staffNo )
end

51
Large Objects

Large objects are increasingly important and
there are two kinds
Binary Large Objects (BLOBs)
Character Large Objects (CLOBs)
You can
Concatenate them and do "substring" operations
Overlay and trim them
Return the length

52
Recursion

SQL3 permits linearly recursive queries, such as
with recursive AllManagers( staffNo,
managerStaffNo)
(select staffNo, managerStaffNo
from Staff
union
select in.staffNo, out.managerStaffNo
from AllManager in, Staff out
where in.managerStaffNo out.staffNo )

53
Objects in Oracle

The object option in Oracle8 provides, among
other things
user-defined data types
the use of objects directly by use of the ref
keyword
collection types including variable length arrays
multimedia data types

54
User Defined Types

UDTs have a name, attributes and methods
create type Person as object
( name varchar2(30),
address varchar2(40),
member function getName return varchar2(30)
)
Constructor methods - as usual
Comparison methods - to help order objects
General methods

55
Ref Types

Attributes with object types have their domains
declared using ref
create type Person as OBJECT
( name VARCHAR2(30),
spouse ref person )
For an object P of type Person, you can then do
P.spouse.name // to get the name of P's
spouse

56
Collection Types

There are two collection types
Arrays (called VARRAYs)
create type Prices as varray(10) of number(1,2)
Tables (called nested tables)
create type PersonTable as table of Person
Now we can have columns whose domains are either
of the above

57
Object Views

An object view is a virtual table of objects
useful to evolve relational applications into
object applications
create table Person (NINum varchar2(9),
Name varchar2(30), Age number)
create view OldView with object oid (NUNum) as
select NINum, Name, Age from Person
where Age gt 40
Update through views permitted where sensible

58
Comparing ORDBs and OODBs

ORDBs are better for
integrating a pre-existing RDB
traditional DBMS facilities (security, recovery
etc.)
OODBs are better for
advanced transactions, navigational queries
schema evolution
integrating a programming language

59
Lecture 3Orthogonal Persistence

Why Orthogonal Persistence is important
What Orthogonal Persistence is
Principles of Orthogonal Persistence
How to achieve Orthogonal Persistence
Examples of Persistence Mechanisms

60
The Problem

Traditional data intensive programming requires
programmers to be distracted trying to arrange
storage for the data
Fortran programs files
Cobol Network Databases
C, etc. Relations
This distraction slows productivity

61
Too Many Mappings!
62
Defining Persistence

Persistence is the length of time for which a
piece of data (including program) continues to
exist.
from until the end of the block it was declared
in
to outliving the program which constructed it
Most systems provide different persistence
mechanisms for different data.
Often systems only permit some data long term
persistence - e.g. JOS.

63
Orthogonal Persistence

is the automatic management of data so that it
may
outlive an individual program execution
automatically moving to and from backing store
be used concurrently by more than one program
not just storing a heap image - e.g. LISP,
SmallTalk
dynamic binding of names and types
be used by successive program versions
requires an evolution mechanism

64
Principles of OP

Data of any type (including multimedia and code
fragments) should have an equal right to all
levels of persistence
All of the data is stored completely
The data retains its structure when stored
The code is the same whatever the persistence of
its data

65
Why is this Important?

Every departure from these rules creates an
irregularity that the programmer has to work
around
data types which cannot be stored in the same way
as everything else
rebuilding incomplete structures
dealing with referential integrity problems
different code for transient and persistent data

66
Other Benefits of OP

Only one persistence technique to learn
Avoids extra code which obscures the application
logic
Permits code re-use
But how does the programmer assign a persistence
level variously to the data?
Any data can persist but for this application
which should?

67
Mechanisms for Indicating Persistence

Explicit write statements - not in the spirit of
OP
Persistence indicated by class or type
ODMG supports this
The E language had "Shadow" classes - one for
each real class
Persistence indicated at object declaration or at
object creation
some OODBs do this
Persistence by reachability
this will be our favourite, you'll see!

68
Persistent Class Examples

Classes declared to be persistent
persistent class Person
early ODMG proposal
class Person implements Serializable
Java - native code can't play
class Person public d_Object
ODMG proposal for C

69
Persistent Object Examples

persistent Person P
Person P new Person(MyDB)
Person P is created in the database
Person Q new Person( P )
Person P is created in the database "near to"
Person P.

70
Persistence by Reachability

Some objects are explicitly stored - persistent
roots
Any other object which is pointed to by a root is
automatically stored as well
Objects pointed to by those objects are also
stored
in fact, the transitive closure of references
from the roots are stored
This is similar to Garbage Collection

71
Example of Reachability
Memory
The rest of the tree is dragged in as well
A tree in memory
Explicit storage of tree root
The Database
72
Using Persistence by Reachability

The data must be organised around the idea of
persistent roots and their transitive closures
Note this is not new
An RDB has each relation as a root whose
transitive closure is the set of records
ORDB and OODB databases can be organised the same
way
Except other structures may now be used - e.g. a
tree

73
History of OP

1978 - Identified by Atkinson
1978 - 1982 - Search for a suitable language
1983 - 1988 - PS-algol
1988 - 1995 - Napier88
1985 - present, ideas gradually appear in
commercial systems
1995 - 2000 - Pjama, Persistent Java

74
What the Research has Entailed

Identification of language with suitable
properties
regularity, popularity
Identification of necessary techniques
store organisation, memory management, organising
the movement of data
Implementation of those techniques efficiently

75
Lecture 4

Persistent Programming Languages
What is a suitable language?
Some examples
Object Oriented Database Systems
Features
Examples
The Object Data Management Group Standard

76
A Suitable Language to Make Persistent

A persistent programming language is one which
accords with the principles of persistence (slide
64)
In building a persistent language other aspects
of a language are desirable
regularity and small number of constructs
since irregularities and more constructs increase
the number of aspects that the persistence layer
must cope with

77
PS-algol

This added persistence to S-algol a simple and
regular form of algol at St. Andrews
complex object structure, but object domains were
all of the same type
procedures are first-class objects which means an
object can a have a piece of code as a component,
there are variables which hold procedures, etc.
databases as objects in which you can enter name,
value pairs to be persistent roots
persistence by reachability from those
anything can persist

78
Napier88

More powerful version of PS-algol developed at St
Andrews and Glasgow
complex objects but now the domains are typed
single procedure to return the persistent store
as the sole persistent root objects
databases inserted immediately below this
abstract data types and other type constructors
hyper-programming allows programming directly
against the database
image data type

79
Persistent Java

PJama was developed in Glasgow from 1995 onwards
Allows Java objects to be bound into the
persistent store and retrieved
Much more on this in subsequent lectures

80
Object Oriented Databases

An Object Oriented Database has the following
features
Objects can persist
Object identifiers and references
Encapsulation of data and methods
Inheritance
Dynamic binding of code to data

81
Example
82
Problems with OODBs

They are hard to implement
Adding concurrency, distribution, efficiency,
reliability and querying to an OO system is
difficult
They use different persistence mechanisms
They use different OO models
and different OO languages
They have been produced by small, unstable
companies

83
Differences in Object Models

Are scalars objects?
Can properties be public?
if not how is the optimiser going to work?
Are there other information hiding controls? -
e.g. friends
Multiple or single inheritance
What can be made persistent and how?

84
History of OODBMs

First products in the field use Smalltalk/Own
Language
1986/7 - GemStone and Vbase
Big companies toy with the idea
1987 - DEC (Trellis/Owl) and Hewlett-Packard
(IRIS)
C Products in Late Eighties
Ontos, Versant, Objectivity, ObjectStore
Other models 1990 onwards
O2, POET, UniSQL, Jasmine, etc.

85
Gemstone/J

Started as persistent Smalltalk
Switch now to Java
Distributed Java Beans and EJBs
Servlets and JSP
CORBA
etc.
OQL, Transactions, etc.

86
Jasmine

From Computer Associates (INGRES RDB)
Studio for application development
Java
Multimedia classes
Authoring tools
Web development facilities

87
POET

Java and C
OQL
Targeted at small applications
Transactions and Locking
Schema versions
Event Notification
Security and Authorisation
Object factory - putting objects into RDBs

88
The Object Data Management Group (ODMG)

Set up by Rick Catell at Sun and the main OODB
vendors
voting members - Sun, POET, Objectivity, Excelon
reviewer members - CERN, Versant, CA, NEC and
Micro Data Base Systems
academic members
membership always changing!

89
What are the ODMG Doing?

an architecture for OODBMS
a logical data model expressed as a class
hierarchy
a data definition language, ODL
a data interchange format, OIF
a query language, OQL
a number of Object Manipulation Languages (OMLs)
bindings to Java, C and SmallTalk

90
ODMG - OO Features Appropriate for Databases

Special Treatment of Literal Values
A DB cannot afford to make an integer an object
Separate Provision for Relationships
Most OO models are not very good at relationships
ODMG provides for automatically maintained
relationships - i.e. when one side changes so
does the other
Domain Types - date and time domains
Objects for Database Management
databases, transactions, locks, sessions,
schemata
Metadata Management

91
The ODMG Data Model

The data model is defined in terms of a number of
types which include
Interfaces - describe the abstract behaviour of
objects
Classes - describe the abstract behaviour and
state of objects
Collections - sets, bags, lists, arrays,
dictionaries
Constructed Types - enumerations, structures and
unions
Objects (with identity) and Literals (no identity)

92
The Type Hierarchy
93
Example (ODL)

struct Address int house String road ...
defines a complex literal (not an object)
interface Person String name int age ...
defines an uninstantiable object structure
class Employee Person int StaffNo Dept d ...
defines an instantiable object structure
"" is inheritance which can be multiple

94
Relationships

Attributes and relationships are distinguished
class Employee Person
attribute int StaffNo
relationship Dept d inverse DeptEmployees
...
class Dept
relationship setltEmployeegt Employees inverse
Employee d ...
Relationships can have automatically maintained
inverses

95
Extents

The extent of a type is the set of instances of
that type in the database
The extent of a subtype is a subset of the extent
of the supertype
The DB designer can request that the extent of a
class is maintained automatically
A particular implementation may include indexes
and keys

Write a Comment

User Comments (0)