Object-Relational Database Systems: - PowerPoint PPT Presentation

About This Presentation
Title:

Object-Relational Database Systems:

Description:

Why relational isn't enough, and some options. The object-oriented DBMS revolution ... GemStone, Objectivity, ObjectStore, Ontos, O2, Matisse, Poet, Versant, others ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 35
Provided by: MikeC159
Learn more at: https://dsf.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Object-Relational Database Systems:


1
  • Object-Relational Database Systems
  • Evolution Beats Revolution

Michael J. Carey IBM Almaden Research Center
Smalltalk
Java
New Data Types
Queries
Navi- gation
Appl. Dev. Tools
SQL
C
2
  • Plan for the Talk
  • The relational DBMS revolution
  • Relational model and query language
  • Why relational succeeded
  • Why relational isn't enough, and some options
  • The object-oriented DBMS revolution
  • Object-oriented model(s) and query language(s)
  • Why object-oriented "failed"
  • Why wrappers will fail as well
  • The object-relational DBMS evolution
  • The object-relational model and query language
  • Current products and examples
  • Performance and other challenges

3
  • The Relational DBMS Revolution
  • The pre-relational era (1970's)
  • Graph-based data models
  • Hierarchical model (e.g., IMS)
  • Network model (e.g., Codasyl)
  • Low-level, navigational interfaces
  • Labor-intensive and error-prone
  • The relational era (1980's)
  • Simple, abstract data model
  • Database set of relations ("tables")
  • 3 schema levels views, base tables, physical
    schema
  • Algebra of set-oriented operations
  • High-level, declarative interfaces
  • SQL, Quel, QBE
  • Embedded languages, 4GLs

4
  • The Relational Model (by example)
  • Employees and departments

Department
dno
name

10
Toy

20
Shoe
Employee
eno
name
salary
dept

1
Lou
10000000
10

7
Laura
150000
20

22
Mike
80000
20
?
select E.name, E.salary, D.no from Employee E,
Department D where E.salary lt 100000 and D.name
'Shoe' and E.dept D.dno
5
  • Relational DBMS "Goodies"
  • Relational query processing
  • Queries range over tables and/or views
  • Programmers use a declarative language (SQL)
  • Query optimizer picks the lowest-cost query plan
  • Alternative access paths, join orders, join
    methods, and so on (based on indices and database
    characteristics)
  • Result data independence
  • Support for (shared) business logic
  • Integrity constraints
  • Check constraints, referential integrity
    constraints
  • Triggers, stored procedures, views, authorization
  • Performance and robustness
  • Buffering, locking, crash recovery, replication,
    ...

6
  • We've Achieved Nirvana ... Right?
  • Relations are surely the answer!
  • Simple, high-level model for programmers
  • Easy to distribute data and parallelize queries
  • But what was the question?
  • Sometimes difficult to model "real world" data
  • Entities and relationships (versus tables)
  • Variance among entities (versus homogeneity)
  • Set-valued attributes (versus normalization)
  • Demanding new database applications
  • New applications bring new data types
  • Complex objects are problematic
  • "A relational database is like a garage which
    forces you to take your car apart and store the
    pieces in little drawers..."

7
  • What are the Options?
  • Throw in the towel
  • OOPL your favorite file system
  • Object-oriented DBMS
  • Tightly integrated OOPL w/built-in DBMS
  • Object-oriented client wrapper
  • Loosely integrated OOPL relational DBMS
  • Object-relational DBMS
  • Newly integrated Relational model OO features

Which solution is the "right" one...?
8
  • Let's Examine the Problem Space
  • Stonebraker's 4-quadrant model

Complex
Simple
Complex
9
  • The Object-Oriented DBMS Revolution
  • Motivated by new database applications, e.g.
  • Computer-aided engineering
  • Document management
  • Geographic data management
  • Engineering applications were early drivers
  • Complex data structures ("pointer spaghetti")
  • Navigational data access required
  • Tight coupling between applications and data
  • Version management support needed
  • Approach OOPL DBMS OO-DBMS
  • Commonly based on C or Smalltalk
  • Persistence, collections, versions, queries, ...

10
  • No OO "Ted Codd" Stepped Forward
  • Object-Oriented Database System Manifesto
  • Mandatory features
  • Complex objects, identity, encapsulation
  • Inheritance w/substitutability and late binding
  • Computationally complete methods
  • Extensible type system, persistence
  • Secondary storage, concurrency and recovery
  • Ad hoc queries
  • Optional features
  • Multiple inheritance, static type checking
  • Distribution, long transactions, versions
  • Individual choices
  • Programming paradigm/language
  • Details and uniformity of object model

11
  • OO-DBMS Technology Today
  • Lots of research results
  • Object data models and features
  • OO query languages and processing techniques
  • Client-server architectures and performance
  • Significant commercial progress
  • Important and innovative systems
  • E.g., O2, ObjectStore, ODE
  • Quite a few commercial product offerings
  • GemStone, Objectivity, ObjectStore, Ontos, O2,
    Matisse, Poet, Versant, others
  • The ODMG-93 standard (release 2.0)
  • Consortium of OO-DBMS startups
  • Three key parts ODL, OQL, C binding

12
  • But the Revolution "Failed" (0B)
  • Lingering OO-DBMS differences
  • Query power, API details, implementation twists
  • Piecewise ODMG standard conformance (ex OQL!)
  • Still behind R-DBMSs in important ways
  • Codasyl-like schema compilation cycle
  • Schema evolution painful, if supported
  • Typically missing many useful "goodies"
  • Support for multiple application languages
  • Query optimization, views, authorization,
    constraints, triggers, multi-user scalability and
    robustness, ...
  • Other factors (niche market)
  • SQL-based application building tools
  • Architecturally biased towards "fat clients"

13
  • OO Client Wrappers are the Answer...
  • Available from a number of vendors
  • Persistence Software, Ontologic, HP, Next, ...
  • Language-specific relational wrappers
  • Proxy classes for C or Smalltalk (or Java)
  • Mapping of row data into language objects
  • Client-side (or middle-tier) object caching and
    method execution
  • Why is this approach attractive?
  • Can develop OO applications today, against
    existing enterprise data, for "business objects"

14
  • ...Not!
  • Paradigm mismatch for querying
  • C or Smalltalk for simple business logic
    and navigation, against object-oriented schema
  • SQL for queries, against relational schema
  • Choice forced for business logic rules
  • Do on server, using DBMS facilities?
  • Check constraints, referential integrity
    constraints, triggers, stored procedures,
    authorization
  • Do on client, using OO wrapper facilities?
  • C or Smalltalk (or Java) programming
  • This had better be a stop-gap solution
  • R-DBMS could become a storage manager, throwing
    away 20 years of successful RD!

15
  • The Object-Relational DBMS Evolution
  • Third Generation Database System Manifesto
  • Support rich object structures and rules
  • Rich type system, inheritance, encapsulation
  • Functions, optional unique ids, rules/trigggers
  • Subsume second generation database systems
  • High-level query-oriented interface
  • Stored and virtual collections
  • Updatable views
  • Data model/performance feature separation
  • Open to other subsystems (tools, middleware)
  • Accessible from multiple languages
  • Layered persistence-oriented language bindings
  • SQL support ("intergalactic dataspeak")
  • Query-shipping architecture

16
  • "Not Your Father's Employee Type"
  • Beyond name, rank, and serial number
  • Several new attribute types
  • Location (2-d point), job description (text),
    photo (image), ...
  • Associated functions
  • Distance(point, point), contains(text, string),
    ...
  • Beyond your basic employee record
  • Employees come in different flavors
  • Emp, RSM, Programmer, Manager, Temp, ...
  • Employees have many known relationships
  • Manager, department, projects, ...
  • Employees have behavior
  • Age(Emp), qualified(Emp, Job), hire(Emp), ...

An Employee is a "business object"
17
  • Two Flavors of O-R Object Extensions
  • Object extension 1 Abstract data types (ADTs)
  • New column types and functions
  • E.g.,text, image, audio, video, time series,
    point, line, OLE...
  • For modeling new kinds of facts about enterprise
    entities
  • Object extension 2 Row types
  • Types and functions for rows of tables
  • Includes inheritance, references, set-valued
    attributes
  • For modeling business objects with relationships
    behavior
  • Impact on schemas and query language SQL3
  • Schemas tables at the top, OO richness within
  • Queries extensions to support the added richness
  • Structured types support both ADT and row type
    object modeling needs (unified type system)

18
  • ADTs (Black Box)
  • To define and use a "black box ADT", a user will
  • Implement its internal structure and functions in
    an external programming language (e.g., C/C,
    Java)
  • Use the DDL to register the type with the DBMS
  • Size of an instance of the type
  • Input (constructor) and output functions
  • Other functions and operators, including
    signatures and linkable implementations
  • Costs and other properties for query optimizer
  • Use the new type like a built-in data type
  • Now available for defining columns of tables
  • Functions and operators become available in
    queries

19
  • Example Illustra Black Box ADT
  • Point as a "black box ADT" (written in C)

create type Point ( internallength 16
-- typedef struct double x, double y
point input point_in -- for
reading in Point constants output point_out
-- for displaying Point
results ) create function point_in(Text)
returns Point as external name
'MI_HOME/functions/point.so' language C create
function point_out(Point) returns Text
as external name 'MI_HOME/functions/point.so' la
nguage C
20
  • Example Illustra Black Box ADT (cont.)
  • Now we can put an end to "Pointless" queries...!

create function further_west(Point, Point)
returns Boolean as external name
'MI_HOME/functions/pointfuns.so' language
C select E1.name, E1.location from Emp E1, Emp
E2 where further_west(E1.location, E2.location)
and E2.name 'Mike' create binary operator
binding to further_west select E1.name,
E1.location from Emp E1, Emp E2 where E1.location
gtgt E2.location and E2.name 'Mike'
21
  • ADTs (White Box)
  • To define and use a "white box ADT", a user will
  • Describe its internal structure using SQL3 DDL
  • Attribute definitions are column-like
  • Advantages heterogeneity, nulls, nesting,
    constraints, ...
  • Implement its functions either directly in SQL or
    in his/her favorite external programming language
  • Utilize system-generated accessors and mutators
  • Finish explaining the type to the DBMS using DDL
  • For query optimizer, as before
  • Use the new type like a built-in data type
  • In tables and queries, as before
  • Note this is just a SQL3 structured type
    definition that's primarily intended for use in
    columns

22
  • Example DB2 UDB/OSF White Box ADT
  • Point as a "white box ADT" (written in SQL3)

create type Point as ( x double, y
double, ) create function distance(p1 Point, p2
Point) returns Point language SQL inline not
variant return sqrt((p2..y-p1..y)(p2..y-p1..y)
(p2..x-p1..x)(p2..x-p1..x)) select
E.name from Emp E, City C where C.name 'San
Jose' and distance(E.location, C.center) lt 25
23
  • Of Extenders, Blades, and Cartridges
  • High performance demands "deep" integration
  • Optimizer must know about an ADT operator's...
  • Execution cost (especially for expensive
    functions)
  • Logical properties (e.g., transitivity, negator,
    ...)
  • Selectivity estimates (i.e., filtering/matching
    power)
  • Relationship to access methods (both old and new)
  • DBMS runtime must invoke functions efficiently
  • Static vs. dynamic loading, fenced vs. unfenced
    execution
  • Partnerships and third-party packages
  • E.g., DB2's text, image, and spatial extenders
  • Package contains types, functions, access
    methods, optimizer information, and SQL DDL
    statements for all of the above

24
  • Row Types
  • To define and use a "row type", a user will
  • Create the desired structured type using SQL3 DDL
  • Columns, plus (optional) specification of a
    supertype
  • Create functions/methods involving the type
  • Arguments of the new type, w/overloading in the
    case of methods
  • Create one or more tables of the indicated type
  • Type hierarchy (if any) yields corresponding
    table hierarchies

25
  • Example SQL3 Row Types (plus Sets...)
  • Employees are people, so ...

create type Emp_t under Person_t as ( salary
Float, job_description Varchar(100), department
ref(Dept), projects set(ref(Project) ) create
table IBM_Emps of Emp_t under IBM_People (...)
create type Person_t as( name Varchar(20), birth
date Date) method age( ) returns Integer
language SQL create method age( ) for
Person_t return year(current date) -
year(birthdate) create table IBM_People of
Person_t (ref is self) (Note this is
approximate SQL3 syntax)
26
  • Queries Over Row Types
  • SQL's query constructs, extended with the ability
    to access these features (a la SQL3 plus sets)
  • User-defined functions in queries (w/late method
    binding)
  • Dereferencing of references (path expressions)
  • Queries over nested collections (table
    expressions)
  • For example, find unexplainable discrepencies
    between employees' and managers' salaries

select E.name, E.manager-gtname,
display(E.photo) from IBM_Emps E where E.salary gt
E.department-gtmanager-gtsalary and
E.department-gtmanager-gtage( ) gt E.self-gtage(
) and not contains(E.job_description, "Java")
27
  • Other OR-Related Features
  • Support for large objects
  • Multimedia data types aren't small (e.g., video)
  • Special handling required for efficiency
  • Minimal copying, piecewise retrieval, optional
    logging, movement to/from files, separate storage
    area from other attributes
  • DB2 has blob, clob, and dbclob types (up to 2GB)
  • Support for active data (triggers and
    constraints)
  • Ex

create trigger me_too after insert on
IBM_Emps referencing new as newemp foreach row
mode db2sql when salary gt department-gtmanager-gtsal
ary begin atomic set newemp.department-gtmanager-gt
salary newemp.salary end
28
  • OR-DBMS Technology Status
  • Many OR-DBMS research results
  • Postgres, EXODUS, Starburst, ...
  • OODB query processing research
  • Commercial systems exist today
  • IBM DB2 CS (V2.1) and CA-Ingres
  • User-defined types functions, large objects,
    triggers
  • Illustra, UniSQL/X
  • Early providers of ADTs, row objects, inheritance
  • IBM DB2 UDB, Informix, Oracle
  • "Universal server" products contain subsets of
    all this stuff
  • Standards right around the corner
  • SQL3 is "hardening" and has an object part with
    structured types, table hierachies, user-defined
    functions and methods, object views, ....

29
  • Some OR-DBMS Performance Issues
  • Bucky OR-DBMS benchmark from UW-Madison
  • Based on a hypothetical university schema
  • Exercised a range of OR-DBMS features
  • Row types, inheritance, late binding, subtables
  • Queries involving path expressions and/or sets
  • ADTs (black or white box) and functions
  • In Proc. 1997 ACM SIGMOD Conference
  • Tested a first-generation OR-DBMS product
  • OR versus relational simulation, same DB engine
  • Showed benefits of (complex) ADTs, indexes on
    functions
  • Indicated areas where query optimization needs
    schema support scope for path expressions,
    inverse relationships
  • Turned up bugs and performance problems (e.g.,
    sets)

30
OR Enterprise Scenerio (w/Challenges)
  • Object-relational server managing the database
  • ADTs w/inheritance and multi-language support
  • Row types, integrated with all of SQL (OO views,
    authorization, triggers, constraints, etc.)
  • High-function, OO, caching front-ends
  • Support for desktop and middle-tier (web!)
    applications
  • OR object model at all levels, for queries and
    navigation
  • Clean bindings for OOPLs (Java, C, Smalltalk)
  • Methods/queries running on client or server
  • Likewise for triggers and constraints
  • Business rules specified implemented once!
  • In SQL ( OOPL), running where appropriate

31
  • Multi-Tier Integration Challenges
  • Good mappings and interfaces to provide
    object-relational objects to OOPLs
  • Java, C, Smalltalk, others
  • Full query support in addition to navigation
  • Challenges in querying and caching
  • Intelligent querying over cache database
  • Correct and efficient caching of view objects
  • Update-related challenges
  • Triggers and constraints of all types
  • View objects (both directions)
  • Method execution on client or server
  • Java should be very useful here

32
  • Legacy Data Access Challenges
  • Some data will live outside the OR-DBMS
  • Older DBMSs (both relational pre-relational)
  • Specialized data stores (documents, images, ...)
  • Applications (i.e., legacy transactions)
  • Object-relational middleware is the answer!
  • Table functions can handle simple cases now
  • Distributed OR query engine (a la DataJoiner) can
    mediate between new applications and legacy data
  • Resulting appearance is that of an integrated OR
    database, accessible via SQL3 APIs and OO tools

33
  • Front-End Integration Legacy Data Access

34
  • Conclusions
  • Relational DBMS era 1980's, early 1990's.
  • Significantly raised the levels of abstraction
    productivity
  • Only "real" parallel computing success story to
    date, too!
  • Object DBMS era Should have been early 1990's...
  • Never made it out of the (mainstream) starting
    gate
  • Object-relational DBMS era You are there!
  • Object enhancements to relational DBMSs
  • ADTs (white box, black box) and functions
  • Row types with inheritance, references, sets, ...
  • Vastly reduces the "impedence mismatch" w/OOPLs
  • Today's OO wrappers are an interim solution
  • Possibilities abound for nice OO/OR tools
  • Will have OR middleware as well as engines
Write a Comment
User Comments (0)
About PowerShow.com