Title: SBA (Stack-Based Approach) and SBQL (Stack-Based Query Language)
1SBA (Stack-Based Approach) and SBQL
(Stack-Based Query Language)
Presentation prepared for OMG Object Database
Technology Working Group OMG TECHNICAL
MEETING, Anaheim, CA USASeptember 25-29,
2006 by Prof. Kazimierz Subieta
Polish-Japanese Institute of Information
Technology, Warsaw, Poland subieta_at_pjwstk.edu.pl
http//www.ipipan.waw.pl/subieta SBA/SBQL pages
http//www.sbql.pl
2Agenda
- Motivation for SBA and SBQL, short history
- Major topics and architectural issues
- Syntax, semantics and pragmatics of QLs
- Foundation of abstract implementation of SBQL
- Abstract syntax, state, environment stack, query
result stack, semantic rules. - M0 store model objects and pointer links, SBQL
for M0 - M1 store model classes and static inheritance,
SBQL for M1 - M2 store model roles and dynamic inheritance,
SBQL for M2 - Imperative constructs, procedures, functions and
methods - SBQL virtual updatable object views
- SBQL strong typing system
- Query optimization for SBQL
3What is SBA and SBQL?
- SBA is a conceptual frame for developing O-O
database query/programming languages - SBQL is a model query language according to SBA.
- It has the same role and meaning as object
algebras, but it is formally sound and much more
universal. - SBA/SBQL deal with various data models and all
imaginable and reasonable query constructs. - Abstract implementation is the basic paradigm of
formal specification of semantics.
4Why the Stack-Based Approach?
- Main motivations
- Inadequacy of the current database theories to
practice - Lack of clean and sound approach to QL semantics
? too challenging query optimization, strong
typing, database views, etc. - Chaotic design of current QLs, non-orthogonality,
sticking independent operators into big syntactic
constructs - Annoying false stereotypes (e.g. contradiction
between encapsulation and query languages),
wishful thinking on advantages or disadvantages
of particular models - Main conclusion query languages are programming
languages. - Should be developed according to the same methods
and principles. - Attempts to establish a clear border line between
querying and programming failed. - We abandon database theories, such as the
relational algebra, relational calculus, their
object-oriented counterparts, F-logic, etc.
5Short history of SBA/SBQL
- 1989 first implementation (NETUL an expert
system shell) - 1990 advanced prototype implementation (LOQIS)
- 2002 SBQL for XML DOM
- 2003 YAOD a prototype OODBMS
- 2004 SBQL for the European Project ICONS
(commercialized) - 2004 SBQL prototype for Objectivity/DB
- 2004 Book on SBA/SBQL, 522 pages
- 2005 BPQL for OfficeObjects Workflows
(commercial) - 2006 (pending) SBQL for the European project
eGov Bus - 2006 (planned) SBQL for the European project
VIDE - 6 finished PhD-s, 7 pending PhD-s, many MSc, many
papers,
6Major topics that SBA deals with (1)
- General architecture of query processing
- Abstract models of object stores
- Syntax, semantics and pragmatics of query
languages - SBQL algebraic and non-algebraic operators
syntax and semantics - Classes, methods and static inheritance in query
languages - Dynamic object roles and dynamic inheritance in
query languages - Processing of irregular data structures
(semi-structured data) - Transitive closures and fixed-point equations in
SBQL - Extension of SBQL with imperative (updating)
constructs - Procedures, functions and methods in SBQL
7Major topics that SBA deals with (2)
- Parameter passing for procedures / functions /
methods - Encapsulation in SBQL
- Virtual updatable views for SBQL
- Types, interfaces, schemas and metamodels
- Static (semi-) strong type checking of SBQL
queries and programs - Query optimization (rewriting, indices, caching,
) - Query processing and optimization in distributed
systems - Data-intense grids and P2P networks integration
of distributed, heterogeneous, fragmented and
redundant resources - Aspect-oriented databases
- SBQL in OMG MDA and executable UML
8General architecture of query processing
- Actually, we do not fix the architecture
- It can be similar to SQL (server-side processing
of queries, the ODBC, ADO or JDBC style) - It can be similar to the ODMG architecture
(queries as strings embedded in a popular
programming language, e.g. Java or C) - It can be similar to Oracle PL/SQL (programs
integrated with queries, performed on the
client-side) - Our goal is to shift as much as possible query
processing and optimization to the client side
(in contrast to SQL) - Lower workload for the server ? better overall
performance - More flexible for query optimization (e.g.
parallel execution on many servers, possibility
to optimize queries referencing local objects)
9Internal architecture of the query processor
- Classical run-time architecture of popular
programming languages, with necessary
modifications and generalizations - In contrast to PL-s, we separate ENVS from an
object store
Query/program operators
Environment (call) stack ENVS
Query result stack QRES
Binding
Query evaluation
References (OIDs) and values of objects
Client object store
Server object store
Persistent (shared) objects
Volatile (non-shared) objects
10Detailed client-server architecture
Software development environment (editor,
debugger, etc.)
Client
Parser of queries and programs
Syntactic tree of a query/program
Optimization by rewriting
Optimization by indices
Interpreter of queries programs
Strong type checker
ENVS
Static ENVS
QRES
Static QRES
Volatile (non-shared) objects
Local metabase
Network
Register of indices
Register of views
Object manager
Server
Metabase of persistent objects
Processing persistent abstractions (views, stored
procedures, triggers)
Administration Transactions
Persistent (shared) objects
11Data model
- SBA and SBQL are neutral to data models.
- This is firm and basic assumption.
- Query languages address data structures rather
than data models. - All imaginable data models are to be acceptable,
starting from the relational model, through
XML/RDF models, up to very advanced
object-oriented models, with classes, roles,
static and dynamic inheritance, encapsulation,
polymorphism, etc. - I am not aware of any data model that cannot be
served by SBQL. - In any data model we are not interested in its
ideological assumptions, advantages and
disadvantages, but only in abstract formal
properties of data structures that it implies. - All such structures are to be formally addressed
by SBQL.
12Syntax and semantics of formal languages
- All we know what is syntax, including formal
syntax. - What is semantics, especially formal semantics?
- For query languages the specification of formal
semantics is the must. - It is a strong guideline for implementation and
standardization. - Required by query optimization and strong typing.
- There is a lot of approaches, especially
mathematical ones, but no approach covering all
the issues of QLs integrated with PLs. - SBA is a universal formal framework to specify
semantics of QLs/PLs. - Formal does not mean mathematical.
- Formal means expressed through precise
abstract implementation.
13Pragmatics of formal languages
- Pragmatics determines when, what for, and how to
use the language, and consequences of the use. - Syntax and semantics are servants of pragmatics.
- Pragmatics is the major subject of languages
manuals. - Pragmatics is explained by examples of use,
intuitive meaning of queries, description of
query results, good patterns, bad patterns, etc. - Popular manuals and standards explain semantics
through syntax, intuitive description and
pragmatics. - For instance, the ODMG standard.
- However, pragmatics is a bad way to specify
semantics. - It is unable to explain and specify all recursive
dependencies and relationships between various
language constructs, operators, data structures
and results of languages statements.
14Object model and database schema
- are inevitable parts of the pragmatics of a
query language. - The application programmer must be aware what the
database contains and how it is organized,
frequently before the database is filled in. - Usually, an object model and a database schema
language are presented at the beginning of the
given specification, c.f. ODMG - The model involves such concepts as types,
classes, interfaces, joined into a coherent whole
as a schema language, c.f. ODL. - However, the concepts are difficult, especially
types. - Introducing them at the beginning, without
realizing what is the semantics of a query
language, usually results in inconsistencies. - Hence, we must first understand the semantics of
a query language on the ground of an abstract
object store model.
15Abstract syntax and syntax-driven semantics
- Concrete syntax usually involves a lot of
syntactic sugar. - Abstract syntax is free of syntactic sugar and
ambiguities. - SQL, concrete syntax
- Abstract syntax
- In computer languages semantics is syntax-driven.
- First, the designers of a language define its
syntactic rules. - Then, each syntactic rule is associated with a
semantic rule. - To simplify the association, the syntax should be
abstract. - Syntactic rules are recursive ? semantic rules
must be recursive too.
select Name, DeptName from Employee, Dept where
some_predicate
((Employee Dept) s some_predicate) p (Name
DeptName)
16SBA semantics of QL-s general point of view
- Let Query be a set of all syntactically correct
queries. - Let State be a set of all states (database
states, but not only). - Let Result be a set of all possible query
results. - Semantics of any query q belonging to Query is a
function that maps State ? Result. - If q has side effects, then it maps State ?
Result and State ? State . - In some theories, a state is a set of objects and
a result is a set of objects too. This is known
as the closure property. - In SBA a state contains not only objects and
- a result never contains
objects. - The closure property is inconsistent, it is
conceptual nonsense. - It leads to next nonsense such as subdividing
queries into object preserving and object
generating.
17Summing up what we need to define semantics?
- Abstract syntax of queries, domain Query. It is
to be defined by a set of context-free rules. - Formal and abstract concept of all possible
states, domain State. - Missed in the ODMG standard, thus the standard is
not prepared to specify the formal semantics of
OQL. - Formal and abstract concept of all possible query
results, domain Result. - Formal (recursive) mapping of each context-free
rule into a semantic rule, which maps each state
into a result.
18Abstract implementation
- Actually, all formal approaches to query
languages propose some method of specification of
semantics. - e.g. the relational algebra, calculus, object
algebras, etc. - However, they are very limited or inconsistent,
as a rule. - In programming languages the most known are
denotational semantics and operational semantics. - The best is a variant of operational semantics
that is referred to as abstract implementation. - The method is simple (but not simpler than
necessary) and universal. - In the operational semantics we have to define a
machine or a procedure that will execute all the
semantic rules. - Abstract implementation can be easily mapped into
a languages interpreter in our favorite
programming language.
19What is State?
- Usually the concept is understood as object state
or database state. We much extend this concept. - A state includes all data or programming features
that can influence the result of some (any)
query, in particular - Database state on the server side.
- Local state (local objects used in queries) on
the client side. - Global and local computer and software
environment (e.g. date, time). - Available libraries, procedures, functions,
classes, views, etc. - A state also includes structures (invisible for
the programmer ) that determine the run-time
environment of computations. - In SBA there is one such structure environment
stack (ENVS). - In SBA a state consists of two elements
- state object store environment stack
20Is ENVS purely implementation notion?
- No. The environment stack is a conceptual notion.
- Without ENVS formal specification of semantics of
QLs and PLs will be impossible or will be
limited. - Without ENVS it is impossible to explain formally
and precisely the mechanisms of classes, roles,
static and dynamic inheritance, etc. - ENVS makes it possible to explain precisely
(recursive) procedures and methods, methods of
parameter passing, database views, etc. - In SBA we present ENVS on an abstract level. We
are not interested in its physical
implementation. - Implementation can be different, introducing many
optimizations. - Usually ENVS is a client-side data structure
stored in main memory. - The main roles of ENVS determining scopes for
names and binding names occurring in queries.
21What is Result?
- Almost any value that can be stored in the object
store or can be computed from other values can be
the result of a query. - For instance, the query 22 returns 4.
- Multimedia, having megabytes, cannot be returned
as query results. - In such cases a query returns a reference to such
a value (e.g. a file name) rather than a value. - Reference is a fundamental concept of SBA.
- Queries can return complex data structures which
include values, references, names, structure
constructors and collection constructors. - In SBA queries never return objects, but
references to objects (OIDs), perhaps within some
complex structures. - Objects are stored within the object store only.
22Query result stack, QRES
- For nested queries temporary and final results
are accumulated on the query result stack
(abbreviated QRES). - This is quite easy notion, known from a lot of
student manuals. - QRES is a client-side structure usually stored in
a main memory. - QRES must be prepared to store in a single
section any query result (including nested
collections, arrays, etc.). - QRES is not a component of State
- because the result of a new query does not
depend on the previous QRES state. - In the denotational semantics this notion is not
necessary (it is hidden within recursive function
calls). - In abstract implementation precise specification
of the QRES mechanism is fundamental. - Thus it is introduced in SBA explicitly, on the
abstract level.
23Example of QRES state
15 i17 struct x(i61), y(i93) bag
struct n("Doe"), s(i9),
struct n("Poe"), s(i14),
struct n("Lee" ), s(i18)
top
the only visible stack section
invisible stack sections
bottom
24What is a semantic rule?
- In the operational semantics method we have to
define a machine that associates each syntactic
rule with a semantic rule, in the form of actions
of the machine. - We define this machine as a recursive procedure
eval having a query q written in an abstract
syntax as an argument. - The procedure eval evaluates a query and returns
its result. - Inside the procedure we have syntactic cases, and
then, actions of the machine for particular
cases. - A case presents just a semantic rule subordinated
to the corresponding syntactic rule. - Next is a skeleton of the procedure eval in the
form of self-explained pseudo-code.
25Procedure eval the idea of semantic rules
procedure eval( q query ) parse ( q )
//the parser recognizes top-level subqueries in
the query q case q is recognized as literal
l // a query consisting of a single literal,
e.g. 2. push the value denoted by l on QRES
case q is recognized as name n // a query
being a single name n, e.g. Person. bind the
name n on ENVS push the result of the
binding on QRES case q is recognized as ?
q1 // q consists of a unary operator ? applied
to q1 eval( q1 ) apply the operator
denoted by ? to the result returned by q1 on
QRES push the result of the application on
QRES case q is recognized as q1 ? q2 //
q involves a binary algebraic operator ?
eval( q1 ) eval( q2 ) apply the
operator denoted by ? to the results returned by
q1 and q2 on QRES push the result of the
application on QRES
Rule 1 ?
Rule 2 ?
Rule 3 ?
Rule 4 ?
26The compositionality principle
- requires that the semantics of a compound
statement is a function of semantics of its
components. - For instance, if we have a compound query q q1
? q2, then the semantics of q, q, is the
result of some function fun? ( q1 , q2 ) . - Function fun? depends on the operator ?.
- Compositionality allows for orthogonal
combination of operators and unlimited recursive
nesting of queries. - Semantics of a complex query is build recursively
from semantics of its components. - Compositionality is better if syntactic rules are
as short as possible. - Good compositionality ? easier implementation,
shorter manuals. - SQL, OQL poor compositionality (big heavy
syntactic monsters). - In SBA compositionality concerns all constructs
of queries, imperative statements, programming
abstractions, etc.
27Total internal identification
- Each database or program entity, which could be
separately retrieved, updated, inserted, deleted,
authorized, indexed, protected, locked, should
possess a unique internal identifier. - A unique internal identifier should be assigned
not only to objects on the top hierarchy level,
but to all sub-objects, including atomic ones. - We are not interested in the form, structure and
meaning of internal identifiers. - The principle makes it possible to make
references and pointers to all possible entities,
thus to avoid conceptual problems with binding,
scoping, updating, deleting, parameter passing,
and other functionalities that require references
as query primitives. - ODMG does not follow the idea.
- ODMG literals (components of objects) have no
identifiers. - Thus e.g. it is impossible to extend OQL with
updating constructs.
28Object relativism
- If some object O1 can be defined, then object O2
having O1 as a component can also be defined. - No limitations concerning the number of hierarchy
levels of objects. - Objects on any hierarchy level should be treated
uniformly. - An atomic object (having no attributes) should be
allowed as a regular data structure. - Object relativism implies the relativism of
corresponding query capabilities. - There is no need for attributes, sub-attributes,
etc. - all are objects too. - The idea radically reduces a database model, cuts
the size of specification of query languages, the
size of implementation, and the size of
documentation. - It much supports query optimization and strong
typing.
29Abstract Object Store Models
- A component of State is an object store.
- To define the semantics of a query language we
have to define an object store precisely, but on
the abstract level. - Because various object models introduce a lot of
incompatible notions, SBA assumes some family of
object store models which are enumerated M0, M1,
M2 and M3. - M0 covers relational, nested-relational and
XML-oriented databases. M0 assumes hierarchical
objects and binary links between objects. - Advanced store models introduce classes and
static inheritance (M1), object roles and dynamic
inheritance (M2), and encapsulation (M3). - All the models are served by SBQL.
- These store models are pivots - they can be
extended and modified, depending on features that
one would like to cover.
30Notions common to store models
- Internal object identifier (OID)
- Uniquely identifies an object in the store.
- Assigned automatically, no external meaning.
- Used as a reference or a pointer to an object.
- External object name
- Usually bears some external semantics of an
object, e.g. Person, Customer. - Explicitly assigned by a database designer,
programmer, etc. - It is usually not unique, e.g. many objects named
Person. - Atomic object value
- Cannot be subdivided into smaller parts
- E.g. 2, 3.14, Doe, Hello, World!.
- The size is not constrained from 1 bit to
gigabytes. - So far we are not interested in types (Ill
return to this issue later).
31M0 Complex Objects and Pointer Links
I - a set of internal identifiers N - a set of
external names V - a set of atomic values
lt i, n, v gt - atomic object lt i1, n, i2 gt -
pointer object lt i, n, T gt - complex object,
T is a set of objects R ? I start
identifiers
lt i, n, f gt
object
object ID
object name
object value
- No record, tuple, array, set, etc. constructors
in the model essentially all of them are
collections of objects. - External names are not unique modeling
collections (bags). - Uniform treatment of relational, nested
relational, etc. databases.
32M0 object store - example
Objects
lt i9, Emp, lt i10, name, Lee gt,
lt i11, sal, 900 gt, lt
i12, address, lti13, city, Rome gt,
lti14, street,
Boogie gt,
lti15, house, 13 gt gt,
lt i16, worksIn, i22 gt gt
33M0 object store graphical view
i5 Emp
i1 Emp
i9 Emp
i6 name Poe
i2 name Doe
i10 name Lee
i11 sal 900
i7 sal 2000
i3 sal 2500
i8 worksIn
i4 worksIn
i12 address
i13 city Rome
i14 street Boogie
i15 house 13
i16 worksIn
i22 Dept
i17 Dept
i23 dname Ads
i18 dname Trade
i24 loc Rome
i19 loc Paris
i25 employs
i20 loc Rome
i26 employs
i21 employs
34A relational database in M0
Relational schema Emp( name, sal, worksIn )
Model M0 Objects lt i1 , Emp, lt i2, name,
Doe gt, lt i3, sal, 2500
gt, lt i4, worksIn,
Production gt gt, lt i5 , Emp, lt i6, name,
Poe gt, lt i7, sal, 2000
gt, lt i8, worksIn, Sales
gt gt, lt i9 , Emp, lt i10, name, Lee gt,
lt i11, sal, 2000 gt,
lt i12, worksIn, Sales gt gt Start
identifiers i1 , i5 , i9
Relation Emp
- A similar mapping can be applied to hierarchical
DB, nested relational DB, XML, RDF,
35Environment Stack, ENVS
- ENVS is also known as call stack.
- For query processing we modified and generalized
it - ENVS is used to binding objects that are stored
at a server, hence ENVS contains references to
objects rather than object values. - The same object can be referenced from different
stack sections. - For collections the binding is macroscopic, for
instance, if Emp is bound, the binding returns
many references. - In PLs the stack has usually two incarnations
static (compile time) and dynamic (run-time). - Because database objects are always dynamically
bound, some properties of a static stack must be
shifted to a dynamic stack. - We return to the static stack when we will
consider strong typing. - Besides classical roles of the stack, in SBA it
plays many new roles, in particular, processing
non-algebraic operators.
36PLs What ENVS is for?
- Abstraction and encapsulation local properties
of a procedure are hidden for programmers that
use it. The procedure is seen only by its
interface (signature). - Isolation programmers writing different
procedures need not to know about each other. - Semantic independency and reuse a procedure can
be invoked from many places of an applications. - Unlimited invocations of procedures from other
procedures, including recursive calls. - Management of name spaces used in programs no
naming conflicts between local procedures
environments. - Implementation of parameter passing methods
call-by-value, call-by-reference,
strict-call-by-value, etc.
37Naming, scoping, binding
- SBA is based on the naming, scoping and binding
paradigm - Every name occurring in a query is bound to run
time program or database entities, according to
the actual scope for the name. - Binding is substituting a name occurring in a
query by a run-time program entity (or entities). - This concerns all names, in particular
- Names of persistent or volatile objects,
subobjects (attributes), sub-subobjects,
pointers, etc. - Names of procedures, functions, methods, views,
parameters. - Names of entities from the computer or software
environment - Any auxiliary names that are defined and used in
queries - ENVS presents an universal scoping and binding
mechanism. - No name occurring in a query can be bound
otherwise.
38New, important concept binder
- Binder is an internal structure to determine
(dynamic) bindings. - A binder consists of two parts
- A binder is a pair (n, r), where n belongs to N
and r belongs to the domain Result (e.g. a
reference to an object). - Such a pair is written n(r).
- Binders are the basis for binding names occurring
in queries. - Roughly, if n(r) is present on ENVS, then the
binding of n returns r. - If binder n(r) is not present on ENVS, then
binding of n fails. - Binders play other important roles.
- Binders can be nested, for instance, Emp(
name(i2), sal(2500) ).
External name
Internal run-time program entity
39ENVS in SBA
- It consists of sections. Each section is a set of
binders. The stack is growing and shrinking
according to nesting query operators.
The most local data are at the top.
...... Binders to local entities of the
currently executed method ..... name(i2) sal(i3)
worksIn(i4) Binders to global entities of the
user session Emp(i1) Emp(i5) Emp(i9) Dept(i17)
Dept(i22) Binders to entities of the global
environment
The section of the currently processed object
The database section
The most global data are at the bottom.
40Binding through ENVS function bind
The order of visiting stack sections
Emp(i1) G(Mary) X(i221) . name(i2) sal(i3)
worksIn(i4) .. Emp(i1) Emp(i5)
Emp(i9) Dept(i17) Dept(i22) .
bind( G ) Mary bind( X ) i221 bind(
sal ) i3 bind( Emp ) i1 bind( Dept )
i17, i22
Omitted section
- Searching from the top section to the bottom
section. - If proper binder is found, the searching is
terminated. - All binders with the given name from the final
section are taken. - Some sections are omitted due to static scoping
(as usual in PLs).
41Opening a new section of ENVS (1)
- In PLs opening a new scope on ENVS is caused by
entering a new procedure (function, method) or
entering a new block. - Respectively, removing the scope is performed
when the control leaves the body of the
procedure/block. - To these classical situations we add a new one.
- It is the essence of SBA. The idea is that some
query operators (called non-algebraic) behave on
the stack similarly to program blocks. - For instance, in the SBQL query
- Emp where ( name Poe and sal gt 1000 )
- the part ( name Poe and sal gt 1000 ) behaves
as a program block executed in an environment
consisting of the interior of an Emp object. - Binding concerns also names name and sal.
- Hence, we push on ENVS a section with the
interior of the currently processed Emp object
(next slide).
42Opening a new section of ENVS (2)
condition
Emp where
(name Poe and sal gt 1000)
binding
binding
name(i10) sal(i11) address(i12)
worksIn(i16) Emp(i1) Emp(i5) Emp(i9)
Dept(i17) Dept(i22)
Interior of the 3-rd object Emp
Emp(i1) Emp(i5) Emp(i9) Dept(i17)
Dept(i22)
Initial ENVS state. bind( Emp ) i1, i5, i9
ENVS during evaluation of the condition for the
third object Emp. bind( name ) i10 bind(
sal ) i11
43Function nested computing objects interior
- Function nested acts on an object reference and
returns its interior as a set of binders. For
instance - The result of nested is then pushed at ENVS.
i9 Emp
i10 name Lee
i11 sal 900
i12 address
i13 city Rome
i14 street Boogie
i15 house 13
i16 worksIn
nested( i9 ) name( i10 ), sal( i11 ),
address( i12 ), worksIn( i16 )
44Generalization of function nested
- In general, it can be applied to any element of
Result. - For a complex object lti, n, lti1, n1,...gt, lti2,
n2,...gt, ... , ltik, nkgt gt it holds nested(
i ) n1(i1), n2(i2), ... , nk(ik) - The case is illustrated on the previous slide.
- If i is an identifier of a pointer object lti, n,
i1gt, and the object store contains the object
lti1, n1, ... gt, then nested( i ) n1(i1) - This accomplishes navigation according to a
pointer. - For a binder n(x) holds nested( n(x) ) n(x)
- As will be shown, this semantics is consistent
with the typical understanding of auxiliary names
introduced in queries. - For a structure nested returns the union of the
results of the nested function applied for
elements of the structure
nested( struct x1, x2, ... )
nested(x1) ?? nested(x2) ? ... - For other arguments nested returns the empty set.
45Definition of Result for SBQL
- Any atomic value belongs to Result.
- Any reference (OID) belongs to Result.
- If x belongs to Result, then any binder n(x)
belongs to Result. - If x1, x2, x3, ... belong to Result, then struct
x1, x2, x3, ... belongs to Result. - The order of elements in a structure can be
significant. - In contrast to typical structures, we do not
assume that all elements of a structure must be
named (elements need not be binders). - Implicitly, we assume that for a single element
struct x1 x1. - Empty structures are not allowed.
- If x1, x2, x3, ... belong to Result, then bagx1,
x2, x3, ... and sequencex1, x2, x3, ...
belong to Result. - bag and sequence are collection constructors.
- Reminder so far we are not dealing with types.
46Summing up what we have defined so far?
- We know precisely what is an object store, atomic
object, complex object, pointer object and
collection. - We know precisely what is the construction of an
environment stack ENVS, what it is for, what is
binding, and how a new section on the stack is
constructed (binders, function nested). - Hence, we know precisely what is state and how it
behaves - We know precisely what is a query result and a
result stack QRES. - We understand the idea of abstract implementation
in the form of the recursive procedure eval
(evaluation of a query). - Now we have all the semantic equipment to define
SBQL and its abstract implementation for the M0
store model.
47SBQL atomic queries
- Syntax Any literal l is an SBQL query.
- E.g. 2 3.14 Doe true
- A literal l is an external (source code)
representation of a value vl. - Any name n is an SBQL query.
- E.g. Emp sal worksIn e d
- Semantics
procedure eval( q query ) .. case q
is recognized as literal l push(vl ,
QRES) case q is recognized as name n
push( bind(n), QRES) ..
48SBQL algebraic operators
- Algebraic operators do not use ENVS.
- Syntax If ? is a symbol denoting a unary
algebraic operator and q1 is a query, then ?
q1 is a query. - If ? is a symbol denoting a binary algebraic
operator and q1 and q2 are queries, then q1 ? q2
is a query. - Semantics
procedure eval( q query ) .. case q
is recognized as ?q1 eval( q1 )
apply the operator denoted by ? to the top of
QRES case q is recognized as q1 ? q2
eval( q1 ) eval( q2 ) apply
the operator denoted by ? to two top elements of
QRES, pop QRES two times, then push the
result of ? on QRES ..
49Examples of algebraic operators
- A lot of them. We assume that SBQL accepts any
operator if some designer wants to introduce it. - Unary algebraic operators
- count, sum, avg, max, median, -, log, sqrt, not,
... - Binary algebraic operators
- Operators and comparisons for primitive types ,
-, , /, , gt, lt, and, or, concatenation of
strings, . - Structure constructor
- Operators and comparisons on collections sum of
bags, equality of bags, intersect, contains, in,
concatenation of sequences, - Coercions (changing types or representations) and
dereferencing - ..
- There is a lot of discussions and semantic
details concerning particular kinds of operators.
In this presentation I skip them.
50Auxiliary naming operators
- Syntax If q is a query, then q as n , q
group as n are queries. - Semantics Both operators are considered unary
algebraic operators parameterized by a name. - Operator as (changing bag/sequence elements into
binders) - Let q returns baga1, a2, a3,....
- Then q as n returns bagn(a1), n(a2),
n(a3),.... - Similarly for sequences and individual elements.
- Operator group as (naming a query result)
- Let q returns some result r.
- Then, q group as n returns a single binder
n(r). - These simple operators cover all the naming
contexts in QLs. - Iteration variables in SQL and OQL
- A variable bound by a quantifier
- Naming virtual attributes in views
51Examples of auxiliary naming in SBQL
- Navigational (dependent) join
- ((Student as x join (x.takes.Lecture) as y join
- ( y.taught_by.Professor) as z)
- where z.rank "full professor). (x.name,
z.name) - Quantifier
- ?Emp as e (e.sal gt 10000)
- Structure constructor
- ( Lee as name, 900 as sal, (Rome as city,
Boogie as street, 13 as
house) as address ) as Emp - Iteration variable
- for each Emp as e do e.sal e.sal 100
52SBQL non-algebraic operators
- Non-algebraic operators use ENVS.
- They cannot be reduced to any algebra.
- SBQL is based on different foundations than the
relational algebra. - Non-algebraic operators introduced in SBQL
- where (selection), dot (projection, navigation,
path expressions), join (dependent or
navigational join), quantifiers (universal and
existential), order by (sorting) and transitive
closures. - All non-algebraic operators are binary.
- All have a common semantic core based on the ENVS
mechanism. - Syntax
q1 where q2
q1 . q2
q1 join q2
? q1 ( q2 )
? q1 ( q2 )
q1 close by q2
q1 order by q2
q1 leaves by q2
53SBQL non-algebraic operators - semantics
- Consider query q1 ? q2 , where ? is a
non-algebraic operator. - Evaluate query q1
- For each e ? result(q1) do the following steps
- Push nested(e) as the top section on ENVS
- Evaluate query q2 in this new environment
- Calculate a partial query result through some
function partialResultOf? (e, result(q2) ) the
function depends on ? - Pop (remove) the top section from ENVS.
- Merge all partial result into the final result.
- It is done by some function
- mergePartialResults?( partialRes1, partialRes2,
..., partialResk), which depends on ?.
54Evaluation of a non-algebraic operator
- Evaluation of query q1 ? q2
result( q1 ) bag e1, e2, e3
nested(e1)
nested(e2)
nested(e3)
Previous state of ENVS
Previous state of ENVS
Previous state of ENVS
Previous state of ENVS
Previous state of ENVS
Previous state of ENVS
Previous state of ENVS
time
result(q2)
result(q2)
result(q2)
result(q1 ? q2)
55Formal semantics (pseudocode)
procedure eval( q query ) .......
case q is recognized as q1 ? q2
partialResults bag of Result
partialResult, finalResult, e Result
partialResults ? eval( q1 )
for each e in top( QRES ) do
push( nested(e), ENVS ) eval( q2
) partialResult
partialResultOf? (e, top( QRES ) )
partialResults partialResults ?
partialResult pop( QRES )
pop( ENVS )
finalResult mergePartialResults?(
partialResults ) pop( QRES )
//removing the result(q1) from QRES
push( QRES, finalResult) .......
56SBQL Selection q1 where q2
- For each element e returned by q1 ,
query q2 is evaluated with ENVS augmented by
nested( e ). - e belongs to the final result, iff q2 returns
true for it.
Results returned by query salgt1000
Result returned by query Emp
Results returned by query sal
Iteration over elements of the previous result
Results returned by query 1000
Final result of the query
Dereference forced by gt
i1 i5 i9
i1 i5
i3
2500
1000
true
i7
2000
1000
true
ENVS before evaluation
Emp(i1) Emp(i5) Emp(i9) Dept(i17) Dept(i22)
i11
900
1000
false
Emp where ( sal
gt 1000 )
57SBQL other non-algebraic operators
- Projection, navigation q1 . q2
- For each element e returned by q1 , query
q2 is evaluated with ENVS augmented by
nested( e ). - The result is the sum of all partial results
returned by q2 . - Path expressions are a side effect of the
definitions. - Dependent join q1 join q2
- For each element e returned by q1 ,
query q2 is evaluated with ENVS augmented by
nested( e ). - The result is the sum of structe, v, where v is
an element returned by q2. - Definition of quantifiers, order by, close by,
leaves by, etc. exactly in the same style.
58Object schema used in examples
worksIn
employs1..
Dept 0.. d dname loc1..
manages0..1
boss
59Examples of SBQL queries for M0
- Get references of departments for employee named
Doe - (Emp where name Doe).worksIn.Dept
- Get names of departments together with their
average salaries - (Dept join avg(employs.Emp.sal) as avgsal) .
(dname, avgsal) - Names and cities for employees working in the
department managed by Kim - (Dept where (boss.Emp.name) Kim).employs.Emp.
(name, if exists(Address)
then Address.city else No address) - Get departments employing a professional for any
job in the company. - Dept where ?distinct(Emp.job) as j (?employs.Emp
(j job)) - Names and salaries of employees earning more than
their bosses. - (Emp where sal gt (worksIn.Dept.boss.Emp.sal)).(nam
e, sal)
60M1 Classes and static inheritance
- Classes, methods and inheritance require
extension of M0. - Classes have two incarnations as pieces of a
source code and as run-time database entities. - Usually programming languages deal with classes
as second-class citizens, i.e. in the source code
only. - In our model we are (so far) not interested in
this point of view. - We deal with them when we consider static binding
and strong typing. - In the M1 store model classes are first class
entities storing invariant properties of their
objects, i.e. methods (but not only). - Hence in our model classes are objects too,
connected with their member objects by a special
relationship. - Classes are also connected with classes by
another relationship know as inheritance.
61Classes as objects in M1
i40 PersonClass
i41 age (...code...)
...
inherits from
i50 EmpClass
member of
i51 changeSal (...code...)
i52 netSal (...code...)
...
i1 Person
member of
member of
i2 name Doe
i9 Emp
i5 Emp
...
i10 name Lee
i6 name Poe
i11 sal 900
i7 sal 2000
i16 worksIn
i8 worksIn
...
...
i33
i22
62SBQL semantics for M1
- Changes concern only ENVS and non-algebraic
operators - When a non-algebraic operator processes an object
lti, gt, which is a member of a class ltiC1, gt,
which inherits from a class ltiC2, gt, etc. then
the ENVS is augmented (starting from the top) by
nested(i), nested(iC1), nested(iC2), up to the
most general class. - When a non-algebraic operators finishes
processing the object lti, gt, all these sections
are removed from ENVS.
During processing the object lti, gt
nested( i ) nested(iC1) nested (iC2) ..
Before processing the object lti, gt
After processing the object lti, gt
Previous ENVS state
Previous ENVS state
Previous ENVS state
63Example Processing an object in M1
- (Emp where name Poe) . (name, netSal, age)
- ENVS during processing the subquery after the dot
name(i6) sal(i7) worksIn(i8) changeSal(i51)
netSal(i52) ... age(i41) ... Person(i1) ...
Emp(i5) Emp(i9) .. ...
nested(i5) - internals of the currently
processed Poes object nested (i50) internals
of EmpClass nested (i40) internals of
PersonClass Binders to database objects
Sections pushed by the dot
64Some peculiarities of M1
- Binding and processing methods
- Invocation of a method means that a new section
(activation record) is additionally pushed at top
of ENVS. - The section contains parameters of the method
(evaluated previously), its local environment and
a return track. - Rather minor semantic peculiarities connected
with encapsulation. - A problem - multiple inheritance
- M1 allows for multiple inheritance, but in case
of name conflict there is no solution. - This is a general problem, not specific to M1.
- Next big problem - collections
- They violate object-oriented principles such as
substitutability and open-close (reuse,
conceptual continuation). - Possible solutions require specific extensions of
M1.
65Examples of SBQL queries for M1 - schema
Dept0.. d dname loc1.. budget()
employs1..
worksIn
manages0..1
boss
66Examples of SBQL queries for M1
- Get names of departments and the average age of
their employees (inheritance of the method age). - Dept . (dname, avg(employs.Emp.age))
- Get employees that for sure live in the cities
where their departments are located (inheritance
of Address). - Emp where ? Address as a (? (worksIn.Dept.loc) as
l (a.city l)) - For each employee get name and the percent of the
annual budget of his/her department that is
consumed by his/her sal. - Emp . (name, (((if exists(sal) then sal else 0)
as s). ((s 12
100)/(worksIn.Dept.budget))) - For each person having no salary give the minimal
salary in his/her department. - for each (Emp where not exists(sal)) as e do
e.changeSal(
min(e.works_in.Dept.employs.Emp.sal) )
67M2 Dynamic roles and dynamic inheritance
- The object model with dynamic object roles
removes essential conceptual drawbacks of the
classical static inheritance. - The idea is that an object during its life can
acquire and lose its roles without changing its
identity. - Objects business semantics depends on a
currently considered role. - SBQL is the first (and only) QL dealing with
dynamic roles. - Dynamic object roles and dynamic inheritance
require extension of M1 and extension of the
semantics of non-algebraic operators.
Person
Employee
Club-member
Patient
Student
Student
Dog-owner
Tax-payer
68Example of the M2 store model
i1 Person
i4 Person
i7 Person
i2 name Doe
i5 name Poe
i8 name Lee
i3 born 1948
i6 born 1975
i9 born 1951
is member of inherits from dynamically inherits
from
69SBQL semantics for M2
- Changes concern only ENVS and non-algebraic
operators - The order of sections of roles and classes on
ENVS is determined by a simple rule (c.f. full
description of SBA/SBQL). - Some new operators dealing with roles (dynamic
cast, has role). - (Emp where name Lee) . (sal, born, age)
Properties of the currently processed Emp role
Properties of the EmpClass Properties of the
Person super-role of the Emp role Properties of
the PersonClass Database section
sal(i17) worksIn(i18) changeSal(i51)
netSal(i52 ) ... name(i8) born(i9) age(i41)
... ......... Person(i1) Person(i4) Person(i7)
Emp(i13) Emp(i16) Student(i19) ... .........
Sections pushed by the dot
70Examples of SBQL queries for M2 - schema
71Examples of SBQL queries for M2
- Get employees older than 60 who live in Warsaw
(dynamic inheritance of the attribute Address and
static inheritance of the method age). - Emp where age gt 60 and ?Address (city
Warsaw) - For each person get name and the sum of all the
incomings (salary and scholarships). - (Person as p).
(p.name, sum(bag(0, ((Student)p).scholarship,
((Emp)p).sal))) - Get students who live in the same city as the
city of their school. - Student where ?Address (city (studiesAt.School.c
ity)) - Get name, faculty and school name for each person
studying at two or more faculties. - (((Person as p) join ((((Student)p) group as s)))
where count(s) 2). (p.name, s.(faculty,
(studiesAt.School.name)))
72Some qualities of dynamic object roles
- Multiple inheritance. Because roles are
encapsulated there is no name conflict even if
the super classes would have different properties
with the same name. - Repeating inheritance. An object can have two or
more roles with the same name. - Multiple-aspect inheritance. A class can be
specialized according to many aspects. UML covers
this feature, but it is neglected in tools. - Object migration An object can change its
classes without changing its identity. - Temporal properties Roles can represent any past
facts concerning objects. - Overlapping collections an object is included
into as many collections as the types of roles it
contains. - Aspect-Oriented Programming. Dynamic object roles
can be considered as a technical facility
supporting AOP.
73Imperative constructs of SBQL
- After implementing the stack-based machine of
SBQL implementation of imperative constructs
becomes quite easy extension. - For instance, create, update, insert, delete, for
each and other control statements. - We accept the tradition of classical imperative
and object-oriented languages, but provide
queries as basic constructs. - Obviously, there is a lot of choices and options.
- Classical dilemma between built-in and added-on
operators.
74Procedures, functions and methods
- A procedure call opens a new section on the
environment stack. - The section contains binders to local procedure
objects (transient) and binders related to the
actual parameters of the procedure. - Local procedure objects are invisible from
outside. - Scoping rules assume skipping irrelevant stack
sections. - Queries are used as actual parameters of
procedures. - A query determines an output from a functional
procedure. - A call of a functional procedure is considered a
query.
Procedure p1 calls p2 . Then, procedure p2 calls
p3. When p3 is executed, sections of p1 and p2
are irrelevant for binding.
75SBQL Example of a procedure
- Procedure ChangeDept moves the specified
employees to the specified department.
procedure ChangeDept( E EmpType0.. D
DeptType ) delete ( Dept . employs ) where
Emp in E for each E as e do
create e as employs insert employs
into D e . worksIn ?D? e .
D D . D
Let Kim become the manager of all designers
working so far for Lee
ChangeDept( Emp where job designer
and (worksIn.Dept.boss.Emp.name) Lee
Dept where (boss.Emp.name) Kim )
76SBQL updatable object views
- 30 years of RD on views have resulted in minor
results. - Very restricted view updating in Oracle and DB2.
- Proposals concerning object-oriented views -
limited and immature. - Essentially, all previous solutions were based on
the assumption that a view definition is a
function determined by a single query. - View updating through side effects of the
definition. - First fine solution for RDBMS instead of
trigger views - Based on overloading of an updating operation on
a virtual table by a trigger with the code
accomplishing the intention of the updating. - SBQL views are based on a similar idea, but it is
incomparably more general and efficient. - The idea works for any data model, including XML
and O-O ones. - It assumes that each operation on a virtual
object is overloaded by a special procedure
written by the view definer. - The procedure expresses definers intention of
the operation.
77General scenario of view processing
User program
View definition
virtual identifiers
A query invoking the view
..
The procedure in the view definition overloading
the given consumer
Interpreter of queries and updating statements
A consumer of the query result (e.g. the operator
update)
..
..
A piece of the interpreter code implementing the
given consumer
78Overloaded operations on virtual objects
- Dereference, i.e. taking a value of a virtual
object (on_retrieve). - Unavailable in instead of trigger views.
- Assignment a new value to a virtual object
(on_update). - Deleting a virtual object (on_delete).
- Inserting a (material or virtual) object into a
virtual object (on_insert). - Creating a new virtual object (on_create).
- If some of the procedures on_retrieve, ,
on_create is not defined by the view definer, the
corresponding operation on virtual objects is not
allowed.
79Example of a virtual updatable view
create view bestSellingBookDef virtual
objects bestSellingBook return (Book
where sold gt 1000) as b on_delete do
delete b create view vtitleDef
virtual objects vtitle return (b.title) as t
on_retrieve do return deref( t )
create view vauthorDef virtual
objects vauthor return (b.author) as a
on_retrieve do return deref( a )
create view vpriceDef virtual objects
vprice return ( b.price ) as p
on_retrieve do return convertToEuro(
b.currency, p ) on_update (newPrice)
do p convertFromEuro(b.currency,
newPrice)
Book title author price currency sold
Stored objects
bestSellingBook vtitle vauthor vprice
Virtual objects
vprice always in euro, updating of vprice is
converted to the proper currency of the book.
for each (BestSellingBook where vtitle MDA )
do vpric