Title: Evaluating Relational Theory
1 Evaluating Relational Theory
- Delivering end-user programming
- General-purpose data modelling
2Generalities of DBs themes of the module
- Two views of impact of databases
- can view the DBMS
- as a program generator for the end-user
- cf. current research on end-user programming
- as a means to record persistent real-world state
- cf. current research on virtual reality
- Key issue Is it possible to align paradigms for
programming and general-purpose data modelling?
3Characteristics of electronic data 1970 (1)
- Abstract model of the entire corpus of
operational data - Separation between persistent transient data
sharper - file vs executing program
- Isolation of persistent data more complete
- changes to persistent data initiated by human
action - persistent data accessed through text interfaces
- Electronic data storage management rare,
expensive - intelligent interpretation of electronic state
by human - no direct connection between environment and data
4Modern context for general data modelling
Programs
5Characteristics of electronic data 1970 (2)
- Abstract model of the entire corpus of
operational data - Demands of the abstract model in 1970 quite low
- small volumes of data, modest performance
- limited levels of volatility and automation
tolerated - Today is very different, BUT subject to viewing
human agency as a metaphor for any agency, the
key issues to be addressed by a classical
database are still vital - Any DB modelling paradigm must handle 70s problems
6 Evaluating Relational Query Languages
- Relational theory strengths and limitations
- From relational theory to practice
7Perspectives on evaluating relational theory
- Briefly review original motivation for relational
theory - with particular attention to relational query
languages and their suitability for end-user
programming - general discussion of issues of principle
- review SQL to expose the relation to normal
practice - review sqleddi to reveal issues in bad current
practice
8Generalities of DBs themes of the module
- Two views of impact of databases
- can view the DBMS
- as a program generator for the end-user
- cf. current research on end-user programming
- as a means to record persistent real-world state
- cf. current research on virtual reality
- Key issue Is it possible to align paradigms for
programming and general-purpose data modelling?
9Relational query languages
- Informally can identify the notion of a pure (as
distinct from a commercial) relational query
language (RQL) - Queries in pure RQLs have precisely the abstract
mathematical functionality identified by Codd - Languages such as ISBL and EDDI come closest to
this - Commercial RQLs have the expressive power of pure
RQLs plus additional features for practical use
10Strengths of pure relational query languages
- Pure relational query languages provide
- wide range of abstract queries for record-based
data - excellent mathematical semantics
- physical and logical data independence
- relative simplicity for the human user (cf.
general PL) - scope for automatic optimisation
- a declarative rather than a procedural emphasis
11Limitations of pure relational query languages 1
- Pure relational query languages DONT provide
- computational completeness cant compute the
transitive closure of a relation - For instance, consider the table
- BIRTHS(NAME,MOTHER,FATHER,YEAR,GENDER)
- try to write a relational query to list ALL
ancestors - an intrinsic procedural interpretation (cf. a
simple functional programming language)
12Limitations of pure relational query languages 2
- Pure relational query languages DONT provide
- procedural elements of the DDL (e.g. update,
create) - aggregate functions
- syntactic sugar to appeal to a naïve users
intuition - presentation features (e.g. ordering, forms)
- access features (e.g. GUIs for users, PL
interfaces)
13Exposing the limitations of RQLs
- Exploration of this theme is an exercise to the
reader - Can appreciate the similarities and natural
discrepancies between a pure RQL and a commercial
RQL by reviewing the features of SQL (see
handout associated with sql.ppt) - Can expose the pathological discrepancies between
SQL and and a pure RQL by studying the
relationship between EDDI, SQL0 and TOYSQL (see
worksheets 5 and 6, and the slides BeyondSQL0.ppt
accessible via the CS233 website)
14B Pathologies in standard SQL 1
- To run the EDDI interpreter consult CS233
Worksheet 5 - 3 evaluation conventions in EDDI reflect a pure
RQL - no multiple rows
- strict type checking on domains and attributes
- use of natural join
- Can change these via the Uneddifying Interface
- See Worksheet 6 Questions 3-6 for illustration
15B Pathologies in standard SQL 2
- To run the SQL0 interpreter consult CS233
Worksheet 6 - SQL0 (? SQL) respects all three evaluation
conventions - BUT Standard SQL violates all 3 evaluation
conventions - allows duplicate rows - implements two types of
selection SELECT DISTINCT and SELECT - dispenses with type checking on attributes
- uses unnatural join
- Consequence logical flaws obscure semantics
HD
16B Pathologies in standard SQL 3
- Issue How to implement standard SQL using EDDI?
- Worksheet 6 provides the context for discussing
this - BeyondSQL0.ppt explains in more detail how such
an implementation can be carried out and
highlights the problematic consequences of the
poor design of standard SQL where implementation
is concerned
17Meta-agenda raised by RQLs
- RQLs went a long way to resolving the data
modelling challenges for end-user programming in
the 70s - BUT (arguably) the solution they offer has its
limitations in respect of modern requirements - Some limitations clearly stem from a failure to
be faithful to the principles of Codds
relational theory - Is it conceivable that some limitations stem from
the limitations of mathematical theory itself? - Will return to these themes later in the module
18Modelling real-world state
- Modelling state in computer science
- State as the current state of affairs
- Modelling state for and with state-change
19Generalities of DBs themes of the module
- Two views of impact of databases
- can view the DBMS
- as a program generator for the end-user
- cf. current research on end-user programming
- as a means to record persistent real-world state
- cf. current research on virtual reality
- Key issue Is it possible to align paradigms for
programming and general-purpose data modelling?
20Modelling state 1
- Modelling state is fundamental to computer
science - Important and confusing distinction between
- modelling the current state of affairs
- modelling to support state changing activities
- File system or database content persistent
storage - represents a current state of affairs
- Object-oriented analysis from use-cases
- reflects the manner in which state is to be
changed
21Modelling state 2
- Philosophical issues raised by this distinction
- Commonly argued that perception of state is
mediated by the goal of our interaction with it - This is consistent with the dominant emphasis in
CS on - state as specified only in the context of a
behaviour - cf. state in a finite state machine, or
procedural program - Objects, ADTs, functional programs etc are viewed
as - abstractions to support the specification of
state-change
22Modelling state 3
- Philosophical issues raised by this distinction
- NB description of state ? description of
state-change - A key reason for this is that assuming the
environment is stable, we can describe recipes
for action that make no explicit representation
to current state - Consider e.g. running up familiar stairs
automatically - Internal memory of the recipe ? memory of the
stairs
23Modelling state 4
- Philosophical issues raised by this distinction
- Relational DBs definitely aspire to model state
itself - e.g. with the aim of supporting open-ended
queries - Relational queries are viewed as interrogating
state NOT as changing state - Adding a view as making a new observation of a
state - Not all OO traditions approve of the emphasis on
supporting state-change (e.g. Simula,
anti-use-case)
24Modelling state 5
- Philosophical issues raised by this distinction
- State bound up with agency and modes of
observation - use of RDBs in a timetabling exercise
- spreadsheets
- persistence as presumed absence of agency
- entity-relationship modelling
25Modelling state 6
- Entity-relationship modelling makes use of
diagrams - Note that ER diagrams
- represent state not state change
- supply direct metaphorical representation of
state entities, relationships, attributes
nodes of graph - Metaphor there is a perceived correspondence
between the features of the diagram and the state
that it represents an experiential not formal
issue - In practice, the metaphor is hard to sustain
26Modelling state 7
- Actual relationship between relation schemes and
real-world observables is very subtle - by way of illustration, consider additional
constraints to which real-world relations may be
subject, not FDs - relations may be subject to data dependencies
that motivate 4NF and 5NF cf. The Connection
Trap