Title: SQL-99
1WG3 Database Languages
Stephen Cannan Convenor 2002-05-06
2Agenda
- Working Group changes
- ISO/IEC 9075
- ISO/IEC 20606
- Recap of recent changes
- Sketch of planned changes
3Working Group changes
- 3 ex-WG5 subprojects absorbed
- 20606-1 Authorization and Audit
- 20606-2 Distribution Schema
- 20606-3 Encompassing Transaction
- Effect on WG3 internal structure
4ISO/IEC 9075
- Multi-part standard
- Part 1 SQL/Framework
- Part 2 SQL/Foundation
- Part 3 SQL/CLI
- Part 4 SQL/PSM
- Part 9 SQL/MED
- Part 10 SQL/OLB
- Part 11 SQL/Schemata
- Part 13 SQL/JRT
- Part 14 SQL/XML
5Part 1 SQL/Framework
- Common definitions concepts
- Structure of multi-part standard
- Basic conformance structure statement
- About 85 pages
6Part 2 SQL/Foundation
- The core of the standard
- Includes
- Traditional SQL
- Object-oriented SQL
- Module Language
- Host language bindings (except Java)
- Dynamic SQL
- Direct SQL
- Excludes
- Information Schema Definition Schema (in Part
11) - About 1300 pages
7Part 3 SQL/CLI
- Call-Level Interface
- Best-known implementation ODBC
- About 400 pages
8Part 4 SQL/PSM
- Persistent Stored Modules
- PSM-96 specified
- functions procedures
- SQL-server modules
- computational completeness
- In PSM1999, functions procedures moved to
Foundation (same in 2003) - Analogous to PL/SQL, Transact-SQL, etc.
- About 170 pages
9Part 9 SQL/MED
- Management of External Data
- Foreign Servers, Foreign-Data Wrappers, Foreign
Tables - SQL-aware vs non-SQL-aware
- Datalinks
- Merge with 20606-2 Distribution Schema
- About 500 pages
10Part 10 SQL/OLB
- Object Language Bindings
- Embedded SQL in Java
- SQLJ Part 0
- About 360 pages
11Part 11 SQL/Schemata
- Information Schema
- Definition Schema
- About 300 pages
12Part 13 SQL/JRT
- Java Routines and Types
- Java routines stored in an SQL database, and
invoked from SQL statements - Java classes used as data types of SQL columns
- About 200 pages
13Part 14 SQL/XML
- Under construction
- Using SQL and XML together
- About 150 pages, but still growing
14ISO/IEC 20206
- 20606-1 Authorization and Audit
- 20606-3 Encompassing Transaction
15Project 20206 1Authorization and Audit
- Addresses issues of
- Authentication
- Access Control
- Transfer Integrity
- Transfer Confidentiality
- Non-repudiation
- No Working Draft available
- CD expected 2002-06-01
- John Hadjioannou listed as Editor but not present
16Project 20206 3Encompassing Transaction
- Interacts with external Transaction Managers
- Support for xa_ interface
- 9075-6 XA Specialization (SQL/Transaction)
- Propagates Global Context information
- No Working Draft available
- CD expected 2002-06-01
- Bob Sunday listed as Editor but has withdrawn
17Recap of recent changes ISO/IEC 9075 SQL2003
- Since 1999 we have already added
- Part 9 SQL/MED
- Part 10 SQL/OLB
- Part 13 SQL/JRT
- Addendum 1 SQL/OLAP
- SQL2003 will be the 4th generation
- 1987, 1992, 1999
- Consists of
- Many error corrections(TC stands at 376 pages
and growing) - A few new features
- Part 14 SQL/XML
18New Features
- Data types
- Predicates
- Semantics
- Security
- Active Database
19New Data Types
20New Data Types BIGINT
- Exact numeric, scale 0
- Precision of BIGINT ? precision of INTEGER
- Must have same radix as SMALLINT and INTEGER
21New Data Types MULTISET
- Varying-length, unordered collections of element
having specified type - COL1 INTEGER MULTISET
- COL2 ROW( F1 BIGINT, F2 VARHCAR(4000)
) MULTISET - No (specified) maximum cardinality
- SQL2003 collection types include
- MULTISET
- ARRAY
22New Data Types MULTISET
- INTEGER MULTISET()
- Empty multiset with integer element type (not
null!) - INTEGER MULTISET(2, 3, 5, 7)
- Integer multiset with first few primes
- INTEGER MULTISET(SELECT COL1
FROM TBL1 WHERE COL2 gt 10) - Integer multiset populated from values in column
of table
23New Data Types MULTISET
- CARDINALITY (value1)
- Type of value1 must be multiset
- Returns number of elements in value
- SET (value1)
- Type of value1 must be multiset
- Returns value1 with duplicate elements removed
- ELEMENT (value1)
- Type of value1 must be multiset
- Cardinality of value1 must be 1
- Returns the single element in value1
24New Data Types MULTISET
- UNNEST(value1) AS corr-name
- Type of value1 must be multiset
- Un-nests value1 and turns the elements into
rows of a virtual table - UNNEST MULTISET (2, 3, 5, 7) AS P
P
2
3
5
7
25New Data Types MULTISET
- value1 MULTISET setop quantifier value2
- setop UNION or EXCEPT or INTERSECT
- quantifier ALL or DISTINCT
- SELECT col1 MULTISET INTERSECT DISTINCT
col2 FROM tbl1 WHERE CARDINALITY(col2) gt 50 - Close analogs to ordinary set operators
- UNION, EXCEPT, and INTERSECT
- However, ALL is the default quantifier
26New Data Types MULTISET
- New aggregates
- COLLECT Transform the values in a group into a
multiset - FUSION Form a union of the multisets in a group
number of duplicates of a given value in the
result is the sum of the number of duplicates in
the multisets in the rows of the group - INTERSECTION Form an intersection of the
multisets in a group number of duplicates of a
given value in the result is the minimum of the
number of duplicates in the multisets in the rows
of the group
27New Predicates
- NORMALIZED
- If character string argument is in Unicode
Normalization Form C (NFC), returns true - MEMBER
- If first argument is a member of multiset in
second argument, returns true - SUBMULTISET
- If first argument is a submultiset of second
argument, returns true - SET
- If argument is a set (no duplicate values),
returns true
28NORMALIZED Predicate
- string_value IS NOT NORMALIZED
- Character repertoire of string_value must be UCS
- Encourages, but does not force, normalization to
NFC
29MEMBER Predicate
- value NOT MEMBER OF multiset
- multiset must be a multiset
- Type of value comparable to element type of
multiset - If multiset is empty, returns false
- If value is equal to some element of multiset,
returns true - Else, if some element of multiset is null,
returns unknown
30SUBMULTISET Predicate
- multiset1 NOT SUBMULTISET OF multiset2
- Both multiset1 and multiset2 must be multisets
- and their element types must be comparable
- If the cardinalities of multiset1 and multiset2
are equal and if every value in multiset1 has a
corresponding value in multiset2, then returns
true
31SET Predicate
- multiset IS NOT A SET
- multiset must be a multiset
- If there are no duplicate values in multiset,
returns true - Maximum of 1 null value in a set
32New Semantics
- MERGE statement
- OLAP TABLESAMPLE
- Generated columns
- Identity columns and sequence generators
33New Semantics MERGE
- If some row in the target table matches some row
in the source table according to the specified
predicate, then the row in the subject table is
updated - If no row in the target table matches a given row
in the source table according to the specified
predicate, then the row from the source table is
inserted into the subject table
34New Semantics MERGE
- MERGE INTO table-name AS correlation USING
table-referenceON search-conditionWHEN MATCHED
THEN SET col value - MERGE INTO table-name AS correlation USING
table-referenceON search-conditionWHEN NOT
MATCHED THEN INSERT ( col-list ) VALUES (
val-list ) - Both MATCHED and NOT MATCHED clauses permitted
once each, in either order
35TABLESAMPLE
- New feature in the OLAP capability
- Permits evaluation of aggregates on samples
derived from database data - Permits faster debugging when database is huge
- Two forms of sampling BERNOULLI and SYSTEM
36TABLESAMPLE
- TABLESAMPLE method ( percentage ) repeatable
- method is either BERNOULLI or SYSTEM
- Bernoulli Sample table contains approximately
percentage of the rows of the original table
the probability of a given row of the original
table appearing in the sample table is percentage
, independently of every other row - System Sample table contains approximately
percentage of the rows of the original table
the probability of a given row of the original
table appearing in the sample table is percentage
- repeatable acts as a random number seed for
sampling
37Generated Columns
- Ordinary columns of base tables base columns
- Generated column value computed from values of
zero or more base columns in same row - CREATE TABLE EMPLOYEES ( EMP_ID INTEGER,
SALARY DECIMAL(7,2), BONUS
DECIMAL(7,2), TOTAL_COMP GENERATED ALWAYS AS
( SALARY BONUS ), HR_CLERK
GENERATED ALWAYS AS ( CURRENT_USER
))
38Identity columns Sequence generators
- Identity column Invented as mechanism to allow
automatic population of table keys - Sequence generator Invented to allow generation
of next value in a sequence - Together, they allow generation of keys for
unique columns (or combinations of columns) when
required - Application chooses select value for identity
column or let sequence generator make choice
39Sequence generators
- Sequence generators
- Data type (exact numeric with scale zero)
- Starting value
- Increment (positive ascending, negative
descending) - Minimum and maximum values
- Cycle option
- External (explicit schema object) or internal
(part of another schema object, such a column)
40External sequence generators
- CREATE SEQUENCE seqname AS type START WITH
value INCREMENT BY value MAXVALUE value
CYCLE - Variations
- NO CYCLE
- NO MAXVALUE, MINVALUE, NO MINVALUE
- Order of clauses can vary
Common sequence generator options
41Sequence generators
- Every sequence generator has a current base
value initially set to the start value - Generate next value of a sequence generator
NEXT VALUE FOR seqname - Returns current base value N increment for
some N ? 0 - If computed value gt MAXVALUE (or lt MINVALUE) and
NO CYCLE, then raise an exception - Otherwise, reset to MINVALUE (or MAXVALUE) and
compute new value for some N.
42Sequence generators
- INSERT INTO TBL ( COL1, COL2 ) VALUES ( 10,
NEXT VALUE FOR seqgen ) - CALL myproc ( NEXT VALUE FOR seqgen )
- SET J J NEXT VALUE FOR seqgen
43Sequence generators
- ALTER SEQUENCE seqname
- RESTART WITH newbase resets start of
computation - Common sequence generator options change start
value, maximum or minimum, increment, cycle
option - DROP SEQUENCE seqname dropoption
44Identity columns Sequence generators
- Base tables may optionally have a single identity
column - Data type exact numeric, scale 0 e.g., INTEGER
- Associated with internal sequence generator
- Start value
- Minimum and/or maximum value
- Increment value
- Cycle option
45Identity columns Sequence generators
- GENERATED ALWAYS or GENERATED BY DEFAULT
- ALWAYS Not allowed to UPDATE column INSERT
requires OVERRIDING SYSTEM VALUE - BY DEFAULT Allowed to INSERT or UPDATE column
if column not specified on INSERT, then value
automatically generated
46Identity columns Sequence generators
- CREATE TABLE employees ( EMP_ID INTEGER
GENERATED ALWAYS AS IDENTITY START WITH
100 INCREMENT 1 MINVALUE 10 NO
MAXVALUE NO CYCLE, SALARY
DECIMAL(7,2), ...,)
47New part SQL/XML
- Out for FCD ballot
- Known not to be complete
- Current contents comprise infrastructure
mappings of - SQL identifiers to from XML QNames
- SQL types values onto XML schema types values
- SQL tables onto XML schema types
- Publishing functions (SQL data expressed as XML)
- New built-in type for SQL XML
48To be addedInserting XML into an SQL database
- Transform character string containing XML text
into an XML value - INSERT INTO employees ( empxml )VALUES XMLPARSE
( 'ltemp id"15339"gtltnamegtGurney
Hallecklt/namegtlthiredategt3998-04-25lt/hiredategtltde
ptgtSecuritylt/deptgtlt/empgt'STRIP WHITESPACE ) - No concrete proposal yet (still working out
issues)
49To be addedInserting XML into an SQL database
- XMLPARSE does not validate
- But it does
- Syntax check all in-line DTDs
- Process entity references defined in such DTDs
- Apply default values defined in such DTDs
- Whitespace handling not finalized
- STRIP and
- PRESERVE are possible choices
50To be added Validating XML in an SQL context
- Validate an XML value against an XML schema
(possibly a DTD) - INSERT INTO employees ( empxml )VALUES
XMLVALIDATE ( XMLPARSE ( 'ltemp id"15339"gtltnamegtGu
rney Hallecklt/namegtlthiredategt3998-04-25lt/hiredate
gtltdeptgtSecuritylt/deptgtlt/empgt' STRIP WHITESPACE )
) - No concrete proposal yet (still working out
issues)
51To be added Validating XML in an SQL context
- Many remaining problems
- How to reference DTDs and schemas
- Security problems of referencing arbitrary DTDs
and schemas - Namespace considerations
- xsischemaLocation and xsinoNameSpaceSchemaLocati
on might identify schemas outside the database
a solution might involve caching schemas inside
the database
52To be added Retrieving XML from an SQL database
- Transform an XML value into a character string
containing XML text - SELECT XMLSERIALIZE ( e.empxml )INTO
hostvarFROM emp AS eWHERE ... - No concrete proposal yet (still working out
issues)
53To be addedRetrieving XML from an SQL database
- Numerous problems not yet resolved
- Precise SQL type of the function result
(CHARACTER, CHARACTER VARYING, CHARACTER LARGE
OBJECT) - Actual character encoding (UTF8, UTF16, other)
and SQLs specified character set - Specified character encoding in prologue
- DTD or schema inference? Creation?
54Summary
- ISO/IEC 9075
- Consolidation of changes since 1999
- Correction of errors
- New Part (SQL/XML)
- ISO/IEC 20206
- Area of concern
- Looking for
- National Body participation
- Editor(s)
55Questions?