Title: To ANSI or Not To ANSI
1To ANSI or Not To ANSI
- Gravenstein, Costello, Maurer
2Agenda
- Brief History
- Review Join Technologies
- Analysis
- Recommendation
3ANSI/Oracle Support History
- ANSI here refers to SQL/99 Join Syntax
- The standard to which all RDBMS vendors strive to
comply - SQL/99 support started with Oracle 9i in 2001
- Two years after the release of the standard,
Oracle supports it. - This presentation restricted to the Oracle
implementation of the ANSI standard
4Join Condition Types
- Equijoin
- Columns with the same name
- Columns with different names
- Outerjoin
- Left (left driving table)
- Right (right driving table)
- Full (both tables driving)
- Cross/Cartesian product
5Traditional Equijoin - Same Name
- Traditional Oracle Approach
- SELECT e.ename AS Employee_name,
- d.deptno,
- d.dname AS Department_name
- FROM emp e,
- dept d
- WHERE e.deptno d.deptno
Table prefix is required to remove ambiguity on
common columns
Join conditions must be listed
6ANSI Equijoin Natural Syntax
- ANSI SQL Natural Join
- SELECT ename AS employee_name,
- deptno,
- dname AS department_name
- FROM emp
- NATURAL JOIN dept
No table prefix if column is part of join
condition.
No commas between tables.
Join columns implied, based on columns that have
the same name
God forbid that you accidentally add a non-join
column to both tables(um.. Audit columns...)
7ANSI Equijoin Using Syntax
- ANSI SQL Join USING
- SELECT d.dname,
- e.ename
- FROM emp d
- JOIN dept d USING ( deptno )
Table prefix allowed on columns that are not part
of the using clause (join condition)
Several columns share same name, only joining on
some of them, in this case deptno
Add additional join columns using ( deptno, col2,
col3, )
8Traditional Equijoin Syntax
- Traditional Join, columns different
- SELECT d.department_name,
- l.city
- FROM departments d,
- locations l
- WHERE d.location_id l.id
Join column names are different
9ANSI On Equijoin Syntax
- ANSI SQL ON
- SELECT d.department_name,
- l.city
- FROM departments d
- JOIN locations l ON ( d.location_id l.id
)
Use ON when join column names are different
List join conditions here like traditional syntax
10ANSI Equijoin Syntax
- ANSI SQL Multi Table On
- SELECT e.empno,
- l.loc_id,
- d.dname,
- l.state_tx
- FROM locations l
- JOIN dept d ON ( d.location_id l.id )
- JOIN emp e ON ( d.deptno e.deptno )
Bring in first table join
No commas between tables
Bring in second table join
Any prior table column is visible joins from
left to right
11ANSI Equijoin Syntax
- ANSI SQL INNER
- SELECT e.emp_id,
- l.city,
- d.dept_name,
- c.city,
- d.deptno
- FROM locations l
- INNER JOIN dept d ON ( d.location_id
l.id ) - INNER JOIN emp e ON d.deptno e.deptno
ON clause allows reference to join columns by
table name
INNER an optional keyword stating this is an
equijoin (not an outer or cross join)
Parenthesis are optional, we like to include them
for clarity
12Traditional Outerjoin Syntax
- Traditional Outer Join
- SELECT e.ename,
- d.dname
- FROM emp e,
- dept d
- WHERE e.deptno () d.deptno
NULL in name if no employees in the department.
Traditional Outer Join Notation () indicator
denotes expand records on this side if needed
13ANSI Outerjoin Syntax
- Left Outer Join
- SELECT e.ename,
- d.dname
- FROM dept d
- LEFT OUTER JOIN emp e
- ON (e.deptno d.deptno)
NULL in last name if no employees in the
department.
OUTER keyword is optional.
LEFT denotes that the dominant table is to the
left (dept) and that all of its rows will be
returned. The right table is expanded with NULL
records
14ANSI Outerjoin Syntax
- Left Outer Join
- SELECT e.ename,
- d.dname
- FROM dept d
- NATURAL LEFT JOIN emp e
-
We dont recommend using it here either!
NATURAL can be used in an INNER and OUTER join.
15ANSI Outerjoin Syntax
- Right Outer Join
- SELECT e.ename,
- d.dname
- FROM emp e
- RIGHT OUTER JOIN dept d
- ON (e.deptno d.deptno)
NULL in last name if no employees in the
department.
RIGHT OUTER denotes that the dominant table is to
the right. The left table gets expanded with
NULLS.
16Traditional Outerjoin Syntax
- Full Outer Join
- Can only be represented with a UNION query.
- SELECT e.ename,
- d.dname
- FROM emp e, dept d
- WHERE e.deptno () d.deptno
- UNION
- SELECT e.ename,
- d.dname
- FROM emp e, dept d
- WHERE e.deptno d.deptno ()
Shouldnt see many of these(Weve never needed
one)
UNION ALL is incorrect as it results in duplicate
rows
UNION performs an implicit DISTINCT on both
queries possibly removing desired rows
17ANSI Outerjoin Syntax
- Full Outer Join
- SELECT e.ename,
- d.dname
- FROM emp e
- FULL OUTER JOIN dept d
- ON (e.deptno d.deptno)
NULL in last name if no employees in the
department.
NULL in department name if employee not in a
department.
OUTER is optional
FULL OUTER denotes that the table to the right
AND the table to the left will have all their
records returned
18Traditional Cross Join/Cross Product
- Cross Join
- SELECT emp_id,
- ename,
- dname
- FROM emp e,
- dept d
- WHERE d.deptno 10
No join condition between tables um, normally
not good.
19ANSI Cross Join/Cross Product
- Cross Join
- SELECT emp_id,
- ename,
- dname
- FROM emp e
- CROSS JOIN dept d
- WHERE d.deptno 10
Explicit CROSS condition, impossible to do this
inadvertently.
20ANSI Correlated Join
- Correlated Query Syntax Join
- SELECT empno,
- ename
- FROM emp e
- WHERE EXISTS
- ( SELECT NULL
- FROM dept d
- INNER JOIN locations l
- ON ( l.loc_id d.loc )
- WHERE d.deptno e.deptno
- )
ANSI doesnt allow join clause on first table
Only tables in current FROM clause visible to
ANSI join logic
Not really a mixed syntax join
21ANSI Correlated Subquery Issues
- SELECT
- FROM dept d
- INNER JOIN locations2 l
- USING ( loc )
- WHERE EXISTS
- (
- SELECT NULL
- FROM emp e
- WHERE e.loc l.loc
- )
- ORA-25154 column part of USING clause cannot
have qualifier
Dont be tempted to remove the table prefix l
Scope of reference dictates that the closest
column be used
NATURAL joins have the same issue
22ANSI Outer Join Subtleties
- SELECT d.deptno, e.ename, e.job FROM dept d
LEFT JOIN emp e ON ( e.deptno
d.deptno AND e.job 'SALESMAN') - DEPTNO ENAME JOB
- 30 ALLEN
SALESMAN30 WARD SALESMAN 30
MARTIN SALESMAN30 TURNER SALESMAN50
40 20 10
Filter is applied before join is executed
A number of rows returned that have OUTER
joined emp data
23ANSI Outer Join Subtleties
- SELECT d.deptno, e.ename, e.job FROM dept d
LEFT JOIN emp e ON ( e.deptno
d.deptno) WHERE e.job 'SALESMAN' - DEPTNO ENAME JOB
- 30 ALLEN
SALESMAN30 WARD SALESMAN30 MARTIN
SALESMAN30 TURNER SALESMAN
Filter is applied after join is executed
No outer joined data
24Mixed Traditional/ANSI Join
- Mixed Syntax Join
- SELECT emp_id,
- ename,
- dname
- FROM emp e
- INNER JOIN dept d USING (deptno),
- dual
- WHERE d.deptno 10
Uhg. Choose one or the other, but not both
please!
25ANSI vs Traditional Join Analysis
- Impact areas
- Flexibility
- Code Clarity
- Readability
- Ease of Use
- Developer Training
- DBA Training
- Legacy Code
- Standards
26Flexibility () Restrictions not present in
ANSI
- You cannot specify the () operator in a query
block that also contains FROM clause join syntax. - The () operator can appear only in the WHERE
clause or, in the context of left-correlation
(that is, when specifying the TABLE clause) in
the FROM clause, and can be applied only to a
column of a table or view. - If A and B are joined by multiple join
conditions, then you must use the () operator in
all of these conditions. If you do not, then
Oracle Database will return only the rows
resulting from a simple join, but without a
warning or error to advise you that you do not
have the results of an outer join. - The () operator does not produce an outer join
if you specify one table in the outer query and
the other table in an inner query. - You cannot use the () operator to outer-join a
table to itself, although self joins are valid.
For example, the following statement is not
valid - SELECT employee_id, manager_id
- FROM employees
- WHERE employees.manager_id()
employees.employee_id - However, the following self join is valid
- SELECT e1.employee_id, e1.manager_id,
e2.employee_id - FROM employees e1, employees e2
- WHERE e1.manager_id() e2.employee_id
- The () operator can be applied only to a column,
not to an arbitrary expression. However, an
arbitrary expression can contain one or more
columns marked with the () operator. - A WHERE condition containing the () operator
cannot be combined with another condition using
the OR logical operator. - A WHERE condition cannot use the IN comparison
condition to compare a column marked with the ()
operator with an expression.
Oracle recommends using ANSI OUTER JOIN syntax
27Flexibility - Performance
- In theory the optimizer should find the same
best execution plan for all logically
equivalent SQL statements - In practice, the more complex the statement, the
less likely logical equivalent statements result
in the same plan - Our preliminary research has shown ANSI OUTER
joins can result in more efficient execution
plans when compared with equivalent traditional
joins
28Code Clarity (Traditional)
- SELECT / qb_name(orig) /
- fdla.dim_borrower_v_id dim_borrower_v_id
- FROM dim_as_of_date_vw daod,
- dim_daily_loan_applctn_detl ddlad,
- dim_disbursement_date_vw dddv,
- dim_loan_originator dlo,
- fact_daily_loan_application fdla,
- dim_loan_applctn_status_vw dlasv
- WHERE daod.dim_as_of_date_v_id
ddlad.dim_as_of_date_v_id - AND daod.dim_as_of_date_v_id
fdla.dim_as_of_date_v_id - AND ddlad.dim_daily_loan_applctn_detl_id
fdla.dim_daily_loan_applctn_detl_id - AND ddlad.dim_as_of_date_v_id
fdla.dim_as_of_date_v_id - AND dddv.dim_disbursement_date_v_id
fdla.dim_disbursement_date_v_id - AND dlo.dim_loan_originator_id
fdla.dim_loan_originator_id - AND dlasv.DIM_LOAN_APPLCTN_STATUS_V_ID
fdla.DIM_LOAN_APPLCTN_STATUS_V_ID - AND NOT (dlasv.STATUS_CODE BETWEEN '700'
AND '740') - AND NOT (dlasv.status_code BETWEEN '000'
AND '429') - AND daod.as_of_calendar_date (CASE WHEN
in_DATE_SLICE IS NULL THEN -
LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1))
Can you quickly determine how tables are joined?
This is a real join weve implemented as part of
a recent project
Is a table join condition missing? How do you
know?
29Code Clarity (ANSI)
- SELECT / qb_name(orig) /
- fdla.dim_borrower_v_id dim_borrower_v_id
- FROM
- dim_as_of_date_vw daod
- INNER JOIN fact_daily_loan_application fdla
- ON (daod.dim_as_of_date_v_id
fdla.dim_as_of_date_v_id) - INNER JOIN dim_daily_loan_applctn_detl ddlad
- ON ( ddlad.dim_as_of_date_v_id
daod.dim_as_of_date_v_id - AND ddlad.dim_daily_loan_applctn_det
l_id - fdla.dim_daily_loan_applctn_detl
_id - AND ddlad.dim_as_of_date_v_id
fdla.dim_as_of_date_v_id ) - INNER JOIN dim_disbursement_date_vw dddv
- ON ( dddv.dim_disbursement_date_v_id
fdla.dim_disbursement_date_v_id) - INNER JOIN dim_loan_originator dlo
- ON (dlo.dim_loan_originator_id
fdla.dim_loan_originator_id) - INNER JOIN dim_loan_applctn_status_vw dlasv
- ON (dlasv.dim_loan_applctn_status_v_id
fdla.dim_loan_applctn_status_v_id) - WHERE NOT (dlasv.STATUS_CODE BETWEEN '700'
AND '740') - AND NOT (dlasv.status_code BETWEEN '000'
AND '429')
Can Not Add Table Without Join Condition
30Ease of Use/Developer DBA
Training
- Traditional Join
- Long time Oracle developers do nothing
- New Oracle Developers need to learn Oracle syntax
- Easier to make mistakes
- ANSI Join
- Works with SQL Server/Oracle/MySQL/
- Syntax is more readable/self documenting
- Natural join is un-natural
31Installed Code Base
- ANSI Joins not present in the Oracle installed
code base - ANSI Joins present in other RDBMS installed code
- More of these databases coming all the time
- Harm in having two join syntaxes
- Support personnel have to be comfortable with
both syntaxes - Additional training required
32Corporate Standards
- Does it make sense to allow both traditional and
ANSI? - Other existing examples of equivalent functions
- TRUNC vs ROUND
- Both allowed
- DECODE vs CASE statement
- Standards says no new DECODE
- CAST vs TO_CHAR, TO_NUMBER, TO_DATE,
- Both are allowed
- No documented standard
- ANSI Join
- ANSI has not been allowed
- No documented standard
DECODE is Traditional, CASE is ANSI
TO_ is Traditional, CAST is ANSI
33Stay with Traditional Joins
- Positives
- Long time Oracle professionals have familiarity
with the syntax - Installed code base is all this way
- Going forward lets not introduce additional
methods/styles which need to be supported - Additionally ANSI functionality, rarely, if ever
needed - No dangerous natural join possibility
34Allow ANSI Joins
- Positives
- Clearer, more readable code
- join syntax is segregated out of the predicate
- More flexible
- Oracle is adding enhancements to this syntax not
available with traditional join syntax - Apparent path of progress
- Standards based
- other DB vendors support ANSI SQL (MS, mysql,)
- Simplified multi-vendor RDBMS support
- Much harder to make join errors
35Fishbone Diagram
Only Allow Traditional Join Syntax
Only Support One Coding Style
Installed Oracle Code Base
No Natural Join Possibility
Error FreeSQL
Multi-Vendor RDBMS Support
Code Clarity
Fewer Join Condition Errors
Standards Based
Feature Rich
Allow ANSI Join Syntax
36Recommendations
- Allow both
- Provide training so that all are familiar with
both - Place some restrictions on ANSI syntax to prevent
problems - Do not allow natural joins
- Single SQL statements should use one or the other
but not both - New development should try to use same syntax
throughout - Long term goal, ANSI only
37