OCL3 Oracle 10g: SQL - PowerPoint PPT Presentation

1 / 110
About This Presentation
Title:

OCL3 Oracle 10g: SQL

Description:

RFID for cars: http://www.rfidanalysis.org/ 11. Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 ... Soln 2: escape any single quotes. Replace any ' with a '' or ... – PowerPoint PPT presentation

Number of Views:275
Avg rating:3.0/5.0
Slides: 111
Provided by: pagesSt
Category:
Tags: 10g | sql | ocl3 | oracle

less

Transcript and Presenter's Notes

Title: OCL3 Oracle 10g: SQL


1
OCL3 Oracle 10gSQL PL/SQLSession 10
  • Matthew P. Johnson
  • CISDD, CUNY
  • January, 2005

2
Agenda
  • Security web apps
  • RegEx support in 10g
  • Oracle XML
  • Data warehousing
  • More on the PL/SQL labs
  • Any more lab?

3
Review Why security is hard
  • Its a negative deliverable
  • Its an asymmetric threat
  • Tolstoy Happy families are all alike every
    unhappy family is unhappy in its own way.
  • Analogs homeland, jails, debugging,
    proof-reading, Popperian science, fishing, MC
    algs
  • So fix biggest problems first

4
DB users have privileges
  • SELECT read access to all columns
  • INSERT(col-name) can insert rows with
    non-default values in this column
  • INSERT can insert rows with non-default values
    in all columns
  • DELETE
  • REFERENCES(col-name) can define foreign keys
    that refer to (or other constraints that mention)
    this column
  • TRIGGER triggers can reference table
  • EXECUTE can run function/SP

5
Granting privileges (Oracle)
  • One method of setting access levels
  • Creator of object automatically gets all
    privileges to it
  • Possible objects tables, whole databases, stored
    functions/procedures, etc.
  • ltDB-namegt. - all tables in DB
  • A privileged user can grant privileges to other
    users or groups

GRANT privileges ON object TO user ltWITH GRANT
OPTIONgt
GRANT SELECT ON mytable TO someone WITH GRANT
OPTION
6
Granting and revoking
  • Privileged user has privileges
  • Privileged-WGO user can grant them, w/wo GO
  • Granter can revoke privileges or GO
  • Revocation cascades by default
  • To prevent, use RESTRICT (at end of cmd)
  • If would cascade, command fails
  • Can change owner

ALTER TABLE my-tbl OWNER TO new-owner
7
Granting and revoking
  • What we giveth, we may taketh away
  • mjohnson (effects?)
  • george (effects?)
  • mjohnson (effects?)

GRANT SELECT, INSERT ON my-table TO george WITH
GRANT OPTION
GRANT SELECT ON my-table TO laura
REVOKE SELECT ON my-table FROM laura
8
Role-based authorization
  • In SQL-1999, privileges assigned with roles
  • For example
  • Student role
  • Instructor role
  • Admin role
  • Each role gets to do same (sorts of) things
  • Privileges assigned by assigning role to users

GRANT SELECT ON my-table TO employee
GRANT employee TO billg
9
Passwords
  • DBMS recognizes your privileges because it
    recognizes you
  • how?
  • Storing passwords in the DB is a bad idea

10
Hashed or digested passwords
  • One-way hash function
  • computing f(x) is easy
  • Computing f-1(y) is hard/impossible
  • Finding some x2 s.t. f(x2) f(x) is hard/imposs
  • collisions
  • Intuitively seeing f(x) gives little (useful)
    info on x
  • x looks random
  • PRNGs
  • MD5, SHA-1
  • RFID for cars http//www.rfidanalysis.org/

11
Built-in accounts
  • Many DBMSs (and OSs) have built-in demo accounts
    by default
  • In some versions, must opt out
  • MySQL root/(blank) (closed on sales)
  • http//lists.seifried.org/pipermail/security/2004-
    February/001782.html
  • Oracle scott/tiger (was open on sales last year)
  • SQLServer sa/(blank/null)
  • http//support.microsoft.com/default.aspx?scidkb
    EN-US313418

12
Query-related Injection attacks
  • Heres a situation
  • Prompt for user/pass
  • Do lookup
  • If found, user gets in
  • test.user table in MySQL
  • http//pages.stern.nyu.edu/mjohnson/dbms/php/logi
    nphp.txt
  • http//pages.stern.nyu.edu/mjohnson/dbms/php/logi
    n.php
  • Apart from no hashing, is this safe?

SELECT FROM users WHERE useru AND passwordp
13
Injection attacks
  • We expect to get input of something like
  • user mjohnson
  • pass secret
  • ?

SELECT FROM users WHERE user u AND password
p
SELECT FROM users WHERE user 'mjohnson' AND
password 'secret'
14
Injection attacks MySQL/Perl/PHP
  • Consider another input
  • user ' OR 11 OR user '
  • pass ' OR 11 OR pass '
  • ?

SELECT FROM users WHERE user u AND password
p
SELECT FROM users WHERE user '' OR 11 OR
user '' AND password '' OR 11 OR pass
''
http//pages.stern.nyu.edu/mjohnson/dbms/php/logi
n.php http//pages.stern.nyu.edu/mjohnson/dbms/eg
/injection.txt
SELECT FROM users WHERE user '' OR 11 OR
user '' AND password '' OR 11 OR pass ''
15
Injection attacks MySQL/Perl/PHP
  • Consider this one
  • user your-boss' OR 11
  • pass abc
  • ?

SELECT FROM users WHERE user u AND password
p
http//pages.stern.nyu.edu/mjohnson/dbms/php/logi
n.php
SELECT FROM users WHERE user 'your-boss' OR
11 ' AND password 'abc'
SELECT FROM users WHERE user 'your-boss' OR
11 ' AND password 'abc'
16
Injection attacks MySQL/Perl/PHP
  • Consider another input
  • user your-boss
  • pass ' OR 11 OR pass '
  • ?

SELECT FROM users WHERE user u AND password
p
http//pages.stern.nyu.edu/mjohnson/dbms/php/logi
n.php
SELECT FROM users WHERE user 'your-boss' AND
password '' OR 11 OR pass ''
SELECT FROM users WHERE user 'your-boss' AND
password '' OR 11 OR pass ''
17
Multi-command inj. attacks (other DBs)
  • Consider another input
  • user ' DELETE FROM users WHERE user 'abc'
    SELECT FROM users WHERE password '
  • pass abc
  • ?

SELECT FROM users WHERE user u AND password
p
SELECT FROM users WHERE user '' DELETE FROM
users WHERE user 'abc' SELECT FROM users WHERE
password '' AND password 'abc'
SELECT FROM users WHERE user '' DELETE FROM
users WHERE user 'abc' SELECT FROM users WHERE
password '' AND password 'abc'
18
Multi-command inj. attacks (other DBs)
  • Consider another input
  • user ' DROP TABLE users SELECT FROM users
    WHERE password '
  • pass abc
  • ?

SELECT FROM users WHERE user u AND password
p
SELECT FROM users WHERE user '' DROP TABLE
users SELECT FROM users WHERE password '' AND
password 'abc'
SELECT FROM users WHERE user '' DROP TABLE
users SELECT FROM users WHERE password '' AND
password 'abc'
19
Multi-command inj. attacks (other DBs)
  • Consider another input
  • user ' SHUTDOWN WITH NOWAIT SELECT FROM users
    WHERE password '
  • pass abc
  • ?

SELECT FROM users WHERE user u AND password
p
SELECT FROM users WHERE user '' SHUTDOWN
WITH NOWAIT SELECT FROM users WHERE password
'' AND password 'abc'
SELECT FROM users WHERE user '' SHUTDOWN
WITH NOWAIT SELECT FROM users WHERE password
'' AND password 'abc'
20
Injection attacks MySQL/Perl/PHP
  • Consider another input
  • user your-boss
  • pass ' OR 11 AND user 'your-boss
  • ? Delete your boss!

DELETE FROM users WHERE user u AND password p
http//pages.stern.nyu.edu/mjohnson/dbms/php/user
s.php
DELETE FROM users WHERE user 'your-boss' AND
pass '' OR 11 AND user 'your-boss'
DELETE FROM users WHERE user 'your-boss' AND
pass ' ' OR 11 AND user 'your-boss'
21
Injection attacks MySQL/Perl/PHP
  • Consider another input
  • user ' OR 11 OR user '
  • pass ' OR 11 OR user '
  • ? Delete everyone!

DELETE FROM users WHERE user u AND pass p
http//pages.stern.nyu.edu/mjohnson/dbms/php/user
s.php
DELETE FROM users WHERE user '' OR 11 OR
user '' AND pass '' OR 11 OR user ''
DELETE FROM users WHERE user '' OR 11 OR user
'' AND pass '' OR 11 OR user ''
22
Preventing injection attacks
  • Ultimate source of problem quotes
  • Soln 1 dont allow quotes!
  • Reject any entered data containing single quotes
  • Q Is this satisfactory?
  • Does Amazon need to sell OReilly books?
  • Soln 2 escape any single quotes
  • Replace any ' with a '' or \'
  • In Perl, use taint mode wont show
  • In PHP, turn on magic_quotes_gpc flag in
    .htaccess
  • show both PHP versions

23
Preventing injection attacks
  • Soln 3 use prepare parameter-based queries
  • Supported in JDBC, Perl DBI, PHP ext/mysqli
  • http//pages.stern.nyu.edu/mjohnson/dbms/perl/log
    insafe.cgi
  • http//pages.stern.nyu.edu/mjohnson/dbms/perl/use
    rssafe.cgi
  • Very dangerous using tainted data to run
    commands at the Unix command prompt
  • Semi-colons, prime char, etc.
  • Safest define set if legal chars, not illegal
    ones

24
Preventing injection attacks
  • When to do security checking for quotes, etc.?
  • Natural choice in client-side data validation
  • But not enough!
  • As saw earlier can submit GET and POST params
    manually
  • ? Must do security checking on server

25
More Info
  • phpGB MySQL Injection Vulnerability
  • http//www.securiteam.com/unixfocus/6X00O1P5PY.htm
    l
  • "How I hacked PacketStorm
  • http//www.wiretrip.net/rfp/txt/rfp2k01.txt

26
SQLPlus settings
SQLgt SET RECSEP OFF SQLgt COLUMN text FORMAT A60
27
New topic Regular Expressions
  • In automata theory, Finite Automata are the
    simplest weakest of computer, Turing Machines the
    strongest
  • Chomskys Hierarchy
  • FA are equivalent to a regular expression
  • Expressions that specify a pattern
  • Can check whether a string matches the pattern

28
RegEx matching
  • Use REGEX_LIKE
  • Metachar for any char is .
  • First, get employee_comment table
  • http//pages.stern.nyu.edu/mjohnson/oracle/empcom
    m.sql
  • Now do search
  • So far, like LIKE

SELECT emp_id, text FROM employee_comment WHERE
REGEXP_LIKE(text,'...-....')
29
RegEx matching
  • Can also pull out the matching text with
    REGEXP_SUBSTR
  • If want only numbers, can specify a set of chars
    rather than a dot

SELECT emp_id, REGEXP_SUBSTR(text,'...-....')
text FROM employee_comment WHERE
REGEXP_LIKE(text,'...-....')
SELECT emp_id, REGEXP_SUBSTR(text,
'0123456789..-...0123456789') text FROM
employee_comment WHERE REGEXP_LIKE(text,
'0123456789..-...0123456789')
30
RegEx matching
  • Or can specify a range of chars
  • Or, finally, can state how many copies to match

SELECT emp_id, REGEXP_SUBSTR(text,
'0-9..-....') text FROM employee_comment WHERE
REGEXP_LIKE(text,'...-....')
SELECT emp_id, REGEXP_SUBSTR(text, '0-93-0-9
4') text FROM employee_comment WHERE
REGEXP_LIKE(text,'0-93-0-94')
31
RegExp matching
  • Other operators
  • - 0 or more matches
  • - 1 or more matches
  • ? - 0 or 1 match
  • Also, can OR options together with op
  • Here some phone nums have area codes, some not,
    so want to match both

SELECT emp_id, REGEXP_SUBSTR(text, '0-93-0-9
3-0-940-93-0-94') text FROM
employee_comment WHERE REGEXP_LIKE(text,'0-93-
0-93-0-940-93-0-94')
32
RegExp matching
  • Order of ORed together patterns matters
  • First matching pattern wins

SELECT emp_id, REGEXP_SUBSTR(text, '0-93-0-9
40-93-0-93-0-94') text FROM
employee_comment WHERE REGEXP_LIKE(text,'0-93-
0-940-93-0-93-0-94')
33
RegExp matching
  • Theres a shared structure between the two, tho
  • Area code is just optional
  • Can use ? op

SELECT emp_id, REGEXP_SUBSTR(text, '(0-93-)?
0-93-0-94') text FROM employee_comment WHER
E REGEXP_LIKE(text,'(0-93-)?0-93-0-94'
)
34
RegExp matching
  • Also, different kinds of separators
  • dash, dot, just blank
  • Can OR together whole number patterns
  • Better Just use set of choices of each sep.

SELECT emp_id, REGEXP_SUBSTR(text, '(0-93-.
)?0-93-. 0-94') text FROM
employee_comment WHERE REGEXP_LIKE(text,'(0-93
-. )?0-93-. 0-94')
35
RegExp matching
  • One other thing area codes in parentheses
  • Of course, area codes are still optional
  • Parentheses must be escaped - \( \)

SELECT emp_id, REGEXP_SUBSTR(text, '(0-93-.
\(0-93\) )?0-93-. 0-94')
text FROM employee_comment WHERE
REGEXP_LIKE(text,'(0-93-. \(0-93\)
)?0-93-. 0-94')
36
And now for something completely different XML
  • XML eXtensible Mark-up Language
  • Very popular language for semi-structured data
  • Mark-up language consists of elements composed
    of tags, like HTML
  • Emerging lingua franca of the Internet, Web
    Services, inter-vender comm

37
Unstructured data
  • At one end of continuum unstructured data
  • Text files
  • Stock market prices
  • CIA intelligence intercepts
  • Audio recordings
  • Just one damn bit after another
  • Henry Ford
  • No (intentional, formal) patterns to the data
  • Difficult to manage/make sense of
  • Why we need data-mining

38
Structured data
  • At the other end structured data
  • Tables in RDBMSs
  • Data organized into semantic chunks
  • entities
  • Similar/related entities grouped together
  • Relationships, classes
  • Entities in same group have same structure
  • Same fields/attributes/properties
  • Easy to make sense of
  • But sometimes too rigid a req.
  • Difficult to sendconvert to tab-delimited

39
Semi-structured data
  • Not too random
  • Data organized into entities
  • Similar/related grouped to form other entities
  • Not too structured
  • Some attributes may be missing
  • Size of attributes may vary
  • Support of lists/sets
  • Juuust Right
  • Data is self-describing

40
Semi-structured data
  • Predominant examples
  • HTML HyperText Mark-up Language
  • XML eXtensible Mark-up Language
  • NB both mark-up languages (use tags)
  • Mark-up lends self of semi-structured data
  • Demarcate boundaries for entities
  • But freely allow other entities inside

41
Data model for semi-structured data
  • Usually represented as directed graphs
  • Graph set of vertices (nodes) and edges
  • Dots connected by lines not nec. a tree!
  • In model,
  • Nodes entities or fields/attributes
  • Edges attribute-of/sub-entity-of
  • Example publisher publishes gt0 books
  • Each book has one title, one year, gt1 authors
  • Draw publishers graph

42
XML is a SSD language
  • Standard published by W3C
  • Officially announced/recommended in 1998
  • XML ! HTML
  • XML ! a replacement for HTML
  • Both are mark-up languages
  • Big diffs
  • XML doesnt use predefined tags (!)
  • But its extensible tags can be added
  • HTML is about presentation ltIgt, ltBgt, ltPgt
  • XML is about content ltbookgt, ltauthorgt

43
XML syntax
  • Like HTML in many respects but more strict
  • All tags must be closed
  • Cant have this is a lineltbrgt
  • Every start tag has an end tag
  • Although ltbr/gt style can replace both
  • IS case-sensitive
  • IS space-sensitive
  • XML doc has a unique root element

44
XML syntax
  • Tags must be properly nested
  • Not allowed ltbgtltigtIm not kiddinglt/bgtlt/igt
  • Intuition file folders
  • Elements may have quoted attributes
  • ltMyelm myattmyvalgtlt/Myelmgt
  • Comments same as in HTML
  • lt!-- Pay no attention --gt
  • Draw publishers XML

45
Escape chars in XML
  • Some chars must be escaped
  • Distinguish content from syntax
  • Can also declare value to be pure text

ltelmgt3 lt 5lt/elmgt
ltelmgtquotDonapost call me aposIshmaelapos!
quotlt/elmgt
ltaRealTaggt lt!CDATAltnotAtaggtjsdljsdltneitherAm
Ilt'gtltgtgtgtlt/aRealTaggt
46
XML Namespaces
  • Different schemas/DTDs may overlap
  • XHTML and MathML share some tags
  • Soln namespaces
  • as in Java/C/C

ltbook xmlnsisbn"www.isbn-org.org/def"gt lttitlegt.
..lt/titlegt ltnumbergt15lt/numbergt ltisbnnumbergt...lt
/isbnnumbergt lt/bookgt
47
From Relational Data to XML Data
persons
XML
persons
row
row
row
name
name
name
ssn
ssn
ssn
  • ltpersonsgt
  • ltrowgtltnamegtMichaellt/namegt
  • ltssngt123lt/ssngtlt/rowgt
  • ltrowgtltnamegtHilarylt/namegt
  • ltssngt456lt/ssngtlt/rowgt
  • ltrowgtltnamegtBilllt/namegt
  • ltssngt789lt/ssngtlt/rowgt
  • lt/personsgt

Michael
123
Hilary
Bill
456
789
48
Semi-structured Data Explained
  • List-valued attributes
  • XML is not 1NF!
  • Impossible in (single, BCNF) tables

ltpersonsgt ltrowgtltnamegtHilarylt/namegt
ltphonegt202-222-2222lt/phonegt
ltphonegt914-222-2222lt/phonegtlt/rowgt
ltrowgtltnamegtBilllt/namegt ltphonegt914-222-2222
lt/phonegt ltphonegt212-333-3333lt/phonegtlt/rowgt
lt/personsgt
? two phones!
???
49
Object ids and References
ltmovieinfogt ltmovie id"o111"gt lttitlegtLost in
Translationlt/titlegt ltyeargt2003lt/yeargt ltstars
idref"o333 o444"/gt lt/moviegt ltmovie id"o222"gt
lttitlegtHamletlt/titlegt ltyeargt1999lt/yeargt
ltstars idref"o333"/gt lt/moviegt ltperson
id"o456"gt ltperson id"o111"gt ltnamegtBill
Murraylt/namegt ltmovies idref"o111
o222"/gt lt/persongt lt/movieinfogt
  • SSD graph might not be trees!
  • But XML docs must be
  • Would cause much redundancy
  • Soln same concept as pointers in C/C/J
  • Object ids and references
  • Graph example
  • Movies Lost in Translation, Hamlet
  • Stars Bill Murray, Scarlet Johansson

50
What do we do with XML?
  • Things done with XML
  • Send to partners
  • Parse XML received
  • Convert to RDBMS rows
  • Query for particular data
  • Convert to other XML
  • Convert to formats other than XML
  • Lots of tools/standards for these

51
DTDs understanding XML
  • XML is extensible
  • Advantage when creating, we can use any tags we
    like
  • Disadv when reading, they can use any tags they
    like
  • Using XML docs a priori is very difficult
  • Solution impose some constraints

52
DTDs
  • DTD Document Type Definition
  • You and partners/vertical industry/academic
    discipline decide on a DTD/schema for your docs
  • Specify which entities you may use/must
    understand
  • Specify legal relationships
  • DTD specifies the grammar to be used
  • DTD set of rules for creating valid entities
  • DTD tells your software what to look for in doc

53
DTD examples
  • Well-formed XML v. valid XML
  • Simple example
  • http//pages.stern.nyu.edu/mjohnson/dbms/xml/note
    .xml
  • http//pages.stern.nyu.edu/mjohnson/dbms/xml/badn
    ote.xml
  • http//pages.stern.nyu.edu/mjohnson/dbms/xml/badn
    ote2.xml
  • Copy from http//pages.stern.nyu.edu/mjohnson/db
    ms/eg/xml.txt
  • Partial publisher example rules
  • Root ? publisher
  • Publisher ? name, book, author
  • Book ? title, date, author
  • Author ? firstname, middlename?, lastname

54
Partial DTD example (typos!)
  • lt?xml version1.0 encodingUTF-8 ?gt
  • lt!DOCTYPE PUBLISHER
  • lt!ELEMENT PUBLISHER (name, book, author)gt
  • lt!ELEMENT name (PCDATA)gt
  • lt!ELEMENT BOOK (title, date, author)gt
  • lt!ELEMENT AUTHOR (firstname, middlename?,
    lastnamegt
  • lt!ELEMENT firstname (PCDATA)gt
  • lt!ELEMENT lastname (PCDATA)gt
  • lt!ELEMENT middlename (PCDATA)gt
  • DTD is not XML, but can be embedded in or ref.ed
    from XML
  • Replacement for DTDs is XML Schema

55
XML Applications/dialects
  • MathML Mathematical Markup Language
  • http//wwwasdoc.web.cern.ch/wwwasdoc/WWW/publicati
    ons/ictp99/ictp99N8059.html
  • VoiceXML http//newmedia.purchase.edu/Jeanine/in
    terfaces/rps.xml
  • ChemML Chemical Markup Language
  • XHMTL HTML retrofitted as an XML application

56
XML Applications/dialects
  • VoiceXML
  • http//newmedia.purchase.edu/Jeanine/interfaces/r
    ps.xml
  • ATT Directory Assistance
  • http//phone.yahoo.com/

57
More XML Apps
  • FIXML
  • XML equiv. of FIX Financial Information eXchange
  • swiftML
  • XML equiv. of SWIFT Society for Worldwide
    Interbank Financial Telecommunications message
    format
  • Apaches Ant
  • Scripting language for Java build management
  • http//ant.apache.org/manual/using.html
  • Many more
  • http//www-106.ibm.com/developerworks/xml/library/
    x-stand4/

58
More XML Applications/Protocols
  • RSS Rich Site Summary/Really Simple Syndication
  • News sites, blogs
  • http//slate.msn.com/rss/
  • http//slashdot.org/index.rss
  • Screenshot
  • http//paulboutin.weblogger.com/pictures/viewer67
    3
  • More info http//slate.msn.com/id/2096660/

ltchannelgt lttitlegtmy channellt/titlegt ltitemgt
lttitlegtstory 1lt/titlegt ltlinkgtlt/linkgt lt/itemgt //
other items lt/channelgt
59
More XML Applications/Protocols
  • SOAP Simple Object Access Protocol
  • XML-based messaging format
  • Used by Google API http//www.google.com/apis/
  • Amazon API http//amazon.com/gp/aws/landing.html
  • Amazon light http//kokogiak.com/amazon/
  • Other examples http//www.wired.com/wired/archive
    /12.03/google.html?pg10topictopic_set
  • SOAP envelope with header and body
  • Request sales tax for total

ltSOAPEnvelope xmlnsSOAP"urnschemas-xmlsoap-or
gsoap.v1"gt ltSOAPHeadergtlt/SOAPHeadergt
ltSOAPBodygt ltGetSalesTaxgt
ltSalesTotalgt100lt/SalesTotalgt
ltGetSalesTaxgt lt/SOAPBodygt lt/SOAPEnvelopegt
60
More XML Applications/Protocols
  • lt?xml version"1.0" encoding"UTF-8"?gt
  • ltsoapEnvelope xmlnssoap"http//schemas.xmlsoap.
    org/soap/envelope/"gt
  • ltsoapBodygt
  • ltgsdoGoogleSearch xmlnsgs"urnGoogleSearch"
    gt
  • ltkeygt(key)slt/keygt
  • ltstartgt0lt/startgt
  • ltmaxResultsgt10lt/maxResultsgt
  • ltfiltergttruelt/filtergt
  • ltrestrict/gt
  • ltsafeSearchgtfalselt/safeSearchgt
  • ltlr/gt
  • lt/gsdoGoogleSearchgt
  • lt/soapBodygt
  • lt/soapEnvelopegt

61
New topic XML in Oracle - purchase-order e.g
lt?xml version"1.0"?gt ltpurchase_ordergt
ltcustomer_namegtAlpha Techlt/customer_namegt
ltpo_numbergt11257gtlt/po_numbergt
ltpo_dategt2004-01-20lt/po_dategt ltpo_itemsgt
ltitemgt ltpart_numbergtAI5-4557lt/part_numbergt
ltquantitygt20lt/quantitygt lt/itemgt
ltitemgt ltpart_numbergtEI-T5-001lt/part_numbergt
ltquantitygt12lt/quantitygt lt/itemgt
lt/po_itemsgt lt/purchase_ordergt
62
Storing XML data
  • As of 9i, has XMLType data type
  • By default, underlying storage is as CLOB

CREATE TABLE purchase_order( po_id number(5)
not null, customer_po_nbr varchar(20),
customer_inception_date date, order_nbr
number(5), purchase_order_doc xmltype,
constraint purchase_order_pk primary
key(po_id) )
63
Loading XML into Oracle
  • First, log in as sys
  • Now scott can import

connect sys/junk as sysdba create directory
xml_data as '/xml' grant read, write on
directory xml_data to scott
connect scott/tiger declare bf1 bfile begin bf1
bfilename('XML_DATA', 'purch_ord.xml') insert
into purchase_order(po_id, purchase_order_doc)
values(1000, xmltype(bf1, nls_charset_id('we8mswi
n1252'))) end
64
Loading XML into Oracle
  • Not just loading raw text
  • XMLType data must be well-formed
  • Parsable as XML
  • Try modifying customer_name open tag

65
Accessing XML in Oracle
  • Now can look at raw XML
  • Can also use XPath to extract particular nodes
    and values, with extract function

SQLgt SELECT purchase_order_doc FROM
purchase_order
SQLgt SELECT extract(purchase_order_doc,
'/purchase_order/customer_name') FROM
purchase_order
66
XPath in Oracle
  • Can also extract all nodes of one type,
    underneath some node, with double-slash //
  • All purchase order items
  • NB this is not valid XML
  • No unique root
  • Can request just one with bracket op
  • Numbering starts at 1, not 0
  • Wrong name/number ? no error, no results

SQLgt SELECT extract(purchase_order_doc,
'/purchase_order//item') FROM purchase_order
SQLgt SELECT extract(purchase_order_doc,
'/purchase_order/po_items/item2') FROM
purchase_order
67
extract v. extractvalue
  • extractvalue returns value, not whole node
  • vs.
  • extractvalue applies only to unique nodes

SQLgt SELECT extract(purchase_order_doc,
'/purchase_order/customer_name') FROM
purchase_order
SQLgt SELECT extractvalue(purchase_order_doc,
'/purchase_order/customer_name') FROM
purchase_order
SQLgt SELECT extractvalue(purchase_order_doc,
'/purchase_order/po_items') FROM purchase_order
68
existsnode function
  • Can check whether node/location exists with
    existnode function
  • Returns 1 or 0
  • Also applies to bracketed paths

SQLgt SELECT po_id FROM purchase_order WHERE
existsnode(purchase_order_doc, '/purchase_order/c
ustomer_name') 1
SQLgt SELECT po_id FROM purchase_order WHERE
existsnode(purchase_order_doc, '/purchase_order/p
o_items/item1') 1
69
Moving data from XML to relations
  • To move single values from XML to tables, can
    simply use extractvalue in UPDATE statements

SQLgt UPDATE purchase_order SET order_nbr
7101, customer_po_nbr extractvalue(purchase_ord
er_doc, '/purchase_order/po_number'), customer_in
ception_date to_date(extractvalue(purchase_orde
r_doc, '/purchase_order/po_date'), 'yyyy-mm-dd')
70
Moving data from XML to relations
  • What about moving set of nodes
  • The two item nodes
  • Use xmlsequence to get a varray of items
  • Use TABLE to convert to a relation

SQLgt SELECT extract(purchase_order_doc,
'/purchase_order//item') FROM purchase_order
SQLgt SELECT rownum, item. FROM TABLE( SELECT
xmlsequence(extract(purchase_order_doc,
'/purchase_order//item')) FROM purchase_order)
item
71
Moving data from XML to relations
  • Result is a two-row relation with XMLTypes
  • Can use extractvalue to extract this data
  • First, create destination table

CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9)
NOT NULL, PART_NBR VARCHAR2(20) NOT NULL,
QTY NUMBER(5) NOT NULL, FILLED_QTY
NUMBER(5), CONSTRAINT line_item_pk PRIMARY
KEY (ORDER_NBR,PART_NBR))
72
Moving data from XML to relations
  • Then insert results

SQLgt INSERT INTO line_item(order_nbr,part_nbr,qty)
SELECT 7109, extractvalue(column_value,
'/item/part_number'), extractvalue(column_value,
'/item/quantity') FROM TABLE( SELECT
xmlsequence(extract(purchase_order_doc,
'/purchase_order//item')) FROM
purchase_order )
73
XML Schemas and Oracle
  • By default, XML must be well-formed to be read
    into the XMLType field
  • XML is valid if it conforms to a schema
  • To use a schema with Oracle, must first register
    it

declare bf1 bfile begin bf1
bfilename('XML_DATA', 'purch_ord.xsd') dbms_xmls
chema.registerschema( 'http//localhost8080/home/
xml/schemas/purch_ord.xsd', bf1) end
74
XML Schemas and Oracle
  • With schema registered, can apply it to an
    XMLType field

CREATE TABLE purchase_order2 (po_id NUMBER(5)
NOT NULL, customer_po_nbr VARCHAR2(20),
customer_inception_date DATE, order_nbr
NUMBER(5), purchase_order_doc XMLTYPE,
CONSTRAINT purchase_order2_pk PRIMARY KEY
(po_id)) XMLTYPE COLUMN purchase_order_doc
XMLSCHEMA "http//localhost8080/home/xml/schemas/
purch_ord.xsd" ELEMENT "purchase_order"
75
Importing to schema field
  • Try to import xml file, get error

declare bf1 bfile begin bf1
bfilename('XML_DATA', 'purch_ord.xml') insert
into purchase_order2(po_id, purchase_order_doc)
values (2000, XMLTYPE(bf1, nls_charset_id('WE8MS
WIN1252'))) end
76
Importing to schema field
  • Root node of XML must specify the schema
  • Change root to the following
  • Now can import
  • Also fails if extra or missing nodes
  • Modify company_name node
  • Add new comments node

ltpurchase_order xmlnsxsi"http//www.w3.org/2001
/XMLSchema-instance" xsinoNamespaceSchemaLocation
"http//localhost8080/home/xml/schemas/purch_ord
.xsd"gt
77
Can check to see whether schema is used
  • Can call isSchemaBased(), getSchemaURL() and
    isSchemaValid() on XMLType fields

SQLgt select po.purchase_order_doc.isSchemaBased(),
po.purchase_order_doc.getSchemaURL(), po.purcha
se_order_doc.isSchemaValid() from purchase_order2
po
78
Updating XMLType data
  • Can update XMLType data with ordinary UPDATE
    statements
  • Replaces whole XMLType object with new one

SQLgt UPDATE purchase_order po SET
po.purchase_order_doc XMLTYPE(BFILENAME('XML_
DATA', 'purch_ord_alt.xml'),
nls_charset_id('WE8MSWIN1252')) WHERE po.po_id
2000
79
Updating XMLType data
  • Can also modify the existing XMLType object
  • By writing node values
  • updateXML() function does search/replace
  • But searches for node, not value

SQLgt SELECT extract(po.purchase_order_doc, '/purch
ase_order/customer_name') FROM purchase_order
po WHERE po_id 1000 SQLgt UPDATE
purchase_order po SET po.purchase_order_doc
updateXML(po.purchase_order_doc, '/purchase_order/
customer_name/text()', 'some other
company') WHERE po.po_id 1000
80
Updating XMLType data
  • Can also write whole node, using XMLType
  • Validation/well-formedness is still checked

SQLgt UPDATE purchase_order po SET
po.purchase_order_doc updateXML(po.purchase_ord
er_doc, '/purchase_order/customer_name', XMLTYPE
('ltcustomer_namegtsome third companylt/customer_nam
egt')) WHERE po.po_id 1000 SQLgt SELECT
extract(po.purchase_order_doc, '/purchase_order/c
ustomer_name') FROM purchase_order po WHERE
po_id 1000
81
Updating XMLType data
  • And can update items in a collection

SQLgt SELECT extract(po.purchase_order_doc,
'/purchase_order//item') FROM purchase_order
po WHERE po.po_id 1000 SQLgt UPDATE
purchase_order po SET po.purchase_order_doc
updateXML(po.purchase_order_doc,
'/purchase_order/po_items/item1',
XMLTYPE('ltitemgtltpart_numbergtT-1000lt/part_numbergtltq
uantitygt33lt/quantitygtlt/itemgt')) WHERE po.po_id
1000
82
Converting relational data to XML
  • Saw how to put XML in a table
  • Conversely, can convert ordinary relational data
    to XML
  • XMLElement() generates an XML node
  • First, create supplier table

CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5)
NOT NULL, NAME VARCHAR2(30) NOT NULL,
PRIMARY KEY (SUPPLIER_ID)) insert into supplier
values(1, 'Acme') insert into supplier values(2,
'Tilton') insert into supplier values(3,
'Eastern')
83
Converting relational data to XML
  • Now can call XMLElement function to wrap values
    in tags
  • And can build it up
  • Dont concatenate! Turns to strings, escapes lt gt
  • Error in book

SELECT XMLElement("supplier_id", s.supplier_id)
XMLElement("name", s.name) xml_fragment FROM
supplier s
SELECT XMLElement("supplier", XMLElement("supplie
r_id", s.supplier_id), XMLElement("name",
s.name)) FROM supplier s
84
XMLForest()
  • More simply, can use XMLForest() function

SELECT XMLElement("supplier",
XMLForest(s.supplier_id, s.name)) FROM supplier s
85
XMLAgg()
  • Can use XMLAgg() to put nodes together inside
    another node

SELECT XMLElement("supplier_list",
XMLAgg(XMLElement("supplier",
XMLElement("supplier_id", s.supplier_id),
XMLElement("name", s.name) )))
xml_document FROM supplier s
86
New topic Data Warehousing
  • Physical warehouse stores different kinds of
    items
  • combined from different sources in supply chain
  • access items as a combined package
  • Synergy
  • DW is the sys containing the data from many DBs
  • OLAP is the system for easily querying the DW
  • Online analytical processing
  • front-end to DW stats

87
Integrating Data
  • Ad hoc combination of DBs from different sources
    can be problematic
  • Data may be spread across many systems
  • geographically
  • by division
  • different systems from before mergers

88
Conversion/scrubbing/merging
  • Lots of issues
  • different types of data
  • Varchar(255) v. char(30)
  • Different values for data
  • GREEN/GR/2
  • Semantic differences
  • Cars v. Automobiles
  • Missing values
  • Handle with nulls or XML

89
Federated DBs
  • Situ n different DBs must work together
  • One idea write programs for each to talk to each
    other one
  • How many programs required?
  • Like ambassadors for each country

90
Federated DBs
  • Better idea introduce another DB
  • write programs for it to talk to each other DB
  • Now how many programs?
  • English in business, French in diplomacy
  • ? Warehousing
  • Refreshed nightly

91
OLTP v. OLAP
  • DWs usually not updated in real-time
  • data is usually not live
  • but care about higher-level, longer-term patterns
  • For knowledge workers/decision-makers
  • Live data is in system used by OLTP
  • online transaction processing
  • E.g., airline reservations
  • OLTP data loaded into DW periodically, say nightly

92
Utilizing Data
  • Situ each time manager has hunch
  • ? requests custom reports
  • ? direct programmers to write/modify SQL app to
    produce these results
  • on higher or lower levels, for different
    specifics
  • Problem too difficult/expensive/slow
  • too great a time lag

93
EISs
  • Could just write queries at command-prompt
  • But decision makes arent (all) SQL programmers
  • Soln create an executive information system
  • provides friendly front-end to common, important
    queries
  • basically a simple DB front-end
  • your project part 5
  • GROUP BY queries are particularly applicable

94
EISs v. OLAP
  • Okay for fixed set of queries
  • But what if queries are open-ended?
  • Q Whats driving sales in the Northeast?
  • Whats the source cause?
  • Result from one query influences next query tried
  • OLAP systems are interactive
  • run query
  • analyze results
  • think of new query
  • repeat

95
Star Schemas
  • Popular schema for DW data
  • One central DB surrounded by specific DBs
  • Center fact table
  • Extremities data tables
  • Fields in fact table are foreign keys to data
    tables
  • Normalization ? Snowflake Schema
  • May not be worthwhile

96
Dates and star schemas
  • OLAP behaves as though you had a Days table, with
    every possible row
  • Dates(day, week, month, year, DID)
  • (5, 27, 7, 2000)
  • Can join on Days like any other table

97
Dates and star schemas
  • E.g. products x salesperson x region x date
  • Products sold by salespeople in regions on dates
  • Regular dim tables
  • Product(PID, name, color)
  • Emp(name, SSN, sal)
  • Region(name, RID)
  • Fact table
  • Sales(PID, DID, SSN, RID)
  • Interpret as a cube (cross product of all
    dimensions)
  • Can have both data and stats

98
Drill-down roll-up
  • Imagine notice some regions sales way up
  • Why? Good salesperson? Some popular product
    there?
  • Maybe need to search by month, or month and
    product, abstract back up to just product
  • slicing dicing

99
OLAP and data warehousing
  • Could write GROUP BY queries for each
  • OLAP systems provide simpler, non-SQL interface
    for this sort of thing
  • Vendors MicroStrategy, SAP, etc.
  • Otoh DW-style operators have been added to SQL
    and some DBMSs

100
DW extensions in SQL ROLLUP (Oracle)
  • Suppose have orders table (from two years), with
    region and date info
  • Can select total sales
  • Examples derived/from Mastering Oracle SQL, 2e
    (OReilly)
  • Get data here http//examples.oreilly.com/mastora
    sql2/mosql2_data.sql

SQLgt column month format a10 SQLgt
_at_mosql2_data SQLgt describe all_orders
SELECT sum(o.tot_sales) FROM all_orders o join
region r ON r.region_id o.region_id
101
DW extensions in SQL ROLLUP (Oracle)
  • Can write GROUP BY queries for year or region or
    both

SELECT r.name region, o.year, sum(o.tot_sales) FRO
M all_orders o join region r ON r.region_id
o.region_id GROUP BY (r.name, o.year)
102
DW extensions in SQL ROLLUP (Oracle)
  • ROLLUP operator
  • Extension of GROUP BY
  • Does GROUP BY on several levels, simultaneously
  • Order matters
  • Get sales totals for each region/year pair each
    region, and the grand total

SELECT r.name region, o.year, sum(o.tot_sales) FRO
M all_orders o join region r ON r.region_id
o.region_id GROUP BY ROLLUP (r.name, o.year)
103
DW extensions in SQL ROLLUP (Oracle)
  • Change the order of the group fields to get a
    different sequence of groups
  • To get totals for each year/region pair, each
    year, and the grand total, and just reverse
    group-by order

SELECT o.year, r.name region, sum(o.tot_sales) FRO
M all_orders o join region r ON r.region_id
o.region_id GROUP BY ROLLUP (o.year, r.name)
104
DW extensions in SQL ROLLUP (Oracle)
  • Adding more dimensions, like month, is easy
    (apart from formatting)
  • NB summing happens on each level

SELECT o.year, to_char(to_date(o.month,
'MM'),'Month') month, r.name region,
sum(o.tot_sales) FROM all_orders o join region
r ON r.region_id o.region_id GROUP BY ROLLUP
(o.year, o.month, r.name)
105
DW extensions in SQL ROLLUP (Oracle)
  • If desired, can combine fields for the sake of
    grouping

SELECT o.year, to_char(to_date(o.month,
'MM'),'Month') month, r.name region,
sum(o.tot_sales) FROM all_orders o join region
r ON r.region_id o.region_id GROUP BY ROLLUP
((o.year, o.month), r.name)
106
DW extensions in SQL CUBE (Oracle)
  • Another GROUP BY extension CUBE
  • Subtotals all possible combins of group-by fields
    (powerset)
  • Syntax ROLLUP ? CUBE
  • Order of fields doesnt matter (apart from
    ordering)
  • To get subtotals for each region/month pair, each
    region, each month, and the grand total

SELECT to_char(to_date(o.month, 'MM'),'Month')
month, r.name region, sum(o.tot_sales) FROM
all_orders o join region r ON r.region_id
o.region_id GROUP BY CUBE (o.month, r.name)
107
DW extensions in SQL CUBE (Oracle)
  • Again, can easily add more dimensions

SELECT o.year, to_char(to_date(o.month,
'MM'),'Month') month, r.name region,
sum(o.tot_sales) FROM all_orders o join region
r ON r.region_id o.region_id GROUP BY CUBE
(o.year, o.month, r.name)
108
DW SQL exts GROUPING SETS (Oracle)
  • Thats a lot of rows
  • Instead of a cube of all combinations, maybe we
    just want the totals for each individual field

SELECT o.year, to_char(to_date(o.month,
'MM'),'Month') month, r.name region,
sum(o.tot_sales) FROM all_orders o join region
r ON r.region_id o.region_id GROUP BY GROUPING
SETS (o.year, o.month, r.name)
109
Next
  • Final evals
  • More lab

110
Thats all, folks!
  • Selected solutions to exercises
  • sqlzoo Answers on sqlzoo.net
  • PL/SQL http//pages.stern.nyu.edu/mjohnson/orac
    le/archive/fall04/plsql/
  • mpjohnson-at-gmail.com
Write a Comment
User Comments (0)
About PowerShow.com