SQL: Queries, Programming, Triggers - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

SQL: Queries, Programming, Triggers

Description:

generated by humans or by applications, consumed by humans only, ... 'Serge' 'Abiteboul' 1997 'Victor' 'Vianu' 122. 133. paper. book. paper. references. references ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 47
Provided by: RaghuRamak244
Category:

less

Transcript and Presenter's Notes

Title: SQL: Queries, Programming, Triggers


1
Database Systems I The Semistructured Data
Model
2
The Web Today
  • HTML documents
  • generated by humans or by applications,
  • consumed by humans only,
  • easy access across platforms, across
    organizations.
  • ? only layout, no semantic information
  • Limited application interoperability
  • HTML not understood by applications at most,
    some heuristic rules.
  • Database technology
  • SQL standard, but still lots of vendor specific
    aspects in implementations.

3
XML Data Exchange Format
  • A standard from the W3C (World Wide Web
    Consortium, http//www.w3.org).
  • The mission of the W3C
  • . . . developing common protocols that
    promote its evolution and ensure its
    interoperability. . ..
  • Basic ideas
  • XML data
  • XML generated by applications
  • XML consumed by applications
  • Easy access across platforms, organizations.

4
Paradigm Shift on the Web
  • For web search engines
  • From documents (HTML) to data (XML)
  • From document management to document
    understanding (e.g., question answering)
  • From information retrieval to data management
  • For database systems
  • From relational (structured) model to
    semistructured data
  • From data processing to data /query translation
  • From storage to transport

5
The Semistructured Data Model
  • Developed by the DBS community to address the
    following, emerging issues
  • Data sets with non-rigid structure
  • Biological datasequence data, 3D data, text
    data . . . and their relationships
  • Web data
  • Integration of heterogeneous sourcesnot only,
    but especially for Web data and biological data.

6
The Semistructured Data Model
  • Data is self-describing, i.e. the data
    description is integrated with the data itself
    rather than in a separate schema.
  • Database is a collection of nodes and arcs
    (directed graph).
  • Leaf nodes represent data of some atomic type
    (atomic objects, such as numbers or strings).
  • Interior nodes represent complex objects
    consisting of components (child nodes), connected
    by arcs to this node.
  • Arcs are directed and connect two nodes.

7
The Semistructured Data Model
  • Arc labels indicates the relationship between the
    two corresponding nodes.
  • The root node is the only interior node without
    in-arcs, representing the entire database.
  • All database objects are children of the root
    node.
  • Every node must be reachable from the root.
  • A general graph structure is possible, i.e. the
    graph need not be a tree structure.

8
Graphical Representation
Bib
o1
paper
paper
book
references
o12
o24
o29
references
references
author
page
author
year
author
title
http
title
title
publisher
author
author
author
o43
25
96
1997
last
firstname
firstname
lastname
first
lastname
243
206
Serge
Abiteboul
Victor
122
133
Vianu
9
Textual Representation
  • Example
  • Bib o1 paper o12 ,
  • book o24 ,
  • paper o29
  • author o52
    Abiteboul,
  • author o96
    firstname 243 Victor,

  • lastname o206 Vianu,
  • title o93 Regular
    path queries with constraints,
  • references o12,
  • references o24,
  • pages o25 first
    o64 122, last o92 133
  • Nested tuples, set-values, object identifiers
    (oids)

10
Textual Representation
  • Simplified textual representation.
  • Can omit oids.
  • paper author Abiteboul,
  • author firstname Victor,
  • lastname
    Vianu,
  • title Regular path queries
    ,
  • page first 122, last 133

11
Comparison with Relational Model
  • Missing attributes
  • Additional attributes
  • Multiple attribute values (set-valued attributes)
  • Objects as attribute values
  • No global schema
  • ? only the first characteristics supported by
    relational model, all others are not

12
Comparison with Relational Model
  • Semistructured data
  • Self-describing,
  • Irregular data,
  • No a-priori structure.
  • Relational DB
  • Separate schema,
  • Regular data,
  • A-priori structure.

13
Comparison with Relational Model
Example
row name John, phone 3634 , row
name Sue, phone 6343 , row name
Dick, phone 6363
14
XML
  • A W3C standard for an Extensible Markup Language.
  • Origins Structured text SGML (Standard
    Generalized Markup Language).
  • Motivation
  • HTML describes presentation only, XML describes
    content and its meaning (semantics).
  • HTML is fix language, XML allows to define your
    own markup languages.

15
From HTML to XML
? HTML describes the presentation / layout
16
From HTML to XML
HTML example lth1gt Bibliography lt/h1gt ltpgt ltigt
Foundations of Databases lt/igt Abiteboul,
Hull, Vianu ltbrgt Addison Wesley,
1995 ltpgt ltigt Data on the Web lt/igt
Abiteboul, Buneman, Suciu ltbrgt Morgan
Kaufmann, 1999
17
From HTML to XML
  • XML example
  • ltbibliographygt
  • ltbookgt lttitlegt Foundations lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltauthorgt Hull lt/authorgt
  • ltauthorgt Vianu lt/authorgt
  • ltpublishergt Addison Wesley
    lt/publishergt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt
  • lt/bibliographygt
  • XML describes the content

18
Elements
  • Tags book, title, author,
  • start tag ltbookgt, end tag lt/bookgt
  • defined by user / programmer (different from
    HTML!)
  • Elements ltbookgtltbookgt,ltauthorgtlt/authorgt
  • An element consists of a matching start and end
    tag and the enclosed content.
  • Elements can be nested, i.e. content of one
    element can consist of sequence of other elements.

19
Attributes
  • Attributes can be associated with any element.
  • Provide additional information about elements.
  • Attributes can have only one value.
  • Example
  • ltbook price 55 currency USDgt
  • lttitlegt Foundations of Databases lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt
  • Attributes can also be used to connect elements.

20
Non-tree-like XML
  • So far only tree-like XML documents,i.e. each
    element is nested within at most one other
    element.
  • Attributes can also be used to create non-tree
    XML documents.
  • Attributes with a domain of ID serve as primary
    keys of elements.
  • Attributes with a domain of IDREF serve as
    foreign keys referencing the ID of another
    element.

21
Non-tree-like XML
  • Example of a non-tree structure
  • ltpersonsgt
  • ltperson personido555gt ltnamegt Jane lt/namegt
  • lt/persongt
  • ltperson personido456gt
  • ltnamegt Mary lt/namegt
  • ltchildren refso123 o555lt/children gt
  • lt/persongt
  • ltperson personido123 mothero456gt
  • ltnamegtJohnlt/namegt
  • lt/persongt
  • lt/personsgt

22
Namespaces
  • An XML document can involve tags that come for
    multiple sources.
  • One and the same tag can appear in more than one
    source.
  • lttablegt lttrgt
  • lttdgtAppleslt/tdgt
  • lttdgtBananaslt/tdgt
  • lt/trgt lt/tablegt
  • lttablegt
  • ltnamegtAfrican Coffee Tablelt/namegt
  • ltwidthgt80lt/widthgt
  • ltlengthgt120lt/lengthgt
  • lt/tablegt

23
Namespaces
  • Name conflicts can be resolved by prefixing tag
    names according to their source.
  • lthtablegt lthtrgt lthtdgtAppleslt/htdgt
  • lthtdgtBananaslt/htdgt lt/htrgt
  • lt/htablegt
  • ltftablegt
  • ltfnamegtAfrican Coffee Tablelt/fnamegt
  • ltfwidthgt80lt/fwidthgt
  • ltflengthgt120lt/flengthgt
  • lt/ftablegt
  • When using prefixes in XML, a namespace for the
    prefix must be defined.
  • The namespace must be referenced (via an URI) in
    the start tag of an enclosing element .

24
Namespaces
  • lthtable xmlnsh"http//www.w3.org/TR/html4/"gt
  • lthtrgt . . .
  • lt/htrgt lt/htablegt
  • ltftable xmlnsf"http//www.w3schools.com/furnitu
    re"gt . . .
  • lt/ftablegt lt/rootgt
  • Or alternatively
  • ltroot xmlnsh"http//www.w3.org/TR/html4/"
    xmlnsf"http//www.w3schools.com/furniture"gt
  • lthtablegt . . .
  • lt/htablegt
  • ltftablegt
  • . . .
  • lt/ftablegt
  • lt/rootgt

25
Namespaces
  • A URI is a Universal Resource Identifier,
    typically a URL.
  • The document referenced by the URI describes the
    meaning of the tags in the namespace.
  • This description is informal and is not used by
    the XML parser.
  • The description can even be empty.

26
Well-Formed XML
  • A well-formed XML document satisfies the
    following conditions
  • Begins with a declaration that it is XML.
  • Has a single root element that encloses the whole
    document.
  • Consists of properly nested elements, i.e. start
    and end tag of an element are within the same
    enclosing element.
  • standalone yes states that document has no
    DTD.
  • In this mode, you can invent your own tags, like
    in semistructured data model.

27
Well-Formed XML
lt?XML version1.0 standalone yes
?gt ltbibliographygt ltbookgt lttitlegt
Foundations lt/titlegt ltauthorgt
Abiteboul lt/authorgt ltauthorgt
Hull lt/authorgt ltauthorgt Vianu
lt/authorgt ltpublishergt Addison
Wesley lt/publishergt ltyeargt 1995
lt/yeargt lt/bookgt ltbookgt lttitlegt lt/titlegt
. . . lt/bookgt
lt/bibliographygt
28
Well-Formed XML
  • HTML browsers will display documents with errors
    (like missing end tags).
  • The W3C XML specification states that a program
    should stop processing an XML document if it
    finds an error.
  • The main reason is that XML is being consumed by
    programs rather than by humans (as HTML).
  • W3C provides a validator that checks whether an
    XML document is well-formed.

29
Valid XML
  • The validator can also check whether an XML
    document is valid, i.e. conforms to a Document
    Type Definition (DTD).
  • A DTD specifies the allowable tags and how they
    can be nested.
  • XML with a DTD is no longer semistructured
    (self-describing).
  • However, a DTD is less rigid than the schema of a
    relational DB. E.g., a DTD allows missing and
    multiple attributes / elements.

30
Document Type Definitions
  • Document Type Definition (DTD) set of rules
    (grammar) specifying elements, attributes and all
    other aspects of XML documents.
  • For each element, specify name and content type.
  • Content type can, e.g., be
  • PCDATA (character string),
  • other elements,
  • regular expression made of the above content
    types zero or more occurrences ? zero or
    one occurrence one or more occurrences ,
    sequence of elements.

31
Document Type Definitions
  • Specification of element type lt!ELEMENT
    ltNamegt ltContentgt gt
  • Specification of attributes lt!ATTLIST
    ltElementNamegt ltAttributeNamegt ltContentgt ltTypegt
    gt
  • Attribute type either REQUIRED or IMPLIED
    (optional).

32
Document Type Definitions
  • ID domain with unique values within the given
    document.
  • IDREF references one ID.
  • IDREFS references a list of IDs.
  • Example
  • ltBook id book1 pub book5 . . .gt
  • . . .
  • ltBook id book5 pub book4 . . .gt

33
Document Type Definitions
  • Document type contains all corresponding element
    types
  • lt!DOCTYPE ltNamegt ltElementTypesgt gt
  • Use of DTD by some document
  • reference DTD in document opening line
  • STANDALONE no.
  • Example
  • lt?XML version1.0 standalone no ?gt
  • lt!DOCTYPE Book SYSTEM Book.dtdgt

34
Example DTD Product Catalog
  • lt!DOCTYPE CATALOG
  • lt!ELEMENT CATALOG (PRODUCT)gt
  • lt!ELEMENT PRODUCT (SPECIFICATIONS,OPTIONS?,PRICE
    ,NOTES?)gt
  • lt!ATTLIST PRODUCT NAME CDATA IMPLIED
  • CATEGORY (HandToolTableShop-Professional)
    "HandTool"
  • PARTNUM CDATA IMPLIED
  • PLANT (PittsburghMilwaukeeChicago)
    "Chicago"
  • INVENTORY (InStockBackorderedDiscontinued)
    "InStock"gt
  • lt!ELEMENT SPECIFICATIONS (PCDATA)gt
  • lt!ATTLIST SPECIFICATIONS WEIGHT CDATA IMPLIED
  • POWER CDATA IMPLIEDgt
  • lt!ELEMENT OPTIONS (PCDATA)gt
  • lt!ATTLIST OPTIONS FINISH (MetalPolishedMatte)
    "Matte"
  • ADAPTER (IncludedOptionalNotApplicable)
    "Included"
  • CASE (HardShellSoftNotApplicable)
    "HardShell"gt
  • lt!ELEMENT PRICE (PCDATA)gt
  • lt!ATTLIST PRICE MSRP CDATA IMPLIED
  • WHOLESALE CDATA IMPLIED
  • STREET CDATA IMPLIED

35
XML Schema
  • The successor of DTDs to specify a schema for XML
    documents.
  • A W3C standard.
  • Includes and extends functionality of DTDs.
  • In particular, XML Schemas support data types.
    This makes it easier to validate the correctness
    of data and to work with data from a database.
  • XML Schemas are written in XML. You don't have to
    learn a new language and can use your XML parser
    to parse your Schema files.

36
Simple Elements
  • Simple elements contain only text.
  • They can have one of the built-in datatypes
  • xsstring, xsdecimal, xsinteger, xsboolean
  • xsdate, xstime.
  • Example
  • ltxselement name"lastname type"xsstring"/gt
  • ltxselement name"age" type"xsinteger"/gt
  • ltxselement name"dateborn" type"xsdate"/gt

37
Simple Elements
  • Restrictions allow you to further constrain the
    content of simple elements.
  • ltxselement name"age"gt
  • ltxssimpleTypegt
  • ltxsrestriction base"xsinteger"gt
  • ltxsminInclusive value"0"/gt
    ltxsmaxInclusive
    value"120"/gt
  • lt/xsrestrictiongt
  • lt/xssimpleTypegt
  • lt/xselementgt

38
Attributes
  • Attributes can be specified using the attribute
    element
  • ltxsattribute name"xxx" type"yyy"/gt
  • Attribute elements are nested within the element
    of the element with which they are associated.
  • By default, attributes are optional.
  • To make an attribute mandatory, use
  • ltxsattribute name"lang type"xsstringuse"re
    quired"/gt
  • Attributes can have the same built-in datatypes
    as simple elements.

39
Complex Elements
  • Complex elements can contain other elements and
    can have attributes.
  • Nested elements need to occur in the order
    specified.
  • The number of repetitions of elements are
    controlled by the attributes minOccurs and
    maxOccurs. The default is one repetition.
  • A complex element with an attribute
  • ltxselement name"product"gt ltxscomplexTypegt
    ltxsattribute name"prodid" type"xspositiveIn
    teger"/gt lt/xscomplexTypegt
    lt/xselementgt

40
Complex Elements
  • A complex element containing a sequence of nested
    (simple) elements
  • ltxselement name"employee"gt ltxscomplexTypegt
    ltxssequencegt ltxselement name"firstname"
    type"xsstring"/gt ltxselement
    name"lastname" type"xsstring"/gt
    lt/xssequencegt lt/xscomplexTypegt
  • lt/xselementgt

41
Complex Elements
  • If you name the complex element, other elements
    can reference and include it
  • ltxscomplexType name"persontype"gt
  • ltxssequencegt
  • ltxselement name"firstname" type"xsstring"/gt
    ltxselement name"lastname" type"xsstring"/gt
    lt/xssequencegt
  • lt/xscomplexTypegt
  • ltxselement name"person" type"persontype"/gt

42
XML Document With Schema
  • An XML document that uses a schema has to
    reference the schema in the schemaLocation
    attribute of its root element
  • lt?xml version"1.0"?gt
  • ltnote xmlns"http//www.w3schools.com"
    xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
    ce" xsischemaLocation"http//www.w3schools.c
    om note.xsd"gt lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekend!lt/bodygt
  • lt/notegt

43
Example XML Schema
  • ltschema version1.0 xmlnshttp//www.w3.org/199
    9/XMLSchemagt
  • ltelement nameauthor typestring /gt
  • ltelement namedate type date /gt
  • ltelement nameabstractgt
  • lttypegt lt/typegt
  • lt/elementgt
  • ltelement namepapergt
  • lttypegt
  • ltattribute namekeywords typestring/gt
  • ltelement refauthor minOccurs0
    maxOccurs /gt
  • ltelement refdate /gt
  • ltelement refabstract minOccurs0
    maxOccurs1 /gt
  • ltelement refbody /gt
  • lt/typegt
  • lt/elementgt
  • lt/schemagt

44
XML vs. Semistructured Data
  • Both described best by a graph.
  • Both are schema-less, self-describing(XML
    without DTD / XML schema).
  • XML is ordered, semistructured data is not.
  • XML can mix text and elements
  • lttalkgt Making Java easier to type and easier
    to type
  • ltspeakergt Phil Wadler lt/speakergt
  • lt/talkgt
  • XML has lots of other stuff attributes,
    entities, processing instructions, comments.

45
Summary
  • Due to their variable and complex structure, Web
    documents cannot naturally be modeled using the
    relational model.
  • The Semistructured Data Model is a
    self-describing data model providing sufficient
    flexibility for representing Web documents.
  • One of the weaknesses of the Web is that (HTML)
    documents cannot be processed automatically.
  • The purpose of XML is to provide a way of
    recording the semantics of Web documents and
    their components. For this sake, XML allows you
    to define your application-specific tags.

46
Summary
  • XML documents are lists of elements and
    attributes. Elements can be nested to form
    tree-like structures.
  • Non-hierarchical structures are also possible.
  • Document type definitions (DTDs) are similar to
    but less restrictive than DB schemas, specifying
    rules that corresponding XML documents have to
    satisfy.
  • XML schemas are a more recent and more DB-like
    extension of DTDs.
Write a Comment
User Comments (0)
About PowerShow.com