Title: Indexing
1Indexing
- The syntax for creating a index is
- CREATE UNIQUE INDEX index_name ON table_name
(column1, column2,... column_n) COMPUTE
STATISTICS - Why do we need index?
2Normalisation
- Normalisation can be considered as a process for
reducing redundancies and eliminating
insertion/deletion/updation anamolies - What is wrong with EMP_DEPT?
3XML
- XML material is a modified version from
Dr.Sagivs slides
4XML vs. HTML
- HTML is a HyperText Markup language
- Designed for a specific application, namely,
presenting and linking hypertext documents - XML describes structure and content
(semantics) - The presentation is defined separately from the
structure and the content
5An Address Book asan XML document
ltaddressesgt ltpersongt ltnamegt Donald
Ducklt/namegt lttelgt 414-222-1234 lt/telgt ltemailgt
donald_at_yahoo.com lt/emailgt lt/persongt ltpersongt lt
namegt Miki Mouselt/namegt lttelgt 123-456-7890
lt/telgt ltemailgtmiki_at_yahoo.comlt/emailgt lt/persongt
lt/addressesgt
6Main Features of XML
- No fixed set of tags
- New tags can be added for new applications
- An agreed upon set of tags can be used in many
applications - Namespaces facilitate uniform and coherent
descriptions of data - For example, a namespace for address books
determines whether to use lttelgt or ltphonegt
7Main Features of XML (contd)
- XML has the concept of a schema
- DTD and the more expressive XML Schema
- XML is a data model
- Similar to the semistructured data model
- XML supports internationalization (Unicode) and
platform independence (an XML file is just a
character file)
8XML is the Standard forData Exchange
- Web services (e.g., ecommerce) require exchanging
data between various applications that run on
different platforms - XML (augmented with namespaces) is the preferred
syntax for data exchange on the Web
9XML is not Alone
- XML Schemas strengthen the data-modeling
capabilities of XML (in comparison to XML with
only DTDs) - XPath is a language for accessing parts of XML
documents - XLink and XPointer support cross-references
- XSLT is a language for transforming XML
documents into other XML documents (including
XHTML, for displaying XML files) - Limited styling of XML can be done with CSS
alone - XQuery is a lanaguage for querying XML documents
10The Two Facets of XML
- Some XML files are just text documents with tags
that denote their structure and include some
metadata (e.g., an attribute that gives the name
of the person who did the proofreading) - See an example on the next slide
- XML is a subset of SGML (Standard Generalized
Markup Language) - Other XML documents are similar to database files
(e.g., an address book)
11XML can Describethe Structure of a Document
- ltbook year"1994"gt
- lttitlegtTCP/IP Illustratedlt/titlegt
- ltauthorgt
- ltlastgtStevenslt/lastgt
- ltfirstgtW.lt/firstgt
- lt/authorgt
- ltpublishergtAddison-Wesleylt/publishergt
- ltpricegt65.95lt/pricegt
- lt/bookgt
12XML Syntax
- W3Schools Resources on XML Syntax
13The Structure of XML
- XML consists of tags and text
- Tags come in pairs ltdategt ... lt/dategt
- They must be properly nested
- good
- ltdategt ... ltdaygt ... lt/daygt ... lt/dategt
- bad
- ltdategt ... ltdaygt ... lt/dategt... lt/daygt
- (You cant do ltigt ... ltbgt ... lt/igt ...lt/bgt in
HTML)
14A Useful Abbreviation
- Abbreviating elements with empty contents
- ltbr/gt for ltbrgtlt/brgt
- lthr width10/gt for lthr width10gtlt/hrgt
- For example
- ltfamilygt
- ltperson id lisagt
- ltnamegt Lisa Simpson lt/namegt
- ltmother idref marge/gt
- ltfather idref homer/gt
- lt/persongt
- ...
- lt/familygt
Note that a tag may have a set of attributes,
each consisting of a name and a value
15XML Text
- XML has only one basic type text
- It is bounded by tags, e.g.,
- lttitlegt The Big Sleep lt/titlegt
- ltyeargt 1935 lt/ yeargt 1935 is still text
- XML text is called PCDATA
- (for parsed character data)
- It uses a 16-bit encoding, e.g., \\x0152 for
the Hebrew letter Mem
16XML Structure
- Nesting tags can be used to express various
structures, e.g., a tuple (record)
ltpersongt ltnamegt Lisa Simpsonlt/namegt lttelgt
02-828-1234 lt/telgt lttelgt 054-470-777
lt/telgt ltemailgt lisa_at_cs.huji.ac.il lt/emailgt
lt/persongt
17XML Structure (contd)
- We can represent a list by using the same tag
repeatedly
ltaddressesgt ltpersongt lt/persongt ltpersongt
lt/persongt ltpersongt lt/persongt ltpersongt
lt/persongt lt/addressesgt
18XML Structure (contd)
ltaddressesgt ltpersongt ltnamegt Donald
Ducklt/namegt lttelgt 04-828-1345 lt/telgt ltemailgt
donald_at_cs.technion.ac.il lt/emailgt lt/persongt ltper
songt ltnamegt Miki Mouselt/namegt lttelgt
03-426-1142 lt/telgt ltemailgtmiki_at_yahoo.comlt/emailgt
lt/persongt lt/addressesgt
19Terminology
- The segment of an XML document between an opening
and a corresponding closing tag is called an
element
ltpersongt ltnamegt Bart Simpson
lt/namegt lttelgt 02 444 7777 lt/telgt lttelgt 051
011 022 lt/telgt ltemailgt bart_at_tau.ac.il lt/emailgt
lt/persongt
20An XML Document is a Tree
Bart Simpson
051 011 022
02 444 7777
bart_at_tau.ac.il
Leaves are either empty or contain PCDATA
21Mixed Content
- An element may contain a mixture of sub-elements
and PCDATA - ltairlinegt
- ltnamegt British Airways lt/namegt
- ltmottogt
- Worlds ltdubiousgt favoritelt/dubiousgt
- airline
- lt/mottogt
- lt/airlinegt
22The Header Tag
- lt?xml version"1.0" standalone"yes/no"
encoding"UTF-8"?gt - Standaloneno means that there is an external
DTD - You can leave out the encoding attribute and the
processor will use the UTF-8 default
23Processing Instructions
- lt?xml version"1.0"?gt
- lt?xml-stylesheet href"doc.xsl"
type"text/xsl"?gt - lt!DOCTYPE doc SYSTEM "doc.dtd"gt
- ltdocgtHello, world!lt!-- Comment 1 --gtlt/docgt
- lt?pi-without-data?gt
- lt!-- Comment 2 --gt
- lt!-- Comment 3 --gt
24Using CDATA
- ltHEAD1gt Entering a Kennel Club Member
- lt/HEAD1gt
- ltDESCRIPTIONgtEnter the member by the name on his
or her papers. Use the NAME tag. The NAME tag has
two attributes. Common (all in lowercase,
please!) is the dog's call name. Breed (also in
all lowercase) is the dog's breed. Please see the
breed reference guide for acceptable breeds. Your
entry should look something like this - lt/DESCRIPTIONgt
- ltEXAMPLEgtlt!CDATAltNAME common"freddy"
breed"springer-spaniel"gtSir Fredrick of
Ledyard's Endlt/NAMEgtgt - lt/EXAMPLEgt
25Well-Formed XML Documents
- An XML document (with or without a DTD) is
well-formed if - Tags are syntactically correct
- Every tag has an end tag
- Tags are properly nested
- There is a root tag
- A start tag does not have two occurrences of the
same attribute
An XML document must be well formed
26Representing relational databases
A relational database for school student cou
rse enroll
27XML representation
ltschoolgt ltstudent id001gt ltnamegt Joe
lt/namegt ltgpagt 3.0 lt/gpagt lt/studentgt ltstudent
id002gt ltnamegt Mary lt/namegt ltgpagt
4.0 lt/gpagt lt/studentgt ltcourse
cno331gt lttitlegt DB lt/titlegt
ltcreditgt 3.0 lt/creditgt lt/coursegt ltcourse
cno350gt lttitlegt Web lt/titlegt ltcreditgt
3.0 lt/creditgt lt/coursegt
28XML representation
- ltenrollgt
- ltidgt 001 lt/idgt ltcnogt 331 lt/cnogt
- lt/enrollgt
- ltenrollgt
- ltidgt 001 lt/idgt ltcnogt 350 lt/cnogt
- lt/enrollgt
- ltenrollgt
- ltidgt 002 lt/idgt ltcnogt 331 lt/cnogt
- lt/enrollgt
- lt/schoolgt