Introduction to XML and RSS - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Introduction to XML and RSS

Description:

An organization A publishes financial data on its web pages (HTML) ... Actor2 Rachel McAdams /Actor2 Director Nick Cassavetes /Director /Movie /Movie ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 57
Provided by: mandarrmut
Category:

less

Transcript and Presenter's Notes

Title: Introduction to XML and RSS


1
Introduction to XML and RSS
  • Data Management Issues

2
Types of data
  • Structured
  • Semi-structured

3
Structured Data
  • data is organized in entities
  • similar entities are grouped together (tables)
  • entities in the same group have the same
    descriptions (attributes)

4
Current Database World
  • Structure
  • Relational Database Management System (DBMS)
  • everything is a relation
  • Query languages SQL
  • Software MS Access, Oracle.

5
Example of a table (patients)
6
Example ofa group of tables
7
(No Transcript)
8
World of Web Data
  • Easy document exchange
  • Unstructured (or poorly structured) data
  • Everything is a document
  • No standard for query languages

9
World of Web Data
  • Example
  • An organization A publishes financial data on its
    web pages (HTML), generated from DBMS.
  • A second organization B wants some financial
    analyses can access only web data.

A
B
HTML
RDBMS
10
Semi-structured Data
  • data can be of any type
  • not necessarily following any format
  • does not follow any rules
  • is not predictable
  • examples include
  • text
  • video
  • sound
  • images

11
Characteristics of Semi-Structured Data
  • structure is irregular missing or additional
    attributes (labels)
  • parts of data lack structure, e.g., images
  • some may yield little structure, e.g., plain text

12
Semi-structured Data ( Contd)
  • Data that is inherently self-describing and does
    not conform to an explicit and fixed schema is
    known as Semistructured Data
  • information is contained within data itself

13
Semi-structured Data ( Contd)
  • The structure of the data is rapidly and
    dynamically changing
  • It includes data as found in several application
    areas such as Web Information Systems and Digital
    Libraries

14
Example of Semi-Structured Data
  • name Peter Wood
  • email ptw_at_dcs.bbk.ac.uk, p.wood_at_bbk.ac.uk
  • --------------------------------------------------
    ----------------
  • name
  • first name Mark
  • last name Levene
  • email mark_at_dcs.bbk.ac.uk
  • --------------------------------------------------
    ----------------
  • name Alex Smith
  • affiliation StFX

15
IMDb A Motivating Example
  • The Internet Movie Database is a classical
    example of a collection of semistructured data
  • Although the information pertaining to different
    movies may be essentially similar, their
    structure may be different!
  • Let us consider an example movie database

16
An Example Movie Database
17
Irregularity In Structure
  • Example Some movie may annotate information
    about the actors, choreographer, director and
    producer, while another movie may annotate
    additional information about the lyricist and the
    music director

18
Irregularity In Structure
  • The same kind of data may be typified differently
  • For ex An actors name may be represented as a
    string or as a tuple (first_name, last_name)
  • Since data gets added to this database
    dynamically, the structure of the database as a
    whole, also keeps changing dynamically

19
Traditional Data Management
Universe of Discourse
Model of the UoD
Database
Query
20
Post-Internet Data Management
Universe of Discourse
Retrieval?
Data
Query
21
XML An Embodiment of Semistructured Data
  • XML can be used to represent semistructured data

22
What is XML?
  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to describe data
  • XML tags are not predefined. You must define your
    own tags

23
The main difference between XML and HTML
  • XML and HTML were designed with different goals
  • XML was designed to describe data and to focus on
    what data is.
  • HTML was designed to display data and to focus on
    how data looks.
  • It is important to understand that XML is not a
    replacement for HTML.

24
XML does not DO anything
  • Maybe it is a little hard to understand, but XML
    does not DO anything. XML is created to
    structure, store and to send information.
  • The note has a header and a message body. It also
    has sender and receiver information. But still,
    this XML document does not DO anything. It is
    just pure information wrapped in XML tags.
    Someone must write a piece of software to send,
    receive or display it.

John Mary
Reminder Don't forget
me this weekend!
25
XML is free and extensible
  • XML tags are not predefined. You must "invent"
    your own tags.
  • The tags used to mark up HTML documents and the
    structure of HTML documents are predefined. (like
    , , etc.).
  • XML allows authors to define their own tags and
    their own document structure.
  • The tags in the example above (like and
    ) are not defined in any XML standard.
    These tags are "invented" by the author of the
    XML document.

26
XML is used to Exchange Data
  • With XML, data can be exchanged between
    incompatible systems.
  • In the real world, computer systems and databases
    contain data in incompatible formats. One of the
    most time-consuming challenges for developers has
    been to exchange data between such systems over
    the Internet.
  • Since XML data is stored in plain text format,
    XML provides a software- and hardware-independent
    way of sharing data.

27
XML can be used to Create new Languages
  • XML is the mother of WAP( Wireless Application
    Protocol) and WML (The Wireless Markup
    Language).
  • WML used to markup Internet applications for
    handheld devices like mobile phones.
  • MathML, for creating Math formula and CML
    (Chemical Markup language) is written in XML.

28
XML Syntax
  • The syntax rules of XML are very simple and very
    strict. The rules are very easy to learn, and
    very easy to use.
  • Because of this, creating software that can read
    and manipulate XML is very easy to do.

29
All XML elements must have a closing tag
  • With XML, it is illegal to omit the closing tag.
  • In HTML some elements do not have to have a
    closing tag. The following code is legal in HTML
  • This is a paragraph
  • In XML all elements must have a closing tag,
    like this
  • This is a paragraph

30
XML tags are case sensitive
  • Unlike HTML, XML tags are case sensitive.
  • With XML, the tag is different from the
    tag .
  • Opening and closing tags must therefore be
    written with the same case
  • This is incorrect
    This is correct

31
All XML elements must be properly nested
  • Improper nesting of tags makes no sense to XML.
  • In HTML some elements can be improperly nested
    within each other like this
  • This text is bold and italic
  • In XML all elements must be properly nested
    within each other like this
  • This text is bold and
    italic

32
All XML documents must have a root element (tag)
  • All XML documents must contain a single tag pair
    to define a root element.
  • All other elements must be within this root
    element.
  • All elements can have sub elements (child
    elements). Sub elements must be correctly nested
    within their parent element
  • .....

33
With XML, white space is preserved
  • With XML, white space is preserved
  • With XML, the white space in your document is not
    truncated.
  • This is unlike HTML. With HTML, a sentence like
    this
  • Hello              my name is John,
  • will be displayed like this
  • Hello my name is John,
  • because HTML strips off the white space.

34
Element Naming
  • XML elements must follow these naming rules
  • Names can contain letters, numbers, and other
    characters
  • Names must not start with a number or punctuation
    character
  • Names must not start with the letters xml (or XML
    or Xml ..)
  • Names cannot contain spaces

35
Element Naming
  • Any name can be used, no words are reserved, but
    the idea is to make names descriptive
  • XML documents often have a corresponding
    database, in which fields exist corresponding to
    elements in the XML document. A good practice is
    to use the naming rules of your database for the
    elements in the XML documents.

36
Comments in XML
  • The syntax for writing comments in XML is similar
    to that of HTML.

37
XML Attributes
  • XML elements can have attributes in the start
    tag, just like HTML.
  • Attributes are used to provide additional
    information about elements.
  • In HTML (and in XML) attributes provide
    additional information about elements

38
XML Attributes
  • Attribute values must always be enclosed in
    quotes

39
XML Attributes Cont.
  • John
  • Mary
  • --------------------------------------------------
    --------------------
  • John
  • Mary
  • The error in the first document is that the date
    attribute in the note element is not quoted.
  • The first line in the document - the XML
    declaration

40
Use of Elements vs. Attributes
  • Data can be stored in child elements or in
    attributes.
  • Take a look at these examples
  • Anna
  • Smith
  • --------------------------------------------------
  • female
  • Anna
  • Smith
  • In the first example sex is an attribute. In the
    last, sex is a child element. Both examples
    provide the same information.

41
Errors in XML will stop the XML program
  • The World Wide Web Consortium (W3C) XML
    specification states that a program should not
    continue to process an XML document if it finds a
    validation error. The reason is that XML software
    should be easy to write, and that all XML
    documents should be compatible.
  • With HTML it was possible to create documents
    with lots of errors (like when you forget an end
    tag). One of the main reasons that HTML browsers
    are so big and incompatible, is that they have
    their own ways to figure out what a document
    should look like when they encounter an HTML
    error.
  • With XML this should not be possible.

42
XML and Browsers
  • Netscape 6 or higher supports XML
  • Internet Explorer 5.0 or higher supports XML

43
Viewing XML Files
  • If you open an XML document in IE, it will
    display the document with color coded root and
    child elements. A plus () or minus sign (-) to
    the left of the elements can be clicked to expand
    or collapse the element structure.
  •  
  • If you want to view the raw XML source, you must
    select "View Source" from the browser menu.
  • If an erroneous XML file is opened, the browser
    will report the error.

44
Other Examples
  • Viewing some XML documents will help you get the
    XML feeling.
  • An XML CD catalogThis is some CD collection,
    stored as XML data
  • An XML plant catalogThis is a plant catalog from
    a plant shop, stored as XML data.
  • A Simple Food MenuThis is a breakfast food menu
    from a restaurant, stored as XML data.

45
Why does XML display like this?
  • XML documents do not carry information about how
    to display the data.
  • Since XML tags are "invented" by the author of
    the XML document, browsers do not know if a tag
    like describes an HTML table or a dining
    table.
  • Without any information about how to display the
    data, most browsers will just display the XML
    document as it is.

46
The XML Rules (Summary)
  • Single, unique root element
  • Matching open/close tags
  • Consistent capitalisation
  • Correctly nested elements (no overlapping
    elements)
  • Attribute values enclosed in quotes
  • No repeating attributes in an element
  • 3Months.com
  • Web Development
  • Wakefield st
  • Wellington
  • New Zealand

47
Authoring XML Documents
  • A basic XML document is an XML element that can,
    but might not, include nested XML elements.
  • Example
  • Second Chance
  • Matthew Dunn

48
Converting Relational Database to XML
  • Example Export the following data into XML and
    group books by store
  • Relational Database
  • Store (sid, name, phone)
  • Book (bid, title, authors)
  • StoreBook (sid , bid, price, stock)

49
Converting Relational Database to XML (Contd)
  • XML

50
Examples
  • example of database
  • Example of database converted to XML

51
XML representation of a sample Movie Database
  • standaloneyes?
  • The Notebook
  • Ryan Gosling
  • Rachel McAdams
  • Nick Cassavetes
  • FRIENDS
  • Seinfeld

52
RSS ( Really Simply Synidication)
  • RSS is a family of web feed formats used to
    publish frequently updated digital content, such
    as blogs, news feeds or podcasts.
  • Users of RSS content use programs called feed
    "readers" or "aggregators" the user "subscribes"
    to a feed by supplying to their reader a link to
    the feed the reader can then check the user's
    subscribed feeds to see if any of those feeds
    have new content since the last time it checked,
    and if so, retrieve that content and present it
    to the user.
  • RSS formats are specified in XML (a generic
    specification for data formats). RSS delivers its
    information as an XML file called an "RSS feed,"
    "webfeed," "RSS stream," or "RSS channel".

53
RSS Feed representation
  • On Web pages, web feeds (RSS) are typically
    linked with the word "Subscribe", an orange
    square,
  • or a rectangle with the letters
  • Or
  • Many news aggregators such as msnbc.com publish
    subscription buttons for use on Web pages to
    simplify the process of adding news feeds.

54
Podcasting
  • A podcast is a media file that is distributed
    over the Internet using syndication feeds, for
    playback on portable media players and personal
    computers.
  • The term "podcast" is derived from Apple's
    portable music player, the iPod.
  • Though podcasters' web sites may also offer
    direct download or streaming of their content, a
    podcast is distinguished from other digital audio
    formats by its ability to be downloaded
    automatically, using software capable of reading
    feed formats such as RSS.

55
Podcasting
  • Podcasting is an automatic mechanism whereby
    multimedia computer files are transferred from a
    server to a client, which pulls down XML files
    containing the Internet addresses of the media
    files. In general, these files contain audio or
    video, but also could be images, text, PDF, or
    any file type.
  • Example StFX Podcasts

56
XML Joke
  • Question When should I use XML?
  • Answer When you need a buzzword in your resume.
Write a Comment
User Comments (0)
About PowerShow.com