Title: Overview
1Overview
- Some hyperlinking history
- XML-Linking goals and status
- XML Base
- The XPointer language
- The XLink language
- Controlling traversal behavior
2Some Hyperlinking History
3History repeats itself
- In word processing
- Early systems knew text structure
- WYSIWYG took over but skipped structure
- As it scaled, people started to fake structure
- Software gradually built structure all over again
- In hypermedia
- Many early systems had linking structure
- HTML ltAgt took over but skipped structure
- As projects scale, people are faking link
structure - So
4Hypermedia in the 60s (!)
- For example, FRESS (Brown, 1969)
- 800,000 lines of 370 assembler -- still runs
- Grandfather of most Word processors
- Teaching databases typesetting
- Some features still not on Web
- Links were structured w/ types, keywords
- Links were truly bidirectional
- Links overviews
- Supported document structures / ranges
- Out-of-line links later (1983, Intermedia)
5HTML and hypermedia
- WWW is powerful because of Internet, URIs, and
documents linking is weak - HTML linking strengths
- Easy syntax lta href"http//foo.net/x.html"gt
- Trivial to implement
- Can go to few specific targets (ID only)
- lta href"http//foo.net/x.htmlsec2"gt
- lta name"sec2"gt
- Anchors move along with editing
- (unlike, say, byte offsets)
6Anatomy of an HTML link
lta href"http//z.uk/x.htmch2"gtInfo.lt/agt
Locallink-end
Locator
Link
lt!DOCTYPE lthtmlgtlta name"ch2"gtlt/h1gt...lt/htm
lgt
Remotelink-end
7HTML Linking Limitations
81 Limited elements
- People cant create their own subtypes
- What if you need several sub-types of link?
- REL/REV attributes non-standardized
- CLASS attributes dont make it
- Semantics/processing second-class
- Maybe no HTML element to start from
- Its like C without subclasses
- CSS cannot make elements into links
92 Link behavior tied to type
- A means (usually)
- Replaces document in same window
- Activated by user
- IMG means (usually)
- Embeds target inline
- Activated upon loading
- Graphics only
- Why cant I have other combinations?
- CSS doesn't have what youd need
- Browsers cant embed HTML/XML yet
103 No real type system
- Few have tackled the link type problem
- Trigg, Bieber, DeRose, Connolly
- REL/REV provide a hook, at least
- No standard/portable set of types
- Need an orderly but extensible system
- Most proposed type-sets are grab-bags
- Mixing rhetorical, topological, and syntactic
types
114 A privileged end
- The link is physically at one of its ends
- Infeasible to allow annotation
- Thus no links from read-only documents
- Nearly all documents, for any one person
- So no annotations that readers of the annotated
document can see - Doesnt allow link databases, so
- Cant buy or sell link-sets
- Cant filter or select links on demand
- Cant share many users links
125 Few possible destinations
- Can point to a whole document
- Can point to HTML lta namegt if it existslta
href"http//xyz.com/foo.htmlid37"gt points
to lta name"id37"gt - Author controls what others can reference
- You can only quote paras 1, 12, or 20.
- Can't do some links at all
- Impossible to supply handles for links that
partly overlap - Impractical to supply handles even at word level
136 No aggregate destinations
- No multi-ended links in HTML
- No multiple locations for a single end
- No dynamic document assembly
- Some uses for multi-ended links
- Back-of-book indexes
- Pull up multiple commentaries at once
- Editing after multiple document reviews
- Parallel views of editions and translations
147 One way A or IMG to somewhere
- Why cant I ask who links to me?
- (without doing a global search)
- (even for a limited domain)
- Why cant I make a two-way link?
- Why cant I make multi-ended links?
- Go back ? 2-way link
- Only works after youve gone the one way
- You cant start at the other end
15Overview
- Some hyperlinking history
- XML-Linking introduction
- Goals
- History
- Pointing vs. Linking
- XML Base
- The XPointer language
- The XLink language
- Controlling traversal behavior
16XML-Linking goals end user
- Links from un-writable documents
- Which is most of the Web, for any person
- Perhaps the most important single feature
- -gtBidirectional and multi-ended links
- -gtAnnotations and annotation sharing
- Dynamic updates, patches, highlighting
- Precise link attachment in any media
- Large sets/databases of managed links
- An entirely new market for links per se
- Anyone can publish/sell their commentary
17The XML-Linking effort
- Started 1/1996 in the XML WG
- Separate Working Group 1998
- Specs
- XPath complete 11/1999 (joint w/ XSL WG)
- XLink complete 6/2001
- XML Base complete 6/2001
- XPointer being refactored
- first 3 documents out soon
18Pointing vs. linking
- In HTML, many things are combined
- lta href"eg.org/foo"gtwowlt/agt
- Technically
- "eg.org/foo" is a pointer (namely a URI)
- The abstract connection itself is the link
- The ltagt element is a link representation
- "wow" is the local anchor
- Anchors are also called link-ends
- Data at eg.org is the remote anchor
- HTML specifies the link behavior
19Why pick all this apart?
- Engineering/theory
- Design each one better
- Ease implementation
- More easily combine them in new ways
- Practically
- Multiple locations can be linked at once
- Link-ends can be organized and labeled
- Meta-information can be added
- Links can live outside of linked docs
20XML-Linking specifications
- XPath expressions on infoset nodes
- REC http//www.w3.org/TR/xpath
- XPointer XPath ranges, in URIs
- CR http//www.w3.org/TR/WD-xptr
- Xpointer Schemes failover for fragment Ids
- Xpointer NS scheme (declare namespaces)
- Xpointer Element scheme
- XLink gather locations to make links
- REC http//www.w3.org/TR/xlink/
- (XML Base)
21XPointer locators
ltxmlgt ltxref target"http//z.com/foo.xmlid('p3
7')"gtSee Section 1.lt/xrefgt
A way of locating data in XML structure used to
attach link end(s) to data
A pointer identifies or locates some part of a
document -- this is only the yellow part above
22XLink connections
Someplace
Someplace
- Describes a relationshipof referenced
location(s), - To each other
- To descriptions
- XLink providessome key ones
A link connects data and meta-data portions,
including their relationship -- really just the
lines
role
role
role
A link may be expressed at a unique source end,
or out in a link database
Someplace
Someplace
Someplace
23Overview
- Some hyperlinking history
- XML-Linking introduction
- XML Base
- The XML Pointer Language (XPointer)
- Fragment identifiers
- XPointer schemes
- XPointer syntax/semantics
- The XML Link language (XLink)
- Controlling traversal behavior
24XPointer
- Locates parts of XML resources
- Even things without IDs
- Even things that aren't whole nodes
- XPointer adds (beyond XPath)
- Way to refer to point and range selections
- Way to use inside URI fragment identifiers
- Typically, a browser might load a document and
scroll to/highlight the part
25Anatomy of a URI reference
URI reference
URI
http//example.com/foo.htmbing
domain
path
fragment identifier
scheme
XPointer defines this part
26Fragment identifiers
- Part of URIs after ""
- Says where in document is actual target
- Separate form for each media type
- Identifiers for graphics ? for text
- IETF MIME definition specifies form
- HTML
- To scroll to lta name"coyote"gt
- http//example.com/hello.htmlcoyote
27The XPointer scheme system
- A way to have several pointing mechanisms in one
fragment ID - Each scheme in a fragment is tried in sequence
- scheme1(args) scheme2(args)
- unimplemented schemes and pointers that fail to
locate, are simply ignored. - recent changes
- Things are more split by scheme. Full XPointer
delayed (again) - Last call drafts imminent
28The 3 XPointer/XPath schemes
- Bare names
- An XML "name" finds element with that ID
- Only exception to the scheme system
- element() scheme
- Counts Stepwise down through elements /1/4/27/2
- May start with an ID element(intro/4/3/2)
- xmlns() binds namespaces for schemes
- xpointer scheme (awaits disposition of comments)
- Full Xpointers, xpath and more
- For now, the only "scheme" is "xpointer"
Name Letters, digits, hyphen, underscore,
period.
29All work on XML tree
- Address Infoset, not markup
- Count nodes of different types
- Root, elements, attributes, text chunks
- Comments, PIs, namespace dcls
- Can also work down in character data
- Note Most of XPointer XPath
- Adds ranges
- Adds way to put into fragment identifier
30XPointer's 2 parts
- Provide 'scheme' mechanism
- Identify media-specific pointer types
- Allow multiple ones to co-exist
- Pointing methods for XML
- Point to ranges, sets, id's, coords
- Point descriptively
31XPointer schemes
- XML medias type may need pointer types
- pngRect(0,10 100,200)
- vrml(camera1,2,3 light4,50,500)
- map(W0?10/ N51?30)
- Schemes label fragment identifier types
- scheme1(args) scheme2(args)
- Escape any extra ( ) -- tlg('(apax')
- element(), xmlns() is the first scheme
32Multiple schemes in a URL?
- When a server responds to a URI, it
- Checks what media the client can handle
- Picks one of those to send
- content negotiation
- If a visually-impaired user clicks
- lta href"http//www.example.com/foo.gif gif(0,0
1,1) xpointer(id(chap1))"gt - The server may fall back to an XML file
- The client tries fragment identifiers
left-to-right, and uses the first one that works
33Full XPointers
- The content of the 'xpointer' scheme
- Provides way to walk around trees
- Provides way to select nodes/ranges
34XPath/XPointer expressions
- XPointers produce a location set
- Location node point range(XPath only
produces nodes) - Locating is stepwise
- A step generally look along some axis
- Candidates are then filtered by predicates
- Special functions locate strings, ids, etc.
- Each step operates on a context, and
- Step are separated by a slash
- id(foo) / childSEC3 / childLIST4
35Points and Ranges
Hello, world.
- Point
- What you get by click-selection
- Gap before/after node or char
- Range
- What you get by drag-selection
- From a start point to an end point
- Not generally a WF XML subtree
- May partially contain some elements
- ltpgtHello, world.lt/pgtltpgtHi, backlt/pgt
- Crucial for creating hypertext links
- How often do you click/drag exactly one entire
element?
Hello, world.
36Absolute linking functions
- origin()
- Locates where traversal came from
- So if paragraphs P1 and P2 are linked, and the
user clicks on P1 to follow a link, P1 is
origin - Can make abstract links, "next chapter"...
- here()
- Locates the link representation itself
- With one-way links like HTML ltagt, here origin
- Can make a web-ring of one-ended links
37here() and origin()
href"xpointer(here()/ following-siblingchapter
)"
origin()/ancestor2
Assuming you clicked on the title
Assuming you clicked on the xref
href"..."
38Summary axes and functions
- root( ), id( )
- parent, self, child
- ancestor, ancestor-or-self
- descendant, descendant-or-self
- preceding-, following-sibling
- preceding, following
- attribute, namespace
- here( ), origin( )
Absolute
Relative
Absolute
39More about ranges
- Nodes are great handles for linkingbut most
selections are not whole nodes - Typical link creation
- Select something do "make link"
- A click indicates a point, not a node
- A drag indicates a range, not a node
- A range is not a set of nodes (!)
- It is one object, not a huge number
- Characters are not nodes in DOM
- Range does not have properties of node
Defined in the DOM 2 spec
40string-range
- string-range(n,"string", offset,length)
- Locates selections (think cursor)
- Finds nth occurrence of string provided
- Offset counts positions before characters
- 1 is before first character -1 before last
- Length specifies length of result
- In the cards are dealt string-range(1,"cards"
,1,1) c(1,"cards",1,0) point before
c(1,"cards") point before c(1,"cards",3,2)
rd(1,"cards",-1,5) s are
41range-to
- Range-to(xptr)
- From start of context to end of xptr
- The xptr would typically be an ID or relative
- id('sec2')/range-to(id('sec4'))
- id('fn37')/range-to(following-sibling3)
- A typical form
- xpointer( id('sec2.1')/4
range-to(id('sec2.1')/2))
42Some range applications
- A typical hypertext interface
- User selects some text
- (a selection is hardly ever a node)
- "make link", "make annotation"
- Application generates an XPointer
- Application saves links in a link-db
- Great for referring to "milestones" or other
non-hierarchical units
43XPointer new datatypes
- Extending nodes
- Points
- Ranges
- Locations
- The supertype of the extensions.
- Location sets
- set of locations
44Reducing pointer/link fragility
- Consider what will break an XPointer
- If you just used offsets (easy but weak) the
slightest edit - If you describe paths through the tree, only
changes on the path - If you use paths with element type names, only
ancestors or pr-siblings of same type - If you use IDs, only a specific change to
that ID - When you can describe the true intent, they can
re-attach when possible - Use IDs, or relative from an ID
45XPointer robustness
- Absolute robustness is not possible
- Servers go down
- Pages are deleted or moved
- Author can mangle documents any time
- Author can even change all IDs any time
- But, how you construct pointers counts
- IDs are good until explicitly changed
- Software can manage IDs to be persistent
- These are not true of counters, offsets, etc.
46Overview
- Some hyperlinking history
- XML-Linking goals and status
- XPath and XPointer in general
- The XML Pointer language (XPointer)
- The XML Linking Language (XLink)
- Terminology
- Inline and out-of-line links
- Arcs and behavior
- Linksets and link databases
47Remember...
- In style
- XPath identifies set of nodes
- XSLT itself does transforms on them
- Similarly in linking
- XPointer identifies sets of locations
- XLink connects and describes them
- These are very separate ideas
48XLink is a language that...
- Lets you invent your own linking elements and
their meanings - In keeping with XML approach overall
- Lets you create link databases
- Links become first-class objects in the model
- Provides some basic traversal behavior
- E.g., Open the target in a new window
- The rest is left to a style mechanism such as XSL
49XLink terminology
- Linking element
- Identifies, connects, and describes anchors
- Locator
- Locates some link end (anchor)s data
- Link end or anchor
- A data portion reachable as part of a link
- Arc
- Explicit connection between two link ends
- Resource
- Anything you can point at on the Web
- Using an arc is called Traversal
50What links do with link-ends
- A link identifies where its ends are
- Using some kind of locators
- URIXPointer will be the locator for XML
- URIscheme()scheme() in general
- A link attaches metadata to each end
- Its formal role in relation to the other ends
- A title by which to refer to it (say, in menus)
- Some traversal behaviors
- Arcs to say which traversals happen
- Link itself can also have type, other info
51Inline links
- Linking element itself (better, the origin() end)
is one of the links ends
52Out-of-line links
- Linking element itself isn't automatically made
into one of its own resources
Requires that there be a way to find link
databases in the first place
53 Anatomy of an XML link
Link need not beat a link-end
lthtmlgtKnuths right.lt/htmlgt
ltlink type"annotated-reference"gt ltloc
role"ref" href"xptr.xmlchild(2,div)"gt ltloc
role"src" href"knut73.texs4.2.2"gt ltloc
role"com" href"http//x.com/note.html"gt ltlinkgt
Each link-end can be described
Link may have any number of ends
lt!DOCTYPE spec...ltspecgtltdivgtlt/divgtltdivgt ltheadgt.
..lt/headgt...
Link-ends need not be XML
\ 4.2.2 A tree is a set of nodes where each
node has one parent, except for a root node,
which has none.
Link-ends need not be marked up
54XLink and namespaces
- Xlink should be usable with any schema
- And have any old name
- Xlink elements are in a namespace
- http//www.w3.org/1999/Xlink
- Apply this in usual namespace way
- ltxlinklocator xmlnsxlink "http//www.w3.org/19
99/Xlink"gt - Namespaces can't rename, so true name goes on
xlinktype attribute - This is much like architectural forms
55Simple and extended links
- A simple link is a
- Starter kit of basic linking functionality
- Only two resources are ever involved
- It is always inline
- An extended link is a
- Generalized, all-purpose linking construct
- Any number of resources can be involved
- It might be inline or out of line
- It enables link databases
56The one-ended link
This is really important
Hilite'red'
- A link can identify and describe
- Can say the end "is a typo" or "important" or
"highlighted in red" - A one-ended link still describes an end
- Can attach tags or attributes to R/O data
- ltclaim href"hamletid(soliloquy)" isacrux/gt
- The description isnt properly an end
- If it were, youd expect to navigate there
- If it were, it would need a role and title
- Critical for adding "standoff" markup
57Arcs
- Arcs specify traversal rules
- Multi-ended links may restrict travel among their
endpoints - Restrictions generic or app-specific
- Arcs enable the description of both
- An arc is a pair of roles, plus metadata
- Enables traversal between ends with the given
roles - May be multiple locators per role (useful for
document assembly, multiple-choice travel)
58Example vehicle annotations
Warning explosive
Warningtoxic
gasoline
fuel-type
warning
warning
ARCS vehicle ? fuel-type fuel-type ? warning
Link body 1
vehicle
vehicle
vehicle
59How to detect links
- Could have any name and content at all
- ltfootnotegt, ltcriticismgt,
- xlinktype attribute marks linking elements for
applications to find - lt!ELEMENT footnote EMPTYgtlt!ATTLIST footnote
xlinktype CDATA FIXED "simple"
xlinkhref CDATA REQUIREDgt - For example ...has studied the issue.ltfootnote
href"http//www.doctools.com" /gt
Defaultvalue forattribute
60Types of links
- simple easy, basic linking element
- Creates link solely by means of attributes
- So allows only one non-local end (cf ltAgt)
- extended
- Creates link using attributes / subelements
- Each sub-element represents an end
- This allows structured or multi-lingual titles
- Both can have
- Machine-readable role attribute
- Human-readable title attribute
61Simple links
- Like HTML's ltagt
- Has two participating resources
- Linking element itself (if in-line)
- Resource pointed to by the href locator attribute
- Uni-directional
- From simple
- To remote resource
62Simple link declaration
Content may be an end
- lt!ELEMENT simple ANYgt
- lt!ATTLIST simplexlinktype CDATA FIXED
"simple"xlinkhref CDATA REQUIRED xlinkr
ole CDATA IMPLIEDxlinktitle CDATA IMPLIEDgt
Identifies this as a link
Locatesresource
Describeother end
63Extended links
- Can be in- (like ltagt) or out-of-line
- Have any number of resources
- Each resource gets its own child locator
element(s) - Link ends are distinguished by roles
- Each locator has an href attribute
- Multi-directional
- Can start traversal from any resource
- Can control allowed traversals via arcs
- Application must be informed about the link
- E.g., user subscribes to link database XYZ
64The extended link container
- lt!ELEMENT extended (locatorarctitle)gt
- lt!ATTLIST extendedxlinktype CDATA FIXED
"extended"xlinkrole CDATA IMPLIEDxlinktitl
e CDATA IMPLIEDgt - Other content/attributes may be included if
needed
Allows for the links ends
Identifiesas a link
Can supplydefaultsfor ends
65An extended links ends
Identifies this as a locator
- lt!ELEMENT locator EMPTYgt
- lt!ATTLIST locatorxlinktype CDATA FIXED
"locator" xlinkhref CDATA REQUIRED - xlinkrole CDATA IMPLIEDxlinktitle CDATA
IMPLIEDgt - Link end locators may also have content and/or
application-specific attributes
Locatesresource
Describethe end
66Declaring Arcs
Identifies this as an arc
- lt!ELEMENT go ANYgt
- lt!ATTLIST goxlinktype (arc) FIXED "arc"
- xlinkfrom NMTOKEN IMPLIEDxlinkto NMTOKEN I
MPLIEDxlinkshow (newreplaceembedunknown)
IMPLIED xlinkactuate (onLoadonRequestunknown)
IMPLIED xlinkarcrole NMTOKEN IMPLIEDxlinkt
itle CDATA IMPLIEDgt
Which roles to connect
Traversal semantics
Describethe arc
67Arcs and Traversals
- Traversal is split into
- Behavior
- Author's intention for behavior of a link.
- Input to style mechanism
- Not a presentation command
- Actuation
- Defines the event that triggers a link
- Events are very generic, intentionally
68Two kinds of behavior policies
- show attribute
- new to traverse and provide new context
- replace to display in existing context
- embed to display in the body of the initiating
resource - Some semantic details are left unspecified
combining multiple ends, style inheritance, etc. - actuate attribute
- onRequest to require external request
- onLoad to traverse when link processed
69Replacement behavior
- replace plus onRequest
- Like clicking on the HTML ltAgt element
- replace plus onLoad
- Like a redirect when the link is processed, the
other resource replaces the original one in which
the link was found
70New behavior
- new plus onRequest
- Like clicking on the HTML ltAgt element, except
that it opens up a new window - new plus onLoad
- Useful for revealing commentary when the link is
processed, a new window opens up in addition to
the original one
71Embedding behavior
- embed plus onRequest
- Useful for expanding a thumbnail or when
auto-loading of images is off when the user
clicks, the content appears in place - embed plus onLoad
- Like the HTML ltIMGgt element or an external entity
reference the content automatically appears in
place when the link is processed - This is transclusion
- True transclusion requires that the transcluded
data also provide access to original context
72Simple link syntax
- lt!ELEMENT defref (PCDATA)gtlt!ATTLIST
defref xlinktype CDATA "simple" xlinkhref
CDATA REQUIRED xlinkrole CDATA
"mynscooking"gtltpgtuse any number of ltdefref
xlinkhref"filterdef"gt filters lt/defrefgt
(about 1 tsp)lt/pgt
Note can add "FIXED" before default value so it
will be the only value allowed.
73Extended links
- lt!ELEMENT term-def (locator)gt
- lt!ATTLIST term-def xlinktype CDATA
"extended"gt - lt!ELEMENT locator EMPTYgt
- lt!ATTLIST locator
- xlinktype CDATA "locator"
- xlinkhref CDATA REQUIRED
- xlinkrole (mynsmentionmynsdef) REQUIREDgt
- ltterm-defgtltlocator href"myterm1"
role"mention"/gtltlocator href"myterm2"
role"mention"/gtltlocator href"definition
role"def"/gt - lt/term-defgt
74Link databases let you
- Attach descriptive information from afar
- Annotate other people's stuff
- Maintain links more easily
- When a destination changes, you dont have to
touch documents with links to it - Engage in online commerce in links
- Express, package, and sell point-of-view
- Collect out of line links as databases
75External Linksets
- Users will have persistent linkdbs
- Subscriptions, interest groups, private,...
- Document can specify relevant link dbs
- Linked by special type of extended link
- Included within regular documents too
- LinkDBs enable link management
- Needed to author using external links
- Example Public annotations on.
76Declaring an External Linkset
- lt!ELEMENT xls (linkbase)gt
- lt!ATTLIST xls
- xmlnsxlink CDATA FIXED "http//www.w3.org/
1999/xlink" - xlinktype (extended) "extended"
- xlinkrole NMTOKEN "xlinkextended-linkset"
- xlinktitle CDATA IMPLIEDgt
- lt!ATTLIST linkbase
- xlinktype (locator) "locator"gt
77An external Linkset Instance
- ltxlsgtltlinkbase xlinkhref"linkset1.xml"
/gtltlinkbase xlinkhref"linkset2.xml"
/gtltlinkbase xlinkhref"linkset3.xml" /gt - lt/xlsgt
78Notes on implementation
- Best thing for any standard
- Lots of implementations
- Attention to conformance/interoperability
- IE does not yet have these
- Encourage implementors
- XPointer easy on top of XPath DOM
- XLink easy for inline, trickier for good LDBs
- Proxy server can merge document links
- Client app can do likewise
- Encourage distributed linkbase dev
79Some Resources
- On the Web Consortium site
- W3C XML site www.w3.org/XML/
- Other specs www.w3.org/TR/
- Elsewhere on the Web
- XML FAQ www.ucc.ie/xml/
- XML Developers list www.lists.ic.ac.uk/hypermail
/xml-dev/ - TEI etext.virginia.edu/
- Robin Covers definitive site www.oasis-open.org
/cover/