Title: The%20Latest%20Web%20Developments:%20How%20Do%20I%20Deploy%20Them?
1The Latest Web Developments How Do I Deploy Them?
- Brian Kelly Email Address
- UK Web Focus B.Kelly_at_ukoln.ac.uk
- UKOLN
- University of Bath
- http//www.ukoln.ac.uk/
UKOLN is funded by the British Library Research
and Innovation Centre, the Joint Information
Systems Committee of the Higher Education Funding
Councils, as well as by project funding from the
JISCs Electronic Libraries Programme and the
European Union. UKOLN also receives support
from the University of Bath where it is based.
2Contents
- Background
- Web Developments
- Data Formats
- Transport
- Addressing
- Metadata
- Deployment Issues
- Questions
- Aims of Talk
- To give an overview of the Web architecture
- To review new web developments
- To address implementation models
3Background
- UK Web Focus
- National web coordination post based at UKOLN,
University of Bath - Responsible for tracking web developments and
informing and advising UK HE community - Represents JISC on W3C
- W3C (World Wide Web Consortium)
- International organisation responsible for
coordinating web standards - See ltURL http//www.w3.org/gt
- See list of recommendations, working drafts and
notes at ltURL http//www.w3.org/TR/gt
4Web and Standardisation
HTML extensions PDF and Java?
- Proprietary
- De facto standards
- Often initially appealing (cf PowerPoint)
- May emerge as standards
- W3C
- Produces W3C Recommendations on Web protocols
- Managed approach to developments
- Protocols initially developed by W3C members
- Decisions made by W3C, influenced by member and
public review - UK membersinclude JISC,UKERNA,Southampton
andBristol
PNG HTML Z39.50 Java?
- ISO
- Produces ISO Standards
- Can be slow moving and bureaucratic
- Produce robust standards
- IETF
- Produces Internet Drafts on Internet protocols
- Bottom-up approach to developments
- Protocols developed by interested individuals
- "Rough consensus and working code"
PNG HTML HTTP
HTTP URN
5The Web Vision
- Tim Berners-Lee's vision for the Web
- Evolvability is critical
- Automation of information management If a
decision can be made by machine, it should - All structured data formats should be based on
XML - Migrate HTML to XML
- All logical assertions to map onto RDF model
- All metadata to use RDF
- See keynote talk at WWW 7 conference at ltURL
http//www.w3.org/Talks/1998/0415-Evolvability/sl
ide1-1.htmgt
6Web Protocols
- Web initially based on three simple protocols
- Data FormatsHTML (HyperText Markup Language)
provides the data format for native documents - AddressingURLs (Uniform Resource Locator)
provides an addressing mechanism for web
resources - TransportHTTP (HyperText Transfer Protocol)
defines transfer of resources between client and
server
Transport HTTP
7HTML History
Dilemna Proprietary extensions cause
problems. But experiments are needed
- HTML 1.0 Unpublished specification.
- HTML 2.0 Spec. based on innovations from NCSA
(forms and inline images!) - HTML 3.0 Proposed spec. (renamed from
HTML).Very comprehensive Failed to complete
IETF standardisation Little implementation
experience - Proprietary Introduction of proprietary HTML
elements by Netscape and Microsoft - HTML 3.2 Spec. based on description of mainstream
innovations in marketplace - HTML 4.0 Current recommendation
1992
1994
1994-5
1995
1997
1998
8HTML 4.0, CSS 2.0 and DOM
- HTML 4.0 used in conjunction with CSS 2.0
(Cascading Style Sheets) and the DOM provides an
architecturally pure, yet functionally rich
environment
- HTML 4.0 W3C-Rec
- Improved forms
- Hooks for stylesheets
- Hooks for scripting languages
- Table enhancements
- Better printing
- CSS 2.0 W3C-Rec
- Support for all HTML formatting
- Positioning of HTML elements
- Multiple media support
- DOM W3C-WD
- Document Object Model
- Hooks for scripting languages
- Permits changes to HTML CSS properties and
content (DHTML)
- CSS Problems
- Changes during CSS development
- Netscape IE incompatibilities
- Continued use of browsers with known bugs
9HTML Limitations
- HTML 4.0 / CSS 2.0 have limitations
- Difficulties in introducing new elements
- Time-consuming standardisation process (ltABBREVgt)
- Dictated by browser vendor (ltBLINKgt, ltMARQUEEgt)
- Area may be inappropriate for standarisation
- Covers specialist area (maths, music, ...)
- Application-specific (ltSTUD-NUMgt)
- HTML is a display (output) format
- HTML's lack of arbitrary structure limits
functionality - Find all memos copied to John Smith
- How many unique tracks on Jackson Browne CDs
10XML
- XML
- Extensible Markup Language
- A lightweight SGML designed for network use
- Addresses HTML's lack of evolvability
- Arbitrary elements can be defined
(ltSTUDENT-NUMBERgt, ltPART-NOgt, etc) - Agreement achieved quickly - XML 1.0 became W3C
Recommendation in Feb 1998 - Support from industry (SGML vendors, Microsoft,
etc.) - Various XML DTDs already agreed (MathML, CML)
- Support in Netscape 5 and IE 5
11XML Concepts
- Well-formed XML resources
- Make end-tags explicit ltLIgt...lt/LIgt
- Make empty elements explicit ltIMG .../gt
- Quote attributes ltIMG SRC"logo" HEIGHT"20"
- Use consistent upper/lower case
- Valid XML resources
- Need DTD
- XML Namespaces
- Mechanism for ensuring unique XML elements
- lt?xmlnamespace ns"http//foo.org/1998-001"
prefix"i"gt - ltPgtInsert ltiPARTgtM-471lt/iPARTgtlt/Pgt
12XML Deployment
- Ariadne issue 14 has article on "What Is XML?"
- Describes how XML support can be provided
- Natively by new browsers
- Back end conversion of XML - HTML
- Client-side conversion of XML - HTML / CSS
- Java rendering of XML
- Examples of intermediaries
See http//www.ariadne.ac.uk/issue15/what-is/
13XLink, XPointer and XSL
- XLink will provide sophisticated hyperlinking
missing in HTML - Links that lead user to multiple destinations
- Bidirectional links
- Links with special behaviors
- Expand-in-place / Replace / Create new window
- Link on load / Link on user action
- Link databases
- XPointer will provide access to arbitrary
portions of XML resource.Interesting IPR
issues! - XSL stylesheet language will provide
extensibility and transformation facilities (e.g.
create a table of contents)
ltcommentary xmllink"extended" inline"false"gt
ltlocator href"smith2.1" role"Essay"/gt
ltlocator href"jones1.4" role"Rebuttal"/gt
ltlocator href"robin3.2" role"Comparison"/gt
lt/commentarygt
14Addressing
- URLs (e.g. http//www.bristol-poly.ac.uk/depts/mus
ic/latest.html) have limitations - Lack of long-term persistency
- Organisation changes name
- Department / Product scrapped
- Directory structure reorganised
- Inability to support multiple versions of
resources (mirroring) - URNs (Uniform Resource Names)
- Proposed as solution
- Difficult to implement (no W3C activity in this
area)
15Addressing - Solutions
- DOIs (Document Object Identifiers)
- Proposed by publishing industry as a solution
- Aimed at supporting rights ownership
- Business model needed
- PURLs (Persistent URLs)
- Provide single level of redirection
- Pragmatic Solution
- URLs don't break - people break them
- Design URLs to have long life-span
- Further Information
- ltURL http//www.ukoln.ac.uk/metadata/resources/ur
n/gt - ltURL http//hosted.ukoln.ac.uk/biblink/wp2/links
.htmlgt
16Transport
- HTTP/0.9 and HTTP/1.0
- Design flaws and implementation problems
- HTTP/1.1
- Addresses some of these problems
- 60 server support
- Performance benefits! (60 packet traffic
reduction) - Is acting as fire-fighter
- Not sufficiently flexible or extensible
- HTTP/NG
- Radical redesign used object-oriented
technologies - Undergoing trials
- Gradual transition (using proxies)
17Metadata
- Metadata - the missing architectural component
from the initial implementation of the web
- Metadata Needs
- Resource discovery
- Content filtering
- Authentication
- Improved navigation
- Multiple format support
- Rights management
18Metadata Examples
- DSig (Digital Signatures initiative)
- Key component for providing trust on the web
- DSig 2.0 will be based on RDF and will support
signed assertion - This page is from the University of Bath
- This page is a legally-binding list of courses
provided by the University - P3P (Platform for Privacy Preferences)
- Developing methods for exchanging Privacy
Practices of Web sites and user - Note that discussions about additional rights
management metadata are currently taking place
19RDF
- RDF (Resource Description Framework)
- Highlight of WWW 7 conference
- Provides a metadata framework ("machine
understandable metadata for the web") - Based on ideas from content rating (PICS),
resource discovery (Dublin Core) and site mapping - Based on a formal data model (direct label
graphs) - Applications include
- cataloging resources resource discovery
- electronic commerce intelligent agents
- digital signatures content rating
- intellectual property rights privacy
20Browser Support for RDF
Trusted 3rd Party Metadata
- Mozilla (Netscape's source code release) provides
support for RDF. - Mozilla supports site maps in RDF, as well as
bookmarks and history lists - See Netscape's or HotWired home page for a link
to the RDF file.
Embedded Metadata e.g. sitemaps
Image from http//purl.oclc.org/net/eric/talks/www
7/devday/
21Deployment Issues
- Various interesting new technologies have been
outlined - How can they be deployed in our environment?
- Should we
- Ignore them?
- Accept them fully?
- Accept them partly?
22Ignore New Developments
- We can chose to ignore new developments, and
continue to use HTML 3.2 - Safe option, with no new training, support or
software costs - Experience in effectiveness, limitations, etc.
- Fails to address current performance problems
- Fails to address accessibility problems
- Fails to provide new functionality
- Service likely to look "old-fashioned" compared
with competition
23Fully Accept New Developments
- We can chose to more wholesale to, say, HTML 4.0
and CSS 2.0 - Can be exciting to be at leading edge
- Performance benefits
- Accessibility benefits
- Based on open-standards
- Provides motivation for users to upgrade browsers
- Likely to be solution at some point (cf. Gopher)
- Backwards compatibility problems with old
browsers - Costly to deploy new authoring news, training, ..
- Likely to be bugs and incompatibilities with new
tools and browsers
24Implement "Safe" Solutions
- An alternative is to use "safe" parts of
technologies which are backwards compatible and
avoid major browser bugs - Attractive sounding compromise position
- Lose some functionality, but not all
- Can be difficult or expensive to find "safe"
options (does .margin-left work on IE on SGI?) - Tools may not allow safe options to be chosen
- Lack of validation tools for checking conformance
with restricted set of specification - Note
- See ltURL www.webreview.com/guides/style/insafeg
rid.htmgt for unsafe CSS 2.0 properties
25Decision Time
- What would you opt for?
- Stick with current technologies
- Cheap, default option. Continuation of
performance and accessibility problems. Unlikely
to be long term solution. - Deploy new technologies
- More expensive option. Functionality, performance
and accessibility benefits. Access problems for
old browsers. - Use "safe" new technologies
- May require home-grown tools and support. Avoids
some of the problems of other solutions
26An Alternative
- An alternative approach to deploying new
technologies is available - Use more intelligent server-side software
- Use "proxies" to address limitations of browser
technologies. The term intermediary was used in
a paper 1 at the WWW 7 conference to describe
this approach - Protocol solutions, such as Transparent Content
Negotiation (TCN)
1 "Intermediaries New Places For Producing and
Manipulating Web Content"
27Intelligent Server Software
- Simple model
- Server receives request for resource
- Server delivers resource to client
- More sophisticated model
- Server receives request for resource
- Server processes header information from client
- Server delivers resource to client based on
client information - This is referred to as browser-sniffing or
user-agent negotiation - Note that server support is now available in
Apache and in server add-ons such as PHP/FI and
MS Active Server Pages
28W3C CSS Gallery
- W3C have a link to a core style sampler service.
- The service provides 8 core style sheets which
can be freely linked to. - The style sheets use "browser sniffing".
Different style sheets are delivered to different
browsers.
H1 font-family Tahoma, ... font-size-adjust
.53 margin-top 1.33em font-weight 500
...
H1, H2, H3, H4, H5, H6, .. color black
background white
Portion of CSS file for Netscape Total 169 lines
Portion of CSS file for IE Total 797 lines
29Java Intermediaries
- Netscape and Internet Explorer don't support
MathML - Who cares? MathML Java renderers are available
- This concept can be generalised to deploying
support for other new markup languages. - For example see the Displets work at
http//www.cs.unibo.it/fabio/displet/
30Deploying URNs
- Problem
- Today's browsers can't process URNs, such as
- urndoi10.1000/1
- Possible Solution
- A separate program could resolve URNs into URLs
- Andy Powell (UKOLN) has demonstrated use of
Netscape's autoproxy to pass on URNs of the
format above to Squid for resolution 1 - Example of use of an intermediary to deploy new
technologies not supported by current browsers
1 "Resolving DOI Based URNs Using Squid" at
http//mirrored.ukoln.ac.uk/lis-journals/dlib/dl
ib/dlib/june98/06powell.html
31Intermediaries
- Intermediaries
- Enable new functionality to be introduced to the
web without extending the client or the server - Intermediaries can be implemented using proxies
- Intermediaries can be used for applications such
as web personalisation, document caching, content
distillation and protocol extension - Demonstration available using WBI (Web Browser
Intelligence) - See ltURL http//wwwcssrv.almaden.ibm.com/wbi/gt
- Another example for web accessibility at ltURL
http//www.inf.ethz.ch/department/IS/ea/blinds/gt
32Web Applications
- An Example
- We're familiar with HTML validation services
(e.g. HENSA mirror) - We can "go there" and use the service
- We can also have a link from the page which will
run the service (rather than just go to the form) - Consider
- Web page is in Bath
- User is in Sheffield
- Application is in Kent
- An example of a web (intermediary?) application
33Examples
http//www.ukoln.ac.uk/web-focus/webwatch/service
s/url-info/
- Examples of remote web applications include
- Link checking
- Website analysis
- Document format conversion
- Accessibility support
- Imagine an intermediary service which called an
XML - HTML conversion service if the browser
agent didn't support XML
http//wheel.compose.cs.cmu.edu8001/cgi-bin/brow
se/objweb
34Content Negotiation
- Transparent Content Negotiation (TCN)
- Method of deploying new formats
- Client ACCEPT image/gif, image/png
- Server If foo.png exists, send, else foo.gif
- Used for logos on W3C website
- Not widely deployed
- Transparent Feature Negotiation
- Proposal for deploying new HTML elements
- Over-engineered? Requires naming authority
35Fourth and Fifth Ways
- Several other options for deploying new web
technologies (e.g. on low spec PCs) - Run Browser on Server
- Use Windows Terminal Server, Citrix, etc.
- Browser runs on NT server
- Deploy JavaPC (e.g. for DOS)
- Use the JavaPC and run HotJava browser (min. spec
486 PC with 8Mb) - Opera
- Supports CSS, Frames, on 486 PCs (8Mb)
- See ltURL http//www.operasoftware.com/gt
36Conclusions
- To conclude
- New web protocols are still being developed
- Deployment of new technologies can be expensive
or time-consuming, but is likely to be needed - Various deployment models
- Don't implement ? Implement fully
- Implement via proxy ? Others (thin clients, )
- We can't do it all ourselves
- Experience in developing (wide-area) web
applications will help in developing
intermediaries