Title: Portals to Distributed Information Environments
1Portals to Distributed Information Environments
- CS 502 20030423
- Carl Lagoze Cornell University
Acknowledgements Internet2 JASIG
2Defining the Problem
- Dynamic and diverse content
- Distributed administration
- Need to protect content
- Need to protect privacy
- Diverse user community
- Different needs
- Different preferences
3Technologies to meet the problem
- Caveat
- There are lots of available technologies
- Most are proprietary
- Most are vertical
- Open Source Solutions
- RSS Web syndication format
- uPortal Framework for integrating content from
distributed sources - Shibboleth Framework to support distributed
sharing of resources that are subject to access
controls
4RSS
- Format to expose news and content of news-like
sites - Wired
- Slashdot
- Weblogs
- News has very wide meaning
- Any dynamic content that can be broken down into
discrete items - Wiki changes
- CVS checkins
- Roles
- Provider syndicates by placing an RSS-formated
XML file on Web - Aggregator runs RSS-aware program to check feeds
for changes
5RSS History
- Original design (0.90) for Netscape for building
portals of headlines to news sites - Loosely RDF based
- Simplified for 0.91 dropping RDF connections
- RDF branch was continued with namespaces and
extensibility in RSS 1.0 - Non-RDF branch continued to 2.0 release
- Alternately called
- Rich Site Summary
- RDF Site Summary
- Really Simple Syndication
6RSS components
- Channel
- single tag that encloses the main body of the RSS
document - Contains metadata about the channel -title, link,
description, language, image - Item
- Channel may contain multiple items
- Each item is a story
- Contains metadata about the story (title,
description, etc.) and possible link to the story
7RSS 2.0 Example
8RSS applications
- http//www.syndic8.com/
- Automated discovery of RSS feeds
- ltlink rel"alternate" type"text/xml" title"XML"
href"http//rss.benhammersley.com/index.rss" /gt - Aggregators
- AmphetaDesk - http//disobey.com/amphetadesk/
- Radio Userland - http//radio.userland.com/
9And of course.
10RSS and publish and subscribe
- ltcloudgt element of channel
- Specifies a web service that supports the
rssCloud interface which can be implemented in
HTTP-POST, XML-RPC or SOAP 1.1 - Allow processes to register with a cloud to be
notified of updates to the channel via a callback
- ltcloud domain"radio.xmlstoragesystem.com"
port"80" path"/RPC2" registerProcedure"xmlStora
geSystem.rssPleaseNotify" protocol"xml-rpc" /gt
11What is uPortal?
- Open source software from Java in Administration
Special Interest Group (JASIG) - Framework for presenting aggregated content
(channels) - Personalization
- Role-based access control
- Open source, collaborative effort
- Java web application
12The content path to the user
Governing Body
Institutions
Schools
Departments
Faculty
User
13Aggregated Layout
A users layout being constructed from pre-defined
fragments
14Tab / Column Layout
15Tree / Column Layout
16uPortal General Mechanics
- Given a set of information sources (channels),
and a recipe on how to arrange and frame them
(stylesheets), uPortal framework coordinates the
compilation of the final document. - User Layout document Base layer abstract
organization of channels - Three stage transformation
- Structure transformation - Stylesheet to
translate user layout to structural units of
final presentation e.g., into tab and
column elements - Theme transformation Stylesheet to translate
structure document into target markup language
(e.g., HTML tables corresponding to tabs and
columns) - Serialize to output device
17Channel
- Elementary unit of presentation, defined by the
IChannel interface
User InteractionExternal Information
Channel Content(Presentation)
IChannel
18IChannel content must
- Be well-formed XML such as XHTML, RSS, SVG, SMIL,
or a SOAP message (HTML is not well formed XML) - Rendered by an XSL transformation using an XSL
stylesheet
19Channel Types
20Framework Organization
User Interaction
Presentation
uPortal Framework
Channel
Channel
Channel
21User Layout
- User Layout is an abstract structure defining the
overall content available to the user - userLayout is a tree structure consisting of
folders and channels, the later always being
the leaf nodes
22User Layout
23Structure Transformation
24Theme Transformation
User Layout
Tab
Tab
Tab
Jim Smith
Financial Aid
Library
Column
Column
Channel
Channel
Channel
Channel
Channel
Channel
Dictionary.com
Bookmarks
Cartoon
25Compiling the Presentation
userLayout
Structuretransformation
XSLT
structuredLayout
setRuntimeData()
XSLT
Channels
Themetransformation
renderXML()
HTML, WML VoiceML...
26Content Transformation
XML
XSLT Processor
XHTML Web Browser
HTML PDA
Stylesheet
WML Cell Phone
27Live Examples
- https//my.columbia.edu/render.userLayoutRootNode.
uP - http//guest.uportal.cornell.edu/render.userLayout
RootNode.uP - http//www.nsdl.org
28Shibboleth
- A word which was made the criterion by which to
distinguish the Ephraimites from the Gileadites.
The Ephraimites, not being able to pronounce sh,
called the word sibboleth. (See -- Judges xii.) - Hence, the criterion, test, or watchword of a
party a party cry or pet phrase. - - Webster's Revised Unabridged Dictionary (1913)
29Interrealm authorizationcurrent approaches
- Lots of ad hoc, non-scalable, difficult to
maintain, and restrictive approaches - Single ID and shared passwords are distributed,
perhaps widely, presenting significant
accountability risks. - Content providers limit access by IP address,
leaving campus users on DSL/cable modems at home
frustrated - Campuses operate proxy services or VPNs that
inconvenience users and present performance
bottlenecks. - Sometimes campuses must load user identities into
vendor databases, incurring additional cost,
stale data, and potential privacy violations. - Users get new userids and passwords in each
realm, incurring huhge overhead (and they often
set all their passwords to be the same)
30Shibboleth Basics
- An initiative to develop an architecture, policy
framework, and practical technologies to support
inter-institutional sharing of resources - Based on a federated administration trust
framework - Provides the secure exchange of interoperable
attributes which can be used in access control
decisions - Controlled dissemination of attribute
information, based on administrative defaults and
user preferences - Shifts the model from passive privacy towards
active privacy
31Federated Administration
- Leverage local authentication mechanisms (UID/PW
to PKI) - Origin Site
- May have created reasonable default attribute
release policies - Responsible for initial identification and
registration of users - Responsible for managing attributes (eg
Affiliation) - Responsible for Authenticating users prior to
resource access - Browser User
- Only needs to know the name of his/her origin
domain - May have created specific attribute release
policies - Target Resource Manager
- Must have joined the appropriate communities
- Manage policies governing access to the resource
32Rethinking Privacy
- Passive privacy - The current approach.
- A user passes identity to the target, and then
worries about the targets privacy policy. To
comply with privacy, targets have significant
regulatory requirements. The user has no control,
and no responsibility. And no one is happy... - Active privacy - A new approach.
- A user (through their security domain) can
release the attributes to the target that are
appropriate and necessary. If the attributes are
personally identifiable. If the attributes are
personally identifiable, the user decides whether
to release them. The user has control, along with
commensurate responsibility. All parties are
happy
33Shibboleth Architecture Components
- Handle pointer to a user without exposing
identity - SHIRE Shib Indexical Reference Establisher,
part of web server that manages acquiring a
handle - Handle Server makes sure user is authenticated
locally and for creating handle that can be used
to receive attributes - WAYF Where are you from? Maps from users
location to Handle Server - SHAR Shib Attribute Requester responsible for
requesting attributes for a specified handle
(user) - Attribute Authority responsible for sending
attribute sets to target for a specific handle
(user)
34Establishing a User Context
35Getting Attributesand Determining Access
36Shibboleth ArchitectureConcepts (detail)
Authentication Phase
Authorization Phase
Success!
Target Web Server
Attribute Server
Pass Privileges for Authz Decision
Ask For Privileges
Browser
Second Access - Authenticated
Web Login Server
Redirect User to Campus for Authn
Pass content if user is allowed
Authentication
First Access - Unauthenticated
WAYF
HS
Target Site
Origin Site