Title: Update on the Fedora Project
1Update on the Fedora Project
- Common Solutions Group
- September 2005
Tim Sigmon University of Virginia Special
thanks to the Fedora Team for these slides!
2Fedora Development Team
- Cornell University
- Sandy Payette (co-director)
- Chris Wilper
- Carl Lagoze
- Eddie Shin
- University of Virginia
- Thorny Staples (co-director)
- Ross Wayland
- Ronda Grizzle
- Bill Niebel
- Bob Haschart
- Tim Sigmon
3Fedora Brief History
- Cornell Research (1997-present)
- DARPA and NSF-funded research
- First reference implementation developed
- Interoperable Repositories (experiments with
CNRI) - Policy Enforcement
- First Application (1999-2001)
- University of Virginia digital library prototype
- Technical implementation adapted to web RDBMS
storage - Scale/stress testing for 10,000,000 objects
- Open Source Software (2002-present)
- Andrew W. Mellon Foundation grants
- Technical implementation XML and web services
- Fedora 1.0 (May 2003)
- Fedora 2.0 (Jan 2005)
- Fedora 2.1 (coming soon!)
4Known Use Cases forFedora Inside
- Digital Library Collections
- Institutional Repository
- Educational Software
- Information Network Overlay
- Digital Archives and Records Management
- Digital Asset Management
- File Cabinet / Document Management
- Scholarly publishing
5Fedora Repository 2.x
6The FedoraDigital Object Model
Service Perspective methods for disseminating
views of content
Item Perspective Set of content or metadata
items
Internal key metadata necessary to manage the
object
7Fedora Repository 2.x
8Fedora whats new(version 2.0)
- FOXML (Fedora Object XML)
- Simple XML format directly expresses Fedora
object model - Easily adapts to Fedora new and planned features
- Easily translated to other well-known formats
- Enhanced Ingest/Export of objects
- FOXML, METS (Fedora extension)
- Extensible to accommodate new XML formats
- Planned METS 1.4, MPEG21 DIDL
9Fedora 2.0 (continued)
- Object-to-object Relationships
- Ontology of common relationships (RDF schema)
- Relationships stored in special datastream
(RELS-EXT) - Resource Index (RI)
- RDF-based index of repository (Kowari
triple-store) - Graph-based index includes
- Object properties and Dublin Core
- Object Relationships
- Object Disseminations
- RI Search (Search the repository as a graph)
- Powerful querying of graph of inter-related
objects - REST-based query interface (using RDQL or ITQL)
- Results in different formats (triples, tuples,
sparql)
10Fedora 2.0 (continued)
- New Utilities
- Batch Modify Utility
- Repository Administrator Reporting
- Performance Tuning (1 million objects)
- Ingest testing (800K objects 40 millisec/object)
- Concurrency testing (access requests)
- Communications and Outreach
- New Fedora Web Site
- Improved Documentation
- Tutorials
11Preview Fedora 2.1 (Sept. 2005)
- ECL license
- Support for SSL
- Authentication plug-ins
- Tomcat realms and login modules
- Plug-in 1 Tomcat user/password file or
database - Plug-in 2 LDAP
- Plug-in 3 Radius Authentication
- Authorization module
- XML-based policies using XACML
- Repository-wide policies
- Object-specific policies
- Fine-grained policy enforcement
- API actions X subject attributes X object
attributes
12Authorization Example Policies
- Repository Policy
- Allow access to all API-M methods to
administrator - Allow access to the deleteDatastream method to
author - Specific Object Policy
- Allow access to object uva100 if user is
Thorny. - Group Object Policy
- Allow access to the getFullArticle
dissemination of objects whose content model is
journal-article if faculty - Allow access to the secret datastream if user
is not guest
13Authorization Example Policies
- Time-oriented Policy
- Permit students access to answers datastream of
learning object cs125 after May 15, 2005 - Deny all access to learning object cs125 after
June 15, 2005 - Backend Service Security Policy
- Deny callback by external service represented by
Bmech10
14Preview of Fedora 2.1 (Sept. 2005)
- Enhanced OAI Provider Service (prOAI)
- Harvest multiple metadata formats
- Harvest datastreams and disseminations
- Support for incremental harvest by modified date
- Support for OAI sets
- Highly configurable via queries against Resource
Index - Directory Ingest Service (and client tool)
- Facilitate ingest of hierarchical directories of
files - Submit files as .zip or .jar (with a METS
manifest) - Automatically asserts parent-child relationships
in RELS-EXT - Stages content and ingests as FOXML objects into
repository - Policy Builder Client
- Simple user interface to create access policies
- Automatically generate XACML
- Handle Generation Plug-in (PIDs as Handles)
15Fedora Service Framework(beginning Fedora 2.1)
16Fedora Service Framework(2005-2006)
17Fedora Service Framework(2006-2007)
18Fedora Web-based IR Client
- Web-based client for institutional repository
- Configurable
- End-user submission
- Object creation template for content models
- Basic Workflow
- Search/Browse
- Easy configuration of access policy
- Development to begin this fall.
19More Dev-Team Priorities
- Federated Repositories
- Federation with other repositories (DSpace,
aDORE) - note the Cornell/LANL NSF Pathways project.
- Fedora Showcase and News (on new website)
- Content Model Specification Language
- Advanced Object Creation Workbenches
- Tools for RDF browse and graph traversal
- Performance Tune millions of objects
- Web services security and Shibboleth
- Code Refactoring
- Fedora as web app (.war)
- MVC2 pattern for REST-based web exposures
- Other misc.
20VTLS
- Commercial support and value-add development
- Similar to RedHat and Linux
- Installation, training, support, hosting, etc.
- VITAL product (based on Fedora) contains
VTLS-developed work flow extensions, management
utilities, and enhanced searching capabilities - In partnership with the ARROW project, VTLS is
developing and contributing back to open source,
e.g., - Handles integration
- SRU/SRW interface to expose Fedora content
- Metadata extraction and content validation via
JHOVE - Automatic capture of technical metadata from
images - Facilitate content exposure to web crawlers
- Creating custom content models
21ARROW
- Australian Research Repositories Online to the
World - Intention of the project is to achieve wider
access to Australian research by making it
available on-line with appropriate discovery
facilities - identify and test software or solutions to
support best practice institutional digital
repositories comprising e-prints, digital theses
and electronic publishing - Selected Fedora and VITAL as its core repository
solutions - http//arrow.edu.au
22NSDL
- National Science Digital Library
- Mission improve Science, Math, and Engineering
education through digital libraries - First implementation was a metadata repository
using Oracle dbms to hold information about
collections and items. - Next generation NSDL will use Fedora to add value
to digital content ... don't just provide access. - Create an information network overlay that
supports - Rich and dynamic information objects
- Information reuse/refactoring
- Graph-based information model (ontology-based
relationships) - Fine-grained access management
- http//www.nsdl.org
23AGU Digital Archive
- American Geophysical Union (a publisher) is
developing a system for long-term preservation
(20, 50, even 100 yrs) - Read the files
- Understand the structure of the files
- Ensure authentic copy of the work
- Selected Fedora based on an extensive list of
requirements - Emphasis on having good metadata
- Descriptive (author, title, volume, ...)
- Technical (formats, versions, ...)
- Administrative (rights, events, audits, ...)
24Preservation of University Records
- Tufts and Yale received a grant from National
Historical Publications and Records Commission
(NHPRC) - To synthesize electronic records preservation
research with digital library repository research
in an effort to develop systems capable of
preserving university electronic records at both
institutions - To test the potential of Fedora to serve as the
architecture for an electronic records
preservation system.
25www.fedora.info