FedoraTM and Repository Implementation at UVa - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

FedoraTM and Repository Implementation at UVa

Description:

Child objects of the collection are assembled at dissemination time. ... Better support for 'collections' New ingest and export formats (METS1.3, DIDL) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 39
Provided by: lesl107
Category:

less

Transcript and Presenter's Notes

Title: FedoraTM and Repository Implementation at UVa


1
FedoraTM and Repository Implementation at UVa
Leslie Johnston, UVa Library DASER
Summit November 22, 2003
2
FedoraTM History
  • Research (1997-present)
  • DARPA and NSF-funded research project at Cornell
    University Digital Library Research Group.
  • Reference implementation developed at Cornell.
  • First Application (1999-2001)
  • University of Virginia Library Digital Library
    Research and Development prototype.
  • Scale/stress testing for 10,000,000 objects.
  • Open Source Software (2002-present)
  • Andrew W. Mellon Foundation granted Virginia and
    Cornell 1 million to develop a
    production-quality Fedora system.
  • Fedora 1.0 released in May 2003.

3
What is FedoraTM?
  • Fedora is a Digital Asset Management
    architecture, upon which many types of Digital
    Library systems might be built.
  • Fedora is based on object models that represent
    data objects (units of content) or collections of
    data objects.
  • The objects contain linkages between datastreams
    (internally managed or external media files),
    metadata (inline or external), and behaviors that
    are themselves code objects and link to
    disseminators (processes, mechanisms, and
    external software). A data object subscribes to a
    pair of behavior objects
  • Object models can be thought of as containers
    that give a useful shape to information poured
    into them if the information fits the container,
    it can immediately be used in predefined ways.

4
FedoraTM Data Object Components
  • Datastreams represent content and metadata.
  • PID persistent identifier, unique to the
    Repository.
  • System Metadata metadata that the Repository
    keeps.
  • Disseminators bindings to objects that can
    deliver software processes that can be used with
    the datastreams.

5
FedoraTM Data Objects
Digital object identifier
Persistent ID (
PID
)
PID uva-lib100
Service view methods for disseminating content
Default Disseminator
Default Disseminator
Extension
Image Disseminator
Extension
Content view Set of data and metadata items
Datastream (item)
Image (mrsid)
Datastream (item)
DC (xml)
Datastream (item)
Thumbnail (jpeg)
Internal view key metadata necessary to manage
the object
6
Behavior Definition Object
behavior subscription
Data Object
behavior contract
data contract
Web Service
Behavior Mechanism Object
7
FedoraTM Service Interfaces
  • Management Service (API-M)
  • Ingest - XML-encoded object submission
  • Create - interactive object creation via API
    requests
  • Maintain - interactive object modification via
    API requests
  • Validate application of integrity rules to
    objects
  • Identify - generate unique object identifiers
  • Security - authentication and access control
  • Preserve - automatic content versioning and audit
    trail
  • Export - XML-encoded object formats
  • Access Service (API-A and API-A-LITE)
  • Search - search repository for objects
  • Object Reflection - what disseminations can the
    object provide?
  • Object Dissemination - request a view of the
    objects content
  • OAI-PMH Provider Service
  • OAI-DC records

8
FedoraTM Distribution Package
  • Open Source (Mozilla Public License)
  • 100 Java (Sun Java J2SDK1.4)
  • Supporting Technologies
  • Apache Tomcat 4.1 and Apache Axis (SOAP)
  • Xerces 2-2.0.2 for XML parsing and validation
  • Saxon 6.5 for XSLT transformation
  • Schematron 1.5 for validation
  • MySQL and Mckoi relational database
  • Oracle 9i support
  • Deployment Platforms
  • Windows 2000, NT, XP
  • Solaris
  • Linux

9
What FedoraTM Is Not
  • Fedora is not finished the development process
    is only half way complete.
  • Version 1.2 releases on December 10, 2003.
  • The scheduled date for implementation of all
    features outlined in the grant-funded project is
    early 2005.
  • Fedora is the underlying architecture for a
    digital repository, not a complete management,
    indexing, discovery, and delivery application.
  • Fedora by itself is not the UVa Library's Digital
    Library system - Fedora is the "plumbing" for our
    first phase production Central Digital
    Repository.

10
Process for Repository Development
  • Fedora developers met with content and format
    specialists, application developers, and user
    service librarians to understand what media files
    we have and how our users expect to find them and
    use them.
  • Priorities were set for phased development and
    content migration by format type
  • First Phase Electronic Texts, EAD, and Images
  • Second Phase Datasets and GIS
  • Third Phase Digital Audio and Video

11
Process for Repository Development
  • Specifications were set for
  • Datastreams (formats, variation in deliverables
    EAD vs. TEI vs. Ebooks, page images vs.
    documentary images)
  • Metadata
  • Discovery functionality and interface (simple and
    advanced searching, metadata vs. full-text
    searching, presentation of results sets, etc.)
  • Delivery (must support static and on-the-fly file
    delivery, and varied end user download and
    printing requirements)

12
Repository Prototype
  • A prototype discovery interface was released for
    review by Library staff during summer 2003.
  • Almost 150 comments on functionality, user
    interface, and proposed additional features were
    collected.
  • The comments were collated into categories which
    were prioritized by Library department heads,
    user services staff, and developers for
    implementation into a first release, scheduled
    for early 2004.

13
Proposed Searching Services
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
Issues - Standards
  • Collate, standardize, and document in-house
    production standards.
  • Slide and photograph scanning Book page
    scanning and Full-text markup
  • lt http//www.lib.virginia.edu/digital/reports/bes
    t_practices.htmlgt
  • Develop UVa DescMeta XML element set, and
    document minimum metadata elements and best use
    practices.
  • lthttp//www.lib.virginia.edu/digital/reports/meta
    data.htmlgt
  • lthttp//www.lib.virginia.edu/digital/reports/DLMR
    PGroupReport.htmgt
  • Develop the General Descriptive Modeling Scheme
    (GDMS) XML encoding standard to describe complex,
    structured collections.
  • lthttp//www.lib.virginia.edu/digital/resndev/gdms
    .htmlgt
  • Recommend the in-house standards for faculty with
    digitization projects through our consulting
    services.
  • Born digital faculty projects are selected for
    collection by the Library, assuring a smoother
    collection process.

33
Issues Authoring Tools
  • User Collection Tool
  • Web-based database for the organization and
    annotation of personal media collections.
  • lthttp//iris.lib.virginia.edu/dmmc/collectiontool
    /gt
  • GDMS Tool
  • XML authoring tool to create documents using a
    locally defined XML encoding standard to
    represent structured collections of images and
    metadata.
  • lthttp//www.lib.virginia.edu/digital/resndev/gdms
    .htmlgt
  • A Data Workbench is planned to create
    relationships between objects and prepare files
    for ingest into the Repository.
  • A Scholarly Object Workbench is planned for
    faculty to use in creating their research and
    instructional resources in formats that can be
    more easily collected by the Library.

34
Upcoming Modeling Virginia
  • Collaboration between Systems Engineering,
    Environmental Sciences, and the Library.
  • Weather datasets, traffic datasets, and the 2000
    census.
  • Proof-of-concept Hampton Roads area.
  • Applying for funding for the entirety of
    Virginia.
  • Will drive the development of object models and
    disseminators for discovery and download of
    variables across datasets with DDI codebooks.

35
Upcoming Aggregation Objects
  • On-the-fly collection objects where the content
    data stream contains rules, formatted as XQuery
    or XPath statements, rather than explicit
    collection relationships.
  • Child objects of the collection are assembled at
    dissemination time.
  • Disseminators can include such functions as
    building a full-text index, rendering a search
    page, etc.

36
Upcoming FedoraTM 1.2
  • Open Fedora APIs
  • Repository as web services (REST and SOAP
    bindings) WSDL interface defs
  • Flexible Digital Object Model
  • Content View objects as bundle of items (content
    and metadata)
  • Service View objects as a set of service methods
    (behaviors)
  • Extensible functionality by associating services
    with objects
  • Repository System
  • Core Services Management, Access/Search, OAI-PMH
  • Storage XML object store relational db object
    cache relational db object registry
  • Mediation - auto-dispatching to distributed web
    services for content transformation
  • Auto-Indexing system metadata and DC record of
    each object
  • HTTP Basic Authentication and Access Control
  • Built-in disseminator services XSLT x-form,
    image manipulation, xml-to-PDF
  • Content Versioning
  • Automatic version control (saves version of
    content/metadata when modified)
  • Enables date-time stamped API requests (see
    object as it looked at a point in time)

37
FedoraTM December 2003-January 2005
  • Fedora Object XML (FOXML)
  • Internal storage format direct expression of
    Fedora object model
  • Better support for relationships (kinship
    metadata)
  • Better support for audit trail (event history)
  • Format identifiers for dynamic service binding
  • Shibboleth authentication
  • Policy Enforcement
  • XACML expression language
  • Fedora policy enforcement module
  • Web interface for easy content submission
  • Batch object modification utility
  • Administrative Reporting
  • Object Event History (ABC/RDF disseminations)
  • Better support for collections
  • New ingest and export formats (METS1.3, DIDL)

38
Contact Information
  • www.fedora.info
  • www.lib.virginia.edu/digital/
Write a Comment
User Comments (0)
About PowerShow.com