WMS, RUcore and Fedora Mini-Conference - PowerPoint PPT Presentation

About This Presentation
Title:

WMS, RUcore and Fedora Mini-Conference

Description:

There is a special Fedora database search allowing access to all objects whether ... All Fedora API management functions trigger alerting messages, are stored in the ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 28
Provided by: rja32
Category:

less

Transcript and Presenter's Notes

Title: WMS, RUcore and Fedora Mini-Conference


1
WMS, RUcore and FedoraMini-Conference
  • Wednesday Morning
  • Greetings and Introduction Grace
  • Collaboration and Architecture Overview Ron
  • RUcore Data Model Grace
  • WMS Tutorial - Mary Beth, Kalaivani, Sharon
  • Lunch (box lunch in conference room)
  • Wednesday Afternoon
  • Hands-On Experience Mary Beth, Kalaivani,
    Sharon
  • Feedback from WMS sessions
  • Collaboration Discussion All

2
WMS, RUcore and FedoraMini-Conference
  • Thursday Morning
  • Brief Recap Ron
  • WMS architecture - Yang
  • User Interface, Search engine and collections -
    Chad
  • Management services - Ron
  • Lunch (on your own)
  • Thursday Afternoon
  • Further collaboration discussion
  • Wrap-up and next steps

3
Possible Areas for Collaboration
  • Sharing Content
  • Exchange, harvesting
  • Federated Searching
  • Fedora Experimentation
  • Relationship services
  • Directory ingest
  • Use of xacml
  • Very large files
  • Event management
  • Data Registries
  • File formats
  • Content Models
  • Software Development
  • Requirements
  • Sharing software
  • Joint development
  • Life cycle support

4
Fedora Enterprise Architecture Major Goals
2007 thru 2009
  • Paradigm Focus
  • Scholarly Communication Collaboration
  • Libraries and Museums Access and Publishing
  • Infinite Scalability
  • Size of and number of objects
  • Capacity and throughput (e.g. ingest 20TB a day)
  • Life cycle preservation
  • Trust Model
  • Transactions - Begin/Commit
  • Transactions across repositories
  • Enable graph based objects (compound objects)

5
Persistence and Layered Architecture
Applications
Data
6
Layered Architecture - RUcore
Applications and Portals (NJDH, RUcore, workflow,
etc)
Middleware Services (searching, alerting,
integrity, etc)
Fedora Core Framework
FOXML Datastreams
7
RUcore - How it Works
RUCORE Portal
NJ Digital Highway
Custom Portals
Dissertations
User, Collection, Preservation Services
Workflow Management System
Fedora Repository Service
Faculty Submissions
Digital Object Repository (Fedora)
Digital Object Ingest
7
8
Simple and Compound Objects
Compound Object - Graph Model
Article Object (Simple)
Persistent ID
IsAnnotationOf
article
Metadata
Behaviors (Disseminators)
Data streams
IsAnnotationOf
SMAP1 StrMap (TOC)
A2
DJVU1- presentation
PDF1 - presentation
XML1 OCR text
A1
ARCH1- Archival master (tiffs of each page)
9
Collections In RUcore
  • A digital collection is simply a grouping of
    objects according to some criteria.
  • Types of digital collections in RUcore
  • Explicit A digital collection whose object
    membership is specified explicitly within the
    descriptive metadata.
  • Dynamic A digital collection of objects which
    are grouped according to user specified criteria.

10
Using Explicit and Dynamic Collections
  • Personal Collections
  • Department Collections
  • Including Faculty Personal collections (e.g.
    preprints, reports, etc)
  • ETDs for the Department
  • Centers and Grant Funded Research
  • New Jersey Digital Highway
  • Center for Remote Sensing and Spatial Analysis
    (CRRSA) Access and preservation of GIS
    resources related to New Jersey

11
RUcore Collection Architecture
Circles collection objects Rectangles content
objects
RUCORE
NJDH (Grant Project)
Solid line explicit membership Dashed line
dynamic membership
Rutgers University Libraries
Rutgers University
Eagleton Archive
Centers/ Departments
General Collections
Special Collections
11
12
Collection Architecture - Lefty
RUCORE
NWestern (1782.1)
RUL (1782.1)
Center/Dept Collections
RU ETDs
FacColl One
FacColl Two
Dept. ETDs
  • http//hdl.rutgers.edu/1782.1/NorthwesternU.colle
    ction.165
  • http//hdl.rutgers.edu/1782.1/PennStateUniv.colle
    ction.164
  • http//hdl.rutgers.edu/1782.1/PrincetonUniv.colle
    ction.166

Solid line explicit membership Dashed line
dynamic membership
12
13
Management Services(incl. Collection and
Preservation)
  • Management
  • Super-user editing (handles, datastreams,
    metadata)
  • Purging an object
  • Export (foxml, mets)
  • Collections
  • Collection administration
  • Statistics
  • Preservation
  • Creation of archival master
  • Creation of persistent ID (handle)
  • Checksum verification

14
Management Services
  • Access to individual objects is provided by a
    special search portal using the same indexes as
    the public search but providing Fedora API
    management functionality
  • Viewing, Exporting and/or purging objects
  • Editing metadata, adding/changing datastreams
  • Validating objects, checking audit trails,
    testing signatures
  • There is a special Fedora database search
    allowing access to all objects whether or not
    they are members of an active collection.

15
Collection Administration
  • Edit collection information
  • Add parents to a collection
  • Add dynamic search terms to a collection
  • Generate an XML structure map

16
Collections - Indexing and Ingest
  • Active Collections may be indexed individually or
    all together at any time, though this is
    typically done using a nightly cron job.
  • Ingest is done through the management API and is
    typically called by the WMS program, but may be
    called directly from the management interface as
    well.

17
Preservation - Alerting
  • All Fedora API management functions trigger
    alerting messages, are stored in the Fedora audit
    trails, and are registered in the collection
    statistics database.
  • Statistics are kept for all object downloads as
    well as editing activities and may be accessed at
    collection or repository levels.

18
Preservation PIDs and Handles
  • Handles are normally created as part of the
    ingest process, but may be manually created,
    changed, or purged on a per object basis using
    the management interface.
  • Three global registries for RU
  • 1782.1 Rutgers University Libraries
  • 1782.2 Rutgers University
  • 1782.3 NJ Digital Highway

19
Object Integrity Verifying Checksums
  • Archival datastreams have SHA1 checksums, created
    during the WMS pipeline process, as well as
    filesize data stored in the technical metadata
    section of each objects.
  • SHA1 checksums are tested using the sha1sum
    checking algorithm in conjunction with a
    management function that polls the repository and
    extracts sha1sum character strings from the
    techMD of individual objects or groups of
    objects. It has a calendar feature that allows it
    to be run as a cron on a subset of objects for
    each day of the week with result reports emailed
    to appropriate data managers.

20
Certification as a Trusted Repository
  • Ultimately, we want to become certified as a
    trusted repository. There are four major areas

A. Organization
B. Repository Functions
Repository actively monitors Archival Information
Package Integrity.
Repository staff have skills appropriate to their
duties.
C. Designated Community
D. Technologies
Repository has technologies to monitor security.
Repository defines its Designated Community
  • RLG/NARA draft An Audit Checklist for the
    Certification of Trusted Digital Repositories

21
Preservation Services Architecture
Preservation Portal
Preservation Services
. . .
Alerting
Migration
Statistics
Monitoring
Event Messaging
Preservation Integrity
Preservation Monitoring
Fedora Repository Service
Content Models
Digital Object Repository
Format Registry
Fedora Service Framework
21
22
Content Models(Content Model Dissemination
Architecture CMDA)
  • The CM object specifies constraints on the
    digital object (DO)
  • MIME type and format
  • Min/max of number of datastreams
  • Whether multiple datastreams are ordered
  • The CM is used to determine runtime behavior
  • On ingest, Fedora validates DO based on CM
    constraints
  • Disseminators are not bound into the DO
  • Run time binding occurs through the CM object and
    the rels-ext datastream
  • The CM can point to a format registry

23
Content Models, Formats, and Disseminators
23
24
Events and Outcomes
  • An event is an
  • . . . action that involves at least one object,
    agent, and/or rights entity (PREMIS).
  • . . . occurrence that is significant to the
    performance of a task
  • Event outcome a situation or state that follows
    an event and is a result of the event.

25
Fedora Event Management
  • Generic Framework
  • Events can have messages which are associated
    with all types of services (preservation,
    collection, user, etc)
  • Messages represent events with actions and
    outcomes
  • Fedora will provide a middle-ware messaging
    solution based on open-source Java Messaging
    Service (JMS)
  • Fedora Working Group Focus
  • Preservation events are atomic (i.e. associated
    with a Fedora API)
  • The event message will be based on the PREMIS
    event entity
  • Initial types ingest, delete, modify,
    fixityCheck

26
The Event Message
  • Event message structure
  • The message payload will be xml-based and use the
    PREMIS event entity semantic units
  • Global identifiers (URIs) will be used for event
    type and outcome
  • An example might look like the following

lteventgt lteventIdentifiergt lteventIdentifierTypegtRu
core eventlt/eventIdentifierTypegt lteventIdentifier
Valuegt30169lt/eventIdentifierValuegt lt/eventIdentifi
ergt lteventTypegtinfopremis/preservation/event/inge
stlteventTypegt lteventDateTimegt2006-07-16T192030lt/
eventDateTimegt lteventDetailgt(to be used for
general information)lt/eventDetailgt lteventOutcomeIn
formationgt lteventOutcomegtinfopremis/preservation/
outcome/successlt/eventOutcomegt lteventOutcomeDetail
gt(more text)lt/eventOutcomeDetailgt lt/eventOutcomeIn
formationgt ltlinkingAgentIdentifiergtrutgers-lib200
lt/linkingAgentIdentifiergt ltlinkingAgentIdentifiergt
rutgers-lib400lt/linkingAgentIdentifiergt ltlinkingO
bjectIdentifiergtrutgers-lib4291lt/linkingObjectIde
ntifiergt lt/eventgt
27
Event Management - Ingest(Using the
publisher/subscriber model)
User Input
JMS Topic Queue
lteventTypegtingestltgt
lteventTypegtdeleteltgt
lteventTypegt
lteventTypegt
Workflow Management System
lteventTypegt
Digital Object Repository (Fedora)
Digital Object Ingest
Write a Comment
User Comments (0)
About PowerShow.com