An Evaluation and Implementation for capturing, - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

An Evaluation and Implementation for capturing,

Description:

1. UNIX-like OS - Installed RedHat Linux 9.0. 2. Java SDK 1.3 or ... Better and Comprehensive Documentation (Maybe Mine) Extensive Online Help (Really lacking) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 22
Provided by: sha7170
Category:

less

Transcript and Presenter's Notes

Title: An Evaluation and Implementation for capturing,


1
An Institutional Repository System
(An Evaluation and Implementation for capturing,
describing and publishing digital
works) S.Shashi Nath IKM, Jan-2004
2
Agenda
  • Project Title and Goal
  • Introduction
  • What is DSpace
  • A little History Milestones
  • What it Does
  • What can we put in
  • DSpace information model
  • Features
  • Architecture
  • DSpace Federation
  • Implementation
  • Wish-List
  • Opinion
  • Demo

3
Project Title and Goal
  • Title
  • Implementing Dspace System to capture, describe
    and publish digital
  • works
  • Goal
  • To demonstrate the implementation of the latest
    version of Dspace
  • system to
  • Capture and describe digital works using a
    submission workflow module
  • Distribute digital works over the web through a
    search and retrieval
  • system
  • Long term Preservation of digital works

4
Introduction
  • Where is the wisdom we have lost in knowledge?
    Where is the knowledge we have lost in
    information? - T.S. Elliot
  • Institutional Repository A container for an
    institutions scholarly output.
  • Mission is to provide reliable, long-term access
    to managed digital resources to its
  • designated community, now in the future.
  • Key Drivers
  • Open Access Movement-Self Archiving
  • Open Source Software Development
  • Open Archives Initiative-Standards for Open
    Access
  • Focus on Long Term Access Preservation

5
About DSpace
  • What is DSpace
  • DSpace is a specialized digital asset
    management/Institutional Repository system
  • developed by the MIT Libraries with support from
    the HP-MIT Alliance designed to
  • A platform to build an Institutional Repository
  • Support long-term preservation of digital
    material
  • To allow creation, indexing and searching of
    metadata
  • enables institutions to capture and describe
    digital works using a custom submission workflow
    module
  • distribute an institution's digital works over
    the web through a search and retrieval system
  • Be distributable in open source ( available under
    the BSD open source license to other research
    institutions to run as-is, or to modify and
    extend as needed)

6
About DSpace
A little History Milestones March 2000, HP
awarded 1.8 million to the MIT Libraries for an
18-month collaboration to build DSspace. HP Labs
and MIT Libraries released the system worldwide
on Nov. 4, 2002. Whos working with
DSpace Cambridge University Columbia University
Cornell University MIT University of
Ohio University of Rochester University of
Toronto University of Washington and many
other universities
Early adopters beta test DSpace March - Sept.,
2002 DSpace adds new communities at
MIT September 30, 2002 DSpace content becomes
publicly accessible September 30, 2002 DSpace
launch event November 4, 2002 DSpace source code
released under November 4, 2002 Open Source BSD
license Early federators begin to collaborate
with DSpace Fall, 2002
7
What it Does
  • Captures
  • Digital research material in any formats
  • Directly from creators (faculty)
  • Large-scale, stable, managed long-term storage
  • Describes
  • Descriptive, technical, rights metadata
  • Persistent identifiers
  • Distributes
  • Via WWW, with necessary access control
  • Preserves
  • Bitstream guaranteed

8
What can we put in
Possible DSpace Content Articles Preprints,
e-prints Technical Reports Working
Papers Conference Papers E-theses Audio/Video
Datasets Statistical, geospatial
etc. Images Visual, scientific Teaching
material Lecture notes, visualizations,
simulations Digitized library collections
9
DSpace information model

10
DSpace information model
  • Communities
  • Departments, Labs, Research Centers, Schools
  • Collections (in communities)
  • Distinct groupings of like items
  • Items (in collections)
  • Logical content objects
  • Receive persistent identifier
  • Bitstreams (in items)
  • Individual files
  • Receive preservation treatment
  • Versioning- Item versions can be
  • All instances of a work in different formats
  • E.g. the XML, PDF, and PostScript versions
  • All editions of a work over time
  • Metadata lists all available versions of items

11
Features
  • User Interface
  • Web based, for submission, end-user and System
    Administrators
  • search and retrieval of items by browsing or
    searching the metadata
  • Workflow
  • Enables differing submission workflows for
    communities
  • models "e-people" who have "roles" in the
    workflow of a particular Community in the context
    of a given collection
  • Open Archives Initiative (OAI)
  • OAI-PMH 2.0 Compatible and uses used the OCLC
    OAICat
  • Persistent Identifiers (Handles)
  • Implements CNRI handles as the persistent
    identifier associated with each item
  • Access Control
  • DSpace allows contributors to limit access to
    items in DSpace, at both the collection and the
    individual item level.
  • Metadata Schema
  • Utilises Qualified Dublin Core.

12
Features
  • Preservation
  • "bit preservation", where a digital file is
    carefully preserved exactly as it was created
    without the slightest change (Known
    Un-Supported Format)
  • "functional preservation", where the digital file
    is kept useable as technology formats, media, and
    paradigms evolve (Supported Format)
  • Technology platform
  • designed to run on the UNIX platform, original
    code is in Java.
  • includes a RDBMS (PostgreSQL), a Web server and
    Java servlet engine (Apache and Tomcat, Jena (an
    RDF toolkit from HP Labs), OAICat from OCLC, and
    several other useful libraries, Lucene 1.2
    (index/search) etc .
  • Search and Retrieval
  • description of items using a qualified version
    of the Dublin Core metadata schema. These
    descriptions are entered into a relational
    database, which is used by the search engine to
    retrieve items. Browsing though title, date and
    author indices keyword Searching
  • Indexed by Search Engines

13
Architecture
14
Architecture
  • The DSpace architecture is a straightforward
    three-layer architecture, including storage,
    business, and application layers, each with a
    documented API to allow for future customization
    and enhancement.
  • The storage layer is implemented using the
    file system, as managed by
  • PostgreSQL database tables.
  • The business layer is where the
    DSpace-specific functionality resides,
  • including the workflow, content
    management, administration, and search
  • and browse modules. Each module has an
    API to allow DSpace adopters
  • to replace or enhance that function as
    desired.
  • The application layer covers the interfaces
    to the system the web UI and
  • batch loader, in particular, but also
    the OAI support and Handle server for
  • resolving persistent identifiers to
    DSpace items.

15
DSpace Federation
  • The DSpace Federation includes minimally all the
    research institutions, libraries,
  • and other cultural heritage institutions that are
    using the DSpace digital
  • repository system.
  • Members of the Federation share the following
    goals
  • Sharing in the development and maintenance of the
    DSpace source code.
  • Developing a critical corpus of content that
    represents the intellectual output of the worlds
    leading research institutions.
  • Promoting the continued development of the DSpace
    service through the open source community.
  • Promoting the interoperability of archival
    repositories.
  • Ensuring the long-term preservation of scholarly
    work by complying with published standards and
    supporting national and international initiatives
    to develop standards in this domain.

16
Implementation
Installed Prerequisite Software 1. UNIX-like OS
- Installed RedHat Linux 9.0 2. Java SDK 1.3 or
later Installed Java SDK 1.4.2 3. Ant 1.4 or
later Installed Jakarta Ant 1.5.1 (a make tool
for java applications) 4. Tomcat 4.0-Installed
Tomcat 5.0 (a web server for Java servelets and
JSP) 5. Postgresql 7.3 or later-Installed 7.3.2
(an RDBMS) 6. Javalibraries - i. Activation.jar
ii. mail.jar iii. servlet.jar Installed DSpace
-Release1.1.1, (released on 29-Aug-2003 ) (1.2
due in March, 2004) -Tweaked Configuration of
DSpace
17
Implementation
Installed Handle Server - Installed Handle server
included - Obtained Handle Prefix from CNRI,
Handle System (1875) - Tweaked Configuration of
DSpace for Handle Server - Administered Handle
Server and made configuration changes - Resolved
Global Handles Successfully Customized DSpace -
Changed several jsps to change the look of the
Repository - Added Custom Content like links,
IISc. License etc. Ensured OAI Compliance -Tweaked
DSpace Configuration for OAI compliance -
Registered and Tested DSpace with OAI Repository
Explorer - Successfully Harvested Metadata via
OAI Repository Explorer
18
Implementation
Ran DSpace via Secure Sockets Layer - Installed
Tomcat SSL (https protocol) successfully Installed
OAI Harveter Plugin -Installed OAI Harvester
Plugin for DSpace (from keplerDspace
project) -Tweaked configuration such as Data
Provider details etc. -Installed Simple API for
XML (SAX) Parser -Successfully Harvested
Metadata -Full Text Caching also possible
without use of Handle Server Working with
DSpace - Created Communities/Collections (with a
view to test different file types) - Tested WebUI
for submission as well as Batch Import and Export
for Items - Explored and administered the full
Admin. Interface (Groups, Authorizations,
E-persons, E-Mail Notifications,
Workflows,Metadata Registry, Bitstream Registry
etc.) - Tested Search, Advanced Search and Browse
Features
19
Wish-list
  • Better and Comprehensive Documentation (Maybe
    Mine)
  • Extensive Online Help (Really lacking)
  • Binaries would make life and installation easier
  • Stable OAI harvester and URL importing function
  • METS Schema Adherence
  • Browse by Subject
  • Custom Sorting of Search Results (by relevance,
    by title etc.)
  • Full Text Indexing
  • Multifile Linking in Item Support
  • Thumbnail Support for content
  • Collection/Item in Multiple Community/Collection
    Support
  • Sub-Communities Support
  • Bitstream Level Handles and Metadata

20
Opinion
  • In an institution like IISc. the workflow model
    suits perfectly allowing stringent peer review.
  • Robust Repository System. Champion in the making,
    for the Open Access Movement.
  • Greater transparency in workflows and submission
    process, will lead to popular usage.
  • Tough work ahead to promote it and open access
  • ITS NOT EASYBUT SURELY NOT IMPOSSIBLE ?

21
Thanks for your patience. Any Queries?? Lets
move on to the LIVE DEMO
Write a Comment
User Comments (0)
About PowerShow.com