Title: Rethinking Assumptions with the
1Rethinking Assumptions with the Our Americas
Archive Partnership (OAAP)
Geneva Henry Rice University 6 April 2009 CNI
Spring 2009 Task Force Meeting, Minneapolis, MN
2Presentation overview
- About the Our Americas Archive Partnership
project - Vision and goals
- Approach were taking with the development
- Building the collections
- Assumption regarding growth
- Assumptions regarding metadata
- Challenges
3Background
- Our Americas Archive Partnership (OAAP) awarded
to Rice by the Institute of Museum and Library
Services -- IMLS - National Leadership Grant for digitization
- Digitize selected items in Woodsons Americas
collection - Add Web 2.0 technologies to enable use of Rice
collection and University of Marylands
complimentary Early Americas Digital Archive
collection
4The Partnership
- Rice University
- Fondren Library
- Humanities Research Center (HRC)
- Digitization, transcriptions, translations,
metadata, markup, research modules, scholarly
introductions - University of Maryland
- Maryland Institute for Technology in the
Humanities (MITH) - Integration of collections, development of web
2.0 features including social tagging and a
geospatial interface - Addition of Instituto Mora, Mexico City
- rich collection of materials relating to the
socioeconomic and historical conditions of Mexico - Not part of IMLS grant
- Collaborative relationship with Rices HRC
5Description of Collections
- Early Americas Digital Archive (EADA1492-1820)
- a collection of electronic texts of transcribed
literary-historical narratives written in or
about the Americas - Rice Americas Digital Archive (1597-1920)
- includes approximately 25,000 pages of original
letters, broadsides, pamphlets, printed materials
and books documenting the political and cultural
relationships between the United States, Mexico,
Central and South America, Cuba, Spain, and
Portugal - Instituto Mora Collection
- 7000 pages of additional archival items scanned,
digitized, marked up, and fully integrated into
the search tools - Scanning started June 2008 and will continue
through summer 2009
6Our Vision
- Focus on Americas from a hemispheric perspective
rather than the nation state, driven by scholars
needs - Span of OAAP captures cultural transformation
that spans the five hundred year period that saw
the making of modern and colonial cultures in the
Americas - OAAP will impact the study of American literary
and cultural history by more easily allowing
scholars to understand cross-cultural influence
7Goals
- Create unique new research and teaching
opportunities - Make unique archival collection digitally
available - Build common interface between partners
repositories, enabling additional digital
archives to be added - Address issues associated with the complexity of
multilingual documents
8Ubiquitous discovery opens new horizons
- OAAP supports new scholarly inquiry into
understanding the development of the Americas - Unrestricted access to scholarly resources that
were previously only in nation-specific
collections at a variety of institutions - Collaboration that crosses institutions, crosses
countries, and will grow as scholars need it to
grow - Power to the scholars
9Federation Model
- Provide a common interface to multiple
repositories with different content management
approaches - search page allowing for multifaceted browsing
- MySQL database built from harvested content
- Federated digital environment allowing
institutional partners to share holdings while
retaining individual identity - Extensible to allow for folksonomic tagging
10Technical approach
- Technically diverse digital collections
- Digital assets stored in separate repositories
- Technical Approach
- Capture meta data as Dublin Core
- Convert TEI-marked documents in EADA to Dublin
Core and harvest repositories - Texts encoded in TEI-Light
- Social tagging by scholars using their vocabulary
Metadata harvesting
Develop common descriptors
Common text display
11DSpace Platform
- DSpace is one of the leading open source software
platforms for an institutional repository - Rices Digital Scholarship Archive uses DSpace,
with some significant customizations - Provides permanent digital archive for materials
- Fine-grained access controls
- Metadata separate from actual objects allows for
scalability of digital assets
12Overview of DSpace Architecture
- Web-based user interface
- Runs on Unix-based OS Rices is running on Apple
Xserves - Production server for final collections
- Development and Test servers for preparation
- Uses PostGres database for managing content
- Includes Lucene search engine
- Support for full text search
- Supports Dublin Core metadata standard
- Metadata harvested by OAI harvesters
- Storage demands are VERY high
- Using Isilon clustered storage solution to
facilitate multimedia
13Connexions
- Provide scholarly analysis of the archival
documents or demonstrate their pedagogical uses
in an on-line environment - Connexions is a set of tools for developing and
freely distributing educational material
14Using archival materials
- Scanned images immediately provide visual cues as
to the type of document - a letter versus a governmental document
15Multilingual documents
- Translations expand access to intellectual
content of texts - By providing the content in language of the
reader - And
- In a format that facilitates visual scanning of
content and full text searching
16Enhancing Multilingual documents
- Digital Image gt Transcription gt Translation
17Example Item Record
- TEI file
- Digital Image
- Metadata
18Rice Americas Archive Interface
19EADA Browse Interface
20OAAP Beta site interface
21Geospatial view of results
22Outcomes
- Allow scholarly examination of American
literature from a hemispheric perspective, - develop a collection of texts, curricular models
and teaching materials that embody a hemispheric
approach to the study of the early Americas - generate professional and intellectual exchanges
among scholars from various fields - Support Scholars from outside the US and their
contributions - Create digitized version of primary sources not
previously available to wide range and physically
dispersed audience - Support addition of other digital archives with
minimal barrier to entry
23Growth Assumptions
- Architectural approach assumed new partners would
host their own digital collections - Assumed familiarity with digitization practices
- Sustainability of collection assumed to be
responsibility of each partner - Assumed at least some level of processing
(minimal) to be a contributing partner
24Assumptions regarding metadata
- Dublin core was assumed acceptable for partnering
- Following metadata best practices viewed as a
good thing when project started - Markup of text documents seen as valuable
enhancement - Geospatial information to support geospatial
visualization of resources thought to be valuable
to scholars
25Collection Challenges
- Latin American institutions have rich collections
but limited experience and resources with
digitization - Hosting collections presents issues of
sustainability - Should hosted collections follow practices of
collection at hosting institution?
26Metadata challenges
- New scholarly approach to understanding historic
documents relies on new descriptions - Cataloging/metadata best practices impose a
previous organizational bias - Deciding what geographic information is relevant
is not so straight-forward - Scholars interested in shifting borders
geospatial presentation of little value to them - Should minimal metadata with full text search be
the new model for supporting digital scholarship?
27Project Website
- Website http//oaap.rice.edu
- Updates on project developments
- Share team presentations to communities
- Share scripts and code for future participants
- Rice and Mora Americas collection at
http//scholarship.rice.edu/handle/1911/9219 - EADA at http//www.mith2.umd.edu/eada/
28Thank You and come visit us on the web
- Contacts
- Geneva Henry, PI (Rice)
- ghenry_at_rice.edu
- Caroline Levander, Co-PI (Rice)
- clevande_at_rice.edu
- Neil Fraistat, Co-PI (MITH)
- nfraistat_at_gmail.com