Title: Not So Different After All
1Not So Different After All
- Creating Access To Diverse Objects in Digital
Repositories
Jennifer OBrien Roper, Gretchen Gueguen, and
Susan Schreibman University of Maryland Libraries
2What We Will Cover
- The Building Blocks of Digital Repositories
- The Thomas MacGreevy Archive vs. The University
of Marylands Digital Repository - Future issues of exploration
3Barriers to Digital Library Integration
- Thematic Collections
- Documenting the American Southhttp//docsouth.unc
.edu - Historic Pittsburghhttp//digital.library.pitt.ed
u/pittsburgh - Object Collections
- University of Indiana (XPat)http//www.dlib.india
na.edu/collections/ - Packaged Collections
- Romantic Circleshttp//www.rc.umd.edu
- After the fact Collections
- NINEShttp//www.nines.org
4Digital Project Building Blocks
- Metadata
- Vocabulary
- Interface Design
5Digital Projects Metadata Standard
- Initiatives to aggregate collections
- Z39.50
- Keep different metadata schemes
- Keep localized repositories
- Pressure on the search interface
- Open Archives Initiatives Metadata Harvesting
Protocol (MHP) - Extract a normalized record
- Create a central repository
- Pressure on the central repository
6Digital Projects Metadata Standard
Individual Repository(local metadata)
metadata
URL
User Query
metadata
Individual Repository(local metadata)
Centralized Repository for searching (normalized
metadata)
URL
metadata
URL
Individual Repository(local metadata)
7Digital Projects Metadata Standard
- Lessons from MHP
- Normalizing metadata for sharing
- Keeping local enhanced records
- Centralize functions like searching, but disperse
functions like displaying and storing - Drawbacks
- Lack of enforcement for Metadata Standards
- Various levels of granularity in data
8Digital Projects Vocabularies
- The ideal language
- Pre-coordinated vs. post-coordinated
- Hierarchies and relationships are great for some
things, not for others
9Digital Projects Vocabularies
- Controlled vs. local
- Locally created languages
- Fit materials well
- Speak the users language
- BUT
- Are difficult to enhance
- All controlled vocabularies are difficult to
combine
10Digital Projects Interface Issues
- Multiple Hierarchies
- Facilitate access to both individual and general
collections - Multiple Object Types
- Film, audio, video, image, text in one interface
- Multiple Modes of Access
- Allow users to browse through objects in a manner
that promotes cross-collection discovery
11Thomas MacGreevy Archive
- http//www.macgreevy.org
- TEI p4-based repository of texts
- A few different collections, some searchable,
some not - Changes Desired
- Add images to the searchable collection
- Add a collection of letters with special display
needs
12Thomas MacGreevy Archive Metadata Standard
- Limited to TEI p4
- Letters have irregularities in the body
- Images have multiple levels of being
- Collections havent been named before
13Thomas MacGreevy Archive Metadata Standard
- Solution?
- Adapt things to fit
- The Entombment
- An Electronic
Version - Fit the title of the original (here, a paiting by
Poussin) in the main title and indicate that it
is a digital copy in the version title. This is
typical TEI practice -
-
- Markup completed by
- Gretchen Gueguen
-
-
- 6 kb
- Keep the responsibility statement about the
creation of the TEI file. Record the creator of
the original when you record details about the
original
14Thomas MacGreevy Archive Metadata Standard
- Add only what is necessary
-
- -
- Art
- French
- 1600-1699
-
- Keywords for cross-searching
-
-
-
- These details for display and internal record
keeping, NOT searching -
- The McGreevy Family Front row from left to
right ThomassName, NoraName - A caption is searchable and provide many precise
points of access.
15Thomas MacGreevy Archive Metadata Standard
- Translate existing codes
-
-
- Image
-
-
- The collections were loosely designated before by
a combination of item type and value. We
continued to use this convention even when it was
redundant.
16Thomas MacGreevy Archive Metadata Standard
Only the information that is also available for
texts is displayed, such as title, creator, text,
and keywords
17Thomas MacGreevy Archive Vocabulary
- Limited vocabulary available
- Main descriptors based on Dewey
- Other descriptors based on nationality and date
of subject - Some other fields used a consistent language
- Collection designations
- Text types (poem, art review, obituary, etc.)
- 500 texts finished
- Solution?
18Thomas MacGreevy Archive Vocabulary
- Cant create a new vocabulary for the collection
- Cant create a scalable vocabulary for everything
- So,
- Add some new words to list and retrospectively
update
19Thomas MacGreevy Archive Vocabulary
- Great War
- History
- Journals
- Music
- Mythology
- Opera
- Sport
- Theatre
- Travel
- Existing Terms
- Architecture
- Art
- Biography
- Catholicism
- Critical Method
- Dance
- Education
- Film
- Folklore
- New Terms
- Career Finances
- Domestic Life
- Irish Culture
- Literature
- Politics Government
- Portraits
- Social Life
20Thomas MacGreevy Archive Interface
- New collections affect how searching is done
- New object types need to be displayed different
ways - Viewing
- Search results
21Thomas MacGreevy Archive Interface Browse
Searchable collections
Unsearchable collections
22Thomas MacGreevy Archive Interface Search
Unclear approach to subjects
Confusing, often missed, options
23Thomas MacGreevy Archive Interface Revised
Search
Search in full-text as well as by type
Faceted subjects made explicit
Collection, author, and date of objects are
searchable nodes
24Thomas MacGreevy Archive Interface Results
User determines relevancy by author and title
within results
25Thomas MacGreevy Archive Interface Revised
Results
Sort by document type in results
Show thumbnails and keywords for image objects
Indicate that images are associated with texts
List blurbs for articles and abstracts for letters
26Thomas MacGreevy Archive Interface Revised
Display
Tabbed interface for comparison
Unique, textual features of letters represented
in images
27Thomas MacGreevy Archive Lessons Learned
- Neutral metadata standard
- Scalable controlled vocabulary
- Starting over is sometimes worth it, sometimes
not
28University of Maryland Digital Repository
- FEDORA architecture
- Individual collections
- Unique look and feel
- Customized metadata design
- Cross collection search and browse
- Common search interface
- Rich minimum standards for metadata
29UM Digital Repository Metadata Standard -
Description
- Dublin Core
- Great for cross-collection description
- Too simple for rich description within a focused
collection - VRA Core
- Excels at rich description
- Created for and focused solely on images
30UM Digital Repository Metadata Standard -
Description
- Hybrid standard
- University of Maryland Descriptive Metadata
(UMDM) - Customized DTD
- Rigorous minimum standard
- Common base of granular data
- MODS
31UM Digital Repository Metadata Standard
Local Standard
Coverage Place Coverage Time Media Type Physical
description
Culture Description Subject Title
PID Relationships Repository Rights
Identifier Agent
Language Style
32UM Digital Repository Metadata Standard - METS
- Wrapper for all objects
- METS record for every object contains
- Header
- Descriptive Metadata
- Administrative Metadata
- File Section
- Structural Map
- Structural Links
- Behavior Section
33UM Digital Repository Metadata Standard - METS
- Flexibility to use external descriptive standards
- Behavioral control
- Map other standards to UMDM dynamically
34UM Digital Repository Metadata Standard -
Conversion
- Mapping existing data to UMDM
- Indicates where information is in the existing
dataset, and intended UMDM location - Transformation notes
- Static information to be added
35UM Digital Repository Metadata Standard -
Conversion
36UM Digital Repository Metadata Standard -
Conversion
- Ingestion
- As a distinct standard, with dynamic generation
- Batch uploaded from another source
- Incrementally built
37UM Digital Repository Metadata Standard -
Conversion
38UM Digital Repository Metadata Standard -
Conversion
39UM Digital Repository Metadata Standard -
Conversion
40UM Digital Repository Vocabularies
- Consistent input key to cross-searchability
- Controlled vocabularies
- General descriptive
- Names and name authority
- Subjects
-
41UM Digital Repository Vocabularies General
Descriptive
- External vocabularies
- Media Type (DCMI Type Vocabulary)
- Language (Former ISO 639-2 values)
- Local vocabularies
- Repository
42UM Digital Repository Vocabularies General
Descriptive
- Terms created as needed
- Culture
- nationality, ethnic, regional, organizational,
Etc. - Style
- architectural, literary, musical, etc.
43UM Digital Repository Vocabularies Name
Authority
- Existing terms
- LC Name Authority File
- Getty Thesaurus of Geographic Names
- Creating terms
- Name Authority Cooperative Program
44UM Digital Repository Vocabularies Subject
- Collection based
- Repository wide
- browse terms
45UM Digital Repository Vocabularies Subject
46UM Digital Repository Vocabularies Subject
Subject Fine Arts Subject Architecture
47UM Digital Repository Vocabularies Subject
- Collection based
- Appropriate to project focus and scope
- Existing thesauri
- Library of Congress Subject Headings
- Art Architecture Thesaurus
- Thesaurus for Graphic Materials
- Etc.
- Local thesauri
48UM Digital Repository Vocabularies Subject
- browse terms
- Defined independent of any project
- Applied to all objects, regardless of collection
- Intentionally general
- Only two levels of specificity
- Experimented with locally derived list based on
LC Call Number Scheme
49UM Digital Repository Interface Design
- Make clear through general and collection
interfaces that - Objects are in multiple hierarchies
- Users can access multiple object types
- Users can use multiple modes of access and
discovery
50UM Digital Repository Interface Design -
General
- University of Maryland theme
- Access to general metadata
- Accommodate multiple file types
- Simple and advanced searching
51UM Digital Repository Interface Design -
Collections
- Collection based theme
- Access to customized metadata
- Exploit file types specific to collection
- Customized search
- Extras
- Contextualized materials
- Exhibits
- Documentation
52(No Transcript)
53(No Transcript)
54(No Transcript)
55Date limit options
Restricted explicit field search
56Embedded video player
Metadata to contextualize video
Amazon.com inspired feature
Linked subject display
57(No Transcript)
58Option to limit or not by format
Drop down menus to help guide searchers
Wide variety of unique fields to explicitly search
59Display of customized metadata
thumbnail
60UM Digital Repository Interface Design - Browse
- Explore collection without searching
- Browse subjects as initial gateway
- Use common metadata elements to drill down
- Lists rebuilt weekly, not dynamically
61UM Digital Repository Future Issues
- Difficulties managing multiple hierarchies
- Workflows for general collections with no
curator - Balancing infrastructure and individual project
development - Digital archiving questions unanswered
62Not So Different After All
- Creating Access To Diverse Objects in Digital
Repositories
http//www.lib.umd.edu/dcr/publications/lita.ppt J
ennifer OBrien Roper jroper1_at_umd.edu Gretchen
Gueguen ggueguen_at_umd.edu Susan Schreibman
sschreib_at_umd.edu