Title: Cataloging, Metadata, and Information Architecture
1Cataloging, Metadata, and Information Architecture
- New Directions for Catalogers Old and New
- Steven J. Miller
- UW-Milwaukee
- WAAL 2006
2Information Organization Landscape Today
- This presentation is a very broad overview,
primarily from a library cataloging perspective - Themes
- New developments and directions for working
catalogers, continuing education, LIS education - Importance of common threads of principles and
practice that run though - Three inter-related contexts
- Cataloging
- Metadata
- Information Architecture
3Starting Point User Needs
- Intellectual access to information resources
- intellectual and artistic output of human
endeavor - human cultural heritage from smallest to largest
scale - Requires organizing and structuring of metadata
(bibliographic data) - Requires in turn specialists who can do this
- The need is greater than ever, but shifting in
scope
4Libraries and Information Resources
stewardship
high
low
uniqueness
low
high
5In Libraries
- Continuing need for traditional knowledge and
skills - AACR, MARC, LCSH, LCC, DDC, OCLC, local systems
- Broadening Landscape
- New conceptual models FRBR
- Changes in cataloging rules RDA
- Digital resources e-books, e-journals,
databases licensing agreements batch record
processing
6In Libraries
- Broadening landscape
- Access to resources via online catalogs,
federated searching, context-sensitive linking
(OpenURL) - Digitization projects and online digital
collections - Unique or rare local resources (text, still
image, moving image, sound) - Digital institutional repositories
- Digital intellectual output of university
community - Pre- and post-prints of scholarly papers,
electronic theses and dissertations,
faculty-created learning objects
7Common Threads
- Common to cataloging, metadata, and information
architecture - Metadata elements, content rules guidelines
- Metadata encoding communication
- Metadata driven information retrieval systems
(user interfaces) - Back-end vs. front-end
- Organization, labeling, search browse
navigation systems - Controlled vocabularies, taxonomies, thesauri
- Facets faceted searching, browsing, and
navigation
8Information Retrieval Systems
Staff Interface
User Interface
Information Professionals
Users
Back End
Front End
Serve needs of all users and search types,
current future
User needs user-centered design usability
Progress in human knowledge depends on cumulative
scholarship
9Types of Controlled Vocabularies Continuum of
Complexity
Types of Relationships
10Cataloging
11Something Old, Something New
- The Old and still relevant
- Panizzi, Cutter, Lubetzky, Paris Principles
- Cataloging principles (theory)
- Cataloging experience (practice)
- The New
- IFLA, JSC
- FRBR, RDA
- Metadata, IA, digital libraries
12Functions of the Catalog
- Panizzi, Cutter, Lubetzky, Paris Principles
- New international cataloging code (IFLA)
- IFLA Statement of International Cataloguing
Principles (Jan. 2005 draft) - http//www.loc.gov/loc/ifla/imeicc/source/statemen
t-draft_jan05.pdf - Resource description and access
- Identifying/finding collocating/gathering
evaluating/selecting
13Functions of Descriptive Metadata
- Representation
- Represent the resource to the user
- Serve as a surrogate for resource itself
- Provide descriptive information
- Help user identify, evaluate and select
- Retrieval
- Provide means for search, browse, navigation
- Known item searches and exploratory searches
- Retrieve sets of results, not just individual
items - grouped according to one or more common
characteristics
14FRBR Functional Requirements for Bibliographic
Records
- A report published by IFLA in 1998
- (IFLA the International Federation of Library
Associations) - Highlights of the report
- 1. Defines four generic user tasks
- 2. Presents new conceptual model of bibliographic
universe - 3. Recommends basic requirements for
bibliographic records
15FRBR User Tasks
- Find entities that relate to a user's search
criteria (locate and collocate) - Identify an entity, confirm it is what is sought,
or distinguish it from similar - Select appropriate to users needs
- Obtain the information entity through loan, or
access electronically via a remote, networked
computer
- Cutters objectives for the catalog
- Finding
- Collocating
16FRBR Entities
- Group 1 Products of intellectual artistic
endeavor - Work
- Expression
- Manifestation
- Item
- Group 2 Those responsible for the intellectual
artistic content - Person
- Corporate body
- Group 3 Subjects of works
- Groups 1 2 and
- Concept
- Object
- Event
- Place
- 80 of resources cataloged work, expression,
manifestation identical - 20 they differ (especially for works of
literature music e.g., Shakespeare, Mozart)
value of the distinctions for the 20
17FRBR Primary Entity Relationships
Work
Expression
Manifestation
one
Item
many
18FRBR in Practice
Work (e.g., Hamlet)(uniform title authority
record)
Author of text(personal name authority record)
Editor of edition(personal name authority record)
Reader of audiobook(personal name authority
record)
One audio recording of the text (partial
bibliographic record)
One language edition of the text (partial
bibliographic record)
Regular-print version (partial bibliographic
record)
Audiocassette version (partial bibliographic
record)
Large-print version (partial bibliographic
record)
Audio CD version (partial bibliographic record)
Online PDF version (partial bibliographic record)
- Manifestation-level information
- ISBN
- Publication information
- Physical description
- Some notes
- Access points related to
- physical manifestation (links to
- authority records)
- Expression-level information
- Title statement of responsibility
- Edition statement
- Some notes
- Access points related to
- intellectual content (links to authority
- records)
Copies owned by local institutions (multiple
holdings item records)
19RDA Resource Description and Access
- Successor to AACR2
- Part 1 on Description draft was available for
review and comment late 2005-early 2006 - Parts 2 and 3 on Access points and Authority
control will be released in 2006-2007 - Input from community
- Intended publication date 2008
20RDA Highlights
- Intended as a metadata content standard alongside
others digital context - Focus on cataloging principles, and basis for
cataloger judgment - Incorporation of FRBR concepts and terminology
- Integration of all types of content, media, and
publication - Separation of content from display (i.e., ISBD
punctuation) and from encoding and communication
format (i.e., MARC) - Primarily a digital resource rather than print
21RDA What will be different?
- Resource description rules pretty much the same
- Different
- ISBD areas of description and punctuation not
part of body of rules now optional an appendix
will cover them - Each data element separate e.g., Title,
Statement of responsibility, Publisher, Place of
publication, Date - Non-print, digital, and continuing resources no
longer second-class status - Some of the terminology is changed to cohere with
other metadata standards e.g., Identifier - Rule of three will be optional rather than the
norm - Recognition of authority records and authority
control built into the rules
22Machine-generated cataloging
- Batch editing and loading of files of MARC
records - Use of tools such as MarcEdit
- Repurposing MARC data (reusing)
- Mapping to MARC-XML
- Mapping to different metadata schemes
- Use for digital projects
- Use for library web sites
23Metadata
24New Knowledge, Skills, Jobs
- Metadata librarians
- Combination cataloger / metadata librarians
- Knowledge and skills in AACR, MARC, LCSH, OCLC,
local systems, etc - And in Dublin Core, TEI, MODS, XML, OAI,
CONTENTdm, DSpace, etc. - Working catalogers broadening their job duties or
moving into new positions - New jobs for new LIS graduates
25Digital Projects Working Catalogers
- Working catalogers are increasingly called upon
to contribute to digital library projects - Digital collections
- Digital institutional repositories
- Creating the metadata
- Selecting metadata standards
- Creating local application guidelines
26Metadata
- Cataloging is metadata
- Interrelated set of content, encoding, controlled
vocabulary and classification standards,
interfaces, databases - Metadata
- Other content schemes controlled vocabularies
- Other encoding standards and interfaces
- Common threads with cataloging
- Value of cataloging knowledge and working
experience cannot be overestimated! - The same issues, problems, challenges arise with
other metadata schemes and vocabularies
27Dublin Core
- One of many possible content schemes
- Lowest-common denominator scheme
- 15 simple elements
- May be enriched using qualifiers
- Widely used, especially for digital collections
- Original purpose vs. actual use today
- Plusses and minuses
- Need for best practice guides
28MODS Metadata Object Description Schema
- Newer than DC, but starting to take off grow in
popularity as general content scheme - Richer than Dublin Core
- Based on MARC, but simpler
- A subset of MARC elements, but using
language-based tags - Designed for XML encoding communication
environment
29VRA Visual Resources Association Core Categories
- For visual and museum cultural heritage resources
- Related to CDWA Categories for Description of
Works of Art - CCO Cataloguing Cultural Objects
- Newly-developing set of rules for creating
content for VRA CDWA elements - Compare with AACR/RDA 482 pages
- http//www.vraweb.org/ccoweb/index.html
30XML
- A markup language for computer-processing of data
- Intended for use in the Web environment
- Defines content rather than display
- A meta-language for creating specific
definitions of tags and content - Widely used for most metadata schemes today
31OAI Open Archives Initiative
- OAI-PMH OAI Protocol for Metadata Harvesting
- Harvesting metadata from diverse repositories and
aggregating in a searchable database - OAIster is best known and best developed
repository to date - Can accept metadata in a variety of formats, but
requires simple Dublin Core in XML as lowest
common denominator
32Software and Interfaces
- CONTENTdm
- Widely-used software for building digital
collections - Includes interface for designing and creating
metadata for digital collections - DSpace
- Widely-used software for building digital
institutional repositories - Also includes metadata components
33Common Threads
- Metadata content elements
- AACR2, RDA, Dublin Core, MODS, VRA
- Application rules guidelines
- AACR2, RDA, CCO, various DC best practice guides
- Machine encoding and communication
- MARC, XML
- Back-end interfaces
- OCLC Connexion, Endeavor, Innovative, CONTENTdm,
DSpace - Front-end user interfaces
- Web-based OPACs, digital collections and
repository interfaces - Controlled vocabularies
- LCSH, LCC, DDC, AAT (Art Architecture
Thesaurus), LCTGM (LC Thesaurus for Graphic
Materials), TGN (Getty thesaurus of Geographic
Names) codes for languages and places
34Information Architecture
35Information Architecture
- Design of front-end user interface
- Based on underlying back-end database
- Metadata
- Controlled vocabularies
- And usability principles and testing
- Can include
- Corporate intranets
- Public Web sites
- Digital collection interfaces
- Online catalog design
36Organization Schemes
- Exact
- alphabetical, chronological, geographical, etc.
- Ambiguous
- topics, audience, tasks
37Organization Structures
- Top-down
- Based on hierarchy or taxonomy
- Bottom-up
- Based on tagged metadata
- Apply structure and power of relational databases
38Navigation Systems
- Searching
- Keywords, full text
- Fielded searching of tagged metadata
- Browsing
- Pre-selected categories (classification)
- Facets
- Search or browse by combining several facets
39Taxonomies Thesauri
- For organization
- For retrieval
- For navigation
- Search and browse
40Online Thesaurus Example
- State of Minnesota Thesaurus
- http//www.state.mn.us/portal/mn/jsp/bridges/thesa
urus.jsp - Searchable and browseable
41Thesaurus in Action
- Example PubMed
- http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?DBp
ubmed - Search for "African Sleeping Sickness natural
language phrase - Click on Details tab
- Query is matched against synonym ring and
translated into National Library of Medicine's
Medical Subject Headings (MeSH) controlled
vocabulary terms - This mapping takes place behind the scenes to the
user, unless they select "Details," since they
may be puzzled why their query term does not
appear in the list of MeSH headings shown with
the full metadata display for this item.
42A Visible Taxonomy
- NCBI Entrez Taxonomy homepage
- http//0-www.ncbi.nlm.nih.gov.csulib.ctstateu.edu/
Taxonomy/ - The user can click on any of the terms and open
up the hierarchy and drill down to see the full
taxonomic structure. - Useful for medical researchers and scientists,
perhaps not of tremendous interest to the general
public.
43Visible Taxonomies
- Amazon.com uses explicit taxonomy /
classification - http//www.amazon.com
- Amazon offers browsing and searching by
controlled subject terms arranged in hierarchical
classifications or taxonomies, as illustrated
above.
44Amazon.com example
45OPAC Re-design
- Next Generation Catalogs
- Metadata (MARC bibliographic data) is the same
(data in back-end database) - Architecture of the OPAC (front-end user
interface) is different
46(No Transcript)
47(No Transcript)
48(No Transcript)
49New Careers in IA
- Libraries, archives, museums
- Business enterprises
- Intranets
- Web sites
50Conclusion
- It's a new world, but one in which time-tested
principles and hands-on experience are more vital
than ever.
51(No Transcript)