Title: Metadata
1Ontologies
Knowledge Organization
Indexing
Information Architecture
Metadata
Information Organization
Categories
Bibliographic Control
Taxonomies
Classification
Dr. Sherry Vellucci Information Organization
Documentation
Concept Maps
Knowledge Representation
Cataloging
?
Abstracting
U B C
2- In the colossal labor, which exhausts both body
and soul, of making into an alphabetical catalog
a multitude of books gathered from every corner
of the earth there are many intricate and
difficult problems that torture the mind. - Thomas Hyde. Catalogue for the Bodleian Library,
1674.
3Why Organize Information?
4What is Information Organization?
- The process of creating, arranging, and
maintaining systems for bibliographic information
retrieval - Organization of the materials and information
that we collect or provide access to in
libraries, museums, archives, and information
centers - Information Organization differs depending on
environment
5Functions of Information Organization
- Primary
- Provide access to recorded information for the
purpose of retrieval - Bring together related documents
- Distinguish between similar documents
- Secondary
- Keep inventory of what we have and where it is
located - Keep recorded information usable for posterity
6Subsets of Information Organization
- Cataloging metadata
- Classification
- Indexing and abstracting
- Database design
- Information architecture
- Content management
- Knowledge management
7Trends in Catalog Creation
- Ancient times - Simple lists
- Middle Ages - Inventories
- Sixteenth Seventeenth Century - Finding lists
- Eighteenth Century - Codification begins
- Nineteenth Century - Collocating Devices
- Twentieth Century - Expanded codification
mechanization - Twenty-first Century - ?
8What Is a Catalog?
- A retrieval tool that provides access to
individual items within collections of
information packages Taylor, 1999 - An organized set of bibliographic records that
represent the holdings of a particular
collection. -- Wynar
9Bibliographic (Metadata) Records
- Surrogates for information packages in the
collection - Include standardized descriptions
- Form a catalog when arranged or accessed
systematically - (Also called bibliographic records, catalog
records, entries)
10Access Points
- Any term in a metadata record that may be used to
locate that record - A Controlled access point
- An authorized (preferred) form of access point
- Constructed with information in a certain order
- Maintained under authority control
11Types of Bibliographic Control
- Control of a Body of Literature
- Indexes ( Abstracts)
- Bibliographies
- Control of Collections
- Catalogs
- Finding Aids
- Museum Registers
- Control of Knowledge
- Knowledge Management
12Levels of Access
- Macro level access
- Broad in scope
- entire book
- complete serial
- complete archival collection
- Macro level tools
- Catalogs
- Micro level access
- Narrower in scope of description
- Chapter in book
- Article in serial
- Individual items in archive or museum
- Micro level tools
- Indexes
- Abstracting services
- Databases
13Cutters Objects of the Catalog
- 1) To enable a person to find a book when one of
the following is known - The author
- The title
- The subject
- 2) To show what the library has
- By a given author
- On a given subject
- in a given kind of literature
14- 3) To assist in the choice of a book
- As to the edition (bibliographically)
- As to its character (literary or topical)
- From Rules for a Dictionary Catalog, 1876, 4th
ed., 1904 - 1. Find
- 2. Collocate
- 3. Evaluate
15FRBR User Tasks
- Find (locate)
- Relate/Navigate (Collocate Svenonius)
- Identify
- Select
- Obtain
- Other possible tasks
- Attribute Royalties to
- Preserve
16Assumptions
- Objective 1User can express the information need
translate into language of the system - Objective 2 Users need requires looking at
related sets of information (all documents by a
given author, on a given subject, in a certain
genre) - Objective 3 User finds multiple manifestations
of work and need to evaluate the surrogate in
order to select the appropriate document
17Problems
- How do we operationalize open-ended objectives?
- Success of objective must be measurable
- To be measurable, must be specific
18Intellectual Issues
- Representation concise depiction of complex
information - Document surrogates
- Describe attributes of the document
- Classification -- a scheme for organizing
information packages or concepts
19Problem What are We Organizing?
- Recorded information -- meaningful symbols
(letters, numbers, etc.), sounds or images
created or collected to convey a message - Why do we use the term recorded information
instead of just information? - Document An information package
- Often associated with text printed on paper
- Broader context includes videos, sound
recordings, graphics, computer files, etc.
20Functional Requirements for Bibliographic Records
- What FRBR is
- a logical framework
- a conceptual model
- a "generalized" view of the bibliographic
universe - Available at http//www.ifla.org/VII/s13/frbr/frbr
.htm - What FRBR is not
- a data model
- an implementation model
- a conceptual model for authority records
- A conceptual model for subjects
21FRBR Functions
- Specifically identify what is being described
- Improve catalog displays
- Provide common conceptual model language
22Entity-Relationship Model
Attributes
Attributes
Entity 1
Entity 2
Relationship
- Group 1 Entities Products of intellectual or
artistic endeavour - Group 2 Entities Those responsible for the
intellectual artistic content, physical
production, or custodianship - Group 3 Entities Entities that serve as subjects
of intellectual or artistic endeavour
23Group 1 Entities Their Relationships
An Expression realizes A Work
A Work Is realized
through An Expression
An Expression Is embodied in A Manifestation
A Manifestation embodies An Expression
An Item exemplifies A Manifestation
A Manifestation Is exemplified by An Item
24LS vs. IS Terminology Comparison
FRBR Terms
I. S. Terms
Work Expression Manifestation Item
Message Text Document Instantiation
25Tolkien
W1
The Lord of the Rings
Work/Expression/ Manifestation/Item Relationships
E3 Spoken Word Performance
E1 English Text
E2 German Text
The Lord of the Rings
Der Herr der Ringe Translated by Margaret
Carroux
The Lord of the Rings Read by Ian Holms
M1 Sound Recording
M1 English
M2 English
M1 German
M3 English
The Lord of the Rings Read by Ian Holm BBC
Audiobooks 2003 13 compact discs
The Lord of the Rings London Harper
Collins 1998, 3 v.
Der Herr der Ringe Translated by Margaret
Carroux Stuttgart Ernst Klett 1968, 3 v.
The Lord of the Rings New York Facsimile
Reprints 1965
The Lord of the Rings London Allen
Unwin 1954-55, 3 v.
I1 VUW Library
Copy 1, signed by the author
26Bibliographic Relationships
- Equivalent
- Derivative
- Descriptive
- Whole-part
- Sequential
- Accompanying
- Shared characteristics
- Barbara Tillett
- Richard Smiraglia
- Sherry Vellucci
- Allyson Carlyle
27Barbara B. Tillett, Bibliographic
Relationships. In Relationships in the
Organization of Knowledge, edited by Carol A.
Bean and Rebecca Green, 19-35. Dordrecht Kluwer
Academic Publishers, 2001
Family of Works
Same Expression
New Work
New Expression
B. Tillett Dec. 2001
28Equivalent Relationships
- Multiple manifestations with identical content
- W1 The Lord of the Rings
- E1 English language text
- M1 Allen Unwin, 1954-55.
- M2 Facsimile Reprints, Inc., 1965.
- M3 Harper Collins, 1998.
- Tolkien, J.R.R. (John Ronald Reuel), 1892-1973.
The Lord of the Rings - BooksEnglish
- London Allen Unwin, 1954-55.
- New York Facsimile Reprints, Inc., 1965.
- London Harper Collins, 1998.
-
29Derivative RelationshipsSame work
- Editions
- Translations
- Performances
- Tolkien, J.R.R. (John Ronald Reuel), 1892-1973.
The Lord of the Rings - E1 BooksGerman
- M1 Trans. by Margaret Carroux. Stuttgart
Ernst Klett, 1968. - E2 Spoken word recordingEnglish
- M1 London BBC Audio Books, 2003.
-
30Derivative Relationships New works
- Parodies
- Adaptations
- Beard, Henry N. Bored of the Rings a Parody of
J.R.R. Tolkiens the Lord of the Rings. New York
New American Library, 1969 - Strachey, Barbara. Journeys with Frodo an Atlas
of J.R.R. Tolkiens The Lord of the Rings.
London Grafton, 1992. - The Lord of the Rings. Screenplay by Fran Walsh,
Phillipa Boyens and Peter Jackson based on the
books by J.R.R. Tolkien produced by Barrie M.
Osborne, Peter Jackson, Fran Walsh, Tim Sanders
Directed by Peter Jackson. London? New Line
Cnema, 2002. - Knizia, Reiner. The Lord of the Rings Board Game.
Illustrations by John Howe. Cambridge
Sophisticated Games, 2001.
31Whole-Part Relationships
- Components
- Aggregates
- The Lord of the Rings aggregate work work of
works - The Fellowship of the Ring component part
work - The Two Towers component part work
- The Return of the King component part work
- The Lord of the Rings Game
- contains 2 books, 2 map sheets, 9 character
sheets, rules, contents sheets, 4 red dice,
cardboard counters, map errata
32Sequential Relationships
- Part to part (or chronological) Relationship
- Part 1 The Fellowship of the Ring
- Part 2 The Two Towers
- Part 3 The Return of the King
- The Lord of the Rings Official Fan Club Magazine
- Vol. 1, no. 1 vol. 1, no. 2
33Accompanying Relationships
- Manifestation is accompanied by additional
material - Shore, Howard. The Lord of the Rings the Motion
Picture Trilogy Instrumental Solos. Music
arranged for trombone by Tod Edmonsen. Miami
Warner Bros, 2004. 1 part (25 p.) 1 sound disc
(4 ¾ in.) - The Lord of the Rings. Extended edition includes
4 DVDs 1 Part One 2 Part Two 3 Appendices
Part One From Book to Vision 4 Appendices Part
Two From Vision to Reality. 1 booklet with
explanation of the extended edition documentary
appendices on the making of the movie complete
listing of scenes, with new scenes and extended
scenes identified and diagrams detailing how the
book was transformed into visual form.
34Descriptive RelationshipsNew works
- Simpson, Dale. Modernized Myth Beowulf, J.R.R.
Tolkien and the Lord of the Rings. - Miesel, Sandra. Myth, Symbol and Religion in the
Lord of the Rings. - Smith, Jim E. The Lord of the Rings The Films,
the Books, the Radio Series. - Fisher, Jude. The Lord of the Rings Location
Guidebook. - Astin, Sean. There and Back Again
Behind-the-Scenes on the Lord of the Rings.
35FRBR Group 2 Entities
- The Group 2 entities represent those responsible
for the intellectual or artistic content, the
physical production and dissemination, or the
custodianship of the entities in the first group
(FRBR, p.13) - Group 2 entities include
- Persons
- Corporate bodies
36Group 1 Entities
Relationships of FRBR Group 1 Entities to FRBR
Group 2 Entities (FRBR p. 14)
Group 2 Entities
37Group 1 Group 2 Relationships
- w1 The Lord of the Rings
- created by
- p1 J.R.R. Tolkien
- e1 The Lord of the Rings spoken word recording
- performed by
- p2 Ian Holm
- m1 The Lord of the Rings. motion picture, 2002
- distributed by
- cb1 New Line Cinema Home Entertainment
- i1 The Lord of the Rings published English text
1965 - owned by
- cb1 Victoria University Library
38FRBR Group 3 Entities
- The Group 3 entities serve as the subjects of
works - The group includes
- concept (an abstract notion or idea)
- object (a material thing)
- event (an action or occurrence)
- place (a location)
- In addition, all entities in Groups 1 and 2 can
serve as subjects for a work
39FRBR Relationships of a Work to entities that can
serve as the subject of a work (FRBR, p. 15)
40Group 1 Group 3 Relationships
- c1 Mythology
- w1 J.R.R. Tolkien The Lord of the Rings
- is the subject of
- w2 The Lord of the Rings An Examination of
Mythical Elements by M.C. Stone
FRBR, p. 63
41Information Representation
- Organized by a special purpose language
(ontologies taxonomies) - Many such languages exist
- Linnaeus Taxonomy of living things
- Educational resources thesaurus
- Bibliographic language
- Subject language
- Document language
42Information Organization in Libraries
- Traditional processes
- Organize items on shelf by classification
- Create maintain catalog that provides access to
information resources (surrogate records) - Create indexes databases
- Create bibliographies
- New processes
- Create library portals
- Provide access to variety of resources through
unified interface - Catalog, databases, resource links, archives,
digital libraries, etc. - Customize for personal information (my library)
- Create and organize digital libraries
43Information Organization in Archives
- Organize arrange in groups by provenance
(originator) and original order (closed stacks) - Create accession record (information about
collection source and physical content) finding
aid (contents of collection)
44Information Organization in Museums
- Organize describe objects in collection
- Create accession/field records (info. _at_ source of
object) and register (similar to catalog) - Description of visual objects is more complex
than text - May also have libraries (include textual
material) and archives in museums
45Information Organization on the Internet
- Libraries
- Web bibliographies (Subject, Classification)
- Metadata (MARC, Dublin Core)
- Non-Libraries
- Search engines
- Subject directories
- Automatic indexing classification
- Visual Organization
- Concept maps
- Ontologies
- Taxonomies
46Information Organization for Digital Libraries
- Provides digitized resources with architecture
and retrieval service - Design of retrieval description system part of
creating the digital library - Increasing demand with distance education
47Information Organization with Library Portals
- Provide access to variety of resources through
unified interface - Catalog, databases, resource links
- Customizable for personal information
48Information Architecture
- Process of designing, implementing and
evaluating information spaces that are humanly
and socially acceptable to the intended
stakeholders (Andrew Dillon) - Determine information needs of users
- Create structural patterns for finding
information - Develop user interface for information retrieval
and display - Evaluate success of architecture for retrieval
and display
49Records Management
- Originally involved keeping, filing, maintaining
paper records - Computer files on individual PCs created
organizational problems - Various systems used across organization
(payroll, general ledger, accounts payable,
inventories) - Data modeling used to create conceptual model of
records management activities (directories,
files, programs, database field values)
50Knowledge Management
- Who knows what in an organization and capturing
that knowledge using technology - Expanded into managing the information explosion
in organizations - Tacit knowledge vs. explicit knowledge
- Software used to create knowledge repositories,
improve knowledge access, enhance the knowledge
environment, manage knowledge as an asset
51Metadata
- Data about data
- Structured data that describes the attributes of
a resource, characterizes it relationships,
supports its discovery, management, and effective
use, and exists in an electronic environment
52The Structure of Information
- Structured Data
- Data has Context Description
Q7 Timetable Manhattan to Queens. Weekends only. Q7 Timetable Manhattan to Queens. Weekends only. Q7 Timetable Manhattan to Queens. Weekends only.
Departs Times Square Departs Queens Plaza Arrives Jamaica Station
658 715 713 730 732 749
Queens Plaza
658
Jamaica Station
732
Q7
Times Sq.
713
53Model of an Information Retrieval System
Lancaster
Major function of an IR System is to act as an
interface between a particular population of
users and the universe of information resources
in printed or other form.
- Activities of IR Systems
- Acquire store documents (or surrogates)
- Organize control documents (or surrogates)
- Distribute documents (or surrogates)
54Subject Analysis Is . . .
- The part of indexing or cataloging that deals
with, first, the conceptual analysis of an
information package - and with translating the conceptual analysis
into the conceptual framework of the
classification or subject heading system (Taylor,
p. 132)
55Step 1 Conceptual Analysis
- determining what the information package is
about - and/or determining what an item is
- An indexer experienced with a controlled
vocabulary may think of aboutness in the terms
available
56Problems in Determining Subject
- Deciding aboutness is subjective
- Predominance?
- Frequency?
- Deciding aboutness may depend on culture,
background and knowledge of cataloger - Behaviorially private
- Socially common ideas
- Gramatically different terms concepts
- Deciding interpretive, thematic, or iconographic
significance for non-textual material requires
specialist
57Determining Form
- Form data are terms and phrases that designate
specific kinds of genres or materials (Taylor, p.
255) - Types of form
- Physical character
- Videocassettes, photographs, maps
- Type of data contained
- Text, visual, audio, numeric
- Arrangement of information contained
- Excyclopedias, dictionaries, indexes, diaries,
outlines - Style, technique, purpose or intended audience
- Drama, romance, cartoons, algebra text
58Exhaustivity
- The number of terms that will be assigned by the
cataloger/indexer - Determined by local policy desired level of
bibliographic control
59Dimensions of Exhaustivity
- Summarization Level
- Describes the overall subject content of the work
as a whole, i.e., the dominant subject - Cataloging is at summarization level
- Assign fewer more general terms
- Depth Level
- Describes all main concepts of subject, including
smaller units of information, i.e., chapters,
articles, etc. - Indexing is at depth level
- Assign more specific terms
Information Retrieval
Document Retrieval
60Specificity
- The level of subject analysis provided for by a
particular controlled vocabulary - The closeness of fit between the meaning of an
index term and the documents themes and/or
subthemes - The Care Feeding of Siamese Cats
- Low specificity Felines
- High specificity Siamese cats
61Classification
- Oldest form of information organization
(Aristotle) - Based on thought process
- Mental models
- classify
- associate
- bring like things together
- Differentiate among things
- Primary types hierarchical, faceted
- Often associated with coding of some type
- Symbols (numbers, letters, punctuation)
62Theories of Categories
- Classical theory of categories based on
commonalities - 20th Century theories
- Family resemblance (Wittgenstein Austin)
- Fuzzy Set Theory (Zadeh)
- Distinct categories/cultural and linguistic
differences (Lounsbury Berlin Kay) - Basic-level Categories (Brown)
- Universal level of human naming (Berlin)
- Prototype Theory (Rosch)
- Musical instruments
63Bibliographic Classifications Differ from
Taxonomic Groupings
- Documents are complex
- Have combinations of topics, not just mutually
exclusive, generic relationships - Documents classified based on literary warrant
- Document arrangement can only be
one-dimensional-linear order, i.e., show one kind
of relationship - Need catalogue to supplement shelf-order
64For Whom are We Organizing Information?
- Users--people who have an information need
- Users vary
- Experts
- librarians, information professionals,
researchers - people who know a domain and have some idea of
vocabulary and the kind of information thats
likely to be available - Novices
- people who never learned to use retrieval tools
- people who only have a vague idea of what theyre
looking for, e.g., a student assigned a research
topic or a person who just found out that their
relative has an obscure disease
65Problems with Information Organization
- Catalogers focus on bibliographic and authority
control and languages - Accurate description does not always lead to
successful query results - Does not link cataloging process with knowledge
base of information retrieval
66Understanding Users Perspectives
- Move from system-centered to user-centered views
of information systems - Designed for the user based on user input
bottom up rather than top down - Needs research into user needs, user modelling,
and catalog information-seeking behavior
67Broadened Perspective
- Metadata has brought information organization
onto center stage - Provides information that goes beyond description
(administrative, structural, etc) - Focuses primarily on digital information
- Adopts/integrates use of search engines
- Objectives can be operationalized, connected and
measured - Representation
- Visualization
- Searching
- Interface usability
68- Metadata has become important to businesses
- Part of knowledge management
- Often used in proprietary systems