Title: Surrogates for Retrieval
1Surrogates for Retrieval
LIS 502 September 26, 2008
2information
analysis
translation
database
standards
translation
analysis
queries
3information
analysis
create surrogates
translation
database
standards
translation
analysis
queries
4Creating surrogates
- Analysis
- Translation- Code?- Relevant info from
characteristics of info package- Selected info
placed in order/format dictated by set of
rules/conventions - Standards- standardized code that guides
analysis (codes in libraries arbitrary or
conventional?)- Ensures concepts are expressed
in standard manner- Dictates format - Surrogate Stands in the place of the info
package a representation of the content and
characteristics provides descriptive data and
access points
5Bibliographic Control
- The term bibliographic control refers to the
operations by which recorded information is
organized or arranged according to established
standards and thereby made readily retrievable. - Lois Mai Chan. (1994). Cataloging and
Classification An Introduction. (2nd ed.). New
York McGraw-Hill. p.3
6Charles Ammi Cutter This image is courtesy of
University of Maryland Website
Identified three objects or purposes of the
library catalogue fulfills the definition of
bibliographic control.
7Cutters Objects of the Catalogue
- 1. To enable a person to find a book of which
either - (A) the author
- (B) the title is known. finding
- (C) the subject
- 2. To show what the library has
- (D) by a given author
- (E) on a given subject gathering
- (F) in a given kind of literature.
- 3. To assist in the choice of a book
- (G) as to its edition (bibliographically).
advisory - (H) as to its character (literary or topical).
8Tools of bibliographic control
- Bibliographic utilities OCLC (WorldCat,)
- Indexing and abstracting services (LISTA)
- Library catalogues (different formats, NEOS)
- Bibliographies
- References to other works footnotes, etc.
9Bibliography example
Title
- Rowling, J.K. Harry Potter and the Chamber of
Secrets. New York Scholastic, 1999.
Author
Year of Publication
Place of publication
Publisher
10Standards as a means of achieving bibliographic
control
- Purpose of standards
- To make surrogates complete and predictable for
finding and advisory functions - To make surrogates compatible with each other and
suitable for gathering in a database - To allow exchange of data
- Standards Created by different communities
serve as style manuals
11Standards for creating surrogates
- Description of the item
- Encoding the surrogate
- Options
- Determine descriptive content and then encode
- Begin with shell (encoding standard) and then
fill in the contents of each field
12information
analysis
Metadata Surrogates
translation
database
standards
translation
analysis
queries
13What is Metadata?
- Data about data
- Structured information that describes, explains,
locates, or otherwise makes it easier to retrieve
, use or manage an information resource - Facilitates resource description and discovery
- Facilitates interoperability across different
systems - It is mainly used in the context of web-based
resources
14Descriptive Data
- Data that is (1) derived from an information
package (e.g. book, article, sound recoding,
Internet document etc.) (2) used to describe the
information package (e.g. its title, author, date
of publication, extent, and notes identifying
pertinent features).
15Standards for Description
- International Standard Bibliographic Description
(ISBD) - goal to facilitate international
exchange of cataloguing records - Anglo-American Cataloguing Rules, 2nd ed.,
revised (AACR2R) - issued in 1967 and revised in
2002, - used in a large number of libraries in
Canada - Archival APPM (Archives, Personal Papers and
Manuscripts)- for the description of archival
materials, based on the AACR2 - Dublin Core- metadata element set
- Visual Resources Association Metadata- for
describing artistic, architectural materials
16Standards for Description (cont.)
- GILS (Government Information Locator Service)-
US federal agencies - FGDC (The Federal Geographic Data Committee ) -
set of metadata for describing geospatial data - EAD (Encoded Archival Description)- for archival
material, intended to function as a library
record
17Anglo-American Cataloging Rules (AACR2)
- Is a description standard
- Set of rules designed for use in the construction
of catalogues - The rules cover the description of, and the
provision of access points for, all library
materials
18Describing an example
- Harris, Roma M.
- Librarianship the erosion of a womans
profession / Roma M. Harris. Norwood, N.M.
Ablex Pub. Corp., c1992 - xiv, 186p. 24 cm. (Information management,
policy, and services) - Includes bibliographical references (p. 165-177)
and index. - ISBN 0893919411 (pp)
- 1. Women in library science. 2. Sex
discrimination against women. 2. Sex
discrimination in employment. 4. Library science
Social aspects. 5. Librarians Professional
ethics. 6. Women in information science. 7. Women
information scientists. 8. Women librarians. I.
Title.
19Dublin Core Metadata Element Set
- About 50 people discussed the issue of creating a
set of elements for describing web-based
resources (1994, Dublin, Ohio) - The metadata element set is internationally
recognized and widely applied - Translated into more than 20 languages
- Provides a consistent and standard way of
describing electronic resources
20Dublin Core Elements
- Title
- Subject
- Description
- Source
- Language
- Relation
- Coverage
- Creator
- Publisher
- Contributor
- Rights
- Date
- Resource type
- Format
- Identifier
21Dublin Core record for Ali Shiris homepage
ltlink rel"schema.DC" href"http//purl.org/dc/ele
ments/1.1/" /gt ltlink rel"schema.DCTERMS"
href"http//purl.org/dc/terms/" /gt ltmeta
name"DC.title" content"Ali Shiri's Homepage" /gt
ltmeta name"DC.creator" content"Ali Shiri" /gt
ltmeta name"DC.subject" content"knowledge
organization information retrieval information
science search behaviour thesauri digital
libraries knowledge organization systems
subject portals and gateways digital libraries
metadata user interaction web-based thesauri
information" /gt ltmeta name"DC.description"
content"This is the homepage of Ali Shiri at the
School of Library and Information Studies in the
University of Alberta" /gt ltmeta name"DC.date"
scheme"DCTERMS.W3CDTF" content"2004-09-13" /gt
ltmeta name"DC.type" scheme"DCTERMS.DCMIType"
content"Text" /gt ltmeta name"DC.format"
content"text/html" /gt ltmeta name"DC.format"
content"12034 bytes" /gt ltmeta
name"DC.identifier" scheme"DCTERMS.URI"
content"http//www.ualberta.ca/ashiri" /gt
22Consortium for the Computer Interchange of Museum
Information (CIMI) Guide to Best Practice
Dublin Core (DC 1.0 RFC 2413)
Final Version 12 August 1999 --------------------
--------------------------------------------------
--------------------------------------------------
-----------------------------------------
Example C-2 Record describing an original
cultural object lt?xml version"1.0" ?gt
ltdc-recordgt lttypegtphysical objectlt/typegt
lttypegtoriginallt/typegt lttypegtculturallt/typegt
ltformatgtDimension L245mm, W105mm,
H125mmlt/formatgt lttitlegtBoxlt/titlegt
ltdescriptiongtMaterialWoodlt/descriptiongt
ltsubjectgtUtensils Inuitlt/subjectgt
ltpublishergtThe National Museum Denmark
Ethnographic Collectionlt/publishergt
ltidentifiergtNMD L 19.246alt/identifiergt
ltrelationgtIsPart Of NMD L19.246a-NMD
L19.246klt/relationgt ltcoveragegt19001940lt/coverage
gt ltcoveragegtAmmassalik distrikt
Tasiilaklt/coveragegt ltcoveragegtEast
Greenlandlt/coveragegt ltrightsgthttp//www.natmus.dk/
skatkamre/intro.htmlt/rightsgt lt/dc-recordgt
23Consortium for the Computer Interchange of Museum
Information (CIMI) Guide to Best Practice
Dublin Core (DC 1.0 RFC 2413)
Final Version 12 August 1999 --------------------
--------------------------------------------------
--------------------------------------------------
-----------------------------------------
Example D-13 Record describing a digital
surrogate of ethnographic object lt?xml
version"1.0" ?gt ltdc-recordgt lttypegtimagelt/typegt
lttypegtsurrogate lt/typegt lttypegtculturallt/typegt
ltformatgtPCD Kodak Photo CD Image PAClt/formatgt
ltformatgtbase 16lt/formatgt ltformatgtYCC color
spacelt/formatgt ltformatgt24-bit colorlt/formatgt
lttitlegtDigitized image of Kite, collected in
China by Berthold Laufer in 1903, depicting an
INSECT, possibly a fly. lt/titlegt
ltdescriptiongtPhotoCD file format base 16, YCC
color space, 24-bit color, ISO9660 CD-ROM. Kite,
collected in China by Berthold Laufer (1874-1934)
in 1903, depicting an INSECT, possibly a fly.
Materials PAPER, BAMBOO, PIGMENT, STRING,
MASKING TAPE (MODERN), METAL (MODERN). Native
term Ts'ang Yinlt/descriptiongt ltsubjectgt Kites
lt/subjectgt ltsubjectgt Insectlt/subjectgt
ltsubjectgt Flylt/subjectgt ltsubjectgt Ts'ang
Yinlt/subjectgt ltpublishergtAmerican Museum of
Natural History Division of Anthropology Digital
Imaging Projectlt/publishergt ltcontributorgtLaufer,
Bertholdlt/contributorgt ltcontributorgtAmerican
Museum of Natural History Division of
Anthropologylt/contributorgt ltdategt 1997
-12-05lt/dategt ltdategt1903lt/dategt
ltidentifiergtAMNH CD269/CD269/70/10596.PCDlt/identi
fiergt ltrightsgtAmerican Museum of Natural History
Division of Anthropologylt/rightsgt lt/dc-recordgt
24Encoding
- Relates to the container for content
- Setting off each part of the record in specified
ways - Facilitates searching and finding
- Allows integration of variant info packages
- Facilitates data transmission
- Some standards include both content (descriptive)
and encoding specification
25Encoding Standards
- MARC (e.g. 100 is a Personal Author in MARC
coding) - SGML (lttitlegt Great Expectations lt/titlegt)
- DTDs
- HTML-DTD for encoding web pages
- MARC-DTD for encoding USMARC records
- TEI-DTD for encoding literary texts
- XML (a subset of SGML but simpler and more
functional)
26What is MARC?
- Machine Readable Cataloging (MARC)
- "Machine-readable" means a computer can read and
interpret the data in the cataloging record - A cataloging record includes 1) a description of
the item, 2) main entry and added entries, 3)
subject headings, and 4) the classification or
call number - MARC records consist of a set of tags.
- A tag Each field is associated with a 3-digit
number called a "tag." A tag identifies the field
-- the kind of data -- that follows. For example
245 is a field for title. - http//ualweb.library.ualberta.ca/uhtbin/cgisirsi/
t82srkUTJg/UAARCHIVES/38850036/8/8586502/Thestory
oflibrariesfromtheinventionofwritingtot
hecomputerage5E2F
27OXX Control information, identification DDC, LCC,
etc. and classification numbers, etc. 1XX Main
entries 2XX Titles and title
paragraph 3XX Physical description,
etc. 4XX Series statements 5XX
Notes 6XX Subject access fields
7XX Added entries and linking fields 8XX
Series added entries, etc. 9XX Other,
including local
28Frequently Used MARC Tags
29http//catalog.loc.gov/ MARC Tags 000 01332cam
2200349 a 450 001 4395316 005 20000330141709.0
010 -- a92015603 020 -- a 0893918407 (cl)
020 -- a 0893919411 (pp) 040 -- a DLC c DLC
d DLC 050 00aZ682.4.W65 b H37 1992 082 00a
025/ .0082 2 20 100 1- a Harris, Roma M. 245
10a Librarianship b the erosion of a woman's
profession / c Roma M. Harris. 260 -- a
Norwood, N.J. b Ablex Pub. Corp., c c1992.
300- a xiv, 186 p. c 24 cm. 440 -0a
Information management, policy, and services 504
-- a Includes bibliographical references (p.
165-177) and indexes. 650 -0 a Women in library
science. 650 -0 a Sex discrimination against
women. 650 -0 a Sex discrimination in
employment. 650 -0 a Library science x Social
aspects. 650 -0 a Librarians x Professional
ethics. 650 -0 a Women in information science.
650 -0 a Women information scientists. 650 -0
a Women librarians.
30The same record in a card catalog format
31Standards for Surrogates
Description Standards
Encoding standards
ISBD AACR2 DC GILS FGDC EAD VRA APPM
MARC SGML XML
32Look at MARC SGML
- Identify sections of the MARC (handout)
- Translate codes
- Identify MARC codes for familiar elements within
Savitch example
33First reflective paper Due next weekOctober
3/08 (9 a.m).