Title: William Y. Arms
1Object models, overlay journals, and virtual
collections
William Y. Arms Corporation for National
Research Initiatives March 22, 1999
2Object models, overlay journals, and virtual
collections
William Y. Arms Department of Computer Science,
Cornell University March 22, 1999
3Object models, overlay journals, and virtual
collections
William Y. Arms Corporation for National
Research Initiatives March 22, 1999
4Physical and Logical Views of Information
Physical view Data structures, files,
directories, servers Publishers, libraries,
web sites Logical view Works, expressions,
manifestations, items Object models
(document models) Overlay journals
Virtual collections
5What is Content? Works, expressions,
manifestations, items
6Work
Work The underlying abstraction. Examples
Homer's The Iliad. Beethoven's Fifth
Symphony. The Unix operating system.
7Expression
Expression A work is realized through an
expression. Examples The Iliad was first
expressed orally, then it was written down
as a fixed sequence of words. Beethoven's
Fifth Symphony can be expressed as a
printed score or by any one of many performances.
The Unix operating system has separate
expressions as source code and machine code.
8Works and Expressions
Works and Expression Many works are realized
through a single expression. Examples The
poem, The Road Not Taken by Robert Frost.
The picture In such examples, there is no
practical distinction between expression and work.
9Manifestations
Manifestation A expression is given form in one
or more manifestations. Examples The text
of The Iliad has been manifest in numerous
manuscripts and printed books. A musical
performance can be distributed on CD, or
broadcast on television. Software is
manifest as files, which may be stored or
transmitted in any digital medium.
10Items
Item When many copies are made of a
manifestation, each is a separate item.
Examples A specific copy of a book. A
copy of a computer file.
11Object Models
12Beyond Simple Documents
Many digital objects are more than static
files of data. Dynamic objects What is
presented to the user depends upon the execution
of computer programs or other external
activities. Complex objects Objects are made
up from many inter-related elements. Alternate
disseminations Digital objects may offer the
user a choice of access methods. Databases A
database comprises many alternative records, with
different records selected each time the database
is accessed.
13Object Models and Structural Types
Web object Digitized materials Digitized
image Set of digitized page images
Marked-up text with page images Digitized
audio recording Sets Set of digital
objects Searchable set of digital objects
14Web Object File with URL Data Type
Identifier
http//www.dlib.org/boats/swan56
Data
Metadata
jpg
15Object Model Digitized Image
Data Several manifestations thumbnail image
reference image archival
image Metadata Each manifestations may have its
own metadata
16Object Model Digitized Image
Identifier
hdlloc.ndlp/amrlp.1234567
Data
thumbnail gif
archive jpg
reference jpg
object metadata
Metadata
17Object Model Set of Digitized Page Images
Data Each page separate image
Metadata Structure of work page sequence
page numbers special pages
18Object Model Set of Digitized Page Images
Identifier
hdlloc.ndlp/amrlp.13579
Data
page 1 gif
page 3 gif
page 2 gif
page map
Metadata
19Page Map
A page map relates the page images to the
structure of the information, e.g.
- List of pages
- Numbers printed on pages
- Blocking of information on pages (columns,
figures) - Sequences of information across pages
A page map is metadata for a specific
manifestation
20Overlay JournalsandVirtual CollectionsLogical
organization of physically separate works
21The NSF SMETE Library
Soon, all scientific and engineering information
will be available on-line Journals,
reports, papers, standards, patents Data
sets, instruments, sensors Computer
programs, simulations, designs Maps,
images, films ... etc., etc., etc.
22The Instructor's Wish List
To discover materials and services Good
science Comprehensible to students --
effective for teaching Stable -- will not
change or disappear Access to collections and
services that are provided by many independent
organizations No uniform catalog or index
to everything Mixture of for-profit and
open access information
23The Instructor's Wish List
To discover materials and services Good
science Comprehensible to students --
effective for teaching Stable -- will not
change or disappear Access to collections and
services that are provided by many independent
organizations No uniform catalog or index
to everything Mixture of for-profit and
open access information
24Conventional Journal
Contents
Articles
25Overlay Journal
Articles in Repository B
Articles in Repository A
Contents
26Overlay Journals
Articles in Repository B
Articles in Repository A
Contents of Journal I
Contents of Journal II
27Overlay Journals with Preprint Servers
Preprint server
Research Web site
Contents of Journal I
Cornell CS Reports
CoRR
Contents of Journal II
28SMETE Library Physical Sites
User
CSTR
NCSTRL
ACM
D-Lib
CoRR
29SMETE Library Virtual Collections
SMETE
Links show the members of the virtual collection
30Metadata for Virtual Collections
Reference linking Identifiers (URLs, URNs,
...) Citations and reverse
citations Information discovery Cataloguing
and indexing Object models Structural
types Disseminators
31Indexing and Cataloguing
Conventional cataloguing and indexing Skilled
professionals, following quality guidelines. Web
spiders and gatherers Programs that gather
information and build indexes (e.g., Infoseek,
Harvest). Meta-data in publishing Addition of
metadata by the creator to aid automatic indexing
(e.g., Dublin Core). Content extraction
Indexing using structured text, speech
recognition, or image content.
32The End
Physical view Data structures, files,
directories, servers Publishers, libraries,
web sites Logical view Works, expressions,
manifestations, items Object models
(document models) Overlay journals
Virtual collections