Title: Decisions for Your Digital Collection
1Decisions for Your Digital Collection
2 An example of audio collection
SCIAM Podcast 60 Second Science
http//www.sciam.com/podcast/podcasts.cfm?type60-
second-science
Keyword search
Browse by year and month
Hundreds of audio clips
3- SCIAM Podcast 60 Second Science
- http//www.sciam.com/podcast/podcasts.cfm?type60-
second-science - Hundreds of audio clips
- With basic metadata and transcripts
- Keyword search but not fielded or categorized
- Difficult to browse by topic
4Set the goal
- Goal
- To create metadata for the individual audio
objects in order to enable searching, browsing,
and sorting by topic, time, space, and other
attributes
5Analysis of the collection
- What is needed to accomplish the goal?
- Questions to be asked
- If starting a new collection
- If the collection is already built
63. Knowing the difference
- 3.1 Textual vs. non-textual resources
- 3.2 Document-like vs. non-document-like objects
- 3.3 Original vs. digital surrogates of the works
- 3.4 Collection-level vs. item-level
73.1 Textual vs. Non-textual
- Text
- Would allow for full text searching or automatic
extraction of keywords. - Marked by HTML or XML tags.
- Tags have semantic meanings.
Newspaper dated July 16, 1976, reporting the
initial discovery of burials in Granado Cave.
8Example of a text document
Newspaper dated July 16, 1976, reporting the
initial discovery of burials in Granado Cave.
Discovery of Granado CaveIn 1976, Mr. Frank
Granado, a surveyor from Pecos, discovered
prehistoric burials in a previously unknown cave
in the Rustler Hills. An article soon appeared in
the local paper. The site was brought to the
attention of the state archaeologist, who
suggested that Dr. Hamilton meet with Mr. Granado
and arrange to see the cave. Mr. Granado was very
cooperative and it soon became clear that the
cave was located near Caldwell Shelter No. 1 and
Brooks Cave. Granado Cave was named after its
discoverer and given the official archaeological
designation 41CU8.During preliminary testing and
excavation in 1976, the archaeological importance
of the cave became evident. Mr. Shelby Brooks,
the owner of the land on which Granado Cave and
Brooks Cave are located, allowed both to be named
State Archaeological Landmarks (SALs). As such,
they became the first SALs located on private
property in Texas. In June 1978, Dr. Hamilton
returned to Granado Cave with a team of four
archaeologists to undertake recording and
excavation of the site. A private collection of
artifacts belonging to Mr. Granado was also
studied.Previous Archaeological ResearchOver
the years, there have been numerous uncontrolled
excavations in the various caves and sinkholes of
the Rustler Hills. Even the excavations conducted
by archaeologists are somewhat confusing. Not
only is the sequence of excavation unclear, but
often a single site has been excavated more than
once and has been given varying names by
different archaeologists. Important sites near
Granado Cave include the Caldwell Shelters
(41CU1 and 41CU2) the McAlpin Caves (41CU5 and
41CU6) Brooks Cave (41CU7) and ELCOR Cave (no
assigned site number).
IST681 Metadata
9Full text document can be indexed or marked up
- In 1976, Mr. Frank Granado, a surveyor from
Pecos, discovered prehistoric burials in a
previously unknown cave in the Rustler Hills. - In ltyeargt 1976 lt/yeargt, Mr. ltpersongt Frank
Granado lt/persongt, a ltoccupationgt surveyor
lt/occupationgt from ltplaceNamegt Pecos lt/placeNamegt
, discovered ltobjectgt prehistoric burials
lt/objectgt in a previously unknown ltplaceTypegt
cave lt/placeTypegt in the ltplaceNamegt Rustler
Hills lt/placeNamegt.
10Textual vs. Non-textual
- Non-textual, e.g., images
- Only the captions, file names can be searched,
not the image itself. - Need transcribing or interpreting.
- Need more detailed metadata to describe its
contents. - Need knowledge to give a deeper interpretation.
Newspaper dated July 16, 1976, reporting the
initial discovery of burials in Granado Cave.
PE19760716a1.gif
11- Title
- Creator
- Source
- Publisher
- Language
- Date
- Format
- Subject topic/things/placeType/person
- Coverage
- year/place
- Description
- Need transcribing or interpreting.
- Need more detailed metadata to describe its
contents. - Need knowledge to give a deeper interpretation.
Newspaper dated July 16, 1976, reporting the
initial discovery of burials in Granado Cave.
PE19760716a1.gif
123.2 Document vs. non-document
- Non-document objects often
- contain multiple components
- carry information about history, culture, and
society - have detail about style, pattern, material,
color, technique, etc.
IST681 Metadata
133.3 Original vs. digital surrogates of the works
- What kinds of objects will be included in the
digital collection? - What kinds of objects will need to be described?
- What records are to be managed?
- born-digitals vs. digital doubles
- original works vs. digital surrogates of the
works -
14Documented by Henry Fuermann in 1910
Designed by Frank L Wright during 1906-1909
A slide made in 1985 scanned in 1997
15Work vs. Image
- A digital collection needs to decide what is the
entity of their collection - works,
- images, or
- both?
- How many metadata records are needed for each
item? - Some part of the data can be reused.
- E.g., one work has different images or different
formats
IST681 Metadata
16Photographs could be treated as works or as a
surrogates
Credits Photographs Various photographers,
mostly William Ward Watkin.
The Construction of the Administration
Buildinghttp//www.rice.edu/fondren/woodson/exhib
its/Watkin/adminconstruction.html
IST681 Metadata
17Entity Relationship Diagram
Source http//www.getty.edu/research/conducting_r
esearch/standards/cdwa/entity.html
183.4 Collection-level vs. item-level
- Collection level
- Item level
- Relation?
- Is Version OfHas VersionIs Replaced
ByReplacesIs Required ByRequiresIs Part
OfHas PartIs Referenced ByReferencesIs Format
OfHas FormatConforms To
IST681 Metadata
19Collection-level example (1)
- Dorothea Lange's "Migrant Mother" Photographs in
the Farm Security Administration Collection - http//www.loc.gov/rr/print/list/128_migm.html
- (?next slide)
IST681 Metadata
20(No Transcript)
21(left) Destitute peapickers in California, a 32
year old mother of seven children.
SUBJECTSMothers children--California--1930-19
40.Migrant agricultural laborers--California--193
0-1940.Poor persons--California--1930-1940.FORMA
TGroup portraits -- 1930-1940.Portrait
photographs 1930-1940. Photographic prints
1930-1940. (Right lower) Look in her eyes! Illus.
from Midweek pictorial, 1936 Oct. 17, p. 23.
Reproduction of photograph by Dorothea Lange,
Resettlement Administration. Original title
Destitute peapickers in California, a 32 year old
mother of seven children. (Right
upper) MEDIUM 1 photographic print on album
page.NOTES FSA log sheet, with photograph in
center, indicating publications in which picture
appeared, 1936-1940. Original title of
photograph in center Destitute peapickers in
California, a 32 year old mother of seven
children.
22Collection level example (2)
- William Ward Watkin Papers, 1903-1956http//www.r
ice.edu/fondren/woodson/mss/ms352/ - Span Dates 1903-1956
- Bulk Dates 1912-1930
- 4 linear feet
- 4 1/2 cubic feet
- Series I. Biographical, 1903-1953. 10 linear
inches - Series II. Architectural Career, 1910-1952. 24
linear inches - Series III. Academic Career, 1914-1949. 10 linear
inches
IST681 Metadata
234. Communicating about the Functional Requirements
- We would like users to be able to
- Search records by Title, Author name, Keyword,
Type of document, Publication, Conference name,
and Year Browse records by Year, Department,
Classification, Object type, Subject matter,
etc. View latest additions to the archive - We would like to be able to
- Link together records from the same Conference,
Publication Filter by Year and Language
-- Based on Guy, Powell, and Day. "Improving the
Quality of Metadata in Eprint Archives." Ariadne
, no. 38. 2004.
24An operational system
25 Will the metadata support these functions?
media type
media type
Video?
Video?
Image?
Image?
Text?
Text?
No FORMAT information
media type
media type
26Greenberg Mapping Types to Functions
-- Greenberg, J. "Metadata Quality A Layer
Cake of Criteria, presentation at ASIST 2005
Conference
27 Back to the audio collection
SCIAM Podcast 60 Second Science
http//www.sciam.com/podcast/podcasts.cfm?type60-
second-science
Keyword search
Browse by year and month
Hundreds of audio clips
28 Analysis of Metadata Elements
- Questions to be asked
- How audio clips should be described?
- What metadata elements that are important to us
to accomplish the goal? - Task Using available audio clip descriptions to
identify desired elements
29Two audio clips examples
30Current descriptions of audio clips views from
iTunes
31Start the lists
- List the metadata elements you derived from the
screenshots - List the desired elements you want to include in
order to accomplish the goal
32- Derived elements from the file attributes
- Name (of the file)
- Artist (creator)
- Album (collection name)
- Grouping (Category)
- Comments (description of the collection)
- Genre (Type)
- Format, size
- Technical bit rate, sample rate, channels,
ID3Tag, encoding
- Derived elements from the front end display
- Date
- Title
- File location (URL)
- Description
33Goal-related elements
- Desired elements
- Subject category
- Topic
- Time
- Geographic area
- Population
- Event
- Publisher
- Rights
- Considerations for long-term and broader context
- Packaging subject-related podcasts to produce
thematic albums for wider distribution? - Collection level metadata for discovery in large
digital collections or digital libraries
34Summary
- Goal for the metadata to accomplish
- Existing and desired metadata
- Metadata schemawhether to adopt, modify, reuse,
or start a new one depends on many factors - Relationships between associated objects
- Level of granularity
IST681 Metadata
35In-Class Exercise
- Team work to make a list of desired elements for
either postcards or bookmarks - Element Name Definition Example
- We will share the elements and discuss these
issues - In preparing your metadata project, what do you
foresee as the most challenging decisions for
your project? - Why do you think they are challenging decisions
to make? - What factors will affect your decisions?