Title: Using the NASA Thesaurus to Support the Indexing of Streaming Media
1Using the NASA Thesaurus to Support the Indexing
of Streaming Media
- Gail Hodge
- Information International Associates, Inc.
- Janet Ormes Patrick Healey
- NASA Goddard Space Flight Center Library
2Historic Context
- The Library has collected and circulated the
Centers colloquia on audio or video since 1967 - A catalog of these holdings have been posted on
the Librarys web site since 2001 - Patrons required to come to the Library,
resulting in limited accessibility of recorded
colloquia - Streaming Media Center Project began in 2001 as
part of the Librarys response to Knowledge
Management initiatives
3Introducing the GSFC Media Center
4Streaming Media
- Streaming media
- Video that is encoded for delivery across the
internet/intranet - Encoding
- Computer processing of video to a format for web
casting - Web casting
- The act of delivering audio and video content
across the internet/intranet - Can be delivered live or on-demand
5The Goddard Library Streaming Media Center
- The Streaming Media Center is now available from
the Library website (http//library.gsfc.nasa.gov)
- Can be included in personalized portals
- Library has collected gt350 hours of video
- gt100 hours indexed
- Currently broadcasting 2 hours daily for the
Earth Observing Systems Knowledge Management Pilot
6Access Issues
- Current Needs
- Need to know the overall topic of the video
- More likely to remember the topic, presenter,
date or series - Permanent Access
- Less likely that users will remember the videos
metadata - More likely that users will want specific
information - Terminology may change over time
7Indexing Video Content
- Video indexing is similar to a back-of-the book
index for specific information - Entering a keyword leads you to the specific
location of the subject
8Features of Selected Software
- Compares recognized speech with stored default
terminology - Uses speaker inflection to identify meaningful
intervals - Indexing and Search components included
9Incorporation of NASA Thesaurus
- Added specific scientific terminology
- Incorporated terms and their NTs, RTs and UF/USE
relationships - Used text of Astrophysics Data System to provide
terms in grammatical structures - Provides query expansion and improves relevancy
10Query Expansion
-
- Saturn Moons
- Ios
- Triton
- Or
- Scatha Satellite
- P78-2 Satellite
-
11Query Expansion (Illustrated)
Sample Search (aurora) on same one hour lecture
entitled Jupiters Aurora. One file was
indexed using the NASA thesaurus, the other was
indexed using a more basic scientific word list.
Benefits
GREATER overall relevance understanding
MORE relevant content found (2M VS 20 Secs)
Ignores IRRELEVANT content (Speech Recognition
Error)
12Relevance Interval Creation
- Relevance Interval Creation links related
concepts within media files, which drives
Relevance Intervals - External knowledge from the thesaurus improves
the accuracy of the Creation process because the
explicit knowledge in text is incomplete
13Relevance Interval (Illustrated)
Sample Search (aurora) on same one hour lecture
entitled Jupiters Aurora. One file was
indexed using the NASA thesaurus, the other was
indexed using a more basic scientific word list.
Benefits
GREATER overall relevance understanding
MORE relevant content found (2M VS 20 Secs)
Ignores IRRELEVANT content (Speech Recognition
Error)
14Benefits
- Identify relevant pieces of content within a
longer video - Stream more relevant, specific information
intervals to users - Minimize manual processing
- Ultimately improve reuse of information and
increase opportunities for knowledge sharing