Title: Title: AID, an Associative Interactive Dictionary for Onlin
1 Controlled Vocabularies in Searching
Tamas Doszkocs, Ph.D.Computer
Scientistdoszkocs_at_nlm.nih.gov
2Controlled Vocabularies
- Definition
- Purpose and Role
- A Brief History
- Who is in Control?
- Spell Checkers
- Folksonomies
- Tagging
- Search Focus
- Search refinement
- Web X.Y
3Related Topics(that we wont talk about)
4Definition and Purpose
- Controlled vocabulary is a list of terms that
have been enumerated explicitly. - In Library and Information Science Controlled
vocabulary is a carefully selected list of words
and phrases, which are used to tag units of
information so that they may be more easily
retrieved by a search. The terms are chosen and
organized by trained professionals (including
librarians and information scientists) who
possess expertise in the subject area. Controlled
vocabulary terms can accurately describe what a
given document is actually about, even if the
terms themselves do not occur within the
document's text. Fully developed controlled
vocabulary systems, such as the Library of
Congress Subject Headings, are often published in
a reference work that is called a thesaurus.
Controlled vocabularies form part of a larger
universe of nomenclatural approaches to data
classification called metadata. (Wikipedia) -
5More Information
- Bridging the gap between languages used by
authors, search systems and users - http//sky.fit.qut.edu.au/middletm/cont_voc.html
- http//www.controlledvocabulary.com/
- http//php.iupui.edu/kcmcreyn/su03/control.html
- http//www.hsl.creighton.edu/hsl/Searching/c-vocab
1.html - http//www.dlese.org/Metadata/vocabularies/term_ex
pln.htm
6A Brief History
- The 1970s and 1980s bloody battles and
casualties - Controlled vocabularies vs. natural language
- Command languages vs. free-form queries
- CVs vs. abstracts vs. full text
- Librarians vs. end users
- The 1990s and the Web natural language for the
masses - The 21st Century the best of both worlds
7Vocabulary Control for Information Retrieval, 1972
- by F. Wilfrid Lancaster
- Â
- About this title Contents- Why Vocabulary
Control? Pre-coordinate Post-coordinate
Systems Vocabulary Structure Display
Gathering the Raw Material Standards
Guidelines Organization of Terms The
Hierarchical Relationship Organization of
Terms The Associative Relationship Terms Form
Compounding The Entry Vocabulary Homography
Scope Notes Thesaurus Display Vocabulary
Growth Updating The Role of the Computer
Identifiers Checklists The Influences of
Vocabulary on the Performance of a Retrieval
System Evaluation of Thesauri
Natural-language Searching the Post-controlled
Vocabulary Hybrid Systems Compatibility
Convertibility Multilingual Aspects Automatic
Approaches to Thesaurus Construction Some
Cost-effectiveness Aspects of Vocabulary Control
Bibliography Index. "The publisher's
announcement claims that the original edition is
an information science classic that has emerged
as the 'bible' of indexing retrieval
vocabularies, (is the) first definitive
monograph devoted exclusively to controlled
vocabularies in information retrieval. ..
8An Associative Interactive Dictionary for Online
Searching, 1978
- Title AID, an Associative Interactive
Dictionary for Online Searching. - Authors Doszkocs, Tamas E.
- Descriptors
- Dictionaries - Information Retrieval - Online
Systems - Search Strategies - Â Tables (Data)
-Â Word Frequency - Source On-Line Review, v2 n2 p163-73 Jun 1978,
Jun78 - AID meta-searched MEDLINE, TOXLINE and the
Hepatitis Databank and displayed result clusters
of keywords and MeSH headings
9CITE, 1979
- Doszkocs T. E., Rapp B. A. Searching Medline in
English A prototype user interface with natural
language query, ranked output and relevance
feedback. Proc. ASlS Annu. Meet. Vol 16 pp
131-137 1979. - Automatic suggestion of Medical Subject Headings
- Used as NLMs OPAC 1979-1984
10WebLine, 1994
- The first Web interface to an online retrieval
system - Associative Concept Navigation in MEDLINE and
other NLM Databases via a Mosaic - Forms - WWW
Interface Combining Natural Language Processing,
Expert Systems and (un)Conventional Information
Retrieval Techniques Tamas E. Doszkocs, Seth B.
Widoff, Bruno M. Vasta National Library of
Medicine in Proceedings of the Second World
Wide Web Conference , Chicago 1994 - http//www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Sear
ching/doszkocs/doszkocs.html - see also WebCrawler (Brian Pinkerton)
- The Open Web and the Hidden Web
11Jerrys Guide to the Web, 1994
- Jerry Yang and David Filos Yahoo! 1995
- a directory of web sites, organized in a
hierarchy of subject descriptors - Librarians at Yahoo
- Surfing is to Yahoo! what the Dewey Decimal
System is to libraries. In other words, Surfing
is the categorization of websites. It also
happens to be how Yahoo! began. Today our Surfing
team continues its passion for finding,
evaluating, and organizing information on the
Internet. They have a voracious appetite for
learning about new topics. They are curious
individuals who are skilled at intuitively and
efficiently analyzing and classifying diverse,
unstructured pieces of information across the
Yahoo! network. Surfers are critical to the
relevance and intuitive nature of information
presented on Yahoo!. http//careers.yahoo.com/job_
descriptions.html
12The Remains of the Yahoo Directory
13(No Transcript)
14(No Transcript)
15Open Directory Project
16(No Transcript)
17Transparent Query Mapping to Controlled
Vocabulary Terms
18Spell Checking as a Controlled Vocabulary
Application
19Correct spelling, correct results
20Folksonomies and Social Tagging
21Tagging in Flickr
22Query Refinement with Phrases
23Query Refinement with Subject Headings
24Focusing in Search Results with Topical Clusters
25Clustering of Search Results with Phrases
26Clustering and Search Refinement with Natural
Language and Controlled Vocabularies
27Clustering with Multiple Criteria
28Analyzing Search Results
29Visualizing Search results
30Multi-faceted Clustering in an OPAC
31AllPlus Web 2.0 Content Mashup
32AllPlus Dynamic Cluster Visualization
33 Controlled Vocabularies in Searching
Tamas Doszkocs, Ph.D.Computer
Scientistdoszkocs_at_nlm.nih.gov