Indexing with a Controlled Vocabulary - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Indexing with a Controlled Vocabulary

Description:

Indexing with a Controlled Vocabulary Basic Concepts – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 22
Provided by: Bernd243
Category:

less

Transcript and Presenter's Notes

Title: Indexing with a Controlled Vocabulary


1
Indexing with a Controlled Vocabulary
  • Basic Concepts

2
Indexing Topics Covered
  • The concept triangle
  • The five-axiom theory of indexing
  • The indexing process

3
The Concept Triangle
Referent
Expression
Concept
4
The Referent
  • The referent is everything about which a
    meaningful statement can be made.
  • For example, about a certain table many
    statements can be made concerning the material of
    which it is made, its price, purpose, producer,
    weight, the structure of its surface, etc.

5
The Concept
  • We define the concept as the sum of the
    essential statements that can be made about a
    referent.
  • Essential statements are those which contribute
    to the characterization of the referent itself.
  • Inessential statements are those which do not
    contribute to the characterization of the
    referent itself.

6
Kinds of Concepts
  • General concepts
  • The general concept describes a class of
    interrelated referents.
  • For example metal, oxidation, information
  • Individual concepts
  • The individual concept is one to which no
    meaningful conceptual feature can be added.
  • For example Albert Einstein Fritz the Cat.

7
General vs. Individual Concepts in Indexing
  • It is the task of subject indexes to provide
    access to documents or text passages relevant to
    general concepts.
  • An information system which works quite well for
    individual concepts, may totally fail when it is
    required to manage general concepts too.

8
The Mode of Expression
  • Lexical expressions
  • linear strings of characters commonly agreed upon
    to express concepts or concept connections
  • Non-lexical expressions
  • linear strings of characters by which concepts or
    concept relations are expressed and upon which no
    firm agreement has been made

9
Forms of Expression Indexing
  • Lexical expressions require little indexing work
  • Often appear in Identifier fields rather than in
    Descriptor fields of database records
  • Non-lexical expressions require indexing work
  • non-lexical expressions exhibit ambiguity and
    multiplicity

10
Concepts Expressions
  • Individual concepts are almost always expressed
    lexically
  • General concepts are almost always expressed
    non-lexically
  • In natural, uncontrolled language there is an
    unlimited multitude of non-lexical, paraphrasing
    expressions for concepts
  • Multiplicity ambiguity of natural language
    expressions are largely restricted to general
    concepts

11
Five-Axiom Theory of Indexing
  • Definability
  • Order
  • Sufficient degree of order
  • Representational predictability
  • Representational fidelity

12
Axiom of Definability
  • The compilation of information relevant to a
    topic can be delegated (to a skilled specialist
    or a programmed search mechanism) only to the
    extent to which the inquirer can define the topic
    in terms of concepts and concept relations.

13
Axiom of Order
  • Any compilation of information relevant to a
    topic is an order-creating process.
  • Order is defined as the meaningful proximity of
    the parts of a whole at a foreseeable place.

14
Axiom of Sufficient Degree of Order
  • The demands made on the degree of order increase
    as the size of the collection and/or the
    frequency of the searches and/or the specificity
    of the searches increases.

15
Axiom of Representational Predictability
  • The completeness of any search for documents
    relevant to a topic of interest depends on the
    predictability of the modes of expression for
    concepts in the search file.
  • Successful searches require a language with
    predictable modes of expression for concepts.

16
Axiom of Representational Fidelity
  • The precision of any search for documents
    relevant to a topic of interest depends on the
    fidelity with which the modes of expression for
    concepts can be expressed in the systems
    language.

17
The Indexing Process
  • Step 1
  • Determine the essence of a document
  • Step 2
  • Represent this essence with sufficient degrees of
    predictability and fidelity

18
Importance of Categories
  • The predictability of essence selection is
    markedly enhanced when the indexers have an
    orientation to conceptual categories.
  • For example, in some chemistry databases, all
    descriptors belong to the following categories
  • MATTER
  • LIVING ENTITY
  • APPARATUS
  • PROCESSS
  • In ERIC, the nine Descriptor Groups serve as
    categories.

19
Natural Language Indexing
  • Natural language expressions, as derived from
    original texts, can only in the case of
    individual concepts lead to an information system
    of adequate quality and survival power.
  • The specificity of natural language expressions
    is compromised by their lack of predictability.

20
Importance of Cutters Rule
  • Precise and complete searches require that the
    most specific descriptors that the vocabulary
    provides be chosen for the indexing of a subject.
  • A query with a specific descriptor must not
    retrieve concepts that are more general than the
    search descriptor.

21
Importance of Syntax
  • In the interests of enhanced representational
    fidelity any advanced indexing language needs a
    syntax in addition to its vocabulary.
  • The syntax should represent the manner in which
    the concepts are connected with each other in the
    texts to be stored.
Write a Comment
User Comments (0)
About PowerShow.com