1 - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

1

Description:

( Longman Dictionary of English Language and Culture: Longman Group UK Limited ... Note - admitted term: technical dictionary (ISO 1087:1990) ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 24
Provided by: hpcu1
Category:
Tags: dictionary

less

Transcript and Presenter's Notes

Title: 1


1
A Flexible XML-Based Glossary Approach for the
Federal Government
  • By Ken Sall
  • for the US Federal
  • XML Community of Practice
  • January 19, 2005

2
Problem Statement
  • After examining standard glossary terminology
    (ISO 1087 and others), define an XML Schema or
    DTD that models all useful aspects of a term
    and its definition.
  • Should be applicable to any government agency.
  • Consider flexibility and collaborative
    development as key design criteria. Many
    different agencies may use the model and many
    individuals may author specific term definitions.
  • Create an XSLT stylesheet that knows about the
    model and displays an XML glossary instance
    document as HTML in any modern browser.
  • Eventually consider XSL-FO for PDF rendering of
    the glossary.

3
Design Goals
  • Standards-Based - XML element names are loosely
    based on an international standard, ISO 1087.
  • Flexible - The Glossary DTD, although initially a
    strawman to stimulate discussion, is fairly
    flexible with few required elements, many
    optional elements, and several repeatable
    elements.
  • Provides a Framework - Since so few elements are
    required, terms can be added even before
    definitions are known. These terms act as
    placeholders that are fully supported by the DTD
    and XSLT. (For example, see the stub terms "DTD"
    and "XSLT" in the example instance.)

4
Design Goals
  • Specialized - Any term may have multiple
    definitions so that different agencies may use
    the same term with their own specialized meaning,
    where necessary.
  • Collaborative - Since an XSLT stylesheet is used
    to sort the terms alphabetically, many
    individuals can work on their own glossary
    fragments (XML instances of the Glossary DTD). At
    any time, the various contributions can be easily
    merged without manual editing.
  • Leverages Links - Search links are automatically
    generated for each term by means of the XSLT,
    both to help kick-start and to augment the
    definition.

5
ISO 1087 Terminology (etc.)
Key ISO 1087 Used ISO 1087
Used Unused
  • Characteristic Abstraction of a property of an
    object or of a set of objects. Note -
    Characteristics are used for describing concepts.
    ISO 1087-12000, 3.2.4
  • Concept A unit of thought constituted through
    abstraction on the basis of properties common to
    a set of objects. Note - Concepts are not bound
    to particular languages. They are, however,
    influenced by the social or cultural background.
    (ISO 10871990) Unit of knowledge created by a
    unique combination of characteristics. ISO
    1087-12000, 3.2.1
  • Definition Statement which describes a concept
    and permits its differentiation from other
    concepts within a system of concepts. (ISO
    10871990) Representation of a concept by a
    descriptive statement which serves to
    differentiate it from related concepts. ISO
    1087-12000, 3.3.1

6
ISO 1087 Terminology (etc.)
  • Designation Representation of a concept by a
    sign which denotes it. ISO 1087-12000, 3.4.1
  • Dictionary see terminology and vocabulary
    Structured collection of lexical units with
    linguistic information about each of them. (ISO
    10871990)

Key ISO 1087 Used ISO 1087
Used Unused
7
ISO 1087 Terminology (etc.)
  • Entry, Headword The term headword appears in two
    different meanings. In lexicography, a headword
    is the word used as the heading in a dictionary
    entry or encyclopedia. In a descriptive
    terminology entry where no preference is given to
    any one term, there is no head term, but if
    preference is given to a term, head term is
    sometimes used in analogy to lexicography, as is
    main entry term. (Wright Budin, 1997)

Key ISO 1087 Used ISO 1087
Used Unused
8
ISO 1087 Terminology (etc.)
  • Glossary see dictionary, terminology,
    vocabulary Alphabetical list of terms or words
    found in or relating to a specific topic or text.
    It may or may not include explanations. Note -
    The distinguishing criterion is that glossaries
    are considered to reside in backmatter attached
    to books and other publications rather than being
    independent works in their own right. Glossaries
    are sometimes perceived as being less scientific
    in intent and methodology than terminologies,
    terminology standards, and even vocabularies,
    although a certain degree of synonymy exists.
    (Wright Budin, 1997)

9
ISO 1087 Terminology (etc.)
  • Nomenclature System of terms which is elaborated
    according to pre-established naming rules. (ISO
    10871990)
  • Object Anything perceivable or conceivable. Note
    - Objects may also be material (e.g. an engine, a
    sheet of paper, a diamond), immaterial (e.g. a
    conversion ratio, a project plan) or imagined
    (e.g. a unicorn). Adapted from ISO 1087-12000,
    3.1.1

Key ISO 1087 Used ISO 1087
Used Unused
10
ISO 1087 Terminology (etc.)
  • Synonym A word with the same meaning or nearly
    the same meaning as another word in the same
    language. (Longman Dictionary of English Language
    and Culture Longman Group UK Limited 1992) Note
    Terminologists distinguish between real synonyms,
    i.e. terms which can be substituted with each
    other whatever the context, and the more common
    quasi-synonyms, which can differ from one another
    by context and sometimes by subject field (Sager,
    1990)
  • Term Designation of a defined concept in a
    special language by a linguistic expression. Note
    - A term may consist of one or more words or even
    contain symbols. (ISO 10871990)

11
ISO 1087 Terminology (etc.)
  • Terminological Dictionary see dictionary and
    vocabulary Dictionary containing terminological
    data from one or more specific subject fields.
    Note - admitted term technical dictionary (ISO
    10871990)
  • Terminological Record Structured collection of
    terminological data relevant to one concept. (ISO
    10871990)
  • Terminological Database Structured sets of
    terminological records in an information
    processing system. (ISO 10871990)

Key ISO 1087 Used ISO 1087
Used Unused
12
ISO 1087 Terminology (etc.)
  • Terminology Work Any activity concerned with the
    systematization and representation of concepts or
    with the presentation of terminologies on the
    basis of established principles and methods. (ISO
    10871990)
  • Vocabulary see terminology, dictionary,
    glossary Terminological dictionary containing
    the terminology of a specific subject field or of
    related subject fields and based on terminology
    work. (ISO 10871990)

Key ISO 1087 Used ISO 1087
Used Unused
13
Summary ISO 1087 Terminology
  • Unused ISO 1087 Terms
  • Characteristic
  • Designation
  • Dictionary
  • Nomenclature
  • Object
  • PreferredTerm TBD?
  • Terminological Dictionary / technical dictionary
  • Terminological Record
  • Terminological Database
  • Terminological Dictionary
  • Terminology Work
  • Vocabulary
  • ISO 1087 Terms Used
  • Concept
  • Definition
  • Term
  • Used but not ISO 1087
  • Glossary
  • Synonym
  • RelatedTerm
  • Additional Terms by Sall (next slide)
  • Name
  • Acronym
  • ExpandedAcronym
  • DefinitionSection
  • Source
  • Usage

14
Additional (Non-Standard) Terminology
  • Glossary change to Dictionary, Vocabulary,
    Technical Dictionary or Terminology?
  • Name added only to allow Term to be a
    container could change Term to Entry and Name to
    Term?
  • Acronym necessary option for technical terms
  • ExpandedAcronym ditto
  • DefinitionSection - added simply as a repeatable
    container to encompass all aspects pertaining to
    a specific definition of a term
  • Source - useful for traceability and credibility
  • Usage useful to have an optional example
    sentence for a given definition (use in context)

15
XML Glossary Model Strawman
16
XML Example of One Term
ontology
semantic
web knowledge
management Defines
the common words and concepts used to describe
and represent an area of knowledge, and so
standardizes the meanings. An ontology
includes classes in the domains of interest,
instances, relationships, properties and their
values, functions of and processes
involving the objects, and relevant constraints
and rules. Daconta,
Obrst, Smith An onotology
can range from the simple notion of a taxonomy to
a thesaurus, to a conceptual model, to a logical
theory. Daconta, Obrst, Smith
classification system
taxonomy
OWL

philosophy
sometimes "Ontology" the
metaphysical study of the nature of being and
existence WordNetce Both the ontology and manner of
human existence are of concern to
Existentialism. metaphysics

17
XML Ex Client-Side XSLT (Firefox)
18
XML Ex Client-Side XSLT (IExplorer)
19
XML Example XSLT Details
DefinitionSection based on Concept
CSS Styling
Optional and Repeatable Elements
New DefinitionSection based on 2nd Concept
Auto-generated Search Links
20
Collaboration Merging Instances
  • Since a Glossary consists of one or more Terms, a
    relatively simple XSLT can be created to merge
    the Term elements for two or more XML instances.
  • This means different authors (from the same or
    different agencies) can work independently.
  • Issue What if same Term is defined by different
    authors? Automatically add each definition, even
    though they may overlap/conflict, or manually
    edit collisions (could generate a conflict
    message)?
  • Issue Should agency name be a Source or another
    element (e.g., AgencySource)? Advantage is that
    custom XSLT could extract or render terms on per
    agency basis, if desired. Should there be an
    optional, repeatable SourceLink element for a URL?

21
Alternative GlossXML
22
Alternative XML Acronym Desmystifier
23
Next Steps
  • Determine interested agencies.
  • Establish funding.
  • Resolve terminology issues for the Glossary
    model.
  • Consider merge or replacement by GlossXML and/or
    XML Acronym Demystifier.
  • Need to finalize DTD or XML Schema before
    agencies start authoring.
  • Revise initial XSLT to match final Glossary
    model.
  • Determine repository and submission mechanisms.
  • Could be a good use for CORE.gov?
  • Coordinate with Plans for Derived XML Registry
    Prototype?
  • Write additional XSLT stylesheets for merging and
    pulling agency-specific terms, etc.
  • Develop XSL-FO stylesheets for PDF rendering of
    Glossary.
Write a Comment
User Comments (0)
About PowerShow.com