Multilingual%20Generation%20of%20Controlled%20Languages - PowerPoint PPT Presentation

About This Presentation
Title:

Multilingual%20Generation%20of%20Controlled%20Languages

Description:

Multilingual Generation of Controlled Languages Richard Power (ITRI) Donia Scott (ITRI) Anthony Hartley (CTS) ITRI: Information Technology Research Institute ... – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 50
Provided by: drs2150
Category:

less

Transcript and Presenter's Notes

Title: Multilingual%20Generation%20of%20Controlled%20Languages


1
Multilingual Generation of Controlled Languages
  • Richard Power (ITRI)
  • Donia Scott (ITRI)
  • Anthony Hartley (CTS)

ITRI Information Technology Research Institute
University of Brighton, UK CTS Centre for
Translation Studies, University of Leeds, UK
2
Background
  • Since 1993, NLG projects at ITRI have focussed on
    the problem of producing technical documentation
    in multiple languages (Drafter, CLIME, PILLS,
    CLEF).
  • Typical application is PILLS, in the
    pharmaceutical domain, where for example patient
    information leaflets are produced in around 150
    languages and revised often.
  • ITRI introduced the WYSIWYM (What You See Is What
    You Meant) method for editing knowledge for NLG.
    A similar idea is used in XRCEs MDA
    (Multilingual Document Authoring) approach.
  • The talk describes current work on widening the
    coverage of WYSIWYM so that it can edit complete
    patient information leaflets.

3
Overview
  • Problem how to produce documents in CLs
  • Approach create a direct manipulation CL editor
    by analogy with a drawing tool
  • Examples of how such an editor might work
  • Snapshots of prototypes
  • Advantages and disadvantages
  • Future developments

4
Methods for controlling language (1)
  • A trained author writes a text, trying to comply
    with the rules of a CL.
  • Tools for checking terminology, grammar, and
    style, identify non-compliant sentences, and may
    generate possible alternatives.
  • If versions in other languages are needed, an MT
    system should make fewer mistakes if the source
    text is in a CL.

5
Methods for controlling language (2)
  • The content of a document is already encoded in a
    formal knowledge base.
  • A language generation tool generates text from
    this encoding of content, using a grammar and
    lexicon which guarantees compliance with a CL
    (Danlos et al., 2000).
  • Versions in other languages can be generated from
    the same knowledge base no interpretation is
    required.

6
Methods for controlling language (3)
  • The author creates the text through a direct
    manipulation interface in which all options are
    generated by the program. These options guarantee
    compliance with a CL.
  • Editing options are linked to features in an
    underlying interlingua, so that as well as
    creating a text, the author implicitly creates a
    formal encoding of the content.
  • Versions in other languages can be generated from
    the same formal encoding no interpretation is
    required.

7
Xfig editor for controlled drawings
Can we develop a CL editor by analogy with a
drawing tool?
8
Complex nested drawings using Xfig
9
Constraints of a drawing editor
  • The author can create instances of a number of
    predefined patterns (rectangle, oval, etc.).
  • Instances can be configured by changing a set of
    predefined features (colour, size, line
    thickness, etc.).
  • Instances can be located at various points in the
    drawing (depending on grid setting).

10
Controlled character editing
Text editor Text editor Text editor
The author can create instances of predefined
patterns (letters, punctuation marks), configure
them by predefined parameters (font, bold, size,
colour, etc.), and place them at permitted
locations.
11
General requirements for editing tool
  • The tool allows users to create instances of
    predefined types, and to place them at
    constrained locations.
  • Once created, instances can be configured by
    varying a predefined set of parameters.
  • Instances can also be deleted, or cut, or copied,
    or pasted into other locations.

12
Editing tool for controlled languages
  • Author can create instances of patterns based on
    verbs, nouns etc. (e.g., sentences, noun
    phrases).
  • Once created, instances can be configured by
    varying parameters like tense, polarity, and
    number, or by introducing modifiers. They can
    also be deleted or cut/copied/pasted.
  • However, what counts as a location within a
    linguistic pattern (e.g., a sentence)?

13
Location in a CL editor
Location is a point within the character sequence
Text editor
Drawing tool
Location is an area within a two-dimensional grid
Controlled Language editor
Since we are editing linguistic form rather than
a character sequence, location might be defined
as a node within a hierarchical structure
?
14
Editing a hierarchical structure (Step 1)
Some specialised drawing tools edit hierarchical
structures. In this example, the aim is to
configure a house. The first step is to choose a
basic house pattern.
In a hierarchical structure, locations are points
within an existing pattern where appropriate
constituents may be added.
15
Step 2 Selecting a constituent (door)
Once a pattern has been selected, it can be
reconfigured. Having chosen the one-door
one-window pattern we can for example add a
garage.
Instead of reconfiguring the basic house pattern,
the author can click on a location where a
constituent must be added.
16
Step 3 Choosing a basic door pattern
Highlighting in red shows which part of the
current design has been selected for adding a new
constituent, or for reconfiguring an existing one.
Having selected a location, the user is presented
with a set of suitable options. Each option is a
basic pattern which can be configured later.
17
Step 4 Configuring the door pattern
  • Three configuration parameters can be varied
  • Cross on window
  • Letter box
  • Cat flap

Having chosen a basic door pattern, the user can
reconfigure it, for instance by adding a letter
box.
18
Step 5 Selecting a constituent (window)
The configuration options change once the letter
box has been added. The options for varying the
other parameters (window cross, cat flap) now
include the letter box.
Satisfied with the door, the user selects the
other location where a new constituent can be
added.
19
Step 6 Choosing a basic window pattern
The window location is now highlighted in red, to
show that it has been selected.
Once a basic window pattern has been selected,
the design will be potentially complete, because
all empty locations will be filled.
20
Result Completed design for a house
To simplify, we assume there are no configuration
options for windows.
Editing could stop here. Alternatively, the user
could change the design by further operations
(delete window, reconfigure house, etc.).
21
Editing a CL sentence (Step 1)
Options
Document
  • Someone asks someone something
  • Something attacks something
  • - - etc. - - -
  • Someone reads something
  • Someone swallows something
  • - - -etc. - - -

Something is the case
The Document pane shows an anchor, a generic
phrase in square brackets. This represents a
location where a specific event pattern may be
inserted. The pattern is selected from a list of
options.
22
Step 2 Selecting a constituent (agent)
Options
Document
Someone might swallow something Someone
must swallow something Someone does not
swallow something Someone swallowed
something Someone will swallow
something Someone swallows something
somewhere Someone swallows something in
some way - - - etc. - - -
Someone swallows something
Having selected the swallow pattern with its
parameters defaulted (e.g., present tense), we
can choose from configuration options.
Alternatively we can select a location within the
pattern, such as the agent role.

23
Step 3 Choosing a basic agent pattern
Options
Document
a doctor a man a patient a pharmacist a woman - -
-etc. - - -
Someone
swallows something
The location corresponding to the unspecified
agent is highlighted in red. As in the house
editor, options are offered only if they are
suitable for the location. The suitable options
in this case are noun phrases referring to agents.
24
Step 4 Configuring the agent pattern
Options
Document
patients the patient a some kind of
patient a patient who does something - - -etc.
- - -
A patient swallows something
The configuration options for nominals vary
parameters corresponding to singular vs. plural,
definite vs. indefinite, and potential modifiers
(e.g., adjective, relative clause).
25
Step 5 Selecting a constituent (object)
Options
Document
the patients a patient the some kind of
patient the patient who does something - -
-etc. - - -
The patient swallows something
Assuming the user does not want to configure the
agent any more, the next step is to select the
object location.
26
Step 6 Choosing a basic object pattern
Options
Document
The patient swallows
something
  • a button
  • a capsule
  • a cream
  • - - etc. - - -
  • a medicine
  • a tablet
  • water
  • - - -etc. - - -

Once an object pattern has been selected, the
sentence is potentially complete, although it can
be configured further if desired.
27
Result Completed event
Options
Document
tablets the tablet a some kind of tablet a
tablet which does something - - -etc. - - -
The patient swallows
a tablet
28
What are we really editing?
Drawing editor
HEIGHT 3.0 in WIDTH 2.0 in LINE THICKNESS
1 LINE COLOUR black FILL COLOUR green
Underlying formal encoding
Presentational form
29
What are we really editing?
Text editor
The patient
  • 84 104 101 32
  • 97 116 105
  • 101 110 116

The patient
The patient
Underlying formal encoding
Presentational forms
30
What are we really editing?
Controlled English editor
CATEGORY nominal HEAD NOUN patient DETERMINER
the NUMBER singular MODIFIERS none
the patient
Underlying formal encoding
Presentational form
31
What are we really editing?
Controlled interlingua editor
CLASS person CONCEPT patient IDENTIFIABLE
yes NUMBER single QUALIFIERS none
the patient
il paziente
o paciente
patienten
Underlying formal encoding
Presentational forms
32
Choosing an event concept
CLASS event CONCEPT MODALITY POLARITY TIME QUALIF
IERS
Something is the case
  • ask(person,person,fact)
  • attack(thing,thing)
  • - - etc. - - -
  • read(person,thing)
  • swallow(person,thing)
  • - - - etc. - - -

event
Anchors in the feedback text correspond to
generic types in the ontology (e.g., event),
which subsume a set of specific conceptual
patterns from which users may choose.
33
Presenting event patterns
Options
Document
  • Someone asks someone something
  • Something attacks something
  • - - etc. - - -
  • Someone reads something
  • Someone swallows something
  • - - -etc. - - -

Something is the case
To present the options, a sentence pattern is
generated for each event pattern specified by the
ontology.
34
Configuring an event
The heavy border on the rectangle means that this
node is currently selected.
CLASS event CONCEPT swallow MODALITY none
(possible, obligatory) POLARITY positive
(negative) TIME present (past,
future) QUALIFIERS none (place, manner)
Someone swallows something
ARG2
ARG1
CLASS person CONCEPT IDENTIFIABLE NUMBER QUALIFIE
RS
CLASS thing CONCEPT IDENTIFIABLE NUMBER QUALIFIER
S
When a pattern is chosen, its configuration
parameters are initially set to default values.
Configuration options are computed from the
alternative values for each parameter (shown here
in brackets).
35
Presenting configuration options
Options
Document
Someone might swallow something Someone
must swallow something Someone does not
swallow something Someone swallowed
something Someone will swallow
something Someone swallows something
somewhere Someone swallows something in
some way - - - etc. - - -
Someone swallows something
Each configuration option is generated from an
event pattern which is identical to the current
pattern except that one parameter is varied.
36
Choosing an agent concept
CLASS event CONCEPT swallow MODALITY
none POLARITY positive TIME present QUALIFIERS
none
Someone
swallows something
ARG1
ARG2
doctor man patient pharmacist woman - - etc. - -
CLASS person CONCEPT IDENTIFIABLE NUMBER QUALIFIE
RS
CLASS thing CONCEPT IDENTIFIABLE NUMBER QUALIFIER
S
person
37
Presenting agent patterns
Options
Document
a doctor a man a patient a pharmacist a woman - -
-etc. - - -
Someone
swallows something
38
Configuring a person/object
CLASS event CONCEPT swallow MODALITY
none POLARITY positive TIME present QUALIFIERS
none
A patient
swallows something
ARG1
ARG2
CLASS person CONCEPT patient IDENTIFIABLE no
(yes) NUMBER single (multiple) QUALIFIERS none
(property, event)
CLASS thing CONCEPT IDENTIFIABLE NUMBER QUALIFIER
S
39
Presenting the configuration options
Options
Document
patients the patient a some kind of
patient a patient who does something - - -etc.
- - -
A patient swallows something
40
Result of configuring operation
CLASS event CONCEPT swallow MODALITY
none POLARITY positive TIME present QUALIFIERS
none
The patient
swallows something
ARG1
ARG2
CLASS person CONCEPT patient IDENTIFIABLE yes
(no) NUMBER single (multiple) QUALIFIERS none
(property, event)
CLASS thing CONCEPT IDENTIFIABLE NUMBER QUALIFIER
S
41
Presenting new configuration options
Options
Document
the patients a patient the some kind of
patient the patient who does something - -
-etc. - - -
something
The patient swallows
42
Implementing the CL editor
  • So far, two programs have been implemented
  • Editing patient information leaflets in English
    and Italian, using language-specific syntactic
    structure as the underlying representation. The
    English and Italian versions must be produced
    separately.
  • The same, using an interlingual semantic
    structure as the underlying representation. A
    single underlying representation is sufficient
    for both languages, so the author only needs to
    create one version.

(No attempt has been made yet to comply with the
rules of any particular controlled language.)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
Advantages of CL editing
  • The author need not learn the rules of a CL.
    Compliance is guaranteed by the options offered
    by the program.
  • If the underlying representation is a semantic
    interlingua, equivalent versions can be generated
    in other languages.
  • If the content of a document changes, the author
    can use CL editing to modify the underlying
    representation, and then regenerate documents in
    all the required languages.

48
Disadvantages of CL editing
  • Within the limits of a CL, there are stylistic
    options which a human author can probably control
    better than a program.
  • An experienced author can create a CL document
    more quickly by typing into a text editor than by
    selecting options from menus.
  • While CL editing brings the added benefit of
    reliable generation in other languages, authors
    (and their bosses) may not perceive this as
    sufficient compensation.

49
Future developments
  • Evaluating the user interface (some pilot studies
    already under way)
  • Using CL editing to supplement and correct
    semantic models derived using information
    extraction from legacy documents
  • Allowing some control over stylistic options
Write a Comment
User Comments (0)
About PowerShow.com