Introduction to Natural Language Generation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Introduction to Natural Language Generation

Description:

Frontiers of science and engineering research ... Languages: Cross-lingual, multi-lingual, machine translation. Analysis: Machine recognition and understanding ... – PowerPoint PPT presentation

Number of Views:1037
Avg rating:3.0/5.0
Slides: 26
Provided by: yae9
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Natural Language Generation


1
Introduction to Natural Language Generation
  • Yael Netzer
  • Department of Computer Science
  • Ben Gurion University

2
Outline
  • Introduction what is NLG
  • Traditional architecture of NLG system
  • Statistical methods in NLG
  • FUF/SURGE
  • An example in Hebrew the noun phrase
  • A statistical method for generation

3
What is Natural Language Generation (NLG)
  • NLG is the process of constructing natural
    language outputs from non-linguistic inputs.
    VanLinden
  • NLG is mapping some communication goal to some
    surface utterance that satisfies the goal.
    Reiter Dale

4
Aspects in NLG
  • Theoretical and practical interests
  • Theoretical modeling various depths of human
    language representation and production.
  • Practical engineering human/computer interfaces
    (computer as an author/authoring aid).

5
Systems for examples
  • NLG as an Author
  • Weather reports (FoG)
  • Stock market descriptions
  • Museum artifacts descriptions (ILEX)
  • Personal letters to costumers (AlethGen)
  • NLG as an author aid
  • Integrated (partial) NLG uses
  • NLG in augmentative and alternative communication
  • Summarization (integrate cut and paste
    techniques with generation)
  • Machine Translation (generation from interlingua)

6
Inputs of NLG systems
  • Formally, a system can be defined as a
    four-tuple k,c,u,d
  • k- knowledge source (tables of numbers, knowledge
    representation lang.) domain dependent, no
    generalizations.
  • c - communicative goal the consequence of a
    given execution of the system (considering
    appropriate information)

7
NLG input spec. cont.
  • u - user model characterization of the hearer or
    intended audience for whom the text is to be
    generated.
  • d - discourse history previous interactions
    between user and NLG controlling anaphoric forms,
    preventing repetitions.

8
The output for an NLG system
  • Any text conveying the communicative goal
  • It can be a word like yes'' in a dialogue -
  • or a text consisting of many paragraphs in other
    cases.
  • The output should be related to the medium
  • web pages with hyperlinks, voice stream etc.

9
Main (Pipeline) Architecture
  • Content determination
  • What information should be included in the text?
  • Document structuring
  • how to organize text
  • Lexicalisation
  • choosing particular words or phrases
  • Aggregation
  • composing chunks of info into sentences.
  • Referring expression generation
  • what properties should be used in referring to an
    entity.
  • Surface realization
  • mapping underlying content of text to a
    grammatically correct sentence that expresses the
    desired meaning.

10
Content Determination
  • Content determination
  • The process of deciding what to say.
  • No general rules - domain specific.
  • what is important - what should always be
    included, what is exceptional information, etc.
  • Practically constructs a set of messages from
    the underlying data (entities, concepts and
    relations).

11
Document Structuring
  • Document Structuring
  • imposing ordering and structure over the
    information.
  • - conceptual grouping
  • - rhetorical relationships.

12
Lexical choice
  • Lexical chooser
  • determining the particular words to be used to
    express concepts and relations.
  • complexity of coding vs. richer language.
  • choosing content words information is mapped
    from conceptual vocabulary.
  • LC should supply a variety of words, consider the
    user model precise vs. general description of
    weather phenomenon, and account for pragmatic
    considerations (formal vs. casual style).

13
Aggregation
  • Aggregation - can be performed in various stages
  • the planner combines similar data.
  • In lexicalization aggregates some concepts into
    one lexical element.
  • Aggregations of sentences
  • The month was cooler than average. The month was
    drier than average into The month was cooler and
    drier than average

14
Referring expression generation
  • Referring Expression Generation
  • an entity can be referred in many ways
    initially, subsequently, distinguishing,
    definite, pronouns.
  • Proper names
  • ??? ???
  • ??? ??? ??? ????
  • Definite descriptions
  • The train that leaves at 10am
  • The next train.
  • Prounouns
  • it

15
Syntactic realizer
  • Syntactic Realizer syntax and morphology.
  • Most general, domain independent (but definitely
    language dependent).
  • Various Usage Scenarios
  • Input to syntactic realization is not observable
  • Input for syntactic realizers in NLG
  • What knowledge is needed to prepare input?
  • Who supplies this knowledge?
  • Can we find a common abstraction, common across
    languages and applications?

16
Possible techniques for realizers
  • Bi-directional grammar specification.
  • Grammar specifications tuned for generation.
  • Templates
  • Corpus statistics

17
A note on bi-directional grammar
  • Realization, in some aspects, is easier than
    parsing no need to handle the full range of
    syntax that a human might use, no need to resolve
    ambiguities, no need to recover ill-formed input.
  • A bi-directional grammar, is, theoretically, a
    possible elegant approach.
  • However, most NLG systems use a
    generation-oriented grammar

18
Why not bi-directional?
  • Output of NLU parser is very different from the
    input to an NLG realizer.
  • Not obvious that lexicalization is a part of the
    realization.
  • Practically, not easy to engineer large
    bi-directional grammars.
  • And more generation is the process of choices,
    even to use canned text when needed.

19
Syntactic Realizer
  • This work concerns Syntactic Realizers the
    grammar
  • Input for grammar lexicalized representation of
    a phrase in various levels of abstractions.
  • Output of grammar a grammatical string,
    representing most accurately the info in the
    input.

20
The input question is

Input??
Application Content planner And lexicon
Knowledge base
Syntactic Realizer
21
FUF/SURGE - Implementation
  • The grammar is written in FUF Functional
    Unification Formalism Elhadad
  • FD - a list of (att val)
  • val atom\fd\path
  • Grammar meta-FD disjunction with ALT, control
    with
  • NONE, GIVEN, ANY.
  • All components in the generation process can be
    implemented with this formalism.

22
Requirements for a syntactic realizer
  • Mapping thematic structure onto syntactic roles.
  • Control of syntactic paraphrasing and
    alternations.
  • Provision of default for syntactic features.
  • Propagation of agreement features.
  • Selection of closed class words.
  • The imposition of linear precedence constraints.
  • The inflection of open class words.

23
SURGE ElhadadRobin 96
  • Functional Grammar, HPSG and descriptive studies
    of language
  • Input for the grammar is a lexicalized
    representation of a phrase (a clause, NP, AP).
  • Minimal syntactic information in the input allows
    isolating earlier stages of the process from
    containing purely syntactic knowledge, it gives
    the grammar paraphrasing power, and it is also
    useful for multilingual application.

24
Input for SURGE in general
  • Each constituent has the feature cat which
    determines which part of the grammar it will be
    unified with.
  • The representation of the clause is mostly
    semantic a process (in SFL terms) and its
    participant. Paraphrasing can be done using one
    feature, like focus
  • The input of an NP uses mostly syntactic
    features.
  • Paraphrases requires different input.

25
An Example
The girl was kissed by John.
John kissed the girl.
  • ((cat clause)
  • (tense past)
  • (process ((type material)
  • (agentless no)
  • (lex kiss)))
  • (participants
  • ((agent ((cat proper)
  • (lex John)))
  • (affected ((cat common)
  • (lex girl))))))

(focus partic affected)
Write a Comment
User Comments (0)
About PowerShow.com