7' Parsing in functional unification grammar - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

7' Parsing in functional unification grammar

Description:

The term 'compilation' almost always refers to a process that translates a text ... The result of this phase of the compilation is a list of simple FDs, containing ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 29
Provided by: klplReP
Category:

less

Transcript and Presenter's Notes

Title: 7' Parsing in functional unification grammar


1
7. Parsing in functional unification grammar
  • Han gi-deuc

2
Contents
  • 7.1 Functional unification grammar
  • 7.1.1 Compilation
  • 7.1.2 Attributes and values
  • 7.1.3 Unification
  • 7.1.4 Patterns and constituent sets
  • 7.1.5 Grammar
  • 7.2 The parser
  • 7.2.1 The General Syntactic Processor
  • 7.2.2 The parsing grammar
  • 7.3 The compiler
  • 7.4 Conclusion

3
7.1 functional unification grammar
  • The claim that this theory makes on the word
    functional in its title is therefore supported
    in three ways.
  • 1. It gives primary status to those aspects of
    language that have often been called functional
    logical aspects are not privileged
  • 2. It describes linguistic structures in terms of
    the function that a part fills in a whole, rather
    than in terms of parts of speech and ordering
    relations
  • 3. Most important for this paper, it requires its
    grammars to function that is, they must support
    the practical enterprises of language generation
    and analysis.

4
7.1.1 Compilation
  • This paper will concentrate on how this
    translation is actually carried out it will, in
    short, be about machine translation between
    grammatical formalisms.
  • This kind of translation to be explored here is
    known in computer science as compilation, and the
    computer program that does it is called a
    compiler.
  • The term compilation almost always refers to a
    process that translates a text produced by a
    human into a text that is functionally
    equivalent, but not intended for human
    consumption.

5
7.1.2 Attributes and values
  • Functional unification grammar knows things by
    their functional descriptions, (FDs). A simple FD
    is a set of descriptors and a descriptor is a
    constituent set, a pattern, or an attribute with
    an associated value.
  • The list of descriptors that make up an FD is
    written in square brackets, no significance
    attaching to the order. The attributes in an FD
    must be distinct from one another so that if an
    FD F contains the attribute a, it is always
    possible to use the phrase the a of F to refer
    unambiguously to a value.
  • An attribute is a symbol, that is, a string of
    letters. A value is either a symbol or another FD

6
7.1.2 Attributes and values
  • The sentence He saw her

7
7.1.3 Unification
  • A string of atoms enclosed in angle brackets
    constitutes a path and there is at least one that
    identifies every value in an FD.
  • The path lta1 a2 akgt identifies the value of
    the attribute ak in the FD that is the value of
    lta1 a2 ak-1gt. It can be read as The ak of the
    ak-1 of the a1.
  • Paths are always interpreted as beginning in the
    largest FD that encloses them.
  • A pair consisting of a path in an FD and the
    value that the path leads to is a feature of the
    object described.
  • If the value is a symbol, the pair is a basic
    feature of the FD

8
7.1.3 Unification
  • The sentence He likes writing books
  • Example of Path

9
7.1.3 Unification
  • The union of a pair of FDs in not, in general, a
    well-formed FD.
  • The reason is this The requirement that a given
    attribute appear only once in an FD implies a
    similar constraint on the set of features
    corresponding to an FD.
  • A path must uniquely identify a value.

10
7.1.3 Unification
  • When two or more simple FDs are compatible, they
    can be combined into one simple FD describing
    those things that they both describe, by the
    process of unification.
  • Unification is the same as set union except that
    it yields the null set when applied to
    incompatible arguments.
  • The sign is used for unification, so that a
    ß denotes the result of unifying a and ß.
  • Unification is the fundamental operation
    underlying the analysis and synthesis of
    sentences using functional unification grammar.

11
7.1.3 Unification
  • Example of Unification

12
7.1.4 Patterns and constituent sets
  • The value of SUBJ is the FD of a constituent of
    the sentence, whereas the value of ASPECT is not
  • The purpose of constituent sets and patterns is
    to identify constituents and to state constraints
    on the order of their occurrence
  • The value of the C-set attribute covers all
    constituents.

13
7.1.4 Patterns and constituent sets
  • Each pattern is a list whose members can be
  • 1. A path. The path may have as its valuea. An
    FD. As in the case of the constituent set, the FD
    describes a constituentb. A pattern. The pattern
    is inserted into the current one at this point
  • 2. A string of dots. This matches any number of
    constituents
  • 3. The symbol . This matches any one constituent
  • 4. An FD. This will match any constituent whose
    description is unifiable with it. The unification
    is made with a copy of the FD in the pattern,
    rather than with the FD itself, because the
    intention is to impute its properties to the
    constituent, but not to unify all the
    constituents that match this part of the pattern
  • 5. An expression of the form ( fd), where fd is
    an FD. This matches zero or more constituents,
    provided they can all be unified with a copy of
    fd.

14
7.1.4 Patterns and constituent sets
Expressions of pattern
The pattern (16) requires exactly one constituent
to have the property TRACENP all others must
have the property TRACENONE
15
7.1.5 Grammar
  • A functional unification grammar is a single FD
  • Example (19) shows a simple grammar,
    corresponding to a context-free grammar
    containing the single rule (20)

16
7.2 The Parser
  • 7.2.1 The General Syntactic Processor
  • The input is an FD that constitutes the
    specification of a sentence to be uttered
  • There are two principal data structures, the
    chart and the agenda
  • The chart is a directed graph each of whose edges
    maps onto a substring of the sentence being
    analyzed

17
7.2.1 The General Syntactic Processor
  • Chart
  • K1 vertices for a sentence of k words
  • Each word in the sentence to be parsed is
    represented by an edge labeled with an FD
    obtained by looking that word up in the lexicon
  • If the word is ambiguous, that is, if it has more
    than one FD, it is represented by more than one
    edge.
  • All the edges for the i-th word clearly go from
    vertex i 1 to vertex i
  • The label on an active edge has two parts, an FD
    describing what is known about the putative
    phrase, and a procedure that will carry the
    recognition of the phrase one step further forward

18
7.2.1 The General Syntactic Processor
  • Parsing proceeds in a series of steps in each of
    which the procedure on an active edge is applied
    to a pair of FDs, one coming from that same
    active edge, and the other from an inactive edge
    that leaves the vertex where the active edge
    ends.
  • If a and i are an active and an inactive edge
    respectively, a being incident to the vertex that
    i is incident from, the step consists in
    evaluating Pa(fa,fi) , where fa and fi are the
    FDs on a and i, and Pa is the procedure

19
7.2.1 The General Syntactic Processor
  • This process carried out for every pair
    consisting of an active followed by an inactive
    edge that comes to be part of the chart. Each
    successful step leads to the introduction of one
    new edge, but this edge may result in several new
    pairs.
  • Each new pair produced therefore becomes a new
    item on the agenda which serves as a queue of
    pairs waiting to be processed

20
7.2.2 The parsing grammar
  • The parsing grammar, as we have seen, takes the
    form of a set of procedures, each of which
    operates on a pair of FDs
  • One of these FDs, the matrix FD, is a partial
    description of a phrase, and the other, the
    constituent FD, is as complete a description as
    the parser will ever have of a candidate for
    inclusion as constituent of that phrase

21
7.2.2 The parsing grammar
22
7.3 The compiler
  • The compiler has two major sections. The first
    part is a straightforward application of the
    generation program to put the grammar,
    effectively, into disjunctive normal form. The
    second is concerned with actually building the
    procedures
  • If F is grammar, or indeed any complex FD, it is
    always possible to recast it in the form F1 ? F2
    Fn, where the Fi (1 i n) each contain no
    alternations

23
7.3 The compiler
  • The process of generation from a particular FD,
    , effectively selects those members of F1 Fn
    that can be unified with , and then repeats this
    procedure recursively for each constituent. F is,
    in general, a conjunct containing some atomic
    terms and some alternations.

24
7.3 The compiler
  • Ignoring patterns for the moment, the procedure
    is as follows
  • 1. Unify the atomic terms of F with . If this
    fails, the procedure as a whole fails. Some
    number of alternations now remain to be
    considered. In other words, that part of F that
    remains to be unified with is an expression F'
    of the form (a1.1 ? a1.2 a1.k1) (a2.1 ? a2.2
    a2.k2) (an.1 ? an.2 an.kn)
  • 2. Rewrites as an alternation by multiplying out
    the terms of an arbitrary alternation in F', say
    the first one. This give an expression F" of the
    form (a1.1 (a2.1 ? a2.2 a2.k2) (an.1 ? an.2
    an.kn)) ? (a1.2 (a2.1 ? a2.2 a2.k2) (an.1
    ? an.2 an.kn)) ? (a1.k1 (a2.1 ? a2.2
    a2.k2) (an.1 ? an.2 an.kn))
  • 3. Apply the whole procedure (steps 1-3)
    separately to each conjunct in F"

25
7.3 The compiler
  • It remains to spell out the alternatives that are
    implicit in the patterns
  • The basic idea is to generate all permutations of
    the constituent set of the FD and to eliminate
    those that do not match all the patterns
  • The result of this phase of the compilation is a
    list of simple FDs, containing no alternations,
    and having either no pattern, or a single
    pattern that specifies the order of constituents
    uniquely
  • Those that have no pattern become lexical entries
    and they are of no further interest to the
    compiler

26
7.3 The compiler
  • The second phase of the compiler centers around a
    procedure which, given a list of simple FDs, and
    an integer n, attempts to find an attribute, or
    path, on the basic of which the nth constituent
    of those FDs can be distinguished
  • The result of this process is (1) a path A, (2) a
    set of values for A, each associated with the
    subset of the list of FDs whose nth constituent
    has that value of A, and (3) a residual subset of
    the list consisting of FDs whose nth constituent
    has no value of the attribute A

27
7.3 The compiler
  • Second process

28
7.4 Conclusion
  • Two things can be said to mitigate this to some
    extent. First, the parsing and generation
    grammars do indeed describe exactly the same
    languages, so that much of the work involved in
    testing prototype grammars can be done with a
    generator that works directly and efficiently off
    the competence grammar. The second point is this
    the compiler behaves as though any
    attribute-value pair in the grammar that did not
    mention CAT was not there at all.
  • The resulting set of parsing procedures clearly
    recognizes at least all the sentences of the
    language intended, though possibly others in
    addition.
Write a Comment
User Comments (0)
About PowerShow.com