Chapter 16: Features and Unification - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Chapter 16: Features and Unification

Description:

Gap-list : implemented by feature 'GAP' are passed through different trees ... and we advance the dot, only one edge (singular or plural) gets created at [x, y] ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 62
Provided by: Inderje9
Category:

less

Transcript and Presenter's Notes

Title: Chapter 16: Features and Unification


1
Chapter 16 Features and Unification
  • Heshaam Faili
  • hfaili_at_ece.ut.ac.ir
  • University of Tehran

2
Overview
  • Feature Structures and Unification
  • Unification-Based Grammars
  • Chart Parsing with Unification-Based Grammars
  • Type Hierarchies

3
Agenda
  • we introduce the idea that grammatical categories
    like VPto, Sthat, Non3sgAux, or 3sgNP, as well as
    the grammatical rules like S?NP VP that make use
    of them, should be thought of as objects that can
    have complex sets of properties associated with
    them
  • The information in these properties is
    represented by constraints, and so these kinds of
    models are often called constraint based
    formalisms
  • Use these constraints for
  • Agreement (this flights)
  • Subcategorization ()
  • Adding properties (features) to words
  • Adding some operation to rules in order to test
    equalities

4
Feature structures
  • We had a problem adding agreement to CFGs. What
    we needed were features, e.g., a way to say
  • number sg
  • person 3
  • A structure like this allows us to state
    properties, e.g., about a noun phrase
  • cat NP
  • number sg
  • person 3
  • Each feature (e.g., number) is paired with a
    value (e.g., sg)
  • A bundle of feature-value pairs can be put into
    an attribute-value matrix (AVM)

5
examples
6
Feature paths
  • Values can be atomic (e.g. sg or NP or 3),
    or can be complex, and thus we can define feature
    paths
  • cat NP
  • agreement number sg
  • person 3
  • The value of the path agreement number is sg
  • A grammar with only atomic feature values can be
    converted to a CFG
  • e.g. AVM on previous page ? NP3,sg
  • However, when the values are complex, it is more
    expressive than a CFG ? can represent more
    linguistic phenomena

7
An Example for FS
8
Reentrancy (structure-sharing)
  • Feature structures embedded in feature structures
    can share the same values
  • That is, two features have the exact same
    valuethey share precisely the same object as
    their value
  • well indicate this with a tag like 1
  • In this example, the agreement features of both
    the matrix sentence and the embedded subject are
    identical
  • This is referred to as reentrancy

9
FS with shared value (Reentrant)
10
Feature structures as graphs
  • Technically, feature structures are directed
    acyclic graphs (DAGs)
  • So, the feature structure represented by the
    attribute-value matrix (AVM)
  • cat NP
  • agreement number sg
  • person 3
  • is really the graph

CAT
np
?
?
sg
NUM
?
AGR
PER
?
?
3
11
Unification
  • Unification (U) a basic operation to merge two
    feature structures into a resultant feature
    structure (FS)
  • The two feature structures must be compatible,
    i.e., have no values that conflict
  • Identical FSs
  • number sg U number sg number sg
  • Conflicting FSs
  • number sg U number pl Fail
  • Merging with an unspecified FS
  • number sg U number number sg

12
Unification (cont.)
  • Merging FSs with different features specified
  • number sg U person 3 number sg
  • person 3
  • More examples
  • cat NP U agreement number sg
  • cat NP
  • agreement number sg
  • agr num sg
  • subj agr num sg U subj
    agr num sg
  • agr num sg
  • subj agr num sg

13
Unification with Reentrancies
  • Remember that structure-sharing means they are
    the same object

14
Unification with Reentrancies
  • When unification takes place, shared values are
    copied over

15
example
16
Subsumption
  • We can see that a more general feature structure
    (less values specified) subsumes a more specific
    feature structure
  • (1) num sg
  • (2) per 3
  • (3) num sg
  • per 3
  • So, we have the following subsumption relations,
    where
  • (1) subsumes (3)
  • (2) subsumes (3)
  • (1) does not subsume (2), and (2) does not
    subsume (1)

17
Subsumption
  • F ? G if and only if
  • 1. For every feature x in F, F(x) ? G(x) (where
    F(x) means the value of the feature x of feature
    structure F).
  • 2. For all paths p and q in F such that F(p)
    F(q), it is also the case that G(p) G(q).

18
Subsumption
19
Subsumption (partial order)
20
Semilattice, unification
  • ?F ? F , so we can model the sumsumption as a
    lattice , which at the top of it
  • Unification F?G most general H such that F ? H
    and G ? H

21
Overview
  • Feature Structures and Unification
  • Unification-Based Grammars
  • Chart Parsing with Unification-Based Grammars
  • Type Hierarchies

22
Grammars with Feature Structures
  • CFG skeleton augmented with feature structure
    path equations, i.e., each category has a feature
    structure
  • CFG skeleton
  • S ? NP VP
  • Path equations
  • ltNP agreementgt ltVP agreementgt
  • 1. There can be zero or more path equations for
    each rule skeleton ? no longer atomic
  • 2. When a path equation references constituents,
    they can only be constituents from the CFG rule
  • e.g., ltD agreementgt ltNom agreementgt is an
    illegal equation for the above rule! (But it
    would be fine for NP ? Det Nom)

23
FEATURES STRUCTURES IN THE GRAMMAR
24
agreement
  • subject-verb agreement and determiner nominal
    agreement.

25
Agreement in Feature-Based Grammars
  • S ? NP VP
  • ltS headgt ltVP headgt
  • ltNP head agrgt ltVP head agrgt
  • VP ? V NP
  • ltVP headgt ltV headgt
  • NP ? Det Nom
  • ltNP headgt ltNom headgt
  • ltDet head agrgt ltNom head agrgt
  • Nom ? Noun
  • ltNom headgt ltNoun headgt
  • Noun ? flights
  • ltNoun head agr numgt pl
  • Compare with the CFG case
  • S ? 3sgNP 3sgVP
  • S ? PluralNP PluralVP
  • 3sgVP? 3sgVerb
  • 3sgVP ? 3sgVerb NP
  • 3sgVP ? 3sgVerb NP PP
  • 3sgVP ? 3sgVerb PP
  • etc.

26
Percolating Agreement Features
  • S ? NP VP
  • ltNP head agrgt ltVP head agrgt
  • VP ? V NP
  • ltVP headgt ltV headgt
  • NP ? Det Nom
  • ltNP headgt ltNom headgt
  • ltDet head agrgt ltNom head agrgt
  • Nom ? Noun
  • ltNom headgt ltNoun headgt

27
agreement
28
Head features in the grammar
  • An important concept shown in the previous rules
    is that heads of grammar rules share properties
    with their mothers
  • VP ? V NP
  • ltVP headgt ltV headgt
  • Knowing the head will tell you about the whole
    phrase
  • This is important for many parsing techniques

29
Sub-categorization
  • We could specify subcategorization like so
  • VP ? V
  • ltVP head subcatgt intrans
  • VP ? V NP
  • ltVP head subcatgt trans
  • VP ? V NP NP
  • ltVP head subcatgt ditrans
  • But values like intrans do not correspond to
    anything that the rules actually look like
  • To make SUBCAT better match the rules, we can
    make it a list of a verbs arguments, e.g. ltNP,PPgt

30
Handling Subcategorization
head 1subcat lt 2, 3gt
  • VP ? V NP PP
  • ltVP headgt ltVerb headgt
  • ltVP head subcatgt ltNP,PPgt
  • V ? leaves
  • ltV head agr numgt sg
  • ltV head subcatgt ltNP,PPgt
  • There is also a longer, more formal way to
    specify lists
  • ltNP,PPgt is equivalent to
  • FIRST NP
  • REST FIRST PP
  • REST ltgt

VP
PP
V
NP
cat 2
cat 3
leaves
head 1agr num sg subcat lt
cat np, cat pp gt
31
Subcategorization
32
Subcategorization frames
  • Subcategorization, or valency, or dependency is a
    very important notion in capturing syntactic
    regularity And there is a wide variety of
    arguments that a verb (or noun or adjective) can
    take.
  • Some subcategorization frames for ask
  • He asked Q What was it like?
  • He asked Swh what it was like
  • He asked NP her Swh what it was like
  • He asked VPto to see you
  • He asked NP her VPto to tell you
  • He asked NP a question
  • He asked NP her NP a question

33
Subcategorization frame
34
Long Distance Dependencies
  • What cities does Continental service?
  • What flights do you have from Boston to
    Baltimore?
  • What time does that flight leave Atlanta?
  • wh-non-subject-question
  • S ? Wh-NP Aux NP VP
  • Should WH-NP agreed with NP
  • Gap-list implemented by feature GAP are
    passed through different trees
  • Fillers (what cities) are put on the top of gap
    list and should be unified to the
    subcategorization frame of the verb

35
Long-Distance Dependencies
  • What is the earliest flight that you have _?
  • TOP (fill gap)
  • S ? WH-word Be-copula NP
  • ltNP gapgt ltWH-word headgt
  • MIDDLE (pass gap)
  • NP ? D Nom
  • ltNP gapgt ltNom gapgt
  • Nom ? Nom RelClause
  • ltNom gapgt ltRelClause gapgt
  • RelClause ? RelPro NP VP
  • ltRelClause gapgt ltVP gapgt
  • BOTTOM (identify gap)
  • VP ? V
  • ltVP gapgt ltV subcat secondgt

36
Overview
  • Feature Structures and Unification
  • Unification-Based Grammars
  • Chart Parsing with Unification-Based Grammars
  • Type Hierarchies

37
Implementing Unification
  • How do we implement a check on unification?
  • i.e., given feature structures F1 and F2, return
    F, the unification of F1 and F2
  • Unification is a recursive operation
  • If a feature has an atomic value, see if the
    other FS has that feature with the same value
  • F a unifies with , F , and F a
  • If a feature has a complex value, follow the
    paths to see if theyre compatible and have the
    same values at bottom
  • Does F G1 unify with F G2? We have to
    inspect G1 and G2 to find out.
  • To avoid cycles, we have to do an occur check to
    see if weve seen a FS before or not

38
(No Transcript)
39
(No Transcript)
40
  • Base case
  • One or both of the arguments has a null value.
  • The arguments are identical.
  • The arguments are non-complex and non-identical.

41
An Example
42
(No Transcript)
43
  • original arguments are neither identical, nor
    null, nor atomic,
  • These arguments are also non-identical, non-null,
    and non-atomic so the loop is entered again
    leading to a recursive check of the values of the
    AGREEMENT features

44
f1 and f2 after the recursion adds the value of
the new PERSON feature
45
The final structures of f1 and f2 at the end
46
Modifying a Early Parser to handle Unification
  • Our grammar still has a context-free backbone, so
    we could just parse a sentence with a CFG and use
    the features to filter out the ungrammatical
    sentences
  • But by utilizing unification as we parse, we can
    eliminate parses that wont work in the end
  • e.g., well eliminate NPs that dont match in
    agreement features with their VPs as we parse,
    instead of ruling them out later

47
Changes to the Chart Representation
  • Each state will be extended to include the LHS
    DAG (which can get augmented as it goes along).
  • i.e., Add a feature structure (in DAG form) to
    each state
  • So, S ? ? NP VP, 0,0
  • Becomes S ? ? NP VP, 0,0, DagS
  • The predictor, scanner, and completer have to
    pass in the DAG, so all three operations have to
    be altered

48
Earley Parser with Unification
49
Predictor, Scanner, Completer
50
Unify States
51
Example (That flight )
52
Change to ENQUEUE
  • The enqueue procedure should also be changed to
    use a subsumption test
  • Do not add a state to the chart if an equivalent
    or more general state is already there.
  • So, if Enqueue wants to add a singular
    determiner state at x, y, and the chart already
    has a determiner state at x, y unspecified for
    number, then Enqueue will not add it.

53
Why a Subsumption Test?
  • If we don't impose a subsumption restriction,
    enqueue will add two states at x, y, one
    expecting to see a singular determiner, the other
    just a determiner.
  • On seeing a singular determiner, the parser will
    advance the dot on both rules, creating two edges
    (since singular will unify with both singular and
    with unspecified).
  • As a result, we would get duplicate edges.
  • If we impose the restriction, and we see either a
    single or plural determiner, and we advance the
    dot, only one edge (singular or plural) gets
    created at x, y.

54
The Need for Copying
  • show me morning flights

55
Overview
  • Feature Structures and Unification
  • Unification-Based Grammars
  • Chart Parsing with Unification-Based Grammars
  • Type Hierarchies

56
Using Type Hierarchies
  • Instead of simple feature structures, formalisms
    like Head-Driven Phrase Structure Grammar (HPSG)
    use typed feature structures
  • Two problems right now
  • What prevents us right now from specifying the
    following?
  • ltnumber femininegt
  • How can we capture the fact that all values of
    NUMBER are the same sort of thing, i.e., make a
    generalization?
  • Solution use types

57
Type Systems
  • 1. Each feature structure is labeled by a type.
  • noun
  • CASE case
  • 2. Each type has appropriateness conditions
    specifying what features are appropriate for it.
  • noun ? CASE case
  • verb ? VFORM vform
  • 3. Types are organized into a type hierarchy.
  • 4. Unification is modified to allow two different
    types to unify.

58
Simple Type Hierarchy
59
Type Hierarchy
  • So, if
  • CASE is appropriate for noun, and
  • the value of CASE is case, and
  • we have the following type hierarchy
  • case
  • nom acc dat
  • Then, the following are possible feature
    structures
  • noun noun noun
  • CASE nom CASE acc CASE dat

60
Unification of types
  • Now, when we unify feature structures, we have to
    unify types, too
  • CASE case U CASE nom CASE nom
  • CASE nom U CASE acc fail
  • Lets also assume that acc and dat have a common
    subtype, obj
  • acc dat
  • obj
  • Then, we have the following unification
  • CASE acc U CASE dat CASE obj

61
Practices
  • 16.2, 16.3, 16.6, 16.7,
Write a Comment
User Comments (0)
About PowerShow.com