Grammar Engineering: Parsing with HPSG Grammars - PowerPoint PPT Presentation

About This Presentation
Title:

Grammar Engineering: Parsing with HPSG Grammars

Description:

A candidate sentence must satisfy all the principles of the Grammar ... signs with local.cat.head value of type noun, and. local. ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 24
Provided by: CT15
Category:

less

Transcript and Presenter's Notes

Title: Grammar Engineering: Parsing with HPSG Grammars


1
Grammar EngineeringParsing with HPSG Grammars
  • Miguel Hormazábal

2
Overview
  • The Parsing Problem
  • Parsing with constraint-based grammars
  • Advantages and drawbacks
  • Three different approaches

3
The Parsing Problem
  • Given a Grammar and a Sentence,
  • Can the lt S, T gt generate / rule out the input
    String ?
  • A candidate sentence must satisfy all the
    principles of the Grammar
  • Coreferences as main explanatory mechanism in
    HPSG

4
Parsing with Constraint-based Grammars
  • Object-based formalism
  • Complex specifications on signs
  • Structure sharing imposed by the theory
  • Feature Structures
  • Sort resolved and well typed
  • Multiple information levels (PHON, SYNSEM)
  • Universal / Language specific principles to be
    met

5
Advantages and Drawbacks
  • Pros
  • A common formalism for all levels of linguistic
    Information
  • All information simultaneously available
  • Cons
  • Hard to modularize
  • Computational overhead for parser

6
1st Approach Distributed Parsing
  • Two kind of constraints
  • Genuine syntactic, they work as filters of the
    input
  • Spurious semantic, they build representational
    structures
  • Parser cannot distinguish between analytical and
    structure-building constraints
  • VERBMOBIL implementation
  • Input word lattices of speech recognition
    hypotheses
  • Parser identifies those paths of acceptable
    utterances
  • Lattices can contain hundreds of hypotheses, most
    ungrammatical
  • Goal Distribute the labour of evaluating the
    constrains in the grammar on several processes

7
Distributed Parsing
  • Analysis strategy
  • Two parser units
  • SYN-Parser
  • Works directly with word lattices
  • Performs as a filter for the SEM-Parser
  • SEM-Parser
  • Works only with successful analysis results
  • Performs under control by the SYN-Parser

8
Distributed Parsing
  • Processing requirements
  • Incrementality
  • The SYN-Parser must NOT send its results only
    when it has complete analysis, forcing the
    SEM-Parser to wait
  • Interactivity
  • The SYN-Parser must report back when its
    hypothesis failed
  • Efficient communication system between the
    parsers, based on the common grammar

9
Distributed Parsing
  • Centralized Parsing
  • Distributed Parsing

10
Distributed Parsing
  • Bottom-Up Hypotheses
  • Emitted by the SYN-Parser and sent to SEM-Parser,
    for semantic verification
  • Top-Down Hypotheses
  • Emitted by the SEM-Parser, failures reported back
    to SYN-Parser
  • Completion History
  • C-hist(NP-DET-N) ((DET t0 t1) (N t1 t2))
  • C-hist(det) ((the t0 t1))
  • C-hist(N) ((example t1 t2))

11
Distributed Parsing
  • Compilation of Subgrammars
  • From common source Grammar,
  • Straightforward option split up the Grammar into
    syntax and semantics strata
  • Manipulating grammar rules and lexical entries
    to obtain Gsyn and Gsem

12
2nd Approach Data-Oriented Parsing
  • Main goal achieve domain adaptation to improve
    efficiency of HPSG parsing
  • Assumption frequency and plausibility of
    linguistic structures within a certain domain,
    will render better results
  • DOP process new input by combining structure
    fragments from a Treebank
  • DOP allows to assign probabilities to arbitrarily
    large syntactic constructions

13
Data-Oriented Parsing
  • Procedure
  • Parse all sentences from a training corpus using
    HPSG Grammar and Parser
  • Automatic acquisition of a stochastic lexicalized
    tree grammar (SLTG)
  • Each parse tree is decomposed into a set of
    subtrees.
  • Assignment of probabilities to each subtree

14
Data-Oriented Parsing
  • Implementation using unification-based Grammar,
    parsing and generation platform LKB
  • First parse each sentence of the training corpus
  • The resulting Feature Structure contains the
    parse tree
  • Each non-terminal node contains the label of the
    HPSG-rule schema applied
  • Each terminal node contains lexical type of the
    corresponding feature structure
  • After this, each parse tree is further processed

15
Data-Oriented Parsing
  • 1. Decomposition, two operations
  • Root ? creates passive (closed, complete)
    fragments by extracting substructures
  • Frontier ? creates active (open, incomplete)
    fragments by deleting pieces of substructure
  • Each non-head subtree is cut off, and the cutting
    point is marked for substitution.

16
Data-Oriented Parsing
  • 2. Specialization
  • Rule labels of root node and substitution nodes
    are replaced with a corresponding category label.
  • Example
  • signs with local.cat.head value of type noun, and
  • local. cat.val.subj feature the empty list, are
    classified as NPs.
  • 3. Probability
  • Count total number n of all trees with same root
    label a
  • Divide frequency number m of a tree t with root a
    by n ? p(t)
  • The sum of all probabilities of trees ti with
    root a ? 1
  • S ti root(ti) a p(ti) 1

17
Data-Oriented Parsing
  • This implementation for the VerbMobil project
    uses a
  • chart-based agenda-driven bottom-up parser
  • Step 1 Selection of a set of SLTG-trees
    associated with the lexical items in the input
    sentence
  • Step 2 Parsing of the sentence with respect to
    this set.
  • Step 3 Each SLTG-parse tree is expanded by
    unifying the feature constraints into the parse
    trees
  • If successful, complete valid feature structure
  • Else, next most likely tree is expanded

18
3rd Approach Probabilistic CFG Parsing
  • Main goal to obtain the Viterbi parse (highest
    probability) given an HPSG and a probabilistic
    model
  • One way
  • Parse input without using probabilities
  • Then select most probable parse looking at every
    result
  • Cost Exponential search space
  • This Approach
  • Define equivalence class function (F.S.
    reduction)
  • Integrate SEM and SYN preference into Figures Of
    Merit (FOMs)

19
Probabilistic CFG Parsing
  • Probabilistic Model
  • HPSG Grammar G lt L, R gt, where
  • L l lt w, F gt w ? W, F ? F set of
    lexical entries
  • R is a set of grammar rules, i.e., r ? R is a
    partial function
  • F x F -gt F

20
Probabilistic CFG Parsing
  • Probabilistic HPSG
  • Probability p(F w) of F.S. Assign to given
    sentence
  • Where ?i is a model parameter,
  • si is a fragment of a F.S., and
  • s (si , F )is a function of N of
    appearences of F.S. fragment si in F
  • Probabilities represent syntactic/semantic
    preferences expressed in a Feature Structure

21
Probabilistic CFG Parsing
  • Implementation Iterative CYK parsing algorithm
  • Pruning edges during parsing
  • Best N parses are tracked
  • Reduced F.S.E though equivalence classes
  • Requires not over/undergenerate
  • FOMs computed with reduced F.S. Equivalent to
    original
  • Parser calculates Viterbi, taking maximum of
    probabilities of the same non terminal symbol at
    each point

22
Assessment
  • The three approaches attempt to achieve a higher
    efficiency of the Parsing process Distributed
    Parsing
  • Distributed Parsing
  • ? Unification and copying faster
  • ? Soundness of Grammar affected ? L(G) ? L(Gsyn)
    n L(Gsem)
  • DO Parsing
  • ? Fragment at the right level of generality
  • ? Straightforward Probability computation
  • PCFG Parsing
  • ? Highly efficient CYK parsing implementation
    trough reduced FS and edge pruning

23
References
  • Pollard, C. and Sag, I. A. (1994). Head-Driven
    Phrase Structure Grammar . Chicago, IL
    University of Chicago Press.
  • Richter, F. (2004b). A Web-based Course in
    Grammar Formalisms and Parsing. Textbook, MiLCA
    project A4, SfS, Universitat Tubingen.
    http//milca.sfs.uni-tuebingen.de/A4/Course/PDF/gr
    amandpars.pdf.
  • Levine Robert, and Meurers Detmar. Head-Driven
    Phrase Structure Grammar Linguistic Approach,
    Formal Foundations, and Computational Realization
    In Keith Brown (Ed.) Encyclopedia of Language
    and Linguistics, Second Edition. Oxford
    Elsevier. 2006.
  • Abdel Kader Diagne, Walter Kasper, and
    Hans-Ulrich Krieger. (1995). Distributed Parsing
    With HPSG Grammars. In Proceedings of the 4th
    International Workshop on Parsing Technologies,
    IWPT-95, pages 7986.
  • Neumann, G.HPSG-DOP data-oriented parsing with
    HPSG. In Unpublished manuscript, presented at
    the 9th Int. Conf. on HPSG, HPSG-2002, Seoul,
    South Korea (2002)
  • Tsuruoka Yoshimasa, Miyao Yusuke, and Tsujii
    Jun'ichi. 2003. Towards efficient probabilistic
    HPSG parsing integrating semantic and syntactic
    preference to guide the parsing. Proceedings of
    IJCNLP-04 Workshop Beyond shallow analyses -
    Formalisms and statistical modeling for deep
    analyses.
Write a Comment
User Comments (0)
About PowerShow.com