Whose presentation is this? SUBJ(present, Violeta Seretan) - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Whose presentation is this? SUBJ(present, Violeta Seretan)

Description:

Title: Slide 1 Author: seretan Last modified by: seretan Created Date: 10/17/2005 10:49:27 PM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 31
Provided by: sere6
Category:

less

Transcript and Presenter's Notes

Title: Whose presentation is this? SUBJ(present, Violeta Seretan)


1
Whose presentation is this?SUBJ(present, Violeta
Seretan)
(Decoding the predicate-argument structure of
nominalizations)
OBL(collaborate, Lorenzo Thione)PP-OBJ(with,
Lorenzo Thione)
SUBJ(supervise, Martin van den Berg)
2
Overview
  • nominalization problem
  • NOMLEX resource
  • Denominalizer service based on NOMLEX
  • additional resources (CSLI)
  • APIs for NOMLEX, CSLI
  • related and future work
  • demo

3
Text normalization for QA
  • Mark Twain published Adventures of Huckleberry
    Finn in 1885 in America.
  • Who published H.F.?
  • Where was H.F. published?
  • When was H.F. published?
  • QA/NLU needs to deal with a large spectrum of
    variation in text
  • morphological published, publishes
  • syntactic H.F. was published
  • lexical novel, book, masterpiece, work
    publish, write, author, appear
  • nominalization the publication
  • Normalization (via parsing)
  • base word form publishes -gt publish published
    -gt publish
  • canonical word order SUBJ(publish, Mark Twain)
    OBJ(publish, H.F.)
  • Lexical semantic resources
  • synonyms, hyponyms, hypernyms,

4
Nominalization
  • Since the publication of Huckleberry Finn in
    1885, there have been many
  • reactions to the novel, some of them quite
    extreme.
  • When was H.F. published?

Nominalization  NP having a systematic
correspondence with a clause structure (Quirk et
al. 1985) Goal decoding the clause structure
5
Mapping nominal arguments into verbal roles
  • Mark Twains publication of his book
  • possessive determiner PP adjunct
    (nominal arguments)
  • the book publication by Mark Twain
  • modifier PP adjunct (nominal
    arguments)
  • Mark Twain - publish book
  • SUBJECT OBJECT (verbal
    roles)

6
Role ambiguity
  • Romes destruction SUBJ or OBJ?
  • OBJ(destroy, Rome)
  • SUBJ(destroy, Rome)
  • Romes destruction by barbarians OBJ
  • Romes destruction of Carthage SUBJ
  • Romes destruction OBJ (by default)
  • Johns admiration SUBJ (by default)

7
NOMLEX NOMinalization LEXicon
  • Macleod et al., New York University
  • 1025 deverbal nouns
  • detailed mapping from nominal arguments to verb
    roles
  • ORTH "destruction"
  • VERB "destroy"
  • VERB-SUBC ((NOM-NP SUBJECT ((N-N-MOD)
  • (DET-POSS)
  • (PP PVAL
    ("by")))
  • OBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL
    ("of")))
  • REQUIRED ((OBJECT
    DET-POSS-ONLY T

  • N-N-MOD-ONLY T))))

role to assign
default role
8
NOMLEXML
(NOM ORTH "accusation" PLURAL
"accusations" PLURAL-FREQ "not rare"
VERB "accuse" NOUN-SUBC ((NOUN-PP PVAL
("about"))) NOM-TYPE ((VERB-NOM))
VERB-SUBJ ((DET-POSS)
(N-N-MOD) (PP PVAL ("by")))
SUBJ-ATTRIBUTE ((COMMUNICATOR))
OBJ-ATTRIBUTE ((COMMUNICATOR)) VERB-SUBC
((NOM-NP-PP SUBJECT ((DET-POSS)
(N-N-MOD)
(PP PVAL ("by")))
OBJECT ((PP PVAL ("against")))
PVAL ("of"))
(NOM-NP SUBJECT ((DET-POSS)
Perl
9
NOMLEX API in Java
  • com.fxpal.sake.test (NomLexInterface)
  • com.fxpal.ltng.services.normalization.noun.nomlex
  • (NomLex, NomLexEntry, NomLexClassConstants,
    Subcat)

10
How useful?
  • Oracle acquired PeopleSoft at the end of last
    year.
  • Oracles acquisition of PeopleSoft at the end of
    last year
  • Google hits, 10/25/2005

"Oracle acquisition of PeopleSoft"
"Oracle acquired PeopleSoft"
"Oracle's PeopleSoft acquisition"
14500
587
693
More hits
"Oracle acquires PeopleSoft" 1020
"Oracle has acquired PeopleSoft" 248
"Oracle will acquire PeopleSoft" 424
11
Argument-role mapping
  • Oracle's acquisition of PeopleSoft
  • possessive PP (of )
  • ORTH "acquisition"
  • VERB "acquire"
  • VERB-SUBC ((NOM-NP SUBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL
    ("by")))
  • OBJECT ((N-N-MOD)
  • (PP PVAL
    ("of"))))

12
Denominalizer
  • Input sentence
  • Output pairs nominal argument verb role
  • for each nominalization
  • (noun, (argument role))
  • Exemples
  • Oracle's acquisition of PeopleSoft finally
    materialized after an 18 months struggle between
    the two companies.
  • (acquisition, (Oracle - SUBJECT) (PeopleSoft -
    OBJECT))
  • Oracle acquisition finally materialized.
  • (acquisition, (Oracle - SUBJECT) (Oracle -
    OBJECT))

13
Algorithm
  • parse sentence
  • for each deverbal noun
  • get noun arguments
  • for each NOMLEX entry for noun
  • for each subcat of the entry
  • 1. match arguments against subcat
  • 2. filter assignment results
  • select a subcat
  • output assignments for selected subcat
  • Note overlapping nominalizations ok
  • an increase in product sales

com.fxpal.ltng.services.normalization.noun.
14
1. Matching
  • Oracle's acquisition of PeopleSoft finally
    materialized.
  • Arguments (acquisition)
  • POSS(acquisition, Oracle)
  • ADJUNCT(acquisition, of)
  • PP-OBJ(of, PeopleSoft)
  • NOM-NP
  • SUBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL ("by")))
  • OBJECT ((N-N-MOD)
  • (PP PVAL ("of")))

15
2. Filtering
  • Oracle's PeopleSoft acquisition finally
    materialized.
  • Arguments (acquisition)
  • POSS(acquisition, Oracle)
  • MOD(acquisition, PeopleSoft)
  • NOM-NP
  • SUBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL ("by")))
  • OBJECT ((N-N-MOD)
  • (PP PVAL ("of")))

Alternatives Oracle SUBJECT PeopleSoft
SUBJECT, OBJECT
16
NOMLEX constraints (1)
  • Uniqueness Constraint
  • A verbal role may be filled only once.
  • Oracle's PeopleSoft acquisition
  • Matching alternatives
  • Oracle SUBJECT
  • PeopleSoft SUBJECT, OBJECT

17
NOMLEX constraints (2)
  • Ordering Constraint
  • If there are multiple pre-nominal arguments,
    they must appear in the order
  • SUBJECT, INDIRECT OBJECT, DIRECT OBJECT,
    OBLIQUE.
  • FXs printer sales grew by 50.
  • Matching alternatives
  • FX SUBJECT, OBJECT
  • printer SUBJECT, OBJECT
  • order FX, printer
  • verbal roles SUBJECT, OBJECT

18
NOMLEX constraints (3)
  • Obligatoriness Constraint
  • By default, the subject and object are optional.
  • A NOMLEX entry can specify obligatory roles to
    be filled.
  • circulation - REQUIRED (SUBJECT)
  • blood circulation
  • SUBJ(circulate, blood)
  • destruction - REQUIRED ((OBJECT DET-POSS-ONLY T

  • N-N-MOD-ONLY T))))
  • Romes destruction
  • OBJ(destroy, Rome)

19
Selectional Restrictions
com.fxpal.ltng.services.normalization.noun.csli (N
ouns, Verbs, NounsVerbs)
20
Applying selectional restrictions
  • room reservation
  • Alternatives
  • room - SUBJECT, OBJECT
  • reserve - selectional restrictions SUBJECT
    sentient OBJECT
  • room - location, physobj
  • semantic types for about 5000 N
  • selectional restrictions for about 5000 V
  • 459/941 verbs from NOMLEX (48.77)

21
Coverage extension
  • What if a noun is not in NOMLEX?
  • additional deverbal nouns in the CSLI data
  • 4087 event nouns
  • 3348 new, 739 already in NOMLEX
  • 3348/1025 326 more data
  • NOMLEX template
  • NOM-NP
  • SUBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL ("by")))
  • OBJECT ((DET-POSS)
  • (N-N-MOD)
  • (PP PVAL ("of")))

22
Future work
  • extensive test and evaluation
  • other nominalization data
  • deverbal noun recognition
  • mapping information (FrameNet)
  • other lexical resources
  • PropBank semantic roles
  • VerbLex selectional restrictions
  • role assignment in context
  • word sense disambiguation, anaphora, discourse
  • collocations
  • the author will make no accusation
  • SUBJ(make, author) -gt SUBJ (accuse, author)

23
Related work
  • PUNDIT system (Dahl et al., 1987)
  • SNOWY QA system (Hull and Gomez 1996)
  • NOMLEX for IE (Meyers et al., 1998)
  • N-N interpretation (Lapata 2002, Girju et al.
    2004)

24
References
  • Dahl, Deborah A., Palmer, Martha S. and
    Passonneau, Rebecca J. 1987. "Nominalizations in
    PUNDIT." Proceedings of the 25th Annual Meeting
    of the Association for Computational Linguistics,
    Stanford, CA.
  • Girju, Roxana, Ana-Maria Giuglea, Marian Olteanu,
    Ovidiu Fortu, Orest Bolohan, and Dan Moldovan.
    Support vector machines applied to the
    classification of semantic relations in
    nominalized noun phrases. In Proceedings of the
    HLT-NAACL Workshop on Computational Lexical
    Semantics, 2004.
  • Hull, Richard and Fernando Gomez (1996). Semantic
    Interpretation of Nominalizations. PDF Format.
    Proceedings of the Thirteenth National Conference
    on Artificial Intelligence, Portland, Oregon,
    August, 1996, pp. 1062-8.
  • Lapata, Maria. 2002. The Disambiguation of
    Nominalisations. Computational Linguistics 283,
    357-388.
  • Macleod, Catherine, Ralph Grishman, Adam Meyers,
    Leslie Barrett, and Ruth Reeves. 1998. Nomlex A
    lexicon of nominalizations. In Proceedings of the
    8th International Congress of the European
    Association for Lexicography, pages 187193,
    Liège, Belgium.
  • Meyers A., et al. Using NOMLEX to produce
    nominalization patterns for information
    extraction. In Proceedings of the COLING-ACL
    Workshop on Computational Treatment of Nominals,
    1998.
  • Quirk, S. R., Greenbaum, G. Leech, and J.
    Svartvik. 1985. A comprehensive grammar of
    English language, Longman, Harlow.
  • Terada Akira, Tokunaga Takenobu. Corpus based
    method of transforming nominalized phrases into
    clauses for text mining application. IEICE
    Transactions on Information and Systems.
    Vol.E86-D. No.9. pp.1736 -- 1744. 2003.

25
  • Thank you!

26
Selectional restrictions data
  • CSLI resource
  • nouns 4447
  • semantic types (ontology)
  • verbs 4858
  • subcategorizations
  • selectional restrictions
  • noun-verb 5700 V (9415 N)
  • noun-verb pairs

27
Grammatical Transfer
NOMLEX XLE Example
DET-POSS POSS Rome's destruction
PP ADJUNCT, PP-OBJ (POSNOUN) destruction of Carthage
TO-INF XCOMP the desire to leave
AS-NP-PHRASE ADJUNCT, PP-OBJ (as, POSNOUN) his resignation as chairman
N-N-MOD MOD the room reservation
P-ING ADJUNCT, PP-OBJ (POSVERB) the accusation against launching
ING ADJUNCT, QA_PROG() my appreciation being there
FOR-TO-INF ADJUNCT, SUBJ the wish for him to go
ADVP ADJUNCT (POSADV) his departure abroad
AS-ING ADJUNCT, PP-OBJ (as, POSVERB), QA_PROG() characterization as being
AS-ADJP ADJUNCT, PP-OBJ (as, POSADJ) the characterization as useful
P-POSSING ADJUNCT, PP-OBJ(POSVERB), POSS the acceptance of his talking
28
FrameNet
  • aim word semantico-syntactic mapping
  • semantic roles frame elements (frame-specific)
  • BNC corpus (100M words) American English LDC,
    ANC
  • more than 600 frames, about 9.000 words

Example accusation frame Judgment_communication
FE (for this word) and their realization
communicator evaluee reason
not expressed (27/48) possessive determiner (6/48) PP (from) (2/48) not expressed (40/48) PP (against) (5/48) PP (about) (3/48) PP (of) (9/48) S (that) (9/48) not expressed (8/48) PP (about) (3/48)
29
NOMLEX constraints (4)
  • restrictions on possible combinations
  • specified in NOMLEX entry
  • adaptation
  • NOT ((AND SUBJECT ((DET-POSS) (N-N-MOD))
  • OBJECT ((N-N-MOD))
  • plants' weather adaptation
  • plants adaptation to weather
  • Note Not implemented (cannot decide which
    assignment to remove).

30
Denominalizer UI
com.fxpal.sake.test.DenominalizerTest
parse triples
output
Write a Comment
User Comments (0)
About PowerShow.com