RMRS - PowerPoint PPT Presentation

About This Presentation
Title:

RMRS

Description:

Integration experiments with broad-coverage systems/grammars (LinGO ERG and RASP) ... rarg rargname ARG1 /rargname label H3 /label var X /var /rarg ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 32
Provided by: anncop
Category:
Tags: rmrs | var

less

Transcript and Presenter's Notes

Title: RMRS


1
RMRS
  • some background and current work

2
Talk overview
  • RMRS integrating processors via semantics
  • Underspecified semantics from shallow processing
  • Integration experiments with broad-coverage
    systems/grammars (LinGO ERG and RASP)
  • Planned work

3
Integrating processing
  • No single system can do everything deep and
    shallow processing have inherent strengths and
    weaknesses
  • Domain-dependent and domain-independent
    processing must be linked
  • Parsers and generators
  • Common representation for processing above
    sentence level (e.g., anaphora)

4
Compositional semantics as a common representation
  • Need a common representation language for
    systems pairwise compatibility between systems
    is too limiting
  • Syntax is theory-specific and unnecessarily
    language-specific
  • Eventual goal should be semantics
  • Core idea shallow processing gives
    underspecified semantic representation, so deep
    and shallow systems can be integrated
  • Full interlingua / common lexical semantics is
    too difficult (certainly currently), but can link
    predicates to ontologies, etc.

5
Shallow processing and underspecified semantics
  • Integrated parsing shallow parsed phrases
    incorporated into deep parsed structures
  • Deep parsing invoked incrementally in response to
    information needs
  • Reuse of knowledge sources
  • domain knowledge, recognition of named entities,
    transfer rules in MT
  • Integrated generation
  • Formal properties clearer, representations more
    generally usable
  • Deep semantics taken as normative

6
RMRS approach current and planned applications
  • Question answering
  • Cambridge CSTIT deep parse questions, shallow
    parse answers
  • QA from structured knowledge Frank et al
  • Information extraction
  • Deep Thought
  • Chemistry texts (SciBorg (?))
  • Dictionary definition parsing for Japanese and
    English
  • Bond and Flickinger
  • Rhetorical structure, multi-document
    summarization, email response ...
  • also LOGON semantic transfer. MRSs from LFG
    used in HPSG generator.

7
RMRS Extreme underspecification
  • Goal is to split up semantic representation into
    minimal components (cf Verbmobil VITs)
  • Scope underspecification (MRS)
  • Splitting up predicate argument structure
  • Explicit equalities
  • Hierarchies for predicates and sorts
  • Compatibility with deep grammars
  • Sorts and (some) closed class word information in
    SEM-I (API for grammar, more later)
  • No lexicon for shallow processing (apart from POS
    tags and possibly closed class words)

8
RMRS principles
  • Split up information content as much as possible
  • Accumulate information monotonically by simple
    operations
  • Dont represent what you dont know but preserve
    everything you do know
  • Use a flat representation to allow pieces to be
    accessed individually

9
Separating arguments
  • lb1every(x,h9,h6), lb2cat(x), lb5dog1(y),
    lb4some(y,h8,h7), lb3chase(e,x,y),
    h9lb2,h8lb5
  • goes to
  • lb1every(x), RSTR(lb1,h9), BODY(lb1,h6),
    lb2cat(x), lb5dog1(y), lb4some(y),
    RSTR(lb4,h8), BODY(lb4,h7), lb3chase(e),ARG1(lb3,
    x),ARG2(lb3,y), h9lb2,h8lb5

10
Naming conventionspredicate names without a
lexicon
  • lb1_every_q(x1sg),RSTR(lb1,h9),BODY(lb1,h6),
  • lb2_cat_n(x2sg),
  • lb5_dog_n_1(x4sg),
  • lb4_some_q(x3sg),RSTR(lb4,h8),BODY(lb4,h7),
  • lb3_chase_v(esp),ARG1(lb3,x2sg),ARG2(lb3,x4sg)
  • h9lb2,h8lb5, x1sgx2sg,x3sgx4sg

11
POS output as underspecification
  • DEEP
  • lb1_every_q(x1sg), RSTR(lb1,h9), BODY(lb1,h6),
    lb2_cat_n(x2sg), lb5_dog_n_1(x4sg),
    lb4_some_q(x3sg), RSTR(lb4,h8),
    BODY(lb4,h7),lb3_chase_v(esp),
    ARG1(lb3,x2sg),ARG2(lb3,x4sg), h9lb2,h8lb5,
    x1sgx2sg,x3sgx4sg
  • POS
  • lb1_every_q(x1), lb2_cat_n(x2sg),
    lb3_chase_v(epast), lb4_some_q(x3),
    lb5_dog_n(x4sg)

12
POS output as underspecification
  • DEEP
  • lb1_every_q(x1sg), RSTR(lb1,h9),BODY(lb1,h6),
    lb2_cat_n(x2sg), lb5_dog_n_1(x4sg),
    lb4_some_q(x3sg), RSTR(lb4,h8),
    BODY(lb4,h7),lb3_chase_v(esp),
    ARG1(lb3,x2sg),ARG2(lb3,x3sg), h9lb2,h8lb5,
    x1sgx2sg,x3sgx4sg
  • POS
  • lb1_every_q(x1), lb2_cat_n(x2sg),
    lb3_chase_v(epast), lb4_some_q(x3),
    lb5_dog_n(x4sg)

13
Semantics from RASP
  • RASP robust, domain-independent, statistical
    parsing (Briscoe and Carroll)
  • cant produce conventional semantics because no
    subcategorization
  • can often identify arguments
  • S -gt NP VP NP supplies ARG1 for V
  • potential for partial identification
  • VP -gt V NP
  • S -gt NP S NP might be ARG2 or ARG3

14
Underspecification of arguments
ARGN
ARG1or2
ARG2or3
ARG2
ARG1
ARG3
RASP arguments can be specified as ARGN, ARG2or3
etc Also useful for Japanese deep parsing?
15
RMRS construction
  • ERG etc uses MRS -gt RMRS converter
  • argument splitting etc
  • also RMRS -gt MRS conversion
  • POS-RMRS tag lexicon
  • RASP-RMRS tag lexicon plus semantic rules
    associated with RASP rules to match ERG
  • defaults when no rule RMRS specified

16
RMRS composition with non-lexicalized grammars
  • MRS composition assumes a lexicalized approach
    algebra defined in Copestake, Lascarides and
    Flickinger (2001)
  • RMRS with non-lexicalised grammars has similar
    basic algebra
  • without lexical subcategorization, rely on
    grammar rules to provide the ARGs
  • anchors rather than slots, to ground the ARGs
    (single anchor for RASP)
  • developed on basis of semantic test suite
  • most rules written by Anna Ritchie

17
Some cat sleeps (in RASP)
  • h3,e, lth3gt, h3_sleep(e)
  • sleeps
  • h,x, lth1gt, h1_some(x),RSTR(h1,h2),h2_cat(x)
  • some cat
  • S-gtNP VP
  • HeadVP, ARG1(ltVP anchorgt,ltNP hook.indexgt)
  • h3,e, lth3gt, h3_sleep(e), ARG1(h3,x),
    h1_some(x),RSTR(h1,h2),h2_cat(x)
  • some cat sleeps

18
Real rule ...
  • lt!--rulegt
  • ltnamegtS/np_vplt/namegt
  • ltdtrsgtltdtrgtNPlt/dtrgtltdtrgtVPlt/dtrgtlt/dtrsgt
  • ltheadgtRULElt/headgt
  • ltsemstructgt
  • lthookgtltindexgtElt/indexgtltlabelgtH1lt/labelgtlt/hookgt
  • ltslotsgtltnoanchor/gtlt/slotsgt
  • ltepgtltgpredgtPRPSTN_M_RELlt/gpredgtltlabelgtH1lt/labelgtltv
    argtH2lt/vargtlt/epgt
  • ltrarggtltrargnamegtARG1lt/rargnamegtltlabelgtH3lt/labelgtltv
    argtXlt/vargtlt/rarggt
  • lthcons hreln'qeq'gtlthigtltvargtH2lt/vargtlt/higtltlogtltvargt
    Hlt/vargtlt/logtlt/hconsgt
  • lt/semstructgt
  • ltequalitiesgtltrvgtXlt/rvgtltdhgtltdtrgtNPlt/dtrgtlthegtINDEXlt/
    hegtlt/dhgtlt/equalitiesgt
  • ltequalitiesgtltrvgtHlt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtLABELlt/
    hegtlt/dhgtlt/equalitiesgt
  • ltequalitiesgtltrvgtH3lt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtANCHOR
    lt/hegtlt/dhgtlt/equalitiesgt
  • ltequalitiesgtltrvgtElt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtINDEXlt/
    hegtlt/dhgtlt/equalitiesgt
  • lt/rule--gt

19
ERG-RMRS / RASP-RMRS
20
Inchoative
21
Infinitival subject (unbound in RASP-RMRS)
22
Ditransitive missing ARG3
23
Mismatch Expletive it
24
Mismatch larger numbers
25
Comments on RASP-RMRS
  • Fast enough (not significant compared to RASP
    processing time because no ambiguity)
  • Too many RASP rules! Need to generalise over
    classes.
  • Requires SEM-I API for MRS/RMRS from deep
    grammar
  • RASP and ERG may change
  • compatible test suites semi-automatic rule
    update?
  • alternative technique for composition?
  • Parse selection need to generalise over RMRSs
  • weighted intersections of RMRSs (cf RASP
    grammatical relations)

26
SEM-I semantic interface
  • Meta-level manually specified grammar
    relations (constructions and closed-class)
  • Object-level linked to lexical database for deep
    grammars
  • Object-level SEM-I auto-generated from expanded
    lexical entries in deep grammars (because type
    can contribute relations)
  • Validation of other lexicons
  • Need closed class items for RMRS construction
    from shallow processing

27
Alignment and XML
  • Comparing RMRSs for same text efficiently uses
    characterization
  • labels RMRSs according to their source in the
    text
  • currently characters, but byte offset? Japanese
    etc?
  • RMRS-XML
  • RMRS seen as levels of mark-up standoff
    annotation

28
SciBorg Chemistry texts
  • eScience project starting in October at Cambridge
  • Computer Laboratory (Copestake, Teufel),
    Chemistry (Murray-Rust), CeSC (Parker)
  • Aims
  • Develop an NL markup language which will act as a
    platform for extraction of information. Link to
    semantic web languages.
  • Develop IE technology and core ontologies for use
    by publishers, researchers, readers, vendors and
    regulatory organisations.
  • Model scientific argumentation and citation
    purpose in order to support novel modes of
    information access.
  • Demonstrate the applicability of this
    infrastructure in a real-world eScience
    environment.

29
Research markup
  • Chemistry The primary aims of the present study
    are (i) the synthesis of an amino acid derivative
    that can be incorporated into proteins /via/
    standard solid-phase synthesis methods, and (ii)
    a test of the ability of the derivative to
    function as a photoswitch in a biological
    environment.
  • Computational Linguistics The goal of the work
    reported here is to develop a method that can
    automatically refine the Hidden Markov Models to
    produce a more accurate language model.

30
RMRS and research markup
  • Specify cues in RMRS
  • Deep process cues feasible because
    domain-independent
  • more general and reliable than shallow techniques
  • allows for complex interrelationships
  • Use zones for advanced citation maps and other
    enhancements to repositories

31
Conclusions
  • RMRS semantic representation language allowing
    linking of deep and shallower processors
  • RMRS construction phrase-level compatibility
    between processors
  • Many potential applications
Write a Comment
User Comments (0)
About PowerShow.com