RMRS - PowerPoint PPT Presentation

About This Presentation

Title:

RMRS

Description:

Integration experiments with broad-coverage systems/grammars (LinGO ERG and RASP) ... rarg rargname ARG1 /rargname label H3 /label var X /var /rarg ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 32

Provided by: anncop

Category:

Tags: rmrs | var

more less

Transcript and Presenter's Notes

Title: RMRS

1
RMRS

some background and current work

2
Talk overview

RMRS integrating processors via semantics
Underspecified semantics from shallow processing
Integration experiments with broad-coverage
systems/grammars (LinGO ERG and RASP)
Planned work

3
Integrating processing

No single system can do everything deep and
shallow processing have inherent strengths and
weaknesses
Domain-dependent and domain-independent
processing must be linked
Parsers and generators
Common representation for processing above
sentence level (e.g., anaphora)

4
Compositional semantics as a common representation

Need a common representation language for
systems pairwise compatibility between systems
is too limiting
Syntax is theory-specific and unnecessarily
language-specific
Eventual goal should be semantics
Core idea shallow processing gives
underspecified semantic representation, so deep
and shallow systems can be integrated
Full interlingua / common lexical semantics is
too difficult (certainly currently), but can link
predicates to ontologies, etc.

5
Shallow processing and underspecified semantics

Integrated parsing shallow parsed phrases
incorporated into deep parsed structures
Deep parsing invoked incrementally in response to
information needs
Reuse of knowledge sources
domain knowledge, recognition of named entities,
transfer rules in MT
Integrated generation
Formal properties clearer, representations more
generally usable
Deep semantics taken as normative

6
RMRS approach current and planned applications

Question answering
Cambridge CSTIT deep parse questions, shallow
parse answers
QA from structured knowledge Frank et al
Information extraction
Deep Thought
Chemistry texts (SciBorg (?))
Dictionary definition parsing for Japanese and
English
Bond and Flickinger
Rhetorical structure, multi-document
summarization, email response ...
also LOGON semantic transfer. MRSs from LFG
used in HPSG generator.

7
RMRS Extreme underspecification

Goal is to split up semantic representation into
minimal components (cf Verbmobil VITs)
Scope underspecification (MRS)
Splitting up predicate argument structure
Explicit equalities
Hierarchies for predicates and sorts
Compatibility with deep grammars
Sorts and (some) closed class word information in
SEM-I (API for grammar, more later)
No lexicon for shallow processing (apart from POS
tags and possibly closed class words)

8
RMRS principles

Split up information content as much as possible
Accumulate information monotonically by simple
operations
Dont represent what you dont know but preserve
everything you do know
Use a flat representation to allow pieces to be
accessed individually

9
Separating arguments

lb1every(x,h9,h6), lb2cat(x), lb5dog1(y),
lb4some(y,h8,h7), lb3chase(e,x,y),
h9lb2,h8lb5
goes to
lb1every(x), RSTR(lb1,h9), BODY(lb1,h6),
lb2cat(x), lb5dog1(y), lb4some(y),
RSTR(lb4,h8), BODY(lb4,h7), lb3chase(e),ARG1(lb3,
x),ARG2(lb3,y), h9lb2,h8lb5

10
Naming conventionspredicate names without a
lexicon

lb1_every_q(x1sg),RSTR(lb1,h9),BODY(lb1,h6),
lb2_cat_n(x2sg),
lb5_dog_n_1(x4sg),
lb4_some_q(x3sg),RSTR(lb4,h8),BODY(lb4,h7),
lb3_chase_v(esp),ARG1(lb3,x2sg),ARG2(lb3,x4sg)
h9lb2,h8lb5, x1sgx2sg,x3sgx4sg

11
POS output as underspecification

DEEP
lb1_every_q(x1sg), RSTR(lb1,h9), BODY(lb1,h6),
lb2_cat_n(x2sg), lb5_dog_n_1(x4sg),
lb4_some_q(x3sg), RSTR(lb4,h8),
BODY(lb4,h7),lb3_chase_v(esp),
ARG1(lb3,x2sg),ARG2(lb3,x4sg), h9lb2,h8lb5,
x1sgx2sg,x3sgx4sg
POS
lb1_every_q(x1), lb2_cat_n(x2sg),
lb3_chase_v(epast), lb4_some_q(x3),
lb5_dog_n(x4sg)

12
POS output as underspecification

DEEP
lb1_every_q(x1sg), RSTR(lb1,h9),BODY(lb1,h6),
lb2_cat_n(x2sg), lb5_dog_n_1(x4sg),
lb4_some_q(x3sg), RSTR(lb4,h8),
BODY(lb4,h7),lb3_chase_v(esp),
ARG1(lb3,x2sg),ARG2(lb3,x3sg), h9lb2,h8lb5,
x1sgx2sg,x3sgx4sg
POS
lb1_every_q(x1), lb2_cat_n(x2sg),
lb3_chase_v(epast), lb4_some_q(x3),
lb5_dog_n(x4sg)

13
Semantics from RASP

RASP robust, domain-independent, statistical
parsing (Briscoe and Carroll)
cant produce conventional semantics because no
subcategorization
can often identify arguments
S -gt NP VP NP supplies ARG1 for V
potential for partial identification
VP -gt V NP
S -gt NP S NP might be ARG2 or ARG3

14
Underspecification of arguments
ARGN
ARG1or2
ARG2or3
ARG2
ARG1
ARG3
RASP arguments can be specified as ARGN, ARG2or3
etc Also useful for Japanese deep parsing?
15
RMRS construction

ERG etc uses MRS -gt RMRS converter
argument splitting etc
also RMRS -gt MRS conversion
POS-RMRS tag lexicon
RASP-RMRS tag lexicon plus semantic rules
associated with RASP rules to match ERG
defaults when no rule RMRS specified

16
RMRS composition with non-lexicalized grammars

MRS composition assumes a lexicalized approach
algebra defined in Copestake, Lascarides and
Flickinger (2001)
RMRS with non-lexicalised grammars has similar
basic algebra
without lexical subcategorization, rely on
grammar rules to provide the ARGs
anchors rather than slots, to ground the ARGs
(single anchor for RASP)
developed on basis of semantic test suite
most rules written by Anna Ritchie

17
Some cat sleeps (in RASP)

h3,e, lth3gt, h3_sleep(e)
sleeps
h,x, lth1gt, h1_some(x),RSTR(h1,h2),h2_cat(x)
some cat
S-gtNP VP
HeadVP, ARG1(ltVP anchorgt,ltNP hook.indexgt)
h3,e, lth3gt, h3_sleep(e), ARG1(h3,x),
h1_some(x),RSTR(h1,h2),h2_cat(x)
some cat sleeps

18
Real rule ...

lt!--rulegt
ltnamegtS/np_vplt/namegt
ltdtrsgtltdtrgtNPlt/dtrgtltdtrgtVPlt/dtrgtlt/dtrsgt
ltheadgtRULElt/headgt
ltsemstructgt
lthookgtltindexgtElt/indexgtltlabelgtH1lt/labelgtlt/hookgt
ltslotsgtltnoanchor/gtlt/slotsgt
ltepgtltgpredgtPRPSTN_M_RELlt/gpredgtltlabelgtH1lt/labelgtltv
argtH2lt/vargtlt/epgt
ltrarggtltrargnamegtARG1lt/rargnamegtltlabelgtH3lt/labelgtltv
argtXlt/vargtlt/rarggt
lthcons hreln'qeq'gtlthigtltvargtH2lt/vargtlt/higtltlogtltvargt
Hlt/vargtlt/logtlt/hconsgt
lt/semstructgt
ltequalitiesgtltrvgtXlt/rvgtltdhgtltdtrgtNPlt/dtrgtlthegtINDEXlt/
hegtlt/dhgtlt/equalitiesgt
ltequalitiesgtltrvgtHlt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtLABELlt/
hegtlt/dhgtlt/equalitiesgt
ltequalitiesgtltrvgtH3lt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtANCHOR
lt/hegtlt/dhgtlt/equalitiesgt
ltequalitiesgtltrvgtElt/rvgtltdhgtltdtrgtVPlt/dtrgtlthegtINDEXlt/
hegtlt/dhgtlt/equalitiesgt
lt/rule--gt

19
ERG-RMRS / RASP-RMRS
20
Inchoative
21
Infinitival subject (unbound in RASP-RMRS)
22
Ditransitive missing ARG3
23
Mismatch Expletive it
24
Mismatch larger numbers
25
Comments on RASP-RMRS

Fast enough (not significant compared to RASP
processing time because no ambiguity)
Too many RASP rules! Need to generalise over
classes.
Requires SEM-I API for MRS/RMRS from deep
grammar
RASP and ERG may change
compatible test suites semi-automatic rule
update?
alternative technique for composition?
Parse selection need to generalise over RMRSs
weighted intersections of RMRSs (cf RASP
grammatical relations)

26
SEM-I semantic interface

Meta-level manually specified grammar
relations (constructions and closed-class)
Object-level linked to lexical database for deep
grammars
Object-level SEM-I auto-generated from expanded
lexical entries in deep grammars (because type
can contribute relations)
Validation of other lexicons
Need closed class items for RMRS construction
from shallow processing

27
Alignment and XML

Comparing RMRSs for same text efficiently uses
characterization
labels RMRSs according to their source in the
text
currently characters, but byte offset? Japanese
etc?
RMRS-XML
RMRS seen as levels of mark-up standoff
annotation

28
SciBorg Chemistry texts

eScience project starting in October at Cambridge
Computer Laboratory (Copestake, Teufel),
Chemistry (Murray-Rust), CeSC (Parker)
Aims
Develop an NL markup language which will act as a
platform for extraction of information. Link to
semantic web languages.
Develop IE technology and core ontologies for use
by publishers, researchers, readers, vendors and
regulatory organisations.
Model scientific argumentation and citation
purpose in order to support novel modes of
information access.
Demonstrate the applicability of this
infrastructure in a real-world eScience
environment.

29
Research markup

Chemistry The primary aims of the present study
are (i) the synthesis of an amino acid derivative
that can be incorporated into proteins /via/
standard solid-phase synthesis methods, and (ii)
a test of the ability of the derivative to
function as a photoswitch in a biological
environment.
Computational Linguistics The goal of the work
reported here is to develop a method that can
automatically refine the Hidden Markov Models to
produce a more accurate language model.

30
RMRS and research markup

Specify cues in RMRS
Deep process cues feasible because
domain-independent
more general and reliable than shallow techniques
allows for complex interrelationships
Use zones for advanced citation maps and other
enhancements to repositories

31
Conclusions

RMRS semantic representation language allowing
linking of deep and shallower processors
RMRS construction phrase-level compatibility
between processors
Many potential applications

Write a Comment

User Comments (0)