Text Analysis - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Text Analysis

Description:

... I Final Workshop. Washington, DC. Sergei Nirenburg. ILIT ... Need to output n-best candidate results, not to work extra to. select the best one at each level; ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 24
Provided by: serg136
Category:
Tags: analysis | text

less

Transcript and Presenter's Notes

Title: Text Analysis


1
Text Analysis
  • AQUAINT Phase I Final Workshop
  • Washington, DC
  • Sergei Nirenburg
  • ILIT/UMBC

2
Iran, Iraq and North Korea on Wednesday rejected
an accusation by President Bush that they are
developing weapons of massdestruction.
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
Overall goal automatically deriving text meaning
representationsthat can serve as the basis of
reasoning at all the stages of theQA process --
question understanding answer determinationansw
er formulation. Subgoal 1 do not, if at all
possible, use uninterpreted text strings as the
basis for deriving an answer non-factoid
questionscan be processed using interpreted
knowledge Subgoal 2 use a constructed semantic
metalanguage that enables ambiguity resolution
and modeling of the users (analysts) background
knowledge.
9
An overall methodological choice Use a broad
inventory of methods and resources to improve
and enhance the heuristic algorithms for
deriving the various components of the
TMR. Wherever comparison of uninterpreted
strings does thejob, use it because this method
is at this point lessexpensive.
10
  • Two main lines of RD within the
    ontological-semantic text analyzer
  • Coverage work toward raising the percentage of
    inputs that the analyzer processes according
    to overtly recorded expectations
  • further development of ontology, lexicons and
    heuristic algorithms
  • Robustness work on methods for recovering from
    errors (some of the causes for errors are
    deliberately built in for efficiency
    purposes)
  • Constraint relaxation and tightening
    probabilistic heuristics

11
  • Resources we started with 13 months ago
  • an ontology of about 6,500 concepts (90,000
    statements) developed for a machine translation
    system (issues of iconic instead of explanatory
    description)
  • an English semantic lexicon of about 26,000 word
    senses developed for an MT system and to
    support an IR/IE system (issues of grain size
    of description too coarse)
  • an English grammar of average coverage
  • a basic semantic analyzer that supported mostly
    static constraint satisfaction processing
  • experience in developing static and dynamic
    knowledge resources for semantic processing

12
  • In the MOQA project, we have been developing a
    new semantic lexicon for
  • English (gt 11,000 word senses at this point) that
    reflects the needs for
  • finer grain-size descriptions for QA. The
    lexicon covers most of the
  • closed-class lexical units light verbs, most
    common main verbs -- in
  • short, the more complex elements.
  • To support more sophisticated contextual
    processing, the lexicon was
  • Augmented with procedural attachments (meaning
    procedures) such as
  • find-anchor-time, combine-time, fix-case-role,
    specify-approximation,
  • trigger-reference, etc.
  • We have been developing semantic lexicons of
    Arabic and Persian on the
  • basis of the English lexicon.
  • We have been modifying the ontology ( 2,200
    concepts added, deleted or
  • modified at this point).
  • We have been enhancing the analysis processing
    modules -- the
  • preprocessor, the syntactic and the semantic
    analyzer, in part, by adding

13
The basic semantic dependency (representation
ofwho did what to whom) is derived, in the
general case, on the basis of a) lexical-semantic
expectations (selectional restrictions) recorded
in the ontology and the lexicon and b) syntactic
dependency.
14
In addition to the basic semantic dependency,
basic TMRs also include parameterized
information provided by the microtheoriesof
aspect, modality (including speaker attitudes),
time, style and others. Many of these
microtheories have been implementedbut all would
benefit from further work. There is also a
possibilityof borrowing some microtheories.
15
Iran, Iraq and North Korea on Wednesday rejected
an accusation by President Bush that they may be
developing weapons of massdestruction.
16
MODALITY-34 MODALITY-TYPE EPISTEMIC MODALITY-SCO
PE DEVELOP-1224 ATTRIBUTED-TO HUMAN-15691 MODALI
TY-VALUE lt 1
17
try-v3 syn-struc root try cat v subj
root var1 cat n xcomp root
var2 cat v form OR infinitive
gerund sem-struc set-1 element-type refsem-1
cardinality gt1 refsem-1 sem event agent
var1 effect refsem-2 modality modality-
type epiteuctic modality-scope refsem-2 mod
ality-value lt 1 refsem-2 value var2 sem ev
ent
18
The ideal case
1 TMR
SyntacticAnalyzer
SemanticAnalyzer
Preprocessor
Grammar Ecology MorphologySyntax
Lexicon and Onomasticon
Ontology and Fact Repository
Static Knowledge Resources
19
Work on robustness is essential because the above
expectation will be violated regularly
20
(No Transcript)
21
Concerns of robustness are also included in the
work on coverage. Thus, selectional restrictions
in our system are multivalued and allowfor a
finer-grain determination of preference in
analysis.
22
Basic coverage-oriented process extensions under
way Microtheories quantifiers, temporal
expression, pleonastic it, definite descriptions
, light verbs, reference, Meaning procedures
(some microtheories are implemented using meaning
procedures) seek-specification (make a cake),
fix-case-role (likely), specify-approximation,
trigger-reference, find-anchor-time, combine-time
(ago), resolve-semantic-ellipsis (prefer)
23
  • First year lessons learned
  • Need to lower expectations from and reliance on
    syntax
  • Need to accept and utilize analysis of sentence
    fragments, not only completely analyzed
    sentences
  • Need to output n-best candidate results, not to
    work extra to select the best one at each
    level
  • Quantitative evaluation must be carried out
    preparation under way results to be presented
    in December
  • While robustness-type work is essential for
    immediate utility, coverage-type work toward
    improving original quality of analysis should
    not be neglected.
Write a Comment
User Comments (0)
About PowerShow.com