Stuff to Add

About This Presentation

Title:

Stuff to Add

Description:

Lucian Lita. Jeongwoo Ko. Scott Judy. Frank Lin. Curtis. Huttenhower. Jeffrey Micher. Kevyn Collins ... Create a general QA planning system. How should a QA ... – PowerPoint PPT presentation

Number of Views:73

Avg rating:3.0/5.0

Slides: 50

Provided by: ericn1

Category:

more less

Transcript and Presenter's Notes

Title: Stuff to Add

1
Language Technologies InstituteCarnegie Mellon
University
AQUAINT 24-Month WorkshopDecember 2, 2003
2
JAVELIN Team
3
Outline

Goals of the Research
Overall Accomplishments
Summary of the Approach
Testing and Evaluation
Remaining Challenges
Future Work

4
Goals of the Research

QA as Planning
Create a general QA planning system
How should a QA system represent its chain of
reasoning?
QA and Auditability
How can we improve a QA systems ability to
justify its steps?
How can we make QA systems open to machine
learning?

5
Goals 2

Utility-Based Information Fusion
Perceived utility is a function of many different
factors
Create and tune utility metrics
Architectures for QA
Distributed development
Modular integration (mix match components)
Loose integration via shared data standards
Impact of multilingual QA on design
Long-term memory

6
Overall Accomplishments

First end-to-end QA project at CMU
Completed system operational w/limited question
type coverage
Established distributed modular architecture for
plug-and-play of individual QA components
Established automatic testing framework
Created a long-term repository of questions,
answers and intermediate results (documents,
passages, etc.)
Created a web-based browser for answer
justification structures
Multi-strategy approach with dynamic planning

7
Overall Accomplishments 2

Implemented several approaches for answer
extraction (pattern-based, statistical, and
NLP-based)
Participated in TREC 11 12 QA track evaluations
Co-organized (with IBM) and participated in
Relationship Pilot evaluation
Preliminary system delivered to MITRE testbed
Extended architecture for multilingual document
bases (includes incorporation of Japanese and
Chinese text processing tools)
Graphical user interface with support for user
clarification

8
Summary of the Approach

Basic Steps (Modules) in the QA Process
Question Analyzer (QA)
Retrieval Strategist (RS)
Information Extractor (IX)
Answer Generator (AG)
Overall Architecture / Integration
Graphical User Interface (GUI)
Planner
Execution Manager (EM)
Repository
Answer Justification (AJ)
Extensions to the Architecture
NLP for Information Extraction
Multilingual document sources

9
Question Analyzer (QA)
Question input (XML format)
Wordnet KANTOO Lexicon
Brill Tagger BBN Identifier KANTOO Lexifier
Tokenizer Token information extractionPhrase
Chunking

Combines pattern-matching and NLP
Request Object
Question, question type, answer type
Keywords, alternate forms (abbrevs, translations)
Syntactic analysis(f-structure)
Semantic analysis (logical representation)

Annotated Token List
Type Taxonomies Type-Specific Constraints
KANTOO grammars
Parser
F-structure
Request Object Builder
Extraction Patterns Heuristics
Request Object (XML format)
10
Question Analyzer Performanceon TREC 11 Questions
11
Retrieval Strategist (RS)

Input
Keyword list (from Request Object)
Max. number of documents
Collection(s) to search
Each keyword is assigned a priority (1-5)
Likelihood that a keyword will appear in an
answer passage
Start with highly constrained search
All keywords, in close proximity
Iterate while more documents needed
Retrieve documents
Relax query by one step (up to 15 steps)
relax keywords, proximity window
Hybrid approach start with structured queries,
switch to tf.idf (combination works better than
either alone)
Output ranked document result list

12
Retrieval Strategist Current Work

Retrieval based on Lemur 2.0 toolkit
Multiple retrieval models, very flexible
RS previously used Inquery
Uses structured query support from UMass
Extending for use with Chinese, Japanese
Distributed search (via Lemur)
Support for querying multiple QA resources
CORI collection selection algorithm

13
Information Extractor (IX)

Input
Question (Request Object from QA)
Set of relevant documents (from RS)
Output
Set of potentially useful extracted answers
Corresponding passages
Confidence scores
Role in JAVELIN Extract candidate answers
passages from documents

14
Information Extractor Features

Self-contained algorithms that score passages in
different ways
Example Simple Features
Keywords present
Normalized window size
Average distance
Verbs encompassed Answer,Main Verb
Proper nouns phrases present
Example Pattern Features
cN .. cV .. in/on date
date, iN .. cV ..
Any procedure that returns a numeric value is a
valid feature

15
Answer Confidence Learning

Supervised learning
Model the probability of correctness given a
question q, a passage p, and an answer a from the
passage
p(cq,a,p) ? Model(f1(q,a,p), f2(q,a,p) ..
fn(q,a,p))
where fi are features computed from q, a, and
p
Supervised models
K-Nearest Neighbors (KNN)
Decision Tree (DT)
Support Vector Machine (SVM)
Finite State Transducers (FST)

16
Information Extractor Steps

Filter passages
Match answer type?
Contain sufficient keywords?
Create variations on passages
POS tagging (Brill)
Cleansing (punctuation, tags, etc.)
Expand contractions
Reduce surface forms to lexemes
Calculate feature values
A classifier scores the passages, which are
output with confidence scores

17
Answer Generator (AG)

Input answer candidates, source passages
Output ranked answers, or requests for more
information passed back to Planner
Not enough answer candidates
Cant distinguish answer candidates
Main tasks
Combination of different sorts of evidence for
answer verification.
Detection and combination of similar answer
candidates to address answer granularity.
Answer type checking to filter out improper
answers.
Generation of answers in required format.

18
Answer Normalization

Request Filler/Answer Generator aware of NE
types dates, times, people names, company names,
locations, currency expressions.
April 14th, 1912, 14th of April 1912, 14
April 1912 instances of same date, but different
strings.
For date expressions, normalization performed to
ISO 8601 (YYYY-MM-DD) in Answer Generator.
summer, last year, etc. remain as strings.

19
Answer Type Checking

Motivation
Errors in earlier modules or ambiguous
information in the document can generate improper
answer candidates.
Not all the answer candidates from IX are the
potential answers.
Validate answer candidates by checking how
adequate each answer is with respect to the
answer type.
Current approaches
Use WordNet
Use Gazetteer for location questions
Use Google for object questions
Use internal patterns for numeric and date
questions

20
SystemArchitecture
Answer Justification
Web Browser
Domain Model
Data Repository
process history data
JAVELIN operator (action) models
Question Analyzer
Planner
JAVELIN GUI
Execution Manager
FST Extractor
Retrieval Strategist
KNN Extractor
Light Extractor
Information Extractors
...
SVM Extractor
Answer Generator
NLP Extractor
21
Graphical User Interface (GUI)
22
GUI/Planner Interaction
GUI
Planner

QUESTION XML containing
question text
planner settings
PAUSE
RESUME
QUIT (end session)
STOP (abort question)

ANSWER XML containing
answers in rank order
confidence scores
repository IDs
OK
ERROR description

GUI-Initiated

DIALOG XML containing
type of dialog (yes/no, multiple choice, text)
question to ask user
default response
choices to display (when applicable)

RESPONSE text containing
yes or no
text of selected choice
reply text

Planner-Initiated
23
Motivation for Planning

Enable run-time generation of new
question-answering strategies
Improve ability to recover from bad decisions as
information is collected
Gain insight into when different QA components
are most useful

24
JAVELIN Planning Approach

Reasoning at a level above syntactic and lexical
details of individual requests
QA process steps - planning domain operators
information consumed/produced by the system -
planning state
Explicit models of state and action uncertainty
Utility-based forward-chaining planning algorithm
Choose actions with maximum expected utility of
information
Interleave planning and execution control of
JAVELIN QA components to manage information
uncertainty

25
System Architecture
Domain Model
Data Repository
Planner
JAVELIN GUI
Execution Manager
S0
Algorithm runs until goal is satisfied or failure
conditions are met
...
26
Role of the Execution Manger

Coordinates communication between Planner and
other question-answering components
Supports session architecture by storing all
planning steps and processing data in the
Repository
Simplifies integration of new modules
Provides centralized Repository access
Authenticates users for GUI
Runs batch end-to-end pipeline system tests

27
Sequence with Interactivity Enabled
Q Where is bile produced? A 1. liver
(0.99175) 2. tube (0.83664) 3. doctors
(0.81202) 4. operation (0.81031) 5. Guangdong
Province (0.78025) 136 additional answers
28
Sequence with Interactivity Enabled
In comparison with non-interactive mode
Q Where is bile produced? A 1. China
(0.96944) 2. Moscow (0.75011) 3. Cambridge
(0.75011) 4. Guangdong Province (0.60531) 5.
Chinese (0.49776) 4 additional answers
300 DS5597 300 RO6180 DS5597 300 FS15985 SVM 300 Q17262 RANKED
time 51 sec
and intermediate results produced by the
interactive mode...
FS15952, AL5445 (SVM) 1 drug (0.73359) 2 liver
(0.6497) 3 acid sequestrants (0.49766) 4
LDL-cholesterol (0.47154) 5 rheumatoid arthritis
(0.47154) 12 additional answers
FS15935, FS15957 (FST) No answer
found FS15939, AL5440 (SVM) (same as
non-interactive mode above)
FS15962, AL5446 (Light) 1 Moscow (0.25) 2
Cambridge (0.25) 3 Dallas (0.01282) 4 China
(0.01259)
29
Javelin Repository

The repository stores all the decisions made by
the Planner and information produced by the
modules in a persistent database
Permits a detailed trace of the systems
operation (a move toward answer justification)

30
Repository ERD
Request Object
Planner Objects
AnswerObjects
31
(No Transcript)
32
Adding Shallow Semantics to JAVELIN NLP IX

Answer extraction module that makes use of
natural language processing capabilities
Currently depends on shallow, broad-coverage
parsing
Similarity-based unification strategy
Incorporates a general framework for text
processing plug-ins

33
Basic Idea
partial interpretation
Unification on simple predicatesrepresenting
basic argumentstructure will provide a
moreaccurate way to match questionswith
appropriate answer(s)
Two Challenges Where do predicates come
from? Flexibility in interpretation
34
Comparing LR for Question and Answer Passages
35
Text Processor (TP)

Complex question analysis requires many types of
language processing
Simple Tokenization, POS tagging
Harder Synonym expansion, syntax
Hardest Semantic frames, temporal info
Collect all of these services into a single
module
CLAWS POS tagger
RASP syntactic parser
Link grammar parser
WordNet synsets
FrameNet semantic frames
BBN Identifinder NE tagging

36
Linguistic Reasoning about Domain Content

More complex questions require more complex
reasoning.
Joe has access to weapons-grade anthrax.
Joe is thought to possess warheads capable of
delivering biological agents.
Is Joe capable of mounting a biological attack?
Requires inference over information drawn from
multiple documents

Planner reasoning about the QA ProcessFLOOD
reasoning about domain content
37
FLOOD Reasoner

FLOOD is an environment for developing reasoners
for complex question analysis
Consumes semantic frame information from the text
processor
Provides a planning platform for rule
specification
Allows complex operations such as subqueries, etc.

38
Pronoun Resolution

Same sentence
This article also states that intelligence
sources world-wide have been on a "manhunt" the
last several weeks for bin Ladin due to reports
that he had purchased nuclear weapons.
Previous sentence
There are still constraints on Saddams power.
His economic infrastructure is in long-term
decline, and his ability to project power outside
Iraqs borders is severely limited, largely
because of the effectiveness and enforcement of
the No-Fly Zones.
Intervening discourse
Iraq is forging ahead with its outlawed
chemical, nuclear and germ weapon programs as
well as with the development of missiles to
deliver them, Defense Secretary Donald Rumsfeld
said on Friday. Saddam Hussein 's appetites for
these weapons is enormous, he said in an
interview with the Fox News Channel.

39
General Process

Parse the retrieved text
morphology (POS, stem)
lexical information (NE tagger, WordNet)
syntactic structure (RASP)
grammatical functions (Link)
Assign agreement features gender, person,
number, animacy
Select possible antecedents (NPs agreeing with
the pronoun)
Prune candidates according to
Known linguistic principles, where applicable
Heuristics (from Mitamura et al., 2003)

40
Multilingual JAVELINArchitecture
?s
Ongoing/Future Work
Chinese Request Object (UTF-8)
Chinese Answers (UTF-8)
Chinese IX module collection
Answer Generator
Question Analyzer
Japanese Answers (UTF-8)
Japanese Request Object (UTF-8)
Multilingual Retrieval Strategist (UTF-8)
Japanese IX module collection
Answer Generator
Multilingual Question Object (UTF-8)
Encoding Conversion
Answer Generator
English IX module collection
English Request Object (UTF-8)
English Answer (UTF-8)
English Answers (UTF-8)
Chinese Corpora (GBK)
English Corpora (ASCII)
Japanese Corpora (EUC-JP)
41
Japanese Language Resources

Mainichi Shimbun Corpus
Full corpus of a major Japanese newspaper for
1998 and 1999 (About 240,000 articles)
Bilingual Dictionaries
EDICT (100,000 general entries, 200,000 Japanese
personal names, 87,000 Japanese place names,
14,000 scientific terms)
EIJIRO (English word to Japanese phrase harder
to use, but has 1,080,000 entries)
Web-based Machine Translation
Systran
Amikai
Named Entity Tagger / Dependency Structure
Analyzer
Cabocha
POS-tagger
Chasen

42
Chinese Language Resources

Corpora
Xinhua News corpora (in use)
Xinhua News from 1991-2001
NTCIR-3 CLIR IR/CLIR Test Collection (future)
Chinese news articles publish in Taiwan in
1998-1999
Foreign Broadcast Information Service (future)
Mandarin-English parallel corpora
Preprocessing (tools from RADD-MT project)
ASCII character and digit normalization
Segmentation
Name entity tagging
Bilingual Dictionaries
LDC
Bilingual word-to-word dictionary
Bilingual phrase-to-phrase dictionary
CEDICT
Chinese-English dictionary

43
Testing and Evaluation

Daily test framework reporting
Evaluations
TREC 11 QA Track evaluation
Relationship Pilot evaluation
TREC 12 QA Track evaluation

Details available from NIST web site or the
JAVELIN home pagehttp//www.lti.cs.cmu.edu/Rese
arch/JAVELIN
44
Evaluation Techniques

Execution Manager can run in lights out batch
mode
Regular tests on different test suites (TREC
question suites, relationship pilot questions,
etc.)
Results include scores and logs for debugging
All intermediate results are stored in Repository

45
Sample Results
46
Sample Log File Excerpt
47
Remaining Challenges

Getting adequate training data for statistical
approaches
Getting adequate lexico-semantic resources for
NLP approaches
Combining existing NLP tools into an integrated
framework
Extending the data model and representations for
scenario-based QA

48
Future Work

Variable-Precision Knowledge Representation and
Reasoning
Scenario-Driven Dialogs
Scenario Representation
Multilingual, Distributed IR
Multi-Strategy Information Gathering
Answer Visualization and Scenario Refinement

49
Questions?

Write a Comment

User Comments (0)

About PowerShow.com

Stuff to Add - PowerPoint PPT Presentation

Stuff to Add

Lucian Lita. Jeongwoo Ko. Scott Judy. Frank Lin. Curtis. Huttenhower. Jeffrey Micher. Kevyn Collins ... Create a general QA planning system. How should a QA ... – PowerPoint PPT presentation