Answering Questions by Computer

About This Presentation

Title:

Answering Questions by Computer

Description:

Title: N-Grams and Corpus Linguistics Author: Kathy McCoy Last modified by: Sudeshna Sarkar Created Date: 1/20/1999 7:57:44 PM Document presentation format – PowerPoint PPT presentation

Number of Views:126

Avg rating:3.0/5.0

Slides: 88

Provided by: KathyM151

Category:

more less

Transcript and Presenter's Notes

Title: Answering Questions by Computer

1
Answering Questions by Computer
2
Terminology Question Type

Question Type an idiomatic categorization of
questions for purposes of distinguishing between
different processing strategies and/or answer
formats
E.g. TREC2003
FACTOID How far is it from Earth to Mars?
LIST List the names of chewing gums
DEFINITION Who is Vlad the Impaler?
Other possibilities
RELATIONSHIP What is the connection between
Valentina Tereshkova and Sally Ride?
SUPERLATIVE What is the largest city on
Earth?
YES-NO Is Saddam Hussein alive?
OPINION What do most Americans think of gun
control?
CAUSEEFFECT Why did Iraq invade Kuwait?

3
Terminology Answer Type

Answer Type the class of object (or rhetorical
type of sentence) sought by the question. E.g.
PERSON (from Who )
PLACE (from Where )
DATE (from When )
NUMBER (from How many )
but also
EXPLANATION (from Why )
METHOD (from How )
Answer types are usually tied intimately to the
classes recognized by the systems Named Entity
Recognizer.

4
Terminology Question Focus

Question Focus The property or entity that is
being sought by the question.
E.g.
In what state is the Grand Canyon?
What is the population of Bulgaria?
What colour is a pomegranate?

5
Terminology Question Topic

Question Topic the object (person, place, ) or
event that the question is about. The question
might well be about a property of the topic,
which will be the question focus.
E.g. What is the height of Mt. Everest?
height is the focus
Mt. Everest is the topic

6
Terminology Candidate Passage

Candidate Passage a text passage (anything from
a single sentence to a whole document) retrieved
by a search engine in response to a question.
Depending on the query and kind of index used,
there may or may not be a guarantee that a
candidate passage has any candidate answers.
Candidate passages will usually have associated
scores, from the search engine.

7
Terminology Candidate Answer

Candidate Answer in the context of a question,
a small quantity of text (anything from a single
word to a sentence or bigger, but usually a noun
phrase) that is of the same type as the Answer
Type.
In some systems, the type match may be
approximate, if there is the concept of
confusability.
Candidate answers are found in candidate passages
E.g.
50
Queen Elizabeth II
September 8, 2003
by baking a mixture of flour and water

8
Terminology Authority List

Authority List (or File) a collection of
instances of a class of interest, used to test a
term for class membership.
Instances should be derived from an authoritative
source and be as close to complete as possible.
Ideally, class is small, easily enumerated and
with members with a limited number of lexical
forms.
Good
Days of week
Planets
Elements
Good statistically, but difficult to get 100
recall
Animals
Plants
Colours
Problematic
People
Organizations
Impossible
All numeric quantities
Explanations and other clausal quantities

9
Essence of Text-based QA
(Single source answers)

Need to find a passage that answers the question.
Find a candidate passage (search)
Check that semantics of passage and question
match
Extract the answer

10
Ranking Candidate Answers
Q066 Name the first private citizen to fly in
space.

Answer type Person
Text passage
Among them was Christa McAuliffe, the first
private citizen to fly in space. Karen Allen,
best known for her starring role in Raiders of
the Lost Ark, plays McAuliffe. Brian Kerwin is
featured as shuttle pilot Mike Smith...

11
Answer Extraction

Also called Answer Selection/Pinpointing
Given a question and candidate passages, the
process of selecting and ranking candidate
answers.
Usually, candidate answers are those terms in the
passages which have the same answer type as that
generated from the question
Ranking the candidate answers depends on
assessing how well the passage context relates to
the question
3 Approaches
Heuristic features
Shallow parse fragments
Logical proof

12
Features for Answer Ranking

Number of question terms matched in the answer
passage
Number of question terms matched in the same
phrase as the candidate answer
Number of question terms matched in the same
sentence as the candidate answer
Flag set to 1 if the candidate answer is followed
by a punctuation sign
Number of question terms matched, separated from
the candidate answer by at most three words
and one comma
Number of terms occurring in the same order in
the answer passage as in the question
Average distance from candidate answer to
question term matches

SIGIR 01
13
Heuristics for Answer Ranking in the Lasso System

Same_Word_Sequence_score number of words from
the question that are recognized in the same
sequence in the passage.
Punctuation_sign_score a flag set to 1 if the
candidate answer is followed by a punctuation
sign
Comma_3_word_score measure the number of
question words that follow the candidate, if the
candidate is followed by a coma.
Same_parse_subtree_score number of question
words found in the parse sub-tree of the answer
Same_sentence_score number of question words
found in the answers sentence.
Distance score adds the distance (measured in
number of words) between the answer candidate and
the other keywords in the window.

14
Heuristics for Answer Ranking in the Lasso
Systemcontinued

Finally..

15
Evaluation

Evaluation of this kind of system is usually
based on some kind of TREC-like metric.
In Q/A the most frequent metric is
Mean reciprocal rank
Youre allowed to return N answers. Your score is
based on 1/Rank of the first right answer.
Averaged over all the questions you answer.

16
Answer Types and Modifiers
Name 5 French Cities

Most likely there is no type for French Cities
So will look for CITY
include French/France in bag of words, and hope
for the best
include French/France in bag of words, retrieve
documents, and look for evidence (deep parsing,
logic)
use high-precision Language Identification on
results
If you have a list of French cities, could either
Filter results by list
Use Answer-Based QA (see later)
Use longitude/latitude information of cities and
countries

17
Answer Types and Modifiers
Name a female figure skater

Most likely there is no type for female figure
skater
Most likely there is no type for figure skater
Look for PERSON, with query terms figure,
skater
What to do about female? Two approaches.
Include female in the bag-of-words.
Relies on logic that if femaleness is an
interesting property, it might well be mentioned
in answer passages.
Does not apply to, say singer.
Leave out female but test candidate answers for
gender.
Needs either an authority file or a heuristic
test.
Test may not be definitive.

18
Part II - Specific Approaches

By Genre
Statistical QA
Pattern-based QA
Web-based QA
Answer-based QA (TREC only)
By System
SMU
LCC
USC-ISI
Insight
Microsoft
IBM Statistical
IBM Rule-based

19
Statistical QA

Use statistical distributions to model
likelihoods of answer type and answer
E.g. IBM (Ittycheriah, 2001) see later section

20
Pattern-based QA

For a given question type, identify the typical
syntactic constructions used in text to express
answers to such questions
Typically very high precision, but a lot of work
to get decent recall

21
Web-Based QA

Exhaustive string transformations
Brill et al. 2002
Learning
Radev et al. 2001

22
Answer-Based QA

Problem Sometimes it is very easy to find an
answer to a question using resource A, but the
task demands that you find it in resource B.
Solution First find the answer in resource A,
then locate the same answer, along with original
question terms, in resource B.
Artificial problem, but real for TREC
participants.

23
Answer-Based QA

Web-Based solution

When a QA system looks for answers within a
relatively small textual collection, the chance
of finding strings/sentences that closely match
the question string is small. However, when a QA
system looks for strings/sentences that closely
match the question string on the web, the chance
of finding correct answer is much higher.

Hermjakob et al. 2002

Why this is true
The Web is much larger than the TREC Corpus
(3,000 1)
TREC questions are generated from Web logs, and
the style of language (and subjects of interest)
in these logs are more similar to the Web content
than to newswire collections.

24
Answer-Based QA

Database/Knowledge-base/Ontology solution
When question syntax is simple and reliably
recognizable, can express as a logical form
Logical form represents entire semantics of
question, and can be used to access structured
resource
WordNet
On-line dictionaries
Tables of facts figures
Knowledge-bases such as Cyc
Having found answer
construct a query with original question terms
answer
Retrieve passages
Tell Answer Extraction the answer it is looking
for

25
Approaches of Specific Systems

SMU Falcon
LCC
USC-ISI
Insight
Microsoft
IBM

Note Some of the slides and/or examples in
these sections are taken from papers or
presentations from the respective system authors
26
SMU Falcon
Harabagiu et al. 2000
27
SMU Falcon

From question, dependency structure called
question semantic form is created
Query is Boolean conjunction of terms
From answer passages that contain at least one
instance of answer type, generate answer semantic
form
3 processing loops
Loop 1
Triggered when too few or too many passages are
retrieved from search engine
Loop 2
Triggered when question semantic form and answer
semantic form cannot be unified
Loop 3
Triggered when unable to perform abductive proof
of answer correctness

28
SMU Falcon

Loops provide opportunities to perform
alternations
Loop 1 morphological expansions and
nominalizations
Loop 2 lexical alternations synonyms, direct
hypernyms and hyponyms
Loop 3 paraphrases
Evaluation (Pasca Harabagiu, 2001). Increase
in accuracy in 50-byte task in TREC9
Loop 1 40
Loop 2 52
Loop 3 8
Combined 76

29
LCC

Moldovan Rus, 2001
Uses Logic Prover for answer justification
Question logical form
Candidate answers in logical form
XWN glosses
Linguistic axioms
Lexical chains
Inference engine attempts to verify answer by
negating question and proving a contradiction
If proof fails, predicates in question are
gradually relaxed until proof succeeds or
associated proof score is below a threshold.

30
LCC Lexical Chains

Q1518 What year did Marco Polo travel to Asia?
Answer Marco polo divulged the truth after
returning in 1292 from his travels, which
included several months on Sumatra
Lexical Chains
(1) travel_tov1 -gt GLOSS -gt travelv1
-gt RGLOSS -gt traveln1
(2) travel_to1 -gt GLOSS -gt travelv1
-gt HYPONYM -gt returnv1
(3) Sumatran1 -gt ISPART -gt
Indonesian1 -gt ISPART -gt
Southeast _Asian1 -gt ISPART -gt Asian1
Q1570 What is the legal age to vote in
Argentina?
Answer Voting is mandatory for all Argentines
aged over 18.
Lexical Chains (1) legala1
-gt GLOSS -gt rulen1 -gt RGLOSS -gt
mandatorya1
(2) agen1 -gt RGLOSS -gt ageda3
(3) Argentinea1 -gt GLOSS -gt Argentinan1

31
LCC Logic Prover

Question
Which company created the Internet Browser
Mosaic?
QLF (_organization_AT(x2) ) company_NN(x2)
create_VB(e1,x2,x6) Internet_NN(x3)
browser_NN(x4) Mosaic_NN(x5)
nn_NNC(x6,x3,x4,x5)
Answer passage
... Mosaic , developed by the National Center for
Supercomputing Applications ( NCSA ) at the
University of Illinois at Urbana - Champaign ...
ALF ... Mosaic_NN(x2) develop_VB(e2,x2,x31)
by_IN(e2,x8) National_NN(x3) Center_NN(x4)
for_NN(x5) Supercomputing_NN(x6)
application_NN(x7) nn_NNC(x8,x3,x4,x5,x6,x7)
NCSA_NN(x9) at_IN(e2,x15) University_NN(x10)
of_NN(x11) Illinois_NN(x12) at_NN(x13)
Urbana_NN(x14) nn_NNC(x15,x10,x11,x12,x13,x14)
Champaign_NN(x16) ...
Lexical Chains develop lt-gt make and make
lt-gtcreate
exists x2 x3 x4 all e2 x1 x7 (develop_vb(e2,x7,x1)
lt-gt make_vb(e2,x7,x1) something_nn(x1)
new_jj(x1) such_jj(x1) product_nn(x2)
or_cc(x4,x1,x3) mental_jj(x3) artistic_jj(x3)
creation_nn(x3)).
all e1 x1 x2 (make_vb(e1,x1,x2) lt-gt
create_vb(e1,x1,x2) manufacture_vb(e1,x1,x2)
man-made_jj(x2) product_nn(x2)).
Linguistic axioms
all x0 (mosaic_nn(x0) -gt internet_nn(x0)
browser_nn(x0))

32
USC-ISI

Textmap system
Ravichandran and Hovy, 2002
Hermjakob et al. 2003
Use of Surface Text Patterns
When was X born -gt
Mozart was born in 1756
Gandhi (1869-1948)
Can be captured in expressions
ltNAMEgt was born in ltBIRTHDATEgt
ltNAMEgt (ltBIRTHDATEgt -
These patterns can be learned

33
USC-ISI TextMap

Use bootstrapping to learn patterns.
For an identified question type (When was X
born?), start with known answers for some values
of X
Mozart 1756
Gandhi 1869
Newton 1642
Issue Web search engine queries (e.g. Mozart
1756 )
Collect top 1000 documents
Filter, tokenize, smooth etc.
Use suffix tree constructor to find best
substrings, e.g.
Mozart (1756-1791)
Filter
Mozart (1756-
Replace query strings with e.g. ltNAMEgt and
ltANSWERgt
Determine precision of each pattern
Find documents with just question term (Mozart)
Apply patterns and calculate precision

34
USC-ISI TextMap

Finding Answers
Determine Question type
Perform IR Query
Do sentence segmentation and smoothing
Replace question term by question tag
i.e. replace Mozart with ltNAMEgt
Search for instances of patterns associated with
question type
Select words matching ltANSWERgt
Assign scores according to precision of pattern

35
Insight

Soubbotin, 2002. Soubbotin Soubbotin, 2003.
Performed very well in TREC10/11
Comprehensive and systematic use of Indicative
patterns
E.g.
cap word paren 4 digits dash 4 digits paren
matches
Mozart (1756-1791)
The patterns are broader than named entities
Semantics in syntax
Patterns have intrinsic scores (reliability),
independent of question

36
Insight

Patterns with more sophisticated internal
structure are more indicative of answer
2/3 of their correct entries in TREC10 were
answered by patterns
E.g.
a countries
b official posts
w proper names (first and last)
e titles or honorifics
Patterns for Who is the President (Prime
Minister) of given country?
abeww
ewwdb,a
b,aeww
Definition questions (A is primary query term,
X is answer)
ltA comma a/an/the X comma/periodgt
For Moulin Rouge, a cabaret
ltX comma also called A commagt
For naturally occurring gas called methane
ltA is/are a/an/the Xgt

37
Insight

Emphasis on shallow techniques, lack of NLP
Look in vicinity of text string potentially
matching pattern for zeroing e.g. for
occupational roles
Former
Elect
Deputy
Negation
Comments
Relies on redundancy of large corpus
Works for factoid question types of TREC-QA not
clear how it extends
Not clear how they match questions to patterns
Named entities within patterns have to be
recognized

38
Microsoft

Data-Intensive QA. Brill et al. 2002
Overcoming the surface string mismatch between
the question formulation and the string
containing the answer
Approach based on the assumption/intuition that
someone on the Web has answered the question in
the same way it was asked.
Want to avoid dealing with
Lexical, syntactic, semantic relationships (bet.
Q A)
Anaphora resolution
Synonymy
Alternate syntax
Indirect answers
Take advantage of redundancy on Web, then project
to TREC corpus (Answer-based QA)

39
Microsoft AskMSR

Formulate multiple queries each rewrite has
intrinsic score. E.g. for What is relative
humidity?
is relative humidity, LEFT, 5
relative is humidity, RIGHT, 5
relative humidity is, RIGHT, 5
relative humidity, NULL, 2
relative AND humidity, NULL, 1
Get top 100 documents from Google
Extract n-grams from document summaries
Score n-grams by summing the scores of the
rewrites it came from
Use tiling to merge n-grams
Search for supporting documents in TREC corpus

40
Microsoft AskMSR

Question is What is the rainiest place on
Earth
Answer from Web is Mount Waialeale
Passage in TREC corpus is In misty Seattle,
Wash., last year, 32 inches of rain fell. Hong
Kong gets about 80 inches a year, and even Pago
Pago, noted for its prodigious showers, gets only
about 196 inches annually. (The titleholder,
according to the National Geographic Society, is
Mount Waialeale in Hawaii, where about 460 inches
of rain falls each year.)
Very difficult to imagine getting this passage by
other means

41
IBM Statistical QA (Ittycheriah, 2001)
q question a answer c correctness e
answer type
p(cq,a) Se p(c,eq,a) Se p(ce,q,a)
p(eq,a)
p(eq,a) is the answer type model
(ATM) p(ce,q,a) is the answer selection model
(ASM)

ATM predicts, from the question and a proposed
answer, the answer type they both satisfy
Given a question, an answer, and the predicted
answer type, ASM seeks to model the correctness
of this configuration.
Distributions are modelled using a maximum
entropy formulation
Training data human judgments
For ATM, 13K questions annotated with 31
categories
For ASM, 5K questions from TREC plus trivia

42
IBM Statistical QA (Ittycheriah)

Question Analysis (by ATM)
Selects one out of 31 categories
Search
Question expanded by Local Context Analysis
Top 1000 documents retrieved
Passage Extraction Top 100 passages that
Maximize question word match
Have desired answer type
Minimize dispersion of question words
Have similar syntactic structure to question
Answer Extraction
Candidate answers ranked using ASM

43
IBM Rule-based

Predictive Annotation (Prager 2000, Prager 2003)
Want to make sure passages retrieved by search
engine have at least one candidate answer
Recognize that candidate answer is of correct
answer type which corresponds to a label (or
several) generated by Named Entity Recognizer
Annotate entire corpus and index semantic labels
along with text
Identify answer types in questions and include
corresponding labels in queries

44
IBM PIQUANT

Predictive Annotation
E.g. Question is Who invented baseball?
Who can map to PERSON or ORGANIZATION
Suppose we assume only people invent things (it
doesnt really matter).
So Who invented baseball? -gt PERSON invent
baseball

Consider text but its conclusion was based
largely on the recollections of a man named Abner
Graves, an elderly mining engineer, who reported
that baseball had been "invented" by Doubleday
between 1839 and 1841.
45
IBM PIQUANT

Predictive Annotation
Previous example
Who invented baseball? -gt PERSON invent
baseball
However, same structure is equally effective at
answering
What sport did Doubleday invent? -gt SPORT
invent Doubleday

46
IBM Rule-Based

Handling Subsumption Disjunction
If an entity is of a type which has a parent
type, then how is annotation done?
If a proposed answer type has a parent type, then
what answer type should be used?
If an entity is ambiguous then what should the
annotation be?
If the answer type is ambiguous, then what should
be used?

Guidelines
If an entity is of a type which has a parent
type, then how is annotation done?
If a proposed answer type has a parent type, then
what answer type should be used?
If an entity is ambiguous then what should the
annotation be?
If the answer type is ambiguous, then what should
be used?

47
Subsumption Disjunction

Consider New York City both a CITY and a PLACE
To answer Where did John Lennon die?, it needs
to be a PLACE
To answer In what city is the Empire State
Building?, it needs to be a CITY.
Do NOT want to do subsumption calculation in
search engine
Two scenarios
1. Expand Answer Type and use most
specific entity annotation
1A (CITY PLACE) John_Lennon die matches
CITY
1B CITY Empire_State_Building matches CITY
Or
2. Use most specific Answer Type and multiple
annotations of NYC
2A PLACE John_Lennon die matches (CITY
PLACE)
2B CITY Empire_State_Building matches
(CITY PLACE)
Case 2 preferred for simplicity, because
disjunction in 1 should contain all hyponyms of
PLACE, while disjunction in 2 should contain all
hypernyms of CITY
Choice 2 suggests can use disjunction in answer
type to represent ambiguity
Who invented the laser -gt (PERSON
ORGANIZATION) invent laser

48
Clausal classes

Any structure that can be recognized in text can
be annotated.
Quotations
Explanations
Methods
Opinions
Any semantic class label used in annotation can
be indexed, and hence used as a target of search
What did Karl Marx say about religion?
Why is the sky blue?
How do you make bread?
What does Arnold Schwarzenegger think about
global warming?

49
Named Entity Recognition
50
IBM

Predictive Annotation Improving Precision at no
cost to Recall
E.g. Question is Where is Belize?
Where can map to (CONTINENT, WORLDREGION,
COUNTRY, STATE, CITY, CAPITAL, LAKE, RIVER
).
But we know Belize is a country.
So Where is Belize? -gt (CONTINENT
WORLDREGION) Belize
Belize occurs 1068 times in TREC corpus
Belize and PLACE co-occur in only 537 sentences
Belize and CONTINENT or WORLDREGION co-occur in
only 128 sentences

51
(No Transcript)
52
(No Transcript)
53
Virtual Annotation (Prager 2001)

Use WordNet to find all candidate answers
(hypernyms)
Use corpus co-occurrence statistics to select
best ones
Rather like approach to WSD by Mihalcea and
Moldovan (1999)

54
Parentage of nematode
Level Synset
0 nematode, roundworm
1 worm
2 invertebrate
3 animal, animate being, beast, brute, creature, fauna
4 life form, organism, being, living thing
5 entity, something
55
Parentage of meerkat
Level Synset
0 meerkat, mierkat
1 viverrine, viverrine mammal
2 carnivore
3 placental, placental mammal, eutherian, eutherian mammal
4 mammal
5 vertebrate, craniate
6 chordate
7 animal, animate being, beast, brute, creature, fauna
8 life form, organism, being, living thing
9 entity, something
56
Natural Categories

Basic Objects in Natural Categories Rosch et
al. (1976)
According to psychological testing, these are
categorization levels of intermediate specificity
that people tend to use in unconstrained
settings.

57
What is this?
58
What can we conclude?

There are descriptive terms that people are drawn
to use naturally.
We can expect to find instances of these in text,
in the right contexts.
These terms will serve as good answers.

59
Virtual Annotation (cont.)

Find all parents of query term in WordNet
Look for co-occurrences of query term and parent
in text corpus
Expect to find snippets such as
meerkats and other Y
Many different phrasings are possible, so we just
look for proximity, rather than parse.
Scoring
Count co-occurrences of each parent with search
term, and divide by level number (only levels gt
1), generating Level-Adapted Count (LAC).
Exclude very highest levels (too general).
Select parent with highest LAC plus any others
with LAC within 20.

60
Parentage of nematode
Level Synset
0 nematode, roundworm
1 worm(13)
2 invertebrate
3 animal(2), animate being, beast, brute, creature, fauna
4 life form(2), organism(3), being, living thing
5 entity, something
61
Parentage of meerkat
Level Synset
0 meerkat, mierkat
1 viverrine, viverrine mammal
2 carnivore
3 placental, placental mammal, eutherian, eutherian mammal
4 mammal
5 vertebrate, craniate
6 chordate
7 animal(2), animate being, beast, brute, creature, fauna
8 life form, organism, being, living thing
9 entity, something
62
Sample Answer Passages
Use Answer-based QA to locate answers

What is a nematode? -gt
Such genes have been found in nematode worms but
not yet in higher animals.
What is a meerkat? -gt
South African golfer Butch Kruger had a good
round going in the central Orange Free State
trials, until a mongoose-like animal grabbed his
ball with its mouth and dropped down its hole.
Kruger wrote on his card "Meerkat."

63
Use of Cyc as Sanity Checker

Cyc Large Knowledge-base and Inference engine
(Lenat 1995)
A post-hoc process for
Rejecting insane answers
How much does a grey wolf weigh?
300 tons
Boosting confidence for sane answers
Sanity checker invoked with
Predicate, e.g. weight
Focus, e.g. grey wolf
Candidate value, e.g. 300 tons
Sanity checker returns
Sane or 10 of value in Cyc
Insane outside of the reasonable range
Plan to use distributions instead of ranges
Dont know
Confidence score highly boosted when answer is
sane

64
Cyc Sanity Checking Example

Trec11 Q What is the population of Maryland?
Without sanity checking
PIQUANTs top answer 50,000
Justification Marylands population is 50,000
and growing rapidly.
Passage discusses an exotic species nutria, not
humans
With sanity checking
Cyc knows the population of Maryland is 5,296,486
It rejects the top insane answers
PIQUANTs new top answer 5.1 million with very
high confidence

65
AskMSR

Process the question by
Forming a search engine query from the original
question
Detecting the answer type
Get some results
Extract answers of the right type based on
How often they occur

66
AskMSR
67
Step 1 Rewrite the questions

Intuition The users question is often
syntactically quite close to sentences that
contain the answer
Where is the Louvre Museum located?
The Louvre Museum is located in Paris
Who created the character of Scrooge?
Charles Dickens created the character of Scrooge.

68
Query rewriting

Classify question into seven categories
Who is/was/are/were?
When is/did/will/are/were ?
Where is/are/were ?
a. Hand-crafted category-specific transformation
rules
e.g. For where questions, move is to all
possible locations
Look to the right of the query terms for the
answer.
Where is the Louvre Museum located?
? is the Louvre Museum located
? the is Louvre Museum located
? the Louvre is Museum located
? the Louvre Museum is located
? the Louvre Museum located is

69
Step 2 Query search engine

Send all rewrites to a Web search engine
Retrieve top N answers (100-200)
For speed, rely just on search engines
snippets, not the full text of the actual
document

70
Step 3 Gathering N-Grams

Enumerate all N-grams (N1,2,3) in all retrieved
snippets
Weight of an n-gram occurrence count, each
weighted by reliability (weight) of rewrite
rule that fetched the document
Example Who created the character of Scrooge?
Dickens 117
Christmas Carol 78
Charles Dickens 75
Disney 72
Carl Banks 54
A Christmas 41
Christmas Carol 45
Uncle 31

71
Step 4 Filtering N-Grams

Each question type is associated with one or more
data-type filters regular expressions for
answer types
Boost score of n-grams that match the expected
answer type.
Lower score of n-grams that dont match.

72
Step 5 Tiling the Answers
Scores 20 15 10
merged, discard old n-grams
Charles Dickens
Dickens
Mr Charles
Mr Charles Dickens
Score 45
73
Results

Standard TREC contest test-bed (TREC 2001) 1M
documents 900 questions
Technique does ok, not great (would have placed
in top 9 of 30 participants)
But with access to the Web They do much better,
would have come in second on TREC 2001

74
Issues

In many scenarios (e.g., monitoring an
individuals email) we only have a small set of
documents
Works best/only for Trivial Pursuit-style
fact-based questions
Limited/brittle repertoire of
question categories
answer data types/filters
query rewriting rules

75
ISI Surface patterns approach

Use of Characteristic Phrases
"When was ltpersongt born
Typical answers
"Mozart was born in 1756.
"Gandhi (1869-1948)...
Suggests phrases like
"ltNAMEgt was born in ltBIRTHDATEgt
"ltNAMEgt ( ltBIRTHDATEgt-
as Regular Expressions can help locate correct
answer

76
Use Pattern Learning

Example
The great composer Mozart (1756-1791) achieved
fame at a young age
Mozart (1756-1791) was a genius
The whole world would always be indebted to the
great music of Mozart (1756-1791)
Longest matching substring for all 3 sentences is
"Mozart (1756-1791)
Suffix tree would extract "Mozart (1756-1791)" as
an output, with score of 3
Reminiscent of IE pattern learning

77
Pattern Learning (cont.)

Repeat with different examples of same question
type
Gandhi 1869, Newton 1642, etc.
Some patterns learned for BIRTHDATE
a. born in ltANSWERgt, ltNAMEgt
b. ltNAMEgt was born on ltANSWERgt ,
c. ltNAMEgt ( ltANSWERgt -
d. ltNAMEgt ( ltANSWERgt - )

78
Experiments

6 different Q types
from Webclopedia QA Typology (Hovy et al., 2002a)
BIRTHDATE
LOCATION
INVENTOR
DISCOVERER
DEFINITION
WHY-FAMOUS

79
Experiments pattern precision

BIRTHDATE table
1.0 ltNAMEgt ( ltANSWERgt - )
0.85 ltNAMEgt was born on ltANSWERgt,
0.6 ltNAMEgt was born in ltANSWERgt
0.59 ltNAMEgt was born ltANSWERgt
0.53 ltANSWERgt ltNAMEgt was born
0.50 - ltNAMEgt ( ltANSWERgt
0.36 ltNAMEgt ( ltANSWERgt -
INVENTOR
1.0 ltANSWERgt invents ltNAMEgt
1.0 the ltNAMEgt was invented by ltANSWERgt
1.0 ltANSWERgt invented the ltNAMEgt in

80
Experiments (cont.)

DISCOVERER
1.0 when ltANSWERgt discovered ltNAMEgt
1.0 ltANSWERgt's discovery of ltNAMEgt
0.9 ltNAMEgt was discovered by ltANSWERgt in
DEFINITION
1.0 ltNAMEgt and related ltANSWERgt
1.0 form of ltANSWERgt, ltNAMEgt
0.94 as ltNAMEgt, ltANSWERgt and

81
Experiments (cont.)

WHY-FAMOUS
1.0 ltANSWERgt ltNAMEgt called
1.0 laureate ltANSWERgt ltNAMEgt
0.71 ltNAMEgt is the ltANSWERgt of
LOCATION
1.0 ltANSWERgt's ltNAMEgt
1.0 regional ltANSWERgt ltNAMEgt
0.92 near ltNAMEgt in ltANSWERgt
Depending on question type, get high MRR
(0.60.9), with higher results from use of Web
than TREC QA collection

82
Shortcomings Extensions

Need for POS /or semantic types
"Where are the Rocky Mountains?
"Denver's new airport, topped with white
fiberglass cones in imitation of the Rocky
Mountains in the background , continues to lie
empty
ltNAMEgt in ltANSWERgt
NE tagger /or ontology could enable system to
determine "background" is not a location

83
Shortcomings... (cont.)

Long distance dependencies
"Where is London?
"London, which has one of the most busiest
airports in the world, lies on the banks of the
river Thames
would require pattern likeltQUESTIONgt,
(ltany_wordgt), lies on ltANSWERgt
Abundance variety of Web data helps system to
find an instance of patterns w/o losing answers
to long distance dependencies

84
Shortcomings... (cont.)

System currently has only one anchor word
Doesn't work for Q types requiring multiple words
from question to be in answer
"In which county does the city of Long Beach
lie?
"Long Beach is situated in Los Angeles County
required pattern ltQ_TERM_1gt is situated in
ltANSWERgt ltQ_TERM_2gt
Does not use case
"What is a micron?
"...a spokesman for Micron, a maker of
semiconductors, said SIMMs are..."
If Micron had been capitalized in question, would
be a perfect answer

85
QA Typology from ISI (USC)

Typology of typical Q forms94 nodes (47 leaf
nodes)
Analyzed 17,384 questions (from answers.com)

86
Question Answering Example

How hot does the inside of an active volcano get?
get(TEMPERATURE, inside(volcano(active)))
lava fragments belched out of the mountain were
as hot as 300 degrees Fahrenheit
fragments(lava, TEMPERATURE(degrees(300)),
belched(out, mountain))
volcano ISA mountain
lava ISPARTOF volcano ? lava inside volcano
fragments of lava HAVEPROPERTIESOF lava
The needed semantic information is in WordNet
definitions, and was successfully translated into
a form that was used for rough proofs

87
References

AskMSR Question Answering Using the Worldwide
Web
Michele Banko, Eric Brill, Susan Dumais, Jimmy
Lin
http//www.ai.mit.edu/people/jimmylin/publications
/Banko-etal-AAAI02.pdf
In Proceedings of 2002 AAAI SYMPOSIUM on Mining
Answers from Text and Knowledge Bases, March
2002
Web Question Answering Is More Always Better?
Susan Dumais, Michele Banko, Eric Brill, Jimmy
Lin, Andrew Ng
http//research.microsoft.com/sdumais/SIGIR2002-Q
A-Submit-Conf.pdf
D. Ravichandran and E.H. Hovy. 2002. Learning
Surface Patterns for a Question Answering
System.ACL conference, July 2002.

88
Harder Questions

Factoid question answering is really pretty
silly.
A more interesting task is one where the answers
are fluid and depend on the fusion of material
from disparate texts over time.
Who is Condoleezza Rice?
Who is Mahmoud Abbas?
Why was Arafat flown to Paris?

Write a Comment

User Comments (0)