Answering Questions by Computer - PowerPoint PPT Presentation

About This Presentation
Title:

Answering Questions by Computer

Description:

Title: N-Grams and Corpus Linguistics Author: Kathy McCoy Last modified by: Sudeshna Sarkar Created Date: 1/20/1999 7:57:44 PM Document presentation format – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 88
Provided by: KathyM151
Category:

less

Transcript and Presenter's Notes

Title: Answering Questions by Computer


1
Answering Questions by Computer
2
Terminology Question Type
  • Question Type an idiomatic categorization of
    questions for purposes of distinguishing between
    different processing strategies and/or answer
    formats
  • E.g. TREC2003
  • FACTOID How far is it from Earth to Mars?
  • LIST List the names of chewing gums
  • DEFINITION Who is Vlad the Impaler?
  • Other possibilities
  • RELATIONSHIP What is the connection between
    Valentina Tereshkova and Sally Ride?
  • SUPERLATIVE What is the largest city on
    Earth?
  • YES-NO Is Saddam Hussein alive?
  • OPINION What do most Americans think of gun
    control?
  • CAUSEEFFECT Why did Iraq invade Kuwait?

3
Terminology Answer Type
  • Answer Type the class of object (or rhetorical
    type of sentence) sought by the question. E.g.
  • PERSON (from Who )
  • PLACE (from Where )
  • DATE (from When )
  • NUMBER (from How many )
  • but also
  • EXPLANATION (from Why )
  • METHOD (from How )
  • Answer types are usually tied intimately to the
    classes recognized by the systems Named Entity
    Recognizer.

4
Terminology Question Focus
  • Question Focus The property or entity that is
    being sought by the question.
  • E.g.
  • In what state is the Grand Canyon?
  • What is the population of Bulgaria?
  • What colour is a pomegranate?

5
Terminology Question Topic
  • Question Topic the object (person, place, ) or
    event that the question is about. The question
    might well be about a property of the topic,
    which will be the question focus.
  • E.g. What is the height of Mt. Everest?
  • height is the focus
  • Mt. Everest is the topic

6
Terminology Candidate Passage
  • Candidate Passage a text passage (anything from
    a single sentence to a whole document) retrieved
    by a search engine in response to a question.
  • Depending on the query and kind of index used,
    there may or may not be a guarantee that a
    candidate passage has any candidate answers.
  • Candidate passages will usually have associated
    scores, from the search engine.

7
Terminology Candidate Answer
  • Candidate Answer in the context of a question,
    a small quantity of text (anything from a single
    word to a sentence or bigger, but usually a noun
    phrase) that is of the same type as the Answer
    Type.
  • In some systems, the type match may be
    approximate, if there is the concept of
    confusability.
  • Candidate answers are found in candidate passages
  • E.g.
  • 50
  • Queen Elizabeth II
  • September 8, 2003
  • by baking a mixture of flour and water

8
Terminology Authority List
  • Authority List (or File) a collection of
    instances of a class of interest, used to test a
    term for class membership.
  • Instances should be derived from an authoritative
    source and be as close to complete as possible.
  • Ideally, class is small, easily enumerated and
    with members with a limited number of lexical
    forms.
  • Good
  • Days of week
  • Planets
  • Elements
  • Good statistically, but difficult to get 100
    recall
  • Animals
  • Plants
  • Colours
  • Problematic
  • People
  • Organizations
  • Impossible
  • All numeric quantities
  • Explanations and other clausal quantities

9
Essence of Text-based QA
(Single source answers)
  • Need to find a passage that answers the question.
  • Find a candidate passage (search)
  • Check that semantics of passage and question
    match
  • Extract the answer

10
Ranking Candidate Answers
Q066 Name the first private citizen to fly in
space.
  • Answer type Person
  • Text passage
  • Among them was Christa McAuliffe, the first
    private citizen to fly in space. Karen Allen,
    best known for her starring role in Raiders of
    the Lost Ark, plays McAuliffe. Brian Kerwin is
    featured as shuttle pilot Mike Smith...

11
Answer Extraction
  • Also called Answer Selection/Pinpointing
  • Given a question and candidate passages, the
    process of selecting and ranking candidate
    answers.
  • Usually, candidate answers are those terms in the
    passages which have the same answer type as that
    generated from the question
  • Ranking the candidate answers depends on
    assessing how well the passage context relates to
    the question
  • 3 Approaches
  • Heuristic features
  • Shallow parse fragments
  • Logical proof

12
Features for Answer Ranking
  • Number of question terms matched in the answer
    passage
  • Number of question terms matched in the same
    phrase as the candidate answer
  • Number of question terms matched in the same
    sentence as the candidate answer
  • Flag set to 1 if the candidate answer is followed
    by a punctuation sign
  • Number of question terms matched, separated from
    the candidate answer by at most three words
    and one comma
  • Number of terms occurring in the same order in
    the answer passage as in the question
  • Average distance from candidate answer to
    question term matches

SIGIR 01
13
Heuristics for Answer Ranking in the Lasso System
  • Same_Word_Sequence_score number of words from
    the question that are recognized in the same
    sequence in the passage.
  • Punctuation_sign_score a flag set to 1 if the
    candidate answer is followed by a punctuation
    sign
  • Comma_3_word_score measure the number of
    question words that follow the candidate, if the
    candidate is followed by a coma.
  • Same_parse_subtree_score number of question
    words found in the parse sub-tree of the answer
  • Same_sentence_score number of question words
    found in the answers sentence.
  • Distance score adds the distance (measured in
    number of words) between the answer candidate and
    the other keywords in the window.

14
Heuristics for Answer Ranking in the Lasso
Systemcontinued
  • Finally..

15
Evaluation
  • Evaluation of this kind of system is usually
    based on some kind of TREC-like metric.
  • In Q/A the most frequent metric is
  • Mean reciprocal rank
  • Youre allowed to return N answers. Your score is
    based on 1/Rank of the first right answer.
  • Averaged over all the questions you answer.

16
Answer Types and Modifiers
Name 5 French Cities
  • Most likely there is no type for French Cities
  • So will look for CITY
  • include French/France in bag of words, and hope
    for the best
  • include French/France in bag of words, retrieve
    documents, and look for evidence (deep parsing,
    logic)
  • use high-precision Language Identification on
    results
  • If you have a list of French cities, could either
  • Filter results by list
  • Use Answer-Based QA (see later)
  • Use longitude/latitude information of cities and
    countries

17
Answer Types and Modifiers
Name a female figure skater
  • Most likely there is no type for female figure
    skater
  • Most likely there is no type for figure skater
  • Look for PERSON, with query terms figure,
    skater
  • What to do about female? Two approaches.
  • Include female in the bag-of-words.
  • Relies on logic that if femaleness is an
    interesting property, it might well be mentioned
    in answer passages.
  • Does not apply to, say singer.
  • Leave out female but test candidate answers for
    gender.
  • Needs either an authority file or a heuristic
    test.
  • Test may not be definitive.

18
Part II - Specific Approaches
  • By Genre
  • Statistical QA
  • Pattern-based QA
  • Web-based QA
  • Answer-based QA (TREC only)
  • By System
  • SMU
  • LCC
  • USC-ISI
  • Insight
  • Microsoft
  • IBM Statistical
  • IBM Rule-based

19
Statistical QA
  • Use statistical distributions to model
    likelihoods of answer type and answer
  • E.g. IBM (Ittycheriah, 2001) see later section

20
Pattern-based QA
  • For a given question type, identify the typical
    syntactic constructions used in text to express
    answers to such questions
  • Typically very high precision, but a lot of work
    to get decent recall

21
Web-Based QA
  • Exhaustive string transformations
  • Brill et al. 2002
  • Learning
  • Radev et al. 2001

22
Answer-Based QA
  • Problem Sometimes it is very easy to find an
    answer to a question using resource A, but the
    task demands that you find it in resource B.
  • Solution First find the answer in resource A,
    then locate the same answer, along with original
    question terms, in resource B.
  • Artificial problem, but real for TREC
    participants.

23
Answer-Based QA
  • Web-Based solution

When a QA system looks for answers within a
relatively small textual collection, the chance
of finding strings/sentences that closely match
the question string is small. However, when a QA
system looks for strings/sentences that closely
match the question string on the web, the chance
of finding correct answer is much higher.

Hermjakob et al. 2002
  • Why this is true
  • The Web is much larger than the TREC Corpus
    (3,000 1)
  • TREC questions are generated from Web logs, and
    the style of language (and subjects of interest)
    in these logs are more similar to the Web content
    than to newswire collections.

24
Answer-Based QA
  • Database/Knowledge-base/Ontology solution
  • When question syntax is simple and reliably
    recognizable, can express as a logical form
  • Logical form represents entire semantics of
    question, and can be used to access structured
    resource
  • WordNet
  • On-line dictionaries
  • Tables of facts figures
  • Knowledge-bases such as Cyc
  • Having found answer
  • construct a query with original question terms
    answer
  • Retrieve passages
  • Tell Answer Extraction the answer it is looking
    for

25
Approaches of Specific Systems
  • SMU Falcon
  • LCC
  • USC-ISI
  • Insight
  • Microsoft
  • IBM

Note Some of the slides and/or examples in
these sections are taken from papers or
presentations from the respective system authors
26
SMU Falcon
Harabagiu et al. 2000
27
SMU Falcon
  • From question, dependency structure called
    question semantic form is created
  • Query is Boolean conjunction of terms
  • From answer passages that contain at least one
    instance of answer type, generate answer semantic
    form
  • 3 processing loops
  • Loop 1
  • Triggered when too few or too many passages are
    retrieved from search engine
  • Loop 2
  • Triggered when question semantic form and answer
    semantic form cannot be unified
  • Loop 3
  • Triggered when unable to perform abductive proof
    of answer correctness

28
SMU Falcon
  • Loops provide opportunities to perform
    alternations
  • Loop 1 morphological expansions and
    nominalizations
  • Loop 2 lexical alternations synonyms, direct
    hypernyms and hyponyms
  • Loop 3 paraphrases
  • Evaluation (Pasca Harabagiu, 2001). Increase
    in accuracy in 50-byte task in TREC9
  • Loop 1 40
  • Loop 2 52
  • Loop 3 8
  • Combined 76

29
LCC
  • Moldovan Rus, 2001
  • Uses Logic Prover for answer justification
  • Question logical form
  • Candidate answers in logical form
  • XWN glosses
  • Linguistic axioms
  • Lexical chains
  • Inference engine attempts to verify answer by
    negating question and proving a contradiction
  • If proof fails, predicates in question are
    gradually relaxed until proof succeeds or
    associated proof score is below a threshold.

30
LCC Lexical Chains
  • Q1518 What year did Marco Polo travel to Asia?
  • Answer Marco polo divulged the truth after
    returning in 1292 from his travels, which
    included several months on Sumatra
  • Lexical Chains
  • (1) travel_tov1 -gt GLOSS -gt travelv1
    -gt RGLOSS -gt traveln1
  • (2) travel_to1 -gt GLOSS -gt travelv1
    -gt HYPONYM -gt returnv1
  • (3) Sumatran1 -gt ISPART -gt
    Indonesian1 -gt ISPART -gt

  • Southeast _Asian1 -gt ISPART -gt Asian1
  • Q1570 What is the legal age to vote in
    Argentina?
  • Answer Voting is mandatory for all Argentines
    aged over 18.
  • Lexical Chains (1) legala1
    -gt GLOSS -gt rulen1 -gt RGLOSS -gt
    mandatorya1
  • (2) agen1 -gt RGLOSS -gt ageda3
  • (3) Argentinea1 -gt GLOSS -gt Argentinan1

31
LCC Logic Prover
  • Question
  • Which company created the Internet Browser
    Mosaic?
  • QLF (_organization_AT(x2) ) company_NN(x2)
    create_VB(e1,x2,x6) Internet_NN(x3)
    browser_NN(x4) Mosaic_NN(x5)
    nn_NNC(x6,x3,x4,x5)
  • Answer passage
  • ... Mosaic , developed by the National Center for
    Supercomputing Applications ( NCSA ) at the
    University of Illinois at Urbana - Champaign ...
  • ALF ... Mosaic_NN(x2) develop_VB(e2,x2,x31)
    by_IN(e2,x8) National_NN(x3) Center_NN(x4)
    for_NN(x5) Supercomputing_NN(x6)
    application_NN(x7) nn_NNC(x8,x3,x4,x5,x6,x7)
    NCSA_NN(x9) at_IN(e2,x15) University_NN(x10)
    of_NN(x11) Illinois_NN(x12) at_NN(x13)
    Urbana_NN(x14) nn_NNC(x15,x10,x11,x12,x13,x14)
    Champaign_NN(x16) ...
  • Lexical Chains develop lt-gt make and make
    lt-gtcreate
  • exists x2 x3 x4 all e2 x1 x7 (develop_vb(e2,x7,x1)
    lt-gt make_vb(e2,x7,x1) something_nn(x1)
    new_jj(x1) such_jj(x1) product_nn(x2)
    or_cc(x4,x1,x3) mental_jj(x3) artistic_jj(x3)
    creation_nn(x3)).
  • all e1 x1 x2 (make_vb(e1,x1,x2) lt-gt
    create_vb(e1,x1,x2) manufacture_vb(e1,x1,x2)
    man-made_jj(x2) product_nn(x2)).
  • Linguistic axioms
  • all x0 (mosaic_nn(x0) -gt internet_nn(x0)
    browser_nn(x0))

32
USC-ISI
  • Textmap system
  • Ravichandran and Hovy, 2002
  • Hermjakob et al. 2003
  • Use of Surface Text Patterns
  • When was X born -gt
  • Mozart was born in 1756
  • Gandhi (1869-1948)
  • Can be captured in expressions
  • ltNAMEgt was born in ltBIRTHDATEgt
  • ltNAMEgt (ltBIRTHDATEgt -
  • These patterns can be learned

33
USC-ISI TextMap
  • Use bootstrapping to learn patterns.
  • For an identified question type (When was X
    born?), start with known answers for some values
    of X
  • Mozart 1756
  • Gandhi 1869
  • Newton 1642
  • Issue Web search engine queries (e.g. Mozart
    1756 )
  • Collect top 1000 documents
  • Filter, tokenize, smooth etc.
  • Use suffix tree constructor to find best
    substrings, e.g.
  • Mozart (1756-1791)
  • Filter
  • Mozart (1756-
  • Replace query strings with e.g. ltNAMEgt and
    ltANSWERgt
  • Determine precision of each pattern
  • Find documents with just question term (Mozart)
  • Apply patterns and calculate precision

34
USC-ISI TextMap
  • Finding Answers
  • Determine Question type
  • Perform IR Query
  • Do sentence segmentation and smoothing
  • Replace question term by question tag
  • i.e. replace Mozart with ltNAMEgt
  • Search for instances of patterns associated with
    question type
  • Select words matching ltANSWERgt
  • Assign scores according to precision of pattern

35
Insight
  • Soubbotin, 2002. Soubbotin Soubbotin, 2003.
  • Performed very well in TREC10/11
  • Comprehensive and systematic use of Indicative
    patterns
  • E.g.
  • cap word paren 4 digits dash 4 digits paren
  • matches
  • Mozart (1756-1791)
  • The patterns are broader than named entities
  • Semantics in syntax
  • Patterns have intrinsic scores (reliability),
    independent of question

36
Insight
  • Patterns with more sophisticated internal
    structure are more indicative of answer
  • 2/3 of their correct entries in TREC10 were
    answered by patterns
  • E.g.
  • a countries
  • b official posts
  • w proper names (first and last)
  • e titles or honorifics
  • Patterns for Who is the President (Prime
    Minister) of given country?
  • abeww
  • ewwdb,a
  • b,aeww
  • Definition questions (A is primary query term,
    X is answer)
  • ltA comma a/an/the X comma/periodgt
  • For Moulin Rouge, a cabaret
  • ltX comma also called A commagt
  • For naturally occurring gas called methane
  • ltA is/are a/an/the Xgt

37
Insight
  • Emphasis on shallow techniques, lack of NLP
  • Look in vicinity of text string potentially
    matching pattern for zeroing e.g. for
    occupational roles
  • Former
  • Elect
  • Deputy
  • Negation
  • Comments
  • Relies on redundancy of large corpus
  • Works for factoid question types of TREC-QA not
    clear how it extends
  • Not clear how they match questions to patterns
  • Named entities within patterns have to be
    recognized

38
Microsoft
  • Data-Intensive QA. Brill et al. 2002
  • Overcoming the surface string mismatch between
    the question formulation and the string
    containing the answer
  • Approach based on the assumption/intuition that
    someone on the Web has answered the question in
    the same way it was asked.
  • Want to avoid dealing with
  • Lexical, syntactic, semantic relationships (bet.
    Q A)
  • Anaphora resolution
  • Synonymy
  • Alternate syntax
  • Indirect answers
  • Take advantage of redundancy on Web, then project
    to TREC corpus (Answer-based QA)

39
Microsoft AskMSR
  • Formulate multiple queries each rewrite has
    intrinsic score. E.g. for What is relative
    humidity?
  • is relative humidity, LEFT, 5
  • relative is humidity, RIGHT, 5
  • relative humidity is, RIGHT, 5
  • relative humidity, NULL, 2
  • relative AND humidity, NULL, 1
  • Get top 100 documents from Google
  • Extract n-grams from document summaries
  • Score n-grams by summing the scores of the
    rewrites it came from
  • Use tiling to merge n-grams
  • Search for supporting documents in TREC corpus

40
Microsoft AskMSR
  • Question is What is the rainiest place on
    Earth
  • Answer from Web is Mount Waialeale
  • Passage in TREC corpus is In misty Seattle,
    Wash., last year, 32 inches of rain fell. Hong
    Kong gets about 80 inches a year, and even Pago
    Pago, noted for its prodigious showers, gets only
    about 196 inches annually. (The titleholder,
    according to the National Geographic Society, is
    Mount Waialeale in Hawaii, where about 460 inches
    of rain falls each year.)
  • Very difficult to imagine getting this passage by
    other means

41
IBM Statistical QA (Ittycheriah, 2001)
q question a answer c correctness e
answer type
p(cq,a) Se p(c,eq,a) Se p(ce,q,a)
p(eq,a)
p(eq,a) is the answer type model
(ATM) p(ce,q,a) is the answer selection model
(ASM)
  • ATM predicts, from the question and a proposed
    answer, the answer type they both satisfy
  • Given a question, an answer, and the predicted
    answer type, ASM seeks to model the correctness
    of this configuration.
  • Distributions are modelled using a maximum
    entropy formulation
  • Training data human judgments
  • For ATM, 13K questions annotated with 31
    categories
  • For ASM, 5K questions from TREC plus trivia

42
IBM Statistical QA (Ittycheriah)
  • Question Analysis (by ATM)
  • Selects one out of 31 categories
  • Search
  • Question expanded by Local Context Analysis
  • Top 1000 documents retrieved
  • Passage Extraction Top 100 passages that
  • Maximize question word match
  • Have desired answer type
  • Minimize dispersion of question words
  • Have similar syntactic structure to question
  • Answer Extraction
  • Candidate answers ranked using ASM

43
IBM Rule-based
  • Predictive Annotation (Prager 2000, Prager 2003)
  • Want to make sure passages retrieved by search
    engine have at least one candidate answer
  • Recognize that candidate answer is of correct
    answer type which corresponds to a label (or
    several) generated by Named Entity Recognizer
  • Annotate entire corpus and index semantic labels
    along with text
  • Identify answer types in questions and include
    corresponding labels in queries

44
IBM PIQUANT
  • Predictive Annotation
  • E.g. Question is Who invented baseball?
  • Who can map to PERSON or ORGANIZATION
  • Suppose we assume only people invent things (it
    doesnt really matter).
  • So Who invented baseball? -gt PERSON invent
    baseball

Consider text but its conclusion was based
largely on the recollections of a man named Abner
Graves, an elderly mining engineer, who reported
that baseball had been "invented" by Doubleday
between 1839 and 1841.
45
IBM PIQUANT
  • Predictive Annotation
  • Previous example
  • Who invented baseball? -gt PERSON invent
    baseball
  • However, same structure is equally effective at
    answering
  • What sport did Doubleday invent? -gt SPORT
    invent Doubleday

46
IBM Rule-Based
  • Handling Subsumption Disjunction
  • If an entity is of a type which has a parent
    type, then how is annotation done?
  • If a proposed answer type has a parent type, then
    what answer type should be used?
  • If an entity is ambiguous then what should the
    annotation be?
  • If the answer type is ambiguous, then what should
    be used?
  • Guidelines
  • If an entity is of a type which has a parent
    type, then how is annotation done?
  • If a proposed answer type has a parent type, then
    what answer type should be used?
  • If an entity is ambiguous then what should the
    annotation be?
  • If the answer type is ambiguous, then what should
    be used?

47
Subsumption Disjunction
  • Consider New York City both a CITY and a PLACE
  • To answer Where did John Lennon die?, it needs
    to be a PLACE
  • To answer In what city is the Empire State
    Building?, it needs to be a CITY.
  • Do NOT want to do subsumption calculation in
    search engine
  • Two scenarios
  • 1. Expand Answer Type and use most
    specific entity annotation
  • 1A (CITY PLACE) John_Lennon die matches
    CITY
  • 1B CITY Empire_State_Building matches CITY
  • Or
  • 2. Use most specific Answer Type and multiple
    annotations of NYC
  • 2A PLACE John_Lennon die matches (CITY
    PLACE)
  • 2B CITY Empire_State_Building matches
    (CITY PLACE)
  • Case 2 preferred for simplicity, because
    disjunction in 1 should contain all hyponyms of
    PLACE, while disjunction in 2 should contain all
    hypernyms of CITY
  • Choice 2 suggests can use disjunction in answer
    type to represent ambiguity
  • Who invented the laser -gt (PERSON
    ORGANIZATION) invent laser

48
Clausal classes
  • Any structure that can be recognized in text can
    be annotated.
  • Quotations
  • Explanations
  • Methods
  • Opinions
  • Any semantic class label used in annotation can
    be indexed, and hence used as a target of search
  • What did Karl Marx say about religion?
  • Why is the sky blue?
  • How do you make bread?
  • What does Arnold Schwarzenegger think about
    global warming?

49
Named Entity Recognition
50
IBM
  • Predictive Annotation Improving Precision at no
    cost to Recall
  • E.g. Question is Where is Belize?
  • Where can map to (CONTINENT, WORLDREGION,
    COUNTRY, STATE, CITY, CAPITAL, LAKE, RIVER
    ).
  • But we know Belize is a country.
  • So Where is Belize? -gt (CONTINENT
    WORLDREGION) Belize
  • Belize occurs 1068 times in TREC corpus
  • Belize and PLACE co-occur in only 537 sentences
  • Belize and CONTINENT or WORLDREGION co-occur in
    only 128 sentences

51
(No Transcript)
52
(No Transcript)
53
Virtual Annotation (Prager 2001)
  • Use WordNet to find all candidate answers
    (hypernyms)
  • Use corpus co-occurrence statistics to select
    best ones
  • Rather like approach to WSD by Mihalcea and
    Moldovan (1999)

54
Parentage of nematode
Level Synset
0 nematode, roundworm
1 worm
2 invertebrate
3 animal, animate being, beast, brute, creature, fauna
4 life form, organism, being, living thing
5 entity, something
55
Parentage of meerkat
Level Synset
0 meerkat, mierkat
1 viverrine, viverrine mammal
2 carnivore
3 placental, placental mammal, eutherian, eutherian mammal
4 mammal
5 vertebrate, craniate
6 chordate
7 animal, animate being, beast, brute, creature, fauna
8 life form, organism, being, living thing
9 entity, something
56
Natural Categories
  • Basic Objects in Natural Categories Rosch et
    al. (1976)
  • According to psychological testing, these are
    categorization levels of intermediate specificity
    that people tend to use in unconstrained
    settings.

57
What is this?
58
What can we conclude?
  • There are descriptive terms that people are drawn
    to use naturally.
  • We can expect to find instances of these in text,
    in the right contexts.
  • These terms will serve as good answers.

59
Virtual Annotation (cont.)
  • Find all parents of query term in WordNet
  • Look for co-occurrences of query term and parent
    in text corpus
  • Expect to find snippets such as
    meerkats and other Y
  • Many different phrasings are possible, so we just
    look for proximity, rather than parse.
  • Scoring
  • Count co-occurrences of each parent with search
    term, and divide by level number (only levels gt
    1), generating Level-Adapted Count (LAC).
  • Exclude very highest levels (too general).
  • Select parent with highest LAC plus any others
    with LAC within 20.

60
Parentage of nematode
Level Synset
0 nematode, roundworm
1 worm(13)
2 invertebrate
3 animal(2), animate being, beast, brute, creature, fauna
4 life form(2), organism(3), being, living thing
5 entity, something
61
Parentage of meerkat
Level Synset
0 meerkat, mierkat
1 viverrine, viverrine mammal
2 carnivore
3 placental, placental mammal, eutherian, eutherian mammal
4 mammal
5 vertebrate, craniate
6 chordate
7 animal(2), animate being, beast, brute, creature, fauna
8 life form, organism, being, living thing
9 entity, something
62
Sample Answer Passages
Use Answer-based QA to locate answers
  • What is a nematode? -gt
  • Such genes have been found in nematode worms but
    not yet in higher animals.
  • What is a meerkat? -gt
  • South African golfer Butch Kruger had a good
    round going in the central Orange Free State
    trials, until a mongoose-like animal grabbed his
    ball with its mouth and dropped down its hole.
    Kruger wrote on his card "Meerkat."

63
Use of Cyc as Sanity Checker
  • Cyc Large Knowledge-base and Inference engine
    (Lenat 1995)
  • A post-hoc process for
  • Rejecting insane answers
  • How much does a grey wolf weigh?
  • 300 tons
  • Boosting confidence for sane answers
  • Sanity checker invoked with
  • Predicate, e.g. weight
  • Focus, e.g. grey wolf
  • Candidate value, e.g. 300 tons
  • Sanity checker returns
  • Sane or 10 of value in Cyc
  • Insane outside of the reasonable range
  • Plan to use distributions instead of ranges
  • Dont know
  • Confidence score highly boosted when answer is
    sane

64
Cyc Sanity Checking Example
  • Trec11 Q What is the population of Maryland?
  • Without sanity checking
  • PIQUANTs top answer 50,000
  • Justification Marylands population is 50,000
    and growing rapidly.
  • Passage discusses an exotic species nutria, not
    humans
  • With sanity checking
  • Cyc knows the population of Maryland is 5,296,486
  • It rejects the top insane answers
  • PIQUANTs new top answer 5.1 million with very
    high confidence

65
AskMSR
  • Process the question by
  • Forming a search engine query from the original
    question
  • Detecting the answer type
  • Get some results
  • Extract answers of the right type based on
  • How often they occur

66
AskMSR
67
Step 1 Rewrite the questions
  • Intuition The users question is often
    syntactically quite close to sentences that
    contain the answer
  • Where is the Louvre Museum located?
  • The Louvre Museum is located in Paris
  • Who created the character of Scrooge?
  • Charles Dickens created the character of Scrooge.

68
Query rewriting
  • Classify question into seven categories
  • Who is/was/are/were?
  • When is/did/will/are/were ?
  • Where is/are/were ?
  • a. Hand-crafted category-specific transformation
    rules
  • e.g. For where questions, move is to all
    possible locations
  • Look to the right of the query terms for the
    answer.
  • Where is the Louvre Museum located?
  • ? is the Louvre Museum located
  • ? the is Louvre Museum located
  • ? the Louvre is Museum located
  • ? the Louvre Museum is located
  • ? the Louvre Museum located is

69
Step 2 Query search engine
  • Send all rewrites to a Web search engine
  • Retrieve top N answers (100-200)
  • For speed, rely just on search engines
    snippets, not the full text of the actual
    document

70
Step 3 Gathering N-Grams
  • Enumerate all N-grams (N1,2,3) in all retrieved
    snippets
  • Weight of an n-gram occurrence count, each
    weighted by reliability (weight) of rewrite
    rule that fetched the document
  • Example Who created the character of Scrooge?
  • Dickens 117
  • Christmas Carol 78
  • Charles Dickens 75
  • Disney 72
  • Carl Banks 54
  • A Christmas 41
  • Christmas Carol 45
  • Uncle 31

71
Step 4 Filtering N-Grams
  • Each question type is associated with one or more
    data-type filters regular expressions for
    answer types
  • Boost score of n-grams that match the expected
    answer type.
  • Lower score of n-grams that dont match.

72
Step 5 Tiling the Answers
Scores 20 15 10
merged, discard old n-grams
Charles Dickens
Dickens
Mr Charles
Mr Charles Dickens
Score 45
73
Results
  • Standard TREC contest test-bed (TREC 2001) 1M
    documents 900 questions
  • Technique does ok, not great (would have placed
    in top 9 of 30 participants)
  • But with access to the Web They do much better,
    would have come in second on TREC 2001

74
Issues
  • In many scenarios (e.g., monitoring an
    individuals email) we only have a small set of
    documents
  • Works best/only for Trivial Pursuit-style
    fact-based questions
  • Limited/brittle repertoire of
  • question categories
  • answer data types/filters
  • query rewriting rules

75
ISI Surface patterns approach
  • Use of Characteristic Phrases
  • "When was ltpersongt born
  • Typical answers
  • "Mozart was born in 1756.
  • "Gandhi (1869-1948)...
  • Suggests phrases like
  • "ltNAMEgt was born in ltBIRTHDATEgt
  • "ltNAMEgt ( ltBIRTHDATEgt-
  • as Regular Expressions can help locate correct
    answer

76
Use Pattern Learning
  • Example
  • The great composer Mozart (1756-1791) achieved
    fame at a young age
  • Mozart (1756-1791) was a genius
  • The whole world would always be indebted to the
    great music of Mozart (1756-1791)
  • Longest matching substring for all 3 sentences is
    "Mozart (1756-1791)
  • Suffix tree would extract "Mozart (1756-1791)" as
    an output, with score of 3
  • Reminiscent of IE pattern learning

77
Pattern Learning (cont.)
  • Repeat with different examples of same question
    type
  • Gandhi 1869, Newton 1642, etc.
  • Some patterns learned for BIRTHDATE
  • a. born in ltANSWERgt, ltNAMEgt
  • b. ltNAMEgt was born on ltANSWERgt ,
  • c. ltNAMEgt ( ltANSWERgt -
  • d. ltNAMEgt ( ltANSWERgt - )

78
Experiments
  • 6 different Q types
  • from Webclopedia QA Typology (Hovy et al., 2002a)
  • BIRTHDATE
  • LOCATION
  • INVENTOR
  • DISCOVERER
  • DEFINITION
  • WHY-FAMOUS

79
Experiments pattern precision
  • BIRTHDATE table
  • 1.0 ltNAMEgt ( ltANSWERgt - )
  • 0.85 ltNAMEgt was born on ltANSWERgt,
  • 0.6 ltNAMEgt was born in ltANSWERgt
  • 0.59 ltNAMEgt was born ltANSWERgt
  • 0.53 ltANSWERgt ltNAMEgt was born
  • 0.50 - ltNAMEgt ( ltANSWERgt
  • 0.36 ltNAMEgt ( ltANSWERgt -
  • INVENTOR
  • 1.0 ltANSWERgt invents ltNAMEgt
  • 1.0 the ltNAMEgt was invented by ltANSWERgt
  • 1.0 ltANSWERgt invented the ltNAMEgt in

80
Experiments (cont.)
  • DISCOVERER
  • 1.0 when ltANSWERgt discovered ltNAMEgt
  • 1.0 ltANSWERgt's discovery of ltNAMEgt
  • 0.9 ltNAMEgt was discovered by ltANSWERgt in
  • DEFINITION
  • 1.0 ltNAMEgt and related ltANSWERgt
  • 1.0 form of ltANSWERgt, ltNAMEgt
  • 0.94 as ltNAMEgt, ltANSWERgt and

81
Experiments (cont.)
  • WHY-FAMOUS
  • 1.0 ltANSWERgt ltNAMEgt called
  • 1.0 laureate ltANSWERgt ltNAMEgt
  • 0.71 ltNAMEgt is the ltANSWERgt of
  • LOCATION
  • 1.0 ltANSWERgt's ltNAMEgt
  • 1.0 regional ltANSWERgt ltNAMEgt
  • 0.92 near ltNAMEgt in ltANSWERgt
  • Depending on question type, get high MRR
    (0.60.9), with higher results from use of Web
    than TREC QA collection

82
Shortcomings Extensions
  • Need for POS /or semantic types
  • "Where are the Rocky Mountains?
  • "Denver's new airport, topped with white
    fiberglass cones in imitation of the Rocky
    Mountains in the background , continues to lie
    empty
  • ltNAMEgt in ltANSWERgt
  • NE tagger /or ontology could enable system to
    determine "background" is not a location

83
Shortcomings... (cont.)
  • Long distance dependencies
  • "Where is London?
  • "London, which has one of the most busiest
    airports in the world, lies on the banks of the
    river Thames
  • would require pattern likeltQUESTIONgt,
    (ltany_wordgt), lies on ltANSWERgt
  • Abundance variety of Web data helps system to
    find an instance of patterns w/o losing answers
    to long distance dependencies

84
Shortcomings... (cont.)
  • System currently has only one anchor word
  • Doesn't work for Q types requiring multiple words
    from question to be in answer
  • "In which county does the city of Long Beach
    lie?
  • "Long Beach is situated in Los Angeles County
  • required pattern ltQ_TERM_1gt is situated in
    ltANSWERgt ltQ_TERM_2gt
  • Does not use case
  • "What is a micron?
  • "...a spokesman for Micron, a maker of
    semiconductors, said SIMMs are..."
  • If Micron had been capitalized in question, would
    be a perfect answer

85
QA Typology from ISI (USC)
  • Typology of typical Q forms94 nodes (47 leaf
    nodes)
  • Analyzed 17,384 questions (from answers.com)

86
Question Answering Example
  • How hot does the inside of an active volcano get?
  • get(TEMPERATURE, inside(volcano(active)))
  • lava fragments belched out of the mountain were
    as hot as 300 degrees Fahrenheit
  • fragments(lava, TEMPERATURE(degrees(300)),
  • belched(out, mountain))
  • volcano ISA mountain
  • lava ISPARTOF volcano ? lava inside volcano
  • fragments of lava HAVEPROPERTIESOF lava
  • The needed semantic information is in WordNet
    definitions, and was successfully translated into
    a form that was used for rough proofs

87
References
  • AskMSR Question Answering Using the Worldwide
    Web
  • Michele Banko, Eric Brill, Susan Dumais, Jimmy
    Lin
  • http//www.ai.mit.edu/people/jimmylin/publications
    /Banko-etal-AAAI02.pdf
  • In Proceedings of 2002 AAAI SYMPOSIUM on Mining
    Answers from Text and Knowledge Bases, March
    2002 
  • Web Question Answering Is More Always Better?
  • Susan Dumais, Michele Banko, Eric Brill, Jimmy
    Lin, Andrew Ng
  • http//research.microsoft.com/sdumais/SIGIR2002-Q
    A-Submit-Conf.pdf
  • D. Ravichandran and E.H. Hovy. 2002. Learning
    Surface Patterns for a Question Answering
    System.ACL conference, July 2002.

88
Harder Questions
  • Factoid question answering is really pretty
    silly.
  • A more interesting task is one where the answers
    are fluid and depend on the fusion of material
    from disparate texts over time.
  • Who is Condoleezza Rice?
  • Who is Mahmoud Abbas?
  • Why was Arafat flown to Paris?
Write a Comment
User Comments (0)
About PowerShow.com