Minimally Supervised Learning of Semantic Knowledge from Query Logs - PowerPoint PPT Presentation

About This Presentation

Title:

Minimally Supervised Learning of Semantic Knowledge from Query Logs

Description:

Minimally Supervised Learning of Semantic Knowledge from Query Logs – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 17

Provided by: MamoruK8

Category:

more less

Transcript and Presenter's Notes

Title: Minimally Supervised Learning of Semantic Knowledge from Query Logs

1
Minimally Supervised Learning of Semantic
Knowledge from Query Logs
Mamoru Komachi() and Hisami Suzuki() () Nara
Institute of Science and Technology, Japan ()
Microsoft Research, USA

IJCNLP-08, Hyderabad, India

2
Task
similar
similar
Darjeeling
Kombucha (Japanese tea)
Chai (Indian tea)

Learn semantic categories from web search query
logs by bootstrapping with minimal supervision
Semantic category a set of words which are
interrelated
Named entities, technical terms, paraphrases,
Can be useful for search ads, etc

2012/2/24
2
3
Approach

Semantic categories
The objects search users frequently ask (cf.
Pasca and Durme 2007)
Query logs
Reflect the interest of search users
Short but relevant for word categorization
Include word segmentation specified by users
Bootstrapping
Adapted by many binary relation extraction tasks
(Brin 1998 Collins and Singer 1999 Etzioni et
al. 2005)
Can start from small set of instances (cf. Sekine
and Suzuki 2007)

4
Our Contribution

First to use the Japanese query logs for the task
of learning of named entities
Propose an efficient method suited for query
logs, based on the general-purpose Espresso
(Pantel and Pennacchiotti 2006) algorithm

5
Table of Contents

Related work
Bootstrapping techniques for relation extraction
Scoring metrics
The Tchai algorithm
Problems of Espresso
Extension to Espresso
Experiment
System performance and comparison to other
algorithms
Samples of extracted instances and patterns

6
Bootstrapping

Iteratively conduct pattern induction and
instance extraction starting from seed instances
Can fertilize small set of seed instances

Query log (Corpus)
Instances
Contextual patterns
vaio
Compare vaio laptop
Compare laptop
Compare toshiba satellite laptop
Toshiba satellite
slot
Compare HP xb3000 laptop
HP xb3000
7
Instance lookup and pattern induction
ANA ??
ANA
??
query log
extracted pattern
instance
Restaurant reservation?
Flight reservation?
Broad coverage, Noisy patterns
Use all strings but instances Require no
segmentation

Semantic drift
Computational efficency

Generic patterns
8
Instance/Pattern Scoring Metrics

Sekine Suzuki (2007)
Starts from a large named entity dictionary
Assign low scores to generic patterns and ignore
Basilisk (Thelen and Riloff, 2002)
Balance the recall and precision of generic
patterns
Espresso (Pantel and Pennacchiotti, 2006)

PMI is normalized by the maximum of all P and I
P patterns in corpus I instances in corpus PMI
pointwise mutual information r reliability score
Reliability of an instance and a pattern is
mutually defined
9
Problems of Espresso

Generic patterns/instances
Generic patterns require a lot of computation
Computational efficiency
Espresso computes the reliability for all
patterns to rank in each iteration

10
The Tchai Algorithm

Filter generic patterns/instances
Not to select generic patterns and instances
Replace scaling factor in reliability scores
Take the maximum PMI for a given instance/pattern
rather than the maximum for all instances and
patterns
This modification shows a large impact on the
effectiveness of our algorithm
Only induce patterns at the beginning
Tchai runs 400X faster than Espresso

11
Comparison of methods
12
Experiments

Japanese query logs from 2007/01-02
Unique one million (166 millions in token)
Target categories
Manually classified 10,000 most frequent search
words (in the log of 2006/12) -- hereafter
referred to as 10K list
Travel the largest category (712 words)
Finance the smallest category (240 words)

13
Results
High precision (92.1)

Travel

Learned 251 novel words

Finance

Due to the ambiguity of hand labeling (e.g. Tokyo
Disney Land)
Include common nouns related to Travel (e.g.
Rental car)
14
Sample of Instances (Travel category)
Able to learn several sub-categories in which no
seed words given
15
Impact of Pattern Induction
No degradation without pattern induction
Can run 400X faster without any cost!
16
Effect of each modification
Filtering outperforms no-filtering constantly
Scaling factor has the most impact
17
System Performance

Travel

Finance

High precision and recall
High precision but low relative recall due to
strict filtering
Relative Recall (Pantel et al., 2004)
18
Cumulative precision Travel
Tchai achieved the best precision
19
Cumulative precision Finance
Both Basilisk and Espresso suffered from
acquiring generic pattern in early stages of
iteration
20
Sample Extracted Patterns
Basilisk and Espresso extracted location names as
context patterns, which may be too generic for
Travel domain
Tchai found context patterns that are
characteristic to the domain
21
Conclusion and future work