MultiTask Learning and Web Search Ranking

1 / 52
About This Presentation
Title:

MultiTask Learning and Web Search Ranking

Description:

Brief Review: Machine Learning in web search ranking and Multi-Task learning. MLR with Adaptive Target Value Transformation each query is a task. ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 53
Provided by: Yah957

less

Transcript and Presenter's Notes

Title: MultiTask Learning and Web Search Ranking


1
Multi-Task Learning and Web Search Ranking
  • Gordon Sun (???)
  • Yahoo! Inc


March 2007
2
  • Outline
  • Brief Review Machine Learning in web search
    ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation
    each query is a task.
  • MLR for Multi-Languages each language is a
    task.
  • MLR for Multi-query classes each type of
    queries is a task.
  • Future work and Challenges.

3
  • MLR (Machine Learning Ranking)
  • General Function Estimation and Risk
    Minimization
  • Input x x1, x2, , xn
  • Output y
  • Training set yi, xi, i 1, , n
  • Goal Estimate mapping function y F(x)
  • In MLR work
  • x x (q, d) x1, x2, , xn --- ranking
    features
  • y judgment labeling e.g. P E G F B mapped
    to 0, 1, 2, 3, 4.
  • Loss Function L(y, F(x)) (y F(x))2
  • Algorithm MLR with regression.

4
  • Rank features construction
  • Query features
  • query language, query word types (Latin, Kanji,
    ),
  • Document features
  • page_quality, page_spam, page_rank,
  • Query-Document dependent features
  • Text match scores in body, title, anchor text
    (TF/IDF, proximity), ...
  • Evaluation metric DCG (Discounted Cumulative
    Gain)
  • where grades Gi grade values for P, E, G, F,
    B (NDCG 2n) DCG5 -- (n5), DCG10 -- (n10)

5
Distribution of judgment grades
6
  • Milti-Task Learning
  • Single-Task Learning (STL)
  • One prediction task (classification/regression)
  • to estimate a function based on
    oneTraining/testing set
  • T yi, xi, i 1, , n
  • Multi-Task Learning (MTL)
  • Multiple prediction tasks, each with their own
    training/testing set
  • Tk yki, xki, k 1, , m, i 1, , nk
  • Goal is to solve multiple tasks together
  • - Tasks share the same input space (or at least
    partially)
  • - Tasks are related (say, MLR -- share one
    mapping function)

7
  • Milti-Task Learning Intuition and Benefits
  • Empirical Intuition
  • Data from related tasks could help --
  • Equivalent to increase the effective sample size
  • Goal Share data and knowledge from task to task
    --- Transfer Learning.
  • Benefits
  • - when of training examples per task is limited
  • - when of tasks is large and can not be handled
    by MLR for each task.
  • - when it is difficult/expensive to obtain
    examples for some tasks
  • - possible to obtain meta-level knowledge

8
  • Milti-Task Learning Relatedness approaches.
  • Probabilistic modeling for task generation
  • Baxter 00, Heskes 00, The, Seeger, Jordan
    05,
  • Zhang, Gharamani, Yang 05
  • Latent Variable correlations
  • Noise correlations Greene 02
  • Latent variable modeling Zhang 06
  • Hidden common data structure and latent
    variables.
  • Implicit structure (common kernels) Evgeniou,
  • Micchelli, Pontil 05
  • Explicit structure (PCA) Ando, Zhang 04
  • Transformation relatedness Shai 05

9
  • Milti-Task Learning for MLR
  • Different levels of relatedness.
  • Grouping data based on queries, each query could
    be one task.
  • Grouping data based on languages of queries, each
    language is a task.
  • Grouping data based on query classes

10
  • Outline
  • Brief Review Machine Learning in web search
    ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation
    each query is a task.
  • MLR for Multi-Languages each language is a
    task.
  • MLR for Multi-query classes each type of
    queries is a task.
  • Future work and Challenges.

11
  • Adaptive Target Value Transformation
  • Intuition
  • Rank features vary a lot from query to query.
  • Rank features vary a lot from sample to sample
    with same labeling.
  • MLR is a ranking problem, but regression is to
    minimize prediction errors.
  • Solution Adaptively adjust training target
    values
  • Where linear (monotonic) transformation is
    required
  • (nonlinear g() may not reserve orders of E(yx))

12
  • Adaptive Target Value Transformation
  • Implementation Empirical Risk Minimization
  • Where the linear transformation weights are
    regularized,
  • ?a and ?ß are regularization parameters,
    the p-norm.
  • The solution will be

13
  • Adaptive Target Value Transformation
  • Norm p2 solution for each (?a and ?ß )
  • For initial (aß) , find F(x) by solving
  • For given F(x), solve for each (aq, ßq), q 1,
    2, Q.
  • Repeat 1 until
  • Norm p1 solution, solve conditional quadratic
    programming Lasso/lars
  • Convergence Analysis Assuming

14
Adaptive Target Value Transformation Experiments
data
15
Adaptive Target Value Transformation Evaluation
of aTVT on US and CN data
16
Adaptive Target Value Transformation
17
Adaptive Target Value Transformation
18
Adaptive Target Value Transformation
Observations 1. Relevance gain (DCG5 2) is
visible. 2. Regularization is needed. 3.
Different query types gain differently from aTVT.
19
  • Outline
  • Brief Review Machine Learning in web search
    ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation
    each query is a task.
  • MLR for Multi-Languages each language is a
    task.
  • MLR for Multi-query classes each type of
    queries is a task.
  • Future work and Challenges.

20
Multi-Language MLR
  • Objective
  • Make MLR globally scalable gt100 languages, gt50
    regions.
  • Improve MLR for small regions/languages using
    data from other languages.
  • Build a Universal MLR for all regions that do not
    have data and editorial support.

21
Multi-Language MLR Part 1
  • Feature Differences between Languages
  • MLR function differences between Languages.

22
Multi-Language MLR Distribution of Text Score
PerfExcellent urls
Bad urls
Legend JP, CN, DE, UK, KR
23
Multi-Language MLR Distribution of Spam Score
PerfExcellent urls
Bad urls
JP, KR similar
DE, UK similar
Legend JP, CN, DE, UK, KR
24
Multi-Language MLR Training and Testing on
Different Languages
Train Language
Test Language
DCG improvement over base function
25
Multi-Language MLR Language Differences
observations
  • Feature difference across languages is visible
    but not huge.
  • MLR trained for one language does not work well
    for other languages.

26
Multi-Language MLR Part 2
  • Transfer Learning with Region features

27
Multi-Language MLR Query Region Feature
  • New feature query region
  • Multiple Binary Valued Features
  • Feature vector qr (CN, JP, UK, DE, KR)
  • CN queries (1, 0, 0, 0, 0)
  • JP queries (0, 1, 0, 0, 0)
  • UK queries (0, 0, 1, 0, 0)
  • To test the Trained Universal MLR on new
    languages e.g. FR
  • Feature vector qr (0, 0, 0, 0, 0)

28
Multi-Language MLR Query Region Feature
Experiment results
DCG-5 improvement over base function
29
Multi-Language MLR Query Region Feature
Experiment results CJK and UK,DE Models
All models include query region feature
30
Multi-Language MLR Query Region Feature
Observations
  • Query Region feature seems to improve combined
    model performance in every case. Not always
    statistically significant.
  • Helped more when we had less data (KR).
  • Helped more when introducing near languages
    models (CJK, EU)
  • Would not help for languages with large training
    data (JP, CN).

31
Multi-Language MLR Experiments Overweighting
Target Language
  • This method deals with the common case where
    there is a language with a small amount of data
    available.
  • Use all available data, but change the weight of
    the data from the target language.
  • When weight1 Universal Language Model
  • As weight-gtINF becomes Single Language Model.

32
Multi-Language MLR Germany
33
Multi-Language MLR UK
34
Multi-Language MLR China
35
Multi-Language MLR Korea
36
Multi-Language MLR Japan
37
Multi-Language MLR Average DCG Gain For JP,
CN, DE, UK, KR
38
Multi-Language MLR Overweighting Target
LanguageObservations
  • It helps on certain languages with small size of
    data (KR, DE).
  • It does not help on some languages (CN, JP).
  • For languages with enough data, it will not help.
  • The weighting of 10 seems better than 1 and 100
    on average.

39
Multi-Language MLR Part 3
  • Transfer Learning with
  • Language Neutral Data and Regression Diff

40
Multi-Language MLR Selection of Language
Neutral queries
  • For each of (CN, JP, KR, DE, UK), train an MLR
    with own data.
  • Test queries of one language by all languages
    MLRs.
  • Select queries that showed best DCG cross
    different language MLRs.
  • Consider these queries as language neutral and
    could be shared by all language MLR development.

41
Multi-Language MLR Evaluation of Language
Neutral Queries on CN-simplified dataset (2,753
queries).
42
  • Outline
  • Brief Review Machine Learning in web search
    ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation
    each query is a task.
  • MLR for Multi-Languages each language is a
    task.
  • MLR for Multi-query classes each type of
    queries is a task.
  • Future work and Challenges.

43
Multi-Query Class MLR
  • Intuitions
  • Different types of queries behave differently
  • Require different ranking features,
  • (Time sensitive queries ? page_time_stamps).
  • Expect different results
  • (Navigational queries ? one official page on
    the top.)
  • Also, different types of queries could share the
    same ranking features.
  • .
  • Multi-class learning could be done in a unified
    MLR by
  • Introducing query classification and use query
    class as input ranking features.
  • Adding page level features for the corresponding
    classes.

44
Multi-Query Class MLR
  • Time Recency experiments
  • Feature implementation
  • Binary query feature Time Sensitive (0,1)
  • Binary page feature discovered within last three
    month.
  • Data
  • 300 time sensitive queries (editorial).
  • 2000 ordinary queries.
  • Over weight time sensitive queries by 3.
  • 10-fold cross validation on MLR training/testing.

45
Multi-Query Class MLR
  • Time Recency experiments result
  • Compare MLR with and w/o page_time feature.

46
Multi-Query Class MLR
  • Name Entity queries
  • Feature implementation
  • Binary query feature name entity query (0,1)
  • 11 new page features implemented
  • Path length
  • Host length
  • Number of host component (url depth)
  • Path contains index
  • Path contains either cgi, asp, jsp, or
    php
  • Path contains search or srch,
  • Data
  • 142 place name entity queries.
  • 2000 ordinary queries.
  • 10-fold cross validation on MLR training/testing.

47
Multi-Query Class MLR
  • Name Entity query experiments result
  • Compared MLR with base model without name entity
    features.

48
Multi-Query Class MLR
  • Observations
  • Query class combined with page level features
    could help MLR relevance.
  • More research is needed on query classification
    and page level feature optimization.

49
  • Outline
  • Brief Review Machine Learning in web search
    ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation
    each query is a task.
  • MLR for Multi-Languages each language is a
    task.
  • MLR for Multi-query classes each type of
    queries is a task.
  • Future work and Challenges.

50
Future Work and Challenges
  • Multi-task learning extended to different types
    of training data
  • Editorial judgment data.
  • User click-through data
  • Multi-task learning extended to different types
    of relevance judgments
  • Absolute relevance judgment.
  • Relative relevance judgment
  • Multi-task learning extended to use both
  • Labeled data.
  • Unlabeled data.
  • Multi-task learning extended to different types
    of search user intentions.

51
  • Contributors from Yahoo! International Search
    Relevance team
  • Algorithm and model development
  • Zhaohui Zheng,
  • Hongyuan Zha,
  • Lukas Biewald,
  • Haoying Fu
  • Data exporting/processing/QA
  • Jianzhang He
  • Srihari Reddy
  • Director
  • Gordon Sun.

52
Thank you. QA?
Write a Comment
User Comments (0)