MultiTask Learning for HIV Therapy Screening - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

MultiTask Learning for HIV Therapy Screening

Description:

Effect of combinations on virus similar but not identical. ... Main diagonal entries of set to (standard regularizer), Diagonals of sub-matrices set to. ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 32

Provided by: sche55

Category:

more less

Transcript and Presenter's Notes

Title: MultiTask Learning for HIV Therapy Screening

1
Multi-Task Learning for HIV Therapy Screening

Steffen Bickel, Jasmina Bogojeska, Thomas
Lengauer, Tobias Scheffer

2
HIV Therapy Screening

Usually combinations (3-6 drugs) out of around
17 antiretroviral drugs administered.
Effect of combinations on virus similar but not
identical.
Scarce training data available from treatment
records.
Challenge Prediction of therapy outcome from
genotypic information.

data for combination 1
data for combination 2
data for comb. 3
successful treatment
failed treatment
3
Multi-Task Learning

Several related prediction problems (tasks).
Not necessarily identical conditional p(yx) of
label given input.
Usually, some conditionals are similar.
Challenge
Use all available training data and account for
the difference in distributions accross tasks.
HIV therapy screening
Can be modeled as multi-task learning problem.
Drug combinations (tasks) have similar but not
identical effect on the virus.

4
Overview

Motivation.
HIV therapy screening.
Multiple tasks with differing distributions.
Multi-task learning by distribution matching.
Problem Setting.
Density ratio matches pool to target
distribution.
Discriminative estimation of matching weights.
Case study
HIV therapy screening.

5
Multi-Task Learning Problem Setting
Target distribution
Labeled target data
6
Multi-Task Learning Problem Setting

Goal Minimize loss under target distribution.

Target distribution
Labeled target data
7
Multi-Task Learning Problem Setting

Goal Minimize loss under target distribution.

Target distribution
Labeled target data
8
Multi-Task Learning Problem Setting

Goal Minimize loss under target distribution.

Target distribution
Auxiliary distributions
Labeled target data
9
Multi-Task Learning Problem Setting

Goal Minimize loss under target distribution.

Target distribution
Auxiliary distributions
Problem Setting Multi-Task Learning
Labeled target data
10
Multi-Task Learning Problem Setting

Goal Minimize loss under target distribution.

Target distribution
Auxiliary distributions
Labeled target data
11
Multi-Task Learning

Goal Minimize loss under target distribution.

?
Target distribution
Pool distribution
Labeled target data
12
Distribution Matching

Goal Minimize loss under target distribution.

Target distribution
Pool distribution
Labeled target data
13
Distribution Matching

Goal Minimize loss under target distribution.

Target distribution
Pool distribution
Expected loss under target distribution
Rescale loss for each pool example
Expectation over training pool
Labeled target data
14
Distribution Matching

Goal Minimize loss under target distribution.

y-1
x
y1
x
Target distribution
Pool distribution
15
Distribution Matching

Goal Minimize loss under target distribution.

y-1
x
y1
x1
x
Target distribution
Pool distribution
16
Distribution Matching

Goal Minimize loss under target distribution.

y-1
x
y1
x1
x
x2
Target distribution
Pool distribution
17
Estimation of Density Ratio

Goal Minimize loss under target distribution.

18
Estimation of Density Ratio

Goal Minimize loss under target distribution.
Theorem

Potentially high-dimensional densities
One binary conditional density
19
Estimation of Density Ratio

Goal Minimize loss under target distribution.
Theorem
Intuition of how much more likely
is to be drawn from target than from
auxiliary density.

Pool
20
Estimation of Density Ratio

Goal Minimize loss under target distribution.
Theorem
Intuition of how much more likely
is to be drawn from target than from
auxiliary density.

Pool
auxiliarytask examples
Targetexamples
21
Estimation of Density Ratio

Goal Minimize loss under target distribution.
Theorem
Intuition of how much more likely
is to be drawn from target than from
auxiliary density.

Estimation of with probabilistic
classifier (e.g., logreg)
Pool
auxiliarytask examples
Targetexamples
22
Estimation of Density Ratio

Goal Minimize loss under target distribution.
Theorem
Intuition of how much more likely
is to be drawn from target than from
auxiliary density.

towards blue larger large resampling weights
Pool
auxiliarytask examples
Targetexamples
23
Prior Knowledge on Task Similarity

Prior knowledge in task similarity kernel
.
Encoding of prior knowledge in Gaussian prior
on parameters v of a multi-class
logistic regression model for the resampling
weights.
Main diagonal entries of set to (standard
regularizer),
Diagonals of sub-matrices set to
.

24
Distribution Matching Algorithm

Weight ModelTrain Logreg of target vs.
auxiliary data with task similarity in .
Target Model Minimize regularized empirical
loss on pool weighted by .

Result of step 1 weight model
25
Overview

Motivation.
HIV therapy screening.
Multiple tasks with differing distributions.
Multi-task learning by distribution matching.
Problem Setting.
Density ratio matches pool to target
distribution.
Discriminative estimation of matching weights.
Case study
HIV therapy screening.

26
HIV Therapy Screening Prediction Problem

Information about each patient x, binary vector
of resistance-relevant virus mutations and
of previously given drugs.
Drug combination selected out of 17 drugs.
Drug combinations correspond to tasks z.
Target label y (success or failure of therapy).
2 different labelings (virus load and
multi-conditional).

virus load
time
conditions
27
HIV Therapy Screening Data

Patients from hospitals in Italy, Germany, and
Sweden.
3260 labeled treatments.
545 different drug combinations (tasks).
50 of combinations with only one labeled
treatment.
Similarity of drug combinations task kernel.
Drug feature kernel product of drug indicator
vectors.
Mutation table kernel similarity of mutations
that render drug ineffective.
80/20 training/test split, consistent with time
stamps.

training data
test data
time
28
Reference Methods

Independent models (separately trained).
One-size-fits-all, product of task and feature
kernel,
Bonilla, Agakov, and Williams (2007).
Hierarchical Bayesian Kernel,
Evgeniou Pontil (2004).
Hierarchical Bayesian Gaussian Process
Yu, Tresp, and Schwaighofer (2005).
Logistic regression is target model (except for
Gaussian process model).
RBF kernels.

29
Results Distribution Matching vs. Other
virus load
multi-condition
separate
one-size-fits-all
hier. Bayeskernel
hier. BayesGauss. Proc.
distributionmatching

Distribution matching always best (17 of 20 cases
stat. significant) or as good as best reference
method.
Improvement over separately trained models 10-14.

30
Results Benefit of Prior Knowledge
virus load
multi-condition
no priorknowledge
drug. feat.kernel
Mut. tablekernel