The Voted Perceptron for Ranking and Structured Classification

About This Presentation

Title:

The Voted Perceptron for Ranking and Structured Classification

Description:

Does it matter that some linear-chain nodes have only one neighbor? ... Bishop's textbook chapter 8 for introduction. The voted perceptron. A. B. instance xi ... – PowerPoint PPT presentation

Number of Views:88

Avg rating:3.0/5.0

Slides: 23

Provided by: willia95

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Voted Perceptron for Ranking and Structured Classification

1
The Voted Perceptron for Ranking and Structured
Classification

William Cohen
3-6-2007

2
A few critique questions

Why use a non-convergent method for computing
expectations (for skip-CRFs) ? Was that the only
choice?
Sadly the choice is provably fast or provably
convergent -- pick only one.
Does it matter that the structure is different at
different nodes in the skip-chain CRF?
Does it matter that some linear-chain nodes have
only one neighbor?
Does it matter that some documents have 100 words
and some have 1000?
What is all the loopy BP stuff all about anyway?
Bishops textbook chapter 8 for introduction.

3
The voted perceptron
instance xi
B
A
4
(1) A target u
(2) The guess v1 after one positive example.
5
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
6
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
gt?
v1
x1
v1
-x2
-u
-u
2?
2?
7
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
8
(No Transcript)
9
On-line to batch learning

Pick a vk at random according to mk/m, the
fraction of examples it was used for.
Predict using the vk you just picked.
(Actually, use some sort of deterministic
approximation to this).

10
The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
11
Ranking some xs with the target vector u
x
?
x
x
x
x
12
Ranking some xs with some guess vector v part 1
x
?
v
x
x
x
x
13
Ranking some xs with some guess vector v part
2. The purple-circles x is xb - the green one
is xb, the one A has chosen to rank highest.
x
v
x
x
x
x
14
Correcting v by adding xb xb
x
v
x
x
x
x
15
Correcting v by adding xb xb (part 2)
Vk1
x
vk
x
x
x
x
16
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
17
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
3
18
Notice this doesnt depend at all on the number
of xs being ranked
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
19
The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
Change number one replace x with z
20
The voted perceptron for NER
instances z1 z2 z3 z4
B
A

A sends B the Sha Pereira paper and
instructions for creating the instances
A sends a word vector xi. Then B could create
the instances F(xi,y)..
but instead B just returns the y that gives the
best score for the dot product vk . F(xi,y) by
using Viterbi.
A sends B the correct label sequence yi.
On errors, B sets vk1 vk zb - zb vk
F(xi,y) - F(xi,y)

21
The voted perceptron for NER
instances z1 z2 z3 z4
B
A