Title: The Voted Perceptron for Ranking and Structured Classification
1The Voted Perceptron for Ranking and Structured
Classification
2A few critique questions
- Why use a non-convergent method for computing
expectations (for skip-CRFs) ? Was that the only
choice? - Sadly the choice is provably fast or provably
convergent -- pick only one. - Does it matter that the structure is different at
different nodes in the skip-chain CRF? - Does it matter that some linear-chain nodes have
only one neighbor? - Does it matter that some documents have 100 words
and some have 1000? - What is all the loopy BP stuff all about anyway?
- Bishops textbook chapter 8 for introduction.
3The voted perceptron
instance xi
B
A
4(1) A target u
(2) The guess v1 after one positive example.
5(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
6(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
gt?
v1
x1
v1
-x2
-u
-u
2?
2?
7(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
8(No Transcript)
9On-line to batch learning
- Pick a vk at random according to mk/m, the
fraction of examples it was used for. - Predict using the vk you just picked.
- (Actually, use some sort of deterministic
approximation to this).
10The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
11Ranking some xs with the target vector u
x
?
x
x
x
x
12Ranking some xs with some guess vector v part 1
x
?
v
x
x
x
x
13Ranking some xs with some guess vector v part
2. The purple-circles x is xb - the green one
is xb, the one A has chosen to rank highest.
x
v
x
x
x
x
14Correcting v by adding xb xb
x
v
x
x
x
x
15Correcting v by adding xb xb (part 2)
Vk1
x
vk
x
x
x
x
16(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
17(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
3
18Notice this doesnt depend at all on the number
of xs being ranked
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
19The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
Change number one replace x with z
20The voted perceptron for NER
instances z1 z2 z3 z4
B
A
- A sends B the Sha Pereira paper and
instructions for creating the instances - A sends a word vector xi. Then B could create
the instances F(xi,y).. - but instead B just returns the y that gives the
best score for the dot product vk . F(xi,y) by
using Viterbi. - A sends B the correct label sequence yi.
- On errors, B sets vk1 vk zb - zb vk
F(xi,y) - F(xi,y)
21The voted perceptron for NER
instances z1 z2 z3 z4
B
A
- A sends a word vector xi.
- B just returns the y that gives the best score
for vk . F(xi,y) - A sends B the correct label sequence yi.
- On errors, B sets vk1 vk zb - zb vk
F(xi,y) - F(xi,y)
22Collins results