Transformationbased errordriven learning TBL

About This Presentation

Title:

Transformationbased errordriven learning TBL

Description:

for comparing the corpus to the truth. for choosing a transformation. Using TBL (cont) ... for comparing the corpus to the truth: ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 30

Provided by: facultyWa9

Learn more at: http://faculty.washington.edu

Category:

more less

Transcript and Presenter's Notes

Title: Transformationbased errordriven learning TBL

1
Transformation-based error-driven learning (TBL)

LING 572
Fei Xia
1/19/06

2
Outline

Basic concept and properties
Relation between DT, DL, and TBL
Case study

3
Basic concepts and properties
4
TBL overview

Introduced by Eric Brill (1992)
Intuition
Start with some simple solution to the problem
Then apply a sequence of transformations
Applications
Classification problems
Other kinds of problems e.g., parsing

5
TBL flowchart
6
Transformations

A transformation has two components
A trigger environment e.g., the previous tag is
DT
A rewrite rule change the current tag from MD to
N
If (prev_tag T) then MD ? N
Similar to a rule in decision tree, but the
rewrite rule can be complicated (e.g., change a
parse tree)
? a transformation list is a processor and not
(just) a classifier.

7
Learning transformations

Initialize each example in the training data with
a classifier
Consider all the possible transformations, and
choose the one with the highest score.
Append it to the transformation list and apply it
to the training corpus to obtain a new corpus.
Repeat steps 2-3.
? Steps 2-3 can be expensive. Various ways that
try to solve the problem.

8
Using TBL

The initial state-annotator
The space of allowable transformations
Rewrite rules
Triggering environments
The objective function minimize error rate
directly.
for comparing the corpus to the truth
for choosing a transformation

9
Using TBL (cont)

Two more parameters
Whether the effect of a transformation is visible
to following transformations
If so, whats the order in which transformations
are applied to a corpus?
left-to-right
right-to-left

10
The order matters

Transformation
If prev_labelA then change the cur_label from A
to B.
Input A A A A A A
Output
Not immediate results A B B B B B
Immediate results, left-to-right A B A B A B
Immediate results, right-to-left A B B B B B

11
Relation between DT, DL, and TBL
12
DT and TBL

DT is a subset of TBL
(Proof)
when depth(DT)1

Label with S
If X then S ? A
S ? B

13
DT is a subset of TBL
Depthn
L1 Label with S L1
L2 Label with S L2
Depthn1
Label with S If X then S ? S S ? S L1 L2
14
DT is a subset of TBL
Label with S If X then S ? S S ? S L1
(renaming X with X) L2 (renaming X with
X) X ? X X ? X
15
DT is a proper subset of TBL

There exists a problem that can be solved by TBL
but not a DT, for a fixed set of primitive
queries.
Ex Given a sequence of characters
Classify a char based on its position
If pos 4 0 then yes else no
Input attributes available previous two chars

Transformation list
Label with S A/S A/S A/S A/S A/S A/S A/S
If there is no previous character, then S? F
A/F A/S A/S A/S A/S A/S A/S
If the char two to the left is labeled ith F,
then S? F
A/F A/S A/F A/S A/F A/S A/F
If the char two to the left is labeled with F,
then F?S
A/F A/S A/S A/S A/F A/S A/S
F ? yes
S ? no

17
DT and TBL

TBL is more powerful than DT
Extra power of TBL comes from
Transformations are applied in sequence
Results of previous transformations are visible
to following transformations.

18
DL and TBL

DL is a proper subset of TBL.
In two-class TBL
(if q then y ? y) ? (if q then y)
If multiple transformations apply to an example,
only the last one matters

19
Two-class TBL ? DL ?

Two-class TBL ? DL
Replace if q then y?y with if q then y
Reverse the rule order
DL ? two-class TBL
Replace if q then y with if q then y?y
Reverse the rule order
? does not hold for dynamic problems
Dynamic problem the anwers to questions are not
static
Ex in POS tagging, when the tag of a word is
changed, it changes the answers to questions for
nearby words.

20
DT, DL, and TBL (summary)

K-DT is a proper subset of k-DL.
DL is a proper subset of TBL.
Extra power of TBL comes from
Transformations are applied in sequence
Results of previous transformations are visible
to following transformations.
TBL transforms training data. It does not split
training data.
TBL is a processor, not just a classifier

21
Case study
22
TBL for POS tagging

The initial state-annotator most common tag for
a word.
The space of allowable transformations
Rewrite rules change cur_tag from X to Y.
Triggering environments (feature types)
unlexicalized or lexicalized

23
Unlexicalized features

t-1 is z
t-1 or t-2 is z
t-1 or t-2 or t-3 is z
t-1 is z and t1 is w

24
Lexicalized features

w0 is w.
w-1 is w
w-1 or w-2 is w
t-1 is z and w0 is w.

25
TBL for POS tagging (cont)

The objective function tagging accuracy
for comparing the corpus to the truth
For choosing a transformation choose the one
that results in the greatest error reduction.
The order of applying transformations
left-to-right.
The results of applying transformations are not
visible to other transformations.

26
Learned transformations
27
Experiments
28
Uncovered issues