Learning and Inference for Hierarchically Split PCFGs

About This Presentation

Title:

Learning and Inference for Hierarchically Split PCFGs

Description:

The Game of Designing a Grammar. Annotation refines base treebank symbols to improve ... [Goodman 97, Charniak&Johnson 05] Coarse grammar. NP ... VP ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 34

Provided by: EECS

Category:

more less

Transcript and Presenter's Notes

Title: Learning and Inference for Hierarchically Split PCFGs

1
Learning and Inference for Hierarchically Split
PCFGs

Slav Petrov and Dan Klein

2
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98

3
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98
Head lexicalization Collins 99, Charniak 00

4
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98
Head lexicalization Collins 99, Charniak 00
Automatic clustering?

5
Learning Latent Annotations
Matsuzaki et al. 05

EM algorithm

Brackets are known
Base categories are known
Only induce subcategories

Just like Forward-Backward for HMMs.
6
Overview
- Hierarchical Training - Adaptive Splitting -
Parameter Smoothing
7
Refinement of the DT tag
DT
8
Refinement of the DT tag
DT
9
Hierarchical refinement of the DT tag
DT
10
Hierarchical Estimation Results
11
Refinement of the , tag

Splitting all categories the same amount is
wasteful

12
Adaptive Splitting

Want to split complex categories more
Idea split everything, roll back splits which
were least useful

13
Adaptive Splitting

Want to split complex categories more
Idea split everything, roll back splits which
were least useful

14
Adaptive Splitting Results
15
Number of Phrasal Subcategories
16
Number of Phrasal Subcategories
NP
VP
PP
17
Number of Phrasal Subcategories
NAC
X
18
Number of Lexical Subcategories
POS
TO
,
19
Number of Lexical Subcategories
NNP
JJ
NNS
NN
20
Smoothing