Title: Generation in the Context of MT
1Generation in the Context of MT
2The Team
- Senior members affiliate members
- Jan Hajic, Charles Univ., Prague Drago Radev,
Univ. of Michigan - Gerald Penn, Univ. of Toronto Jason Eisner, Johns
Hopkins Univ. - Owen Rambow, Univ. of Pennsylvania
- Dan Gildea, Univ. of Pennsylvania Bonnie Dorr,
Univ. of Maryland - Students
- Yuan Ding, Univ. of Pennsylvania Martin Cmejrek,
Charles Univ., Prague - Terry Koo, MIT Kristen Parton, Stanford Univ.
- Jan Curín, Charles University Ivona Kucerová,
Charles University - Pre-workshop work (Charles University)
- Zdenek abokrtský Petr Pajas
- Václav Honetschläger Alena Böhmová
- Vladislav Kubon Jirí Havelka
3The Goal
- Generate English (linear surface form)
- from syntactic-semantic sentence representation
(so-called tectogrammatical, or TR) - Possible application setting
- machine translation
- other uses
- Front-end for QA systems, summarization
- Evaluate under various circumstances
4 Tectogrammatical Representation
According to his opinion UALs executives were
misinformed about the financing of the original
transaction
5 Tectogrammatical Representation
According to he opinion UALs executive were
misinform about the financing of the original
transaction
6TR in Machine Translation
Vedení UAL bylo podle jeho názoru o financování
puvodní transakce nesprávne informováno.
NULL
7The MT Framework
Source language textCZECH
8The MT Framework
AR trees
CZECH
ENGLISH
9Translating trees
a
A
c
b
B
CD
d
E
e
f
F
10Tools and Data Resources
- Tools
- WS98 Czech parser other Czech tools (tagger)
- GIZA (WS99) ISI decoder
- Data
- PTB (40k sentences)
- PTB translation to Czech (11k sentences)
- Prague Dependency Treebank 1.0 (90k sentences)
- Prague Dependency Treebank 2.0 preliminary
- 15k sentences manually annotated
- Monolingual data
11The Evaluation Metric BLEU
- Plain English output (MT, Generation)
- difficult and/or expensive to evaluate
subjectively - BLEU (IBM)
- automatic method, score 0..1
- relative scores ? subjective human evaluation
- needs several reference gold standards
- n-gram-based metric w/small-length penalty
- Different local evaluations throughout, too
12Presentation Outline
- The Systems and Their Inputs
- Getting the data tools ready
- The Statistical Generation System
- The channel model
- Word order, Punctuation, Morphology
- The Hybrid Approach
- Evaluation Results
- Student Project Proposals
- Conclusions and Future Directions
13Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
14The Systems and Their Inputs
15WS02GMT
- System 1 statistical
- System 2 hybrid
- Output English linear surface form
- Input 1 automatically created English TR
- Input 2 manually created English TR
- Input 3 improved automatic English TR (PropBank)
- Input 4 Czenglish TR (simple translation)
16Input 1 Automatic English TR
- Penn Treebank v. 3
- heads (Jason Eisners code modifications)
- lemmatization
- word IDs
- rule-based transformation to English AR, TR
- (by Kucerová abokrtský)
- ? English TR (I1), size 40k sentences
17Input 2 Manual English TR
- Penn Treebank v. 3
- Input 1
- manual annotation (correction) (IK)
- including
- deep word order, conversion of grammatical
codes - ? English TR (I2), size 1.5k sentences
18Input 3 Enhanced Automatic English TR
- Penn Treebank v. 3
- Input 1
- PropBank
- additional sources
- ? English TR (I3) size 40k sentences
19 Input 4 Automatic Czenglish TR
- Linear Surface Czech
- Czech tagging lemmatization
- Parsed to Czech AR, Czech TR
- Simple Transfer (Lemma translation)
- - lexical replacement
- dictionary collected from web, MRDs
- trained on TR lemmas by GIZA
- ? Czenglish TR (I4) 11k sentences
-
-
20Dictionary Filtering
Frequencies on English Monolingual Corpus (North
American News Text) 365 M words
4 Czech/English Dictionary Sources (WinGED,
GNU/FDL, PCTrans, EuroWordNet)
Merging, Pruning
Czech POS
English POS
Czech/English parallel Penn TreeBank Corpus
GIZA Training
Czech/English Dictionary for Transfer
Input Data Source Output Data Tools
21Word-by-word translation of TR lemmas
- Word by word dictionary 42 835 entries, 65408
translations - format
- ltegtteckalttgtN
- lttrgtspotlttrtgtNltprobgt0.353598
- lttrgtdotlttrtgtNltprobgt0.28792
- lttrgtfull _at_stoplttrtgtNltprobgt0.28729
- 1-1, 1-2 (2-1 translations not yet implemented)
- packed forest representation for multiple
translation choice - simplified version choose the first best
22Where are we?
w/additional info
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
23Automatically Annotating a Tectogrammatical Corpus
24Goal
- Use PropBank annotations to
- Improve automatic construction of English TRs
- Allow generation from generic pred-arg
structures
25Types of Corpus Annotation
- Surface Syntax
- Deep Syntax
- Local Lexical Semantics
- Global Lexical Semantics
- Hybrid Deep Syntactic/Global Semantic
- Tectogrammatical level used here
26Surface SyntaxE.g., Penn Treebank
loaded
prepobj
prepobj
subj
hay
into
by
is
comp
comp
John
trucks
Hay is loaded into trucks by John
27Deep SyntaxE.g., TAG
John loads hay into trucks
Hay is loaded into trucks by John
28Local SemanticsPenn PropBank (brand new)
John loads hay into trucks
John loads trucks with hay
29Global SemanticsLCS (U. Md.)
John loads hay into trucks
John throws hay into trucks
30Tectogrammatical Representation
- First two syntactic arguments of verb
deep-syntactic - All other arguments global semantic
load
load
throw
dir3
dir3
pat
act
act
act
pat
acmp
pat
John
hay
truck
John
hay
truck
John
hay
truck
John loads hay into trucks
John throws hay into trucks
John loads trucks with hay
31Why Use TR? Research Hypothesis
- Replacing function words by TR arc labels makes
transfer easier - Choice of realization target language-dependent
- Deep-syntactic labels for first two arguments
realization more verb-specific - Global semantic labels on remaining arguments
realization just label-specific
32Available Resources for Input 3
- Surface syntax PTB corpus (hand, checked)
- Deep syntax derived automatically from PTB
(Chen01) - Local semantics PropBank corpus and frame
lexicon (hand, checked) - Global semantics LCS lexicon (partially hand,
partially checked) - TR PTB subset corpus (hand), PropBank ? TR
dictionary (hand, not checked) (I. Kucerová)
33Experiment Machine Learning of TR Labels Using
Ripper
- Ripper (Cohen 1996) greedy symbolic rule
learner, set- and bag-valued features - Features
- Surface, deep syntactic info
- Local, global semantic info
- Kucerovás PropBank ? TR dictionary
(hand-crafted) - Input 1 (Automatic English TR)
34Results (TR Label Error Rates)
Semantics
PB? TR dict
all
local-global
local
none
37.7
22.6
23.7
25.9
58.8
none
17.1
15.9
16.3
17.7
19.5
Input 1
16.2
16.7
17.1
16.4
16.5
surface-deep
Syntax
14.4
16.1
16.2
15.9
15.5
surface-deep-Inp1
Average accuracy on 5-fold cross-validation (1326
data points)
35Conclusions
- Machine learning can improve on hand-written
conversion rules ( Input 1) - PropBank is useful
- Best results
- All syntactic features PropBank ? TR dictionary
- Future work use PropBank ? LCS dictionary
(developed during workshop)
36Where are we?
Transfer
Deep syntax (Czech)
CZECH
ENGLISH
37The MAGENTA System
- Statistically based
- The pipeline
- TR to AR by a channel model
- Word order by reordering on dep. trees
- Punctuation insertion
- Morphology
38Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
CZECH
ENGLISH
39The Tree-to-Tree Transductions
a
A
CD
c
b
B
d
E
prep
prep
e
f
F
det
det
40Translating trees
a
A
c
b
B
CD
learn this 21 mapping(or in dictionary)
d
E
Also 12, 20, etc., rearrangements ...
e
f
F
01 mapping
41Translating trees
a
A
c
b
B
CD
d
E
e
f
F
42Statistical Need a model of tree pairs
Mainly interested in (TR,AR) pairs But our
techniques are quite general E.g., example below
is not a (TR,AR) pair
the girl kissed her kitty cat
the girl gave a kiss to her cat
43Training Our team has many tree pairs
Should be nicer to model than string pairs - why
we built them! What Czech trees went with what
English trees in training? ... Learn parameters
? of a joint model P?(T1,T2).
the girl kissed her kitty cat
Pred,kissed
Pred
Obj,cat
Obj
Subj,girl
Obj
Subj
Det
Det,the
kitty
Det
Det,her
44Decoding Complete a tree pair
Training given T1 and T2 find ? to maximize
P?(T1,T2) Decoding given T1 and ? find T2 to
maximize P?(T1,T2) Horrible sparse data problem -
cant just do tree lookup.
the girl kissed her kitty cat
??
45How should a model of tree pairs look?
Joint model P?(T1,T2).
Wise to use noisy-channel form P?(T1 T2)
P?(T2)
But any joint model will do.
46How should a model P? (T1,T2) of tree pairs look?
Intuition some kind of correspondence between
words.
Try to learn correspondence using EM
alignment (could seed with a dictionary).
the girl kissed her kitty cat
the girl gave a kiss to her cat
47How should a model P? (T1,T2) of tree pairs look?
Intuition some kind of correspondence between
words.
Try to learn correspondence using EM
alignment (could seed with a dictionary).
the girl kissed her kitty cat
the girl gave a kiss to her cat
different, bad alignment!
48How should a model P? (T1,T2) of tree pairs look?
Intuition some kind of correspondence between
words.
Try to learn correspondence using EM
alignment (could seed with a dictionary).
- So model must consider alignment P? (T1,T2,A)
- Why A is complicated
- The correspondence isnt 1 to 1
- Also need to model word order (indeed topology)
49Solution Use the right grammar formalism
Grammars can assemble words or phrases into
trees. Lets work up to the right formalism.
- Model must consider alignment P? (T1,T2,A)
- Why A is complicated
- The correspondence isnt 1 to 1
- Also need to model word order (indeed topology)
- kiss ? gave a kiss
- cat ? kitty cat
- ? ? to
the girl kissed her kitty cat
the girl gave a kiss to her cat
50Context-Free Grammar
the girl kissed her cat
S
etc.
51Augment CFG nonterminalswith headwords
the girl kissed her cat
S
etc.
52Augment CFG nonterminalswith headwords
the girl kissed her cat
S
look at all the rules headed by kissed ...
etc.
53Lexicalized Tree Substitution Grammar
the girl kissed her cat
S
S,kissed
look at all the rules headed by kissed ...
a natural chunk
VP,kissed
VP,kissed
NP
NP
V, kissed
open role waiting to be filled
kissed
can fill open roles higher up
etc.
54Lexicalized Tree Substitution Grammar
the girl kissed her cat
55Lexicalized Tree Substitution Grammar
S
S
NP
VP
NP
Det
N
Det
NP
V
NP
the
girl
Det
N
Det
kissed
cat
her
56Dependency-Style Lexicalized Tree Substitution
Grammar
Simplify structure Eliminate extra internal
nodes Just one node per word (dependency
style) Yields the kind of AR and TR trees we
actually have
57Dependency-Style Lexicalized Tree Substitution
Grammar
the girl kissed her kitty cat
58Synchronous Dependency-Style Lexicalized Tree
Substitution Grammar
the girl kissed her kitty cat
the girl gave a kiss to her cat
59Synchronous Dependency-Style Lexicalized Tree
Substitution Grammar
the girl kissed her kitty cat
the girl gave a kiss to her cat
60Synchronous Dependency-Style Lexicalized Tree
Substitution Grammar
the girl kissed her kitty cat
the girl gave a kiss to her cat
Det,a
61Synchronous Dependency-Style Lexicalized Tree
Substitution Grammar
the girl kissed her kitty cat
the girl gave a kiss to her cat
Det,a
62P(T1, T2, A) ? p(t1,t2,a n)
So any aligned BIG TREE PAIR is built from a set
of aligned LITTLE TREE PAIRS
Det,a
63P(T1, T2, A) ? p(t1,t2,a n) How This
Simplifies Things
- Alignment find A to max P?(T1,T2,A)
- Decoding find T2, A to max P?(T1,T2,A)
- Training find ? to max ?A P?(T1,T2,A)
- Do everything on little trees instead!
- Only need to train decode a model of
p?(t1,t2,a) - But not sure how to break up big tree correctly
- So try all possible little trees all ways
of combining them, by dynamic prog.
64System Architecture
Probability Model p?(t1,t2,a) of Little Trees
score little trees find p(...)
propose little translations t2 make p(...) big
update parameters ?raise p(...)
Decoder
Trainer
alignmentsbetween a big tree
T1 a forest of big trees T2
scores all
scores all alignmentsof two big trees T1,T2
dynamic programming engine
65System Architecture
Probability Model p?(t1,t2,a) of Little Trees
score little trees
propose little translations t2
update parameters ?
Decoder
Trainer
dynamic programming engine
output
66Related Work
- Synchronous grammars (Shieber Schabes 1990)
- Statistical work has allowed only 11 (isomorphic
trees) - Stochastic inversion transduction grammars (Wu
1995) - Head transducer grammars (Alshawi et al. 2000)
- Statistical tree translation
- Noisy channel model (Yamada Knight 2000)
- Infers tree trains on (string, tree) pair, not
(tree, tree) pair - But again, allows only 11, plus 10 at leaves
- Statistical tree generation - find most prob.
expressing meaning - Dynamic prog. search in packed forest (Langkilde
2000) - Stack decoder (Ratnaparkhi 2000)
67Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
68The Little Trees
)
p?(
69)
p?(
- Data still sparse, but better than for big trees
- No alignment needed - already hypothesized for us
70Form of the model for 11 (ARTR)
- Base form
- p(cat,PL,PAT,cat,NNS,Obj,alignment)
- High-level Backoff
- p(cat,cat) p(PL,NNS) p(PAT,Obj)
p(alignment) - Low-level Backoff
- p(align) (1/LTF) , where
- (L size of ltTlemma,Alemmagt, etc.)
71Non-11 Correspondences
- Joint model
- 01
- p(to,TO,AuxY,alignment)k01
- 10
- p(GenNULL,ACT,align)k10
- 12
- p(home,SG,LOC,in,IN,AuxP,home,NNS,Adv,alignment)k1
2 - etc. corresponding backoff scheme
72Smoothing issues
- Other backoff schemes?
- Too many to do all
- Graphical models?
- Derive from (manual) alignments
- esp. for types of alignment the model cannot
handle (14, for example)
73Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
74The Proposer
75Map TR to AR
76Proposer for Decoder
- Collecting Feature Patterns on TR
- Construct AR using observed possible TR-AR
transform - For unobserved TR, using naïve mapping onto AR
77Proposer Observes during Training
78Proposer During Decoding
79Example
State
80Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
Evaluation
CZECH
ENGLISH
81The Classifier(s)
82Tree Transduction Model
Tree Transduction Models Decoder
Proposer
- Global information in labels suppress proposals
83Preposition Insertion Labeler
- C5.0 decision tree classifier
- Labels nothing, insert_of,
84Preposition Insertion Labeler
- Trained on Input 1 (Automatic English TR)
85Preposition Insertion Labeler
- Some TR nodes should be ignored
- fly to Baltimore and from Boston
86Boosting Insertion Recall
- Overgenerating better than undergenerating
- Using C5.0s misclassification costs to
discourage nothing - Training on preposition-only data
87Boosting Insertion Recall
- N Best Labels
- Confidence Threshold
- N Average of Labels
- Aggressive Confidence Threshold
- N Average of Labels
88Insertion Recall vs N
N 5, R 84.35
Aggressive Confidence Threshold
N 3, R 80.26
N 4, R 80.59
N 3, R 76.39
Confidence Threshold
N Best
89What should be done next?
- Clustering TR Lemmas into a tractable number of
classes - Ripper instead of C5.0
90Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
91Word Order
92Word order
- Tree-based models
- Analytical level surface dependency, tree-based
- Collins model
- Uses function information (Sb, Obj, Atr, ...),
POS, lemmas - 94 of nodes have correct ordering of children
(chance 68) - No punctuation (inserted later)
- Input order completely irrelevant
93Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
94Punctuation-- Morphology
95Punctuation Insertion Motivation
- Important for sentence meaning, understanding
- BLEU - n-gram statistics
- commas are most frequent lemma in WSJ
- Focusing on commas (95 of intra-sentence
punctuation) - Difficulties
- English comma usage very flexible
- varies with style, meaning of sentence
- quotes not marked in TR trees
96Why insert commas separately?
- Commas depend not only on underlying
syntax/semantics but also on the surface
realization of the sentence. - Soon, she will realize her mistake.
- ? Soon she will realize her mistake.
- She will soon realize her mistake.
- ?? She will, soon, realize her mistake.
- She will soon, realize her mistake.
- She will, soon realize her mistake.
- Channel model deals with unordered trees
- Easier to do comma insertion after surface
ordering
97Commas in AR Trees
- TR tree - autosemantic words - commas deleted
- AR tree - commas are AuxX or Apos (apposition
governors) - Input Data ordered, unpunctuated AR tree, with
AR and TR functors, POS - Task insert AuxX nodes into AR tree, and link
them in correct surface order.
98Another Example
99Comma Insertion Model
- C5.0 decisions tree classifier
- Trained on English AR trees with TR functors
(sect. 0-19 WSJ) with punctuation stripped - Node Labels NO-ACTION, INSERT-RIGHT
- Feature vectors
- Local features (AFun, TFun, POS)
- For node, left/right brother, parent, grandparent
- Global features (Zhang 02) (position in sentence,
)
100Decision Tree Model Results
Preliminary results - still based on hand parsed
WSJ
- Evaluation metric is sentence accuracy
- What is (human) upper bound?
- Systems are hard to compare models and data sets
very different
101Results for Generation
- Comma insertion improves BLEU score
- Possible improvements
- Adding n-gram information to insertion model
- Trying with other punctuation marks
102Surface Morphology
- Morphology dictionary - 365 M words (Curín)
- morpha (morph analyzer) - lemmatize words, keep
counts - Word -gt POS surface_form lemma
frequency - NN want-you-babe want-you-babe
1 - VBD wanted want
45595 - VBD wanting want
1 - VBG wanting want
3708 - Task Lemma POS -gt surface form reverse
lookup - Clashes resolved by frequency
103Morphology Dictionary
- Initial tests half of the errors were from the
ambiguity in the verb "to be" between singular
and plural. - Be VBP -gt (I) am or
(we/they) are ?? - Be VBD -gt (I/he/she/it) was or
(we/you/they) were ?? - Introduced an entry for "be2" to correspond to
plural subject. - Test use the full dictionary (plus "be2"
entries) - 902,220 entries total - generate
surface forms for lemmatized version of the WSJ
sections 0-21.
Be2 VBP -gt are Be2 VBD -gt were
Be VBP -gt am Be VBD -gt was
104Surface Morphology Results
- OOV rate 1.69
- For many, surface form lemma, so correct by
default - English morphology not complex most OOV are
proper nouns - Non-OOV words 99.74 accuracy
- 86 of mistakes were contractions 'm am, 've
have, etc. (Actually correct.) Ignoring these
99.96 correct rate - Overall, ignoring contractions error rate of
0.03 - High accuracy rate, fast runtime, good coverage -
unnecessary to improve more
105Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
106Improving Czech Parsing (AR-TR)
107Improving Czech TR Parsing
- Pre-workshop state
- Czech Deep Syntax mapping AR to TR (Boehmova,
Honetschlaeger, Zabokrtsky) - Two parts of the system
- rule-based
- 19 transformations by order-dependent perl code
- statistical
- C4.5-based labeling of TR functions 84 accuracy
108Czech Deep Syntax mapping AR to TR
- New statistical system
- tree transduction model has same form as for
generation - little-tree model reversed for parsing (AR to
TR mapping) - initial EM pass uses simple model based on PDT
(manual) node ID alignment - (reversed) proposer not finished yet
109Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
CZECH
ENGLISH
110The Hybrid Approach to Generation The ARGENT
System
111Example
112Alan Spoon, recently named Newsweek president,
said Newsweek s ad rates would increase 5 pct in
January.
113NLG architectures
- Statistical approaches
- MAGENTA Hajic et al. 02
- Rule-based approaches
- FUF/Surge Elhadad 93, Elhadad and Robin 98
- KPML Bateman 97
- Hybrid approaches
- NitroGen Knight and Hatzivassiloglou 95
- HaloGen Langkilde and Knight 00
- Fergus Bangalore and Rambow 00
- ARGENT
114FUF/Surge
- What FUF can do (given sufficient control
information) - Maps FUF-style thematic structure onto syntactic
roles - Performs syntactic paraphrasing and alternations
(e.g., dative move, passive) - Provides defaults for syntactic features (e.g.,
present tense, third person) - Propagates agreement features
- Selects closed class words
- Inflects words
- Provides linear precedence constraints among
syntactic constituents - What FUF cannot do
- convert dependency to phrase-structure
- provide control for syntactic paraphrasing
- provide control for lexical features
(conditionals, past tense, ) - choose determiners
- provide a robust grammar
115(setq r '((process ((lex "say")
(tense past) (object-clause
that))) (circum ((time ((cat pp)
(prep ((lex "in")))
(np ((lex "January")
(determiner none)))))))
(partic ((affected ((cat clause)
(process ((lex "increase")
(tense past)))
(partic ((created ((cat
measure)
(quantity ((value 5)))
(unit ((lex
"pct")))))
(agent ((cat np)
(head ((lex "rate")
(number
plural)
(determiner none)))
(classifier ((lex
"ad")
(determiner none)))
(possessor ((lex
"Newsweek")
(determiner none)))))))))
(agent ((complex apposition)
(punctuation ((after
","))) (distinct
(((lex "Spoon")
(classifier ((lex "Alan")))
(determiner none))
((lex "name")
(classifier
((lex "president")))
(determiner none))))))))))
116(No Transcript)
117Grammar development
- translating TG ? FUF (deterministic channel)
- write high coverage rules first
- problem no aligned training data
- four types of rules Langkilde-Geary 02 -
recasting, ordering, filling, morphing - Three modules
- Top-level
- Recursion
- Bottom-level
118Evaluation
- Robustness
- ARGENT 245/248 sentences 98.7
- HaloGen 80
- Speed
- ARGENT 1.4-2.9 sec/sentence
- HaloGen 28.9-55.5 sec/sentence
- BLEU score -- later
119Future work
- Complete grammar
- improve coverage
- use other grammatemes
- degree of comparison (comparative) , sentmod
(interrogative), verbmod (imperative) - Better error recovery
- inconsistent PTB markup, TR transformation,
translation - Grammar induction
- N-gram based insertion of missing words
- Integrate with MAGENTA
120Where are we?
Transfer
English TR to AR
Deep syntax (Czech)
Word Order
Punctuation
Morphology
Evaluation
CZECH
ENGLISH
121The Implemented Systems Creating Data for
Generation
- ? ? Czech Tagger Parser (WS98, pre-WS02)
- ? ? Czech-English Transfer (WS99, pre-WS02)
- ? ? New Statistical Czech Parser to TR
- ? ? Input3 Improved English TR for training
122The Generation Systems
- ? ? Aligner and Decoder
- ? ? Little Tree Joint Model
- ? ? Proposer
- ? ? Preposition Classifier
- ? ? Word Order by Tree LM
- ? ? Comma insertion
- ? ? Morphology
- ? ? The Hybrid System (TR to FUF translation)
123Evaluation
- Evaluation data for BLEU (1-4grams)
- devtest/evaltest 248/249 sentences, 5 ref.
translations - Inputs
- 1 Automatic English TR
- 2 Manual English TR
- 3 Enhanced Automatic English TR
- 4 Automatic Czenglish TR
- Systems Statistical, Hybrid (FUF-based)
124Upper estimate
- 5 reference translations
- 1 original WSJ text from PTB
- 4 retranslations from Czech to English
- 2 US, 2 Czech
- Evaluate the translations
- take one out
- evaluate against remaining 4
- Average BLEU score 0.556
125Results
- Input 1 (Automatic English TR)
126Results
- Input 2 (Manual English TR)
127Results
- Input 3 (Improved Auto English TR)
128Results
- Input 4 (CzenglishAutomatic TR)
Unigram BLEU score for the reference set 0.844
129Conclusions and Future Work
130The Good News and the Bad News
- Good news
- End-to-end, tree-transformation system running
- Written in 4 weeks, fully trainable from data
- Generates from semantic (TR) English
significantly better than the baseline - Datasets developed for generation/MT, evaluation
- Bad news
- not fully integrated (proposer, little tree
model) - on full MT, cannot beat baseline (and yes, GIZA)
131Things To Do (1)
- Integrate the proposer
- Integrate the preposition classifier
- Write more classifiers, integrate
- classifiers running in parallel/sequentially?
- True EM smoothing (by adaptation of aligner)
- Make the system more modular
- e.g., declarative specification of smoothing
132Things To Do (2)
- The aligner/decoder
- Pruning during aligning/decoding
- Better smoothing of the little tree model
- More dependence among little trees
- through shared nonterminals or lexicalized
nonterminals - Little-tree joint model ? noisy channel model
- i.e., integrate Gildeas tree LM directly
- Better initial model for EM
- ML training off manual alignments
- Nondeterministic transfer
133Things To Do (3)
- Make use of TRs deep (discourse) word order
- More experiments
- with new smoothing, integrated proposer
- different order of modules
- other punctuation classifier or inside the
model? - Different settings/applications
- AR to TR (parsing)
- AR to AR (surface translation)
- TR to TR (translation) other languages
134The End
The beginning!
135Generation in the context of MT
- Project summary
- Explore ways of using semantic sentence
representation for NL generation - Use it in the machine translation context
- Evaluate / compare the results