Title: QuasiSynchronous Grammars
1Quasi-Synchronous Grammars
- Alignment by Soft Projection of Syntactic
Dependencies
David A. Smith and Jason Eisner Center for
Language and Speech Processing Department of
Computer Science Johns Hopkins University
2Synchronous Grammars
- Synchronous grammars elegantly model
- P(T1, T2, A)
- Conditionalizing for
- Alignment
- Translation
- Training?
- Observe parallel trees?
- Impute trees/links?
- Project known trees
Im
Anfang
war
das
Wort
In
the
beginning
was
the
word
3Projection
- Train with bitext
- Parse one side
- Align words
- Project dependencies
- Many to one links?
- Non-projective and circular dependencies?
- Proposals in Hwa et al., Quirk et al., etc.
Im
Anfang
war
das
Wort
In
the
beginning
was
the
word
4Divergent Projection
Auf
Frage
diese
bekommen
ich
habe
leider
Antwort
keine
NULL
I
did
not
unfortunately
receive
an
answer
to
this
question
null
siblings
head-swapping
monotonic
5Free Translation
Bad dependencies
Tschernobyl
könnte
dann
etwas
später
an
die
Reihe
kommen
NULL
Parent-ancestors?
Then
we
could
deal
with
Chernobyl
some
time
later
6Dependency Menagerie
7Overview
- Divergent Sloppy Projection
- Modeling Motivation
- Quasi-Synchronous Grammars (QG)
- Basic Parameterization
- Modeling Experiments
- Alignment Experiments
8QG by Analogy
Target
HMM noisy channel generating states
Source
MEMM direct generative model of states
Target
Source
CRF undirected, globally normalized
9Words with Senses
Now senses in a particular (German) sentence
Papier
Ich
habe
die
Veröffentlichung
über
präsentiert
das
mit
I really mean conference paper.
Veröffentlichung
I
presented
the
have
paper
about
with
10Quasi-Synchronous Grammar
- QG A target-language grammar that generates
translations of a particular source-language
sentence. - A direct, conditional model of translation as
- P(T2, A T1)
- This grammar can be CFG, TSG, TAG, etc.
11Generating QCFG from T1
- U Target language grammar nonterminals
- V Nodes of given source tree T1
- Binarized QCFG A, B, C ? U a, ß, ? ? 2V
- ?
- ? w
- Present modeling restrictions
- a 1
- Dependency grammars (1 node per word)
- Tie parameters that depend on a, ß, ?
- Model 1 property reuse of senses. Why?
senses
12Modeling Assumptions
Tie params for all tokens of im
Im
Anfang
war
das
Wort
At most 1 sense per English word
Allow sense reuse
the
beginning
was
the
word
In
Dependency Grammar one node/word
13Dependency Relations
none of the above
14QCFG Generative Story
observed
?
Auf
Frage
diese
bekommen
ich
leider
Antwort
keine
habe
NULL
P(parent-child)
P(breakage)
P(I ich)
I
did
not
unfortunately
receive
an
answer
to
this
question
P(PRP no left children of did)
O(m2n3)
15Training the QCFG
- Rough surrogates for translation performance
- How can we best model target given source?
- How can we best match human alignments?
- German-English Europarl from SMT05
- 1k, 10k, 100k sentence pairs
- German parsed w/Stanford parser
- EM training of monolingual/bilingual parameters
- For efficiency, select alignments in training
(not test) from IBM Model 4 union
16Cross-Entropy Results
17AER Results
18AER Comparison
IBM4 German-English
QG German-English
IBM4 English-German
19Conclusions
- Strict isomorphism hurts for
- Modeling translations
- Aligning bitext
- Breakages beyond local nodes help most
- None of the above beats simple head-swapping
and 2-to-1 alignments - Insignificant gains from further breakage
taxonomy
20Continuing Research
- Senses of more than one word should help
- Maintaining O(m2n3)
- Further refining monolingual features on
monolingual data - Comparison to other synchronizers
- Decoder in progress uses same direct model of
P(T2 ,A T1) - Globally normalized and discriminatively trained
21Thanks
- David Yarowsky
- Sanjeev Khudanpur
- Noah Smith
- Markus Dreyer
- David Chiang
- Our reviewers
- The National Science Foundation
22Synchronous Grammar as QG
- Target nodes correspond to 1 or 0 source nodes
- ? ?
- (?i ? j) ai ? aj unless ai NULL
- (?i 0) ai is a child of a0 in T1 , unless ai
NULL - STSG, STAG operate on derivation trees
- Cf. Gildeas clone operation as a
quasi-synchronous move