QuasiSynchronous Grammars - PowerPoint PPT Presentation

About This Presentation
Title:

QuasiSynchronous Grammars

Description:

Free Translation. Tschernobyl. k nnte. dann. etwas. sp ter. an. die. Reihe. kommen. Then. we ... Rough surrogates for translation performance. How can we best ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 23
Provided by: hansfl
Category:

less

Transcript and Presenter's Notes

Title: QuasiSynchronous Grammars


1
Quasi-Synchronous Grammars
  • Alignment by Soft Projection of Syntactic
    Dependencies

David A. Smith and Jason Eisner Center for
Language and Speech Processing Department of
Computer Science Johns Hopkins University
2
Synchronous Grammars
  • Synchronous grammars elegantly model
  • P(T1, T2, A)
  • Conditionalizing for
  • Alignment
  • Translation
  • Training?
  • Observe parallel trees?
  • Impute trees/links?
  • Project known trees

Im
Anfang
war
das
Wort
In
the
beginning
was
the
word
3
Projection
  • Train with bitext
  • Parse one side
  • Align words
  • Project dependencies
  • Many to one links?
  • Non-projective and circular dependencies?
  • Proposals in Hwa et al., Quirk et al., etc.

Im
Anfang
war
das
Wort
In
the
beginning
was
the
word
4
Divergent Projection
Auf
Frage
diese
bekommen
ich
habe
leider
Antwort
keine
NULL
I
did
not
unfortunately
receive
an
answer
to
this
question
null
siblings
head-swapping
monotonic
5
Free Translation
Bad dependencies
Tschernobyl
könnte
dann
etwas
später
an
die
Reihe
kommen
NULL
Parent-ancestors?
Then
we
could
deal
with
Chernobyl
some
time
later
6
Dependency Menagerie
7
Overview
  • Divergent Sloppy Projection
  • Modeling Motivation
  • Quasi-Synchronous Grammars (QG)
  • Basic Parameterization
  • Modeling Experiments
  • Alignment Experiments

8
QG by Analogy
Target
HMM noisy channel generating states
Source
MEMM direct generative model of states
Target
Source
CRF undirected, globally normalized
9
Words with Senses
Now senses in a particular (German) sentence
Papier
Ich
habe
die
Veröffentlichung
über
präsentiert
das
mit
I really mean conference paper.
Veröffentlichung
I
presented
the
have
paper
about
with
10
Quasi-Synchronous Grammar
  • QG A target-language grammar that generates
    translations of a particular source-language
    sentence.
  • A direct, conditional model of translation as
  • P(T2, A T1)
  • This grammar can be CFG, TSG, TAG, etc.

11
Generating QCFG from T1
  • U Target language grammar nonterminals
  • V Nodes of given source tree T1
  • Binarized QCFG A, B, C ? U a, ß, ? ? 2V
  • ?
  • ? w
  • Present modeling restrictions
  • a 1
  • Dependency grammars (1 node per word)
  • Tie parameters that depend on a, ß, ?
  • Model 1 property reuse of senses. Why?

senses
12
Modeling Assumptions
Tie params for all tokens of im
Im
Anfang
war
das
Wort
At most 1 sense per English word
Allow sense reuse
the
beginning
was
the
word
In
Dependency Grammar one node/word
13
Dependency Relations
none of the above
14
QCFG Generative Story
observed
?
Auf
Frage
diese
bekommen
ich
leider
Antwort
keine
habe
NULL
P(parent-child)
P(breakage)
P(I ich)
I
did
not
unfortunately
receive
an
answer
to
this
question
P(PRP no left children of did)
O(m2n3)
15
Training the QCFG
  • Rough surrogates for translation performance
  • How can we best model target given source?
  • How can we best match human alignments?
  • German-English Europarl from SMT05
  • 1k, 10k, 100k sentence pairs
  • German parsed w/Stanford parser
  • EM training of monolingual/bilingual parameters
  • For efficiency, select alignments in training
    (not test) from IBM Model 4 union

16
Cross-Entropy Results
17
AER Results
18
AER Comparison
IBM4 German-English
QG German-English
IBM4 English-German
19
Conclusions
  • Strict isomorphism hurts for
  • Modeling translations
  • Aligning bitext
  • Breakages beyond local nodes help most
  • None of the above beats simple head-swapping
    and 2-to-1 alignments
  • Insignificant gains from further breakage
    taxonomy

20
Continuing Research
  • Senses of more than one word should help
  • Maintaining O(m2n3)
  • Further refining monolingual features on
    monolingual data
  • Comparison to other synchronizers
  • Decoder in progress uses same direct model of
    P(T2 ,A T1)
  • Globally normalized and discriminatively trained

21
Thanks
  • David Yarowsky
  • Sanjeev Khudanpur
  • Noah Smith
  • Markus Dreyer
  • David Chiang
  • Our reviewers
  • The National Science Foundation

22
Synchronous Grammar as QG
  • Target nodes correspond to 1 or 0 source nodes
  • ? ?
  • (?i ? j) ai ? aj unless ai NULL
  • (?i 0) ai is a child of a0 in T1 , unless ai
    NULL
  • STSG, STAG operate on derivation trees
  • Cf. Gildeas clone operation as a
    quasi-synchronous move
Write a Comment
User Comments (0)
About PowerShow.com