Linguistics 187287 Week 6

About This Presentation

Title:

Linguistics 187287 Week 6

Description:

(Optional) filtering of f-structure snippets based on consistency of linguistic categories. Extraction of snippet that translates zutiefst dankbar into a deep ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 111

Provided by: ronk153

Category:

more less

Transcript and Presenter's Notes

Title: Linguistics 187287 Week 6

1
Linguistics 187/287 Week 6
Generation Term-rewrite System Machine Translation

Martin Forst, Ron Kaplan, and Tracy King

2
Generation

Parsing string to analysis
Generation analysis to string
What type of input?
How to generate

3
Why generate?

Machine translation
Lang1 string -gt Lang1 fstr -gt Lang2 fstr -gt Lang2
string
Sentence condensation
Long string -gt fstr -gt smaller fstr -gt new string
Question answering
Production of NL reports
State of machine or process
Explanation of logical deduction
Grammar debugging

4
F-structures as input

Use f-structures as input to the generator
May parse sentences that shouldnt be generated
May want to constrain number of generated options
Input f-structure may be underspecified

5
XLE generator

Use the same grammar for parsing and generation
Advantages
maintainability
write rules and lexicons once
But
special generation tokenizer
different OT ranking

6
Generation tokenizer/morphology

White space
Parsing multiple white space becomes a single TB
John appears. -gt John TB appears TB . TB
Generation single TB becomes a single space (or
nothing)
John TB appears TB . TB -gt John appears.
John appears .
Suppress variant forms
Parse both favor and favour
Generate only one

7
Morphconfig for parsing generation

STANDARD ENGLISH MOPRHOLOGY (1.0)
TOKENIZE
P!eng.tok.parse.fst G!eng.tok.gen.fst
ANALYZE
eng.infl-morph.fst G!amerbritfilter.fst
G!amergen.fst
----

8
Reversing the parsing grammar

The parsing grammar can be used directly as a
generator
Adapt the grammar with a special OT ranking
GENOPTIMALITYORDER
Why do this?
parse ungrammatical input
have too many options

9
Ungrammatical input

Linguistically ungrammatical
They walks.
They ate banana.
Stylistically ungrammatical
No ending punctuation They appear
Superfluous commas John, and Mary appear.
Shallow markup NP John and Mary appear.

10
Too many options

All the generated options can be linguistically
valid, but too many for applications
Occurs when more than one string has the same,
legitimate f-structure
PP placement
In the morning I left. I left in the morning.

11
Using the Gen OT ranking

Generally much simpler than in the parsing
direction
Usually only use standard marks and NOGOOD
no marks, no STOPPOINT
Can have a few marks that are shared by several
constructions
one or two for dispreferred
one or two for preferred

12
Example Prefer initial PP

S --gt (PP _at_ADJUNCT _at_(OT-MARK GenGood))
NP _at_SUBJ
VP.
VP --gt V
(NP _at_OBJ)
(PP _at_ADJUNCT).
GENOPTIMALITYORDER NOGOOD GenGood.
parse they appear in the morning.
generate without OT In the morning they appear.
They appear
in the morning.
with OT In the morning they
appear.

13
Debugging the generator

When generating from an f-structure produced by
the same grammar, XLE should always generate
Unless
OT marks block the only possible string
something is wrong with the tokenizer/morphology
regenerate-morphemes if this gets a
string
the tokenizer/morphology is not the
problem
Hard to debug XLE has robustness features to help

14
Underspecified Input

F-structures provided by applications are not
perfect
may be missing features
may have extra features
may simply not match the grammar coverage
Missing and extra features are often systematic
specify in XLE which features can be added and
deleted
Not matching the grammar is a more serious problem

15
Adding features

English to French translation
English nouns have no gender
French nouns need gender
Soln have XLE add gender
the French morphology will control
the value
Specify additions in xlerc
set-gen-adds add "GEND"
can add multiple features
set-gen-adds add "GEND CASE PCASE"
XLE will optionally insert the feature

Note Unconstrained additions make generation
undecidable
16
Example
The cat sleeps. -gt Le chat dort.

PRED 'dormirltSUBJgt'
SUBJ PRED 'chat'
NUM sg
SPEC def
TENSE present

PRED 'dormirltSUBJgt' SUBJ PRED 'chat'
NUM sg GEND masc
SPEC def TENSE present
17
Deleting features

French to English translation
delete the GEND feature
Specify deletions in xlerc
set-gen-adds remove "GEND"
can remove multiple features
set-gen-adds remove "GEND CASE PCASE"
XLE obligatorily removes the features
no GEND feature will remain in the f-structure
if a feature takes an f-structure value, that
f-structure is also removed

18
Changing values

If values of a feature do not match between the
input f-structure and the grammar
delete the feature and then add it
Example case assignment in translation
set-gen-adds remove "CASE"
set-gen-adds add "CASE"
allows dative case in input to become accusative
e.g., exceptional case marking verb in input
language but regular case in output language

19
Generation for Debugging

Checking for grammar and lexicon errors
create-generator english.lfg
reports ill-formed rules, templates, feature
declarations, lexical entries
Checking for ill-formed sentences that can be
parsed
parse a sentence
see if all the results are legitimate strings
regenerate they appear.

20
Rewriting/Transfer System
21
Why a Rewrite System

Grammars produce c-/f-structure output
Applications may need to manipulate this
Remove features
Rearrange features
Continue linguistic analysis (semantics,
knowledge representation next week)
XLE has a general purpose rewrite system (aka
"transfer" or "xfr" system)

22
Sample Uses of Rewrite System

Sentence condensation
Machine translation
Mapping to logic for knowledge representation and
reasoning
Tutoring systems

23
What does the system do?

Input set of "facts"
Apply a set of ordered rules to the facts
this gradually changes the set of input facts
Output new set of facts
Rewrite system uses the same ambiguity management
as XLE
can efficiently rewrite packed structures,
maintaining the packing

24
Example F-structure Facts

PERS(var(1),3)
PRED(var(1),girl)
CASE(var(1),nom)
NTYPE(var(1),common)
NUM(var(1),pl)
SUBJ(var(0),var(1))
PRED(var(0),laugh)
TNS-ASP(var(0),var(2))
TENSE(var(2),pres)
arg(var(0),1,var(1))
lex_id(var(0),1)
lex_id(var(1),0)

F-structures get var()
Special arg facts
lex_id for each PRED
Facts have two arguments (except arg)
Rewrite system allows for any number
of arguments

25
Rule format

Obligatory rule LHS gt RHS.
Optional rule LHS ?gt RHS.
Unresourced fact - clause.
LHS
clause match and delete
clause match and keep
-LHS negation (don't have fact)
LHS, LHS conjunction
( LHS LHS ) disjunction
ProcedureCall procedural attachment
RHS
clause replacement facts
0 empty set of replacement facts
stop abandon the analysis

26
Example rules
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SUBJ(var(0),v
ar(1)) PRED(var(0),laugh) TNS-ASP(var(0),var(2))
TENSE(var(2),pres) arg(var(0),1,var(1)) lex_id(va
r(0),1) lex_id(var(1),0)
"PRS (1.0)" grammar toy_rules. "obligatorily
add a determiner if there is a noun with no
spec" NTYPE(F,), -SPEC(F,) gt SPEC(F,def
). "optionally make plural nouns singular this
will split the choice space" NUM(F, pl) ?gt
NUM(F, sg).
27
Example Obligatory Rule
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SUBJ(var(0),v
ar(1)) PRED(var(0),laugh) TNS-ASP(var(0),var(2))
TENSE(var(2),pres) arg(var(0),1,var(1)) lex_id(va
r(0),1) lex_id(var(1),0)
"obligatorily add a determiner if there is a
noun with no spec" NTYPE(F,),
-SPEC(F,) gt SPEC(F,def).
Output facts all the input facts plus
SPEC(var(1),def)
28
Example Optional Rule
"optionally make plural nouns singular this will
split the choice space" NUM(F, pl) ?gt
NUM(F, sg).
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SPEC(var(1),de
f) SUBJ(var(0),var(1)) PRED(var(0),laugh) TNS-AS
P(var(0),var(2)) TENSE(var(2),pres) arg(var(0),1,
var(1)) lex_id(var(0),1) lex_id(var(1),0)
Output facts all the input facts plus
choice split A1 NUM(var(1),pl)
A2 NUM(var(1),sg)
29
Output of example rules

Output is a packed f-structure
Generation gives two sets of strings
The girls laugh.laugh!laugh
The girl laughs.laughs!laughs

30
Manipulating sets

Sets are represented with an in_set feature
He laughs in the park with the telescope
ADJUNCT(var(0),var(2))
in_set(var(4),var(2))
in_set(var(5),var(2))
PRED(var(4),in)
PRED(var(5),with)
Might want to optionally remove adjuncts
but not negation

31
Example Adjunct Deletion Rules

"optionally remove member of adjunct set"
ADJUNCT(, AdjSet), in_set(Adj, AdjSet),
-PRED(Adj, not)
?gt 0.
"obligatorily remove adjunct with nothing in it"
ADJUNCT(, Adj), -in_set(,Adj)
gt 0.

He laughs with the telescope in the park. He
laughs in the park with the telescope He laughs
with the telescope. He laughs in the park. He
laughs.
32
Manipulating PREDs

Changing the value of a PRED is easy
PRED(F,girl) gt PRED(F,boy).
Changing the argument structure is trickier
Make any changes to the grammatical functions
Make the arg facts correlate with these

33
Example Passive Rule

"make actives passive
make the subject NULL make the object the
subject
put in features"
SUBJ( Verb, Subj), arg( Verb, Num, Subj),
OBJ( Verb, Obj), CASE( Obj, acc)
gt
SUBJ( Verb, Obj), arg( Verb, Num, NULL),
CASE( Obj, nom),
PASSIVE( Verb, ), VFORM( Verb, pass).

the girls saw the monkeys gt The monkeys were
seen. in the park the girls saw the monkeys
gt In the park the monkeys were seen.
34
Templates and Macros

Rules can be encoded as templates
n2n(Eng,Frn)
PRED(F,Eng), NTYPE(F,)
gt PRED(F,Frn).
_at_n2n(man, homme).
_at_n2n(woman, femme).
Macros encode groups of clauses/facts
sg_noun(F)
NTYPE(F,), NUM(F,sg).
_at_sg_noun(F), -SPEC(F)
gt SPEC(F,def).

35
Unresourced Facts

Facts can be stipulated in the rules and refered
to
Often used as a lexicon of information not
encoded in the f-structure
For example, list of days and months for
manipulation of dates
- day(Monday). - day(Tuesday). etc.
- month(January). - month(February). etc.
PRED(F,Pred), ( day(Pred) month(Pred) )
gt

36
Rule Ordering

Rewrite rules are ordered (unlike LFG syntax
rules but like finite-state rules)
Output of rule1 is input to rule2
Output of rule2 is input to rule3
This allows for feeding and bleeding
Feeding insert facts used by later rules
Bleeding remove facts needed by later rules
Can make debugging challenging

37
Example of Rule Feeding

Early Rule Insert SPEC on nouns
NTYPE(F,), -SPEC(F,) gt
SPEC(F, def).
Later Rule Allow plural nouns to become singular
only if have a specifier (to avoid bad count
nouns)
NUM(F,pl), SPEC(F,) gt NUM(F,sg).

38
Example of Rule Bleeding

Early Rule Turn actives into passives
(simplified)
SUBJ(F,S), OBJ(F,O) gt
SUBJ(F,O), PASSIVE(F,).
Later Rule Impersonalize actives
SUBJ(F,), -PASSIVE(F,) gt
SUBJ(F,S), PRED(S,they), PERS(S,3),
NUM(S,pl).
will apply to intransitives and verbs with
(X)COMPs but not transitives

39
Debugging

XLE command line tdbg
steps through rules stating how they apply

Rule
1 (NTYPE(F,A)), -(SPEC(F,B))
gtSPEC(F,def) File /tilde/thking/courses/ling18
7/hws/thk.pl, lines 4-10 Rule 1 matches
(2) NTYPE(var(1),common) 1
--gt SPEC(var(1),def)
Rule 2 NUM(F,pl)
?gtNUM(F,sg) File /tilde/thking/courses/ling187/
hws/thk.pl, lines 11-17 Rule 2 matches 3
NUM(var(1),pl) 1 --gt
NUM(var(1),sg)
Rule 5 SUBJ(Verb,Subj),
arg(Verb,Num,Subj), OBJ(Verb,Obj),
CASE(Obj,acc) gtSUBJ(Verb,Obj),
arg(Verb,Num,NULL), CASE(Obj,nom),
PASSIVE(Verb,), VFORM(Verb,pass) File
/tilde/thking/courses/ling187/hws/thk.pl, lines
28-37 Rule does not apply
girls laughed
40
Running the Rewrite System

create-transfer adds menu items
load-transfer-rules FILE loads rules from file
f-str window under commands has
transfer prints output of rules in XLE window
translate runs output through generator
Need to do (where path is XLEPATH/lib)
setenv LD_LIBRARY_PATH /afs/ir.stanford.edu/data/l
inguistics/XLE/SunOS/lib

41
Rewrite Summary

The XLE rewrite system lets you manipulate the
output of parsing
Creates versions of output suitable for
applications
Can involve significant reprocessing
Rules are ordered
Ambiguity management is as with parsing

42
Grammatical Machine Translation

Stefan Riezler John Maxwell

43
Translation System
Lots of statistics
Translationrules
XLEParsing
XLEGeneration
F-structures
F-structures.
GermanLFG
English LFG
44
Transfer-Rule Induction from aligned bilingual
corpora

Use standard techniques to find many-to-many
candidate word-alignments in source-target
sentence-pairs
Parse source and target sentences using LFG
grammars for German and English
Select most similar f-structures in source and
target
Define many-to-many correspondences between
substructures of f-structures based on
many-to-many word alignment
Extract primitive transfer rules directly from
aligned f-structure units
Create powerset of possible combinations of basic
rules and filter according to contiguity and type
matching constraints

45
Induction

Example sentences Dafür bin ich zutiefst
dankbar.
I have a deep appreciation for that.
Many-to-many word alignment
Dafür6 7 bin2 ich1 zutiefst3 4 5
dankbar5
F-structure alignment

46
Extracting Primitive Transfer Rules

Rule (1) maps lexical predicates
Rule (2) maps lexical predicates and interprets
subj-to-subj link as indication to map subj of
source with this predicate into subject of target
and xcomp of source into object of target
X1, X2, X3, are variables for f-structures

(2) PRED(X1, sein),
SUBJ(X1,X2),
XCOMP(X1,X3)
gt
PRED(X1, have),
SUBJ(X1,X2)
OBJ(X1,X3)

(1) PRED(X1, ich) gt PRED(X1, I)
47
Extracting Complex Transfer Rules

Complex rules are created by taking all
combinations of primitive rules, and filtering

(4) zutiefst dankbar sein
gt
have a deep appreciation
(5) zutiefst dankbar dafür sein
gt
have a deep appreciation for that
(6) ich bin zutiefst dankbar dafür
gt
I have a deep appreciation for that

48
Transfer Contiguity constraint

Transfer contiguity constraint
Source and target f-structures each have to be
connected
F-structures in the transfer source can only be
aligned with f-structures in the transfer target,
and vice versa
Analogous to constraint on contiguous and
alignment-consistent phrases in phrase-based SMT
Prevents extraction of rule that would translate
dankbar directly into appreciation since
appreciation is aligned also to zutiefst
Transfer contiguity allows learning idioms like
es gibt - there is from configurations that are
local in f-structure but non-local in string,
e.g., es scheint zu geben - there seems
to be

49
Linguistic Filters on Transfer Rules

Morphological stemming of PRED values
(Optional) filtering of f-structure snippets
based on consistency of linguistic categories
Extraction of snippet that translates zutiefst
dankbar into a deep appreciation maps
incompatible categories adjectival and nominal
valid in string-based world
Translation of sein to have might be discarded
because of adjectival vs. nominal types of their
arguments
Larger rule mapping zutiefst dankbar sein to have
a deep appreciation is ok since verbal types match

50
Transfer

Parallel application of transfer rules in
non-deterministic fashion
Unlike XLE ordered-rule rewrite system
Each fact must be transferred by exactly one rule
Default rule transfers any fact as itself
Transfer works on chart using parsers
unification mechanism for consistency checking
Selection of most probable transfer output is
done by beam-decoding on transfer chart

51
Generation

Bi-directionality allows us to use same grammar
for parsing training data and for generation in
translation application
Generator has to be fault-tolerant in cases where
transfer-system operates on FRAGMENT parse or
produces non-valid f-structures from valid input
f-structures
Robust generation from unknown (e.g.,
untranslated) predicates and from unknown
f-structures

52
Robust Generation

Generation from unknown predicates
Unknown German word Hunde is analyzed by German
grammar to extract stem (e.g., PRED Hund, NUM
pl) and then inflected using English default
morphology (Hunds)
Generation from unknown constructions
Default grammar that allows any attribute to be
generated in any order is mixed as suboptimal
option in standard English grammar, e.g. if SUBJ
cannot be generated as sentence-initial NP, it
will be generated in any position as any category
extension/combination of set-gen-adds and OT
ranking

53
Statistical Models

Log-probability of source-to-target transfer
rules, where probability r(ef) or rule that
transfers source snippet f into target snippet e
is estimated by relative frequency
Log-probability of target-to-source transfer
rules, estimated by relative frequency

54
Statistical Models, cont.

Log-probability of lexical translations l(ef)
from source to target snippets, estimated from
Viterbi alignments a between source word
positions i1, n and target word positions
j1,,m for stems fi and ej in snippets f and e
with relative word translation frequencies
t(ejfi)
Log-probability of lexical translations from
target to source snippets

55
Statistical Model, cont.

Number of transfer rules
Number of transfer rules with frequency 1
Number of default transfer rules
Log-probability of strings of predicates from
root to frontier of target f-structure, estimated
from predicate trigrams in English f-structures
Number of predicates in target f-structure
Number of constituent movements during
generations based on original order of head
predicates of the constituents

56
Statistical Models, cont.

Number of generation repairs
Log-probability of target string as computed by
trigram language model
Number of words in target string

57
Experimental Evaluation

Experimental setup
German-to-English on Europarl parallel corpus
(Koehn 02)
Training and evaluation on sentences of length
5-15, for quick experimental turnaround
Resulting in training set of 163,141 sentences,
development set of 1,967 sentences, test of 1,755
sentences (used in Koehn et al. HLT03)
Improved bidirectional word alignment based on
GIZA (Och et al. EMNLP99)
LFG grammars for German and English (Butt et al.
COLING02 Riezler et al. ACL02)
SRI trigram language model (Stocke02)
Comparison with PHARAOH (Koehn et al. HLT03) and
IBM Model 4 as produced by GIZA (Och et al.
EMNLP99)

58
Experimental Evaluation, cont.

Around 700,000 transfer rules extracted from
f-structures chosen by dependency similarity
measure
System operates on n-best lists of parses (n1),
transferred f-structures (n10), and generated
strings (n1,000)
Selection of most probable translations in two
steps
Most probable f-structure by beam search (n20)
on transfer chart using features 1-10
Most probable string selected from strings
generated from selected n-best f-structures using
features 11-13
Feature weights for modules trained by MER on 750
in-coverage sentences of development set

59
Automatic Evaluation

NIST scores (ignoring punctuation) Approximate
Randomization for significance testing (see
above)
44 in-coverage of grammars 51 FRAGMENT parses
and/or generation repair 5 timeouts
In-coverage Difference between LFG and P not
significant
Suboptimal robustness techniques decrease overall
quality

60
Manual Evaluation

Closer look at in-coverage examples
Random selection of 500 in-coverage examples
Two independent judges indicated preference for
LFG or PHARAOH, or equality, in blind test
Separate evaluation under criteria of
grammaticality/fluency and translational/semantic
adequacy
Significance assessed by Approximate
Randomization via stratified shuffling of
preference ratings between systems

61
Manual Evaluation

Result differences on agreed-on ratings are
statistically significant at p lt 0.0001
Net improvement in translational adequacy on
agreed-on examples is 11.4 on 500 sentences
(57/500), amounting to 5 overall improvement in
hybrid system (44 of 11.4)
Net improvement in grammaticality on agreed-on
examples is 15.4 on 500 sentences, amounting to
6.7 overall improvement in hybrid system

62
Examples LFG gt PHARAOH

src in diesem fall werde ich meine verantwortung
wahrnehmen
sef then i will exercise my responsibility
LFG in this case i accept my responsibility
P in this case i shall my responsibilities
src die politische stabilität hängt ab von der
besserung der lebensbedingungen
ref political stability depends upon the
improvement of living conditions
LFG the political stability hinges on the
recovery the conditions
P the political stability is rejects the
recovery of the living conditions

63
Examples PHARAOH gt LFG

src das ist schon eine seltsame vorstellung von
gleichheit
ref a strange notion of equality
LFG equality that is even a strange idea
P this is already a strange idea of equality
src frau präsidentin ich beglückwünsche herrn
nicholson zu seinem ausgezeichneten bericht
ref madam president I congratulate mr nicholson
on his excellent report
LFG madam president I congratulate mister
nicholson on his report excellented
P madam president I congratulate mr nicholson
for his excellent report

64
Discussion

High percentage of out-of-coverage examples
Accumulation of 2 x 20 error-rates in parsing
training data
Errors in rule extraction
Together result in ill-formed transfer rules
causing high number of generation
failures/repairs
Propagation of errors through the system also for
in-coverage examples
Error analysis 69 transfer errors, 10 due to
parse errors
Discrepancy between NIST and manual evaluation
Suboptimal integration of generator, making
training and translation with large n-best lists
infeasible
Language and distortion models applied after
generation

65
Conclusion

Integration of grammar-based generator into
dependency-based SMT system achieves
state-of-the-art NIST and improved grammaticality
and adequacy on in-coverage examples
Possibility of hybrid system since it is
determinable when sentences are in coverage of
system

66
Grammatical Machine Translation II

Ji Fang, Martin Forst, John Maxwell, and Michael
Tepper

67
Overview over different approaches to MT
68
Limitations of string-based approaches

Transfer rules/correspondences of little
generality
Problems with long-distance dependencies
Perform less well for morphologically rich
(target) languages
N-gram LM-based disambiguation seems to have
leveled out

69
Limitations of string-based approaches - little
generality

From Europarl Das tut mir leid. Im sorry
about that.
Google (SMT) Im sorry. Perfect!
But As soon as input changes a bit, we get
garbage.
Das tut ihr leid. She is sorry about that.
? It does their suffering.
Der Tod deines Vaters tut mir leid. I am sorry
about the death of your father. ? The death of
your father I am sorry.
Der Tod deines Vaters tut ihnen leid. They are
sorry about the death of your father. ? The
death of your father is doing them sorry.

70
Limitations of string-based approaches - problems
with LDDs

From Europarl Dies stellt eine der großen
Herausforderungen für die französische
Präsidentschaft dar . This is one of the
major issues of the French Presidency .
Google (SMT) This is one of the major challenges
for the French presidency represents.
Particle verb is identified and translated
correctly
But two verbs ? ungrammatical seem to be too
far apart to be filtered by LM

71
Limitations of string-based approaches - rich
morphology

Language pairs involving morphologically rich
languages, e.g., Finnish, are hard

From Koehn (2005, MT Summit)
72
Limitations of string-based approaches - rich
morphology

Morphologically rich, free word order languages,
e.g. German, are particularly hard as target
languages.

Again from Koehn (2005, MT Summit)
73
Limitations of string-based approaches - n-gram
LMs

Even for morphologically poor languages,
improving n-gram LMs becomes increasingly
expensive.
Adding data helps improve translation quality
(BLEU scores), but not enough.
Assuming best improvement rate observed in Brants
et al. (2007), 400 million times available data
needed to attain human translation quality by LM
improvement.

74
Limitations of string-based approaches - n-gram
LMs

Best improvement rate 0.7 BP/x2
Would need 40 more doublings to obtain human
translation quality. (42 0.740 70)
Necessary training data in tokens 1e22
(1e10240 1e22)
4e8 times current English Web (estimate)
(2.5e134e8 1e22)

From Brants et al. (2007)

75
Limitations of bitext-based approaches

Generally available bitexts are limited in size
and specialized in genre
Parliament proceedings
UN texts
Judiciary texts (from multilingual countries)
? Makes it hard to repurpose bitext-based
systems to new genres
Induced transfer rules/correspondences often of
mediocre quality
Loose translations
Bad alignments

76
Limitations of bitext-based approaches -
availability and quality

Readily available bitexts are limited in size and
specialized in genre
Approaches to auto-extracting bitexts from the
web exist.
Additional data help to some degree, but then
effect levels out.
Still a genre bias in bitexts, despite automatic
acquisition?
Still more general problems with alignment
quality etc.?

77
Limitations of bitext-based approaches -
availability and quality

Much more data needed to attain human translation
quality
Logarithmic gains (at best) by adding bitext data

From Munteanu Marcu (2005)
Base Line 100K - 95M English Words
Mid Line (auto) 90K - 2.1M
Top Line (oracle) 90K - 2.1M

78
Context-Based MT / Meaningful Machines

Combines example-based MT (EBMT) and SMT
Very large (target) language model, large amount
of monolingual text required
No transfer statistics, thus no parallel text
required
Translation lexicon is developed
semi-automatically (i.e. hand-validated)
Lexicon has slotted phrase pairs (like EBMT),
i.e. NP1 biss ins Gras. NP1 bit the dust.

79
Context-Based MT / Meaningful Machines - pros

High-quality translation lexicon seems to allow
for
Easier repurposing of system(s) to new genres
Better translation quality

From Carbonell (2006)
80
Context-Based MT / Meaningful Machines - cons

Works really well for English-Spanish. How about
other language pairs?
Same problems with n-gram LMs as traditional
SMT probably affects pairs involving
morphologically rich (target) language
particularly badly.
How much manual labor involved in development of
translation lexicon?
Computationally expensive

81
Grammatical Machine Translation

Syntactic transfer-based approach
Parsing and generation identical/similar between
GMT I and GMT II

pyramid
F-structure transfer rules
transfer, score target FSs
parse source, score f-structures
generate, pick best realization
String-level statistical methods
82
Grammatical Machine Translation GMT I vs. GMT II

GMT I
Transfer rules induced from parsed bitexts
Target f-structures ranked using individual
transfer rule statistics

GMT II
Transfer rules induced from manually/semi-automati
cally construc-ted phrase lexicon
Target f-structures ranked using monolingually
trained bilexical dependency statistics and
general transfer rule statistics

83
GMT II

Where do the transfer rules come from?
Where do statistics/machine learning come in?

induced from manually/semi-automatically compiled
phrase pairs with slots potentially, but not
necessarily from bitexts
pyramid
log-linear model trained on synt. annotated
monolingual corpus
log-linear model trained on bitext data includes
score from parse ranking model and very general
transfer features
F-structure transfer rules
log-linear model trained on bitext data includes
scores from other two models and features/score
of monolingually trained model for realization
ranking
transfer, score target FSs
parse source, score f-structures
generate, pick best realization
String-level statistical methods
84
GMT II - The phrase dictionary

Contains phrase pairs with slot categories
(Ddeff, Ddef, NP1nom, NP1, etc.) that allow for
well-formed phrases without being included in
induced rules
Currently hand-written
Will hopefully be compiled (semi-)automati-cally
from bilingual dictionaries
Bitexts might also be used how exactly remains
to be defined.

85
GMT II - Rule induction from the phrase dictionary

Sub-FSs of slot variables are not included
FS attributes can be defined as irrelevant for
translation, e.g. CASE (in both en and de), GEND
(in de). Attributes so defined are never included
in induced rules.
set-gen-adds remove CASE GEND
FS attributes can be defined as
remove_equal_features. Attributes defined as
such are not included in induced rules when they
are equal.
set remove_equal_features NUM OBJ OBL-AG
PASSIVE SUBJ TENSE
? more general rules

86
GMT II - Rule induction from the phrase
dictionary (noun)

Ddeff Verfassung Ddef constitution
PRED(X1, Verfassung),
NTYPE(X1, Z2),
NSEM(Z2, Z3),
COMMON(Z3, count),
NSYN(Z2, common)
gt
PRED(X1, constitution),
NTYPE(X1, Z4),
NSYN(Z4, common).

87
GMT II - Rule induction from the phrase
dictionary (adjective)

europäische European
PRED(X1, europäisch)
gt
PRED(X1, European).
To accommodate certain non-parallelism with
respect to SUBJs of adjectives etc., special
mechanism removes SUBJs of non-verbs and makes
them addable in generation.

88
GMT II - Rule induction from the phrase
dictionary (verb)

NP1nom koordiniert NP2acc. NP1 coordinates
NP2.
PRED(X1, koordinieren),
arg(X1, 1, A2),
arg(X1, 2, A3),
VTYPE(X1, main)
gt
PRED(X1, coordinate),
arg(X1, 1, A2),
arg(X1, 2, A3),
VTYPE(X1, main).

89
GMT II - Rule induction (argument switching)

NP1nom tut NP2dat leid. NP2 is sorry about
NP1.
PRED(X1, leidtun),
SUBJ(X1, A2),
OBJ-TH(X1, A3),
VTYPE(X1, main)
gt
PRED(X1,be),
SUBJ(X1,A3),
XCOMP-PRED(X1,Z1),
PRED(Z1, sorry),
OBL(Z1,Z2),
PRED(Z2,about),
OBJ(Z2,A2),
VTYPE(X1,copular).

90
GMT II - Rule induction (head switching)

Ich versuche nur, mich jeder Demagogie zu
enthalten. It is just that I am trying not to
indulge in demagoguery.
NP1nom Vfin nur. It is ist just that NP1 Vs.
ADJUNCT(X1,Z2), in_set(X3,Z2),
PRED(X3,nur), ADV-TYPE(X3,unspec)
gt
PRED(Z4,be), SUBJ(Z4,X3), NTYPE(X3,Z5),
NSYN(Z5,pronoun), GEND-SEM(Z5,nonhuman),
HUMAN(Z5,-), NUM(Z5,sg), PERS(Z5,3),
PRON-FORM(Z5,it), PRON-TYPE(Z5,expl_),
arg(Z4,1,Z6), PRED(Z6, just), SUBJ(Z6,Z7),
arg(Z6,1,A1), COMP-FORM(A1,that),
COMP(Z6,A1), nonarg(Z6,1,Z7),
ATYPE(Z6,predicative), DEGREE(Z6, positive),
nonarg(Z4,1,X3), TNS-ASP(Z4,Z8),
MOOD(Z8,indicative), TENSE(Z8, pres),
XCOMP-PRED(Z4,Z6), CLAUSE-TYPE(Z4,decl),
PASSIVE(Z4,-), VTYPE(A2,copular).

91
GMT II - Rule induction (more on head switching)

In addition to rewriting terms, system
re-attaches rewritten FS if necessary. Here, this
might be the case of X1.
ADJUNCT(X1,Z2), in_set(X3,Z2),
PRED(X3,nur), ADV-TYPE(X3,unspec)
gt
PRED(Z4,be), SUBJ(Z4,X3), NTYPE(X3,Z5),
NSYN(Z5,pronoun), GEND-SEM(Z5,nonhuman),
HUMAN(Z5,-), NUM(Z5,sg), PERS(Z5,3),
PRON-FORM(Z5,it), PRON-TYPE(Z5,expl_),
arg(Z4,1,Z6), PRED(Z6, just), SUBJ(Z6,Z7),
arg(Z6,1,A1), COMP-FORM(A1,that),
COMP(Z6,A1), nonarg(Z6,1,Z7),
ATYPE(Z6,predicative), DEGREE(Z6, positive),
nonarg(Z4,1,X3), TNS-ASP(Z4,Z8),
MOOD(Z8,indicative), TENSE(Z8, pres),
XCOMP-PRED(Z4,Z6), CLAUSE-TYPE(Z4,decl),
PASSIVE(Z4,-), VTYPE(A2,copular).

92
GMT II - Pros and cons of rule induction from a
phrase dictionary

Development of phrase pairs can be carried out by
someone with little knowledge of grammar and
transfer system manual development of transfer
rules would require experts (for boring,
repetitive labor).
Phrase pairs can remain stable while grammars
keep evolving. Since transfer rules are induced
fully automatically, they can easily be kept in
sync with grammars.
Induced rules are of much higher quality than
rules induced from parsed bitexts (GMT I).
Although there is hope that phrase pairs can be
constructed semi-automatically from bilingual
dictionaries, it is not yet clear to what extent
this can be automated.
If rule induction from parsed bitexts can be
improved, the two approaches might well be
complementary.

93
Lessons Learned for Parallel Grammar Development

Absence of a feature like PERF/- is not
equivalent to PERF-.
FS-internal features should not say anything
about the function of the FS
Example PRON-TYPEposs instead of
PRON- TYPEpers
Compounds should be analyzed similarly, whether
spelt together (de) or apart (en)
Possible with SMOR
Very hard or even impossible with DMOR

94
Absence of PERF ? PERF-
95
No function info in FS-internal features

I think NP1 Vs. In my opinion NP1 Vs.

96
Parallel analysis of compounds
97
More Lessons Learned for Parallel Grammar
Development

ParGram needs to agree on a parallel PRED value
for (personal) pronouns
We need an interlingua for numbers, clock
times, dates etc.
Guessers should analyze (composite) names
similarly

98
Parallel PRED values for (personal) pronouns

Otherwise the number of rules we have to learn
for them explodes.
de-en pro/er ? he, pro/er ? it, pro/sie ? she,
pro/sie ? it, pro/es ? it, pro/es ? he, pro/es ?
she
Also PRED-NUM-PERS combination may make no
sense!!! Result A lot of generator effort for
nothing
en-de he ? pro/er, she ? pro/sie, it ? pro/es,
it ? pro/er, it ? pro/sie,

99
Interlingua for numbers, clock times, dates, etc.

We cannot possibly learn transfer rules for all
dates.

100
Guessed (composite) names
We cannot possibly learn transfer rules for all
proper names in this world.
101
And Yet More Lessons Learned for Grammar
Development

Reflexive pronouns - PERS and NUM agreement
should be insured via inside-out function
application, e.g. ((SUBJ ) PERS) (PERS).
Semantically relevant features should not be
hidden in CHECK

102
Reflexive pronouns

Introduce their own values for PERS and NUM
Overgeneration Ich wasche sich.
NUM ambiguity for (frequent) sich
Less generalization possible in transfer rules
for inherently reflexive verbs - 6 rules
necessary instead of 1.

103
Reflexive pronouns
104
Semantically relevant features in CHECK

sie they
Sie you (formal)
Since CHECK features are not used for
translations, the distinction between sie and
Sie is lost.

105
Planned experiments - Motivation

We do not have the resources to develop a
general purpose phrase dictionary in the short
or medium term.
Nevertheless, we want to get an idea about how
well our new approach may scale.

106
Planned Experiments 1

Manually develop phrase dictionary for a few
hundred Europarl sentences
Train target FS ranking model and realization
ranking model on those sentences
Evaluate output in terms of BLEU, NIST and
manually
Can we make this new idea work under ideal
conditions? It seems we can.

107
Planned Experiments 2

Manually develop phrase dictionary for a few
hundred Europarl sentences
Use bilingual dictionary to add possible phrase
pairs that may distract the system
Train target FS ranking model and realization
ranking model on those sentences
Evaluate output in terms of BLEU, NIST and
manually
How well can our system deal with the
distractors?

108
Planned Experiments 3

Manually develop phrase dictionary for a few
hundred Europarl sentences
Use bilingual dictionary to add possible phrase
pairs that may distract the system
Degrade the phrase dictionary at various levels
of severity
Take out a certain percentage of phrase pairs
Shorter phrases may be penalized less than longer
ones
Train target FS ranking model and realization
ranking model on those sentences
Evaluate output in terms of BLEU, NIST and
manually
How good or bad is the output of the system when
the bilingual phrase dictionary lacks coverage?

109
Main Remaining Challenges

Get comprehensive and high-quality dictionary of
phrase pairs
Get more and better (i.e. more normalized and
parallel) analyses from grammars
Improve ranking models, in particular on source
side
Improve generation behavior of grammars - So far,
grammar development has mostly been
parsing-oriented.
Efficiency, in particular on the generation side,
i.a. packed transfer and generation

110
(No Transcript)

Write a Comment

User Comments (0)