MT with Limited Resources: Approaches and Results - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

MT with Limited Resources: Approaches and Results

Description:

... training corpus, ... and statistical dictionary extracted from the training text. ... upon manual evaluation of training data and personal grammatical ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 24

Provided by: cmu

Category:

more less

Transcript and Presenter's Notes

Title: MT with Limited Resources: Approaches and Results

1
MT with Limited ResourcesApproaches and Results

Ralf Brown, Stephan Vogel, Alon Lavie, Lori
Levin, Jaime Carbonell
Students Christian Monson, Erik Peterson,
Kathrin Probst, Ashish Venugopal, Ying Zhang
Carnegie Mellon University

2
RADD CMU's "incubator" for MT technologies

Multiple Techniques
Statistical
Example-Based
Transfer-Rule
Common Pre-Processing
Segmentation
Conversion of numbers to Arabic numerals
Translation of month names to English month names
Multi-Engine combinations

3
Statistical MT

The major improvement for June evaluation was
phrase-to-phrase alignments.
Performance (NIST score)ME-Compatible-SMT
5.7354Full-SMT 6.1361

4
Example-Based MT

Given an indexed training corpus,
find phrases in the corpus which occur in the
input to be translated
retrieve the sentence pairs containing matches,
and
perform a word-level alignment to determine
translations.
Our "standard" EBMT system is actually a
multi-enginecombination of phrasal EBMT, LDC
lexicon, and statistical dictionary extracted
from the training text.

5
Example-Based MT (2)

Inexact matching A phrase can match even if one
of the words inside the phrase differs, provided
that the dictionary can provide a more-or-less
unambiguous translation for the unmatched word.
"More-or-less Unambiguous" means that the
translationwith the second-highest frequency has
frequency less thanTHRESHOLDhighest-frequency
we experimentally determined the best threshold
to be 0.55.

6
Transfer Rule MT

Manually-developed transfer rules for translation
with our newly developed Transfer Engine.
(71 hours development time)
and a transfer lexicon automatically derived from
theLDC 10k-word lexicon.
A language model selected from among ambiguous
translations.
Performance (NIST score)XFERLM 4.8404

7
Multi-Engine MT

Hypothesis by combining multiple
translationmethods, we can mitigate weaknesses
and enhancestrengths of individual methods.
Each engine generates whatever partial
translationsit can and assigns an approximate
quality score.
The partial translations are then combined into
alattice and a trigram model of the output
language(plus other scoring heuristics) is used
to selectthe best path through the lattice.

8
Multi-Engine MT Results

Most combinations outperform the individual
engines.
We submitted two combinations and the engines
they combinedfor official scores, with the
result shown below.PhrEBMT 3.9668 PhrEBMT 3.96
68SMT 5.7354 XFER
4.8404Combo 5.9524 Combo 5.2170
Additionally, we see the effect of
combininglexica with phrasal EBMTPhrEBMT
3.9668PhrEBMTlex 5.2883

9
CMU Small Data Results

Official results were submitted with segmentor
trained on full LDC word list (same as large
data track)
We retrained our segmentor with only 10K small
dict and the words from the 100K Chinese treebank
and re-evaluated our latest systems.
Results with new (small) segmentation are
reported in parentheses

10
CMU Small Data Results SMT

Results with full segmentor and with re-trained
small segmentor for different versions of our
SMT system

11
Learning Transfer-Rules for Languages with
Limited Resources

Rationale
Large bilingual corpora not available
Bilingual native informant(s) can translate and
align a small pre-designed elicitation corpus,
using elicitation tool
Elicitation corpus designed to be typologically
comprehensive and compositional
Transfer-rule engine and new learning approach
support acquisition of generalized transfer-rules
from the data

12
AVENUE Transfer
13
Sample Transfer Rule

Rules contain necessary information for analysis,
transfer and generation
Unification equations used to build source,
target feature structures
Transfer Chinese questions formed by appending
particle MA to English
S,2 Rule ID
SS NP VP MA -gt AUX NP VP Source
Target production rules
(
(x1y2) Source NP
aligns with target NP
(x2y3) Source VP
aligns with target VP
((x0 subj) x1) Build the source
feature structure
((x0 subj case) nom)
((x0 act) quest)
(x0 x2)
((y1 form) do) Set inserted constituent
AUXs base form to do
((y3 vform) c inf) Constrain verb to
infinitive form
((y1 agr) (y2 agr)) Enforce agreement
between do and subject
)

14
Transfer Overview

The AVENUE translation engine developed
internally and follows a three-step transfer
approach
Analysis
Transfer
Generation
The engine can be run with manually developed
transfer-rules as a stand-alone system or operate
as part our larger rule-learning system.

15
RADD Transfer Development

Total Chinese-specific rule and lexicon
development time 71 hours
Small and Large Tracks used same transfer rules
but different sized lexicons (10K vs. 50K)
Rule development by a bilingual speaker with
linguistic background, based upon manual
evaluation of training data and personal
grammatical knowledge.
Development concentrated on translating noun
phrases and structures where Chinese and English
word order differed

16
Analysis

Analysis uses a uses a unification-based chart
parser to find the input sentences grammatical
structure.
Different possible analyses and transfer paths
are all efficiently packed together in a packed
forest for later usage.

17
Transfer

Transfer rule manipulate the parse tree(s)
created during analysis.
Constituents (such as noun, verb phrases) can be
reordered, inserted, or deleted.
Words are translated using a transfer lexicon
For sentences without a complete parse, transfer
occurs on the longest sub-parses found during
analysis.

18
Generation

During generation, the engine checks that the
target language tree from transfer satisfies
target language constraints (e.g. subject-verb
agreement in English)
Finally, the target sentence is read from the
leaves of the target tree and returned.

19
Rule Learning - Overview

Goal Acquisition of Syntactic Transfer Rules
1) Flat Seed Generation produce rules from
word-aligned sentence pairs, abstracted only to
POS level no syntactic structure
2) Add compositional structure to Seed Rule by
exploiting previously learned rules
3) Seeded Version Space Learning group seed rules
by constituent sequences and alignments, seed
rules form s-boundary of VS generalize with
validation

20
Flat Seed Generation

Create a seed rule that is specific to the
sentence pair, but abstracted to the pos level.
Use SL information (e.g. parses), and any TL
information. E.g.
The highly qualified applicant visited the
company.
Der äußerst qualifizierte Bewerber besuchte die
Firma.
((1,1),(2,2),(3,3),(4,4),(5,5),(6,6))

SS det adv adj n v det n? det adv adj n v
det n (alignments (x1y1) (x2y2) (x3y3)
(x4y4) (x5y5) (x6y6) (x7y7) constraints
((x1 def) ) ((x4 agr) 3-sing) ((x5 tense)
past) . ((y1 def) ) ((y3 case) nom)
((y4 agr) 3sg) )
21
Compositionality

If there is a previously learned rule that can
account for part of the sentence, adjust seed
rule to reflect this compositional element.
Adjust constituent sequences, alignments, and
constraints add context constraints (from
possible translations), remove unnecessary ones

SS det adv adj n v det n? det adv adj n v
det n (alignments (x1y1) (x2y2) (x3y3)
(x4y4) (x5y5) (x6y6) (x7y7) constraints
((x1 def) ) ((x4 agr) 3-sing) ((x5 tense)
past) . ((y1 def) ) ((y4 agr) 3sg)
)
NPNP det adv adj n det adv adj
n ((x1y1) ((y4 agr) (x4 agr) .)
SS NP v det n? NP v det n (alignments (x1
y1) (x2y2) (x3y3) (x4y4) (x5y5)
(x6y6) (x7y7) constraints ((x5 tense)
past) . ((y1 def) ) ((y1 case) nom)
((y1 agr) 3sg) )
22
Seeded Version Space Learning

NP v det n NP VP
Group seed rules into version spaces as above.
Make use of partial order of rules in version
space. Partial order is defined
via the f-structures satisfying the constraints.
Generalize in the space by repeated merging of
rules
Deletion of constraint
Moving value constraints to agreement
constraints, e.g.
((x1 num) pl), ((x3 num) pl) ?
((x1 num) (x3 num)
4. Check translation power of generalized rules
against sentence pairs

23
Future Work

Baseline evaluation
Adjust generalization step size
Revisit generalization operators
Introduce specialization operators to retract
from overgeneralizations (including seed rules)
Learn from an unstructured bilingual corpus
Evaluate merges to pick the optimal one at any
step based on cross-validation, number of
sentences it can translate

Write a Comment

User Comments (0)